CN114565816B - Multi-mode medical image fusion method based on global information fusion - Google Patents

Multi-mode medical image fusion method based on global information fusion Download PDF

Info

Publication number
CN114565816B
CN114565816B CN202210202366.1A CN202210202366A CN114565816B CN 114565816 B CN114565816 B CN 114565816B CN 202210202366 A CN202210202366 A CN 202210202366A CN 114565816 B CN114565816 B CN 114565816B
Authority
CN
China
Prior art keywords
fusion
module
nth
convolution
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210202366.1A
Other languages
Chinese (zh)
Other versions
CN114565816A (en
Inventor
陈勋
张静
刘爱萍
张勇东
吴枫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202210202366.1A priority Critical patent/CN114565816B/en
Publication of CN114565816A publication Critical patent/CN114565816A/en
Application granted granted Critical
Publication of CN114565816B publication Critical patent/CN114565816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a multi-mode medical image fusion method based on global information fusion, which comprises the following steps: 1. performing color space conversion and image shearing pretreatment on the medical images of the original multiple modes; 2. establishing a modal branch network which is interacted in a plurality of scales through a fusion module, and establishing a fusion module formed by a transducer to combine multi-modal characteristic information; 3. a reconstruction module is established, and a fusion image is synthesized from multi-mode characteristics of multiple scales; 4. training and evaluating the model on a public data set; 4. and realizing medical image fusion tasks by using the trained model. According to the invention, the multi-mode semantic information can be fully fused through the transducer fusion module and the interactive mode branch network, so that a fine granularity fusion effect is realized, the structure and texture information of the original image are well reserved, and the mosaic phenomenon caused by the low-resolution medical image is also improved.

Description

Multi-mode medical image fusion method based on global information fusion
Technical Field
The invention relates to the technical field of image fusion, in particular to a medical image fusion technology based on deep learning.
Background
The medical image can assist doctors to better understand human body structures and tissues, and is widely applied to clinical applications such as disease diagnosis, treatment planning, operation guidance and the like. Medical images of different modalities have a difference in the degree of interest in human organs and tissues due to the difference in imaging mechanisms. Medical images in a single mode often cannot provide comprehensive and sufficient information, and doctors often need to observe a plurality of images at the same time to accurately judge the illness state, which brings certain difficulty to diagnosis. Due to the limitations of single-modality medical images, multi-modality medical image fusion is a very necessary area of research. The multi-mode medical image fusion refers to synthesizing important information of medical images of different modes in the same scene to synthesize one image.
In general, medical images can be divided into anatomical images and functional images. Anatomical images have a high spatial resolution, are capable of clearly imaging the anatomy of an organ, but are not capable of displaying functional changes in the metabolism of the human body, such as computed tomography (Computed Tomography, CT) and magnetic resonance (Magnetic Resonance, MR); functional images, on the contrary, perform well for functional and metabolic displays, but at low resolution, do not accurately describe anatomical details of organs such as positron emission tomography (Positron Emission Tomography, PET) and Single photon emission computed tomography (Single-Photon Emission Computed Tomography). Even though CT and MR are both anatomical images, PET and SPECT are both functional images, their information of interest is not the same. CT mainly reflects the position information of human bones and implants, while MR is mainly used to provide clear detailed information to soft tissues and other parts. MR contains multiple modes, focusing on subregions of different properties, and the common modes are T1 weighting (denoted as T1), contrast enhancement T1 weighting (denoted as T1 c), T2 weighting (denoted as T2), and liquid decay inversion recovery pulse (denoted as FLAIR). PET primarily reflects tumor function and metabolic information, while SPECT primarily provides blood flow information for organs and tissues.
Most multi-modality medical image fusion methods can be summarized as three processes: extracting, fusing and reconstructing the characteristics. In order to achieve medical image fusion, various algorithms have been proposed by students at home and abroad over the last three decades, and in general, these methods can be divided into two main categories: traditional fusion methods and fusion methods based on deep learning.
In the traditional medical image fusion framework, researchers have proposed a number of decomposition or transformation methods to extract features of source images, then select a certain fusion strategy to fuse the features, and finally, inverse transform the fused features to reconstruct a fused image. Conventional methods can be divided into four classes according to the feature extraction scheme: (1) sparse representation based methods; (2) Methods based on multi-scale decomposition, such as pyramids and wavelets; (3) Subspace-based methods such as independent component analysis; (4) salient feature-based methods. The traditional medical image fusion method has good fusion effect, but has some defects, and further improvement of fusion performance is limited. First, the fusion performance of conventional approaches relies heavily on artificially defined features, which can limit the generalization of the approach to other fusion tasks. Second, different features may require different fusion strategies to function. Third, for the sparse representation-based fusion method, its dictionary learning is relatively time consuming, and therefore, it takes much time to synthesize one fused image.
In recent years, a deep learning-based method becomes a new research hotspot in the field of image fusion, a deep learning model represented by a convolutional neural network (Convolutional Neural Network, CNN) and a generated countermeasure network (Generative Adversarial Network) is successfully applied to the problems of multi-focusing, infrared and visible light image fusion, characteristics and fusion strategies are not required to be defined manually, and advantages relative to the traditional fusion method are presented. However, because the reference image of the fusion result cannot be constructed for supervision learning, and the complex diversity of human body structure and tissue, the imaging characteristics of each mode are not easy to quantitatively describe, and the like, the research of the current medical image fusion method based on deep learning is relatively less, and the current medical image fusion method is still in a starting stage.
It has been discovered that existing medical image fusion methods typically use either a manually defined fusion strategy or a convolution-based network to fuse multi-modal image features. However, these fusion strategies cannot effectively extract global semantic information of the multi-modal image. In addition, the current medical image fusion method based on deep learning has the problems of insufficient and inexact utilization of multi-mode image information. Most methods use multi-modality images in a simplistic manner, the most common being to stack the channel dimensions of the original different modality images (or respectively extracted underlying features) and then input directly into the network model for fusion.
Disclosure of Invention
The invention provides a medical image fusion method based on global information fusion, aiming at combining global information of multi-mode characteristics through a self-attention mechanism and maximizing information of different modes through an interactive mode branch network, thereby realizing high-quality medical image fusion effect.
In order to achieve the aim of the invention, the invention adopts the following technical scheme:
the invention discloses a multi-mode medical image fusion method based on global information fusion, which is characterized by comprising the following steps:
step one, acquiring M original medical images in different modes and converting YCbCr color space to obtain Y channel images { I } of all modes 1 ,...,I m ...,I M -a }; wherein I is m A Y-channel image representing an mth modality, M e {1,2,., M }; y channel images { I } for all modalities 1 ,...,I m ...,I M Image cutting is carried out to obtain an image block set { S } of all modes 1 ,...,S m ...,S M S, where S m A set of image blocks representing an mth modality;
step two, constructing a fusion network Transfusion, which comprises the following steps: m modal branch networks, N fusion modules and a reconstruction module; and aggregate the image blocks { S } of all modalities 1 ,...,S m ...,S M Input into the fusion network transfusions:
step 2.1, constructing M modal branch networks and N fusion modules:
step 2.1.1, constructing M modal branch networks:
the mth modal branch network in the M modal branch networks consists of N convolution modules, and the N convolution modules are respectively marked as ConvBlock m1 ,...,ConvBlock mn ,...,ConvBlock mN Wherein, convBlock mn An nth convolution module representing an mth modal branch network, N e {1, 2., N };
when n=1, the nth convolution module ConvBlock of the mth modal branch network mn From X mn A plurality of two-dimensional convolution layers;
when n=2, 3, N, an nth convolution module ConvBlock of the mth modal branch network mn From a maximum pooling layer and X mn A plurality of two-dimensional convolution layers;
the convolution kernel size of the x two-dimensional convolution layer of the nth convolution module of the mth modal branch network is ks mnx ×ks mnx The number of convolution kernels is kn mnx ,x∈{1,2,...,X mn };
Step 2.1.2, constructing N fusion modules:
any nth fusion module of the N fusion modules is a transducer network and is composed of L self-attention mechanism modules; the first self-attention mechanism module of the L self-attention mechanism modules includes: 1 multi-head attention layer, 2 layer normalization and 1 full connection layer;
step 2.2, collecting the image blocks { S } of all modes 1 ,...,S m ...,S M Inputting into M modal branch networks, and carrying out information fusion through N fusion modules:
when n=1, the image block set S of the mth modality m An nth convolution module ConvBlock input to the mth modal branch network mn X of (2) mn Outputting characteristic diagram after two-dimensional convolution layerWherein H is n 、W n 、D n Output characteristic diagram of mth modal branch network in nth convolution module>The height, width and channel number of the channel; thereby obtaining the output characteristic diagram of M modal branch networks in the nth convolution module +.>
Outputting characteristic diagrams of the M modal branch networks in an nth convolution moduleAfter being processed by the nth fusion module, the characteristic diagram is output +.>Wherein (1)>The nth feature map output by the nth fusion module is shown;
outputting the mth feature map from the nth fusion moduleA characteristic diagram outputted by an nth convolution module of the mth modal branch network +.>Adding to obtain a feature map of the mth modal branch network after interaction of the nth convolution moduleThus obtaining a characteristic diagram ++of M modal branch networks after interaction of the nth convolution module>
When n=2, 3, N, feature map of mth modal branch network after interaction of nth-1 convolution moduleThe feature map ∈10 of the nth convolution module of the mth modal branch network after downsampling is obtained by downsampling the mth modal branch network in the maximum pooling layer of the nth convolution module>The feature map after downsamplingInputting into the 1 st two-dimensional convolution layer of the nth convolution module of the mth modal branch network, and sequentially passing through X m After processing of the two-dimensional convolution layers, a feature map +.>Thereby obtaining a characteristic diagram output by an nth convolution module of the M modal branch networks +.>-fitting the feature map>After the processing of the nth fusion module, a characteristic diagram is output +.>The mth feature map outputted by the nth fusion module +.>Feature map output by nth convolution module of branch network of mth mode +.>Adding to obtain added characteristic diagram +.>Thereby obtaining M added feature maps +.>And further obtaining a feature map output by the Nth fusion module
Step 2.3, the reconstruction module is composed of an N-level convolution network; and outputting the characteristic graphs of the N fusion modulesInputting the primary fusion image F 'into the reconstruction module to obtain a primary fusion image F':
step 2.3.1, outputting all feature graphs output by the nth fusion moduleAdding to obtain a fusion characteristic diagram phi n The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining fusion characteristic diagrams { phi } of N fusion modules 1 ,...,Φ n ,...,Φ N };
Step 2.3.2, constructing a reconstruction module formed by an N-level convolution network, and fusing a feature map phi of an nth fusion module n N-th stage convolutional network input to reconstruction module:
when n=1, the nth stage convolutional network includes: b (B) n Each convolution module RConvBlock n1 ,...,RConvBlock nb ,...,
When n=2, 3, N, the nth level convolutional network includes: b (B) n Each convolution module RConvBlock n1 ,...,RConvBlock nb ,...,And B n +1 upsampling layers Upsample n1 ,...,UpSample nb ,...,The b-th convolution module RConvBlock of the nth level convolution network nb Consists of Y two-dimensional convolutional layers, B e {1, 2., B n };
When n=1 and b=1, the fusion profile Φ of the nth fusion module is calculated n A b-th convolution module RConvBlock input to the n-th level convolution network nb And outputs a characteristic map ΦR nb
When n=2, 3,..n and b=1, the fusion profile Φ of the nth fusion module is taken n B-th up-sampling layer Upsample input to nth stage convolutional network nb After the up-sampling process, an up-sampled characteristic diagram phi U is output nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the up-sampled characteristic diagram { phi U (phi) of the convolution network from the level 2 to the level N-1 2b ,...,ΦU nb ,...,ΦU Nb };
When n=2, 3, N and b=2, 3, B n When in use, the fusion characteristic diagram phi of the nth fusion module is obtained n Output characteristic diagram { ΦR of front b-1 convolution modules of nth-stage convolution network n1 ,...,ΦR n(b-1) The first b up-sampled feature maps { ΦU } of the n+1st level convolutional network (n+1)1 ,...,ΦU (n+1)b After splicing, obtaining a spliced characteristic diagram; inputting the spliced characteristic diagram to a b-th convolution module RConvBlock of an n-th level convolution network nb And outputs an output characteristic diagram ΦR of a b-th convolution module of the nth-stage convolution network nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the B-th of the level 1 convolutional network 1 Output feature map of each convolution module
B of the level 1 convolutional network 1 Output feature map of each convolution moduleAfter processing of a convolution layer, a primary fusion image F' is obtained;
step three, constructing a loss function and training a network to obtain an optimal fusion model:
step 3.1, respectively calculating the image block sets { S } of all modes 1 ,...,S m ...,S M Entropy of each image block set in the sequence and obtaining corresponding entropy value { e } 1 ,...,e m ...,e M E, where e m An entropy value representing a set of image blocks of the mth modality;
step 3.2, for the entropy { e } 1 ,...,e m ...,e M Respectively carrying out normalization processing to obtain an image block set { S } of all modes 1 ,...,S m ...,S M Weights { omega } 1 ,...,ω m ,...,ω M }, wherein omega m A weight representing a set of image blocks of the mth modality;
step 3.3, constructing a total Loss function Loss by using the formula (1):
in the formula (1), L ssim (S m F') represents the set S of image blocks of the mth modality m A structural similarity loss function with the preliminary fusion image F';
step 3.4, carrying out minimum solution on the total Loss function Loss by using an optimizer, so as to optimize all parameters in the fusion network transformation and obtain an optimal fusion model;
step four, utilizing an optimal fusion model to carry out Y-channel image { I } of all modes 1 ,I 2 ,...,I M Processing and outputting a preliminary fusion image F'; the preliminary fusion image F' is converted into an RGB color space, thereby obtaining a final fusion image F.
The multi-mode medical image fusion method based on global information fusion is also characterized in that the nth fusion module in the step 2.2 is processed according to the following process:
2.2.1, outputting a characteristic diagram of the M modal branch networks output by the nth convolution module by the nth fusion moduleSplicing and leveling to obtain a product with a size of (M.times.H) n *W n )×D n Is a flattened feature vector; adding the leveled feature vector with a trainable vector with the same size to obtain a feature vector +.>
Step 2.2.2, when l=1, the 1 st self-attention mechanism module of the nth fusion module takes the feature vectorAfter linear mapping, three matrixes Q are obtained nl ,K nl ,V nl The method comprises the steps of carrying out a first treatment on the surface of the Recalculating Q nl ,K nl ,V nl Multi-head attention result Z between nl The method comprises the steps of carrying out a first treatment on the surface of the The multi-head attention result Z nl Input into the full connection layer of the first self-attention mechanism module of the nth fusion module, and obtain the output sequence vector of the first self-attention mechanism module of the nth fusion module>
When l=2, 3, N, the output sequence vector of the 1 st self-attention mechanism module of the nth fusion moduleInputting the first self-attention mechanism module of the nth fusion module into the first self-attention mechanism module of the nth fusion module, and obtaining the first self-attention mechanism module of the nth fusion moduleOutput sequence vector +.>Thereby obtaining the output sequence vector of the L self-attention mechanism module of the nth fusion module +.>
Step 2.2.3, vector the output sequenceDividing into M modes, and then deforming the size of each mode into H n ×W n ×D n To obtain an output characteristic map->
Compared with the prior art, the invention has the beneficial effects that:
1. the invention provides an unsupervised anatomic and functional image fusion method. According to the method, a transducer structure is introduced as a fusion strategy, and the transducer utilizes a self-attention mechanism contained in the transducer structure to merge global information of the multi-mode medical image and fully fuse multi-mode semantic information, so that a fine-granularity fusion effect is realized. The invention not only well reserves the structure and texture information in the anatomical image, but also improves the mosaic phenomenon caused by low resolution in the functional image.
2. The present invention proposes a modal branching network that interacts on multiple scales. The network can extract multi-scale complementary characteristics of each modal image and fully utilizes multi-modal image information. The interactive branching network enhances the anatomic and functional image fusion effect.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention, wherein "ks×ks, kn" represents one convolution layer with a kernel size of ks×ks and a kernel number of kn;
FIG. 2 is a block diagram of a modal branching network and a fusion module provided by an embodiment of the present invention;
fig. 3 is a structural diagram of a reconstruction module according to an embodiment of the present invention.
Detailed Description
In this embodiment, a multi-mode image fusion method based on global information fusion, as shown in fig. 1, includes the following steps:
step one, acquiring M original medical images in different modes, performing color space conversion and image shearing pretreatment to obtain a pretreated image block set { S } of all modes 1 ,S 2 ,...,S M S, where S m Image block set representing the M-th modality, M e {1,2,., M }:
step 1.1, acquiring original medical images of a plurality of modes required by an experiment from a Harvard medical image dataset website (http:// www.med.harvard.edu/AANLIB/home.html); the present embodiment collects medical images of m=2 modalities from the public dataset, including 279 pairs of MR-T1 and PET images and 318 pairs of MR-T2 and SPECT images, wherein MR-T1 and MR-T2 are gray scale anatomical images, the number of channels is 1, PET and SPECT are functional images of RGB color space, the number of channels is 3;
step 1.2, converting the image of the RGB color space into the YCbCr space according to the formula (1):
in the formula (1), R, G and B are three channels of RGB color space respectively, Y is a brightness channel, and Cb and Cr are two color channels;
step 1.3, in order to expand the sample number, the gray level image and the image of the Y channel are cut into image blocks to obtain an image block set { S } 1 ,S 2 ,...,S M -a }; in this embodiment, the size of the clipped image block is 64×64;
step two, constructing a fusion network Transfusion, which comprises the following steps: m modal branch networks, N fusion modules and a reconstruction module; and aggregate the image blocks { S } of all modalities 1 ,...,S m ...,S M Input fusion network fusion:
step 2.1, constructing M modal branch networks and N fusion modules:
step 2.1.1, constructing M modal branch networks:
the mth modal branch network in the M modal branch networks consists of N convolution modules, and the N convolution modules are respectively marked as ConvBlock m1 ,...,ConvBlock mn ,...,ConvBlock mN Wherein, convBlock mn An nth convolution module representing an mth modal branch network, N e {1, 2., N };
when n=1, the nth convolution module ConvBlock of the mth modal branch network mn From X mn A plurality of two-dimensional convolution layers;
when n=2, 3, N, nth convolution module ConvBlock of mth modal branch network mn From a maximum pooling layer and X mn A plurality of two-dimensional convolution layers;
the convolution kernel size of the x two-dimensional convolution layer of the nth convolution module of the mth modal branch network is ks mnx ×ks mnx The number of convolution kernels is kn mnx ,x∈{1,2,...,X mn };
In this embodiment, n=4, the core size of all the largest pooling layers is 2×2, the step size is 2, x mn 、ks mnx 、kn mnx As shown in fig. 2;
step 2.1.2, constructing N fusion modules:
any nth fusion module of the N fusion modules is a transducer network and is composed of L self-attention mechanism modules; the first self-attention mechanism module of the L self-attention mechanism modules includes: 1 multi-head attention layer, 2 layer normalization and 1 full connection layer; in this embodiment, l=1;
step 2.2, collecting the image blocks { S } of all modes 1 ,...,S m ...,S M Inputting into M modal branch networks, and carrying out information fusion through N fusion modules:
when n=1, the image block set of the mth modalityS in S m N-th convolution module ConvBlock input into m-th modal branch network mn X of (2) mn Outputting characteristic diagram after two-dimensional convolution layerWherein H is n 、W n 、D n Output characteristic diagram of mth modal branch network in nth convolution module>The number of wells, channels; thereby obtaining the output characteristic diagram of M modal branch networks in the nth convolution module +.>In this embodiment, (H) 1 ,W 1 ,D 1 )=(64,64,64),(H 2 ,W 2 ,D 2 )=(32,32,128),(H 3 ,W 3 ,D 3 )=(16,16,256),(H 4 ,W 4 ,D 4 )=(8,8,512);
Output characteristic diagram of M modal branch networks in nth convolution moduleAfter the processing of the nth fusion module according to the formula (2), a characteristic map is output +.>Wherein (1)>The mth feature map representing the output of the nth fusion module:
in the formula (2), a transducer n Represents an nth fusion module and is realized according to the following steps:
step 2.2.1, outputting a characteristic diagram of the nth fusion module to the M modal branch networks at the nth convolution moduleSplicing and leveling to obtain a product with a size of (M.times.H) n *W n )×D n Is a flattened feature vector; adding the leveled feature vector with a trainable vector with the same size to obtain a feature vector +.>
Step 2.2.2 the self-attention mechanism module of the nth fusion module takes the feature vectorLinear mapping to three matrices, Q n ,K n ,V n
In the formula (3), the amino acid sequence of the compound,is a trainable matrix with the size of D n ×D n
Step 2.2.3, Q n ,K n ,V n Respectively dividing into h heads to obtaini e {1, 2..h }, and then calculating the multi-head attention according to equation (4) -equation (6), resulting in the result Z:
in the formula (5), concat represents a splicing operation,is a trainable matrix; in formula (6), layerNorm represents layer normalization;
step 2.2.4 results of Multi-head attention Z n Input to the full connection layer to obtain the output sequence vector of the nth fusion module
In formula (7), MLP represents a full link layer;
step 2.2.5, willDividing into M modes, and then deforming the size of each mode into H n ×W n ×D n Obtain the output characteristic diagram->Wherein (1)>Representing an mth feature map output by an nth fusion module;
outputting the mth feature map from the nth fusion moduleFeature map output by nth convolution module of branch network of mth mode +.>Adding to obtain a characteristic diagram ++of the mth modal branch network after interaction of the nth convolution module>Thus obtaining a characteristic diagram ++of M modal branch networks after interaction of the nth convolution module>
When n=2, 3, N, feature map of mth modal branch network after interaction of nth-1 convolution moduleThe feature map ∈10 of the nth convolution module of the mth modal branch network after downsampling is obtained by downsampling the mth modal branch network in the maximum pooling layer of the nth convolution module>Feature map after downsampling +.>Ru in the 1 st two-dimensional convolution layer of the nth convolution module of the mth modal branch network and sequentially passes through X mn After processing of the two-dimensional convolution layers, a feature map +.>Thereby obtaining the feature map output by the nth convolution module of the M modal branch networksFeature map +.>After the processing of the nth fusion module, a characteristic diagram is outputThe mth feature map outputted by the nth fusion module +.>Feature map output by nth convolution module of branch network of mth mode +.>Adding to obtain added characteristic diagram +.>Thereby obtaining M added feature mapsAnd further obtaining a characteristic diagram +.>
Step 2.3, constructing a reconstruction module formed by an N-level convolution network, and outputting characteristic graphs of the N fusion modulesInputting the primary fusion image F 'into a reconstruction module to obtain a primary fusion image F':
step 2.3.1, outputting all feature graphs output by the nth fusion moduleAdding to obtain a fusion characteristic diagram phi n The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining fusion characteristic diagrams { phi } of N fusion modules 1 ,...,Φ n ,...,Φ N };
Step 2.3.2, constructing a reconstruction module formed by an N-level convolution network, and fusing a feature map phi of an nth fusion module n N-th stage convolutional network input to reconstruction module:
when n=1, the nth stage convolutional network includes: b (B) n Multiple convolution modesBlock RConvBlock n1 ,RConvBlock n2 ,...,
When n=2, 3, N, the nth level convolutional network includes: b (B) n Each convolution module RConvBlock n1 ,RConvBlock n2 ,...,And B n +1 upsampling layers Upsample n1 ,UpSample n2 ,...,The b-th convolution module RConvBlock of the nth level convolution network nb Consists of Y two-dimensional convolutional layers, B e {1, 2., B n };
When n=1 and b=1, the fusion profile Φ of the nth fusion module is taken n The b-th convolution module RConvBlock input to the nth stage convolution network nb And outputs a characteristic map ΦR nb
When n=2, 3,..n and b=1, the fusion profile Φ of the nth fusion module is taken n B-th up-sampling layer Upsample input to nth stage convolutional network nb After the up-sampling process, an up-sampled characteristic diagram phi U is output nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the up-sampled characteristic diagram { phi U (phi) of the convolution network from the level 2 to the level N-1 2b ,...,ΦU nb ,...,ΦU Nb };
When n=2, 3, N and b=2, 3, B n When in use, the fusion characteristic diagram phi of the nth fusion module is obtained n Output characteristic diagram { ΦR of front b-1 convolution modules of nth-stage convolution network n1 ,...,ΦR n(b-1) The first b up-sampled feature maps { ΦU } of the n+1st level convolutional network (n+1)1 ,...,ΦU (n+1)b After splicing, obtaining a spliced characteristic diagram; inputting the spliced characteristic diagram to a b-th convolution module RConvBlock of an n-th level convolution network nb And outputs the b-th of the nth stage convolutional networkOutput characteristic diagram phi R of convolution module nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the B-th of the level 1 convolutional network 1 Output feature map of each convolution module
B of the level 1 convolutional network 1 Output feature map of each convolution moduleAfter processing of a convolution layer, a primary fusion image F' is obtained;
in this embodiment, the reconstruction module is shown in fig. 3, where y=2, b 1 =3,B 2 =2,B 3 =1,B 4 =0;
Step three, constructing a loss function and training a network to obtain an optimal fusion model:
step 3.1, respectively calculating the image block sets { S } of all modes according to the formulas (8) - (9) 1 ,S 2 ,...,S M Entropy of each image block set in the sequence and obtaining corresponding entropy value { e } 1 ,e 2 ,...,e M E, where e m Entropy value representing the image block set of the mth modality:
e m =Entropy(S m ) (8)
in the formula (9), p l Probability of being the first gray value;
step 3.2, for entropy { e } 1 ,e 2 ,...,e M Respectively carrying out normalization processing to obtain an image block set { S } of all modes 1 ,S 2 ,...,S M Weights { omega } 1 ,ω 2 ,...,ω M }, wherein omega m Weights representing the image block set of the mth modality:
in the formula (10), η is a temperature parameter; in this embodiment, η=1;
step 3.3, constructing a total Loss function Loss by using the formula (11):
Loss=ω 1 L ssim (S 1 ,F′)+ω 2 L ssim (S 2 ,F′) (11)
L ssim (S j ,F′)=1-SSIM(S j ,F′) (12)
in the formula (11), L ssim (S m F') represents the set S of image blocks of the mth modality m A structural similarity loss function with the preliminary fusion image F'; in formula (12), SSIM is a structural similarity function;
step 3.4, carrying out minimized solution on the total Loss function Loss by adopting an AdamW optimizer, so as to optimize all parameters in the fusion network transformation and obtain an optimal fusion model;
step four, utilizing an optimal fusion model to carry out the fusion on Y-channel images or gray-scale images { I } 1 ,I 2 ,...,I M Processing and outputting a preliminary fusion image F'; splicing the primary fusion image F 'with Cb and Cr channels and converting the primary fusion image F' into an RGB color space to obtain a final fusion image F;
step five, evaluating the performance of the invention:
in specific implementation, the invention is compared with the traditional method CSMC A and the deep learning methods DDcGAN and EMfusion. In addition, in order to illustrate the effectiveness of the fusion module based on the transducer and the interactive mode branching network in the invention, two comparative experiments are set up. The first experiment removes the transducer fusion module and the second experiment replaces the interactive modal branch network with a weight-shared modal branch network. Similarity measurement Q based on boundary using mutual information, average gradient AB/F Visual perception index Q CV As evaluation index, mutual information, average gradient, Q AB/F The larger Q CV The smaller the fusion image, the higher the quality. The average fusion quality for 30 pairs of MR-T1 and PET test images and 30 pairs of MR-T2 and SPECT test images is shown in the following table:
TABLE 1 fusion performance of different methods
Experimental results show that the invention has the advantages of mutual information, average gradient and Q AB/F 、Q CV The four indexes are all optimal. The transducer fusion module of the invention helps to enhance 5.10% -10.02% of mutual information, 2.59% -5.28% of average gradient, 3.04% -4.36% of Q AB/F Q of 1.43% -12.66% CV The method comprises the steps of carrying out a first treatment on the surface of the The interactive mode branching network of the invention helps to enhance the mutual information of 18.39-19.91%, the average gradient of 1.06-6.69% and the Q of 7.68-11.02% AB/F 27.69% -62.22% of Q CV

Claims (2)

1. A multimode medical image fusion method based on global information fusion is characterized by comprising the following steps:
step one, acquiring M original medical images in different modes and converting YCbCr color space to obtain Y channel images { I } of all modes 1 ,…,I m …,I M -a }; wherein I is m A Y-channel image representing the mth modality, M ε {1,2, …, M }; y channel images { I } for all modalities 1 ,…,I m …,I M Image cutting is carried out to obtain an image block set { S } of all modes 1 ,…,S m …,S M S, where S m A set of image blocks representing an mth modality;
step two, constructing a fusion network Transfusion, which comprises the following steps: m modal branch networks, N fusion modules and a reconstruction module; and aggregate the image blocks { S } of all modalities 1 ,…,S m …,S M Input into the fusion network transfusions:
step 2.1, constructing M modal branch networks and N fusion modules:
step 2.1.1, constructing M modal branch networks:
the mth modal branch network in the M modal branch networks consists of N convolution modules, and the N convolution modules are respectively marked as ConvBlock m1 ,…,COnvBlock mn ,…,ConvBlock mN Wherein, convBlock mn An nth convolution module representing an mth modal branch network, N e {1,2, …, N };
when n=1, the nth convolution module ConvBlock of the mth modal branch network mn From X mn A plurality of two-dimensional convolution layers;
when n=2, 3, …, N, the nth convolution module ConvBlock of the mth modal branch network mn From a maximum pooling layer and X mn A plurality of two-dimensional convolution layers;
the convolution kernel size of the x two-dimensional convolution layer of the nth convolution module of the mth modal branch network is ks mnx ×ks mnx The number of convolution kernels is kn mnx ,x∈{1,2,…,X mn };
Step 2.1.2, constructing N fusion modules:
any nth fusion module of the N fusion modules is a transducer network and is composed of L self-attention mechanism modules; the first self-attention mechanism module of the L self-attention mechanism modules includes: 1 multi-head attention layer, 2 layer normalization and 1 full connection layer;
step 2.2, collecting the image blocks { S } of all modes 1 ,…,S m …,S M Inputting into M modal branch networks, and carrying out information fusion through N fusion modules:
when n=1, the image block set S of the mth modality m An nth convolution module ConvBlock input to the mth modal branch network mn X of (2) mn Outputting characteristic diagram after two-dimensional convolution layerWherein H is n 、W n 、D n Respectively represent the mth mouldOutput characteristic diagram of state branch network in nth convolution module +.>The height, width and channel number of the channel; thereby obtaining the output characteristic diagram of M modal branch networks in the nth convolution module +.>
Outputting characteristic diagrams of the M modal branch networks in an nth convolution moduleAfter being processed by the nth fusion module, the characteristic diagram is output +.>Wherein (1)>Representing an mth feature map output by an nth fusion module;
outputting the mth feature map from the nth fusion moduleA characteristic diagram outputted by an nth convolution module of the mth modal branch network +.>Adding to obtain a characteristic diagram ++of the mth modal branch network after interaction of the nth convolution module>Thus obtaining a characteristic diagram ++of M modal branch networks after interaction of the nth convolution module>When n=2, 3…, N, the feature diagram of the mth modal branch network after interaction of the N-1 convolution module is ∈ ->The feature map ∈10 of the nth convolution module of the mth modal branch network after downsampling is obtained by downsampling the mth modal branch network in the maximum pooling layer of the nth convolution module>-down-sampled feature map->Inputting into the 1 st two-dimensional convolution layer of the nth convolution module of the mth modal branch network, and sequentially passing through X mn After processing of the two-dimensional convolution layers, a feature map +.>Thereby obtaining the feature map output by the nth convolution module of the M modal branch networks-fitting the feature map>After the processing of the nth fusion module, a characteristic diagram is output +.>The mth feature map outputted by the nth fusion module +.>Feature map output by nth convolution module of branch network of mth mode +.>Adding to obtain added characteristic diagram +.>Thereby obtaining M added feature maps +.>And further obtaining a characteristic diagram +.>
Step 2.3, the reconstruction module is composed of an N-level convolution network; and outputting the characteristic graphs of the N fusion modulesInputting the primary fusion image F 'into the reconstruction module to obtain a primary fusion image F':
step 2.3.1, outputting all feature graphs output by the nth fusion moduleAdding to obtain a fusion characteristic diagram phi n The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining fusion characteristic diagrams { phi } of N fusion modules 1 ,…,Φ n ,…,Φ N };
Step 2.3.2, constructing a reconstruction module formed by an N-level convolution network, and fusing a feature map phi of an nth fusion module n N-th stage convolutional network input to reconstruction module:
when n=1, the nth stage convolutional network includes: b (B) n Each convolution module
When n=2, 3, …, N, the nth stage convolutional network comprises: b (B) n Each convolution module And B n +1 upsampling layers-> The b-th convolution module RConvBlock of the nth level convolution network nb Consists of Y two-dimensional convolution layers, B epsilon {1,2, …, B n };
When n=1 and b=1, the fusion profile Φ of the nth fusion module is calculated n A b-th convolution module RConvBlock input to the n-th level convolution network nb And outputs a characteristic map ΦR nb
When n=2, 3, …, N and b=1, the fusion profile Φ of the nth fusion module is calculated n B-th up-sampling layer Upsample input to nth stage convolutional network nb After the up-sampling process, an up-sampled characteristic diagram phi U is output nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the up-sampled characteristic diagram { phi U (phi) of the convolution network from the level 2 to the level N-1 2b ,…,ΦU nb ,…,ΦU Nb };
When n=2, 3, …, N and b=2, 3, …, B n When in use, the fusion characteristic diagram phi of the nth fusion module is obtained n Output characteristic diagram { ΦR of front b-1 convolution modules of nth-stage convolution network n1 ,…,ΦR n(b-1 ) The first b up-sampled feature maps { ΦU } of the n+1st level convolutional network (n+1)1 ,…,ΦU (n+1)b After splicing, obtaining a spliced characteristic diagram; inputting the spliced characteristic diagram to a b-th convolution module RConvBlock of an n-th level convolution network nb And outputs an output characteristic diagram ΦR of a b-th convolution module of the nth-stage convolution network nb The method comprises the steps of carrying out a first treatment on the surface of the Thereby obtaining the B-th of the level 1 convolutional network 1 The outputs of the convolution modulesFeature drawing
B of the level 1 convolutional network 1 Output feature map of each convolution moduleAfter processing of a convolution layer, a primary fusion image F' is obtained;
step three, constructing a loss function and training a network to obtain an optimal fusion model:
step 3.1, respectively calculating the image block sets { S } of all modes 1 ,…,S m …,S M Entropy of each image block set in the sequence and obtaining corresponding entropy value { e } 1 ,…,e m …,e M E, where e m An entropy value representing a set of image blocks of the mth modality;
step 3.2, for the entropy { e } 1 ,…,e m …,e M Respectively carrying out normalization processing to obtain an image block set { S } of all modes 1 ,…,S m …,S M Weights { omega } 1 ,…,ω m ,…,ω M }, wherein omega m A weight representing a set of image blocks of the mth modality;
step 3.3, constructing a total Loss function Loss by using the formula (1):
in the formula (1), L ssim (S m F') represents the set S of image blocks of the mth modality m A structural similarity loss function with the preliminary fusion image F';
step 3.4, carrying out minimum solution on the total Loss function Loss by using an optimizer, so as to optimize all parameters in the fusion network transformation and obtain an optimal fusion model;
step four, utilizing an optimal fusion dieY channel images { I } for all modalities 1 ,I 2 ,…,I M Processing and outputting a preliminary fusion image F'; the preliminary fusion image F' is converted into an RGB color space, thereby obtaining a final fusion image F.
2. The multi-modal medical image fusion method based on global information fusion according to claim 1, wherein the nth fusion module in step 2.2 is processed according to the following procedures:
2.2.1, outputting a characteristic diagram of the M modal branch networks output by the nth convolution module by the nth fusion moduleSplicing and leveling to obtain a product with a size of (M.times.H) n *W n )×D n Is a flattened feature vector; adding the leveled feature vector with a trainable vector with the same size to obtain a feature vector +.>
Step 2.2.2, when l=1, the nth self-attention mechanism module of the nth fusion module compares the feature vectorAfter linear mapping, three matrixes Q are obtained nl ,K nl ,V nl The method comprises the steps of carrying out a first treatment on the surface of the Recalculating Q nl ,K nl ,V nl Multi-head attention result Z between nl The method comprises the steps of carrying out a first treatment on the surface of the The multi-head attention result Z nl Input into the full connection layer of the first self-attention mechanism module of the nth fusion module, and obtain the output sequence vector of the first self-attention mechanism module of the nth fusion module>
When lWhen=2, 3, …, N, the output sequence vector of the 1 st self-attention mechanism module of the nth fusion moduleInputting the output sequence vector of the first self-attention mechanism module of the nth fusion module into the first self-attention mechanism module of the nth fusion module>Thereby obtaining the output sequence vector of the L self-attention mechanism module of the nth fusion module +.>
Step 2.2.3, vector the output sequenceDividing into M modes, and then deforming the size of each mode into H n ×W n ×D n To obtain an output characteristic map->
CN202210202366.1A 2022-03-03 2022-03-03 Multi-mode medical image fusion method based on global information fusion Active CN114565816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210202366.1A CN114565816B (en) 2022-03-03 2022-03-03 Multi-mode medical image fusion method based on global information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210202366.1A CN114565816B (en) 2022-03-03 2022-03-03 Multi-mode medical image fusion method based on global information fusion

Publications (2)

Publication Number Publication Date
CN114565816A CN114565816A (en) 2022-05-31
CN114565816B true CN114565816B (en) 2024-04-02

Family

ID=81717119

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210202366.1A Active CN114565816B (en) 2022-03-03 2022-03-03 Multi-mode medical image fusion method based on global information fusion

Country Status (1)

Country Link
CN (1) CN114565816B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115523B (en) * 2022-08-26 2022-11-25 中加健康工程研究院(合肥)有限公司 CNN and Transformer fused medical image depth information extraction method
CN115134676B (en) * 2022-09-01 2022-12-23 有米科技股份有限公司 Video reconstruction method and device for audio-assisted video completion
CN115511767B (en) * 2022-11-07 2023-04-07 中国科学技术大学 Self-supervised learning multi-modal image fusion method and application thereof
CN117173525B (en) * 2023-09-05 2024-07-09 北京交通大学 Universal multi-mode image fusion method and device
CN117853856B (en) * 2024-01-09 2024-07-30 中国矿业大学 Low-light night vision scene understanding method based on multi-mode image fusion
CN118038222A (en) * 2024-01-19 2024-05-14 南京邮电大学 Image fusion model and method based on secondary image decomposition and attention mechanism

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469094A (en) * 2021-07-13 2021-10-01 上海中科辰新卫星技术有限公司 Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
US11222217B1 (en) * 2020-08-14 2022-01-11 Tsinghua University Detection method using fusion network based on attention mechanism, and terminal device
CN114049408A (en) * 2021-11-15 2022-02-15 哈尔滨工业大学(深圳) Depth network model for accelerating multi-modality MR imaging

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20240069802A (en) * 2018-11-16 2024-05-20 스냅 인코포레이티드 Three-dimensional object reconstruction

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222217B1 (en) * 2020-08-14 2022-01-11 Tsinghua University Detection method using fusion network based on attention mechanism, and terminal device
CN113469094A (en) * 2021-07-13 2021-10-01 上海中科辰新卫星技术有限公司 Multi-mode remote sensing data depth fusion-based earth surface coverage classification method
CN114049408A (en) * 2021-11-15 2022-02-15 哈尔滨工业大学(深圳) Depth network model for accelerating multi-modality MR imaging

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于多级特征和混合注意力机制的室内人群检测网络;沈文祥;秦品乐;曾建潮;;计算机应用(第12期);全文 *
多模态多维信息融合的鼻咽癌MR图像肿瘤深度分割方法;洪炎佳;孟铁豹;黎浩江;刘立志;李立;徐硕瑀;郭圣文;;浙江大学学报(工学版)(第03期);全文 *

Also Published As

Publication number Publication date
CN114565816A (en) 2022-05-31

Similar Documents

Publication Publication Date Title
CN114565816B (en) Multi-mode medical image fusion method based on global information fusion
Lundervold et al. An overview of deep learning in medical imaging focusing on MRI
Li et al. VolumeNet: A lightweight parallel network for super-resolution of MR and CT volumetric data
CN111932529B (en) Image classification and segmentation method, device and system
CN113506222B (en) Multi-mode image super-resolution method based on convolutional neural network
Wang et al. SK-Unet: An improved U-Net model with selective kernel for the segmentation of multi-sequence cardiac MR
Liu et al. An automatic cardiac segmentation framework based on multi-sequence MR image
Hu et al. Recursive decomposition network for deformable image registration
Tawfik et al. Multimodal medical image fusion using stacked auto-encoder in NSCT domain
Chai et al. Synthetic augmentation for semantic segmentation of class imbalanced biomedical images: A data pair generative adversarial network approach
Wu et al. Slice imputation: Multiple intermediate slices interpolation for anisotropic 3D medical image segmentation
Atek et al. SwinT-Unet: hybrid architecture for medical image segmentation based on Swin transformer block and Dual-Scale Information
CN117853547A (en) Multi-mode medical image registration method
Qiao et al. Cheart: A conditional spatio-temporal generative model for cardiac anatomy
CN117475268A (en) Multimode medical image fusion method based on SGDD GAN
Yang et al. Hierarchical progressive network for multimodal medical image fusion in healthcare systems
CN116757982A (en) Multi-mode medical image fusion method based on multi-scale codec
Yang et al. Adaptive zero-learning medical image fusion
Zhu et al. A novel full-convolution UNet-transformer for medical image segmentation
CN116309754A (en) Brain medical image registration method and system based on local-global information collaboration
He et al. LRFNet: A real-time medical image fusion method guided by detail information
Wang et al. Multimodal parallel attention network for medical image segmentation
Shihabudeen et al. Autoencoder Network based CT and MRI Medical Image Fusion
Wu et al. Convolutional neural network with coarse-to-fine resolution fusion and residual learning structures for cross-modality image synthesis
Zhou et al. Balancing High-performance and Lightweight: HL-UNet for 3D Cardiac Medical Image Segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant