CN112488976B - Multi-modal medical image fusion method based on DARTS network - Google Patents

Multi-modal medical image fusion method based on DARTS network Download PDF

Info

Publication number
CN112488976B
CN112488976B CN202011467496.5A CN202011467496A CN112488976B CN 112488976 B CN112488976 B CN 112488976B CN 202011467496 A CN202011467496 A CN 202011467496A CN 112488976 B CN112488976 B CN 112488976B
Authority
CN
China
Prior art keywords
network
fusion
darts
image
medical image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011467496.5A
Other languages
Chinese (zh)
Other versions
CN112488976A (en
Inventor
张旭明
叶少壮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202011467496.5A priority Critical patent/CN112488976B/en
Priority to US18/001,295 priority patent/US11769237B2/en
Priority to PCT/CN2021/074531 priority patent/WO2022121100A1/en
Publication of CN112488976A publication Critical patent/CN112488976A/en
Application granted granted Critical
Publication of CN112488976B publication Critical patent/CN112488976B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/20Image enhancement or restoration using local operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/771Feature selection, e.g. selecting representative features from a multi-dimensional feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/52Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/5215Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data
    • A61B8/5238Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data for combining image data of patient, e.g. merging several images from different acquisition modes into one image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10104Positron emission tomography [PET]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30016Brain
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Image Analysis (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a multi-modal medical image fusion method based on a DARTS network, and belongs to the technical field of image fusion in image processing and analysis. The method adopts a micro-architectural Search (DARTS) network to extract the characteristics of the multi-modal medical images, the network learns by taking the gradient of network weight as a loss function in the Search stage, and selects a network structure most suitable for a current data set among different convolution operations and the connection of different nodes, so that the characteristics obtained by network extraction are more detailed; meanwhile, the invention adopts various indexes capable of representing image gray information, correlation, detail information, structural characteristics and image contrast as network loss functions, can realize effective fusion of medical images in an unsupervised learning mode under the condition of no gold standard, and has better fusion effect in the aspects of image detail protection, image contrast improvement and the like compared with the existing method.

Description

Multi-modal medical image fusion method based on DARTS network
Technical Field
The invention belongs to the technical field of image fusion in image processing and analysis, and particularly relates to a multi-modal medical image fusion method based on a DARTS network.
Background
With the development of medical imaging technology, more and more medical imaging technologies such as ultrasound imaging (US), Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET) and Single Photon Emission Computed Tomography (SPECT) are applied to diagnosis and evaluation of diseases of various organs of the human body. Each medical imaging technology has advantages and disadvantages, such as CT imaging can cover all anatomical parts of human body, the density resolution is high, but the spatial resolution is low, the joint muscle display is poor, and artifacts exist. While MR imaging has high soft tissue resolution, no bone and overlap artifacts, and the disadvantages are long scan time and lower spatial resolution than CT. The multi-modal medical image fusion can integrate the specific information of the images under different modes into one image, thereby facilitating the observation and diagnosis of doctors. Therefore, the multi-modal medical image fusion plays an important role in clinic, such as the fusion of US and MRI to realize prostate needle biopsy, and the fusion of PET and CT to realize lung cancer detection in clinic.
The traditional image fusion algorithm is mainly divided into image fusion methods of pixel level, feature level and decision level. The pixel-level fusion can obtain higher precision and is simple to implement, but the pixel-level fusion has poor anti-interference capability and is sensitive to noise, artifacts are easily generated in medical image fusion, and the diagnosis accuracy of doctors is influenced; the feature extraction in the feature level fusion method mainly comes from a manual design method, and the manually designed features are difficult to adapt to different medical images due to the complexity of the medical images; the decision-level fusion is not sensitive to noise, but the fusion precision is poor, the implementation is difficult, the obtained fusion image information amount is small, and important information is easy to lose when the fusion method is used for medical image fusion.
Deep learning can intensively learn feature information of samples from a large number of samples, and is widely used for image processing and analysis tasks such as image segmentation, image registration, image fusion and the like in recent years. As for image fusion, since there is no gold standard, image fusion methods based on deep learning can be classified into the following two categories: the first type is to extract features by using a deep learning method, and then to fuse images by adopting a traditional fusion strategy based on the extracted features; the second type is an end-to-end fusion method. The first method specifically comprises the steps of extracting features by using a pre-trained deep learning model, and performing feature fusion and subsequent fusion image reconstruction by adopting a traditional strategy. The second type of method can be classified into a supervised method and an unsupervised method, and the specific method is to provide an end-to-end network structure, adopt MSE, SSIM and other measures to train an image fusion data set, and directly use the trained model for image fusion.
For the first kind of method, the realization is simple, the pre-trained model and some traditional fusion and reconstruction strategies are directly used for image fusion, the training step is omitted, but the pre-trained model is suitable for a certain specific data set, so the method has poor generalization capability and is not suitable for various data sets or multi-task image fusion. For the second method, the end-to-end image fusion can be well realized by using a proper network structure and training a data set, improving the capability of extracting image features by a network, matching with some steps such as convolution up-sampling and the like. In the method, the design of a network structure and a loss function of the network are very important, and the existing fusion method usually adopts simple CNN and ResNet network structures and adopts simple MSE, SSIM and other measures in the aspect of the network structure, so that the quality of a fusion image is difficult to ensure.
Disclosure of Invention
In view of the above drawbacks or needs of the prior art, the present invention provides a method for fusing multimodal medical images based on a dart network, which aims to improve the image quality after the fusion of the multimodal medical images.
In order to achieve the above object, the present invention provides a multimodal medical image fusion method based on DARTS network, including:
s1, utilizing multi-modal medical image data to perform network structure search on a preset DARTS model to obtain a DARTS network structure suitable for the multi-modal medical image data; a DARTS network structure includes one or more concatenated cell bodies; each cell body comprises a plurality of nodes; each node is connected with the output of the previous two cell bodies or other nodes in the current cell body through different convolution operations; merging channels of all nodes to be used as the output of a cell body;
s2, constructing a multi-mode medical image fusion network; the multi-modal medical image fusion network comprises a multi-channel DARTS network module, a feature fusion module and an up-sampling module; the multichannel DARTS network module is composed of a plurality of parallel DARTS network structures;
the DARTS network structure is used for performing down-sampling on the input image to obtain a corresponding characteristic diagram; the characteristic fusion module is used for carrying out characteristic fusion on the characteristic graph output by the double-channel DARTS network module; the up-sampling module is used for performing convolution up-sampling on the fused features to obtain a fusion result with the same size as the input image;
s3, training the multi-modal medical image fusion network by using the multi-modal medical image data and adopting an unsupervised learning method;
and S4, inputting the multi-modal medical images to be fused into the trained multi-modal medical image fusion model to obtain a fusion result.
Further, the structure similarity, the multi-scale structure similarity, the edge protection degree, the difference value cross correlation, the mutual information and the mutual information based on the structure representation are used as loss functions of the multi-modal medical image fusion network training.
Further, the loss function for training the multi-modal medical image fusion network is:
Figure BDA0002831873680000031
wherein L isSSIMDenotes structural similarity loss, LSCDRepresenting the difference cross-correlation loss, LMIIndicating loss of mutual information, LMS-SSIMRepresenting a loss of similarity of the multi-scale structure,
Figure BDA0002831873680000032
denotes the loss of edge protection, LSR-MIRepresenting mutual information loss, λ, based on structural characterization1、λ2Respectively, the weights of the sets of loss functions.
Further, the different convolution operations in step S1 include: depth-separable convolution with a convolution kernel of 3, depth-separable convolution with a convolution kernel of 5, standard convolution with a convolution kernel of 3, standard convolution with a convolution kernel of 5, hole convolution with a convolution kernel of 3, hole convolution with a convolution kernel of 5, and jump connection.
Further, the DARTS network structure includes a cell body.
Further, the cell body includes four nodes.
Further, the convolution step size in the cell body is set to be 1, and the feature map is made to be consistent with the size of the input image in a filling mode.
Further, the feature fusion module performs feature fusion on the feature map output by the dual-channel DARTS network module, specifically, the feature map fusion is realized by adopting a channel merging mode.
Further, the method further comprises data enhancing the multi-modality medical image data.
Further, the data enhancement includes translation, rotation, and non-rigid deformation.
In general, the above technical solutions contemplated by the present invention can achieve the following advantageous effects compared to the prior art.
The method adopts a micro-architectural Search (DARTS) network to extract the characteristics of the multi-modal medical images, the network learns by taking the gradient of network weight as a loss function in the Search stage, and selects a network structure most suitable for a current data set among different convolution operations and the connection of different nodes, so that the characteristics obtained by network extraction are more detailed; meanwhile, the invention adopts various indexes capable of representing image gray information, correlation, detail information, structural characteristics and image contrast as network loss functions, can realize effective fusion of medical images in an unsupervised learning mode under the condition of no gold standard, and has better fusion effect in the aspects of image detail protection, image contrast improvement and the like compared with the existing method.
Drawings
FIG. 1 is a structural diagram of a multimodal medical image fusion method based on DARTS network provided by the invention;
FIG. 2 is a cell structure of DARTS model under CIFAR-10 data set;
FIG. 3(a) is a source CT image used in an embodiment of the present invention and a comparative method;
FIG. 3(b) is a source MR image used with an embodiment of the present invention and a comparative method;
FIG. 4(a) is a fused image obtained by comparison method 1 NSCT-SR;
FIG. 4(b) is a fused image obtained by comparative method 2 DTCTWT-SR;
FIG. 4(c) is a fused image obtained by the comparative method 3 NSST-PAPCNN;
FIG. 4(d) is a fused image obtained by the DenseeFuse of comparative method 4;
FIG. 4(e) is a fused image obtained by the comparison method 5 VIFNet;
FIG. 4(f) is a fused image obtained by comparative method 6 Nestfuse;
FIG. 5 is a fused image obtained by the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Referring to fig. 1, the invention provides a multimodal medical image fusion method based on a DARTS network, comprising:
s1, utilizing multi-modal medical image data to perform network structure search on a preset DARTS model to obtain a DARTS network structure suitable for the multi-modal medical image data; the DARTS network structure comprises a plurality of cell cells connected in series; each cell body comprises a plurality of nodes; each node is connected with the output of the previous two cell bodies or other nodes in the current cell body through different convolution operations; merging channels of all nodes to be used as the output of a cell body;
in the embodiment of the invention, the DARTS network is a DARTS model pre-trained by a data set CIFAR-10, but the full connection layer and the classifier part of the DARTS network are removed, in order to ensure that the network parameters are greatly reduced while the image characteristics are extracted, the number of the cells is set to be 1, the number of the nodes in each cell is set to be 4, the specific structure is shown in figure 2, the nodes are connected by depth separable convolution, jump connection and hole convolution, except for the several convolution operations, the node connection optional convolution operation can also be a node connection optional convolution operation which can also be depth separable convolution with a convolution kernel size of 5, the standard convolution with a convolution kernel size of 3 and the standard convolution with a convolution kernel size of 5. These convolution methods can be used to characterize image features well in an end-to-end network structure.
Down-sampling and outputting the channels which are four nodes to be combined; the convolution step length in the cell is set to be 1, and the feature graph is consistent with the size of the input image in a filling mode so as to avoid information loss.
S2, constructing a multi-mode medical image fusion network; the multi-modal medical image fusion network comprises a multi-channel DARTS network module, a feature fusion module and an up-sampling module; the multichannel DARTS network module is composed of a plurality of parallel DARTS network structures; the DARTS network structure is used for performing down-sampling on the input image to obtain a corresponding characteristic diagram; the feature fusion module is used for performing feature fusion on the feature maps output by the dual-channel DARTS network module, optionally, the feature map fusion is realized by adopting a channel merging mode, the channel merging can ensure that the features of the source images are subjected to weighted fusion through the subsequent convolution, and under the condition that the network loss function is properly designed, the feature fusion effect is better realized by the channel merging mode, and the useful information of the source images can be retained to a greater extent. The up-sampling module is used for performing convolution up-sampling on the fused features to obtain a fusion result with the same size as the input image; in the up-sampling operation, the convolution of 3 multiplied by 3 is adopted, the filling mode of step size 1 is adopted, and the number of channels of the convolution neural network is reduced from 128 to 1.
The two-channel DARTS network shown in FIG. 1 is merely one practical example that may be taken to facilitate presentation of the overall structure of the network, and multiple, side-by-side channels may also be provided, depending on the number of modalities to fuse the images.
S3, training the multi-modal medical image fusion network by using the multi-modal medical image data through an unsupervised learning method;
in order to realize network training, the loss functions adopted by the invention comprise Structure Similarity (SSIM), multi-scale structure similarity (MS-SSIM) and edge protection degree
Figure BDA0002831873680000061
Differential cross-correlation (SCD), Mutual Information (MI), and structural characterization based mutual information (SR-MI).
The present invention combines the above evaluation indices as a loss function in the following manner:
Figure BDA0002831873680000062
the above-mentioned loss functions comprise three groups of functions, a first group MI and SCD relating to image gray scale information and correlation, a second group relating to image edges and texture information, and a third group being additional terms for constraining edges. Wherein λ1、λ2Respectively, the weights of the sets of loss functions.
The SCD reflects the sum of the correlation of the fused image and the source image, and the calculation formula is as follows:
LSCD=-(R(F-S2,S1)+R(F-S1,S2))
wherein R represents correlation calculation, and the specific formula is as follows:
Figure BDA0002831873680000071
the MI representation fusion image comprises the information quantity of two source images, and the specific formula is as follows:
LMI=-(I(S1,F)+I(S2,F))
the formula of the mutual information I is as follows:
Figure BDA0002831873680000072
wherein p(s) and p (f) are edge distributions, and p (s, f) is a combined distribution of s and f.
The formula for the loss of SSIM is as follows:
Figure BDA0002831873680000073
the SSIM is structural similarity, and is calculated by using information such as brightness, contrast, structure and the like between two images, and the specific formula is as follows:
Figure BDA0002831873680000074
wherein, muSAnd muFAre the average values of S and F, respectively.
Figure BDA0002831873680000075
And
Figure BDA0002831873680000076
variance, σ, of S and F, respectivelySFIs the covariance of S and F, c1And c2Is a small constant.
The MS-SSIM is multi-level structural similarity, and is an index obtained by reducing images in different scales by taking power of 2 as a factor and then calculating the structural similarity in different scales.
Figure BDA0002831873680000077
The degree of expression of the salient information (i.e. edge information) from the input in the fused image is estimated by using a local metric, and the calculation formula is as follows:
Figure BDA0002831873680000081
wherein the content of the first and second substances,
Figure BDA0002831873680000082
respectively representing the protection degrees of the fused image relative to the edges of the two source images, and the calculation formula is as follows:
Figure BDA0002831873680000083
wherein the content of the first and second substances,
Figure BDA0002831873680000084
and
Figure BDA0002831873680000085
the length protection value and the direction protection value of the edge are respectively represented, and the calculation formula is as follows:
Figure BDA0002831873680000086
Figure BDA0002831873680000087
wherein gamma isα,Γg,κα,κg,σgAnd σαIs a constant number, ASFAnd GSFThe direction correlation value and the length correlation value of the source image S and the fusion image F are respectively, and the calculation formula is as follows:
Figure BDA0002831873680000088
Figure BDA0002831873680000089
wherein g isS、αS、gFAnd alphaFThe edge lengths and angles of the source image and the fused image, respectively.
Figure BDA00028318736800000810
In the calculation formula (2)
Figure BDA00028318736800000811
(where L is a constant) which are used as weights for the degree of edge protection of the fused image relative to the source image
Figure BDA00028318736800000812
The calculation of (2).
SR-MI represents mutual information based on structural characterization results, where the structural characterization results were obtained using a PCANet network.
The embodiment of the invention selects brain CT and MR images in ATLAS (website from Harvard medical college, which is hosted by http:// www.med.harvard.edu/aanlib/home, wherein human brain CT, MRI, PET and SPECT images under normal conditions and different disease states are collected) data set, and carries out translation, rotation and non-rigid transformation on the brain CT and MR images, and the obtained 30000 pair of images is used as a training set; meanwhile, selecting brain CT and MR images which do not belong to a training set to construct a test set, and training and testing a multi-modal medical image fusion network;
and S4, inputting the multi-modal medical images to be fused into the trained multi-modal medical image fusion model to obtain a fusion result.
To demonstrate the superiority of the method of the invention, 6 comparison algorithms were used for comparison tests with the method of the invention. Introduction of 6 comparison algorithms:
1. NSCT-SR: NSCT transform image fusion algorithm based on sparse representation.
2. DTCTWT-SR: DWTCT transform image fusion algorithm based on sparse representation.
3. NSST-PAPCNN: a medical image fusion method based on a parameter adaptive neural network of a non-downsampling Shearlet transform domain.
4. DenseFuse: and an image fusion algorithm for performing fusion by using a coding and decoding structure based on DenseNet and a fusion strategy such as addition and L1 regularization.
5. VIFNet: a depth learning image fusion method of unsupervised DenseNet coding and convolution decoding adopts MSE and weighted SSIM as loss functions.
6. NestFuse: a deep learning image fusion method based on a novel bird nest network structure is disclosed.
Selecting some objective evaluation indexes (EN, FMI _ pixel, MI, MS _ SSIM, SSIM),
Figure BDA0002831873680000091
SCD) to evaluate the fusion result, in which, FMI _ pixel represents the mutual information of the pixel-level features, and is the mutual information calculation based on the image features, which can be characterized by gradients, Sobel edge operators, etc.
EN (Encopy) represents the entropy of information, and the larger the value of EN, the more abundant the information contained in the image. The formula is as follows:
Figure BDA0002831873680000101
h (X) denotes the information entropy of the image, P (x)i) Indicating the probability of occurrence of a pixel with a gray level I, I (x)i) Represents the probability P (x)i) Is determined.
The results of the different methods for the fusion evaluation of the same pair of images of ATLAS are shown in Table 1, from which it can be seen that DARTS images are fused at SCD,
Figure BDA0002831873680000102
the EN index is obviously higher than other methods, and the method has certain advantages on indexes such as MI, MS _ SSIM and the like, which shows that the method retains the information of two source images to a greater extent and has certain advantages on structures such as edge information and the like.
TABLE 1
Figure BDA0002831873680000103
In order to more intuitively show the superiority of the present invention relative to other methods, the embodiment of the present invention further provides a visual effect map of the fused image corresponding to the method of the present invention and various comparison methods, wherein fig. 3(a) is a source CT image, fig. 3(b) is a source MR image, fig. 4(a) is the fused image obtained by the comparison method 1, fig. 4(b) is the fused image obtained by the comparison method 2, fig. 4(c) is the fused image obtained by the comparison method 3, fig. 4(d) is the fused image obtained by the comparison method 4, fig. 4(e) is the fused image obtained by the comparison method 5, fig. 4(f) is the fused image obtained by the comparison method 6, and fig. 5 is the fused image obtained by the method of the present invention; as can be seen from the figure, the method of the present invention is excellent in contrast, especially, details and rich in edge information, and also shows that the method of the present invention has the advantages of more details of the obtained image and large amount of information.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (8)

1. A multimodal medical image fusion method based on DARTS network is characterized by comprising the following steps:
s1, utilizing multi-modal medical image data to perform network structure search on a preset DARTS model to obtain a DARTS network structure suitable for the multi-modal medical image data; a DARTS network structure includes one or more concatenated cell bodies; each cell body comprises a plurality of nodes; each node is connected with the output of the previous two cell bodies or other nodes in the current cell body through different convolution operations; merging channels of all nodes to be used as the output of a cell body;
s2, constructing a multi-mode medical image fusion network; the multi-modal medical image fusion network comprises a multi-channel DARTS network module, a feature fusion module and an up-sampling module; the multichannel DARTS network module is composed of a plurality of parallel DARTS network structures;
the DARTS network structure is used for performing down-sampling on the input image to obtain a corresponding characteristic diagram; the characteristic fusion module is used for carrying out characteristic fusion on the characteristic graph output by the double-channel DARTS network module; the up-sampling module is used for performing convolution up-sampling on the fused features to obtain a fusion result with the same size as the input image;
s3, training the multi-modal medical image fusion network by using the multi-modal medical image data and adopting an unsupervised learning method; the loss function for training the multi-modal medical image fusion network is as follows:
Figure FDA0003556404120000011
wherein L isSSIMDenotes structural similarity loss, LSCDRepresenting the difference cross-correlation loss, LMIIndicating loss of mutual information, LMS–SSIMRepresenting the loss of similarity of the multi-scale structure,
Figure FDA0003556404120000012
denotes the loss of edge protection, LSR-MIRepresenting mutual information loss based on structural characterization, the structural characterization result being obtained using a PCANet network, λ1、λ2Respectively the weight of each group of loss functions;
the difference value cross-correlation SCD reflects the sum of the correlation of the fused image and the source image, and the calculation formula is as follows:
LSCD=-(R(F-S2,S1)+R(F-S1,S2))
S1and S2Respectively representing two source images, F representing a fusion image, and R representing image correlation calculation;
mutual information MI represents the information quantity of two source images contained in a fused image, and the specific formula is as follows:
LMI=-(I(S1,F)+I(S2,F))
i represents mutual information calculation of the two images;
the formula for the loss of structural similarity SSIM is as follows:
Figure FDA0003556404120000021
the edge protection loss calculation formula is as follows:
Figure FDA0003556404120000022
Figure FDA0003556404120000023
representing a fused image F versus two source images S1And S2Degree of edge retention of;
and S4, inputting the multi-modal medical images to be fused into the trained multi-modal medical image fusion model to obtain a fusion result.
2. The method for fusing multimodal medical images based on DARTS network as claimed in claim 1, wherein the different convolution operations in step S1 include: depth-separable convolution with a convolution kernel of 3, depth-separable convolution with a convolution kernel of 5, standard convolution with a convolution kernel of 3, standard convolution with a convolution kernel of 5, hole convolution with a convolution kernel of 3, hole convolution with a convolution kernel of 5, and jump connection.
3. The method of claim 1, wherein the DARTS network structure comprises a cell.
4. The multimodal medical image fusion method based on DARTS network as claimed in claim 3, wherein the cell body comprises four nodes.
5. The multimodal medical image fusion method based on DARTS network as claimed in claim 4, wherein the convolution step size in cell body is set to 1, and the feature map is made to be consistent with the input image size by filling.
6. The multimodal medical image fusion method based on DARTS network as claimed in claim 1, wherein the feature fusion module performs feature fusion on the feature map output by the dual-channel DARTS network module, specifically, the feature map fusion is realized by adopting a channel merging mode.
7. The DARTS network-based multimodal medical image fusion method according to any of claims 1-6, wherein the method further comprises data enhancement of multimodal medical image data.
8. The method of claim 7, wherein the data enhancement comprises translation, rotation, and non-rigid deformation.
CN202011467496.5A 2020-12-11 2020-12-11 Multi-modal medical image fusion method based on DARTS network Active CN112488976B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202011467496.5A CN112488976B (en) 2020-12-11 2020-12-11 Multi-modal medical image fusion method based on DARTS network
US18/001,295 US11769237B2 (en) 2020-12-11 2021-01-30 Multimodal medical image fusion method based on darts network
PCT/CN2021/074531 WO2022121100A1 (en) 2020-12-11 2021-01-30 Darts network-based multi-modal medical image fusion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011467496.5A CN112488976B (en) 2020-12-11 2020-12-11 Multi-modal medical image fusion method based on DARTS network

Publications (2)

Publication Number Publication Date
CN112488976A CN112488976A (en) 2021-03-12
CN112488976B true CN112488976B (en) 2022-05-17

Family

ID=74916879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011467496.5A Active CN112488976B (en) 2020-12-11 2020-12-11 Multi-modal medical image fusion method based on DARTS network

Country Status (3)

Country Link
US (1) US11769237B2 (en)
CN (1) CN112488976B (en)
WO (1) WO2022121100A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114974518A (en) * 2022-04-15 2022-08-30 浙江大学 Multi-mode data fusion lung nodule image recognition method and device
CN115222724B (en) * 2022-08-05 2023-05-23 兰州交通大学 Medical image fusion method based on NSST domain mixed filtering and ED-PCNN
CN115767040B (en) * 2023-01-06 2023-04-25 松立控股集团股份有限公司 360-degree panoramic monitoring automatic cruising method based on interactive continuous learning
CN116563189B (en) * 2023-07-06 2023-10-13 长沙微妙医疗科技有限公司 Medical image cross-contrast synthesis method and system based on deep learning
CN117710227B (en) * 2023-12-14 2024-06-11 北京长木谷医疗科技股份有限公司 Modal fusion method and device based on multi-modal medical image
CN117611473B (en) * 2024-01-24 2024-04-23 佛山科学技术学院 Synchronous denoising image fusion method and related equipment thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102867296A (en) * 2012-08-20 2013-01-09 华中科技大学 Medical image fusion method based on pulsed cortical model
CN108416802A (en) * 2018-03-05 2018-08-17 华中科技大学 A kind of multi modal medical image non-rigid registration method and system based on deep learning
CN111612754A (en) * 2020-05-15 2020-09-01 复旦大学附属华山医院 MRI tumor optimization segmentation method and system based on multi-modal image fusion
CN111860495A (en) * 2020-06-19 2020-10-30 上海交通大学 Hierarchical network structure searching method and device and readable storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9158965B2 (en) * 2012-06-14 2015-10-13 The Board Of Trustees Of The Leland Stanford Junior University Method and system for optimizing accuracy-specificity trade-offs in large scale visual recognition
US9922272B2 (en) 2014-09-25 2018-03-20 Siemens Healthcare Gmbh Deep similarity learning for multimodal medical images
CN106682424A (en) * 2016-12-28 2017-05-17 上海联影医疗科技有限公司 Medical image adjusting method and medical image adjusting system
GB201709672D0 (en) * 2017-06-16 2017-08-02 Ucl Business Plc A system and computer-implemented method for segmenting an image
US10699414B2 (en) * 2018-04-03 2020-06-30 International Business Machines Corporation Image segmentation based on a shape-guided deformable model driven by a fully convolutional network prior
US11756160B2 (en) * 2018-07-27 2023-09-12 Washington University ML-based methods for pseudo-CT and HR MR image estimation
CN110852168A (en) 2019-10-11 2020-02-28 西北大学 Pedestrian re-recognition model construction method and device based on neural framework search
CN110827200B (en) * 2019-11-04 2023-04-07 Oppo广东移动通信有限公司 Image super-resolution reconstruction method, image super-resolution reconstruction device and mobile terminal
DE202020101012U1 (en) * 2020-02-25 2020-03-08 Robert Bosch Gmbh Device for predicting a suitable configuration of a machine learning system for a training data set
CN111553480B (en) 2020-07-10 2021-01-01 腾讯科技(深圳)有限公司 Image data processing method and device, computer readable medium and electronic equipment
CN111882514B (en) 2020-07-27 2023-05-19 中北大学 Multi-mode medical image fusion method based on double-residual ultra-dense network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102867296A (en) * 2012-08-20 2013-01-09 华中科技大学 Medical image fusion method based on pulsed cortical model
CN108416802A (en) * 2018-03-05 2018-08-17 华中科技大学 A kind of multi modal medical image non-rigid registration method and system based on deep learning
CN111612754A (en) * 2020-05-15 2020-09-01 复旦大学附属华山医院 MRI tumor optimization segmentation method and system based on multi-modal image fusion
CN111860495A (en) * 2020-06-19 2020-10-30 上海交通大学 Hierarchical network structure searching method and device and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度学习的医学图像分割算法研究;赵向明;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20200315;全文 *

Also Published As

Publication number Publication date
US20230196528A1 (en) 2023-06-22
US11769237B2 (en) 2023-09-26
WO2022121100A1 (en) 2022-06-16
CN112488976A (en) 2021-03-12

Similar Documents

Publication Publication Date Title
CN112488976B (en) Multi-modal medical image fusion method based on DARTS network
CN109584254B (en) Heart left ventricle segmentation method based on deep full convolution neural network
CN109978850B (en) Multi-modal medical image semi-supervised deep learning segmentation system
CN110232383B (en) Focus image recognition method and focus image recognition system based on deep learning model
CN111709953B (en) Output method and device in lung lobe segment segmentation of CT (computed tomography) image
CN110276736B (en) Magnetic resonance image fusion method based on weight prediction network
CN111047594A (en) Tumor MRI weak supervised learning analysis modeling method and model thereof
CN115496771A (en) Brain tumor segmentation method based on brain three-dimensional MRI image design
CN110782427B (en) Magnetic resonance brain tumor automatic segmentation method based on separable cavity convolution
Mahapatra et al. Crohn's disease tissue segmentation from abdominal MRI using semantic information and graph cuts
CN111597946A (en) Processing method of image generator, image generation method and device
Liu et al. Multimodal MRI brain tumor image segmentation using sparse subspace clustering algorithm
CN112364920A (en) Thyroid cancer pathological image classification method based on deep learning
CN112150564A (en) Medical image fusion algorithm based on deep convolutional neural network
Fan et al. TR-Gan: multi-session future MRI prediction with temporal recurrent generative adversarial Network
Cai et al. Triple multi-scale adversarial learning with self-attention and quality loss for unpaired fundus fluorescein angiography synthesis
Abdeltawab et al. A new 3D CNN-based CAD system for early detection of acute renal transplant rejection
CN115409843B (en) Brain nerve image feature extraction method based on scale equalization coupling convolution architecture
CN116152235A (en) Cross-modal synthesis method for medical image from CT (computed tomography) to PET (positron emission tomography) of lung cancer
Bhaiya et al. Classification of MRI brain images using neuro fuzzy model
Muthiah et al. Fusion of MRI and PET images using deep learning neural networks
Sun et al. Brain tumor segmentation based on AMRUNet++ neural network
CN114266738A (en) Longitudinal analysis method and system for mild brain injury magnetic resonance image data
Chan et al. Automated quality controlled analysis of 2d phase contrast cardiovascular magnetic resonance imaging
CN113205472A (en) Cross-modal MR image mutual generation method based on cyclic generation countermeasure network cycleGAN model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant