CA3104607A1 - Contrast-agent-free medical diagnostic imaging - Google Patents

Contrast-agent-free medical diagnostic imaging Download PDF

Info

Publication number
CA3104607A1
CA3104607A1 CA3104607A CA3104607A CA3104607A1 CA 3104607 A1 CA3104607 A1 CA 3104607A1 CA 3104607 A CA3104607 A CA 3104607A CA 3104607 A CA3104607 A CA 3104607A CA 3104607 A1 CA3104607 A1 CA 3104607A1
Authority
CA
Canada
Prior art keywords
image
network
pixel
tumor
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3104607A
Other languages
French (fr)
Inventor
Shuo Li
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
London Health Sciences Centre Research Inc
Original Assignee
London Health Sciences Centre Research Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by London Health Sciences Centre Research Inc filed Critical London Health Sciences Centre Research Inc
Priority to CA3104607A priority Critical patent/CA3104607A1/en
Publication of CA3104607A1 publication Critical patent/CA3104607A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/05Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves 
    • A61B5/055Detecting, measuring or recording for diagnosis by means of electric currents or magnetic fields; Measuring using microwaves or radio waves  involving electronic [EMR] or nuclear [NMR] magnetic resonance, e.g. magnetic resonance imaging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

Described herein is medical imaging technology for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising: receiving a medical image acquired by a medical scanner in absence of contrast agent enhancement; providing the medical image to a computer-implemented machine learning model; concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks. Methods and systems and non-transitory computer readable media are described for execution of concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis.

Description

CONTRAST-AGENT-FREE MEDICAL DIAGNOSTIC IMAGING
BACKGROUND OF THE INVENTION
Field of the Invention The present invention relates to medical imaging, and more particularly medical imaging acquired without administration of a contrast agent.
Description of the Related Art Contrast agent (CA) administration to a patient is a frequent prerequisite for medical imaging.
Taking heart imaging as an example, gadolinium-based CA is administered to a patient for cardiac magnetic resonance (MR) imaging, for example as part of current ischemic heart disease (IHD) treatment workflow in cardiac radiology (Beckett et al., 2015).
IHD
diagnosis/treatment is a relevant example of cardiac MR imaging as a majority of patients undergoing cardiac MR imaging are being evaluated for possible myocardial ischemia, and IHD
in general and subtypes of IHD may be distinguished according to patterns of contrast enhancement. CA imaging uses chemical substances in MR scans. After the CA is injected into the body, CA imaging produces a late gadolinium enhancement (LGE) image to illustrate IHD
scars that are invisible under regular MR imaging and improves the clarity of other internal and surrounding cardiac tissues (i.e., muscles, cavities, and even blood).
Terminology of early versus late gadolinium enhancement references a lapsed time after injection for acquiring imaging data. An advantage of LGE is due to a relative contrast enhancement change between healthy and diseased tissue at a later time after injection of CA
favoring enhancement of diseased tissue. For example, at an early time (1-3 min post-injection) gadolinium resides primarily in the blood pool and healthy myocardium. At a later time (5-20 min post-injection) gadolinium is relatively cleared from healthy tissue and is relatively retained by diseased tissue.
After the CA imaging, manual segmentation helps radiologists to segment multiple cardiac tissues to delineate diagnosis-related tissues (scars, myocardium, etc.), and the subsequent quantitative evaluation of these segmented tissues results in various diagnosis metrics to accurately report the presence or the progression of IHD.
However, with this workflow (i.e., CA imaging first followed by manual segmentation second), there are still concerns regarding toxicity, high inter-observer variability, and Date Recue/Date Received 2020-12-30 ineffectiveness. 1) CAs have been highlighted in numerous clinical papers showing their potential toxicity, retention in the human body, and importantly, their potential to induce fatal nephrogenic systemic fibrosis (Ordovas and Higgins, 2011). 2) Manual segmentation has well-known issues regarding high inter-observer variability and non-reproducibility, which are caused by the difference in expertise among clinicians (Ordovas and Higgins, 2011). 3) CA
imaging followed by segmentation leads to additional time and effort for patient and clinician, as well as high clinical resource costs (labor and equipment).
To date, a few initial CA-free and automatic segmentation methods have been reported.
However, even the state-of-the-art methods only produce a binary scar image that fails to provide .. a credible diagnosis (Xu et al., 2018a; 2018b).
As another example of medical imaging acquired with CA administration, the MR
examination of liver relies heavily on CA injection. For example, in liver cancer diagnosis, non-contrast enhanced MR imaging (NCEMRI) obtained without CA injection can barely distinguish areas of hemangioma (a benign tumor) and hepatocellular carcinoma (HCC, a malignant tumor).
On the contrary, contrast-enhanced MRI (CEMRI) obtained with CA injection shows the area of hemangioma as a gradual central filling and bright at the edge and the area of HCC as entirely or mostly bright through the whole tumor, which provides an accurate and easy way to diagnose hemangioma and HCC.
However, gadolinium-based CA brings inevitable shortcomings, suffering from high-risk, time-consuming, and expensive disadvantages. The high-risk disadvantage is due to potential toxic effect of gadolinium-based CA injection. The time-consuming disadvantage comes from the MRI process itself and the waiting-time after CA injection. The expensive disadvantage mainly comes from the cost of CA; in the USA alone, conservatively, if each dose of CA is $60, the direct material expense alone equates to roughly $1.2 billion in 2016 (Statistics from IQ-AI
Limited Company, USA).
Accordingly, there is a need for contrast-agent-free medical diagnostic imaging.
SUMMARY OF THE INVENTION
In an aspect there is provided, a medical imaging method for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
receiving a medical image acquired by a medical scanner in absence of contrast agent enhancement;
-2-Date Recue/Date Received 2020-12-30 providing the medical image to a computer-implemented machine learning model;
concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.
In another aspect there is provided, a medical imaging method for concurrent and simultaneous synthesis and segmentation of a CA-free-AI-enhanced image comprising:
receiving a magnetic resonance (MR) image acquired by a medical MR scanner in absence of contrast agent enhancement;
providing the MR image to a progressive framework of a plurality of generative adversarial networks (GAN);
inputting the MR image into a first GAN;
obtaining a coarse tissues mask from the first GAN;
inputting the coarse tissues mask and the MR image into a second GAN;
obtaining a CA-free-AI-enhanced image from the second GAN;
inputting the CA-free-AI-enhanced image and the MR image into a third GAN;
obtaining a diagnosis-related tissue segmented image from the third GAN.
In yet another aspect there is provided, a medical imaging method for concurrent and simultaneous synthesis of a CA-free-AI-enhanced image and tumor detection comprising:
receiving a magnetic resonance (MR) image acquired by a medical MR scanner in absence of contrast agent enhancement;
providing the MR image to a tripartite generative adversarial network (GAN) comprising a generator network, a discriminator network and a detector network;
inputting the MR image into the generator network;
obtaining a CA-free-AI-enhanced image and an attention map of tumor specific features from the generator network;
inputting the CA-free-AI-enhanced image and the attention map into the detector network;
obtaining a tumor location and a tumor classification extracted from the CA-free-AI-enhanced image by the detector network;
training the generator network by both adversarial learning with the discriminator network and back-propagation with the detector network.
-3 -Date Recue/Date Received 2020-12-30 In further aspects there are provided, systems and non-transitory computer readable media for execution of concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis described herein.
For example, there is provided, a medical imaging system for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
an interface device configured for receiving a medical image acquired by a medical scanner in absence of contrast agent enhancement;
a memory configured for storing the medical image and a computer-implemented machine learning model;
a processor configured for:
inputting the medical image to the computer-implemented machine learning model;
concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.
As another example there is provided, a non-transitory computer readable medium embodying a computer program for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
computer program code for receiving a medical image acquired by a medical scanner in absence of contrast agent enhancement;
computer program code for providing the medical image to a computer-implemented machine learning model;
computer program code for concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
computer program code for reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows a schematic of a contrast-agent-free (CA-free) medical imaging system.
Figure 2 shows a flow diagram of a CA-free medical imaging method.
-4-Date Recue/Date Received 2020-12-30 Figure 3 shows a schematic of advantages of the present CA-free medical imaging technology (Fig. 3A) compared to existing deep learning-based methods (Fig.
3B) and existing clinical CA-based imaging and manual segmentation by medical experts (Fig.
3C).
Figure 4 shows a schematic of a variant of the present CA-free medical technology formed as a progressive sequential causal GAN (PSCGAN) framework.
Figure 5 shows a schematic of a sequential causal learning network (SCLN) component integrated within the PSCGAN framework shown in Figure 4.
Figure 6 shows that by integrating SCLN into the GAN architecture as the encoder of cine MR images in the generator, the SCLN-based GAN improves the learning effectiveness of interest distribution from the latent space of cine MR images, thereby effectively improving the generating.
Figure 7 shows a schematic of data flow within the PSCGAN framework and three linked GANs executing three progressive phases of PSCGAN ¨ priori generation GAN
(panel a), conditional synthesis GAN (panel b), and segmentation GAN (panel c). GANs in the three phases leverage adversarial training and dedicated loss terms to enhance the performance of image synthesis and segmentation tasks. The conditional synthesis GAN and segmentation GAN
leverage the output of the respective previous GANs to guide the training of the next GAN as part of its input.
Figure 8 shows PSCGAN-based synthesis of high-quality late-gadolinium-enhanced-equivalent (LGE-equivalent) images and PSCGAN-based accurate diagnosis-related tissue segmentation images. In LGE-equivalent images, the scar (dashed box, the high contrast area in left ventricle (LV) wall) has a clear and accurate presentation when compared to the real LGE
image. This high contrast area is invisible in cine MR images without CA
injection. In diagnosis-related tissue segmentation images, the segmented scar (light grey), health myocardium (dark grey), and blood pool (intermediate grey) from our method are highly consistent with the ground truth in terms of shape, location, and size.
Figure 9 shows that PSCGAN generated an accurate diagnosis-related tissue segmentation image and that each component in the PSCGAN effectively improved the segmentation accuracy.
Figure 10 shows that PSCGAN calculated scar sizes and scar ratios are highly consistent with those from the current clinical workflow as shown by comparisons with Bland-Altman analysis.
Figure 11 shows that each component in the PSCGAN effectively improves LGE-equivalent image quality.
-5-Date Recue/Date Received 2020-12-30 Figure 12 shows that a hinge adversarial loss term in the PSCGAN improved performance in LGE-equivalent image synthesis.
Figure 13 shows that PSCGAN corrects the overestimation and boundary error issues in existing state-of-the-art scar segmentation methods.
Figure 14 shows that two-stream pathways and the weighing unit in the SCLN
effectively improve segmentation accuracy, as does multi-scale, causal dilated convolution.
Figure 15 shows visual examples of the synthesis and segmentation, including both good case and bad cases (arrows). The segmented scar appears as light grey and intermediate grey areas in our method and the ground truth, respectively. The segmented myocardium appears as dark grey and white areas in our method and the ground truth, respectively.
Figure 16 shows a schematic of advantages of the present CA-free medical imaging technology (synthetic CEMRI) compared to existing non-contrast enhanced MRI
(NCEMRI) methods and compared to existing contrast enhanced MRI (CEMRI) methods. There are four cases of synthesizing CEMRI from NCEMRI. Subjectl and Subject2 are hemangioma, a benign tumor. Subject3 and Subject4 are hepatocellular carcinoma (HCC), a malignant tumor.
Figure 17 shows a schematic of a variant of the present CA-free medical technology formed as a Tripartite-GAN that generates synthetic CEMRI for tumor detection by the combination of three associated-task networks, the attention-aware generator, the CNN-based discriminator and the R-CNN-based detector. The R-CNN-based detector directly detects tumor from the synthetic CEMRI and improves the accuracy of synthetic CEMRI
generation via back-propagation. The CNN-based discriminator urges generator to generate more realistic synthetic CEMRI via adversarial-learning-strategy.
Figure 18 shows that the generator in the Tripartite-GAN aims to synthesize accurate and realistic synthetic CEMRI. It uses a hybrid convolution including standard convolution layers, dilated convolution layers, and deconvolution layers. The dilated convolution is utilized to enlarge receptive fields. The two standard convolution layers and two deconvolution layers are connected to the front and back of dilated convolution, which reduces the size of feature maps to expand the receptive fields more efficiently. Following the hybrid convolution, the dual attention module (DAM including MAM and GAM) enhances the detailed feature extraction and aggregates long-range contextual information of the generator, which improves the detailed synthesis of the specificity of the tumor and the spatial continuity of the multi-class liver MRI.
Figure 19 shows a schematic of MAM that enhances the detailed feature extraction by utilizing the interdependencies between channel maps X.
-6-Date Recue/Date Received 2020-12-30 Figure 20 shows a schematic of GAM that captures global dependencies of multi-class liver MRI by encoding global contextual information into local features.
Figure 21 shows a schematic of a CNN architecture of the discriminative network that receives the ground truth of CEMRI and the synthetic CEMRI, and then outputs the discriminative results of real or fake. Its adversarial strategy eagerly supervises attention-aware generator to find its own mistakes, which increased the authenticity of the synthetic CEMRI.
Figure 22 shows a schematic of architecture of the tumor detection network that receives synthetic CEMRI and then accurately localizes the Region of Interest (RoI) of the tumor and classifies the tumor. Attention maps from the generator newly added into the detector in the manner of residual connection improve VGG-16 based convolution operation to extract tumor information better, which improves the performance of tumor detection.
Meanwhile, the back-propagation of Lch prompts the generator to focus on the specificity between two types of tumors. Added Lch into Tripartite-GAN achieves a win-win between detector and generator via back-propagation.
Figure 23 shows that Tripartite-GAN generated synthetic CEMRI has an equal diagnostic value to real CEMRI. In the first and second rows, it is clear that the area of hemangioma becomes gradual central filling and bright at the edge in synthetic CEMRI, and the area of HCC
becomes entirely or mostly bright through the whole tumor. The dark grey and light grey windows/boxes represent the hemangioma and HCC, respectively, and enlarge them on the right.
The third (bottom) row is the synthesis result of healthy subjects.
Figure 24 shows Tripartite-GAN outperforms three other methods in comparison of detailed expression of the tumor and highly realistic synthetic-CEMRI. The pixel intensity curve and zoomed local patches of tumor area show that our Tripartite-GAN is more accurate than three other methods.
Figure 25 shows that the ablation studies of No discriminator, No DAM, No detector, No dilated convolution, and No residual learning, which demonstrate contribution of various components of Tripartite-GAN to generation of synthetic CEMRI. The pixel intensity curve and zoomed local patches of tumor area demonstrate that our Tripartite-GAN is more accurate and more powerful in the detailed synthesis. The horizontal coordinate denotes pixel positions of the white line drawn in the ground truth, and the vertical coordinate is the pixel intensity of the corresponding pixel.
-7-Date Recue/Date Received 2020-12-30 Figure 26 show contributions of DAM, GAM, and MAM in Tri-partite GAN generated synthetic CEMRI. The subjectl demonstrates that DAM enhances the detailed synthesis of anatomy specificity and the spatial continuity. The subject2 demonstrates that GAM improves the spatial continuity of CEMRI synthesis. The subject3 demonstrates that MAM
enhances the .. detailed feature extraction to improve the discrimination of hemangioma and HCC. The subject3 shows the failure of not being able to differentiate HCC and Hemangioma when MAM is removed, which incorrectly synthesizes the specificity of hemangioma into the specificity of HCC. The dark grey windows/boxes of zoomed local patches represent the tumor area. From left to right, they are the NCEMRI, the synthetic CEMRI without attention module, the synthetic CEMRI with attention module, and the ground truth, respectively.
Figure 27 shows two examples of CEMRI synthesis. The dark grey windows/boxes marked in the feature maps represent the difference of spatial continuity between Tripartite-GAN
with and without GAM. The light grey windows/boxes marked in feature maps represent the difference of detailed feature extraction between Tripartite-GAN with and without MAM. The last two columns show the synthesis results and zoomed local patches of the tumor area. It is clear that MAM helps Tripartite-GAN enhance detailed synthesis, and GAM helps Tripartite-GAN improve the spatial continuity of synthetic CEMRI.
Figure 28 shows that the generator of Tripartite-GAN with residual learning has lower training loss compared with the Tripartite-GAN without residual learning.
Figure 29 shows that attention maps not only focus on the tumor but pay more attention to extract all features of all anatomy structure in liver MRI for multi-class liver MRI synthesis. The feature maps of VGG-16 without attention maps are more focused on tumor information. The feature maps of VGG-16 with attention maps also focus on tumor information but more accurate and detailed than without attention maps.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
With reference to the drawings, a system and method for CA-free-AI-enhanced imaging devoid of CA administration is described. The system and method compare favourably with current CA imaging techniques. The full wording of the term CA-free-AI-enhanced is contrast-agent-free-artificial-intelligence-enhanced with the CA-free component indicative of image or scan data acquired without CA administration and the AI-enhanced component indicative of machine learning enhancement of image/scan data acquired without CA
administration.
-8-Date Recue/Date Received 2020-12-30 Fig. 1 shows am example of a computer implemented imaging system 2, incorporating a MR scanner 4. The MR scanner 4 typically comprises a static field magnet, a gradient coil, and a radio frequency (RF) coil disposed in a cylindrical housing of a gantry and an adjustable, often motorized, support or table for maintaining a subject in a desired position (for example, a prone or supine position) in an open central chamber or bore formed in the gantry during a scan procedure.
The static field magnet of the gantry is typically substantially in cylindrical form, and generates a static magnetic field inside the open central chamber of the gantry which is an imaging region of a subject (patient) using electric current provided from a static magnetic field power source in an excitation mode. The gradient coil is also typically substantially in cylindrical form, and located interior to the static field magnet. The gradient coil applies gradient magnetic fields to the subject in the respective directions of the X axis, the Y axis and the Z axis, by using the electric currents supplied from the gradient magnetic field power sources.
The RF coil transmits RF pulses toward the subject and receives MR signals as RF radiation emitted from the subject due to nuclear spin excitation and relaxation. RF pulse transmission includes an RF pulse synthesizer and pulse amplifier communicative with an RF coil, while MR signal reception includes an RF coil communicative with a signal amplifier and signal processor. One or more RF
coils may be used for RF pulse transmission and MR signal reception, such that the RF coil for RF pulse transmission and MR signal reception may be the same or different.
The static field magnet, the gradient coil and the RF coil are driven by one or more controllers.
Directed by a data acquisition scheme, the one or more controllers coordinate a scan of the subject by driving gradient magnetic fields, RF pulse transmission and MR
signal reception, and then communicating the received scan data to a data acquisition component 6.
The data acquisition component 6 incorporates a data acquisition scheme or data acquisition computer code that receives, organizes and stores MR scan data from the RF
coil/controller of the MR scanner. The scan data is sent to an image reconstruction component 8 incorporating an image reconstruction computer code. The scan data can then be processed using the image reconstruction computer code resulting in image data including multiple images of predetermined sampling site(s) of the subject. The image reconstruction computer code can easily be varied to accommodate any available MR imaging technique. The image data can then be processed by a machine learning image synthesis component 10 incorporating image synthesis computer code tasked with processing of image data to generate a CA-free-AI-enhanced image.
The image data can be concurrently processed by a machine learning image analysis component
-9-Date Recue/Date Received 2020-12-30 12 incorporating image analysis computer code tasked with processing of image data to generate a diagnostic image analysis, such as a tissue segmentation or a tumour detection. The image synthesis component 10 and image analysis component 12 are communicative to reciprocally guide their respective CA-free-AI-enhanced image synthesis and image analysis tasks, such that a .. synthesized CA-free-AI-enhanced image or a precursor thereof generated by the image synthesis component 10 is communicated to the image analysis component 12 to guide the diagnostic image analysis task, and conversely a diagnostic image result or precursor thereof generated by image analysis component 12 is communicated to the image synthesis component
10 to guide the image synthesis task.
The imaging system 2 is controlled by one or more computers 16 with data and operational commands communicated through bus 14. The imaging system 2 may include any additional component as desired for CA-free-AI-enhanced image synthesis and image analysis including multiplexers, digital/analog conversion boards, microcontrollers, physical computer interface devices, input/output devices, display devices, data storage devices and the like. The imaging system 2 may include controllers dedicated to different components of the MR scanner 4, such as a sequence controller to provide power and timing signals to control the gradient coil magnetic field, RF pulse transmission and/or MR signal reception, or such as a table controller to provide power and timing signals to a table motor to control table position and thereby control position of a subject in the gantry by moving the subject along a z-axis through an opening of the gantry communicative with the interior open chamber of the gantry.
Fig. 2 shows a computer implemented method 20 for contrast agent-free medical imaging.
The method 20 comprises a pre-scan preparation 30 and positioning of a subject for MR scanning of a desired sampling site or anatomical region of interest. Once the subject is prepared and positioned within a MR scanner, MR scanning 40 is performed to acquire scan data at the sampling site. The scan data is processed to reconstruct 50 image data from the scan data. The image data is then concurrently processed in an image synthesis task 60 and a diagnostic image analysis task 70. The image synthesis task and the image analysis task reciprocally communicate for mutual dependent training of both tasks.
The contrast agent-free medical imaging system and method have been validated by experimental testing. Experimental testing results demonstrate the ability of the contrast agent-free medical imaging system and method to concurrently provide CA-free-AI-enhanced image synthesis and diagnostic image analysis. The following experimental examples are for illustration purposes only and are not intended to be a limiting description.

Date Recue/Date Received 2020-12-30 Experimental Example 1.
The details of Experimental Example 1 are extracted from a prior scientific publication (Xu et al., (2020) "Contrast agent-free synthesis and segmentation of ischemic heart disease images using progressive sequential causal GANs", Medical Image Analysis, Vol.
62: article 101668), and this scientific publication is incorporated herein by reference in its entirety. In the event of inconsistency between the incorporated material and the express disclosure of the current document, the incorporated material should be considered supplementary to that of the current document; for irreconcilable inconsistencies, the current document controls.
In this Experimental Example 1, a CA-free image is an image that is synthesized from image data acquired in absence of contrast agent (CA) administration by a machine learning model to achieve an imaging equivalent to CA-enhanced imaging for purposes of a concurrent diagnostic image analysis by the machine learning model achieving diagnostic results comparable to human expert diagnosis using CA-enhanced imaging. Therefore, in Experimental Example 1, the term CA-free can be used interchangeably with the term CA-free-AI-enhanced (or contrast-agent-free-artificial-intelligence-enhanced); for example the term CA-free image can be used interchangeably with the term CA-free-AI-enhanced image.
Current state-of-the-art CA-free segmentation methods only produce a binary scar image that fails to provide a credible diagnosis (Xu et al., 2018a; 2018b). As shown in Fig. 3B, this binary scar image can only indicate two categories of pixels: scar and background. This limited resolution thus fails to highlight several relevant tissues (e.g., myocardium and healthy myocardium, blood pool) recommended according to the clinical protocols of comprehensive IHD evaluation. Subsequently, it fails to help radiologists quantitatively assess multiple tissues to obtain the most powerful metrics for a credible IHD diagnosis (for example as shown in Fig. 3C, scar ratio = size of the scar/size of the myocardium). Because the use of multiple metrics based on multiple tissues results in far greater accuracy than using only a metric based on scar tissue alone in a credible IHD diagnosis, the limitations of existing segmentation methods need to be addressed. Thus, clinicians desire development of more advanced CA-free technology that can produce an LGE-equivalent image (i.e., an image that is equivalent to an LGE
image in terms of usefulness in an IHD diagnosis or from which clinical metrics can be obtained without CA
injections) and a segmented image (including diagnosis-related tissues, i.e., scar, healthy myocardium, and blood pools, as well as other pixels) (Leiner, 2019).
However, it is very challenging to synthesize an LGE-equivalent image and accurately segment diagnosis-related tissues (i.e., scar, healthy myocardium and blood pools) from 2D+T
-11 -Date Recue/Date Received 2020-12-30 cine MR images. The pixel-level understanding of LGE images by representation learning of the 2D+T cine MR images faces numerous challenges. The differences in the enhancement effects of the CAs on different cardiac cells result in each of the numerous pixels of the LGE image requiring a definite non-linear mapping from the cine MR images.
Representation learning of the 2D+T cine MR has a number of high-complexity issues. The time series characteristics of 2D+T
cine MR images result in each non-linear mapping requiring a complex mixing of the spatial and temporal dependencies of a mass of pixels in the images, especially since these pixels often have high local variations. More importantly, a pixel-level understanding of LGE
images is needed to differentiate between pixels that have very similar appearances (Xu et al., 2017). The highly similar intensity of pixels within the tissue on an LGE image often results in high similarities between the learned spatial and temporal dependencies of these pixels and often causes interference and inaccuracy during mixing.
Existing CA-free automated IHD-diagnosing methods are inefficient in the representation learning of cine MR images, as they must contend with a fixed local observation in both spatial dependency and temporal dependency extraction (e.g., only adjacent temporal frames of optical flow and a fixed spatial convolutional kernel size for deep learning).
However, pixels in 2D+T
cine MR images often have high local variations (i.e., different positions and motion ranges in different regions and timestamps). Furthermore, current spatial-temporal feature learning methods still struggle with constant learning weights during the mixing of spatial dependencies with temporal dependencies (e.g., both 3DConv and ConvLSTM often simply treat the two dependencies on each pixel as equal during learning) (Xu et al., 2017).
However, different pixels have different selection requirements in terms of temporal dependencies and spatial dependencies.
Existing progressive networks. Recently, progressive generative adversarial networks (GAN) have shown great potential in the tasks of image synthesis and segmentation (Huang et al., 2017; Karras et al., 2017; Zhang et al., 2018b). Progressive GAN inherit the advantage of adversarial semi-supervised learning from GAN to effectively learn to map from a latent space to a data distribution of interest. More importantly, the progressive framework of such progressive GAN stacks multiple sub-GAN networks as different phases to take advantage of the result of the previous phase to guide the performance of the next phase and greatly stabilize training.
However, current progressive GAN are designed to train on a single task because they lack a two-task generation scheme to handle the synthesis task and segmentation task.
-12-Date Recue/Date Received 2020-12-30 Existing generative adversarial networks (GANs). GANs (Goodfellow et al., 2014) have become one of the most promising deep learning architectures for either image segmentation tasks or synthesis tasks in recent years but may face inefficient and unstable results when two or more tasks must be solved. GAN comprises two networks, a generator and a discriminator, where one is pitted against the other. The generator network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. However, a GAN may learn an erroneous data distribution or a gradient explosion when the latent space of the distributions of two tasks interfere with each other. Conditional GAN, a type of GAN implementation, has the potential to learn reciprocal commonalities of the two tasks to avoid interference with each other because of its considerable flexibility in how two hidden representations are composed (Mirza and Osindero, 2014). In conditional GAN, a conditioned parameter y is added to the generator to generate the corresponding data using the following equation:
min max V (D, G) = Ex_ pdata (x)[log D (xly)]
G D
Ez_pz(z) [ log(1 ¨ D(G(zly)))]
(1) where P data X) represents the distribution of the real data; and P2 represents the distribution of the generator.
Existing attention model. An attention model successfully weighs the positions that are highly related to the task, thereby improving the performance of the application in various tasks (Vaswani et al., 2017). It is inspired from the way humans observe images, wherein more attention is paid to a key part of the image in addition to understanding an image as a whole.
Such a model uses convolutional neural networks as basic building blocks and calculates long-range representations that respond to all positions in the input and output images. It then determines the key parts that have high responses in the long-range representations and weights these parts to motivate the networks to better learn the images. Recent work on attention models embedded an auto regressive model to achieve image synthesis and segmentation by calculating the response at a position in a sequence through attention to all positions within the same
-13 -Date Recue/Date Received 2020-12-30 sequence (Zhang et al., 2018a). This model has also been integrated into GANs by attending to internal model states to efficiently find global, long-range dependencies within the internal representations of the images. The attention model has been formalized as a non-local operation to model the spatial-temporal dependencies in video sequences (Wang et al., 2018). Despite this progress, the attention model has not yet been explored for the internal effects of different spatial and temporal combinations on synthesis and segmentation in the context of GANs.
A novel progressive sequential causal GAN. A novel progressive sequential causal GAN
(PSCGAN) described herein, provides a CA-free technology capable of both synthesizing an LGE equivalent image and segmenting a diagnosis-related tissue segmentation image (for example, scar, healthy myocardium, and blood pools, as well as other pixels) from cine MR
images to diagnose IHD. As shown schematically in Fig. 3A, this is the first technology to synthesize an image equivalent to a CA-based LGE-image and to segment multiple tissues equivalently to the manual segmentation performed by experts. A further advantage of the described technology is that it is capable of performing concurrent or simultaneous synthesis and segmentation.
PSCGAN builds three phases in a step-by-step cascade of three independent GANs (i.e., priori generation GAN, conditional synthesis GAN, and enhanced segmentation GAN). The first phase uses the priori generation GAN to train the network on a coarse tissue mask; the second phase uses the conditional synthesis GAN to synthesize the LGE-equivalent image; and the third phase uses the enhanced segmentation GAN to segment the diagnosis related tissue image. The PSCGAN creates a pipeline to leverage the commonalities between the synthesis task and the segmentation task, which takes pixel categories and distributions in the coarse tissues mask as a priori condition to guide the LGE-equivalent image synthesis and the fine texture in the LGE-equivalent image as a priori condition to guide the diagnosis-related tissue segmentation.
PSCGAN use these two reciprocal guidances between the two tasks to gain an unprecedentedly high performance in both tasks while performing stable training.
The PSCGAN further includes the following novel features: (1) a novel sequential causal learning network (SCLN) and (2) the adoption of two specially designed loss terms. First, the SCLN creatively builds a two-stream dependency-extraction pathway and a multi-attention weighing unit. The two-stream pathway multi-scale extracts the spatial and temporal dependencies separately in the spatiotemporal representation of the images to include short-range to long-range scale variants; the multi-attention weighing unit computes the responses within and between spatial and temporal dependencies at task output as a weight and mixes them according
-14-Date Recue/Date Received 2020-12-30 to assigned weights. This network also integrates with GAN architecture to further facilitate the learning of interest dependencies of the latent space of cine MR images in all phases. Second, the two specially designed loss terms are a synthetic regularization loss term and a self-supervised segmentation auxiliary loss term for optimizing the synthesis task and the segmentation task, .. respectively. The synthetic regularization loss term uses a spare regularization learned from the group relationship between the intensity of the pixels to avoid noise during synthesis, thereby improving the quality of the synthesized image, while the self-supervised segmentation auxiliary loss term uses the number of pixels in each tissue as a compensate output rather than only the shape of the tissues to improve the discrimination performance of the segmented image and thereby improve segmentation accuracy.
Overview of PSCGAN. As depicted in Fig. 4, PSCGAN cascades three GANs to build three phases and connect them by taking the output of the previous GAN as an input of the next GAN. Moreover, to reduce the randomness during training, all three GANs encode the cine MR
images by using the same foundational network architecture, a SCLN-based GAN
that includes an encoder-decoder generator and a discriminator to specially design and handle time-series images. Thus, PSCGAN not only have great training stability by using divide-and-conquer to separate the segmentation task and synthesis task into different phases but also undergo effective training by progressively taking the output of the previous phase as the priori condition input to guide the next phase.
Phase I: priori generation GAN. This phase uses the priori generation GAN
(Pri) to generate a coarse tissue mask (Mpri) from the cine MR images X by adversarial training. This coarse segmented image is a rich priori condition, as it contains all pixel categories and tissue shapes, locations, and boundaries.
Phase II: conditional synthesis GAN. This phase uses the conditional synthesis GAN
(Sys) to integrate the coarse tissue mask and the cine MR image to build a conditional joint mapping to use the obtained pixel attributes and distributions from the mask to guide image synthesis for generating a high-quality LGE-equivalent image Phase III: enhanced segmentation GAN. This phase uses the enhanced segmentation GAN
(Seg) to introduce the synthesized image from Sys as a priori condition to generate the diagnosis-related tissue segmentation image 'Seg. The synthesized image and all detailed textures effectively guide the classification of the tissue boundary pixels.
A component of the PSCGAN is a novel SCLN. SCLN improves the accuracy of time-series image representations by task-specific dependence selecting between and within extracted
-15-Date Recue/Date Received 2020-12-30 spatial and temporal dependencies. By integrating SCLN into the GAN
architecture as the encoder of cine MR images in the generator, the SCLN-based GAN improves the learning effectiveness of the interest distribution from the latent space of cine MR
images, thereby effectively improving the generating performance on adversarial training.
Sequential causal learning network (SCLN). The SCLN uses a two-stream structure that includes a spatial perceptual pathway, a temporal perceptual pathway and a multi-attention weighing unit. This network gains diverse and accurate spatial and temporal dependencies for improving the representation of the time-series images. In addition, this is a general layer that can be used individually or stacked flexibly as the first or any other layer.
Two-stream structure for multi-scale spatial and temporal dependency extraction. As shown in Fig. 5, a two-stream structure, which includes a spatial perceptual pathway and a temporal perceptual pathway, correspondingly match the two-aspect dependencies in the time-series image. It uses two independent, stacked dilated convolutions as multi-scale extractors to respectively focus the spatial dependencies and the temporal dependencies in the time-series images. Dilated convolution includes sparse filters that use skip points during convolution to exponentially grow the receptive field to aggregate multi-scale context information. It improves the diversity of both spatial dependencies and temporal dependencies to include all short-range to long-range scale variants. The 1D/2D dilated convolutions are formulated as:
Dc 1D : (kernel 441 x)t kernels ft-ls S=-Dc (2) 2D : (x */ kernel)(p) = x(s)kernel(t) s+It=p (3) where x is the 1D/2D signal/image, and 1 is the dilation rater.
The spatial perceptual pathway uses 2D dilated convolution, and the temporal perceptual pathway uses 1D dilated convolution. The inputs of both pathways are cine MR
images. The spatial perceptual pathway regards 2D + T cine MR images as multiple (time t to time t + n) independent 2D images. Each input image is learned by a 2D dilated convolution, where the number of 2D dilated convolution is the same as the number of frames. The output of the 2D
dilated convolution in time t is the spatial feature convolved with the frame of time t only. Thus, the spatial feature of 2D + T cine MR images can be effectively captured when combining all 2D
-16-Date Recue/Date Received 2020-12-30 dilated convolution from time t to time t + n. By contrast, the spatial perceptual pathway regards 2D + T cine MR images as a whole 1D data. This 1D data is learned by 1D
dilated convolutions according to its order, where the hidden units of the 1D dilated convolution that are the same length as the 1D form of each frame (the length of a 64x64 frame is 4096). The output of each 1D
dilated convolution time t is the temporal feature convolved with the frame of time t and the earlier time in the previous layer. Thus, the temporal feature of 2D + T cine MR can be effectively captured when the 1D dilated convolution process reaches the time t + n.
In this experimental example, both pathways initially stack 6 dilated convolutions, and the corresponding dilation rate is [1, 1, 2, 4, 6, 8]. This setting allows the learned representation to include all 3 x 3 to 65 x 65 motion and deformation scales. Note that the stack number still varies with the spatial and temporal resolution of the time-series image during encoding. Moreover, both spatial and temporal perceptual pathways stack 3 stacked dilated convolutions (1D/2D) again to build a residual block framework for deepening the network layers and enriching hierarchical features. In this experimental example, both paths also adopt a causal padding to ensure that the output at time t is only based on the convolution operation at the previous time.
This causal-based convolution means that there is no information leakage from the future to the past. Advantages of this two-stream structure include: 1) two pathways used to focus on two aspect dependencies independently; 2) dilated convolution with residual blocks and shortcut connections used to extract multiscale and multilevel dependencies and 3) causal padding used to understand the time order within the dependencies.
Multi-attention weighing unit for task-specific dependence selection. As shown in Fig. 5, the multi-attention weighing unit includes three independent self-attention layers and an add operator to adaptively weigh the high-contribution dependences between and within spatial and temporal dependencies at the output to perform accurate task-specific dependence selection (Vaswani et al., 2017). Two self-attention layers first embed behind both the spatial perceptual pathway and the temporal perceptual pathway to adaptively compute the response of each pathway's dependence at the output as their weights; then, the add operator element-wise fuses the weighed spatial and temporal dependencies; finally, the third self-attention layer determines which of the fused spatial-temporal dependences is the task-specific dependence. The spatial dependencies from the spatial perceptual pathway are defined as RC AN
where:
-17-Date Recue/Date Received 2020-12-30 C is the number of channels; and N is the number of dependencies.
The spatial self-attention layer first maps these spatial dependencies into two feature spaces:
f(.) = wf Fs con and 1-14') = Wg-FSc.
or, It calculates the weight ai to the ith dependencies, where CxN
a= (al a2 , = = = , , = = = , aN) E R
exp (S i) CYi ________________________ , where si = f(Tsc..vi) g(-Tscariv) = ENi=1 exp (Si) (4) The weighed spatial dependencies aFScom are as follows:
N
V L'aih(Tsconvi) (5) W
h VSCon i) = W h FSConv V VSCortv i) ."1",-,Con v (6) where VV8, Wf. VV/I,51/14 are the learned weight matrices.
W ) RC'c"
For memory efficiency, g- W W I - ¨ , where C is the reduced channel number; and C = C/8..
Note that 8 is a hyperparameter.
By the same token, the temporal self-attention layer enhances the temporal dependencies ow from the temporal perceptual path to an attention-weighted AFTrome õHE RCxN
where 13 (131 , 132, = = = ,13, = = = , 13N) E RCxN are the weights of the temporal dependencies.
-18-Date Recue/Date Received 2020-12-30 The add operator elementwise fuses the weighed spatial dependencies and temporal dependencies:
FSTConv = FSCon FTConv (7) The fused self-attention layer weighs the fused spatial-temporal dependencies:
j-STC our" .
The output of this layer is ' n STI:firn - FID "C x N
This output further adds the input of the map layer after modification with a learnable scalar (y).
Therefore, the final output is given by Ostcor, TsTror?õ..
Implementation of an SCLN-based GAN for the basic network architecture. This network stacks 4 SCLNs and 4 corresponding up-sampling blocks to build a generator.
The network further stacks 5 convolutional layers to build a discriminator. Both the generator and discriminator use conditional adversarial training to effectively perform the segmentation and synthesis. As shown in Fig. 6, the generator is an encode-decode 2D+T to 2D
framework .. modified from U-Net (Ronneberger, 0., Fischer, P., Brox, T., 2015. U-net:
Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer. pp. 234-241). It first encodes the input XE R25 x 64 x 64 X 1 (25frames, image size per frame 64 x 64 x 1) by using 4 SCLNs with 2, 2, 2, 2 strides on the spatial perceptual pathway and 4, 4, 4, 4 strides on the temporal perceptual pathway. The first SLCN uses two copies of X as the inputs into its spatial perceptual pathway and temporal perceptual pathway. Thus, beginning from the second SCLN, the generator takes the spatial and temporal perceptual pathway outputs of the previous SCLN as the input and encodes a 25 x 4 x 4 x 128 feature from the multi-attention weighing unit output of the fourth SCLN. Then, this encoded feature is further reduced to 1 x 1 x 4096 by a fully connected layer and is then passed to another fully connected layer to reshape the encoded feature into a 4 x 4 x 256 feature. Four upsampling blocks (Upsampling-Conv2D-LN) then use this reshaped feature to encode an image (i.e., the coarse tissue mask, the LGE-equivalent image or the diagnosis-related tissue segmentation image) c R64 x 64 x 1 . Moreover, the generator also uses a dot layer to reduce the first dimension of the multi-attention weighing unit output from the first to the third SCLN
-19-Date Recue/Date Received 2020-12-30 and a skip connection that is the same as the U-Net to feed the corresponding upsampling block with the same feature map size.
The discriminator encodes the output of the generator of the corresponding phase and determines whether this output is consistent with the domain of its ground truth. All 5 convolutional layers have strides of 2. Note that the attention layer is added between the second convolutional layer and the third convolutional layer. These attention layers endow the discriminator with the ability to verify that highly detailed features in distant portions of the image are consistent with each other and to improve the discrimination performance.
An advantage of this SCLN-based GAN is an accurate encoding of interest dependencies from the latent space of cine MR image.
Phase I: priori generation GAN for coarse tissue mask generation. The priori generation GAN ( Pri ) is built with the same architecture as the SCLN-based GAN, as shown in Fig. 7 @art a). It includes a generator Gpri and a discriminator Dpri . This GAN generates a coarse tissue mask Mpri , which focuses on drawing the shape, contour and correct categories for four classifications (scar, healthy myocardium, blood pool, and other pixels). This GAN does not seek a final result in one step but takes advantage of the shape, contour, and categories of this rough segmentation as a priori information to guide the next module to learn the attributes and distributions of the pixels. Training of this generator uses multi-class cross-entropy loss.
Although Mpri contains four classes, the generator is treated as a single classification problem for the samples in one of these classes by encoding both the generator output and ground truth to one-hot vector classes. The generator can be formulated as follows:
= mce (G pri (X) , Iseg) n=1 (8) N
mce = ¨ ¨ _____________________________ [Iseg log M
ISeg) log (1 ¨ Mpri)]
N n=1 (9) where keg is the ground truth of Mpri, and N =4.
-20-Date Recue/Date Received 2020-12-30 Pri The discriminator training uses the adversarial loss MP' , which adopts the recently developed hinge adversarial loss (Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I., 2017.
Attention is all you need, in: Advances in Neural Information Processing Systems, pp. 5998-6008).
This hinge adversarial loss maps the true sample to a range greater than 1 and maps the false sample to an interval less than ¨1. It better converges to the Nash equilibrium between the discriminator and generator, thus resulting in less mode collapsing and more stable training performance than other GAN losses. It can be formulated as follows:
Dpri _E [min(0, -1 D pri (Iseg))1 Ad v Used"' Pdata EX,px [ min(0, -1 - Dpri (Gpri (X I
)) LGPri ¨ TR:
¨x-px D Pri (Gni (X)) Ad v (10) Phase II: conditional synthesis GAN for high-quality LGE-equivalent image synthesis.
The conditional synthesis GAN ( Sys ) includes a generator Gsys and a discriminator Ds ys to generate an LGE-equivalent image 'Sys . As shown in Fig. 7 @art b), this GAN
introduces the previously generated course tissue mask to guide the network training by modifying the SCLN-based GAN with a fully connected layer in the generator to concatenate the 1 x 1 x 4096 feature and the mask, the output of which is then fed into the following fully connected layer and 4 upsampling blocks. Thus, this GAN builds a conditional joint mapping space between the segmentation and the synthesis to use the basic attributes and distributions (i.e., shape, contour, location, and categories) of the tissues to disentangle different tissue-feature learning in the cine MR images and allows the generator to perform accurate and detailed synthesis.
The generator uses the synthetic regularization loss sys for the training. This loss incorporates an L2-regularization term and an overlapping group sparsity anisotropic operator into the recently developed total variation loss to improve the quality of the synthesized image (Pumarola, A., Agudo, A., Martinez, A.M., Sanfeliu, A., Moreno-Noguer, F., 2018. Ganimation: Anatomically-aware facial animation from a single image, in: Proceedings of the European Conference on Computer Vision, pp. 818-833). The total variation loss has recently shown the ability to significantly reduce the noise in the synthesized image during image synthesis. L2-regularization is further incorporated into the total variation
-21-Date Recue/Date Received 2020-12-30 loss to measure the computation complexity and prevent overfitting by penalizing this complexity. The overlapping group sparsity anisotropic operator is further incorporated into the total variation loss. It takes into account group sparsity characteristics of image intensity derivatives, thereby avoiding staircase artifacts that erroneously consider smooth regions as piecewise regions. This loss is formulated as follows:
y 2 LGsys = E - )VSys 2 v (CISysil hysi ) c/5 hysi:
J
hys¨PG 2 (11) where i and j are the ith and jth pixel entry of 'Sys, v> 0 is a regularization parameter, and (p(.) is overlapping group sparsity function. Overlapping group sparsity anisotropic operator is described as 0(1) = E uiK(01 2 i,j=1 (12) iIC =
i¨n11+1,i¨mi +1, j-7771+1 (13) where K is the group size;
11.11 = I K-1 L 2 1 ; and m2 =LU
The discriminator is trained using an adversarial loss term and a synthetic content loss =Dcv, .. term: 1) the synthesis adversarial loss Adv adopts the hinge adversarial loss and can be formulated as:
-22-Date Recue/Date Received 2020-12-30 tAdv psYs ¨ ¨ (IsysY'Pdata E [mm (0, ¨1 + DsegarSys))]
s -Exp [ min(0, ¨1 ¨ DSys(GSysq MPri)))]
LGsY' = E
mv X¨Px Dsys (Gsys(X MPH)) (14) where Sys is the ground truth (i.e, LGE image);
'Sirs 2) the synthetic content loss Cant uses feature maps of the 2nd, 3rd and 4th convolution layers outputted from discriminator to evaluate Lys by comparing it to its ground truth Ys.S
This multiple feature map evaluation allows the discriminator to discriminate the image in terms of both the general detail content and higher detail abstraction during the activation of the deeper layers, thereby improving the discriminator performance. It is defined as follows:
W, Cony. ¨ COI1V 1 E [ ___ E WH (Dsys asys)xy _ Dsys (Gsys VIMpri)x,y))2 I.Sys"-"Pdato X=1 y=1 (15) where:
GM' nSys is the feature map;
and Wi and Hi obtained by the ith convolution layer (after activation).
Advantages of the conditional synthesis GAN include: 1) the coarse tissue mask is used as an a priori condition to guide the accurate synthesis of the tissues, 2) the synthetic regularization loss is used to reduce the image noise during synthesis, and 3) the synthetic content loss is used to improve the detail restoration in the image synthesis.
Phase III: enhanced segmentation GAN for accurate diagnosis-related tissues segmentation. The enhanced segmentation GAN (S'eg) includes a generator Gseg and a discriminator Dseg to generate an accurate diagnosis-related tissue segmentation image Iseg , as shown in Fig. 7 (part c). Compared to the basic SCLN-based GAN, this GAN has following two differences: 1) it adds a fully connected layer into the generator at the same position as that of the conditional synthesis GAN to introduce the synthesized image output from phase II as a
-23-Date Recue/Date Received 2020-12-30 condition to guide the segmentation (the synthesized image already includes all detailed textures of the tissues, which effectively aids the fine classification of the tissue boundary pixels), and 2) it adds a linear layer at the end of the discriminator to regress the size (number of pixels) of the 4 different segmentation categories at the end of the discriminator to perform a self-supervised segmentation auxiliary loss. This self-supervised loss prevents the discriminator from only judging the segmented image based on the segmentation shape, causing the discriminator to extract a compensate feature from the input image to improve its discrimination performance.
The generator with multi-class cross-entropy loss and the discriminator with segmentation adversarial loss can be formulated as follows:
= ince (Gseg Is Is =CGseg y)s eg) 11=1 2Seg _E [min(0, ¨1 +Dseg (ISeg))]
Ad v (iSeg)".-Pdata [mm (0, ¨1 ¨ Dseg (Gseg ISys)))1 T GSeg v -1124X¨ px DSeg(GSeg I ISys)) (16) The discriminator with self-supervised segmentation auxiliary loss can be formulated as follows:
I rAux nAux Seg I 'Sys 1 'Seg 1"1.Seg'''Pciatal"seg lo-'111Seg (Si I G (X)) )I
"Seg (17) where Si = (51i Si2 Si3 Si41 -is the size of the 4 segmentation categories of pixels in the image outputted from the linear layer of the discriminator 021tux Seg Advantages of the enhanced segmentation GAN include: 1) the boundaries of tissues within synthesized images are used to guide the tissue's boundary segmentation and 2) the self-supervised segmentation auxiliary loss is used to improve the segmentation adversarial.
Materials and implementation for Experimental Example 1.
A total of 280 (230 IHD and 50 normal control) patients with short-axis cine MR images were selected. Cardiac cine MR images were obtained using a 3-T MRI system (Verio, Siemens, Erlangen, Germany). Retrospectively gated balanced steady-state free-precession non-enhanced
-24-Date Recue/Date Received 2020-12-30

Claims (111)

WHAT IS CLAIMED IS:
1. A medical imaging method for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
receiving a magnetic resonance (MR) image acquired by a medical MR scanner in absence of contrast agent enhancement;
providing the MR image to a computer-implemented machine learning model;
concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.
2. The method of claim 1, wherein the machine learning model is trained by deep reinforcement learning.
3. The method of claim 1, wherein the machine learning model comprises a neural network.
4. The method of claim 3, wherein the neural network comprises at least one generative adversarial network (GAN).
5. The method of claim 1, wherein the medical diagnostic image analysis is a tissue segmented image and the machine learning model is a plurality of machine learning components, the method further comprising:
inputting the MR image into a first machine learning component;
obtaining a coarse tissues mask from the first machine learning component;
inputting the coarse tissues mask and the MR image into a second machine learning component;
obtaining a CA-free-AI-enhanced image from the second machine learning component;
inputting the CA-free-AI-enhanced image and the MR image into a third machine learning component;
obtaining a diagnosis-related tissue segmented image from the third machine learning component.

Date Recue/Date Received 2020-12-30
6. The method of claim 5, wherein at least one of the plurality of machine learning components is trained by deep reinforcement learning.
7. The method of claim 5, wherein each of the plurality of machine learning components comprises a neural network.
8. The method of claim 7, wherein the plurality of machine learning components comprises a first generative adversarial network (GAN), a second GAN and a third GAN.
9. The method of claim 8, further comprising a sequential causal learning network (SCLN) connected to a generator network of each of the first, second and third GANs, the SCLN
configured as an encoder of the MR image.
10. The method of claim 9, wherein the SCLN comprises a two-stream structure of a spatial perceptual pathway and a temporal perceptual pathway to independently extract spatial and temporal dependencies from the MR image.
11. The method of claim 10, wherein each of the spatial perceptual pathway and the temporal perceptual pathway includes a dilated convolutional network.
12. The method of claim 10, further comprising a multi-attention weighing unit embedded within each of the spatial perceptual pathway and the temporal perceptual pathway to compute and select task-specific dependence within the spatial and temporal dependencies.
13. The method of claim 12, wherein the multi-attention weighing unit comprises a first attention layer embedded in the spatial perceptual pathway to compute weights for spatial dependencies, a second attention layer embedded in the temporal perceptual pathway to compute weights for temporal dependencies, an add operator to fuse the weighted spatial and temporal dependencies, and a third attention layer to determine task-specific dependence of the fused spatial-temporal dependencies.

Date Recue/Date Received 2020-12-30
14. The method of claim 8, wherein a generator network of the second GAN is trained using a synthetic regularization loss term to improve quality of image synthesis, and a discriminator network of the second GAN is trained using a synthetic content loss term.
15. The method of claim 14, wherein a discriminator network of the third GAN
is trained using a self-supervised segmentation auxiliary loss term causing the discriminator network to extract a tissue-related compensate feature from the CA-free-AI-enhanced image.
16. The method of claim 5, wherein the MR image is a time-series of MR images.
17. The method of claim 16, wherein the time-series of MR images are cine MR
images.
18. The method of claim 16, wherein the time-series of MR images are cardiac MR images.
19. The method of claim 18, further comprising a neural network with a heart localization layer configured to automatically crop cardiac MR image to a region-of-interest.
20. The method of claim 4, wherein the medical diagnostic image analysis is a tumor detection and the at least one GAN is a tripartite generative adversarial network (GAN) comprising a generator network, a discriminator network and a detector network, the method further comprising:
inputting the MR image into the generator network;
obtaining a CA-free-AI-enhanced image and an attention map of tumor specific features from the generator network;
inputting the CA-free-AI-enhanced image and the attention map into the detector network;
obtaining a tumor location and a tumor classification extracted from the CA-free-AI-enhanced image by the detector network;
training the generator network by both adversarial learning with the discriminator network and back-propagation with the detector network.
21. The method of claim 20, wherein the generator network includes a dual attention module that produces the attention map.

Date Recue/Date Received 2020-12-30
22. The method of claim 21, wherein the dual attention module includes first and second attention modules in parallel, the first attention module providing feature representation learning of tumor specificity and the second attention module providing global context learning of a multi-class aspect of the MR image.
23. The method of claim 22, wherein information from the first attention module and the second attention module is fused to generate the attention map.
24. The method of claim 20, wherein the generator network is an attention-aware generator, the discriminator network is a convolutional neural network-based (CNN-based) discriminator, and the detector network is a region-based convolutional neural network-based (R-CNN-based) detector.
25. The method of claim 20, wherein the tripartite-GAN incorporates a tripartite loss function relating to three tasks of synthesis of the CA-free-AI-enhanced image, discrimination of the CA-free-AI-enhanced image and tumor classification of the CA-free-AI-enhanced image.
26. The method of claim 20, wherein the MR image is a time-series of MR
images.
27. The method of claim 20, wherein the MR image is a liver MR image.
28. The method of claim 2, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a pixel-level graph reinforcement learning model comprising a plurality of pixel-level agents equaling the number of pixels, each of the plurality of pixel-level agents associated with a single pixel, and a graph convolutional network communicative with all of the plurality of pixel-level agents, the method further comprising:
inputting the MR image into the pixel-level graph reinforcement learning model;
determining an intensity value for each pixel of the MR image with the plurality of pixel-level agents according to a learned policy;
outputting a plurality of pixel-level actions, a single pixel-level action for each pixel of the MR
image, to change the MR image to synthesize a CA-free-AI-enhanced image.

Date Recue/Date Received 2020-12-30
29. The method of claim 28, wherein the graph convolutional network comprises a state-behavior network for generating pixel-level candidate actions and a state-evaluator network for generating pixel-level average actions and a reward function communicative with both the state-behavior network and the state-evaluator network, the method further comprising:
generating, for each pixel of the MR image, a pixel-level candidate action and a corresponding pixel-level average action;
comparing each pixel-level candidate action with each corresponding pixel-level average action using the reward function and selecting an action for each corresponding pixel;
reciprocally training the state-behavior network and the state-evaluator network by communicating a parameter of the selected action to both the state-behavior network and the state-evaluator network.
30. The method of claim 29, wherein the communication of the parameter of the selected action trains the state-behavior network to improve estimates of the pixel-level candidate action and trains the state-evaluator network to improve prediction of pixel-level average action.
31. The method of claim 30, wherein the reward function is a dual-level reward function combination of a pixel-level reward function and a regional-level reward function.
32. The method of claim 31, wherein the pixel-level reward function is a Euclidean distance-based pixel-level reward function. and the regional-level reward function is a Wasserstein distance-based region-level reward function.
33. The method of claim 2, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a weakly-supervised teacher-student network comprising a teacher module and a student module, the method further comprising:
inputting the MR image into a detection component of the student module;
obtaining a fused tumor detection box locating a tumor in the MR image based on two tumor detection boxes generated by two detection strategies of the detection component;
inputting the MR image and the fused tumor detection box into a segmentation component of the student module;
obtaining a tumor segmented MR image.

Date Recue/Date Received 2020-12-30
34. The method of claim 33, wherein the student detection component is a student dual-strategy deep reinforcement learning model comprising a first pair of cooperating relative-entropy biased actor-critic components; and the student segmentation component is a dense U-net.
35. The method of claim 33, wherein the student detection component is guided by a tumor detection strategy provided by a detection component of the teacher module, and the student segmentation component is guided by a tumor mask provided by a segmentation component of the teacher module.
36. The method of claim 35, wherein the detection component of the teacher module is a teacher dual-strategy deep reinforcement learning model comprising a second pair of cooperating relative-entropy biased actor-critic components trained by learning the tumor detection strategy from contrast-agent(CA)-enhanced MR image, and the segmentation component of the teacher module is a self-ensembling component including an uncertainty-estimation trained to learn tumor segmentation and generate a tumor mask.
37. The method of claim 36, wherein the detection component of the teacher module inputs the CA-enhanced MR image and outputs a teacher-fused tumor detection box, and the segmentation component of the teacher module inputs the CA-enhanced MR image and the teacher-fused tumor detection box and outputs the tumor mask.
38. A medical imaging system for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
an interface device configured for receiving a magnetic resonance (MR) image acquired by a medical MR scanner in absence of contrast agent enhancement;
a memory configured for storing the MR image and a computer-implemented machine learning model;
a processor configured for:
inputting the MR image to the computer-implemented machine learning model;
concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.

Date Recue/Date Received 2020-12-30
39. The system of claim 38, wherein the machine learning model is trained by deep reinforcement learning.
40. The system of claim 38, wherein the machine learning model comprises a neural network.
41. The system of claim 40, wherein the neural network comprises at least one generative adversarial network (GAN).
42. The system of claim 38, wherein the medical diagnostic image analysis is a tissue segmented image and the machine learning model is a plurality of machine learning components, wherein the processor is configured for:
inputting the MR image into a first machine learning component;
obtaining a coarse tissues mask from the first machine learning component;
inputting the coarse tissues mask and the MR image into a second machine learning component;
obtaining a CA-free-AI-enhanced image from the second machine learning component;
inputting the CA-free-AI-enhanced image and the MR image into a third machine learning component;
obtaining a diagnosis-related tissue segmented image from the third machine learning component.
43. The system of claim 42, wherein at least one of the plurality of machine learning components is trained by deep reinforcement learning.
44. The system of claim 42, wherein each of the plurality of machine learning components comprises a neural network.
45. The system of claim 44, wherein the plurality of machine learning components comprises a first generative adversarial network (GAN), a second GAN and a third GAN.
46. The system of claim 45, further comprising a sequential causal learning network (SCLN) connected to a generator network of each of the first, second and third GANs, the SCLN
configured as an encoder of the MR image.

Date Recue/Date Received 2020-12-30
47. The system of claim 46, wherein the SCLN comprises a two-stream structure of a spatial perceptual pathway and a temporal perceptual pathway to independently extract spatial and temporal dependencies from the MR image.
48. The system of claim 47, wherein each of the spatial perceptual pathway and the temporal perceptual pathway includes a dilated convolutional network.
49. The system of claim 47, further comprising a multi-attention weighing unit embedded within each of the spatial perceptual pathway and the temporal perceptual pathway to compute and select task-specific dependence within the spatial and temporal dependencies.
50. The system of claim 49, wherein the multi-attention weighing unit comprises a first attention layer embedded in the spatial perceptual pathway to compute weights for spatial dependencies, a second attention layer embedded in the temporal perceptual pathway to compute weights for temporal dependencies, an add operator to fuse the weighted spatial and temporal dependencies, and a third attention layer to determine task-specific dependence of the fused spatial-temporal dependencies.
51. The system of claim 45, wherein a generator network of the second GAN is trained using a synthetic regularization loss term to improve quality of image synthesis, and a discriminator network of the second GAN is trained using a synthetic content loss term.
52. The system of claim 51, wherein a discriminator network of the third GAN
is trained using a self-supervised segmentation auxiliary loss term causing the discriminator network to extract a tissue-related compensate feature from the CA-free-AI-enhanced image.
53. The system of claim 42, wherein the MR image is a time-series of MR
images.
54. The system of claim 53, wherein the time-series of MR images are cine MR
images.
55. The system of claim 53, wherein the time-series of MR images are cardiac MR images.

Date Recue/Date Received 2020-12-30
56. The system of claim 55, further comprising a neural network with a heart localization layer configured to automatically crop cardiac MR image to a region-of-interest.
57. The system of claim 41, wherein the medical diagnostic image analysis is a tumor detection and the at least one GAN is a tripartite generative adversarial network (GAN) comprising a generator network, a discriminator network and a detector network, wherein the processor is configured for:
inputting the MR image into the generator network;
obtaining a CA-free-AI-enhanced image and an attention map of tumor specific features from the generator network;
inputting the CA-free-AI-enhanced image and the attention map into the detector network;
obtaining a tumor location and a tumor classification extracted from the CA-free-AI-enhanced image by the detector network;
training the generator network by both adversarial learning with the discriminator network and back-propagation with the detector network.
58. The system of claim 57, wherein the generator network includes a dual attention module that produces the attention map.
59. The system of claim 58, wherein the dual attention module includes first and second attention modules in parallel, the first attention module providing feature representation learning of tumor specificity and the second attention module providing global context learning of a multi-class aspect of the MR image.
60. The system of claim 59, wherein information from the first attention module and the second attention module is fused to generate the attention map.
61. The system of claim 57, wherein the generator network is an attention-aware generator, the discriminator network is a convolutional neural network-based (CNN-based) discriminator, and the detector network is a region-based convolutional neural network-based (R-CNN-based) detector.

Date Recue/Date Received 2020-12-30
62. The system of claim 57, wherein the tripartite-GAN incorporates a tripartite loss function relating to three tasks of synthesis of the CA-free-AI-enhanced image, discrimination of the CA-free-AI-enhanced image and tumor classification of the CA-free-AI-enhanced image.
63. The system of claim 57, wherein the MR image is a time-series of MR
images.
64. The system of claim 57, wherein the MR image is a liver MR image.
65. The system of claim 39, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a pixel-level graph reinforcement learning model comprising a plurality of pixel-level agents equaling the number of pixels, each of the plurality of pixel-level agents associated with a single pixel, and a graph convolutional network communicative with all of the plurality of pixel-level agents, wherein the processor is configured for:
inputting the MR image into the pixel-level graph reinforcement learning model;
determining an intensity value for each pixel of the MR image with the plurality of pixel-level agents according to a learned policy;
outputting a plurality of pixel-level actions, a single pixel-level action for each pixel of the MR
image, to change the MR image to synthesize a CA-free-AI-enhanced image.
66. The system of claim 65, wherein the graph convolutional network comprises a state-behavior network for generating pixel-level candidate actions and a state-evaluator network for generating pixel-level average actions and a reward function communicative with both the state-behavior network and the state-evaluator network, the processor configured for:
generating, for each pixel of the MR image, a pixel-level candidate action and a corresponding pixel-level average action;
comparing each pixel-level candidate action with each corresponding pixel-level average action using the reward function and selecting an action for each corresponding pixel;
reciprocally training the state-behavior network and the state-evaluator network by communicating a parameter of the selected action to both the state-behavior network and the state-evaluator network.

Date Recue/Date Received 2020-12-30
67. The system of claim 66, wherein the communication of the parameter of the selected action trains the state-behavior network to improve estimates of the pixel-level candidate action and trains the state-evaluator network to improve prediction of pixel-level average action.
68. The system of claim 67, wherein the reward function is a dual-level reward function combination of a pixel-level reward function and a regional-level reward function.
69. The system of claim 68, wherein the pixel-level reward function is a Euclidean distance-based pixel-level reward function. and the regional-level reward function is a Wasserstein .. distance-based region-level reward function.
70. The system of claim 39, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a weakly-supervised teacher-student network comprising a teacher module and a student module, the processor configured for:
.. inputting the MR image into a detection component of the student module;
obtaining a fused tumor detection box locating a tumor in the MR image based on two tumor detection boxes generated by two detection strategies of the detection component;
inputting the MR image and the fused tumor detection box into a segmentation component of the student module;
obtaining a tumor segmented MR image.
71. The system of claim 70, wherein the student detection component is a student dual-strategy deep reinforcement learning model comprising a first pair of cooperating relative-entropy biased actor-critic components; and the student segmentation component is a dense U-net.
72. The system of claim 70, wherein the student detection component is guided by a tumor detection strategy provided by a detection component of the teacher module, and the student segmentation component is guided by a tumor mask provided by a segmentation component of the teacher module.
73. The system of claim 72, wherein the detection component of the teacher module is a teacher dual-strategy deep reinforcement learning model comprising a second pair of cooperating relative-entropy biased actor-critic components trained by learning the tumor detection strategy Date Recue/Date Received 2020-12-30 from contrast-agent(CA)-enhanced MR image, and the segmentation component of the teacher module is a self-ensembling component including an uncertainty-estimation trained to learn tumor segmentation and generate a tumor mask.
74. The system of claim 73, wherein the detection component of the teacher module inputs the CA-enhanced MR image and outputs a teacher-fused tumor detection box, and the segmentation component of the teacher module inputs the CA-enhanced MR image and the teacher-fused tumor detection box and outputs the tumor mask.
75. A non-transitory computer readable medium embodying a computer program for concurrent and simultaneous synthesis of a medical CA-free-AI-enhanced image and medical diagnostic image analysis comprising:
computer program code for receiving a magnetic resonance (MR) image acquired by a medical MR scanner in absence of contrast agent enhancement;
computer program code for providing the MR image to a computer-implemented machine learning model;
computer program code for concurrently performing a medical CA-free-AI-enhanced image synthesis task and a medical diagnostic image analysis task with the machine learning model;
computer program code for reciprocally communicating between the image synthesis task and the image analysis task for mutually dependent training of both tasks.
76. The computer readable medium of claim 75, wherein the machine learning model is trained by deep reinforcement learning.
77. The computer readable medium of claim 75, wherein the machine learning model comprises a neural network.
78. The computer readable medium of claim 77, wherein the neural network comprises at least one generative adversarial network (GAN).
79. The computer readable medium of claim 75, wherein the medical diagnostic image analysis is a tissue segmented image and the machine learning model is a plurality of machine learning components, the computer readable medium further comprising:

Date Recue/Date Received 2020-12-30 computer program code for inputting the MR image into a first machine learning component;
computer program code for obtaining a coarse tissues mask from the first machine learning component;
computer program code for inputting the coarse tissues mask and the MR image into a second machine learning component;
computer program code for obtaining a CA-free-AI-enhanced image from the second machine learning component;
computer program code for inputting the CA-free-AI-enhanced image and the MR
image into a third machine learning component;
computer program code for obtaining a diagnosis-related tissue segmented image from the third machine learning component.
80. The computer readable medium of claim 79, wherein at least one of the plurality of machine learning components is trained by deep reinforcement learning.
81. The computer readable medium of claim 79, wherein each of the plurality of machine learning components comprises a neural network.
82. The computer readable medium of claim 81, wherein the plurality of machine learning components comprises a first generative adversarial network (GAN), a second GAN and a third GAN.
83. The computer readable medium of claim 82, further comprising computer program code for a sequential causal learning network (SCLN) connected to a generator network of each of the first, second and third GANs, the SCLN configured as an encoder of the MR image.
84. The computer readable medium of claim 83, wherein the SCLN comprises a two-stream structure of a spatial perceptual pathway and a temporal perceptual pathway to independently extract spatial and temporal dependencies from the MR image.
85. The computer readable medium of claim 84, wherein each of the spatial perceptual pathway and the temporal perceptual pathway includes a dilated convolutional network.

Date Recue/Date Received 2020-12-30
86. The computer readable medium of claim 84, further comprising computer program code for a multi-attention weighing unit embedded within each of the spatial perceptual pathway and the temporal perceptual pathway to compute and select task-specific dependence within the spatial and temporal dependencies.
87. The computer readable medium of claim 86, wherein the multi-attention weighing unit comprises a first attention layer embedded in the spatial perceptual pathway to compute weights for spatial dependencies, a second attention layer embedded in the temporal perceptual pathway to compute weights for temporal dependencies, an add operator to fuse the weighted spatial and temporal dependencies, and a third attention layer to determine task-specific dependence of the fused spatial-temporal dependencies.
88. The computer readable medium of claim 82, wherein a generator network of the second GAN
is trained using a synthetic regularization loss term to improve quality of image synthesis, and a discriminator network of the second GAN is trained using a synthetic content loss term.
89. The computer readable medium of claim 88, wherein a discriminator network of the third GAN is trained using a self-supervised segmentation auxiliary loss term causing the discriminator network to extract a tissue-related compensate feature from the CA-free-AI-enhanced image.
90. The computer readable medium of claim 79, wherein the MR image is a time-series of MR
images.
91. The computer readable medium of claim 90, wherein the time-series of MR
images are cine MR images.
92. The computer readable medium of claim 90, wherein the time-series of MR
images are cardiac MR images.
93. The computer readable medium of claim 92, further comprising computer program code for a neural network with a heart localization layer configured to automatically crop cardiac MR image to a region-of-interest.

Date Recue/Date Received 2020-12-30
94. The computer readable medium of claim 78, wherein the medical diagnostic image analysis is a tumor detection and the at least one GAN is a tripartite generative adversarial network (GAN) comprising a generator network, a discriminator network and a detector network, the computer readable medium further comprising:
computer program code for inputting the MR image into the generator network;
computer program code for obtaining a CA-free-AI-enhanced image and an attention map of tumor specific features from the generator network;
computer program code for inputting the CA-free-AI-enhanced image and the attention map into the detector network;
.. computer program code for obtaining a tumor location and a tumor classification extracted from the CA-free-AI-enhanced image by the detector network;
computer program code for training the generator network by both adversarial learning with the discriminator network and back-propagation with the detector network.
95. The computer readable medium of claim 94, wherein the generator network includes a dual attention module that produces the attention map.
96. The computer readable medium of claim 95, wherein the dual attention module includes first and second attention modules in parallel, the first attention module providing feature representation learning of tumor specificity and the second attention module providing global context learning of a multi-class aspect of the MR image.
97. The computer readable medium of claim 96, wherein information from the first attention module and the second attention module is fused to generate the attention map.
98. The computer readable medium of claim 94, wherein the generator network is an attention-aware generator, the discriminator network is a convolutional neural network-based (CNN-based) discriminator, and the detector network is a region-based convolutional neural network-based (R-CNN-based) detector.
99. The computer readable medium of claim 94, wherein the tripartite-GAN
incorporates a tripartite loss function relating to three tasks of synthesis of the CA-free-AI-enhanced image, Date Recue/Date Received 2020-12-30 discrimination of the CA-free-AI-enhanced image and tumor classification of the CA-free-AI-enhanced image.
100. The computer readable medium of claim 94, wherein the MR image is a time-series of MR
images.
101. The computer readable medium of claim 94, wherein the MR image is a liver MR image.
102. The computer readable medium of claim 76, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a pixel-level graph reinforcement learning model comprising a plurality of pixel-level agents equaling the number of pixels, each of the plurality of pixel-level agents associated with a single pixel, and a graph convolutional network communicative with all of the plurality of pixel-level agents, the computer readable medium further comprising:
computer program code for inputting the MR image into the pixel-level graph reinforcement learning model;
computer program code for determining an intensity value for each pixel of the MR image with the plurality of pixel-level agents according to a learned policy;
computer program code for outputting a plurality of pixel-level actions, a single pixel-level action .. for each pixel of the MR image, to change the MR image to synthesize a CA-free-AI-enhanced image.
103. The computer readable medium of claim 102, wherein the graph convolutional network comprises a state-behavior network for generating pixel-level candidate actions and a state-evaluator network for generating pixel-level average actions and a reward function communicative with both the state-behavior network and the state-evaluator network, the computer readable medium further comprising:
computer program code for generating, for each pixel of the MR image, a pixel-level candidate action and a corresponding pixel-level average action;
computer program code for comparing each pixel-level candidate action with each corresponding pixel-level average action using the reward function and selecting an action for each corresponding pixel;

Date Recue/Date Received 2020-12-30 computer program code for reciprocally training the state-behavior network and the state-evaluator network by communicating a parameter of the selected action to both the state-behavior network and the state-evaluator network.
104. The computer readable medium of claim 103, wherein the communication of the parameter of the selected action trains the state-behavior network to improve estimates of the pixel-level candidate action and trains the state-evaluator network to improve prediction of pixel-level average action.
105. The computer readable medium of claim 104, wherein the reward function is a dual-level reward function combination of a pixel-level reward function and a regional-level reward function.
106. The computer readable medium of claim 105, wherein the pixel-level reward function is a Euclidean distance-based pixel-level reward function. and the regional-level reward function is a Wasserstein distance-based region-level reward function.
107. The computer readable medium of claim 76, wherein the medical diagnostic image analysis is a tumor detection and the machine learning model is a weakly-supervised teacher-student network comprising a teacher module and a student module, the computer readable medium further comprising:
computer program code for inputting the MR image into a detection component of the student module;
computer program code for obtaining a fused tumor detection box locating a tumor in the MR
image based on two tumor detection boxes generated by two detection strategies of the detection component;
computer program code for inputting the MR image and the fused tumor detection box into a segmentation component of the student module;
computer program code for obtaining a tumor segmented MR image.
108. The computer readable medium of claim 107, wherein the student detection component is a student dual-strategy deep reinforcement learning model comprising a first pair of cooperating Date Recue/Date Received 2020-12-30 relative-entropy biased actor-critic components; and the student segmentation component is a dense U-net.
109. The computer readable medium of claim 107, wherein the student detection component is guided by a tumor detection strategy provided by a detection component of the teacher module, and the student segmentation component is guided by a tumor mask provided by a segmentation component of the teacher module.
110. The computer readable medium of claim 109, wherein the detection component of the teacher module is a teacher dual-strategy deep reinforcement learning model comprising a second pair of cooperating relative-entropy biased actor-critic components trained by learning the tumor detection strategy from contrast-agent(CA)-enhanced MR image, and the segmentation component of the teacher module is a self-ensembling component including an uncertainty-estimation trained to learn tumor segmentation and generate a tumor mask.
111. The computer readable medium of claim 110, wherein the detection component of the teacher module inputs the CA-enhanced MR image and outputs a teacher-fused tumor detection box, and the segmentation component of the teacher module inputs the CA-enhanced MR image and the teacher-fused tumor detection box and outputs the tumor mask.

Date Recue/Date Received 2020-12-30
CA3104607A 2020-12-30 2020-12-30 Contrast-agent-free medical diagnostic imaging Pending CA3104607A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CA3104607A CA3104607A1 (en) 2020-12-30 2020-12-30 Contrast-agent-free medical diagnostic imaging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CA3104607A CA3104607A1 (en) 2020-12-30 2020-12-30 Contrast-agent-free medical diagnostic imaging

Publications (1)

Publication Number Publication Date
CA3104607A1 true CA3104607A1 (en) 2022-06-30

Family

ID=82196654

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3104607A Pending CA3104607A1 (en) 2020-12-30 2020-12-30 Contrast-agent-free medical diagnostic imaging

Country Status (1)

Country Link
CA (1) CA3104607A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681991A (en) * 2023-07-24 2023-09-01 北京航空航天大学 Time sequence two-dimensional coding-based tightly-combined navigation fault detection method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116681991A (en) * 2023-07-24 2023-09-01 北京航空航天大学 Time sequence two-dimensional coding-based tightly-combined navigation fault detection method
CN116681991B (en) * 2023-07-24 2023-10-31 北京航空航天大学 Time sequence two-dimensional coding-based tightly-combined navigation fault detection method

Similar Documents

Publication Publication Date Title
US11837354B2 (en) Contrast-agent-free medical diagnostic imaging
Yurt et al. mustGAN: multi-stream generative adversarial networks for MR image synthesis
Xu et al. Contrast agent-free synthesis and segmentation of ischemic heart disease images using progressive sequential causal GANs
Liu et al. Multimodal MR image synthesis using gradient prior and adversarial learning
US9361686B2 (en) Method and apparatus for the assessment of medical images
Zhang et al. Can signal-to-noise ratio perform as a baseline indicator for medical image quality assessment
JP2023540910A (en) Connected Machine Learning Model with Collaborative Training for Lesion Detection
US20210097690A1 (en) Protocol-Aware Tissue Segmentation in Medical Imaging
CN110619635B (en) Hepatocellular carcinoma magnetic resonance image segmentation system and method based on deep learning
Wang et al. JointVesselNet: Joint volume-projection convolutional embedding networks for 3D cerebrovascular segmentation
JP2022077991A (en) Medical image processing apparatus, medical image processing method, medical image processing program, model training apparatus, and training method
Fan et al. TR-Gan: multi-session future MRI prediction with temporal recurrent generative adversarial Network
Mouches et al. Unifying brain age prediction and age-conditioned template generation with a deterministic autoencoder
Sun et al. Double U-Net CycleGAN for 3D MR to CT image synthesis
Marin et al. Numerical surrogates for human observers in myocardial motion evaluation from SPECT images
Zuo et al. HACA3: A unified approach for multi-site MR image harmonization
CN114881848A (en) Method for converting multi-sequence MR into CT
CN116630463B (en) Enhanced CT image generation method and system based on multitask learning
Poonkodi et al. 3D-MedTranCSGAN: 3D medical image transformation using CSGAN
CA3104607A1 (en) Contrast-agent-free medical diagnostic imaging
WO2020056196A1 (en) Fully automated personalized body composition profile
Xu et al. Applying cross-modality data processing for infarction learning in medical internet of things
CN113052840B (en) Processing method based on low signal-to-noise ratio PET image
CN113658116B (en) Artificial intelligence method and system for generating medical images with different body positions
Alogna et al. Brain magnetic resonance imaging generation using generative adversarial networks