US12394187B2 - Synthesis of medical images of brain tumors using 3D-2D GANs - Google Patents
Synthesis of medical images of brain tumors using 3D-2D GANsInfo
- Publication number
- US12394187B2 US12394187B2 US18/161,186 US202318161186A US12394187B2 US 12394187 B2 US12394187 B2 US 12394187B2 US 202318161186 A US202318161186 A US 202318161186A US 12394187 B2 US12394187 B2 US 12394187B2
- Authority
- US
- United States
- Prior art keywords
- tumor
- training
- image
- medical image
- mask
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/60—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
- G16H40/67—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/04—Indexing scheme for image data processing or generation, in general involving 3D image data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30016—Brain
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Definitions
- the present invention relates generally to medical image synthesis, and in particular to synthesis of medical images of brain tumors using 3D (three-dimensional)-2D (two-dimensional) GANs (generative adversarial networks).
- machine learning and deep learning based models have been proposed for brain tumor classification, segmentation, and detection to facilitate diagnosis and treatment.
- Such machine learning and deep learning based models are data driven such that their performance and robustness are highly dependent upon the size, quality, and diversity of the training data.
- collecting sufficiently large amounts of such training data is both costly and time-consuming.
- systems and methods for generating synthesized medical images of a tumor are provided.
- a 3D mask of an anatomical structure generated from a 3D medical image and a 3D image of a plurality of concentric spheres are received.
- a 3D mask of a tumor is generated based on the 3D mask of the anatomical structure and the 3D image of the plurality of concentric spheres using a first 3D generator network.
- a 3D intensity map of the tumor is generated based on the 3D mask of the tumor and the 3D image of the plurality of concentric spheres using a second 3D generator network.
- a 3D synthesized medical image of the tumor is generated based on one or more 2D slices of the 3D intensity map of the tumor and one or more 2D slices of the 3D medical image using a 2D generator network.
- the 3D medical image of the tumor is output.
- a machine learning based network is trained for performing a medical imaging analysis task based on the 3D synthesized medical image of the tumor.
- FIG. 1 shows a method for generating synthesized medical images of a tumor, in accordance with one or more embodiments
- FIG. 2 shows a workflow for generating synthesized medical images of a tumor, in accordance with one or more embodiments
- FIG. 3 shows a workflow for training 3D-2D networks for generating synthesized medical images of a tumor, in accordance with one or more embodiments
- FIG. 4 shows a workflow for generating 3D training images of a plurality of concentric spheres and 3D training intensity maps of the tumor, in accordance with one or more embodiments
- FIG. 5 shows synthesized medical images of lesions generated in accordance with embodiments described herein;
- FIG. 7 shows exemplary images generated in accordance with embodiments described herein
- FIG. 8 shows images comparing synthesized lesions generated by the “transposed convolution” and the “up-sampling+convolution up-sampling” strategies with the corresponding intermediate quantized intensity maps, in accordance with embodiments described herein;
- FIG. 9 shows an exemplary artificial neural network that may be used to implement one or more embodiments
- FIG. 10 shows a convolutional neural network that may be used to implement one or more embodiments.
- FIG. 11 shows a high-level block diagram of a computer that may be used to implement one or more embodiments.
- the present invention generally relates to methods and systems for synthesis of medical images of brain tumors using a type of 3D (three-dimensional)-2D (two-dimensional) GANs (generative adversarial networks).
- Embodiments of the present invention are described herein to give a visual understanding of such methods and systems.
- a digital image is often composed of digital representations of one or more objects (or shapes).
- the digital representation of an object is often described herein in terms of identifying and manipulating the objects.
- Such manipulations are virtual manipulations accomplished in the memory or other circuitry/hardware of a computer system. Accordingly, is to be understood that embodiments of the present invention may be performed within a computer system using data stored within the computer system.
- Embodiments described herein provide for a GAN based network architecture for synthesizing 3D medical images of brain tumors.
- the GAN based network architecture comprises a series of 3D generators for generating 3D intermediate representations of brain tumors and a 2D generator for generating 2D synthetic images of brain tumors using the 3D intermediate representations for step-by-step guidance.
- the 2D synthetic images are stacked to form a 3D synthetic image of the brain tumor.
- the brain tumors in the final 3D synthetic image is generated with configurable parameters for controlling, for example, the location, size, structure, heterogeneity, and contrast of the synthesized brain tumors.
- the 3D intermediate representations generated by the 3D generators preserve interslice continuity in all three dimensions, while the final 3D synthetic image is generated by the 2D generator trained using a 2D perceptual loss to ensure that realistic brain tumors are generated with high perceptual quality.
- the 3D synthesized medical images synthesized in accordance with embodiments described herein may be utilized for, e.g., data augmentation during training machine learning networks for classifying, segmenting, detecting, etc. brain tumors.
- FIG. 1 shows a method 100 for generating synthesized medical images of a tumor, in accordance with one or more embodiments.
- the steps of method 100 may be performed by one or more suitable computing devices, such as, e.g., computer 1102 of FIG. 11 .
- FIG. 2 shows a workflow 200 for generating synthesized medical images of a tumor, in accordance with one or more embodiments. FIG. 1 and FIG. 2 will be described together.
- 3D mask 202 of the anatomical structure provides a voxel-wise identification of the anatomical structure in 3D medical image 206 .
- 3D mask 202 may be a 3D binary segmentation mask where, e.g., voxels having an intensity value of 1 correspond to the anatomical structure while voxels having an intensity value of 0 do not correspond to the anatomical structure.
- the anatomical structure is a brain of a patient or a healthy subject (e.g., a person).
- the anatomical structure may be any other suitable organ, vessel, bone, etc. of the patient or healthy subject.
- 3D mask 202 of the anatomical structure may be generated from 3D medical image 206 using any suitable approach.
- 3D mask 202 of the anatomical structure is automatically generated from 3D medical image 206 using a machine learning based segmentation network.
- 3D mask 202 of the anatomical structure is manually generated from 3D medical image 206 by a user.
- 3D medical image 206 may be an MRI (medical resonance imaging) image, a CT (computed tomography) image, an ultrasound image, or a 3D medical image of any other suitable modality.
- 3D medical image 206 is received and 3D mask 202 of the anatomical structure is generated from 3D medical image 206 .
- 3D image 204 of the plurality of concentric spheres encodes the properties of the tumor to be generated in the plurality of concentric spheres.
- the center of the concentric spheres defines the mass center location of the tumor
- the size of the outermost sphere defines the overall size of the tumor
- the ratio of the sizes and intensity values of the concentric spheres defines the structure of the tumor.
- the location and/or size of the concentric spheres may be randomly selected or user defined.
- 3D image 204 depicts three concentric spheres.
- a 3D mask of a tumor is generated based on the 3D mask of the anatomical structure and the 3D image of the plurality of concentric spheres using a first 3D generator network.
- 3D generator network G binary 210 receives as input 3D mask 202 of the anatomical structure and 3D image 204 of a plurality of concentric spheres and generates as output 3D mask 212 of the tumor.
- a 3D intensity map of the tumor is generated based on the 3D mask of the tumor and the 3D image of the plurality of concentric spheres using a second 3D generator network.
- 3D generator network G quantize 214 receives as input 3D mask 212 of the tumor and 3D image 204 of a plurality of concentric spheres and generates as output 3D intensity map 216 of the tumor at several discrete gray levels.
- 3D intensity map 216 is formed on a plurality of 2D cross-section slices, as shown in FIG. 2 .
- Each 2D synthesized medical images 220 thus depicts a recreation of a respective slice 208 with a cross-sectional view of the synthesized tumor overlaid thereon.
- 2D synthesized medical images 220 represent cross-sectional 2D slices that together form a 3D synthesized medical image of the tumor.
- they are stacked into the 3D synthesized medical image of the tumor.
- the 3D synthesized medical images are post-processed.
- lesion 3D smoothing 222 is applied to smooth the synthesized tumor in the 3D synthesized medical image (formed by 2D synthesized medical images 220 ) using a 3D Gaussian kernel.
- Lesion blending 224 is then performed by extracting the smoothed synthesized tumors from the smoothed 3D synthesized medical image and blending the extracted smoothed synthesized tumors into the original 3D medical image 206 using 3D mask 212 to generate a 3D blended synthesized medical image.
- contrast enhancement 228 is performed to adjust (e.g., enhance or reduce) the contrast of the tumor in 3D blended synthesized medical image 226 as needed to generate the final 3D synthesized medical image 230 of the tumor.
- the contrast adjustment may be random or user-defined.
- the 3D synthesized medical image of the tumor is output.
- the 3D synthesized medical image of the tumor e.g., generated at step 108 and/or 110
- the 3D synthesized medical image of the tumor can be output by displaying the 3D synthesized medical image of the tumor on a display device of a computer system, storing the 3D synthesized medical image of the tumor on a memory or storage of a computer system, or by transmitting the 3D synthesized medical image of the tumor to a remote computer system.
- the 3D synthesized medical image of the tumor is stored as part of an augmented training dataset and used for training a machine learning based network for performing a medical imaging analysis task, such as, e.g., detection, segmentation, classification, etc. of a tumor.
- a medical imaging analysis task such as, e.g., detection, segmentation, classification, etc. of a tumor.
- 3D generator network G binary 210 for generating 3D mask 212 of the tumor and 3D generator network G quantize 214 for generating 3D intensity map 216 are implemented using 3D neural networks, enabling the synthesized tumor generated in the 3D synthesized medical image (formed by stacking 2D synthesized medical images 220 ) to have continuous structure in all three dimensions, while 2D generator network G syn 218 for generating 2D synthesized medical images 220 is trained with a 2D perceptual loss so that the visual perception of synthesized tumor is realistic.
- FIG. 3 shows a workflow 300 for training 3D-2D networks for generating synthesized medical images of a tumor, in accordance with one or more embodiments.
- Workflow 300 of FIG. 3 is performed during a prior offline or training stage.
- the trained networks are utilized during an online or inference stage (e.g., to perform one or more steps of method 100 of FIG. 1 or one or more operations of workflow 200 of FIG. 2 ) for generating synthesized medical images of a tumor.
- Workflow 300 comprises 3D generator network G binary 310 , 3D generator network G quantize 314, and 2D generator network G syn 318 .
- 3D generator network G binary 310 may be the first 3D generator network utilized at step 104 of FIG. 1 or 3D generator network G binary 210 of FIG. 2
- 3D generator network G quantize 314 may be the second 3D generator network utilized at step 106 of FIG. 1 or 3D generator network G quantize 214 of FIGS. 2
- 2D generator network G syn 318 may be the 2D generator network utilized at step 108 of FIG. 1 or 2D generator network G syn 218 of FIG. 2 .
- 3D generator network G binary 310 , 3D generator network G quantize 314, and 2D generator network G syn 318 may be implemented using any suitable machine learning based network.
- 3D generator network G binary 310 and 3D generator network G quantize 314 are implemented using 3D UNet or UNet-like networks or 3D encoder-decoder networks and 2D generator network G syn 318 is implemented using a 2D encoder-decoder network.
- 3D generator network G binary 310 is trained using 1) 3D training mask 302 of an anatomical structure generated from 3D training medical image 306 , 2 ) 3D training image 304 of a plurality of concentric spheres, and 3) 3D training mask 312 of a tumor.
- 3D generator network G quantize 314 is trained using 1) 3D training mask 312 of a tumor, 2) 3D training image 304 of a plurality of concentric spheres, and 3) 3D training intensity maps 316 of the tumor.
- 2D generator network G syn 318 is trained using 1) 2D training intensity maps 316 of the tumor, 2) 2D training slices 308 extracted from 3D training medical image 306 , and 3) 2D synthesized medical images 220 of the tumor.
- 3D training image 304 of a plurality of concentric spheres and 3D training intensity maps 316 are generated according to workflow 400 of FIG. 4 , described in further detail below.
- 2D training intensity maps 316 are generated by unstacking the 3D training intensity maps 316 into 2D slices.
- 3D generator network G binary 310 , 3D generator network G quantize 314, and 2D generator network G syn 318 are trained according to one or more loss functions 324 .
- 3D generator network G binary 310 and 3D generator network G quantize 314 are trained with a 3D L1 loss
- 2D generator network G syn 318 is trained with a 2D L1 loss, a GAN loss, and a 2D perceptual loss.
- the 2D L1 loss, the GAN loss, and the 2D perceptual loss may be applied to 2D training synthesized medical images 320 of the tumor along one axis or along all three perpendicular axes for training 2D generator network G syn 318 .
- 2D generator network G syn 318 is trained with adversarial loss using 2D discriminator network D syn 322 .
- 2D discriminator network D syn 322 attempts to distinguish between 2D training synthesized medical images 320 and 2D training slices 308 . Accordingly, 2D discriminator network D syn 322 guides 2D generator network G syn 318 to generate realistic 2D training synthesized medical images 320 that are indistinguishable from the real 2D training slices 308 .
- 2D discriminator network D syn 322 is only utilized during the training stage and is not utilized during the inference stage.
- FIG. 4 shows a workflow 400 for generating 3D training images of a plurality of concentric spheres and 3D training intensity maps, in accordance with one or more embodiments.
- the 3D training images of a plurality of concentric spheres and 3D training intensity maps may be utilized in workflow 300 of FIG. 3 for training 3D-2D networks for generating synthesized medical images of a tumor.
- multi-Otsu thresholding 406 calculates two thresholds.
- FIG. 4 shows 3D training intensity map 408 with voxels of the tumor classified to one of three classes.
- Volume match 410 is then performed on 3D training intensity map 408 to generate 3D training image 412 of three concentric spheres.
- each of the concentric spheres is associated with a respective category and the volume of voxels in each category in 3D training intensity map 408 is the volume of the associated sphere in 3D training image 412 .
- the center of the concentric spheres in 3D training image 412 is defined as the mass center of the tumor in 3D training intensity map 408 .
- Embodiments described herein were experimentally validated to synthesize brain MR images of brain metastases.
- the validation dataset comprised T1-weighted 3D MP-RAGE (magnetization-prepared rapid gradient-echo) post-contrast brain MR images of 800 human subjects with a total of 2688 metastasis lesions.
- the clinically treated metastases were annotated by radiation oncologists, while the untreated metastases were annotated by radiologists.
- All images were resized to 256 ⁇ 256 ⁇ original number of axial slices.
- Each 3D MR volume's minimum voxel intensity to the 98 th percentile range was normalized to [0,1].
- Brain masks were generated using a deep learning model.
- the network was trained with 10%, 30%, 50%, and 100% of the synthetic data.
- two network up-sampling strategies for generating the quantized intensity map were evaluated: “transposed convolution” and “up-sampling+convolution.”
- inference a set of concentric spheres from the dataset were randomly sampled and used as the starting point to generate brain metastasis lesions onto another set of randomly sampled brain MR images.
- a radiologist reviewed 1500 synthetic images in each combination of the training data scale and network architecture to evaluate the synthetic metastasis images.
- a synthetic metastasis image was deemed indistinguishable if the radiologist could not tell whether it was a synthetic lesion or real lesion.
- FIG. 5 shows synthesized medical images 500 of lesions generated in accordance with embodiments described herein.
- the lesions in synthesized medical images 500 are generated with different locations, sizes, structures, up-sampling strategies, and with and without contrast enhancement.
- Each synthesized lesion is at the center of each image in axial, coronal, and sagittal views.
- images 500 confirm the spatial continuity and realistic appearance in different configurations and demonstrates the effectiveness of the configurable parameters according to embodiments described herein.
- Embodiments described herein are described with respect to the claimed systems as well as with respect to the claimed methods.
- Features, advantages or alternative embodiments herein can be assigned to the other claimed objects and vice versa.
- claims for the systems can be improved with features described or claimed in the context of the methods.
- the functional features of the method are embodied by objective units of the providing system.
- a trained machine learning based network mimics cognitive functions that humans associate with other human minds.
- the trained machine learning based network is able to adapt to new circumstances and to detect and extrapolate patterns.
- a trained machine learning based network can comprise a neural network, a support vector machine, a decision tree, and/or a Bayesian network, and/or the trained machine learning based network can be based on k-means clustering, Q-learning, genetic algorithms, and/or association rules.
- a neural network can be a deep neural network, a convolutional neural network, or a convolutional deep neural network.
- a neural network can be an adversarial network, a deep adversarial network and/or a generative adversarial network.
- FIG. 9 shows an embodiment of an artificial neural network 900 , in accordance with one or more embodiments.
- Alternative terms for “artificial neural network” are “neural network”, “artificial neural net” or “neural net”.
- Machine learning networks described herein such as, e.g., first 3D generator network utilized at step 104 , the second 3D generator network utilized at step 106 , and the 2D generator network utilized at step 108 of FIG. 1 ; 3D generator network G binary 210 , 3D generator network G quantize 214 , 2D generator network G syn 218 of FIG. 2 ; and 3D generator network G binary 310 , 3D generator network G quantize 314 , 2D generator network G syn 318 , and 2D discriminator network D syn 322 of FIG. 3 , may be implemented using artificial neural network 900 .
- the artificial neural network 900 comprises nodes 902 - 922 and edges 932 , 934 , . . . , 936 , wherein each edge 932 , 934 , . . . , 936 is a directed connection from a first node 902 - 922 to a second node 902 - 922 .
- the first node 902 - 922 and the second node 902 - 922 are different nodes 902 - 922 , it is also possible that the first node 902 - 922 and the second node 902 - 922 are identical. For example, in FIG.
- the edge 932 is a directed connection from the node 902 to the node 906
- the edge 934 is a directed connection from the node 904 to the node 906
- An edge 932 , 934 , . . . , 936 from a first node 902 - 922 to a second node 902 - 922 is also denoted as “ingoing edge” for the second node 902 - 922 and as “outgoing edge” for the first node 902 - 922 .
- the nodes 902 - 922 of the artificial neural network 900 can be arranged in layers 924 - 930 , wherein the layers can comprise an intrinsic order introduced by the edges 932 , 934 , . . . , 936 between the nodes 902 - 922 .
- edges 932 , 934 , . . . , 936 can exist only between neighboring layers of nodes.
- the number of hidden layers 926 , 928 can be chosen arbitrarily.
- the number of nodes 902 and 904 within the input layer 924 usually relates to the number of input values of the neural network 900
- the number of nodes 922 within the output layer 930 usually relates to the number of output values of the neural network 900 .
- a (real) number can be assigned as a value to every node 902 - 922 of the neural network 900 .
- x (n) i denotes the value of the i-th node 902 - 922 of the n-th layer 924 - 930 .
- the values of the nodes 902 - 922 of the input layer 924 are equivalent to the input values of the neural network 900
- the value of the node 922 of the output layer 930 is equivalent to the output value of the neural network 900 .
- w (m,n) i,j denotes the weight of the edge between the i-th node 902 - 922 of the m-th layer 924 - 930 and the j-th node 902 - 922 of the n-th layer 924 - 930 .
- the abbreviation w (n) i,j is defined for the weight w (n,n+1) i,j .
- the input values are propagated through the neural network.
- the function f is a transfer function (another term is “activation function”).
- transfer functions are step functions, sigmoid function (e.g. the logistic function, the generalized logistic function, the hyperbolic tangent, the Arctangent function, the error function, the smoothstep function) or rectifier functions.
- the transfer function is mainly used for normalization purposes.
- the values are propagated layer-wise through the neural network, wherein values of the input layer 924 are given by the input of the neural network 900 , wherein values of the first hidden layer 926 can be calculated based on the values of the input layer 924 of the neural network, wherein values of the second hidden layer 928 can be calculated based in the values of the first hidden layer 926 , etc.
- training data comprises training input data and training output data (denoted as t i ).
- training output data denoted as t i .
- the neural network 900 is applied to the training input data to generate calculated output data.
- the training data and the calculated output data comprise a number of values, said number being equal with the number of nodes of the output layer.
- a comparison between the calculated output data and the training data is used to recursively adapt the weights within the neural network 900 (backpropagation algorithm).
- FIG. 10 shows a convolutional neural network 1000 , in accordance with one or more embodiments.
- Machine learning networks described herein such as, e.g., first 3D generator network utilized at step 104 , the second 3D generator network utilized at step 106 , and the 2D generator network utilized at step 108 of FIG. 1 ; 3D generator network G binary 210 , 3D generator network G quantize 214 , 2D generator network G syn 218 of FIG. 2 ; and 3D generator network G binary 310 , 3D generator network G quantize 314 , 2D generator network G syn 318 , and 2D discriminator network D syn 322 of FIG. 3 , may be implemented using convolutional neural network 1000 .
- the convolutional neural network comprises 1000 an input layer 1002 , a convolutional layer 1004 , a pooling layer 1006 , a fully connected layer 1008 , and an output layer 1010 .
- the convolutional neural network 1000 can comprise several convolutional layers 1004 , several pooling layers 1006 , and several fully connected layers 1008 , as well as other types of layers.
- the order of the layers can be chosen arbitrarily, usually fully connected layers 1008 are used as the last layers before the output layer 1010 .
- the nodes 1012 - 1020 of one layer 1002 - 1010 can be considered to be arranged as a d-dimensional matrix or as a d-dimensional image.
- the value of the node 1012 - 1020 indexed with i and j in the n-th layer 1002 - 1010 can be denoted as x (n) [i,j] .
- the arrangement of the nodes 1012 - 1020 of one layer 1002 - 1010 does not have an effect on the calculations executed within the convolutional neural network 1000 as such, since these are given solely by the structure and the weights of the edges.
- a convolutional layer 1004 is characterized by the structure and the weights of the incoming edges forming a convolution operation based on a certain number of kernels.
- the k-th kernel K k is a d-dimensional matrix (in this embodiment a two-dimensional matrix), which is usually small compared to the number of nodes 1012 - 1018 (e.g. a 3 ⁇ 3 matrix, or a 5 ⁇ 5 matrix).
- a kernel being a 3 ⁇ 3 matrix, there are only 9 independent weights (each entry of the kernel matrix corresponding to one independent weight), irrespectively of the number of nodes 1012 - 1020 in the respective layer 1002 - 1010 .
- the number of nodes 1014 in the convolutional layer is equivalent to the number of nodes 1012 in the preceding layer 1002 multiplied with the number of kernels.
- nodes 1012 of the preceding layer 1002 are arranged as a d-dimensional matrix
- using a plurality of kernels can be interpreted as adding a further dimension (denoted as “depth” dimension), so that the nodes 1014 of the convolutional layer 1004 are arranged as a (d+1)-dimensional matrix.
- nodes 1012 of the preceding layer 1002 are already arranged as a (d+1)-dimensional matrix comprising a depth dimension, using a plurality of kernels can be interpreted as expanding along the depth dimension, so that the nodes 1014 of the convolutional layer 1004 are arranged also as a (d+1)-dimensional matrix, wherein the size of the (d+1)-dimensional matrix with respect to the depth dimension is by a factor of the number of kernels larger than in the preceding layer 1002 .
- convolutional layers 1004 The advantage of using convolutional layers 1004 is that spatially local correlation of the input data can exploited by enforcing a local connectivity pattern between nodes of adjacent layers, in particular by each node being connected to only a small region of the nodes of the preceding layer.
- the input layer 1002 comprises 36 nodes 1012 , arranged as a two-dimensional 6 ⁇ 6 matrix.
- the convolutional layer 1004 comprises 72 nodes 1014 , arranged as two two-dimensional 6 ⁇ 6 matrices, each of the two matrices being the result of a convolution of the values of the input layer with a kernel. Equivalently, the nodes 1014 of the convolutional layer 1004 can be interpreted as arranges as a three-dimensional 6 ⁇ 6 ⁇ 2 matrix, wherein the last dimension is the depth dimension.
- a pooling layer 1006 can be characterized by the structure and the weights of the incoming edges and the activation function of its nodes 1016 forming a pooling operation based on a non-linear pooling function f.
- the number of nodes 1014 , 1016 can be reduced, by replacing a number d1 ⁇ d2 of neighboring nodes 1014 in the preceding layer 1004 with a single node 1016 being calculated as a function of the values of said number of neighboring nodes in the pooling layer.
- the pooling function f can be the max-function, the average or the L2-Norm.
- the weights of the incoming edges are fixed and are not modified by training.
- the advantage of using a pooling layer 1006 is that the number of nodes 1014 , 1016 and the number of parameters is reduced. This leads to the amount of computation in the network being reduced and to a control of overfitting.
- the pooling layer 1006 is a max-pooling, replacing four neighboring nodes with only one node, the value being the maximum of the values of the four neighboring nodes.
- the max-pooling is applied to each d-dimensional matrix of the previous layer; in this embodiment, the max-pooling is applied to each of the two two-dimensional matrices, reducing the number of nodes from 72 to 18.
- a fully-connected layer 1008 can be characterized by the fact that a majority, in particular, all edges between nodes 1016 of the previous layer 1006 and the nodes 1018 of the fully-connected layer 1008 are present, and wherein the weight of each of the edges can be adjusted individually.
- the nodes 1016 of the preceding layer 1006 of the fully-connected layer 1008 are displayed both as two-dimensional matrices, and additionally as non-related nodes (indicated as a line of nodes, wherein the number of nodes was reduced for a better presentability).
- the number of nodes 1018 in the fully connected layer 1008 is equal to the number of nodes 1016 in the preceding layer 1006 .
- the number of nodes 1016 , 1018 can differ.
- the values of the nodes 1020 of the output layer 1010 are determined by applying the Softmax function onto the values of the nodes 1018 of the preceding layer 1008 .
- the Softmax function By applying the Softmax function, the sum the values of all nodes 1020 of the output layer 1010 is 1, and all values of all nodes 1020 of the output layer are real numbers between 0 and 1.
- a convolutional neural network 1000 can also comprise a ReLU (rectified linear units) layer or activation layers with non-linear transfer functions.
- the number of nodes and the structure of the nodes contained in a ReLU layer is equivalent to the number of nodes and the structure of the nodes contained in the preceding layer.
- the value of each node in the ReLU layer is calculated by applying a rectifying function to the value of the corresponding node of the preceding layer.
- the input and output of different convolutional neural network blocks can be wired using summation (residual/dense neural networks), element-wise multiplication (attention) or other differentiable operators. Therefore, the convolutional neural network architecture can be nested rather than being sequential if the whole pipeline is differentiable.
- convolutional neural networks 1000 can be trained based on the backpropagation algorithm.
- methods of regularization e.g. dropout of nodes 1012 - 1020 , stochastic pooling, use of artificial data, weight decay based on the L1 or the L2 norm, or max norm constraints.
- Different loss functions can be combined for training the same neural network to reflect the joint training objectives.
- a subset of the neural network parameters can be excluded from optimization to retain the weights pretrained on another datasets.
- Systems, apparatuses, and methods described herein may be implemented using digital circuitry, or using one or more computers using well-known computer processors, memory units, storage devices, computer software, and other components.
- a computer includes a processor for executing instructions and one or more memories for storing instructions and data.
- a computer may also include, or be coupled to, one or more mass storage devices, such as one or more magnetic disks, internal hard disks and removable disks, magneto-optical disks, optical disks, etc.
- the server may transmit a request adapted to cause a client computer to perform one or more of the steps or functions of the methods and workflows described herein, including one or more of the steps or functions of FIGS. 1 - 4 .
- Certain steps or functions of the methods and workflows described herein, including one or more of the steps or functions of FIGS. 1 - 4 may be performed by a server or by another processor in a network-based cloud-computing system.
- Certain steps or functions of the methods and workflows described herein, including one or more of the steps of FIGS. 1 - 4 may be performed by a client computer in a network-based cloud computing system.
- the steps or functions of the methods and workflows described herein, including one or more of the steps of FIGS. 1 - 4 may be performed by a server and/or by a client computer in a network-based cloud computing system, in any combination.
- Computer program instructions can be implemented as computer executable code programmed by one skilled in the art to perform the method and workflow steps or functions of FIGS. 1 - 4 . Accordingly, by executing the computer program instructions, the processor 1104 executes the method and workflow steps or functions of FIGS. 1 - 4 .
- Computer 1102 may also include one or more network interfaces 1106 for communicating with other devices via a network.
- Computer 1102 may also include one or more input/output devices 1108 that enable user interaction with computer 1102 (e.g., display, keyboard, mouse, speakers, buttons, etc.).
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Public Health (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Biomedical Technology (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pathology (AREA)
- Quality & Reliability (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Image Analysis (AREA)
Abstract
Description
x j (n+1) =f(Σi x i (n) ·w i,j (n)).
w i,j (n) ′=w i,j (n)−γ·δj (n) ·x i (n)
wherein γ is a learning rate, and the numbers δ(n) j can be recursively calculated as
δj (n)=(x k (n+1) −t j (n+1))·f′(Σi x i (n) ·w i,j (n))
based on δ(n+1) j, if the (n+1)-th layer is not the output layer, and
δj (n)=(x k (n+1) −t j (n+1))·f′(Σi x i (n) ·w i,j (n))
if the (n+1)-th layer is the output layer 930, wherein f′ is the first derivative of the activation function, and y(n+1) j is the comparison training value for the j-th node of the output layer 930.
x k (n) [i,j]=(K k *x (n−1))[i,j]=Σ i′Σj′ K k [i′,j′]·x (n−1) [i−i′,j−j′].
x (n) [i,j]=f(x (n−1) [id 1 ,jd 2 ], . . . ,x (n−1) [id 1 +d 1−1,jd 2 +d 2−1])
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/161,186 US12394187B2 (en) | 2022-08-22 | 2023-01-30 | Synthesis of medical images of brain tumors using 3D-2D GANs |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202263373122P | 2022-08-22 | 2022-08-22 | |
| US18/161,186 US12394187B2 (en) | 2022-08-22 | 2023-01-30 | Synthesis of medical images of brain tumors using 3D-2D GANs |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20240062523A1 US20240062523A1 (en) | 2024-02-22 |
| US12394187B2 true US12394187B2 (en) | 2025-08-19 |
Family
ID=89907002
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/161,186 Active 2044-05-02 US12394187B2 (en) | 2022-08-22 | 2023-01-30 | Synthesis of medical images of brain tumors using 3D-2D GANs |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US12394187B2 (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190197358A1 (en) * | 2017-12-21 | 2019-06-27 | International Business Machines Corporation | Generative Adversarial Network Medical Image Generation for Training of a Classifier |
| US20200364864A1 (en) * | 2019-04-25 | 2020-11-19 | GE Precision Healthcare LLC | Systems and methods for generating normative imaging data for medical image processing using deep learning |
| US10937540B2 (en) * | 2017-12-21 | 2021-03-02 | International Business Machines Coporation | Medical image classification based on a generative adversarial network trained discriminator |
| US20210327054A1 (en) * | 2020-04-15 | 2021-10-21 | Siemens Healthcare Gmbh | Medical image synthesis of abnormality patterns associated with covid-19 |
| US20210383537A1 (en) * | 2020-06-09 | 2021-12-09 | Siemens Healthcare Gmbh | Synthesis of contrast enhanced medical images |
| US20240257339A1 (en) * | 2021-05-11 | 2024-08-01 | Quantum Surgical | Method for generating rare medical images for training deep-learning algorithms |
| US20250148601A1 (en) * | 2022-07-13 | 2025-05-08 | Hyperfine Operations, Inc. | Simulating structures in images |
-
2023
- 2023-01-30 US US18/161,186 patent/US12394187B2/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190197358A1 (en) * | 2017-12-21 | 2019-06-27 | International Business Machines Corporation | Generative Adversarial Network Medical Image Generation for Training of a Classifier |
| US10937540B2 (en) * | 2017-12-21 | 2021-03-02 | International Business Machines Coporation | Medical image classification based on a generative adversarial network trained discriminator |
| US20200364864A1 (en) * | 2019-04-25 | 2020-11-19 | GE Precision Healthcare LLC | Systems and methods for generating normative imaging data for medical image processing using deep learning |
| US20210327054A1 (en) * | 2020-04-15 | 2021-10-21 | Siemens Healthcare Gmbh | Medical image synthesis of abnormality patterns associated with covid-19 |
| US20210383537A1 (en) * | 2020-06-09 | 2021-12-09 | Siemens Healthcare Gmbh | Synthesis of contrast enhanced medical images |
| US20240257339A1 (en) * | 2021-05-11 | 2024-08-01 | Quantum Surgical | Method for generating rare medical images for training deep-learning algorithms |
| US20250148601A1 (en) * | 2022-07-13 | 2025-05-08 | Hyperfine Operations, Inc. | Simulating structures in images |
Non-Patent Citations (21)
Also Published As
| Publication number | Publication date |
|---|---|
| US20240062523A1 (en) | 2024-02-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP4375948A1 (en) | Cross-domain segmentation with uncertainty-guided curriculum learning | |
| US12450730B2 (en) | Multimodal analysis of imaging and clinical data for personalized therapy | |
| US11776128B2 (en) | Automatic detection of lesions in medical images using 2D and 3D deep learning networks | |
| EP4141788A1 (en) | Fully automated assessment of coronary vulnerable plaque in coronary ct images using radiomic features | |
| US12100502B2 (en) | Multi-view matching across coronary angiogram images | |
| US11908047B2 (en) | Generating synthetic x-ray images and object annotations from CT scans for augmenting x-ray abnormality assessment systems | |
| US12198337B2 (en) | Out-of-distribution detection for artificial intelligence systems for prostate cancer detection | |
| US12106549B2 (en) | Self-supervised learning for artificial intelligence-based systems for medical imaging analysis | |
| EP4057296A1 (en) | Machine learning for automatic detection of intracranial hemorrhages with uncertainty measures from ct images | |
| US12354259B2 (en) | Semi-supervised learning leveraging cross-domain data for medical imaging analysis | |
| US20240070853A1 (en) | Self-supervised learning for modeling a 3d brain anatomical representation | |
| US11861828B2 (en) | Automated estimation of midline shift in brain ct images | |
| US12211204B2 (en) | AI driven longitudinal liver focal lesion analysis | |
| EP4528639A1 (en) | Multiscale subnetwork fusion with adaptive data sampling for brain lesion detection and segmentation | |
| US12394187B2 (en) | Synthesis of medical images of brain tumors using 3D-2D GANs | |
| US12555230B2 (en) | Regression-based approach for measurement of rotator cuff tears in shoulder MRI images | |
| EP4404207A1 (en) | Automatic personalization of ai systems for medical imaging analysis | |
| US12541847B2 (en) | Domain adaption for prostate cancer detection | |
| US12541853B2 (en) | AI based lesion segmentation and detection for medical images | |
| US20240379244A1 (en) | Ai-driven biomarker bank for liver lesion analysis | |
| EP4432166A1 (en) | Invariance preserving knowledge distillation of foundational models for medical imaging analysis | |
| US20250104276A1 (en) | Pixelwise positional embeddings for medical images in vision transformers | |
| US20250111656A1 (en) | Self-supervised training at scale with weakly-supervised latent space structure | |
| US20240321458A1 (en) | Clinical correction network for clinically significant prostate cancer prediction | |
| US12406339B2 (en) | Machine learning data augmentation using diffusion-based generative models |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| AS | Assignment |
Owner name: SIEMENS MEDICAL SOLUTIONS USA, INC., PENNSYLVANIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHAO, GENGYAN;YOO, YOUNGJIN;RE, THOMAS;AND OTHERS;SIGNING DATES FROM 20230130 TO 20230208;REEL/FRAME:062690/0453 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| AS | Assignment |
Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF HEALTH AND HUMAN SERVICES (DHHS), U.S. GOVERNMENT, MARYLAND Free format text: CONFIRMATORY LICENSE;ASSIGNOR:SIEMENS HEALTHCARE TECHNOLOGY CENTER;REEL/FRAME:064480/0037 Effective date: 20230208 Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF HEALTH AND HUMAN SERVICES (DHHS), U.S. GOVERNMENT, MARYLAND Free format text: CONFIRMATORY LICENSE;ASSIGNOR:SIEMENS HEALTHCARE TECHNOLOGY CENTER;REEL/FRAME:064479/0997 Effective date: 20230208 Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF HEALTH AND HUMAN SERVICES (DHHS), U.S. GOVERNMENT, MARYLAND Free format text: CONFIRMATORY LICENSE;ASSIGNOR:SIEMENS HEALTHCARE TECHNOLOGY CENTER;REEL/FRAME:064480/0171 Effective date: 20230208 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |