US12456202B2 - Methods and systems for automated image segmentation of anatomical structure - Google Patents
Methods and systems for automated image segmentation of anatomical structureInfo
- Publication number
- US12456202B2 US12456202B2 US18/213,931 US202318213931A US12456202B2 US 12456202 B2 US12456202 B2 US 12456202B2 US 202318213931 A US202318213931 A US 202318213931A US 12456202 B2 US12456202 B2 US 12456202B2
- Authority
- US
- United States
- Prior art keywords
- image
- training
- processed
- training image
- processed training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/12—Edge-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—Three-dimensional [3D] image rendering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/771—Feature selection, e.g. selecting representative features from a multi-dimensional feature space
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/031—Recognition of patterns in medical or anatomical images of internal organs
Definitions
- the disclosure herein generally relates to the field of image segmentation, and more specifically to methods and systems for automated image segmentation of an anatomical structure such as heart.
- Segmentation of images associated with anatomical structures finds many applications in medical imaging field for diagnosis, treatment and so on.
- These anatomical structures are very sensitive and complex and demands analysis of till last slice image.
- conventional techniques in the art including sophisticated and deep learning techniques are limited and inaccurate in the image segmentation till the last slice.
- cardiovascular diseases are one of the most fatal diseases in the world. Quantification of volumetric changes in the heart during the cardiac cycle is essential for the diagnosis and monitoring of diseases. Clinical manifestation of the cardiac structure such as changes in size, mass, geometry, regional wall motion, and function of the heart can be assessed timely and monitored non-invasively by cardiovascular magnetic resonance imaging (CMRI). Cardiac image segmentation plays a vital role in the diagnosis of cardiac diseases, quantification of volume, and image-guided interventions.
- CMRI cardiovascular magnetic resonance imaging
- LV left ventricular
- RV right ventricular
- ED end-diastolic
- ES end-systolic
- Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
- a processor-implemented method for automated image segmentation of an anatomical structure including the steps of: receiving a plurality of 3-dimensional (3-D) training images corresponding to the anatomical structure and a ground-truth 3-D image associated with each of the plurality of 3-D training images, wherein the plurality of 3-D training images is associated with a plurality of classes of the anatomical structure; pre-processing the plurality of 3-D training images, to obtain a plurality of pre-processed training images; forming one or more mini-batches from the plurality of pre-processed training images, based on a predefined mini-batch size, wherein each mini-batch comprises one or more pre-processed training images; training a segmentation network model, with the one or more pre-processed training images present in each mini-batch at a time, until the one or more mini-batches are completed for a predefined training epochs, to obtain a trained segmentation network model, wherein the segment
- a system for automated image segmentation of an anatomical structure includes: a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to: receive a plurality of 3-dimensional (3-D) training images corresponding to the anatomical structure and a ground-truth 3-D image associated with each of the plurality of 3-D training images, wherein the plurality of 3-D training images is associated with a plurality of classes of the anatomical structure; pre-process the plurality of 3-D training images, to obtain a plurality of pre-processed training images; form one or more mini-batches from the plurality of pre-processed training images, based on a predefined mini-batch size, wherein each mini-batch comprises one or more pre-processed training images; train a segmentation network model, with the one or more pre-processed
- one or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause: receiving a plurality of 3-dimensional (3-D) training images corresponding to the anatomical structure and a ground-truth 3-D image associated with each of the plurality of 3-D training images, wherein the plurality of 3-D training images is associated with a plurality of classes of the anatomical structure: pre-processing the plurality of 3-D training images, to obtain a plurality of pre-processed training images; forming one or more mini-batches from the plurality of pre-processed training images, based on a predefined mini-batch size, wherein each mini-batch comprises one or more pre-processed training images; training a segmentation network model, with the one or more pre-processed training images present in each mini-batch at a time, until the one or more mini-batches are completed for a predefined training epochs, to obtain a trained segment
- pre-processing each 3-dimensional training image to obtain a corresponding pre-processed training image comprising: sequentially performing at least one of: (i) an image orientation normalization, (ii) a region of interest (ROI) extraction, (iii) a size normalization, (iv) a pixel value normalization, and (v) an image data augmentation, on each 3-D training image.
- the segmentation network model is a generative adversarial network (GAN) and comprising the generator and the patch-based discriminator, and wherein the generator comprises the encoder network, the bottleneck network, the decoder network, and a set of skip connections between the encoder network and the decoder network.
- GAN generative adversarial network
- the loss function of the segmentation network model for each pre-processed training image comprises a generator loss and a discriminator loss, wherein the generator loss comprises a class-weighted generalized dice loss and an adversarial loss, and the discriminator loss comprises a real loss and a fake loss, and wherein: the class-weighted generalized dice loss is calculated between the ground-truth 3-D image of the corresponding pre-processed training image and the predicted segmented image corresponding to the pre-processed training image, wherein the class-weighted generalized dice loss is calculated using pixel-based distribution technique; the adversarial loss is calculated between the ground-truth 3-D image of the corresponding pre-processed training image and the predicted segmented image corresponding to the pre-processed training image; the real loss is calculated between the corresponding pre-processed training image and the ground-truth 3-D image of the corresponding pre-processed training image; and the fake loss is calculated between the corresponding pre-processed training image and predicted segment
- the class-weighted generalized dice loss is defined with one or more class weights that are associated with plurality of classes of the anatomical structure.
- a learning rate and a dropout of the segmentation network model are dynamically adjusted between the predefined training epochs during the training, based on the value of loss function at each predefined training epoch.
- FIG. 1 is an exemplary block diagram of a system for automated image segmentation of an anatomical structure, in accordance with some embodiments of the present disclosure.
- FIGS. 2 A- 2 C illustrates exemplary flow diagrams of a processor-implemented method for automated image segmentation of the anatomical structure, in accordance with some embodiments of the present disclosure.
- FIG. 3 shows a high-level block diagram of the segmentation network model, in accordance with some embodiments of the present disclosure.
- FIG. 4 shows an exemplary block diagram of the generator, in accordance with some embodiments of the present disclosure.
- FIG. 5 shows an exemplary block diagram of the patch-based discriminator, in accordance with some embodiments of the present disclosure.
- FIG. 6 shows performance results of the trained segmentation network model with Blind-testing on Multi-Centre, Multi-Vendor & Multi-Disease Cardiac Image Segmentation Challenge (M&Ms) dataset of different vendors, in accordance with some embodiments of the present disclosure.
- M&Ms Multi-Disease Cardiac Image Segmentation Challenge
- FIG. 7 shows a segmented output from a trained segmentation network model for basal, mid-ventricular, and apex ED slices, in accordance with some embodiments of the present disclosure.
- the segmentation process of cardiac imaging is broadly divided into two stages, i.e., localization, and segmentation.
- localization some of the conventional techniques use variance, circular Hough and Fourier transforms, and so on, to locate the heart.
- DL deep learning
- M-net-based architecture for segmenting LV, RV, and myocardium An ensemble of U-Net inspired architectures for segmenting LV, RV, and myocardium on each time instance of cardiac cycle.
- a one-stage U-Net for segmentation of heart is proposed. Further, there are challenge with outcomes of ACDC dataset with the results from DL methods provided by several research groups for the segmentation task and for the classification task.
- the present disclosure solves the technical problems in the art for automated 3-D image segmentation of the anatomical structure such as heart, by proposing a new Generative Adversarial Network (GAN) based architecture for the segmentation (for example LV, RV, and myocardium of heart) from 3-D volume data with high accuracy.
- GAN Generative Adversarial Network
- the proposed 3-D GAN based architecture is capable of storing the 3-D contextual information for the image segmentation of the anatomical structure.
- FIG. 1 through FIG. 7 where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary systems and/or methods.
- FIG. 1 is an exemplary block diagram of a system 100 for automated image segmentation of an anatomical structure, in accordance with some embodiments of the present disclosure.
- the system 100 includes or is otherwise in communication with one or more hardware processors 104 , communication interface device(s) or input/output (I/O) interface(s) 106 , and one or more data storage devices or memory 102 operatively coupled to the one or more hardware processors 104 .
- the one or more hardware processors 104 , the memory 102 , and the I/O interface(s) 106 may be coupled to a system bus 108 or a similar mechanism.
- the I/O interface(s) 106 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like.
- the VO interface(s) 106 may include a variety of software and hardware interfaces, for example, interfaces for peripheral device(s), such as a keyboard, a mouse, an external memory, a plurality of sensor devices, a printer and the like. Further, the I/O interface(s) 106 may enable the system 100 to communicate with other devices, such as web servers and external databases.
- the I/O interface(s) 106 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, local area network (LAN), cable, etc., and wireless networks, such as Wireless LAN (WLAN), cellular, or satellite.
- the I/O interface(s) 106 may include one or more ports for connecting a number of computing systems with one another or to another server computer. Further, the I/O interface(s) 106 may include one or more ports for connecting a number of devices to one another or to another server.
- the one or more hardware processors 104 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
- the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102 .
- the expressions ‘processors’ and ‘hardware processors’ may be used interchangeably.
- the system 100 can be implemented in a variety of computing systems, such as laptop computers, portable computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
- the memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- volatile memory such as static random access memory (SRAM) and dynamic random access memory (DRAM)
- non-volatile memory such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
- the memory 102 includes a plurality of modules 102 a and a repository 102 b for storing data processed, received, and generated by one or more of the plurality of modules 102 a .
- the plurality of modules 102 a may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types.
- the plurality of modules 102 a may include programs or computer-readable instructions or coded instructions that supplement applications or functions performed by the system 100 .
- the plurality of modules 102 a may also be used as, signal processor(s), state machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions.
- the plurality of modules 102 a can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 104 , or by a combination thereof.
- the plurality of modules 102 a can include various sub-modules (not shown in FIG. 1 ).
- the memory 102 may include information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure.
- the repository 102 b may include a database or a data engine. Further, the repository 102 b amongst other things, may serve as a database or includes a plurality of databases for storing the data that is processed, received, or generated as a result of the execution of the plurality of modules 102 a . Although the repository 102 b is shown internal to the system 100 , it will be noted that, in alternate embodiments, the repository 102 b can also be implemented external to the system 100 , where the repository 102 b may be stored within an external database (not shown in FIG. 1 ) communicatively coupled to the system 100 . The data contained within such external database may be periodically updated.
- data may be added into the external database and/or existing data may be modified and/or non-useful data may be deleted from the external database.
- the data may be stored in an external system, such as a Lightweight Directory Access Protocol (LDAP) directory and a Relational Database Management System (RDBMS).
- LDAP Lightweight Directory Access Protocol
- RDBMS Relational Database Management System
- the data stored in the repository 102 b may be distributed between the system 100 and the external database.
- FIGS. 2 A- 2 C illustrates exemplary flow diagrams of a processor-implemented method 200 for automated image segmentation of the anatomical structure, in accordance with some embodiments of the present disclosure.
- steps of the method 200 including process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order.
- the steps of processes described herein may be performed in any practical order. Further, some steps may be performed simultaneously, or some steps may be performed alone or independently.
- the one or more hardware processors 104 of the system 100 are configured to receive a plurality of 3-dimensional (3-D) training images corresponding to the anatomical structure and a ground-truth 3-D image associated with each of the plurality of 3-D training images.
- the plurality of 3-D training images is associated with a plurality of classes of the anatomical structure.
- the plurality of classes is associated with the plurality of substructures of the anatomical structure.
- the plurality of classes (substructures) includes a left ventricular (LV), a right ventricular (RV) and a myocardium, that are interested in for the segmentation.
- the ground-truth 3-D image associated with each of the plurality of 3-D training images refers to the segmented image of the associated anatomical structure.
- the plurality of 3-dimensional (3-D) training images are received may in the form including but are not limited to magnetic resonance imaging (MRI), computerized tomography (CT) or any other 3-D form.
- each 3-D training image includes 4 channels of size 160 ⁇ 160 ⁇ 16 ⁇ 4.
- the plurality of 3-dimensional (3-D) training images and the corresponding ground-truth 3-D images are stored in a repository 102 b of the system 100 .
- the one or more hardware processors 104 of the system 100 are configured to pre-process the plurality of 3-D training images received at step 202 of the method 200 , to obtain a plurality of pre-processed training images.
- Each 3-D training image of the plurality of 3-D training images is pre-processed to obtain the corresponding pre-processed training image and the plurality of pre-processed training images are obtained from the plurality of 3-D training images.
- pre-processing each 3-dimensional training image comprising: sequentially performing at least one of: (i) an image orientation normalization, (ii) a region of interest (ROI) extraction, (iii) a size normalization, (iv) a pixel value normalization, and (v) an image data augmentation, on each 3-D training image.
- the on-the-go 3-D image data augmentation is performed to increase the data size and reduce the storage dependency.
- the on-the-go 3-D image data augmentation comprises: (i) flipping the image over one of the three axes x, y and z, (ii) rotating the image over x, y and z-axis randomly between 0-30 degrees, (iii) deforming the image using elastic deformation, and (iv) altering the brightness of the image using a power-law gamma transformation.
- the one or more hardware processors 104 of the system 100 are configured to form one or more mini-batches from the plurality of pre-processed training images, based on a predefined mini-batch size.
- Each mini-batch includes one or more pre-processed training images out of the plurality of pre-processed training images obtained at step 204 of the method 200 .
- the predefined mini-batch size is 16, then each mini-batch includes 16 pre-processed training images.
- the single pre-processed training image is not be part of the multiple mini-batches, i.e., each mini-batch comprises unique pre-processed training images.
- the number of the one or more pre-processed training images present in last mini-batch may or may not be equal to the predefined mini-batch size, based on the number of remaining samples available.
- the predefined mini-batch size is defined based on the resource availability such as hardware, graphic processing unit (GPU) capacity, and memory present in the system 100 .
- the one or more hardware processors 104 of the system 100 are configured to train a segmentation network model, with the one or more pre-processed training images present in each mini-batch at a time.
- the training of the segmentation network model is performed until the one or more mini-batches are completed for a predefined training epochs, to obtain a trained segmentation network model. If the training of the segmentation network model is completed with all the one or more mini-batches, then it is termed as one training epoch and in the next training epoch, again the one or more mini-batches are formed for the training.
- the one or more pre-processed training images present in one mini-batch associated with one training epoch need not be same to that of the one or more pre-processed training images present in another mini-batch associated with another training epoch.
- the predefined mini-batch size is uniform across all the training epochs. In an embodiment, the predefined training epochs is 2500.
- FIG. 3 shows a high-level block diagram of the segmentation network model 300 , in accordance with some embodiments of the present disclosure.
- the segmentation network model is a generative adversarial network (GANs) and includes a generator 302 and a patch-based discriminator 304 pitted one against the other.
- the generative adversarial network (GANs) is used to generate new synthetic instances of data that can pass for the real data.
- the generator 302 further includes an encoder network 302 a , the bottleneck network 302 b , and the decoder network 302 c .
- the encoder network 302 a and the decoder network 302 c are connected to the bottleneck network 302 b .
- a set of skip connections (not shown in FIG. 3 ) between the encoder network 302 a and the decoder network 302 c.
- FIG. 4 shows an exemplary block diagram of the generator 302 , in accordance with some embodiments of the present disclosure.
- the encoder network 302 a includes four 3-D convolutional blocks namely 3DConv E1, 3DConv E2, 3DConv E3, and 3DConv E4.
- the bottleneck network 302 b includes four 3-D convolutional blocks namely 3DConv B1, 3DConv B2, 3DConv B3, and 3DConv B4.
- the decoder network 302 c includes four 3-D transposed convolutional blocks namely 3D trans Conv D1, 3D trans Conv D2, 3D trans Conv D3, and 3D trans Conv D4.
- FIG. 5 shows an exemplary block diagram of the patch-based discriminator 304 , in accordance with some embodiments of the present disclosure.
- the patch-based discriminator 304 includes five 3-D convolutional blocks namely 3DConv D1, 3DConv D2, 3DConv D3, 3DConv D4, and 3DConv D5.
- Each 3-D convolutional block ((i) of the four 3-D convolutional blocks namely 3DConv E1, 3DConv E2, 3DConv E3, and 3DConv E4 of the encoder network 302 , or (ii) of the four 3-D convolutional blocks namely 3DConv B1, 3DConv B2, 3DConv B3, and 3DConv B4 of the bottleneck network 302 b , or (iii) of the four 3-D transposed convolutional blocks namely 3D trans Conv D1, 3D trans Conv D2, 3D trans Conv D3, and 3D trans Conv D4 of the decoder network 302 c ) contain an identical layer structure comprising a convolutional layer, a padding layer, a pooling or stride layer, a batch normalisation layer, an activation function, and a dropout layer.
- each pre-processed training image present in the mini-batch is passed to the encoder network 302 a , to obtain a set of patched feature maps and a set of encoded feature maps, corresponding to the pre-processed training image.
- each pre-processed training image is in specific passed to the first 3-D convolutional block 3DConv E1.
- the patched feature maps are extracted from each four 3-D convolutional blocks namely 3DConv E1, 3DConv E2, 3DConv E3, and 3DConv E4.
- the patched feature maps P1(X ⁇ Y ⁇ 1) are extracted from the 3-D convolutional block 3DConv E1.
- the patched feature maps P2(X ⁇ Y ⁇ 1) are extracted from the 3-D convolutional block 3DConv E2
- the patched feature maps P3(X ⁇ Y ⁇ 1) are extracted from the 3-D convolutional block 3DConv E3
- the patched feature maps P4(X ⁇ Y ⁇ 1) are extracted from the 3-D convolutional block 3DConv E4.
- the P2(X ⁇ Y ⁇ 1) the value of X and Y is 10.
- the size of the patched feature maps P1(X ⁇ Y ⁇ 1) is 10 ⁇ 10 ⁇ 1 and such patched feature maps are extracted from each of the four 3-D convolutional blocks of the encoder network 302 a .
- each of the four 3-D convolutional blocks of the encoder network 302 a includes a 3-D down-sampling convolutional layers with a kernel of 4 ⁇ 4 ⁇ 4, stride 2, and with a leaky RELU activation function.
- the set of patched feature maps (P1(X ⁇ Y ⁇ 1), P2(X ⁇ Y ⁇ 1), P3(X ⁇ Y ⁇ 1), and P4(X ⁇ Y ⁇ 1)) and the set of encoded feature maps (from the last 3-D convolutional block 3DConv E4) of each pre-processed training image, obtained at step 208 a are concatenated channel-wise through the bottleneck network 302 b , to obtain a concatenated feature map for the corresponding pre-processed training image. As shown in FIG.
- the patched feature map P1(X ⁇ Y ⁇ 1) and the set of encoded feature maps (from the last 3-D convolutional block 3DConv E4) are concatenated first through the first 3-D convolutional block 3DConv B1 of the bottleneck network 302 b , to obtain first intermediate feature maps.
- the first intermediate feature maps and the patched feature map P2(X ⁇ Y ⁇ 1) are concatenated through the second 3-D convolutional block 3DConv B2 to obtain second intermediate feature maps.
- the second intermediate feature maps and the patched feature map P3(X ⁇ Y ⁇ 1) are concatenated through the third 3-D convolutional block 3DConv B3 to obtain third intermediate feature maps.
- the third intermediate feature maps and the patched feature map P4(X ⁇ Y ⁇ 1) are concatenated through the fourth 3-D convolutional block 3DConv B4 to obtain the concatenated feature map for the corresponding pre-processed training image.
- the concatenated feature map is obtained for each pre-processed training image present in the mini-batch at this step.
- the bottleneck network 302 b includes 3-D convolution layer with kernel size, distribution, and activation function similar to the encoder network 302 a , but with stride 1.
- the 3-D convolution layers in the bottleneck network 302 b help by reducing the number of parameters in the network ( 302 b ) but still allowing it to be deep, representing many feature maps.
- the depth of the bottleneck network 302 b and the encoder network 302 a are identical, as feature maps of X ⁇ Y ⁇ 1 (for example, 10 ⁇ 10 ⁇ 1) from each encoder network output layer are concatenated with each bottleneck layer using the patch extraction.
- Skip-connections between the encoder network 302 a and the decoder network 302 c are also applied.
- the skip connections and the bottleneck network 302 b helps in the addition of the lost features during down-sampling and thereby preserving the specific and crucial information which is essential in the medical image domain.
- the concatenated feature map of each pre-processed training image is passed to the decoder network 302 c , to predict a segmented image corresponding to each pre-processed training image. More specifically, the concatenated feature map of each pre-processed training image is passed to the first 3-D transposed convolutional block 3D trans Conv D1 of the decoder network 302 c . The predicted segmented image corresponding to each pre-processed training image is obtained from the last 3-D transposed convolutional block 3D trans Conv D4 of the decoder network 302 c . Hence the predicted segmented image is obtained for each pre-processed training image present in the mini-batch at this step.
- the decoder network 302 c includes 3-D transpose up-sampling layers with the rest similar to the encoder network 302 a .
- the segmentation network model 300 is the segmented network and the classifier network
- the last output layer is a 3-D transpose layer with 4 ⁇ 4 ⁇ 4, stride 2 with soft-max activation function for segmenting and classifying the substructures of the anatomical structure (For example, the 4 substructures of the heart include LV, RV, Myocardium, and the background).
- a probability value corresponding to each pre-processed training image is predicted, through the patch-based discriminator 304 .
- the probability value corresponding to each pre-processed training image (i) the predicted segmented image corresponding to the pre-processed training image obtained at step 208 c , and (ii) the ground-truth 3-D image of the corresponding pre-processed training image received at step 202 of the method 200 , are passed to the patch-based discriminator 304 .
- the patch-based discriminator 304 includes the 3-D convolutional layers with parameters similar to the encoder network 302 a of the generator 302 .
- the patch-based discriminator 304 is built on patch GAN architecture style.
- the patch-based discriminator 304 takes two inputs, the original image and the predicted segmented image output from the generator 302 .
- the patch-based discriminator 304 splits the raw input image into local small patches of size with 24 ⁇ 24 ⁇ 6, then runs a general discriminator convolutionally on every patch declaring whether the patch is real or fake.
- the final prediction is the average of all the patch responses,
- a value of a loss function of the segmentation network model 300 is calculated, for the one or more pre-processed training images present in each mini-batch, using the predicted probability value corresponding to each pre-processed training image obtained at step 208 d .
- the value of the loss function of the segmentation network model is first calculated for each pre-processed training image present in the mini-batch and the value of the loss function is then aggregated for all the pre-processed training images present in the mini-batch.
- the loss function of the segmentation network model 300 for each pre-processed training image is a summation of a generator loss (L G ) and a discriminator loss (L D ).
- the generator loss (L G ) comprises a class-weighted generalized dice loss and an adversarial loss.
- the discriminator loss comprises a real loss and a fake loss.
- the class-weighted generalized dice loss is calculated between the ground-truth 3-D image of the corresponding pre-processed training image and the predicted segmented image corresponding to the pre-processed training image, wherein the class-weighted generalized dice loss is calculated using pixel-based distribution technique.
- the adversarial loss is calculated between the ground-truth 3-D image of the corresponding pre-processed training image and the predicted segmented image corresponding to the pre-processed training image.
- the class weights are incorporated while calculating the class-weighted generalized dice loss of the generator 302 , to resolve the bad training of a certain class due to the class imbalance problem of especially some of the classes of the segmentation (For example, myocardium in the heart) in the classification network of the generator 302 . This ensures that minority class can be detected correctly.
- These class weights are computed based on the plurality of 3-dimensional (3-D) training images (training dataset in general) received at step 202 of the method 200 .
- the class-weighted generalized dice loss is defined with one or more class weights that are associated with plurality of classes of the anatomical structure.
- Dice ⁇ cofficient 2 ⁇ ⁇ " ⁇ [LeftBracketingBar]” A ⁇ B ⁇ “ ⁇ [RightBracketingBar]” ⁇ “ ⁇ [LeftBracketingBar]” A ⁇ “ ⁇ [RightBracketingBar]” + ⁇ “ ⁇ [LeftBracketingBar]” B ⁇ “ ⁇ [RightBracketingBar]” ⁇ class ⁇ weights ( 3 ) wherein,
- Class weights are calculated as follows: Suppose there are P number of the plurality of 3-dimensional (3-D) training images ⁇ I 1 , I 2 , I 3 , I 4 , . . . , I p ⁇ and j number of unique labels ⁇ 1 , ⁇ 2 , ⁇ 3 , ⁇ 4 , . . . , ⁇ j ⁇ considering j substructures in each 3-D training image, then weight of the label j in i th 3-D training image is represented as W ⁇ ji and defined as in equation 4:
- Wa ji ⁇ m : m ⁇ pixels ⁇ of ⁇ label ⁇ a j ⁇ distributed ⁇ over ⁇ image ⁇ i ⁇ N : N ⁇ is ⁇ total ⁇ number ⁇ of ⁇ pixels ⁇ in ⁇ image ⁇ i ( 4 )
- Total ⁇ weights [ ⁇ ( Wa 11 , Wa 12 , , ... , Wa 1 ⁇ i ) i , ⁇ ( Wa 21 , Wa 22 , , ... , Wa 2 ⁇ i ) i ⁇ ... , ⁇ ( Wa j ⁇ 1 , Wa j ⁇ 2 , , ... , Wa ji ) i ]
- the real loss is calculated between the corresponding pre-processed training image and the ground-truth 3-D image of the corresponding pre-processed training image.
- the fake loss is calculated between the corresponding pre-processed training image and predicted segmented image corresponding to the pre-processed training image.
- L D discriminator loss
- patch-based discriminator is to split the raw input image into some small local patches, run a general discriminator convolutionally on every patch, and average all the responses to obtain the final output indicating whether the input image is fake or not.
- the main difference between the patch-based discriminator and a regular GAN discriminator is that the latter maps an input image to a single scalar output in the range of [0,1], indicating the probability of the image being real or fake, while the patch-based discriminator provides an array as the output with each entry signifying whether its corresponding patch is real or fake.
- the weights of the segmentation network model 300 are backpropagated based on the calculated value of the loss function of the segmentation network model 300 , obtained at step 208 e .
- the training of the segmentation network model 300 is performed until the one or more mini-batches are completed for the predefined training epochs, to obtain the trained segmentation network model.
- the trained segmentation model obtained at step 208 of the method 200 is then used for various applications where the segmentation image of the anatomical structure is required, and especially where the last slice of the anatomical structure such as heart, is of high importance.
- the one or more hardware processors 104 of the system 100 are configured to receive an input 3-D training image corresponding to the anatomical structure for which the segmentation is required.
- the received input 3-D training image is pre-processed as explained at step 204 of the method 200 and the pre-processed input 3-D training image is obtained.
- the one or more hardware processors 104 of the system 100 are configured to pass the pre-processed input 3-D training image, to the trained segmentation model obtained at step 208 of the method 200 , to predict the segmented image corresponding to the input 3-D training image of the anatomical structure. It is implicit that the pre-processed input 3-D training image is passed to the encoder network 302 a of the trained segmentation model and the predicted segmented image corresponding to the input 3-D training image is obtained from the decoder network 302 c of the trained segmentation model.
- a learning rate and a dropout of the segmentation network model 300 are dynamically adjusted between the predefined training epochs during the training, based on the value of the loss function at each predefined training epoch.
- the generator uses an Adam optimizer with a learning rate (l r ) of 2e ⁇ 4 and a beta of 0.5.
- the discriminator has RMSprop with l r of 1e ⁇ 3 , ⁇ , which is the discounting factor for the coming gradient is set to 0.5.
- Different dropout values are applied to both networks of the generator and the discriminator.
- the generator has a lower dropout of 0.3 as compared to the discriminator that has 0.5. By giving a higher dropout to the discriminator results in more dynamic so that it does not go to mode collapse, a common problem while training the GAN. Also, the low dropout to the generator helps in convergence and avoids vanishing gradient problem.
- the predicted segmented image corresponding to the anatomical structure in this step only gives the segmented information such as type of segments (for example, the left ventricular (LV), the right ventricular (RV) and the myocardium of the heart) present in the segmented image.
- the anomalies in the segments (substructures) are unknown.
- the one or more hardware processors 104 of the system 100 are configured to obtain one or more domain features of the predicted segmented image corresponding to the input 3-dimensional training image.
- a domain feature extraction technique is employed to extract the one or more domain features of the predicted segmented image corresponding to the anatomical structure.
- the domain features are derived from the segmented areas in the predicted segmented image.
- the domain features can be used as clinical values that aids in diagnosis or can be used as a feature to automate algorithms that detect anomalies.
- the standard clinical values like ejection fraction, myocardial mass, volume information for disease classification, etc.
- the standard features mostly include used volumes of LV, RV, and Myocardium during ES and ED phases, stroke volume for LV and RV, height, weight etc.
- the domain features may also be textural, geometrical, and radiomics approach-based features. In some pathologies, the difference in volume plays a critical role in the identification of dilated cardiomyopathy as LV volume is more than the myocardium volume in the ED phase. Similarly, for hypertrophic cardiomyopathy, myocardium has a significantly large volume than LV during the ED phase. Hence, the observations are incorporated during ES and ED phases across images and considered some standard, new ratio and subtraction volume features. Overall, 28 domain features are identified and shown in table 1:
- LV mean left ventricle RV mean right ventricle
- MY mean myocardium
- ES mean end-systole ED mean end-diastole
- EF mean ejection fraction SV mean stroke volume.
- the one or more hardware processors 104 of the system 100 are configured to pass the one or more domain features of the predicted segmented image, obtained at step 214 of the method 200 , to a classification network model, to predict an anomaly class of the plurality of anomaly classes associated with the anomaly substructures.
- the 28 domain features are identified at step 214 of the method 200 . These domain features are used for training a random forest classifier (RFC), a supervised learning technique that is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset, to obtain the classification network model after the training.
- RRC random forest classifier
- the individual decision trees are generated using an attribute selection indicator such as information gain, which is a measure of how much information can be gathered from a piece of data.
- the patch-based feature extraction at the encoder network is concatenated with the bottleneck layers of the bottleneck network to maintain the finer and spatial details for the semantic segmentation.
- the input to the GAN based segmentation network model of the present disclosure is 3-D image, it is advantage to capture and maintain minute feature details, voxel spacing and volume information.
- the segmentation network model of the present disclosure is computationally efficient. It captures contextual information and has potential to implicitly learn local dependence between the pixels. Due to the dynamic network parameters, such as the learning rate and the dropout of the segmentation network, the training process converges fast, with better segmented output and efficiently handles mode collapse and vanishing gradient problems.
- the GAN based segmentation network model of the present disclosure is robust and efficient for segmenting the anatomical structure with good accuracy.
- the methods and systems of the present disclosure can robustly detect the substructures even in the presence of anatomical and pathological structural anomalies. Further, the methods and systems of the present disclosure is fast, consistent with high accuracy.
- Automated Cardiac Diagnosis Challenge is a MICCAI 2017 challenge dataset acquired at the university hospital of Dijon, France. This consists of cardiac short-axis MRI images with the corresponding ground truth (GT) of LV, LV myocardium and RV for 100 patients. Each case contains all phases of 4D images: however, manual reference images are provided only in ED (end-diastole) and ES (end-systole) cardiac phases. The dataset is divided into 5 evenly distributed subgroups: normal case, heart failure with infarction, dilated cardiomyopathy, hypertrophic cardiomyopathy and abnormal right ventricle. Since we do not have the GT for the 50 test subjects, we divide the training dataset of 100 patients into 80 subjects for training and 20 subjects (4 from each subgroup) for testing. The dice score in this paper is the average performance results of 5-fold cross-validation.
- the cardiac MR images comprise the heart and the surrounding chest cavity like the lungs and diaphragm. To narrow the region of interest and localize the heart region (LV center), the images are cropped from the center to a 150 ⁇ 150 ⁇ original depth and then zero-padded to 160 ⁇ 160 ⁇ 16 as per network requirement. The pixel values are normalized between [0, 1]. To increase the training samples and reduce storage dependency, on the go flip 3-D data augmentation is applied that randomly picks to flip over one of the three-axis, rotated the 3-D images over x, y, and z-axis randomly between 0-30 degrees, deformed the image using elastic deformation and altered the brightness using power-law gamma transformation. The techniques listed in the pre-processing are randomly selected for a specific time.
- the proposed invention and the approach is implemented using Tensorflow and OpenCV.
- the generator uses an Adam optimizer with a learning rate (l r ) of 2e ⁇ 4 and a beta of 0.5.
- the discriminator has RMSprop with l r of 1e ⁇ 3 , ⁇ , which is the discounting factor for the coming gradient is set to 0.5.
- Different dropout values are applied to both networks of the generator and the discriminator.
- the generator has a lower dropout of 0.3 as compared to the discriminator that has 0.5. By giving a higher dropout to the discriminator results in more dynamic so that it does not go to mode collapse, a common problem while training the GAN. Also, the low dropout to the generator helps in convergence and avoids vanishing gradient problem.
- the segmentation network model is trained with combined ES and ED dataset for 2500 epochs on a GPU.
- Dice score is used to evaluate the performance of the present disclosure with the state-of-the-art techniques.
- the ES and ED dataset are trained separately as well as combined training. It is observed that training ES and ED separately on the proposed segmentation network model had no major improvements.
- the present disclosure (the proposed segmentation network model) is robust enough to learn from both ED and ES images trained together and the below reported results are from combined training.
- FIG. 6 shows performance results of the trained segmentation network model with Blind-testing on Multi-Centre, Multi-Vendor & Multi-Disease Cardiac Image Segmentation Challenge (M&Ms) dataset of different vendors, in accordance with some embodiments of the present disclosure.
- M&Ms Multi-Centre, Multi-Vendor & Multi-Disease Cardiac Image Segmentation Challenge
- the present disclosure provides consistent results across all scanners from multiple vendors A, B, C, and D.
- region 1 is LV
- 2 is Myocardium
- 3 is RV.
- FIG. 7 shows a segmented output from a trained segmentation network model for basal, mid-ventricular, and apex ED slices, in accordance with some embodiments of the present disclosure.
- the predicted output is close to ground truth (GT) shared by clinical experts for all three regions across different slice levels.
- GT ground truth
- the various shades of gray are the 3 different substructures as shown in the first GT image.
- the embodiments of present disclosure herein address unresolved problem of accurately achieving 3-D image segmentation by using the GAN based segmentation network model.
- the performance results also shows that the present disclosure for the 3-D image segmentation is efficient, accurate and provide minute segmentation till the last slice.
- the embodiment of the present disclosure is more explained with the example of heart as the anatomical structure as it is quite complex and need last minute slicing.
- the scope of the present disclosure is not limited to heart, other anatomical structure such as lungs, abdomen, and so on can also be utilized by the system and method of the present disclosure.
- Such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device.
- the hardware device can be any kind of device which can be programmed including e.g., any kind of computer like a server or a personal computer, or the like, or any combination thereof.
- the device may also include means which could be e.g., hardware means like e.g., an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g., an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein.
- the means can include both hardware means and software means.
- the method embodiments described herein could be implemented in hardware and software.
- the device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g., using a plurality of CPUs.
- the embodiments herein can comprise hardware and software elements.
- the embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc.
- the functions performed by various components described herein may be implemented in other components or combinations of other components.
- a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
- a computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored.
- a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein.
- the term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Computer Graphics (AREA)
- Image Analysis (AREA)
- Apparatus For Radiation Diagnosis (AREA)
Abstract
Description
L G =MSE(D01(Ori,Pred))+∝*GDL(gt,G pred) (1)
where GDL (gt, Gpred) represents the class-weighted generalized dice loss, MSE (D01(Ori, Pred)) represents the adversarial loss, MSE is the mean square error, D01 is the patch-based discriminator output with the tensor of ones, Ori is the actual image (3-D training image of the plurality of 3-dimensional (3-D) training images received at step 202), Pred is the respective predicted output, gt is the corresponding ground-truth 3-D image received at step 202, and Gpred is the segmented predicted output of the decoder network 302 c (from the generator 302), ∝ is a crucial hyperparameter, which is a scalar coefficient that works as a regularization parameter and the penalizes the network accordingly. In embodiment, the values of ∝ is chosen as 8 based on the experimental analysis.
Dice loss=1−dice cofficient (2)
and the dice cofficient is calculated using equation 3:
wherein, |A∩B| is element wise multiplication between the predicted segmented image and the ground truth image and then sum the resulting matrix, and |A|+|B| represents a total pixels sum of the predicted segmented image and the ground truth image.
I 1 ={Wα 11 ,Wα 21 ,Wα 31 , . . . ,Wα j1}
I 2 ={Wα 12 ,Wα 22 ,Wα 32 , . . . ,Wα j2}
I i ={Wα 1i ,Wα 2i ,Wα 3i , . . . ,Wα ji}
L G=0.5*(L R +L F) (5)
wherein LR is the real loss and LF is the fake loss.
-
- (i) A stroke volume (SV) is defined as the volume ejected between the end of diastole and the end of systole.
SV=EDV−ESV
wherein, the EDV is an end-diastolic volume and the ESV is an end-systolic volume. - (ii) A LV Mass is the density of cardiac muscle is about 1.05 g/mL and the LV mass can be computed as LV myocardial volume×1.05.
- (iii) Ejection fraction (EF) is a measurement, expressed as a percentage, of how much blood the left ventricle pumps out with each contraction.
- (i) A stroke volume (SV) is defined as the volume ejected between the end of diastole and the end of systole.
-
- (iv) Body Surface Area (BSA): Normalisation of the derived physiological values is done using BSA. This is calculated using Mosteller's formula given below:
| TABLE 1 | ||
| Standard | Proposed Subtraction | Proposed Ratio |
| Features | Features | Features |
| LVES & LVED vol | LVED − MYED (S1) | LVED/RVED (R1) |
| RVES & RVED vol | LVES − MYES (S2) | LVES/RVES (R2) |
| MYES & MYED vol | LVED − RVED (S3) | MYED/LVED (R3) |
| LV & RV SV | LVES − RVES (S4) | MYES/LVES (R4) |
| LV & RV EF | MYED − RVED (S5) | MYED/RVED (R5) |
| LV Mass (LVM) | MYES − RVES (S6) | MYES/RVES (R6 |
| Height (Ht), weight | MYED/MYES (R7) | |
| (Wt) | ||
| BMI & BSA | ||
Claims (17)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| IN202221037838 | 2022-06-30 | ||
| IN202221037838 | 2022-06-30 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20240005512A1 US20240005512A1 (en) | 2024-01-04 |
| US12456202B2 true US12456202B2 (en) | 2025-10-28 |
Family
ID=87002941
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/213,931 Active 2044-07-22 US12456202B2 (en) | 2022-06-30 | 2023-06-26 | Methods and systems for automated image segmentation of anatomical structure |
Country Status (2)
| Country | Link |
|---|---|
| US (1) | US12456202B2 (en) |
| EP (1) | EP4300365A3 (en) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12456202B2 (en) * | 2022-06-30 | 2025-10-28 | Tata Consultancy Services Limited | Methods and systems for automated image segmentation of anatomical structure |
| KR102924100B1 (en) * | 2022-11-22 | 2026-02-06 | 한국과학기술원 | Apparatus and method for generating 3d object texture map |
| US20250148777A1 (en) * | 2023-11-08 | 2025-05-08 | Qualcomm Incorporated | Systems and methods for segmentation map error correction |
| CN118570473B (en) * | 2024-05-30 | 2025-01-17 | 广东海洋大学 | Improved U-net apple image segmentation method and device |
| CN118968070B (en) * | 2024-10-12 | 2025-02-07 | 南方医科大学南方医院 | Dynamic pulmonary vessel segmentation method based on operation video |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190261945A1 (en) * | 2018-02-26 | 2019-08-29 | Siemens Medical Solutions Usa, Inc. | Three-Dimensional Segmentation from Two-Dimensional Intracardiac Echocardiography Imaging |
| US20210012885A1 (en) * | 2019-07-12 | 2021-01-14 | The Regents Of The University Of California | Fully automated four-chamber segmentation of echocardiograms |
| US20210248747A1 (en) * | 2020-02-11 | 2021-08-12 | DeepVoxel, Inc. | Organs at risk auto-contouring system and methods |
| US20220012890A1 (en) * | 2020-07-01 | 2022-01-13 | University Of Iowa Research Foundation | Model-Based Deep Learning for Globally Optimal Surface Segmentation |
| US20220036128A1 (en) * | 2020-08-03 | 2022-02-03 | International Business Machines Corporation | Training machine learning models to exclude ambiguous data samples |
| US20240005512A1 (en) * | 2022-06-30 | 2024-01-04 | Tata Consultancy Services Limited | Methods and systems for automated image segmentation of anatomical structure |
| US20250104221A1 (en) * | 2023-09-27 | 2025-03-27 | GE Precision Healthcare LLC | System and method for one-shot anatomy localization with unsupervised vision transformers for three-dimensional (3d) medical images |
-
2023
- 2023-06-26 US US18/213,931 patent/US12456202B2/en active Active
- 2023-06-27 EP EP23181767.7A patent/EP4300365A3/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190261945A1 (en) * | 2018-02-26 | 2019-08-29 | Siemens Medical Solutions Usa, Inc. | Three-Dimensional Segmentation from Two-Dimensional Intracardiac Echocardiography Imaging |
| US20210012885A1 (en) * | 2019-07-12 | 2021-01-14 | The Regents Of The University Of California | Fully automated four-chamber segmentation of echocardiograms |
| US20210248747A1 (en) * | 2020-02-11 | 2021-08-12 | DeepVoxel, Inc. | Organs at risk auto-contouring system and methods |
| US20220012890A1 (en) * | 2020-07-01 | 2022-01-13 | University Of Iowa Research Foundation | Model-Based Deep Learning for Globally Optimal Surface Segmentation |
| US20220036128A1 (en) * | 2020-08-03 | 2022-02-03 | International Business Machines Corporation | Training machine learning models to exclude ambiguous data samples |
| US20240005512A1 (en) * | 2022-06-30 | 2024-01-04 | Tata Consultancy Services Limited | Methods and systems for automated image segmentation of anatomical structure |
| US20250104221A1 (en) * | 2023-09-27 | 2025-03-27 | GE Precision Healthcare LLC | System and method for one-shot anatomy localization with unsupervised vision transformers for three-dimensional (3d) medical images |
Non-Patent Citations (5)
| Title |
|---|
| Cirillo et al., "Vox2Vox: 3D-GAN for Brain Tumour Segmentation," Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries, (2020). |
| Gatti, A.A., Maly, M.R. Automatic knee cartilage and bone segmentation using multi-stage convolutional neural networks: data from the osteoarthritis initiative. Magn Reson Mater Phy 34, 859-875 (Jun. 2021). https://doi.org/10.1007/s10334-021-00934-z (Year: 2021). * |
| Le et al., "Auto Whole Heart Segmentation from CT images Using an Improved Unet-GAN," (2021). |
| Morris et al., "Cardiac substructure segmentation with deep learning for improved cardiac sparing," Med Phys, 47(2):576-586 (2020). |
| Morris, Eric D., et al. "Cardiac substructure segmentation with deep learning for improved cardiac sparing." Medical physics 47.2 (2020): 576-586. (Year: 2020). * |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4300365A3 (en) | 2024-01-17 |
| EP4300365A2 (en) | 2024-01-03 |
| US20240005512A1 (en) | 2024-01-04 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12456202B2 (en) | Methods and systems for automated image segmentation of anatomical structure | |
| US11182896B2 (en) | Automated segmentation of organ chambers using deep learning methods from medical imaging | |
| Zotti et al. | Convolutional neural network with shape prior applied to cardiac MRI segmentation | |
| CN110475505B (en) | Automatic segmentation using full convolution network | |
| Khened et al. | Densely connected fully convolutional network for short-axis cardiac cine MR image segmentation and heart diagnosis using random forest | |
| US9968257B1 (en) | Volumetric quantification of cardiovascular structures from medical imaging | |
| US20190279361A1 (en) | Automatic quantification of cardiac mri for hypertrophic cardiomyopathy | |
| CN113012173A (en) | Heart segmentation model and pathology classification model training, heart segmentation and pathology classification method and device based on cardiac MRI | |
| CN108603922A (en) | Automatic cardiac volume is divided | |
| Shoaib et al. | An overview of deep learning methods for left ventricle segmentation | |
| He et al. | Automatic left ventricle segmentation from cardiac magnetic resonance images using a capsule network | |
| Zotti et al. | Novel deep convolution neural network applied to MRI cardiac segmentation | |
| US20250006346A1 (en) | Method and system for magnetic resonance (mr) image analysis | |
| Kanakatte et al. | 3D cardiac substructures segmentation from CMRI using generative adversarial network (GAN) | |
| Baumgartner et al. | Fully convolutional networks in medical imaging: Applications to image enhancement and recognition | |
| Abdelrauof et al. | Light-weight localization and scale-independent multi-gate UNET segmentation of left and right ventricles in MRI images | |
| Attar et al. | High throughput computation of reference ranges of biventricular cardiac function on the UK Biobank population cohort | |
| EP4445329A1 (en) | Selecting training data for annotation | |
| Yang et al. | Not all areas are equal: Detecting thoracic disease with chestwnet | |
| Hamedian | Automatic classification of cardiomegaly using deep convolutional neural network | |
| Zhang et al. | Image quality assessment for population cardiac magnetic resonance imaging | |
| Shilpa et al. | A review on cardiovascular disease detection using machine learning algorithms | |
| Muthulakshmi et al. | Pelican optimized extreme learning machine based prognosis of heart failure using textural patterns in CMR images | |
| Galati | Cardiac Image Segmentation: towards better reliability and generalization | |
| Appavu | A Deep Learning Method for Precise Rare Disease Diagnosis Through Enhanced Clinical Imagery |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: TATA CONSULTANCY SERVICES LIMITED, INDIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KANAKATTE GURUMURTHY, APARNA;GHOSE, AVIK;BHATIA, DIVYA MANOHARLAL;AND OTHERS;SIGNING DATES FROM 20220724 TO 20220803;REEL/FRAME:064053/0361 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: ALLOWED -- NOTICE OF ALLOWANCE NOT YET MAILED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |