EP3704636A1 - System and method for interactive representation learning transfer through deep learning of feature ontologies - Google Patents
System and method for interactive representation learning transfer through deep learning of feature ontologiesInfo
- Publication number
- EP3704636A1 EP3704636A1 EP18804491.1A EP18804491A EP3704636A1 EP 3704636 A1 EP3704636 A1 EP 3704636A1 EP 18804491 A EP18804491 A EP 18804491A EP 3704636 A1 EP3704636 A1 EP 3704636A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- input image
- learning
- image dataset
- cnn
- imaging modality
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 230000013016 learning Effects 0.000 title claims abstract description 150
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 36
- 238000012546 transfer Methods 0.000 title claims abstract description 25
- 238000013135 deep learning Methods 0.000 title description 13
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 125
- 238000013507 mapping Methods 0.000 claims abstract description 81
- 230000006870 function Effects 0.000 claims abstract description 78
- 238000003384 imaging method Methods 0.000 claims abstract description 66
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000013526 transfer learning Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 7
- 230000003190 augmentative effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 3
- 210000003484 anatomy Anatomy 0.000 description 16
- 238000010801 machine learning Methods 0.000 description 8
- 238000002591 computed tomography Methods 0.000 description 6
- 238000013170 computed tomography imaging Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 210000004556 brain Anatomy 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 210000002216 heart Anatomy 0.000 description 3
- 210000003734 kidney Anatomy 0.000 description 3
- 210000004185 liver Anatomy 0.000 description 3
- 210000004072 lung Anatomy 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 210000002784 stomach Anatomy 0.000 description 3
- 238000012285 ultrasound imaging Methods 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000000857 visual cortex Anatomy 0.000 description 2
- 241001658031 Eris Species 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 210000001835 viscera Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
- G06F18/256—Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10104—Positron emission tomography [PET]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/031—Recognition of patterns in medical or anatomical images of internal organs
Definitions
- Embodiments of the present specification relate generally to a system and method for generating transferable representation learnings of medical image data of varied anatomies obtained from varied imaging modalities for use in learning networks. Specifically, the system and method are directed to determining a representation learning of the medical image data as a set of feature primitives based on the physics of a first and/or second imaging modality and the biology of the anatomy to configure new convolutional networks for learning problems such as classification and segmentation of the medical image data from other imaging modalities.
- machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed.”
- Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.
- feature learning or representation learning is a set of techniques that transform raw data input into a representation that can be effectively exploited in machine learning tasks.
- Representation learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.
- real-world data such as images, video, and sensor measurement is usually complex, redundant, and highly variable.
- manual feature identification methods require expensive human labor and rely on expert knowledge.
- manually generated representations normally do not lend themselves well to generalization, thereby motivating the design of efficient representation learning techniques to automate and generalize feature or representation learning.
- CNN convolutional neural network
- ConvNet convolutional neural network
- a deep convolutional neural network is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex.
- Convolutional neural networks are biologically inspired variants of multilayer perceptrons, designed to emulate the behavior of a visual cortex with minimal amounts of preprocessing.
- MLP multilayer perceptron network
- An MLP includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one.
- CNNs have wide applications in image and video recognition, recommender systems, and natural language processing.
- CNNs are also known as shift invariant or space invariant artificial neural networks (SIANNs) based on their shared weights architecture and translation invariance characteristics.
- SIANNs shift invariant or space invariant artificial neural networks
- a convolutional layer is the core building block.
- Parameters associated with a convolutional layer include a set of learnable filters or kernels. During a forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a two-dimensional (2D) activation map of that filter.
- 2D two-dimensional
- the network learns filters that are activated when the network detects a specific type of feature at a given spatial position in the input.
- Deep CNN architectures are typically purpose-built for deep learning problems. Deep learning is known to be effective in many tasks that involve human-like perception and decision making. Typical applications of deep learning are handwriting recognition, image recognition, and speech recognition.
- Deep learning techniques construct a network model by using training data and outcomes that correspond to the data. Once the network model is constructed, it can be used on new data for determining outcomes. Moreover, it will be appreciated that a deep learning network once learnt for a specific outcome may be reused advantageously for related outcomes.
- CNNs may be used for supervised learning, where an input dataset is labelled, as well as for unsupervised learning, where the input dataset is not labelled.
- a labelled input dataset is one where the elements of the dataset are pre- associated with a classification scheme, represented by labels.
- the CNN is trained with a labelled subset of the dataset, and may be tested with another subset to verify an accurate classification result
- a deep CNN architecture is multi-layered, where the layers are hierarchically connected. As the number of input to output mappings and filters for each layer increases, a multi-layer deep CNN may result in a huge number of parameters that need to be configured for its operation. If training data for such a CNN is scarce, the learning problem is under-determined. In this situation, it is advantageous to transfer certain parameters from a pre-learned CNN model. Transfer learning reduces the number of parameters to be optimized by freezing the pre-learned parameters in a subset of layers and provides a good initialization for tuning the remaining layers.
- a method for a method for interactive representation learning transfer to a convolutional neural network includes obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality.
- the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Additionally, the method includes storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
- an interactive representation learning transfer (IRLT) unit for interactive representation learning transfer to a convolutional neural network (CNN) is presented.
- the IRLT unit includes an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
- the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
- a multimodality transfer learning system includes a processor unit and a memory unit communicatively operatively coupled to the processor unit.
- the multimodality transfer learning system includes an interactive representation learning transfer (IRLT) unit operatively coupled to the processor unit and including an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
- the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
- FIG. 1 is a schematic diagram of an exemplary system for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification
- FIG. 2 is a flowchart illustrating a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
- FIG. 3 is a flowchart illustrating a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
- FIG. 4 is a flowchart illustrating a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
- FIG. 5 is a schematic diagram of one embodiment of an interactive learning network configurator, in accordance with aspects of the present specification.
- transfer learning or “inductive transfer” as used in the present specification is intended to mean an approach of machine learning that focuses on applying the knowledge or learning gained while solving one problem to a different, related problem.
- This knowledge or learning may be characterized and/or represented in various ways, typically by combinations and variations of transfer functions, mapping functions, graphs, matrices, and other primitives.
- transfer learning primitive as used in the present specification is intended to mean the characterization and/or representation of knowledge or learning gained by solving a machine learning problem as described hereinabove.
- feature primitive as used in the present specification is intended to mean a characterization of an aspect of an input dataset, typically, an appearance, shape geometry, or morphology of a region of interest (ROI) corresponding to an image of an anatomy, where the image may be obtained from an imaging modality such as an ultrasound imaging system, a computed tomography (CT) imaging system, a positron emission tomography-CT (PET- CT) imaging system, a magnetic resonance (MR) imaging system, and the like.
- CT computed tomography
- PET- CT positron emission tomography-CT
- MR magnetic resonance
- the anatomy may include an internal organ of the human body such as, but not limited to, the lung, liver, kidney, stomach, heart, brain, and the like.
- FIG. 1 An exemplary system 100 for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification, is illustrated in FIG. 1.
- the system 100 includes a multimodality transfer learning (MTL) subsystem 102.
- the MTL subsystem 102 includes an interactive representation learning transfer (IRLT) unit 104, a processor unit 108, a memory unit 110, and a user interface 106.
- the processor unit 108 is communicatively coupled to the memory unit 110.
- the user interface 106 is operatively coupled to the IRLT unit 104.
- the IRLT unit 104 is operatively coupled to the processor unit 108 and the memory unit 110.
- the system 100 and/or the MTL subsystem 102 may include a display unit 134. It may be noted that the MTL subsystem 102 may include other components or hardware, and is not limited to the components shown in FIG. 1.
- the user interface 106 is configured to receive user input 130 corresponding to characteristics of an input image 128.
- the user input 130 may include aspects or characteristics of the input image 128, such as, but not limited to, the imaging modality, the anatomy generally represented by the input image 128, the appearance of the ROI corresponding to the input image 128, and the like.
- the IRLT unit 104 may be implemented as software systems or computer instructions executable via one or more processor units 108 and stored in the memory unit 110.
- the IRLT unit 104 may be implemented as a hardware system, for example, via FPGAs, custom chips, integrated circuits (ICs), Application Specific ICs (ASICs), and the like.
- the IRLT unit 104 may include an interactive learning network configurator (ILNC) 112, one or more CNNs configured for unsupervised learning (unsupervised learning CNN) 114, one or more CNNs configured for supervised learning (supervised learning CNN) 116, and a feature primitive repository 118.
- the ILNC 112 is operatively coupled to the unsupervised learning CNNs 114, the supervised learning CNNs 116, and the feature primitive repository 118.
- the INLC 112 may be a graphical user interface subsystem configured to enable a user to configure one or more supervised learning CNNs 116 or unsupervised learning CNNs 114.
- the feature primitive repository 118 is configured to store one or more feature primitives corresponding to an ROI in the input image 128 and one or more corresponding mapping functions.
- mapping function as used in the present specification is intended to represent a transfer function or a CNN filter that maps the ROI to a compressed representation such that an output of the CNN is a feature primitive that characterizes the ROI of the input image 128 based on an aspect of the ROI.
- the aspect of the ROI may include a shape geometry, an appearance, a morphology, and the like.
- the feature primitive repository 118 is configured to store one or more mapping functions. These mapping functions, in conjunction with the corresponding feature primitives, may be used to pre-configure a CNN to learn a new training set.
- the feature primitive repository 118 is configured to store feature primitives and mapping functions which are transfer learnings obtained from other CNNs to pre-configure a new CNN to learn an unseen dataset.
- some non-limiting examples of feature primitives that the feature primitive repository 118 is configured to store include feature primitives characterizing an appearance 120 corresponding to the ROI of the image 128, a shape geometry 124 corresponding to the ROI of the image 128, and an anatomy 126 corresponding to the image 128.
- the ILNC 112 is configured to present a user with various tools and options to interactively characterize one or more aspects of the ROI corresponding to the input image 128.
- An exemplary embodiment of the ILNC 112 is shown in FIG. 5 and the working of the ILNC 112 will be described in greater detail with reference to FIGs. 4 and 5.
- the system 100 is configured to develop a portfolio of feature primitives and mapping functions across one or more anatomies and imaging modalities to be stored in the feature primitive repository 118. Subsequently, the system 100 may provide the user with an ILNC 112 to allow the user to pre-configure a CNN for learning a new, unseen image dataset based on a selection of modality, anatomy, shape geometry, morphology, and the like corresponding to one or more ROIs of the unseen image dataset.
- one learning outcome of the CNN may be a classification scheme categorizing the image dataset. Other non-limiting examples of learning outcomes may include pixel-level segmentation, regression, and the like. The working of the system 100 will be described in greater detail with reference to FIGs. 2-5.
- system 100 and/or the MTL subsystem 102 may be configured to visualize one or more of the feature primitives, the mapping functions, the image datasets, and the like on the display unit 134.
- FIG. 2 a flowchart 200 generally representative of a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. The method 200 is described with reference to the components of FIG. 1.
- the flowchart 200 illustrates the main steps of the method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository such as the feature primitive repository 118.
- steps 202- 208 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
- step 210 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more supervised learning CNNs 116.
- steps 214-216 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more unsupervised learning CNNs 114.
- the method 200 starts at step 202 where at least a first input image dataset 220 and a second input image dataset 222 corresponding to a first imaging modality and a second imaging modality are obtained.
- the first and second input image datasets 220, 222 may correspond to an anatomy of the human body, such as the liver, lung, kidney, heart, brain, stomach, and the like.
- the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
- step 204 a check is carried out to determine whether the input image datasets 220, 222 are labelled.
- labels referencing the input image datasets may be generally representative of a classification scheme or score that characterizes an aspect of the input images, such as the shape geometry, appearance, morphology, and the like. If, at step 204, it is determined that the input image datasets are labelled, control passes to step 210, where a first CNN and a second CNN are configured for supervised learning of the input image datasets 220, 222. Step 210 will be described in greater detail with reference to FIG. 3. Subsequent to step 210, the method 200 is terminated, as indicated by step 212.
- the second input dataset 222 corresponding to the second imaging modality is augmented with additional data.
- the additional data for augmenting the second input image dataset 222 may be obtained by processing the first input image dataset 220 corresponding to the first imaging modality via an intensity mapping function.
- the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control passes to step 214.
- a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality.
- a first unsupervised learning CNN and a second unsupervised learning CNN are jointly trained with the first input image dataset 220 and the second input image dataset 222 to learn compressed representations of the input image datasets 220, 222, where the compressed representations include one or more common feature primitives and corresponding mapping functions.
- the one or more feature primitives characterize aspects of the images of the first input image dataset 220. It may be noted that the mapping functions map the input image dataset to the corresponding feature primitive. In one embodiment, the mapping function may be defined in accordance with equation (1).
- her is a set of feature primitives obtained when a region of interest of an image eris mapped using a mapping function /and weights w.
- the image PC T corresponds to an image obtained via use of a CT imaging system.
- one or more mapping functions corresponding to the first imaging modality and the second imaging modality are generated. It may be noted that these mapping functions map the first and second input image datasets 220, 222 to the same feature primitives.
- a second mapping function may be defined in accordance with equation (2).
- HMR is a set of feature primitives obtained when a region of interest of an image PMR is mapped using a mapping function/and weights w.
- the image PMR is obtained using an MR imaging system.
- step 218 the at least the one or more feature primitives and the corresponding mapping functions are stored in the feature primitive repository 118. Control is then passed to step 212 to terminate the method 200.
- FIG. 3 a flowchart 300 generally representative of a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. More particularly, the method 300 describes step 210 of FIG. 2 in greater detail. Also, the method 300 is described with reference to the components of FIG. 1.
- the flowchart 300 illustrates the main steps of the method for building a set of feature primitives from labeled image data to augment a feature primitive repository.
- steps 302-308 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
- steps 310-314 may be performed by processor unit 108 in conjunction with memory unit 110 and the one or more supervised learning CNNs 116.
- the method 300 starts at step 302 where at least a first input image dataset 316 and a second input image dataset 318 corresponding to at least a first imaging modality and a second imaging modality are obtained.
- the first and second input image datasets 316, 318 may correspond to an anatomy of the human body, for example, liver, lung, kidney, heart, brain, stomach, and the like.
- the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
- a learning outcome of the one or more supervised CNNs may be a classification of images in the first and second input image datasets 316, 318.
- a check is carried out to determine if the first input image dataset 316 and the second input image dataset 318 include sufficient data to adequately train one or more CNNs. If, at step 304, it is determined that the first and second input image datasets 316, 318 include sufficient data, control is passed to step 308. However, at step 304, it is determined that the first and second input image datasets 316, 318 do not have sufficient data, control is passed to step 306. In one example, at step 306, it may be determined that the first input image dataset 316 includes sufficient data and the second input image dataset 318 does not include sufficient data. Accordingly, in this example, at step 306, the second input image dataset 318 corresponding to the second modality is augmented with additional data.
- the additional data is obtained by processing the first input image dataset 316 corresponding to the first imaging modality via an intensity mapping function.
- the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control is passed to step 308.
- a first supervised learning CNN and a second supervised learning CNN are jointly trained based on labels associated with the first input image dataset 316 and labels associated with the second input image dataset 318 to generate one or more feature primitives and corresponding mapping functions.
- the learning outcome may include one or more feature primitives that characterize aspects of the images of the first input image dataset 316 and aspects of the images of the second input image dataset 318 and corresponding mapping functions where the mapping functions map the corresponding first input image dataset 316 and the second input image dataset 318 to the one or more feature primitives.
- the feature primitives are independent of the imaging modality used to acquire the first input image dataset 316 and the second input image dataset 318.
- the one or more feature primitives and the corresponding mapping functions are stored in a feature primitive repository.
- the methods 200 and 300 described hereinabove enable the creation of a portfolio of feature primitives and mapping functions corresponding to images generated for a plurality of anatomies across a plurality of modalities.
- the feature primitives and the mapping functions are stored in the feature primitive repository 118.
- this portfolio of feature primitives and mapping functions characterizes the learning gained in the training of the CNNs with the input image datasets.
- the learning may be transferred to pre-configure new CNNs to obtain learning outcomes for different, unseen datasets.
- FIG. 4 illustrates a flowchart 400 depicting a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
- the method 400 is described with reference to FIGs. 1, 2 and 3. It may be noted that the flowchart 400 illustrates the main steps of the method 400 for pre- configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives.
- steps 402-406 may be performed by the processor unit 108 in conjunction with the memory unit 110 and the ILNC 112 of the IRLT unit 104.
- steps 408-410 may be performed by processor unit 108 in conjunction with the memory unit 110 and the one or more supervised learning CNNs 116.
- the method 400 starts at step 402, where at least one input image dataset 404 may be obtained.
- the at least one input image dataset 404 is representative of an unseen input image data set.
- at least one learning parameter 406 and a learning outcome 408 corresponding to input image dataset 404 may be obtained.
- the input image dataset 404, the learning parameter 406, and the learning outcome 408 may be obtained as user input 410.
- the learning parameter 406 may include an imaging modality, an image anatomy, or a combination thereof.
- the learning outcome 408 may include a classification scheme, a regression scheme, or a pixel level output like segmentation.
- At step 412 at least one feature primitive and a corresponding mapping function corresponding to the learning parameter 406 and the learning outcome 408 are obtained from the feature primitive repository 118.
- a CNN is configured for learning the input image dataset 404 using the at least one feature primitive and the at least one mapping function, as indicated by step 414.
- the configuration of the CNN may entail setting one or more filters obtained from the feature primitive repository 118 to the mapping functions.
- a pre-configured CNN is generated. Further, at step 416, the pre-configured CNN is optimized with at least a training subset of the input image dataset 404.
- a trained convolutional autoencoder (CAE) for supervised learning that uses labelled data corresponding to a first imaging modality is adapted for the input image dataset 404, with a few parameters, in accordance with equation (3).
- w is a set of feature primitives obtained when a region of interest of an image Pi corresponding to the first imaging modality is mapped using a mapping function / and weights w(a, w), where a is a sparse set of the CAE parameters.
- the CAE parameters may be further optimized with at least the training subset of the input image dataset 404.
- the framework for learning may be defined in accordance with the following formulation.
- mapping function/obtained in equation (3) is applied over the mapping function fa corresponding to the region of interest of an image corresponding to the second imaging modality.
- the input image dataset 404 is processed via the optimized CNN to obtain a learning outcome 420, corresponding to the requested learning outcome 408.
- FIG.5 is a schematic diagram 500 of one embodiment of the interactive learning network configurator 112 of the interactive representation learning transfer unit 104 of FIG. 1, in accordance with aspects of the present specification.
- the block diagram 500 is generally representative of the ILNC 112 as shown in FIG. 1.
- Reference numerals 502-508 are generally representative of visualizations of feature primitives respectively corresponding to an imaging modality, anatomy, appearance, and shape geometry.
- the data for the visualizations 502-508 may be obtained from the feature primitive repository 118 of FIG. 1.
- the ILNC 500 provides a user a selection of interactive menus.
- the user may select one or more aspects of an unseen image dataset to be learned by a CNN.
- Reference numerals 510-516 are generally representative of interactive menu options that may be available to a user to aid in the characterization of the unseen image dataset.
- reference numeral 510 may pertain to the imaging modality of the unseen image dataset.
- the menu options of block 510 may include CT, MR, PET, ultrasound, and the like.
- reference numerals 512-516 may show menu options pertaining to the appearance, shape geometry, and anatomy respectively of the unseen image dataset.
- Reference numeral 518 is generally representative of a visualization of the pre-configured CNN.
- reference numeral 518 may correspond to one or more supervised learning CNNs 116 of the IRLT unit 104 of FIG. 1.
- the user may graphically browse, visualize, and combine feature primitives of blocks 502-508 to create the pre-configured CNN 518.
- the systems and methods for interactive representation learning transfer through deep learning of feature ontologies presented hereinabove provide a transfer learning paradigm where a portfolio of learned feature primitives and mapping functions may be combined to configure CNNs to solve new medical image analytics problems.
- the CNNs may be trained for learning appearance and morphology of images.
- tumors may be classified into blobs, cylinders, disks, bright/dark or banded, and the like.
- networks may be trained for various combinations of anatomy, modality, appearance, and morphology (shape geometry) to generate a rich portfolio configured to immediately provide a transference of pre-learnt features to a new problem at hand.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biophysics (AREA)
- Biodiversity & Conservation Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN201741039221 | 2017-11-03 | ||
PCT/US2018/058855 WO2019090023A1 (en) | 2017-11-03 | 2018-11-02 | System and method for interactive representation learning transfer through deep learning of feature ontologies |
Publications (1)
Publication Number | Publication Date |
---|---|
EP3704636A1 true EP3704636A1 (en) | 2020-09-09 |
Family
ID=64332416
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP18804491.1A Withdrawn EP3704636A1 (en) | 2017-11-03 | 2018-11-02 | System and method for interactive representation learning transfer through deep learning of feature ontologies |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP3704636A1 (en) |
JP (1) | JP7467336B2 (en) |
CN (1) | CN111316290B (en) |
WO (1) | WO2019090023A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210486B (en) * | 2019-05-15 | 2021-01-01 | 西安电子科技大学 | Sketch annotation information-based generation countermeasure transfer learning method |
CN110186375A (en) * | 2019-06-06 | 2019-08-30 | 西南交通大学 | Intelligent high-speed rail white body assemble welding feature detection device and detection method |
US11941497B2 (en) | 2020-09-30 | 2024-03-26 | Alteryx, Inc. | System and method of operationalizing automated feature engineering |
CN112434602B (en) * | 2020-11-23 | 2023-08-29 | 西安交通大学 | Fault diagnosis method based on movable common feature space mining |
CN113707312A (en) * | 2021-09-16 | 2021-11-26 | 人工智能与数字经济广东省实验室(广州) | Blood vessel quantitative identification method and device based on deep learning |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011114243A1 (en) * | 2010-03-18 | 2011-09-22 | Koninklijke Philips Electronics N.V. | Functional image data enhancement and/or enhancer |
JP6235610B2 (en) * | 2012-12-26 | 2017-11-22 | ボルケーノ コーポレイション | Measurement and enhancement in multi-modality medical imaging systems |
US9922272B2 (en) * | 2014-09-25 | 2018-03-20 | Siemens Healthcare Gmbh | Deep similarity learning for multimodal medical images |
CN105930877B (en) * | 2016-05-31 | 2020-07-10 | 上海海洋大学 | Remote sensing image classification method based on multi-mode deep learning |
US10242443B2 (en) | 2016-11-23 | 2019-03-26 | General Electric Company | Deep learning medical systems and methods for medical procedures |
US10127659B2 (en) | 2016-11-23 | 2018-11-13 | General Electric Company | Deep learning medical systems and methods for image acquisition |
CN106909905B (en) * | 2017-03-02 | 2020-02-14 | 中科视拓(北京)科技有限公司 | Multi-mode face recognition method based on deep learning |
CN106971174B (en) * | 2017-04-24 | 2020-05-22 | 华南理工大学 | CNN model, CNN training method and CNN-based vein identification method |
CN107220337B (en) * | 2017-05-25 | 2020-12-22 | 北京大学 | Cross-media retrieval method based on hybrid migration network |
-
2018
- 2018-11-02 WO PCT/US2018/058855 patent/WO2019090023A1/en unknown
- 2018-11-02 CN CN201880071649.9A patent/CN111316290B/en active Active
- 2018-11-02 EP EP18804491.1A patent/EP3704636A1/en not_active Withdrawn
- 2018-11-02 JP JP2020524235A patent/JP7467336B2/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111316290B (en) | 2024-01-12 |
CN111316290A (en) | 2020-06-19 |
JP7467336B2 (en) | 2024-04-15 |
WO2019090023A1 (en) | 2019-05-09 |
JP2021507327A (en) | 2021-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tajbakhsh et al. | Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation | |
US11393229B2 (en) | Method and system for artificial intelligence based medical image segmentation | |
JP7467336B2 (en) | METHOD, PROCESSING UNIT AND SYSTEM FOR STUDYING MEDICAL IMAGE DATA OF ANATOMICAL STRUCTURES OBTAINED FROM MULTIPLE IMAGING MODALITIES - Patent application | |
Altaf et al. | Going deep in medical image analysis: concepts, methods, challenges, and future directions | |
US20210012486A1 (en) | Image synthesis with generative adversarial network | |
Dalca et al. | Anatomical priors in convolutional networks for unsupervised biomedical segmentation | |
Fritscher et al. | Deep neural networks for fast segmentation of 3D medical images | |
EP3273387B1 (en) | Medical image segmentation with a multi-task neural network system | |
Conze et al. | Current and emerging trends in medical image segmentation with deep learning | |
US20210012162A1 (en) | 3d image synthesis system and methods | |
Srinivasu et al. | Self-Learning Network-based segmentation for real-time brain MR images through HARIS | |
Xu et al. | BMAnet: Boundary mining with adversarial learning for semi-supervised 2D myocardial infarction segmentation | |
Biswas et al. | Data augmentation for improved brain tumor segmentation | |
Khan et al. | Segmentation of shoulder muscle MRI using a new region and edge based deep auto-encoder | |
Naga Srinivasu et al. | Variational Autoencoders‐BasedSelf‐Learning Model for Tumor Identification and Impact Analysis from 2‐D MRI Images | |
Ogiela et al. | Natural user interfaces in medical image analysis | |
Liu et al. | An automatic cardiac segmentation framework based on multi-sequence MR image | |
Jyotiyana et al. | Deep learning and the future of biomedical image analysis | |
Quan et al. | An intelligent system approach for probabilistic volume rendering using hierarchical 3D convolutional sparse coding | |
Mano et al. | Method of multi‐region tumour segmentation in brain MRI images using grid‐based segmentation and weighted bee swarm optimisation | |
Huang et al. | A two-level dynamic adaptive network for medical image fusion | |
Jena et al. | Review of neural network techniques in the verge of image processing | |
Silva-Rodríguez et al. | Towards foundation models and few-shot parameter-efficient fine-tuning for volumetric organ segmentation | |
Ramtekkar et al. | A comprehensive review of brain tumour detection mechanisms | |
Ullah et al. | DSFMA: Deeply supervised fully convolutional neural networks based on multi-level aggregation for saliency detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20200428 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230528 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20230907 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN |
|
18W | Application withdrawn |
Effective date: 20231118 |