WO2019090023A1 - System and method for interactive representation learning transfer through deep learning of feature ontologies - Google Patents
System and method for interactive representation learning transfer through deep learning of feature ontologies Download PDFInfo
- Publication number
- WO2019090023A1 WO2019090023A1 PCT/US2018/058855 US2018058855W WO2019090023A1 WO 2019090023 A1 WO2019090023 A1 WO 2019090023A1 US 2018058855 W US2018058855 W US 2018058855W WO 2019090023 A1 WO2019090023 A1 WO 2019090023A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- input image
- learning
- image dataset
- cnn
- imaging modality
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
- G06F18/256—Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/30—Determination of transform parameters for the alignment of images, i.e. image registration
- G06T7/33—Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10088—Magnetic resonance imaging [MRI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10104—Positron emission tomography [PET]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
- G06V2201/031—Recognition of patterns in medical or anatomical images of internal organs
Definitions
- Embodiments of the present specification relate generally to a system and method for generating transferable representation learnings of medical image data of varied anatomies obtained from varied imaging modalities for use in learning networks. Specifically, the system and method are directed to determining a representation learning of the medical image data as a set of feature primitives based on the physics of a first and/or second imaging modality and the biology of the anatomy to configure new convolutional networks for learning problems such as classification and segmentation of the medical image data from other imaging modalities.
- machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed.”
- Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.
- feature learning or representation learning is a set of techniques that transform raw data input into a representation that can be effectively exploited in machine learning tasks.
- Representation learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.
- real-world data such as images, video, and sensor measurement is usually complex, redundant, and highly variable.
- manual feature identification methods require expensive human labor and rely on expert knowledge.
- manually generated representations normally do not lend themselves well to generalization, thereby motivating the design of efficient representation learning techniques to automate and generalize feature or representation learning.
- CNN convolutional neural network
- ConvNet convolutional neural network
- a deep convolutional neural network is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex.
- Convolutional neural networks are biologically inspired variants of multilayer perceptrons, designed to emulate the behavior of a visual cortex with minimal amounts of preprocessing.
- MLP multilayer perceptron network
- An MLP includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one.
- CNNs have wide applications in image and video recognition, recommender systems, and natural language processing.
- CNNs are also known as shift invariant or space invariant artificial neural networks (SIANNs) based on their shared weights architecture and translation invariance characteristics.
- SIANNs shift invariant or space invariant artificial neural networks
- a convolutional layer is the core building block.
- Parameters associated with a convolutional layer include a set of learnable filters or kernels. During a forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a two-dimensional (2D) activation map of that filter.
- 2D two-dimensional
- the network learns filters that are activated when the network detects a specific type of feature at a given spatial position in the input.
- Deep CNN architectures are typically purpose-built for deep learning problems. Deep learning is known to be effective in many tasks that involve human-like perception and decision making. Typical applications of deep learning are handwriting recognition, image recognition, and speech recognition.
- Deep learning techniques construct a network model by using training data and outcomes that correspond to the data. Once the network model is constructed, it can be used on new data for determining outcomes. Moreover, it will be appreciated that a deep learning network once learnt for a specific outcome may be reused advantageously for related outcomes.
- CNNs may be used for supervised learning, where an input dataset is labelled, as well as for unsupervised learning, where the input dataset is not labelled.
- a labelled input dataset is one where the elements of the dataset are pre- associated with a classification scheme, represented by labels.
- the CNN is trained with a labelled subset of the dataset, and may be tested with another subset to verify an accurate classification result
- a deep CNN architecture is multi-layered, where the layers are hierarchically connected. As the number of input to output mappings and filters for each layer increases, a multi-layer deep CNN may result in a huge number of parameters that need to be configured for its operation. If training data for such a CNN is scarce, the learning problem is under-determined. In this situation, it is advantageous to transfer certain parameters from a pre-learned CNN model. Transfer learning reduces the number of parameters to be optimized by freezing the pre-learned parameters in a subset of layers and provides a good initialization for tuning the remaining layers.
- a method for a method for interactive representation learning transfer to a convolutional neural network includes obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality.
- the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Additionally, the method includes storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
- an interactive representation learning transfer (IRLT) unit for interactive representation learning transfer to a convolutional neural network (CNN) is presented.
- the IRLT unit includes an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
- the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
- a multimodality transfer learning system includes a processor unit and a memory unit communicatively operatively coupled to the processor unit.
- the multimodality transfer learning system includes an interactive representation learning transfer (IRLT) unit operatively coupled to the processor unit and including an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
- the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
- FIG. 1 is a schematic diagram of an exemplary system for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification
- FIG. 2 is a flowchart illustrating a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
- FIG. 3 is a flowchart illustrating a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
- FIG. 4 is a flowchart illustrating a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
- FIG. 5 is a schematic diagram of one embodiment of an interactive learning network configurator, in accordance with aspects of the present specification.
- transfer learning or “inductive transfer” as used in the present specification is intended to mean an approach of machine learning that focuses on applying the knowledge or learning gained while solving one problem to a different, related problem.
- This knowledge or learning may be characterized and/or represented in various ways, typically by combinations and variations of transfer functions, mapping functions, graphs, matrices, and other primitives.
- transfer learning primitive as used in the present specification is intended to mean the characterization and/or representation of knowledge or learning gained by solving a machine learning problem as described hereinabove.
- feature primitive as used in the present specification is intended to mean a characterization of an aspect of an input dataset, typically, an appearance, shape geometry, or morphology of a region of interest (ROI) corresponding to an image of an anatomy, where the image may be obtained from an imaging modality such as an ultrasound imaging system, a computed tomography (CT) imaging system, a positron emission tomography-CT (PET- CT) imaging system, a magnetic resonance (MR) imaging system, and the like.
- CT computed tomography
- PET- CT positron emission tomography-CT
- MR magnetic resonance
- the anatomy may include an internal organ of the human body such as, but not limited to, the lung, liver, kidney, stomach, heart, brain, and the like.
- FIG. 1 An exemplary system 100 for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification, is illustrated in FIG. 1.
- the system 100 includes a multimodality transfer learning (MTL) subsystem 102.
- the MTL subsystem 102 includes an interactive representation learning transfer (IRLT) unit 104, a processor unit 108, a memory unit 110, and a user interface 106.
- the processor unit 108 is communicatively coupled to the memory unit 110.
- the user interface 106 is operatively coupled to the IRLT unit 104.
- the IRLT unit 104 is operatively coupled to the processor unit 108 and the memory unit 110.
- the system 100 and/or the MTL subsystem 102 may include a display unit 134. It may be noted that the MTL subsystem 102 may include other components or hardware, and is not limited to the components shown in FIG. 1.
- the user interface 106 is configured to receive user input 130 corresponding to characteristics of an input image 128.
- the user input 130 may include aspects or characteristics of the input image 128, such as, but not limited to, the imaging modality, the anatomy generally represented by the input image 128, the appearance of the ROI corresponding to the input image 128, and the like.
- the IRLT unit 104 may be implemented as software systems or computer instructions executable via one or more processor units 108 and stored in the memory unit 110.
- the IRLT unit 104 may be implemented as a hardware system, for example, via FPGAs, custom chips, integrated circuits (ICs), Application Specific ICs (ASICs), and the like.
- the IRLT unit 104 may include an interactive learning network configurator (ILNC) 112, one or more CNNs configured for unsupervised learning (unsupervised learning CNN) 114, one or more CNNs configured for supervised learning (supervised learning CNN) 116, and a feature primitive repository 118.
- the ILNC 112 is operatively coupled to the unsupervised learning CNNs 114, the supervised learning CNNs 116, and the feature primitive repository 118.
- the INLC 112 may be a graphical user interface subsystem configured to enable a user to configure one or more supervised learning CNNs 116 or unsupervised learning CNNs 114.
- the feature primitive repository 118 is configured to store one or more feature primitives corresponding to an ROI in the input image 128 and one or more corresponding mapping functions.
- mapping function as used in the present specification is intended to represent a transfer function or a CNN filter that maps the ROI to a compressed representation such that an output of the CNN is a feature primitive that characterizes the ROI of the input image 128 based on an aspect of the ROI.
- the aspect of the ROI may include a shape geometry, an appearance, a morphology, and the like.
- the feature primitive repository 118 is configured to store one or more mapping functions. These mapping functions, in conjunction with the corresponding feature primitives, may be used to pre-configure a CNN to learn a new training set.
- the feature primitive repository 118 is configured to store feature primitives and mapping functions which are transfer learnings obtained from other CNNs to pre-configure a new CNN to learn an unseen dataset.
- some non-limiting examples of feature primitives that the feature primitive repository 118 is configured to store include feature primitives characterizing an appearance 120 corresponding to the ROI of the image 128, a shape geometry 124 corresponding to the ROI of the image 128, and an anatomy 126 corresponding to the image 128.
- the ILNC 112 is configured to present a user with various tools and options to interactively characterize one or more aspects of the ROI corresponding to the input image 128.
- An exemplary embodiment of the ILNC 112 is shown in FIG. 5 and the working of the ILNC 112 will be described in greater detail with reference to FIGs. 4 and 5.
- the system 100 is configured to develop a portfolio of feature primitives and mapping functions across one or more anatomies and imaging modalities to be stored in the feature primitive repository 118. Subsequently, the system 100 may provide the user with an ILNC 112 to allow the user to pre-configure a CNN for learning a new, unseen image dataset based on a selection of modality, anatomy, shape geometry, morphology, and the like corresponding to one or more ROIs of the unseen image dataset.
- one learning outcome of the CNN may be a classification scheme categorizing the image dataset. Other non-limiting examples of learning outcomes may include pixel-level segmentation, regression, and the like. The working of the system 100 will be described in greater detail with reference to FIGs. 2-5.
- system 100 and/or the MTL subsystem 102 may be configured to visualize one or more of the feature primitives, the mapping functions, the image datasets, and the like on the display unit 134.
- FIG. 2 a flowchart 200 generally representative of a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. The method 200 is described with reference to the components of FIG. 1.
- the flowchart 200 illustrates the main steps of the method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository such as the feature primitive repository 118.
- steps 202- 208 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
- step 210 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more supervised learning CNNs 116.
- steps 214-216 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more unsupervised learning CNNs 114.
- the method 200 starts at step 202 where at least a first input image dataset 220 and a second input image dataset 222 corresponding to a first imaging modality and a second imaging modality are obtained.
- the first and second input image datasets 220, 222 may correspond to an anatomy of the human body, such as the liver, lung, kidney, heart, brain, stomach, and the like.
- the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
- step 204 a check is carried out to determine whether the input image datasets 220, 222 are labelled.
- labels referencing the input image datasets may be generally representative of a classification scheme or score that characterizes an aspect of the input images, such as the shape geometry, appearance, morphology, and the like. If, at step 204, it is determined that the input image datasets are labelled, control passes to step 210, where a first CNN and a second CNN are configured for supervised learning of the input image datasets 220, 222. Step 210 will be described in greater detail with reference to FIG. 3. Subsequent to step 210, the method 200 is terminated, as indicated by step 212.
- the second input dataset 222 corresponding to the second imaging modality is augmented with additional data.
- the additional data for augmenting the second input image dataset 222 may be obtained by processing the first input image dataset 220 corresponding to the first imaging modality via an intensity mapping function.
- the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control passes to step 214.
- a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality.
- a first unsupervised learning CNN and a second unsupervised learning CNN are jointly trained with the first input image dataset 220 and the second input image dataset 222 to learn compressed representations of the input image datasets 220, 222, where the compressed representations include one or more common feature primitives and corresponding mapping functions.
- the one or more feature primitives characterize aspects of the images of the first input image dataset 220. It may be noted that the mapping functions map the input image dataset to the corresponding feature primitive. In one embodiment, the mapping function may be defined in accordance with equation (1).
- her is a set of feature primitives obtained when a region of interest of an image eris mapped using a mapping function /and weights w.
- the image PC T corresponds to an image obtained via use of a CT imaging system.
- one or more mapping functions corresponding to the first imaging modality and the second imaging modality are generated. It may be noted that these mapping functions map the first and second input image datasets 220, 222 to the same feature primitives.
- a second mapping function may be defined in accordance with equation (2).
- HMR is a set of feature primitives obtained when a region of interest of an image PMR is mapped using a mapping function/and weights w.
- the image PMR is obtained using an MR imaging system.
- step 218 the at least the one or more feature primitives and the corresponding mapping functions are stored in the feature primitive repository 118. Control is then passed to step 212 to terminate the method 200.
- FIG. 3 a flowchart 300 generally representative of a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. More particularly, the method 300 describes step 210 of FIG. 2 in greater detail. Also, the method 300 is described with reference to the components of FIG. 1.
- the flowchart 300 illustrates the main steps of the method for building a set of feature primitives from labeled image data to augment a feature primitive repository.
- steps 302-308 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
- steps 310-314 may be performed by processor unit 108 in conjunction with memory unit 110 and the one or more supervised learning CNNs 116.
- the method 300 starts at step 302 where at least a first input image dataset 316 and a second input image dataset 318 corresponding to at least a first imaging modality and a second imaging modality are obtained.
- the first and second input image datasets 316, 318 may correspond to an anatomy of the human body, for example, liver, lung, kidney, heart, brain, stomach, and the like.
- the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
- a learning outcome of the one or more supervised CNNs may be a classification of images in the first and second input image datasets 316, 318.
- a check is carried out to determine if the first input image dataset 316 and the second input image dataset 318 include sufficient data to adequately train one or more CNNs. If, at step 304, it is determined that the first and second input image datasets 316, 318 include sufficient data, control is passed to step 308. However, at step 304, it is determined that the first and second input image datasets 316, 318 do not have sufficient data, control is passed to step 306. In one example, at step 306, it may be determined that the first input image dataset 316 includes sufficient data and the second input image dataset 318 does not include sufficient data. Accordingly, in this example, at step 306, the second input image dataset 318 corresponding to the second modality is augmented with additional data.
- the additional data is obtained by processing the first input image dataset 316 corresponding to the first imaging modality via an intensity mapping function.
- the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control is passed to step 308.
- a first supervised learning CNN and a second supervised learning CNN are jointly trained based on labels associated with the first input image dataset 316 and labels associated with the second input image dataset 318 to generate one or more feature primitives and corresponding mapping functions.
- the learning outcome may include one or more feature primitives that characterize aspects of the images of the first input image dataset 316 and aspects of the images of the second input image dataset 318 and corresponding mapping functions where the mapping functions map the corresponding first input image dataset 316 and the second input image dataset 318 to the one or more feature primitives.
- the feature primitives are independent of the imaging modality used to acquire the first input image dataset 316 and the second input image dataset 318.
- the one or more feature primitives and the corresponding mapping functions are stored in a feature primitive repository.
- the methods 200 and 300 described hereinabove enable the creation of a portfolio of feature primitives and mapping functions corresponding to images generated for a plurality of anatomies across a plurality of modalities.
- the feature primitives and the mapping functions are stored in the feature primitive repository 118.
- this portfolio of feature primitives and mapping functions characterizes the learning gained in the training of the CNNs with the input image datasets.
- the learning may be transferred to pre-configure new CNNs to obtain learning outcomes for different, unseen datasets.
- FIG. 4 illustrates a flowchart 400 depicting a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
- the method 400 is described with reference to FIGs. 1, 2 and 3. It may be noted that the flowchart 400 illustrates the main steps of the method 400 for pre- configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives.
- steps 402-406 may be performed by the processor unit 108 in conjunction with the memory unit 110 and the ILNC 112 of the IRLT unit 104.
- steps 408-410 may be performed by processor unit 108 in conjunction with the memory unit 110 and the one or more supervised learning CNNs 116.
- the method 400 starts at step 402, where at least one input image dataset 404 may be obtained.
- the at least one input image dataset 404 is representative of an unseen input image data set.
- at least one learning parameter 406 and a learning outcome 408 corresponding to input image dataset 404 may be obtained.
- the input image dataset 404, the learning parameter 406, and the learning outcome 408 may be obtained as user input 410.
- the learning parameter 406 may include an imaging modality, an image anatomy, or a combination thereof.
- the learning outcome 408 may include a classification scheme, a regression scheme, or a pixel level output like segmentation.
- At step 412 at least one feature primitive and a corresponding mapping function corresponding to the learning parameter 406 and the learning outcome 408 are obtained from the feature primitive repository 118.
- a CNN is configured for learning the input image dataset 404 using the at least one feature primitive and the at least one mapping function, as indicated by step 414.
- the configuration of the CNN may entail setting one or more filters obtained from the feature primitive repository 118 to the mapping functions.
- a pre-configured CNN is generated. Further, at step 416, the pre-configured CNN is optimized with at least a training subset of the input image dataset 404.
- a trained convolutional autoencoder (CAE) for supervised learning that uses labelled data corresponding to a first imaging modality is adapted for the input image dataset 404, with a few parameters, in accordance with equation (3).
- w is a set of feature primitives obtained when a region of interest of an image Pi corresponding to the first imaging modality is mapped using a mapping function / and weights w(a, w), where a is a sparse set of the CAE parameters.
- the CAE parameters may be further optimized with at least the training subset of the input image dataset 404.
- the framework for learning may be defined in accordance with the following formulation.
- mapping function/obtained in equation (3) is applied over the mapping function fa corresponding to the region of interest of an image corresponding to the second imaging modality.
- the input image dataset 404 is processed via the optimized CNN to obtain a learning outcome 420, corresponding to the requested learning outcome 408.
- FIG.5 is a schematic diagram 500 of one embodiment of the interactive learning network configurator 112 of the interactive representation learning transfer unit 104 of FIG. 1, in accordance with aspects of the present specification.
- the block diagram 500 is generally representative of the ILNC 112 as shown in FIG. 1.
- Reference numerals 502-508 are generally representative of visualizations of feature primitives respectively corresponding to an imaging modality, anatomy, appearance, and shape geometry.
- the data for the visualizations 502-508 may be obtained from the feature primitive repository 118 of FIG. 1.
- the ILNC 500 provides a user a selection of interactive menus.
- the user may select one or more aspects of an unseen image dataset to be learned by a CNN.
- Reference numerals 510-516 are generally representative of interactive menu options that may be available to a user to aid in the characterization of the unseen image dataset.
- reference numeral 510 may pertain to the imaging modality of the unseen image dataset.
- the menu options of block 510 may include CT, MR, PET, ultrasound, and the like.
- reference numerals 512-516 may show menu options pertaining to the appearance, shape geometry, and anatomy respectively of the unseen image dataset.
- Reference numeral 518 is generally representative of a visualization of the pre-configured CNN.
- reference numeral 518 may correspond to one or more supervised learning CNNs 116 of the IRLT unit 104 of FIG. 1.
- the user may graphically browse, visualize, and combine feature primitives of blocks 502-508 to create the pre-configured CNN 518.
- the systems and methods for interactive representation learning transfer through deep learning of feature ontologies presented hereinabove provide a transfer learning paradigm where a portfolio of learned feature primitives and mapping functions may be combined to configure CNNs to solve new medical image analytics problems.
- the CNNs may be trained for learning appearance and morphology of images.
- tumors may be classified into blobs, cylinders, disks, bright/dark or banded, and the like.
- networks may be trained for various combinations of anatomy, modality, appearance, and morphology (shape geometry) to generate a rich portfolio configured to immediately provide a transference of pre-learnt features to a new problem at hand.
Abstract
A method for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The method includes obtaining at least first and second input image datasets from first and second imaging modalities. Furthermore, the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN and a second unsupervised learning CNN with the first and second input image dataset respectively to learn compressed representations of the input image datasets, including common feature primitives and corresponding mapping functions and storing the common feature primitives and the corresponding mapping functions in a feature primitive repository.
Description
SYSTEM AND METHOD FOR INTERACTIVE REPRESENTATION LEARNING TRANSFER THROUGH DEEP LEARNING OF FEATURE
ONTOLOGIES
BACKGROUND
[0001] Embodiments of the present specification relate generally to a system and method for generating transferable representation learnings of medical image data of varied anatomies obtained from varied imaging modalities for use in learning networks. Specifically, the system and method are directed to determining a representation learning of the medical image data as a set of feature primitives based on the physics of a first and/or second imaging modality and the biology of the anatomy to configure new convolutional networks for learning problems such as classification and segmentation of the medical image data from other imaging modalities.
[0002] As will be appreciated, machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed." Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. In machine learning, feature learning or representation learning is a set of techniques that transform raw data input into a representation that can be effectively exploited in machine learning tasks. Representation learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process. However, real-world data such as images, video, and sensor measurement is usually complex, redundant, and highly variable. Thus, it is desirable to identify useful features or representations from raw data. Currently, manual feature identification methods require expensive human labor and rely on expert knowledge. Also, manually generated representations normally do not lend themselves well to generalization,
thereby motivating the design of efficient representation learning techniques to automate and generalize feature or representation learning.
[0003] Moreover, in machine learning, a convolutional neural network (CNN or ConvNet) or a deep convolutional neural network is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex. Convolutional neural networks (CNNs) are biologically inspired variants of multilayer perceptrons, designed to emulate the behavior of a visual cortex with minimal amounts of preprocessing. It may be noted that a multilayer perceptron network (MLP) is a feed-forward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one. In current deployments, CNNs have wide applications in image and video recognition, recommender systems, and natural language processing. CNNs are also known as shift invariant or space invariant artificial neural networks (SIANNs) based on their shared weights architecture and translation invariance characteristics.
[0004] It may be noted that in deep CNN architectures, a convolutional layer is the core building block. Parameters associated with a convolutional layer include a set of learnable filters or kernels. During a forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a two-dimensional (2D) activation map of that filter. Thus, the network learns filters that are activated when the network detects a specific type of feature at a given spatial position in the input. Deep CNN architectures are typically purpose-built for deep learning problems. Deep learning is known to be effective in many tasks that involve human-like perception and decision making. Typical applications of deep learning are handwriting recognition, image recognition, and speech recognition. Deep learning techniques construct a network model by using training data and outcomes that correspond to the data. Once the network model is
constructed, it can be used on new data for determining outcomes. Moreover, it will be appreciated that a deep learning network once learnt for a specific outcome may be reused advantageously for related outcomes.
[0005] Furthermore, CNNs may be used for supervised learning, where an input dataset is labelled, as well as for unsupervised learning, where the input dataset is not labelled. A labelled input dataset is one where the elements of the dataset are pre- associated with a classification scheme, represented by labels. Thus, the CNN is trained with a labelled subset of the dataset, and may be tested with another subset to verify an accurate classification result
[0006] A deep CNN architecture is multi-layered, where the layers are hierarchically connected. As the number of input to output mappings and filters for each layer increases, a multi-layer deep CNN may result in a huge number of parameters that need to be configured for its operation. If training data for such a CNN is scarce, the learning problem is under-determined. In this situation, it is advantageous to transfer certain parameters from a pre-learned CNN model. Transfer learning reduces the number of parameters to be optimized by freezing the pre-learned parameters in a subset of layers and provides a good initialization for tuning the remaining layers. In the medical image problem domain, using transfer learning to pre-configure CNNs for various problems of classification and identification is advantageous in ameliorating the situations where data is scarce, the challenges of heterogenous data types due to the plurality of imaging modalities and anatomies, and other clinical challenges.
BRIEF DESCRIPTION
[0007] In accordance with one aspect of the present specification, a method for a method for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The method includes obtaining at least a first input image
dataset from a first imaging modality and a second input image dataset from a second imaging modality. Furthermore, the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Additionally, the method includes storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
[0008] In accordance with another aspect of the present specification, an interactive representation learning transfer (IRLT) unit for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The IRLT unit includes an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Moreover, the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
[0009] In accordance with yet another aspect of the present specification, a multimodality transfer learning system, is presented. The multimodality transfer learning system includes a processor unit and a memory unit communicatively operatively coupled to the processor unit. Furthermore, the multimodality transfer learning system includes an interactive representation learning transfer (IRLT) unit operatively coupled to the processor unit and including an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. In addition, the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
DRAWINGS
[0010] These and other features and aspects of embodiments of the present specification will become better understood when the following detailed description is read with references to the accompanying drawings in which like characters represent like parts throughout the drawings, wherein:
[0011] FIG. 1 is a schematic diagram of an exemplary system for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification;
[0012] FIG. 2 is a flowchart illustrating a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification;
[0013] FIG. 3 is a flowchart illustrating a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification;
[0014] FIG. 4 is a flowchart illustrating a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification; and
[0015] FIG. 5 is a schematic diagram of one embodiment of an interactive learning network configurator, in accordance with aspects of the present specification.
DETAILED DESCRIPTION
[0016] As will be described in detail hereinafter, various embodiments of an exemplary system and method for interactive representation learning transfer through deep learning of feature ontologies are presented. In an effort to provide a concise description of these embodiments, all features of an actual implementation may not be described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation- specific decisions must be made to achieve the developer's specific goals such as compliance with system-related and business-related constraints.
[0017] When describing elements of the various embodiments of the present specification, the articles "a," "an," "the," and "said" are intended to mean that there are one or more of the elements. The terms "comprising," "including" and "having" are intended to be inclusive and mean that there may be additional elements other than the listed elements. Furthermore, the terms "build" and "construct" and their variations
are intended to mean a mathematical determination or computation of mathematical constructs. The terms "data drawn on arbitrary domains" or "data on arbitrary domains" are intended to mean data corresponding to a domain for example, social media data, sensor data, enterprise data and the like.
[0018] The term "transfer learning" or "inductive transfer" as used in the present specification is intended to mean an approach of machine learning that focuses on applying the knowledge or learning gained while solving one problem to a different, related problem. This knowledge or learning may be characterized and/or represented in various ways, typically by combinations and variations of transfer functions, mapping functions, graphs, matrices, and other primitives. Also, the term "transfer learning primitive" as used in the present specification is intended to mean the characterization and/or representation of knowledge or learning gained by solving a machine learning problem as described hereinabove.
[0019] Moreover, the term "feature primitive" as used in the present specification is intended to mean a characterization of an aspect of an input dataset, typically, an appearance, shape geometry, or morphology of a region of interest (ROI) corresponding to an image of an anatomy, where the image may be obtained from an imaging modality such as an ultrasound imaging system, a computed tomography (CT) imaging system, a positron emission tomography-CT (PET- CT) imaging system, a magnetic resonance (MR) imaging system, and the like. Also, the anatomy may include an internal organ of the human body such as, but not limited to, the lung, liver, kidney, stomach, heart, brain, and the like.
[0020] An exemplary system 100 for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification, is illustrated in FIG. 1. In a presently contemplated configuration of FIG. 1, the system 100 includes a multimodality transfer learning (MTL) subsystem 102. The MTL subsystem 102 includes an interactive representation learning transfer
(IRLT) unit 104, a processor unit 108, a memory unit 110, and a user interface 106. The processor unit 108 is communicatively coupled to the memory unit 110. The user interface 106 is operatively coupled to the IRLT unit 104. Also, the IRLT unit 104 is operatively coupled to the processor unit 108 and the memory unit 110. The system 100 and/or the MTL subsystem 102 may include a display unit 134. It may be noted that the MTL subsystem 102 may include other components or hardware, and is not limited to the components shown in FIG. 1.
[0021] The user interface 106 is configured to receive user input 130 corresponding to characteristics of an input image 128. The user input 130 may include aspects or characteristics of the input image 128, such as, but not limited to, the imaging modality, the anatomy generally represented by the input image 128, the appearance of the ROI corresponding to the input image 128, and the like.
[0022] In certain embodiments, the IRLT unit 104 may be implemented as software systems or computer instructions executable via one or more processor units 108 and stored in the memory unit 110. In other embodiments, the IRLT unit 104 may be implemented as a hardware system, for example, via FPGAs, custom chips, integrated circuits (ICs), Application Specific ICs (ASICs), and the like.
[0023] As illustrated in FIG. 1, the IRLT unit 104 may include an interactive learning network configurator (ILNC) 112, one or more CNNs configured for unsupervised learning (unsupervised learning CNN) 114, one or more CNNs configured for supervised learning (supervised learning CNN) 116, and a feature primitive repository 118. The ILNC 112 is operatively coupled to the unsupervised learning CNNs 114, the supervised learning CNNs 116, and the feature primitive repository 118. In one embodiment, the INLC 112 may be a graphical user interface subsystem configured to enable a user to configure one or more supervised learning CNNs 116 or unsupervised learning CNNs 114.
[0024] The feature primitive repository 118 is configured to store one or more feature primitives corresponding to an ROI in the input image 128 and one or more corresponding mapping functions. The term "mapping function" as used in the present specification is intended to represent a transfer function or a CNN filter that maps the ROI to a compressed representation such that an output of the CNN is a feature primitive that characterizes the ROI of the input image 128 based on an aspect of the ROI. In one example, the aspect of the ROI may include a shape geometry, an appearance, a morphology, and the like. Accordingly, the feature primitive repository 118 is configured to store one or more mapping functions. These mapping functions, in conjunction with the corresponding feature primitives, may be used to pre-configure a CNN to learn a new training set. Thus, the feature primitive repository 118 is configured to store feature primitives and mapping functions which are transfer learnings obtained from other CNNs to pre-configure a new CNN to learn an unseen dataset. As shown in FIG. 1, some non-limiting examples of feature primitives that the feature primitive repository 118 is configured to store include feature primitives characterizing an appearance 120 corresponding to the ROI of the image 128, a shape geometry 124 corresponding to the ROI of the image 128, and an anatomy 126 corresponding to the image 128.
[0025] In the presently contemplated configuration of FIG. l, the ILNC 112 is configured to present a user with various tools and options to interactively characterize one or more aspects of the ROI corresponding to the input image 128. An exemplary embodiment of the ILNC 112 is shown in FIG. 5 and the working of the ILNC 112 will be described in greater detail with reference to FIGs. 4 and 5.
[0026] The system 100 is configured to develop a portfolio of feature primitives and mapping functions across one or more anatomies and imaging modalities to be stored in the feature primitive repository 118. Subsequently, the system 100 may provide the user with an ILNC 112 to allow the user to pre-configure a CNN for
learning a new, unseen image dataset based on a selection of modality, anatomy, shape geometry, morphology, and the like corresponding to one or more ROIs of the unseen image dataset. By way of example, one learning outcome of the CNN may be a classification scheme categorizing the image dataset. Other non-limiting examples of learning outcomes may include pixel-level segmentation, regression, and the like. The working of the system 100 will be described in greater detail with reference to FIGs. 2-5.
[0027] Additionally, the system 100 and/or the MTL subsystem 102 may be configured to visualize one or more of the feature primitives, the mapping functions, the image datasets, and the like on the display unit 134.
[0028] Turning now to FIG. 2, a flowchart 200 generally representative of a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. The method 200 is described with reference to the components of FIG. 1.
[0029] It may be noted that the flowchart 200 illustrates the main steps of the method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository such as the feature primitive repository 118. In some embodiments, various steps of the method 200 of FIG. 2, more particularly, steps 202- 208 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, step 210 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more supervised learning CNNs 116. Also, steps 214-216 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more unsupervised learning CNNs 114.
[0030] The method 200 starts at step 202 where at least a first input image dataset 220 and a second input image dataset 222 corresponding to a first imaging modality
and a second imaging modality are obtained. In one embodiment, the first and second input image datasets 220, 222 may correspond to an anatomy of the human body, such as the liver, lung, kidney, heart, brain, stomach, and the like. Also, the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
[0031] At step 204, a check is carried out to determine whether the input image datasets 220, 222 are labelled. As previously noted, labels referencing the input image datasets may be generally representative of a classification scheme or score that characterizes an aspect of the input images, such as the shape geometry, appearance, morphology, and the like. If, at step 204, it is determined that the input image datasets are labelled, control passes to step 210, where a first CNN and a second CNN are configured for supervised learning of the input image datasets 220, 222. Step 210 will be described in greater detail with reference to FIG. 3. Subsequent to step 210, the method 200 is terminated, as indicated by step 212.
[0032] Referring again to step 204, if it is determined that the input image datasets are unlabeled, control passes to step 206, where a second check is carried out to determine if the first input image dataset 220 and the second input image dataset 222 include sufficient data to adequately train one or more CNNs. If, at step 206, it is determined that the first and second input image datasets 220, 222 include sufficient data, control passes to step 214. However, at step 206 it is determined that the first and second input image datasets 220, 222 do not have sufficient data, control passes to step 208.
[0033] In one example, at step 208, it may be determined that the first input image dataset 220 includes sufficient data and the second input image dataset 222 does not include sufficient data. Accordingly, in this example, at step 208, the second input dataset 222 corresponding to the second imaging modality is augmented with additional data. The additional data for augmenting the second input image dataset
222 may be obtained by processing the first input image dataset 220 corresponding to the first imaging modality via an intensity mapping function. One non-limiting example of the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control passes to step 214.
[0034] At step 214, a first unsupervised learning CNN and a second unsupervised learning CNN are jointly trained with the first input image dataset 220 and the second input image dataset 222 to learn compressed representations of the input image datasets 220, 222, where the compressed representations include one or more common feature primitives and corresponding mapping functions.
[0035] The one or more feature primitives characterize aspects of the images of the first input image dataset 220. It may be noted that the mapping functions map the input image dataset to the corresponding feature primitive. In one embodiment, the mapping function may be defined in accordance with equation (1).
[0036] In equation (1), her is a set of feature primitives obtained when a region of interest of an image eris mapped using a mapping function /and weights w. In this example, the image PCT corresponds to an image obtained via use of a CT imaging system.
[0037] It may be noted that consequent to step 214, one or more mapping functions corresponding to the first imaging modality and the second imaging modality are generated. It may be noted that these mapping functions map the first and second input image datasets 220, 222 to the same feature primitives. In one embodiment, a second mapping function may be defined in accordance with equation (2).
[0038] In equation (2), HMR is a set of feature primitives obtained when a region of interest of an image PMR is mapped using a mapping function/and weights w. In this example, the image PMR is obtained using an MR imaging system.
[0039] Furthermore, at step 218, the at least the one or more feature primitives and the corresponding mapping functions are stored in the feature primitive repository 118. Control is then passed to step 212 to terminate the method 200.
[0040] Turning now to FIG. 3, a flowchart 300 generally representative of a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. More particularly, the method 300 describes step 210 of FIG. 2 in greater detail. Also, the method 300 is described with reference to the components of FIG. 1.
[0041] It may be noted that the flowchart 300 illustrates the main steps of the method for building a set of feature primitives from labeled image data to augment a feature primitive repository. In some embodiments, various steps of the method 300 of FIG. 3, more particularly, steps 302-308 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, steps 310-314 may be performed by processor unit 108 in conjunction with memory unit 110 and the one or more supervised learning CNNs 116.
[0042] The method 300 starts at step 302 where at least a first input image dataset 316 and a second input image dataset 318 corresponding to at least a first imaging modality and a second imaging modality are obtained. In one embodiment, the first and second input image datasets 316, 318 may correspond to an anatomy of the human body, for example, liver, lung, kidney, heart, brain, stomach, and the like. Also, the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof. It may be noted that a learning outcome of the one or more supervised CNNs may be a classification of images in the first and second input image datasets 316, 318.
[0043] At step 304, a check is carried out to determine if the first input image dataset 316 and the second input image dataset 318 include sufficient data to adequately train one or more CNNs. If, at step 304, it is determined that the first and second input image datasets 316, 318 include sufficient data, control is passed to step 308. However, at step 304, it is determined that the first and second input image datasets 316, 318 do not have sufficient data, control is passed to step 306. In one example, at step 306, it may be determined that the first input image dataset 316 includes sufficient data and the second input image dataset 318 does not include sufficient data. Accordingly, in this example, at step 306, the second input image dataset 318 corresponding to the second modality is augmented with additional data. It may be noted that the additional data is obtained by processing the first input image dataset 316 corresponding to the first imaging modality via an intensity mapping function. As previously noted, one non-limiting example of the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control is passed to step 308.
[0044] Furthermore, at step 308, a first supervised learning CNN and a second supervised learning CNN are jointly trained based on labels associated with the first input image dataset 316 and labels associated with the second input image dataset 318 to generate one or more feature primitives and corresponding mapping functions.
[0045] In one embodiment, the learning outcome may include one or more feature primitives that characterize aspects of the images of the first input image dataset 316 and aspects of the images of the second input image dataset 318 and corresponding mapping functions where the mapping functions map the corresponding first input image dataset 316 and the second input image dataset 318 to the one or more feature primitives. Thus, the feature primitives are independent of the imaging modality used to acquire the first input image dataset 316 and the second input image dataset 318. Referring now to step 312, the one or more feature primitives and the corresponding mapping functions are stored in a feature primitive repository.
[0046] The methods 200 and 300 described hereinabove enable the creation of a portfolio of feature primitives and mapping functions corresponding to images generated for a plurality of anatomies across a plurality of modalities. The feature primitives and the mapping functions are stored in the feature primitive repository 118. Also, this portfolio of feature primitives and mapping functions characterizes the learning gained in the training of the CNNs with the input image datasets. Moreover, the learning may be transferred to pre-configure new CNNs to obtain learning outcomes for different, unseen datasets.
[0047] With the foregoing in mind, FIG. 4 illustrates a flowchart 400 depicting a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification. The method 400 is described with reference to FIGs. 1, 2 and 3. It may be noted that the flowchart 400 illustrates the main steps of the method 400 for pre- configuring a CNN with mapping functions to learn an unseen dataset based on a
selection of feature primitives. In some embodiments, various steps of the method 400 of FIG. 4, more particularly, steps 402-406 may be performed by the processor unit 108 in conjunction with the memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, steps 408-410 may be performed by processor unit 108 in conjunction with the memory unit 110 and the one or more supervised learning CNNs 116.
[0048] The method 400 starts at step 402, where at least one input image dataset 404 may be obtained. The at least one input image dataset 404 is representative of an unseen input image data set. Additionally, at least one learning parameter 406 and a learning outcome 408 corresponding to input image dataset 404 may be obtained. In certain embodiments, the input image dataset 404, the learning parameter 406, and the learning outcome 408 may be obtained as user input 410. In one embodiment, the learning parameter 406 may include an imaging modality, an image anatomy, or a combination thereof. Also, the learning outcome 408 may include a classification scheme, a regression scheme, or a pixel level output like segmentation.
[0049] At step 412, at least one feature primitive and a corresponding mapping function corresponding to the learning parameter 406 and the learning outcome 408 are obtained from the feature primitive repository 118. Subsequently, a CNN is configured for learning the input image dataset 404 using the at least one feature primitive and the at least one mapping function, as indicated by step 414. In certain embodiments, the configuration of the CNN may entail setting one or more filters obtained from the feature primitive repository 118 to the mapping functions. Consequent to the processing of step 414, a pre-configured CNN is generated. Further, at step 416, the pre-configured CNN is optimized with at least a training subset of the input image dataset 404.
[0050] In one embodiment, a trained convolutional autoencoder (CAE) for supervised learning that uses labelled data corresponding to a first imaging modality
is adapted for the input image dataset 404, with a few parameters, in accordance with equation (3).
[0051] In equation (3), w is a set of feature primitives obtained when a region of interest of an image Pi corresponding to the first imaging modality is mapped using a mapping function / and weights w(a, w), where a is a sparse set of the CAE parameters. In this way, the number of filters may be reduced. The CAE parameters may be further optimized with at least the training subset of the input image dataset 404. In this way for a supervised learning problem corresponding to a second imaging modality, the framework for learning may be defined in accordance with the following formulation.
[0052] In formulation (4), the mapping function/obtained in equation (3) is applied over the mapping function fa corresponding to the region of interest of an image corresponding to the second imaging modality. Subsequently, at step 418, the input image dataset 404 is processed via the optimized CNN to obtain a learning outcome 420, corresponding to the requested learning outcome 408.
[0053] The workflow embodied in the method 400 is described in greater detail with reference to FIG. 5. FIG.5 is a schematic diagram 500 of one embodiment of the interactive learning network configurator 112 of the interactive representation learning transfer unit 104 of FIG. 1, in accordance with aspects of the present specification. As illustrated in FIG. 5, the block diagram 500 is generally representative of the ILNC 112 as shown in FIG. 1. Reference numerals 502-508 are generally representative of visualizations of feature primitives respectively corresponding to an imaging modality,
anatomy, appearance, and shape geometry. The data for the visualizations 502-508 may be obtained from the feature primitive repository 118 of FIG. 1.
[0054] In FIG. 5, the ILNC 500 provides a user a selection of interactive menus. In particular, using the ILNC 500, the user may select one or more aspects of an unseen image dataset to be learned by a CNN. Reference numerals 510-516 are generally representative of interactive menu options that may be available to a user to aid in the characterization of the unseen image dataset. By way of example, reference numeral 510 may pertain to the imaging modality of the unseen image dataset. The menu options of block 510 may include CT, MR, PET, ultrasound, and the like. In a similar fashion, reference numerals 512-516 may show menu options pertaining to the appearance, shape geometry, and anatomy respectively of the unseen image dataset.
[0055] The selections made by the user from the interactive menus enable the pre- configuration of a CNN with feature primitives and mapping functions corresponding to the menu selections. Reference numeral 518 is generally representative of a visualization of the pre-configured CNN. In certain embodiments, reference numeral 518 may correspond to one or more supervised learning CNNs 116 of the IRLT unit 104 of FIG. 1. Additionally, the user may graphically browse, visualize, and combine feature primitives of blocks 502-508 to create the pre-configured CNN 518.
[0056] The systems and methods for interactive representation learning transfer through deep learning of feature ontologies presented hereinabove provide a transfer learning paradigm where a portfolio of learned feature primitives and mapping functions may be combined to configure CNNs to solve new medical image analytics problems. Advantageously, the CNNs may be trained for learning appearance and morphology of images. By way of example, tumors may be classified into blobs, cylinders, disks, bright/dark or banded, and the like. Overall, networks may be trained for various combinations of anatomy, modality, appearance, and morphology (shape
geometry) to generate a rich portfolio configured to immediately provide a transference of pre-learnt features to a new problem at hand.
[0057] It is to be understood that not necessarily all such objects or advantages described above may be achieved in accordance with any embodiment. Thus, for example, those skilled in the art will recognize that the systems and techniques described herein may be embodied or carried out in a manner that achieves or improves one advantage or group of advantages as taught herein without necessarily achieving other objects or advantages as may be taught or suggested herein.
[0058] While the technology has been described in detail in connection with only a limited number of embodiments, it should be readily understood that the specification is not limited to such disclosed embodiments. Rather, the technology can be modified to incorporate any number of variations, alterations, substitutions or equivalent arrangements not heretofore described, but which are commensurate with the spirit and scope of the claims. Additionally, while various embodiments of the technology have been described, it is to be understood that aspects of the specification may include only some of the described embodiments. Accordingly, the specification is not to be limited by the foregoing description, but is only limited by the scope of the appended claims.
Claims
1. A method for interactive representation learning transfer to a convolutional neural network (CNN), comprising: obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and
storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
2. The method of claim 1, further comprising:
obtaining at least one unseen input image dataset;
obtaining at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtaining, via the feature primitive repository, at least one feature primitive and a corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome;
configuring a CNN for learning the at least one unseen input image dataset based on the at least one feature primitive and the at least one mapping function;
optimizing the configured CNN with at least a training subset of the unseen input image dataset; and
processing the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
3. The method of claim 1, wherein jointly training the first supervised learning CNN and the second supervised learning CNN to generate the one or more common feature primitives and the corresponding mapping functions comprises jointly training the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
4. The method of claim 1 , further comprising augmenting the second input image dataset corresponding to the second imaging modality with additional data, wherein the additional data is obtained by processing the first input image dataset corresponding to the first imaging modality via an intensity mapping function.
5. The method of claim 4, wherein processing the first input image dataset corresponding to the first imaging modality via the intensity mapping function comprises:
performing, via a regression framework in the intensity mapping function, multi-modality acquisitions from a first imaging modality and a second imaging modality corresponding to one or more subjects; and
learning, via the regression framework in the intensity mapping function, a patch level mapping from the first imaging modality to the second imaging modality to map intensities of the first imaging modality to the second imaging modality.
6. An interactive representation learning transfer unit for interactive representation learning transfer to a convolutional neural network (CNN), comprising: an interactive learning network configurator configured to:
obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and
a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
7. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to:
obtain at least one unseen input image dataset;
obtain at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtain, via the feature primitive repository, at least one feature primitive and corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome;
configure a CNN for learning the at least one unseen input image dataset based on the at least one feature primitive and the at least one mapping function;
optimize the configured CNN with at least a training subset of the unseen input image dataset; and
process the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
8. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to jointly train the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
9. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to augment the second input image dataset corresponding to the second imaging modality with additional data, and wherein the additional data is obtained by processing the first input image dataset corresponding to the first imaging modality via an intensity mapping function.
10. The interactive representation learning transfer unit of claim 9, wherein the intensity mapping function comprises a regression framework configured to: perform multi-modality acquisitions from a first imaging modality and a second imaging modality corresponding to one or more subjects; and
learn a patch level mapping from the first imaging modality to the second imaging modality to map intensities of the first imaging modality to the second imaging modality.
11. A multimodality transfer learning system, comprising:
a processor unit;
a memory unit operatively coupled to the processor unit;
an interactive representation learning transfer unit operatively coupled to the processor unit and comprising:
an interactive learning network configurator configured to:
obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
12. The multimodality transfer learning system of claim 11, wherein the interactive learning network configurator is further configured to:
obtain at least one unseen input image dataset;
obtain at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtain, via the feature primitive repository, at least one feature primitive and corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome;
configure a CNN for learning the at least one unseen input image dataset, based on the at least one feature primitive and the at least one mapping function;
optimize the configured CNN with at least a training subset of the unseen input image dataset; and
process the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
13. The multimodality transfer learning system of claim 11 , wherein the interactive learning network configurator is further configured to:
jointly train the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18804491.1A EP3704636A1 (en) | 2017-11-03 | 2018-11-02 | System and method for interactive representation learning transfer through deep learning of feature ontologies |
JP2020524235A JP7467336B2 (en) | 2017-11-03 | 2018-11-02 | METHOD, PROCESSING UNIT AND SYSTEM FOR STUDYING MEDICAL IMAGE DATA OF ANATOMICAL STRUCTURES OBTAINED FROM MULTIPLE IMAGING MODALITIES - Patent application |
CN201880071649.9A CN111316290B (en) | 2017-11-03 | 2018-11-02 | System and method for interactive representation learning migration through deep learning of feature ontologies |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
IN201741039221 | 2017-11-03 | ||
IN201741039221 | 2017-11-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019090023A1 true WO2019090023A1 (en) | 2019-05-09 |
Family
ID=64332416
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2018/058855 WO2019090023A1 (en) | 2017-11-03 | 2018-11-02 | System and method for interactive representation learning transfer through deep learning of feature ontologies |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP3704636A1 (en) |
JP (1) | JP7467336B2 (en) |
CN (1) | CN111316290B (en) |
WO (1) | WO2019090023A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110186375A (en) * | 2019-06-06 | 2019-08-30 | 西南交通大学 | Intelligent high-speed rail white body assemble welding feature detection device and detection method |
CN110210486A (en) * | 2019-05-15 | 2019-09-06 | 西安电子科技大学 | A kind of generation confrontation transfer learning method based on sketch markup information |
CN112434602A (en) * | 2020-11-23 | 2021-03-02 | 西安交通大学 | Fault diagnosis method based on migratable common feature space mining |
WO2022072150A1 (en) * | 2020-09-30 | 2022-04-07 | Alteryx, Inc. | System and method of operationalizing automated feature engineering |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113707312A (en) * | 2021-09-16 | 2021-11-26 | 人工智能与数字经济广东省实验室(广州) | Blood vessel quantitative identification method and device based on deep learning |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2548172A1 (en) * | 2010-03-18 | 2013-01-23 | Koninklijke Philips Electronics N.V. | Functional image data enhancement and/or enhancer |
JP6235610B2 (en) * | 2012-12-26 | 2017-11-22 | ボルケーノ コーポレイション | Measurement and enhancement in multi-modality medical imaging systems |
US9922272B2 (en) * | 2014-09-25 | 2018-03-20 | Siemens Healthcare Gmbh | Deep similarity learning for multimodal medical images |
CN105930877B (en) * | 2016-05-31 | 2020-07-10 | 上海海洋大学 | Remote sensing image classification method based on multi-mode deep learning |
US10127659B2 (en) | 2016-11-23 | 2018-11-13 | General Electric Company | Deep learning medical systems and methods for image acquisition |
US10242443B2 (en) | 2016-11-23 | 2019-03-26 | General Electric Company | Deep learning medical systems and methods for medical procedures |
CN106909905B (en) * | 2017-03-02 | 2020-02-14 | 中科视拓(北京)科技有限公司 | Multi-mode face recognition method based on deep learning |
CN106971174B (en) * | 2017-04-24 | 2020-05-22 | 华南理工大学 | CNN model, CNN training method and CNN-based vein identification method |
CN107220337B (en) * | 2017-05-25 | 2020-12-22 | 北京大学 | Cross-media retrieval method based on hybrid migration network |
-
2018
- 2018-11-02 WO PCT/US2018/058855 patent/WO2019090023A1/en unknown
- 2018-11-02 EP EP18804491.1A patent/EP3704636A1/en not_active Withdrawn
- 2018-11-02 CN CN201880071649.9A patent/CN111316290B/en active Active
- 2018-11-02 JP JP2020524235A patent/JP7467336B2/en active Active
Non-Patent Citations (3)
Title |
---|
LLUIS CASTREJON ET AL: "Learning Aligned Cross-Modal Representations from Weakly Aligned Data", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 July 2016 (2016-07-25), XP080714696 * |
TADAS BALTRUSAITIS ET AL: "Multimodal Machine Learning: A Survey and Taxonomy", 25 May 2017 (2017-05-25), XP055414490, Retrieved from the Internet <URL:https://arxiv.org/pdf/1705.09406.pdf> [retrieved on 20190128] * |
YUSUF AYTAR ET AL: "Cross-Modal Scene Networks", 27 October 2016 (2016-10-27), XP055549670, Retrieved from the Internet <URL:https://arxiv.org/pdf/1610.09003.pdf> [retrieved on 20190128] * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110210486A (en) * | 2019-05-15 | 2019-09-06 | 西安电子科技大学 | A kind of generation confrontation transfer learning method based on sketch markup information |
CN110210486B (en) * | 2019-05-15 | 2021-01-01 | 西安电子科技大学 | Sketch annotation information-based generation countermeasure transfer learning method |
CN110186375A (en) * | 2019-06-06 | 2019-08-30 | 西南交通大学 | Intelligent high-speed rail white body assemble welding feature detection device and detection method |
WO2022072150A1 (en) * | 2020-09-30 | 2022-04-07 | Alteryx, Inc. | System and method of operationalizing automated feature engineering |
US11941497B2 (en) | 2020-09-30 | 2024-03-26 | Alteryx, Inc. | System and method of operationalizing automated feature engineering |
CN112434602A (en) * | 2020-11-23 | 2021-03-02 | 西安交通大学 | Fault diagnosis method based on migratable common feature space mining |
CN112434602B (en) * | 2020-11-23 | 2023-08-29 | 西安交通大学 | Fault diagnosis method based on movable common feature space mining |
Also Published As
Publication number | Publication date |
---|---|
EP3704636A1 (en) | 2020-09-09 |
JP7467336B2 (en) | 2024-04-15 |
CN111316290A (en) | 2020-06-19 |
JP2021507327A (en) | 2021-02-22 |
CN111316290B (en) | 2024-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019090023A1 (en) | System and method for interactive representation learning transfer through deep learning of feature ontologies | |
Altaf et al. | Going deep in medical image analysis: concepts, methods, challenges, and future directions | |
US20210110135A1 (en) | Method and system for artificial intelligence based medical image segmentation | |
Maier et al. | A gentle introduction to deep learning in medical image processing | |
Song et al. | PET image super-resolution using generative adversarial networks | |
US20210012486A1 (en) | Image synthesis with generative adversarial network | |
Fritscher et al. | Deep neural networks for fast segmentation of 3D medical images | |
EP3273387B1 (en) | Medical image segmentation with a multi-task neural network system | |
US20210012162A1 (en) | 3d image synthesis system and methods | |
Srinivasu et al. | Self-Learning Network-based segmentation for real-time brain MR images through HARIS | |
Agravat et al. | Deep learning for automated brain tumor segmentation in mri images | |
You et al. | Incremental learning meets transfer learning: Application to multi-site prostate mri segmentation | |
Conze et al. | Current and emerging trends in medical image segmentation with deep learning | |
Khan et al. | Segmentation of shoulder muscle MRI using a new region and edge based deep auto-encoder | |
Ogiela et al. | Natural user interfaces in medical image analysis | |
Agravat et al. | A survey and analysis on automated glioma brain tumor segmentation and overall patient survival prediction | |
Biswas et al. | Data augmentation for improved brain tumor segmentation | |
Anam et al. | Classification of scaled texture patterns with transfer learning | |
Quan et al. | An intelligent system approach for probabilistic volume rendering using hierarchical 3D convolutional sparse coding | |
Jena et al. | Review of neural network techniques in the verge of image processing | |
Huang et al. | A two-level dynamic adaptive network for medical image fusion | |
Ullah et al. | DSFMA: Deeply supervised fully convolutional neural networks based on multi-level aggregation for saliency detection | |
Kishanrao et al. | An improved grade based MRI brain tumor classification using hybrid DCNN-DH framework | |
Robben et al. | DeepVoxNet: voxel‐wise prediction for 3D images | |
Silva-Rodríguez et al. | Towards foundation models and few-shot parameter-efficient fine-tuning for volumetric organ segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18804491 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2020524235 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018804491 Country of ref document: EP Effective date: 20200603 |