WO2019090023A1 - System and method for interactive representation learning transfer through deep learning of feature ontologies - Google Patents

System and method for interactive representation learning transfer through deep learning of feature ontologies Download PDF

Info

Publication number
WO2019090023A1
WO2019090023A1 PCT/US2018/058855 US2018058855W WO2019090023A1 WO 2019090023 A1 WO2019090023 A1 WO 2019090023A1 US 2018058855 W US2018058855 W US 2018058855W WO 2019090023 A1 WO2019090023 A1 WO 2019090023A1
Authority
WO
WIPO (PCT)
Prior art keywords
input image
learning
image dataset
cnn
imaging modality
Prior art date
Application number
PCT/US2018/058855
Other languages
French (fr)
Inventor
Vivek Prabhakar Vaidya
Rakesh Mullick
Krishna Seetharam Shriram
Sohan Rashmi Ranjan
Pavan Kumar V Annangi
Sheshadri Thiruvenkadam
Chandan Kumar Mallappa Aladahalli
Arathi Sreekumari
Original Assignee
General Electric Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Electric Company filed Critical General Electric Company
Priority to EP18804491.1A priority Critical patent/EP3704636A1/en
Priority to JP2020524235A priority patent/JP7467336B2/en
Priority to CN201880071649.9A priority patent/CN111316290B/en
Publication of WO2019090023A1 publication Critical patent/WO2019090023A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • G06F18/256Fusion techniques of classification results, e.g. of results related to same input data of results relating to different input data, e.g. multimodal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10104Positron emission tomography [PET]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/031Recognition of patterns in medical or anatomical images of internal organs

Definitions

  • Embodiments of the present specification relate generally to a system and method for generating transferable representation learnings of medical image data of varied anatomies obtained from varied imaging modalities for use in learning networks. Specifically, the system and method are directed to determining a representation learning of the medical image data as a set of feature primitives based on the physics of a first and/or second imaging modality and the biology of the anatomy to configure new convolutional networks for learning problems such as classification and segmentation of the medical image data from other imaging modalities.
  • machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed.”
  • Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.
  • feature learning or representation learning is a set of techniques that transform raw data input into a representation that can be effectively exploited in machine learning tasks.
  • Representation learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process.
  • real-world data such as images, video, and sensor measurement is usually complex, redundant, and highly variable.
  • manual feature identification methods require expensive human labor and rely on expert knowledge.
  • manually generated representations normally do not lend themselves well to generalization, thereby motivating the design of efficient representation learning techniques to automate and generalize feature or representation learning.
  • CNN convolutional neural network
  • ConvNet convolutional neural network
  • a deep convolutional neural network is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex.
  • Convolutional neural networks are biologically inspired variants of multilayer perceptrons, designed to emulate the behavior of a visual cortex with minimal amounts of preprocessing.
  • MLP multilayer perceptron network
  • An MLP includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one.
  • CNNs have wide applications in image and video recognition, recommender systems, and natural language processing.
  • CNNs are also known as shift invariant or space invariant artificial neural networks (SIANNs) based on their shared weights architecture and translation invariance characteristics.
  • SIANNs shift invariant or space invariant artificial neural networks
  • a convolutional layer is the core building block.
  • Parameters associated with a convolutional layer include a set of learnable filters or kernels. During a forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a two-dimensional (2D) activation map of that filter.
  • 2D two-dimensional
  • the network learns filters that are activated when the network detects a specific type of feature at a given spatial position in the input.
  • Deep CNN architectures are typically purpose-built for deep learning problems. Deep learning is known to be effective in many tasks that involve human-like perception and decision making. Typical applications of deep learning are handwriting recognition, image recognition, and speech recognition.
  • Deep learning techniques construct a network model by using training data and outcomes that correspond to the data. Once the network model is constructed, it can be used on new data for determining outcomes. Moreover, it will be appreciated that a deep learning network once learnt for a specific outcome may be reused advantageously for related outcomes.
  • CNNs may be used for supervised learning, where an input dataset is labelled, as well as for unsupervised learning, where the input dataset is not labelled.
  • a labelled input dataset is one where the elements of the dataset are pre- associated with a classification scheme, represented by labels.
  • the CNN is trained with a labelled subset of the dataset, and may be tested with another subset to verify an accurate classification result
  • a deep CNN architecture is multi-layered, where the layers are hierarchically connected. As the number of input to output mappings and filters for each layer increases, a multi-layer deep CNN may result in a huge number of parameters that need to be configured for its operation. If training data for such a CNN is scarce, the learning problem is under-determined. In this situation, it is advantageous to transfer certain parameters from a pre-learned CNN model. Transfer learning reduces the number of parameters to be optimized by freezing the pre-learned parameters in a subset of layers and provides a good initialization for tuning the remaining layers.
  • a method for a method for interactive representation learning transfer to a convolutional neural network includes obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality.
  • the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Additionally, the method includes storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
  • an interactive representation learning transfer (IRLT) unit for interactive representation learning transfer to a convolutional neural network (CNN) is presented.
  • the IRLT unit includes an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
  • the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
  • a multimodality transfer learning system includes a processor unit and a memory unit communicatively operatively coupled to the processor unit.
  • the multimodality transfer learning system includes an interactive representation learning transfer (IRLT) unit operatively coupled to the processor unit and including an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions.
  • the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
  • FIG. 1 is a schematic diagram of an exemplary system for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification
  • FIG. 2 is a flowchart illustrating a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
  • FIG. 3 is a flowchart illustrating a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification
  • FIG. 4 is a flowchart illustrating a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
  • FIG. 5 is a schematic diagram of one embodiment of an interactive learning network configurator, in accordance with aspects of the present specification.
  • transfer learning or “inductive transfer” as used in the present specification is intended to mean an approach of machine learning that focuses on applying the knowledge or learning gained while solving one problem to a different, related problem.
  • This knowledge or learning may be characterized and/or represented in various ways, typically by combinations and variations of transfer functions, mapping functions, graphs, matrices, and other primitives.
  • transfer learning primitive as used in the present specification is intended to mean the characterization and/or representation of knowledge or learning gained by solving a machine learning problem as described hereinabove.
  • feature primitive as used in the present specification is intended to mean a characterization of an aspect of an input dataset, typically, an appearance, shape geometry, or morphology of a region of interest (ROI) corresponding to an image of an anatomy, where the image may be obtained from an imaging modality such as an ultrasound imaging system, a computed tomography (CT) imaging system, a positron emission tomography-CT (PET- CT) imaging system, a magnetic resonance (MR) imaging system, and the like.
  • CT computed tomography
  • PET- CT positron emission tomography-CT
  • MR magnetic resonance
  • the anatomy may include an internal organ of the human body such as, but not limited to, the lung, liver, kidney, stomach, heart, brain, and the like.
  • FIG. 1 An exemplary system 100 for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification, is illustrated in FIG. 1.
  • the system 100 includes a multimodality transfer learning (MTL) subsystem 102.
  • the MTL subsystem 102 includes an interactive representation learning transfer (IRLT) unit 104, a processor unit 108, a memory unit 110, and a user interface 106.
  • the processor unit 108 is communicatively coupled to the memory unit 110.
  • the user interface 106 is operatively coupled to the IRLT unit 104.
  • the IRLT unit 104 is operatively coupled to the processor unit 108 and the memory unit 110.
  • the system 100 and/or the MTL subsystem 102 may include a display unit 134. It may be noted that the MTL subsystem 102 may include other components or hardware, and is not limited to the components shown in FIG. 1.
  • the user interface 106 is configured to receive user input 130 corresponding to characteristics of an input image 128.
  • the user input 130 may include aspects or characteristics of the input image 128, such as, but not limited to, the imaging modality, the anatomy generally represented by the input image 128, the appearance of the ROI corresponding to the input image 128, and the like.
  • the IRLT unit 104 may be implemented as software systems or computer instructions executable via one or more processor units 108 and stored in the memory unit 110.
  • the IRLT unit 104 may be implemented as a hardware system, for example, via FPGAs, custom chips, integrated circuits (ICs), Application Specific ICs (ASICs), and the like.
  • the IRLT unit 104 may include an interactive learning network configurator (ILNC) 112, one or more CNNs configured for unsupervised learning (unsupervised learning CNN) 114, one or more CNNs configured for supervised learning (supervised learning CNN) 116, and a feature primitive repository 118.
  • the ILNC 112 is operatively coupled to the unsupervised learning CNNs 114, the supervised learning CNNs 116, and the feature primitive repository 118.
  • the INLC 112 may be a graphical user interface subsystem configured to enable a user to configure one or more supervised learning CNNs 116 or unsupervised learning CNNs 114.
  • the feature primitive repository 118 is configured to store one or more feature primitives corresponding to an ROI in the input image 128 and one or more corresponding mapping functions.
  • mapping function as used in the present specification is intended to represent a transfer function or a CNN filter that maps the ROI to a compressed representation such that an output of the CNN is a feature primitive that characterizes the ROI of the input image 128 based on an aspect of the ROI.
  • the aspect of the ROI may include a shape geometry, an appearance, a morphology, and the like.
  • the feature primitive repository 118 is configured to store one or more mapping functions. These mapping functions, in conjunction with the corresponding feature primitives, may be used to pre-configure a CNN to learn a new training set.
  • the feature primitive repository 118 is configured to store feature primitives and mapping functions which are transfer learnings obtained from other CNNs to pre-configure a new CNN to learn an unseen dataset.
  • some non-limiting examples of feature primitives that the feature primitive repository 118 is configured to store include feature primitives characterizing an appearance 120 corresponding to the ROI of the image 128, a shape geometry 124 corresponding to the ROI of the image 128, and an anatomy 126 corresponding to the image 128.
  • the ILNC 112 is configured to present a user with various tools and options to interactively characterize one or more aspects of the ROI corresponding to the input image 128.
  • An exemplary embodiment of the ILNC 112 is shown in FIG. 5 and the working of the ILNC 112 will be described in greater detail with reference to FIGs. 4 and 5.
  • the system 100 is configured to develop a portfolio of feature primitives and mapping functions across one or more anatomies and imaging modalities to be stored in the feature primitive repository 118. Subsequently, the system 100 may provide the user with an ILNC 112 to allow the user to pre-configure a CNN for learning a new, unseen image dataset based on a selection of modality, anatomy, shape geometry, morphology, and the like corresponding to one or more ROIs of the unseen image dataset.
  • one learning outcome of the CNN may be a classification scheme categorizing the image dataset. Other non-limiting examples of learning outcomes may include pixel-level segmentation, regression, and the like. The working of the system 100 will be described in greater detail with reference to FIGs. 2-5.
  • system 100 and/or the MTL subsystem 102 may be configured to visualize one or more of the feature primitives, the mapping functions, the image datasets, and the like on the display unit 134.
  • FIG. 2 a flowchart 200 generally representative of a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. The method 200 is described with reference to the components of FIG. 1.
  • the flowchart 200 illustrates the main steps of the method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository such as the feature primitive repository 118.
  • steps 202- 208 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
  • step 210 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more supervised learning CNNs 116.
  • steps 214-216 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more unsupervised learning CNNs 114.
  • the method 200 starts at step 202 where at least a first input image dataset 220 and a second input image dataset 222 corresponding to a first imaging modality and a second imaging modality are obtained.
  • the first and second input image datasets 220, 222 may correspond to an anatomy of the human body, such as the liver, lung, kidney, heart, brain, stomach, and the like.
  • the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
  • step 204 a check is carried out to determine whether the input image datasets 220, 222 are labelled.
  • labels referencing the input image datasets may be generally representative of a classification scheme or score that characterizes an aspect of the input images, such as the shape geometry, appearance, morphology, and the like. If, at step 204, it is determined that the input image datasets are labelled, control passes to step 210, where a first CNN and a second CNN are configured for supervised learning of the input image datasets 220, 222. Step 210 will be described in greater detail with reference to FIG. 3. Subsequent to step 210, the method 200 is terminated, as indicated by step 212.
  • the second input dataset 222 corresponding to the second imaging modality is augmented with additional data.
  • the additional data for augmenting the second input image dataset 222 may be obtained by processing the first input image dataset 220 corresponding to the first imaging modality via an intensity mapping function.
  • the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control passes to step 214.
  • a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality.
  • a first unsupervised learning CNN and a second unsupervised learning CNN are jointly trained with the first input image dataset 220 and the second input image dataset 222 to learn compressed representations of the input image datasets 220, 222, where the compressed representations include one or more common feature primitives and corresponding mapping functions.
  • the one or more feature primitives characterize aspects of the images of the first input image dataset 220. It may be noted that the mapping functions map the input image dataset to the corresponding feature primitive. In one embodiment, the mapping function may be defined in accordance with equation (1).
  • her is a set of feature primitives obtained when a region of interest of an image eris mapped using a mapping function /and weights w.
  • the image PC T corresponds to an image obtained via use of a CT imaging system.
  • one or more mapping functions corresponding to the first imaging modality and the second imaging modality are generated. It may be noted that these mapping functions map the first and second input image datasets 220, 222 to the same feature primitives.
  • a second mapping function may be defined in accordance with equation (2).
  • HMR is a set of feature primitives obtained when a region of interest of an image PMR is mapped using a mapping function/and weights w.
  • the image PMR is obtained using an MR imaging system.
  • step 218 the at least the one or more feature primitives and the corresponding mapping functions are stored in the feature primitive repository 118. Control is then passed to step 212 to terminate the method 200.
  • FIG. 3 a flowchart 300 generally representative of a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. More particularly, the method 300 describes step 210 of FIG. 2 in greater detail. Also, the method 300 is described with reference to the components of FIG. 1.
  • the flowchart 300 illustrates the main steps of the method for building a set of feature primitives from labeled image data to augment a feature primitive repository.
  • steps 302-308 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104.
  • steps 310-314 may be performed by processor unit 108 in conjunction with memory unit 110 and the one or more supervised learning CNNs 116.
  • the method 300 starts at step 302 where at least a first input image dataset 316 and a second input image dataset 318 corresponding to at least a first imaging modality and a second imaging modality are obtained.
  • the first and second input image datasets 316, 318 may correspond to an anatomy of the human body, for example, liver, lung, kidney, heart, brain, stomach, and the like.
  • the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
  • a learning outcome of the one or more supervised CNNs may be a classification of images in the first and second input image datasets 316, 318.
  • a check is carried out to determine if the first input image dataset 316 and the second input image dataset 318 include sufficient data to adequately train one or more CNNs. If, at step 304, it is determined that the first and second input image datasets 316, 318 include sufficient data, control is passed to step 308. However, at step 304, it is determined that the first and second input image datasets 316, 318 do not have sufficient data, control is passed to step 306. In one example, at step 306, it may be determined that the first input image dataset 316 includes sufficient data and the second input image dataset 318 does not include sufficient data. Accordingly, in this example, at step 306, the second input image dataset 318 corresponding to the second modality is augmented with additional data.
  • the additional data is obtained by processing the first input image dataset 316 corresponding to the first imaging modality via an intensity mapping function.
  • the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control is passed to step 308.
  • a first supervised learning CNN and a second supervised learning CNN are jointly trained based on labels associated with the first input image dataset 316 and labels associated with the second input image dataset 318 to generate one or more feature primitives and corresponding mapping functions.
  • the learning outcome may include one or more feature primitives that characterize aspects of the images of the first input image dataset 316 and aspects of the images of the second input image dataset 318 and corresponding mapping functions where the mapping functions map the corresponding first input image dataset 316 and the second input image dataset 318 to the one or more feature primitives.
  • the feature primitives are independent of the imaging modality used to acquire the first input image dataset 316 and the second input image dataset 318.
  • the one or more feature primitives and the corresponding mapping functions are stored in a feature primitive repository.
  • the methods 200 and 300 described hereinabove enable the creation of a portfolio of feature primitives and mapping functions corresponding to images generated for a plurality of anatomies across a plurality of modalities.
  • the feature primitives and the mapping functions are stored in the feature primitive repository 118.
  • this portfolio of feature primitives and mapping functions characterizes the learning gained in the training of the CNNs with the input image datasets.
  • the learning may be transferred to pre-configure new CNNs to obtain learning outcomes for different, unseen datasets.
  • FIG. 4 illustrates a flowchart 400 depicting a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification.
  • the method 400 is described with reference to FIGs. 1, 2 and 3. It may be noted that the flowchart 400 illustrates the main steps of the method 400 for pre- configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives.
  • steps 402-406 may be performed by the processor unit 108 in conjunction with the memory unit 110 and the ILNC 112 of the IRLT unit 104.
  • steps 408-410 may be performed by processor unit 108 in conjunction with the memory unit 110 and the one or more supervised learning CNNs 116.
  • the method 400 starts at step 402, where at least one input image dataset 404 may be obtained.
  • the at least one input image dataset 404 is representative of an unseen input image data set.
  • at least one learning parameter 406 and a learning outcome 408 corresponding to input image dataset 404 may be obtained.
  • the input image dataset 404, the learning parameter 406, and the learning outcome 408 may be obtained as user input 410.
  • the learning parameter 406 may include an imaging modality, an image anatomy, or a combination thereof.
  • the learning outcome 408 may include a classification scheme, a regression scheme, or a pixel level output like segmentation.
  • At step 412 at least one feature primitive and a corresponding mapping function corresponding to the learning parameter 406 and the learning outcome 408 are obtained from the feature primitive repository 118.
  • a CNN is configured for learning the input image dataset 404 using the at least one feature primitive and the at least one mapping function, as indicated by step 414.
  • the configuration of the CNN may entail setting one or more filters obtained from the feature primitive repository 118 to the mapping functions.
  • a pre-configured CNN is generated. Further, at step 416, the pre-configured CNN is optimized with at least a training subset of the input image dataset 404.
  • a trained convolutional autoencoder (CAE) for supervised learning that uses labelled data corresponding to a first imaging modality is adapted for the input image dataset 404, with a few parameters, in accordance with equation (3).
  • w is a set of feature primitives obtained when a region of interest of an image Pi corresponding to the first imaging modality is mapped using a mapping function / and weights w(a, w), where a is a sparse set of the CAE parameters.
  • the CAE parameters may be further optimized with at least the training subset of the input image dataset 404.
  • the framework for learning may be defined in accordance with the following formulation.
  • mapping function/obtained in equation (3) is applied over the mapping function fa corresponding to the region of interest of an image corresponding to the second imaging modality.
  • the input image dataset 404 is processed via the optimized CNN to obtain a learning outcome 420, corresponding to the requested learning outcome 408.
  • FIG.5 is a schematic diagram 500 of one embodiment of the interactive learning network configurator 112 of the interactive representation learning transfer unit 104 of FIG. 1, in accordance with aspects of the present specification.
  • the block diagram 500 is generally representative of the ILNC 112 as shown in FIG. 1.
  • Reference numerals 502-508 are generally representative of visualizations of feature primitives respectively corresponding to an imaging modality, anatomy, appearance, and shape geometry.
  • the data for the visualizations 502-508 may be obtained from the feature primitive repository 118 of FIG. 1.
  • the ILNC 500 provides a user a selection of interactive menus.
  • the user may select one or more aspects of an unseen image dataset to be learned by a CNN.
  • Reference numerals 510-516 are generally representative of interactive menu options that may be available to a user to aid in the characterization of the unseen image dataset.
  • reference numeral 510 may pertain to the imaging modality of the unseen image dataset.
  • the menu options of block 510 may include CT, MR, PET, ultrasound, and the like.
  • reference numerals 512-516 may show menu options pertaining to the appearance, shape geometry, and anatomy respectively of the unseen image dataset.
  • Reference numeral 518 is generally representative of a visualization of the pre-configured CNN.
  • reference numeral 518 may correspond to one or more supervised learning CNNs 116 of the IRLT unit 104 of FIG. 1.
  • the user may graphically browse, visualize, and combine feature primitives of blocks 502-508 to create the pre-configured CNN 518.
  • the systems and methods for interactive representation learning transfer through deep learning of feature ontologies presented hereinabove provide a transfer learning paradigm where a portfolio of learned feature primitives and mapping functions may be combined to configure CNNs to solve new medical image analytics problems.
  • the CNNs may be trained for learning appearance and morphology of images.
  • tumors may be classified into blobs, cylinders, disks, bright/dark or banded, and the like.
  • networks may be trained for various combinations of anatomy, modality, appearance, and morphology (shape geometry) to generate a rich portfolio configured to immediately provide a transference of pre-learnt features to a new problem at hand.

Abstract

A method for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The method includes obtaining at least first and second input image datasets from first and second imaging modalities. Furthermore, the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN and a second unsupervised learning CNN with the first and second input image dataset respectively to learn compressed representations of the input image datasets, including common feature primitives and corresponding mapping functions and storing the common feature primitives and the corresponding mapping functions in a feature primitive repository.

Description

SYSTEM AND METHOD FOR INTERACTIVE REPRESENTATION LEARNING TRANSFER THROUGH DEEP LEARNING OF FEATURE
ONTOLOGIES
BACKGROUND
[0001] Embodiments of the present specification relate generally to a system and method for generating transferable representation learnings of medical image data of varied anatomies obtained from varied imaging modalities for use in learning networks. Specifically, the system and method are directed to determining a representation learning of the medical image data as a set of feature primitives based on the physics of a first and/or second imaging modality and the biology of the anatomy to configure new convolutional networks for learning problems such as classification and segmentation of the medical image data from other imaging modalities.
[0002] As will be appreciated, machine learning is the subfield of computer science that "gives computers the ability to learn without being explicitly programmed." Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. In machine learning, feature learning or representation learning is a set of techniques that transform raw data input into a representation that can be effectively exploited in machine learning tasks. Representation learning is motivated by the fact that machine learning tasks such as classification often require input that is mathematically and computationally convenient to process. However, real-world data such as images, video, and sensor measurement is usually complex, redundant, and highly variable. Thus, it is desirable to identify useful features or representations from raw data. Currently, manual feature identification methods require expensive human labor and rely on expert knowledge. Also, manually generated representations normally do not lend themselves well to generalization, thereby motivating the design of efficient representation learning techniques to automate and generalize feature or representation learning.
[0003] Moreover, in machine learning, a convolutional neural network (CNN or ConvNet) or a deep convolutional neural network is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex. Convolutional neural networks (CNNs) are biologically inspired variants of multilayer perceptrons, designed to emulate the behavior of a visual cortex with minimal amounts of preprocessing. It may be noted that a multilayer perceptron network (MLP) is a feed-forward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP includes multiple layers of nodes in a directed graph, with each layer fully connected to the next one. In current deployments, CNNs have wide applications in image and video recognition, recommender systems, and natural language processing. CNNs are also known as shift invariant or space invariant artificial neural networks (SIANNs) based on their shared weights architecture and translation invariance characteristics.
[0004] It may be noted that in deep CNN architectures, a convolutional layer is the core building block. Parameters associated with a convolutional layer include a set of learnable filters or kernels. During a forward pass, each filter is convolved across the width and height of the input volume, computing the dot product between the entries of the filter and the input and producing a two-dimensional (2D) activation map of that filter. Thus, the network learns filters that are activated when the network detects a specific type of feature at a given spatial position in the input. Deep CNN architectures are typically purpose-built for deep learning problems. Deep learning is known to be effective in many tasks that involve human-like perception and decision making. Typical applications of deep learning are handwriting recognition, image recognition, and speech recognition. Deep learning techniques construct a network model by using training data and outcomes that correspond to the data. Once the network model is constructed, it can be used on new data for determining outcomes. Moreover, it will be appreciated that a deep learning network once learnt for a specific outcome may be reused advantageously for related outcomes.
[0005] Furthermore, CNNs may be used for supervised learning, where an input dataset is labelled, as well as for unsupervised learning, where the input dataset is not labelled. A labelled input dataset is one where the elements of the dataset are pre- associated with a classification scheme, represented by labels. Thus, the CNN is trained with a labelled subset of the dataset, and may be tested with another subset to verify an accurate classification result
[0006] A deep CNN architecture is multi-layered, where the layers are hierarchically connected. As the number of input to output mappings and filters for each layer increases, a multi-layer deep CNN may result in a huge number of parameters that need to be configured for its operation. If training data for such a CNN is scarce, the learning problem is under-determined. In this situation, it is advantageous to transfer certain parameters from a pre-learned CNN model. Transfer learning reduces the number of parameters to be optimized by freezing the pre-learned parameters in a subset of layers and provides a good initialization for tuning the remaining layers. In the medical image problem domain, using transfer learning to pre-configure CNNs for various problems of classification and identification is advantageous in ameliorating the situations where data is scarce, the challenges of heterogenous data types due to the plurality of imaging modalities and anatomies, and other clinical challenges.
BRIEF DESCRIPTION
[0007] In accordance with one aspect of the present specification, a method for a method for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The method includes obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality. Furthermore, the method includes performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Additionally, the method includes storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
[0008] In accordance with another aspect of the present specification, an interactive representation learning transfer (IRLT) unit for interactive representation learning transfer to a convolutional neural network (CNN) is presented. The IRLT unit includes an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. Moreover, the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions in a feature primitive repository. [0009] In accordance with yet another aspect of the present specification, a multimodality transfer learning system, is presented. The multimodality transfer learning system includes a processor unit and a memory unit communicatively operatively coupled to the processor unit. Furthermore, the multimodality transfer learning system includes an interactive representation learning transfer (IRLT) unit operatively coupled to the processor unit and including an interactive learning network configurator configured to obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality and perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, where the compressed representation includes one or more common feature primitives and corresponding mapping functions. In addition, the IRLT unit includes a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
DRAWINGS
[0010] These and other features and aspects of embodiments of the present specification will become better understood when the following detailed description is read with references to the accompanying drawings in which like characters represent like parts throughout the drawings, wherein:
[0011] FIG. 1 is a schematic diagram of an exemplary system for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification; [0012] FIG. 2 is a flowchart illustrating a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification;
[0013] FIG. 3 is a flowchart illustrating a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification;
[0014] FIG. 4 is a flowchart illustrating a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification; and
[0015] FIG. 5 is a schematic diagram of one embodiment of an interactive learning network configurator, in accordance with aspects of the present specification.
DETAILED DESCRIPTION
[0016] As will be described in detail hereinafter, various embodiments of an exemplary system and method for interactive representation learning transfer through deep learning of feature ontologies are presented. In an effort to provide a concise description of these embodiments, all features of an actual implementation may not be described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation- specific decisions must be made to achieve the developer's specific goals such as compliance with system-related and business-related constraints.
[0017] When describing elements of the various embodiments of the present specification, the articles "a," "an," "the," and "said" are intended to mean that there are one or more of the elements. The terms "comprising," "including" and "having" are intended to be inclusive and mean that there may be additional elements other than the listed elements. Furthermore, the terms "build" and "construct" and their variations are intended to mean a mathematical determination or computation of mathematical constructs. The terms "data drawn on arbitrary domains" or "data on arbitrary domains" are intended to mean data corresponding to a domain for example, social media data, sensor data, enterprise data and the like.
[0018] The term "transfer learning" or "inductive transfer" as used in the present specification is intended to mean an approach of machine learning that focuses on applying the knowledge or learning gained while solving one problem to a different, related problem. This knowledge or learning may be characterized and/or represented in various ways, typically by combinations and variations of transfer functions, mapping functions, graphs, matrices, and other primitives. Also, the term "transfer learning primitive" as used in the present specification is intended to mean the characterization and/or representation of knowledge or learning gained by solving a machine learning problem as described hereinabove.
[0019] Moreover, the term "feature primitive" as used in the present specification is intended to mean a characterization of an aspect of an input dataset, typically, an appearance, shape geometry, or morphology of a region of interest (ROI) corresponding to an image of an anatomy, where the image may be obtained from an imaging modality such as an ultrasound imaging system, a computed tomography (CT) imaging system, a positron emission tomography-CT (PET- CT) imaging system, a magnetic resonance (MR) imaging system, and the like. Also, the anatomy may include an internal organ of the human body such as, but not limited to, the lung, liver, kidney, stomach, heart, brain, and the like.
[0020] An exemplary system 100 for interactive representation learning transfer through deep learning of feature ontologies, in accordance with aspects of the present specification, is illustrated in FIG. 1. In a presently contemplated configuration of FIG. 1, the system 100 includes a multimodality transfer learning (MTL) subsystem 102. The MTL subsystem 102 includes an interactive representation learning transfer (IRLT) unit 104, a processor unit 108, a memory unit 110, and a user interface 106. The processor unit 108 is communicatively coupled to the memory unit 110. The user interface 106 is operatively coupled to the IRLT unit 104. Also, the IRLT unit 104 is operatively coupled to the processor unit 108 and the memory unit 110. The system 100 and/or the MTL subsystem 102 may include a display unit 134. It may be noted that the MTL subsystem 102 may include other components or hardware, and is not limited to the components shown in FIG. 1.
[0021] The user interface 106 is configured to receive user input 130 corresponding to characteristics of an input image 128. The user input 130 may include aspects or characteristics of the input image 128, such as, but not limited to, the imaging modality, the anatomy generally represented by the input image 128, the appearance of the ROI corresponding to the input image 128, and the like.
[0022] In certain embodiments, the IRLT unit 104 may be implemented as software systems or computer instructions executable via one or more processor units 108 and stored in the memory unit 110. In other embodiments, the IRLT unit 104 may be implemented as a hardware system, for example, via FPGAs, custom chips, integrated circuits (ICs), Application Specific ICs (ASICs), and the like.
[0023] As illustrated in FIG. 1, the IRLT unit 104 may include an interactive learning network configurator (ILNC) 112, one or more CNNs configured for unsupervised learning (unsupervised learning CNN) 114, one or more CNNs configured for supervised learning (supervised learning CNN) 116, and a feature primitive repository 118. The ILNC 112 is operatively coupled to the unsupervised learning CNNs 114, the supervised learning CNNs 116, and the feature primitive repository 118. In one embodiment, the INLC 112 may be a graphical user interface subsystem configured to enable a user to configure one or more supervised learning CNNs 116 or unsupervised learning CNNs 114. [0024] The feature primitive repository 118 is configured to store one or more feature primitives corresponding to an ROI in the input image 128 and one or more corresponding mapping functions. The term "mapping function" as used in the present specification is intended to represent a transfer function or a CNN filter that maps the ROI to a compressed representation such that an output of the CNN is a feature primitive that characterizes the ROI of the input image 128 based on an aspect of the ROI. In one example, the aspect of the ROI may include a shape geometry, an appearance, a morphology, and the like. Accordingly, the feature primitive repository 118 is configured to store one or more mapping functions. These mapping functions, in conjunction with the corresponding feature primitives, may be used to pre-configure a CNN to learn a new training set. Thus, the feature primitive repository 118 is configured to store feature primitives and mapping functions which are transfer learnings obtained from other CNNs to pre-configure a new CNN to learn an unseen dataset. As shown in FIG. 1, some non-limiting examples of feature primitives that the feature primitive repository 118 is configured to store include feature primitives characterizing an appearance 120 corresponding to the ROI of the image 128, a shape geometry 124 corresponding to the ROI of the image 128, and an anatomy 126 corresponding to the image 128.
[0025] In the presently contemplated configuration of FIG. l, the ILNC 112 is configured to present a user with various tools and options to interactively characterize one or more aspects of the ROI corresponding to the input image 128. An exemplary embodiment of the ILNC 112 is shown in FIG. 5 and the working of the ILNC 112 will be described in greater detail with reference to FIGs. 4 and 5.
[0026] The system 100 is configured to develop a portfolio of feature primitives and mapping functions across one or more anatomies and imaging modalities to be stored in the feature primitive repository 118. Subsequently, the system 100 may provide the user with an ILNC 112 to allow the user to pre-configure a CNN for learning a new, unseen image dataset based on a selection of modality, anatomy, shape geometry, morphology, and the like corresponding to one or more ROIs of the unseen image dataset. By way of example, one learning outcome of the CNN may be a classification scheme categorizing the image dataset. Other non-limiting examples of learning outcomes may include pixel-level segmentation, regression, and the like. The working of the system 100 will be described in greater detail with reference to FIGs. 2-5.
[0027] Additionally, the system 100 and/or the MTL subsystem 102 may be configured to visualize one or more of the feature primitives, the mapping functions, the image datasets, and the like on the display unit 134.
[0028] Turning now to FIG. 2, a flowchart 200 generally representative of a method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. The method 200 is described with reference to the components of FIG. 1.
[0029] It may be noted that the flowchart 200 illustrates the main steps of the method for building a set of feature primitives from unlabeled image data to augment a feature primitive repository such as the feature primitive repository 118. In some embodiments, various steps of the method 200 of FIG. 2, more particularly, steps 202- 208 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, step 210 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more supervised learning CNNs 116. Also, steps 214-216 may be performed by the processor unit 108 in conjunction with the memory unit 110 and one or more unsupervised learning CNNs 114.
[0030] The method 200 starts at step 202 where at least a first input image dataset 220 and a second input image dataset 222 corresponding to a first imaging modality and a second imaging modality are obtained. In one embodiment, the first and second input image datasets 220, 222 may correspond to an anatomy of the human body, such as the liver, lung, kidney, heart, brain, stomach, and the like. Also, the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof.
[0031] At step 204, a check is carried out to determine whether the input image datasets 220, 222 are labelled. As previously noted, labels referencing the input image datasets may be generally representative of a classification scheme or score that characterizes an aspect of the input images, such as the shape geometry, appearance, morphology, and the like. If, at step 204, it is determined that the input image datasets are labelled, control passes to step 210, where a first CNN and a second CNN are configured for supervised learning of the input image datasets 220, 222. Step 210 will be described in greater detail with reference to FIG. 3. Subsequent to step 210, the method 200 is terminated, as indicated by step 212.
[0032] Referring again to step 204, if it is determined that the input image datasets are unlabeled, control passes to step 206, where a second check is carried out to determine if the first input image dataset 220 and the second input image dataset 222 include sufficient data to adequately train one or more CNNs. If, at step 206, it is determined that the first and second input image datasets 220, 222 include sufficient data, control passes to step 214. However, at step 206 it is determined that the first and second input image datasets 220, 222 do not have sufficient data, control passes to step 208.
[0033] In one example, at step 208, it may be determined that the first input image dataset 220 includes sufficient data and the second input image dataset 222 does not include sufficient data. Accordingly, in this example, at step 208, the second input dataset 222 corresponding to the second imaging modality is augmented with additional data. The additional data for augmenting the second input image dataset 222 may be obtained by processing the first input image dataset 220 corresponding to the first imaging modality via an intensity mapping function. One non-limiting example of the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control passes to step 214.
[0034] At step 214, a first unsupervised learning CNN and a second unsupervised learning CNN are jointly trained with the first input image dataset 220 and the second input image dataset 222 to learn compressed representations of the input image datasets 220, 222, where the compressed representations include one or more common feature primitives and corresponding mapping functions.
[0035] The one or more feature primitives characterize aspects of the images of the first input image dataset 220. It may be noted that the mapping functions map the input image dataset to the corresponding feature primitive. In one embodiment, the mapping function may be defined in accordance with equation (1).
Figure imgf000014_0001
[0036] In equation (1), her is a set of feature primitives obtained when a region of interest of an image eris mapped using a mapping function /and weights w. In this example, the image PCT corresponds to an image obtained via use of a CT imaging system. [0037] It may be noted that consequent to step 214, one or more mapping functions corresponding to the first imaging modality and the second imaging modality are generated. It may be noted that these mapping functions map the first and second input image datasets 220, 222 to the same feature primitives. In one embodiment, a second mapping function may be defined in accordance with equation (2).
[0038] In equation (2), HMR is a set of feature primitives obtained when a region of interest of an image PMR is mapped using a mapping function/and weights w. In this example, the image PMR is obtained using an MR imaging system.
[0039] Furthermore, at step 218, the at least the one or more feature primitives and the corresponding mapping functions are stored in the feature primitive repository 118. Control is then passed to step 212 to terminate the method 200.
[0040] Turning now to FIG. 3, a flowchart 300 generally representative of a method for building a set of feature primitives from labeled image data to augment a feature primitive repository, in accordance with aspects of the present specification, is presented. More particularly, the method 300 describes step 210 of FIG. 2 in greater detail. Also, the method 300 is described with reference to the components of FIG. 1.
[0041] It may be noted that the flowchart 300 illustrates the main steps of the method for building a set of feature primitives from labeled image data to augment a feature primitive repository. In some embodiments, various steps of the method 300 of FIG. 3, more particularly, steps 302-308 may be performed by the processor unit 108 in conjunction with memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, steps 310-314 may be performed by processor unit 108 in conjunction with memory unit 110 and the one or more supervised learning CNNs 116. [0042] The method 300 starts at step 302 where at least a first input image dataset 316 and a second input image dataset 318 corresponding to at least a first imaging modality and a second imaging modality are obtained. In one embodiment, the first and second input image datasets 316, 318 may correspond to an anatomy of the human body, for example, liver, lung, kidney, heart, brain, stomach, and the like. Also, the first and second imaging modalities may include an ultrasound imaging system, an MR imaging system, a CT imaging system, a PET-CT imaging system, or combinations thereof. It may be noted that a learning outcome of the one or more supervised CNNs may be a classification of images in the first and second input image datasets 316, 318.
[0043] At step 304, a check is carried out to determine if the first input image dataset 316 and the second input image dataset 318 include sufficient data to adequately train one or more CNNs. If, at step 304, it is determined that the first and second input image datasets 316, 318 include sufficient data, control is passed to step 308. However, at step 304, it is determined that the first and second input image datasets 316, 318 do not have sufficient data, control is passed to step 306. In one example, at step 306, it may be determined that the first input image dataset 316 includes sufficient data and the second input image dataset 318 does not include sufficient data. Accordingly, in this example, at step 306, the second input image dataset 318 corresponding to the second modality is augmented with additional data. It may be noted that the additional data is obtained by processing the first input image dataset 316 corresponding to the first imaging modality via an intensity mapping function. As previously noted, one non-limiting example of the intensity mapping function may include a regression framework that takes multi-modality acquisitions via use of a first imaging modality and a second imaging modality corresponding to one or more subjects, for example, CT and MR, and learns a patch level mapping from the first modality to the second modality, using deep learning, hand crafted intensity features, or a combination thereof to map the intensities of the first modality to the second modality. Subsequently, control is passed to step 308. [0044] Furthermore, at step 308, a first supervised learning CNN and a second supervised learning CNN are jointly trained based on labels associated with the first input image dataset 316 and labels associated with the second input image dataset 318 to generate one or more feature primitives and corresponding mapping functions.
[0045] In one embodiment, the learning outcome may include one or more feature primitives that characterize aspects of the images of the first input image dataset 316 and aspects of the images of the second input image dataset 318 and corresponding mapping functions where the mapping functions map the corresponding first input image dataset 316 and the second input image dataset 318 to the one or more feature primitives. Thus, the feature primitives are independent of the imaging modality used to acquire the first input image dataset 316 and the second input image dataset 318. Referring now to step 312, the one or more feature primitives and the corresponding mapping functions are stored in a feature primitive repository.
[0046] The methods 200 and 300 described hereinabove enable the creation of a portfolio of feature primitives and mapping functions corresponding to images generated for a plurality of anatomies across a plurality of modalities. The feature primitives and the mapping functions are stored in the feature primitive repository 118. Also, this portfolio of feature primitives and mapping functions characterizes the learning gained in the training of the CNNs with the input image datasets. Moreover, the learning may be transferred to pre-configure new CNNs to obtain learning outcomes for different, unseen datasets.
[0047] With the foregoing in mind, FIG. 4 illustrates a flowchart 400 depicting a method for pre-configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives, in accordance with aspects of the present specification. The method 400 is described with reference to FIGs. 1, 2 and 3. It may be noted that the flowchart 400 illustrates the main steps of the method 400 for pre- configuring a CNN with mapping functions to learn an unseen dataset based on a selection of feature primitives. In some embodiments, various steps of the method 400 of FIG. 4, more particularly, steps 402-406 may be performed by the processor unit 108 in conjunction with the memory unit 110 and the ILNC 112 of the IRLT unit 104. Moreover, steps 408-410 may be performed by processor unit 108 in conjunction with the memory unit 110 and the one or more supervised learning CNNs 116.
[0048] The method 400 starts at step 402, where at least one input image dataset 404 may be obtained. The at least one input image dataset 404 is representative of an unseen input image data set. Additionally, at least one learning parameter 406 and a learning outcome 408 corresponding to input image dataset 404 may be obtained. In certain embodiments, the input image dataset 404, the learning parameter 406, and the learning outcome 408 may be obtained as user input 410. In one embodiment, the learning parameter 406 may include an imaging modality, an image anatomy, or a combination thereof. Also, the learning outcome 408 may include a classification scheme, a regression scheme, or a pixel level output like segmentation.
[0049] At step 412, at least one feature primitive and a corresponding mapping function corresponding to the learning parameter 406 and the learning outcome 408 are obtained from the feature primitive repository 118. Subsequently, a CNN is configured for learning the input image dataset 404 using the at least one feature primitive and the at least one mapping function, as indicated by step 414. In certain embodiments, the configuration of the CNN may entail setting one or more filters obtained from the feature primitive repository 118 to the mapping functions. Consequent to the processing of step 414, a pre-configured CNN is generated. Further, at step 416, the pre-configured CNN is optimized with at least a training subset of the input image dataset 404.
[0050] In one embodiment, a trained convolutional autoencoder (CAE) for supervised learning that uses labelled data corresponding to a first imaging modality is adapted for the input image dataset 404, with a few parameters, in accordance with equation (3).
[0051] In equation (3), w is a set of feature primitives obtained when a region of interest of an image Pi corresponding to the first imaging modality is mapped using a mapping function / and weights w(a, w), where a is a sparse set of the CAE parameters. In this way, the number of filters may be reduced. The CAE parameters may be further optimized with at least the training subset of the input image dataset 404. In this way for a supervised learning problem corresponding to a second imaging modality, the framework for learning may be defined in accordance with the following formulation.
Figure imgf000019_0002
[0052] In formulation (4), the mapping function/obtained in equation (3) is applied over the mapping function fa corresponding to the region of interest of an image corresponding to the second imaging modality. Subsequently, at step 418, the input image dataset 404 is processed via the optimized CNN to obtain a learning outcome 420, corresponding to the requested learning outcome 408.
[0053] The workflow embodied in the method 400 is described in greater detail with reference to FIG. 5. FIG.5 is a schematic diagram 500 of one embodiment of the interactive learning network configurator 112 of the interactive representation learning transfer unit 104 of FIG. 1, in accordance with aspects of the present specification. As illustrated in FIG. 5, the block diagram 500 is generally representative of the ILNC 112 as shown in FIG. 1. Reference numerals 502-508 are generally representative of visualizations of feature primitives respectively corresponding to an imaging modality, anatomy, appearance, and shape geometry. The data for the visualizations 502-508 may be obtained from the feature primitive repository 118 of FIG. 1.
[0054] In FIG. 5, the ILNC 500 provides a user a selection of interactive menus. In particular, using the ILNC 500, the user may select one or more aspects of an unseen image dataset to be learned by a CNN. Reference numerals 510-516 are generally representative of interactive menu options that may be available to a user to aid in the characterization of the unseen image dataset. By way of example, reference numeral 510 may pertain to the imaging modality of the unseen image dataset. The menu options of block 510 may include CT, MR, PET, ultrasound, and the like. In a similar fashion, reference numerals 512-516 may show menu options pertaining to the appearance, shape geometry, and anatomy respectively of the unseen image dataset.
[0055] The selections made by the user from the interactive menus enable the pre- configuration of a CNN with feature primitives and mapping functions corresponding to the menu selections. Reference numeral 518 is generally representative of a visualization of the pre-configured CNN. In certain embodiments, reference numeral 518 may correspond to one or more supervised learning CNNs 116 of the IRLT unit 104 of FIG. 1. Additionally, the user may graphically browse, visualize, and combine feature primitives of blocks 502-508 to create the pre-configured CNN 518.
[0056] The systems and methods for interactive representation learning transfer through deep learning of feature ontologies presented hereinabove provide a transfer learning paradigm where a portfolio of learned feature primitives and mapping functions may be combined to configure CNNs to solve new medical image analytics problems. Advantageously, the CNNs may be trained for learning appearance and morphology of images. By way of example, tumors may be classified into blobs, cylinders, disks, bright/dark or banded, and the like. Overall, networks may be trained for various combinations of anatomy, modality, appearance, and morphology (shape geometry) to generate a rich portfolio configured to immediately provide a transference of pre-learnt features to a new problem at hand.
[0057] It is to be understood that not necessarily all such objects or advantages described above may be achieved in accordance with any embodiment. Thus, for example, those skilled in the art will recognize that the systems and techniques described herein may be embodied or carried out in a manner that achieves or improves one advantage or group of advantages as taught herein without necessarily achieving other objects or advantages as may be taught or suggested herein.
[0058] While the technology has been described in detail in connection with only a limited number of embodiments, it should be readily understood that the specification is not limited to such disclosed embodiments. Rather, the technology can be modified to incorporate any number of variations, alterations, substitutions or equivalent arrangements not heretofore described, but which are commensurate with the spirit and scope of the claims. Additionally, while various embodiments of the technology have been described, it is to be understood that aspects of the specification may include only some of the described embodiments. Accordingly, the specification is not to be limited by the foregoing description, but is only limited by the scope of the appended claims.

Claims

CLAIMS:
1. A method for interactive representation learning transfer to a convolutional neural network (CNN), comprising: obtaining at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
performing at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and
storing at least the one or more common feature primitives and the corresponding mapping functions in a feature primitive repository.
2. The method of claim 1, further comprising:
obtaining at least one unseen input image dataset;
obtaining at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtaining, via the feature primitive repository, at least one feature primitive and a corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome;
configuring a CNN for learning the at least one unseen input image dataset based on the at least one feature primitive and the at least one mapping function; optimizing the configured CNN with at least a training subset of the unseen input image dataset; and
processing the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
3. The method of claim 1, wherein jointly training the first supervised learning CNN and the second supervised learning CNN to generate the one or more common feature primitives and the corresponding mapping functions comprises jointly training the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
4. The method of claim 1 , further comprising augmenting the second input image dataset corresponding to the second imaging modality with additional data, wherein the additional data is obtained by processing the first input image dataset corresponding to the first imaging modality via an intensity mapping function.
5. The method of claim 4, wherein processing the first input image dataset corresponding to the first imaging modality via the intensity mapping function comprises:
performing, via a regression framework in the intensity mapping function, multi-modality acquisitions from a first imaging modality and a second imaging modality corresponding to one or more subjects; and
learning, via the regression framework in the intensity mapping function, a patch level mapping from the first imaging modality to the second imaging modality to map intensities of the first imaging modality to the second imaging modality.
6. An interactive representation learning transfer unit for interactive representation learning transfer to a convolutional neural network (CNN), comprising: an interactive learning network configurator configured to:
obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and
a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
7. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to:
obtain at least one unseen input image dataset;
obtain at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtain, via the feature primitive repository, at least one feature primitive and corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome; configure a CNN for learning the at least one unseen input image dataset based on the at least one feature primitive and the at least one mapping function;
optimize the configured CNN with at least a training subset of the unseen input image dataset; and
process the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
8. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to jointly train the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
9. The interactive representation learning transfer unit of claim 6, wherein the interactive learning network configurator is further configured to augment the second input image dataset corresponding to the second imaging modality with additional data, and wherein the additional data is obtained by processing the first input image dataset corresponding to the first imaging modality via an intensity mapping function.
10. The interactive representation learning transfer unit of claim 9, wherein the intensity mapping function comprises a regression framework configured to: perform multi-modality acquisitions from a first imaging modality and a second imaging modality corresponding to one or more subjects; and learn a patch level mapping from the first imaging modality to the second imaging modality to map intensities of the first imaging modality to the second imaging modality.
11. A multimodality transfer learning system, comprising:
a processor unit;
a memory unit operatively coupled to the processor unit;
an interactive representation learning transfer unit operatively coupled to the processor unit and comprising:
an interactive learning network configurator configured to:
obtain at least a first input image dataset from a first imaging modality and a second input image dataset from a second imaging modality;
perform at least one of jointly training a first supervised learning CNN based on labels associated with the first input image dataset and a second supervised learning CNN based on labels associated with the second input image dataset to generate one or more common feature primitives and corresponding mapping functions and jointly training a first unsupervised learning CNN with the first input image dataset and a second unsupervised learning CNN with the second input image dataset to learn compressed representations of the input image datasets, wherein the compressed representation comprises one or more common feature primitives and corresponding mapping functions; and a feature primitive repository configured to store the at least one or more common feature primitives and the corresponding mapping functions.
12. The multimodality transfer learning system of claim 11, wherein the interactive learning network configurator is further configured to: obtain at least one unseen input image dataset;
obtain at least one learning parameter and at least one learning outcome corresponding to the at least one unseen input image dataset;
obtain, via the feature primitive repository, at least one feature primitive and corresponding mapping function corresponding to the at least one learning parameter and the at least one learning outcome;
configure a CNN for learning the at least one unseen input image dataset, based on the at least one feature primitive and the at least one mapping function;
optimize the configured CNN with at least a training subset of the unseen input image dataset; and
process the at least one unseen input image dataset via the optimized CNN to obtain the at least one learning outcome.
13. The multimodality transfer learning system of claim 11 , wherein the interactive learning network configurator is further configured to:
jointly train the first supervised learning CNN with the first input image dataset and the second supervised learning CNN with the second input image dataset to obtain at least the one or more common feature primitives characterizing the first input image dataset and the second input image dataset, and the corresponding mapping functions of the first supervised learning CNN and the second supervised learning CNN.
PCT/US2018/058855 2017-11-03 2018-11-02 System and method for interactive representation learning transfer through deep learning of feature ontologies WO2019090023A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP18804491.1A EP3704636A1 (en) 2017-11-03 2018-11-02 System and method for interactive representation learning transfer through deep learning of feature ontologies
JP2020524235A JP7467336B2 (en) 2017-11-03 2018-11-02 METHOD, PROCESSING UNIT AND SYSTEM FOR STUDYING MEDICAL IMAGE DATA OF ANATOMICAL STRUCTURES OBTAINED FROM MULTIPLE IMAGING MODALITIES - Patent application
CN201880071649.9A CN111316290B (en) 2017-11-03 2018-11-02 System and method for interactive representation learning migration through deep learning of feature ontologies

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201741039221 2017-11-03
IN201741039221 2017-11-03

Publications (1)

Publication Number Publication Date
WO2019090023A1 true WO2019090023A1 (en) 2019-05-09

Family

ID=64332416

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/058855 WO2019090023A1 (en) 2017-11-03 2018-11-02 System and method for interactive representation learning transfer through deep learning of feature ontologies

Country Status (4)

Country Link
EP (1) EP3704636A1 (en)
JP (1) JP7467336B2 (en)
CN (1) CN111316290B (en)
WO (1) WO2019090023A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110186375A (en) * 2019-06-06 2019-08-30 西南交通大学 Intelligent high-speed rail white body assemble welding feature detection device and detection method
CN110210486A (en) * 2019-05-15 2019-09-06 西安电子科技大学 A kind of generation confrontation transfer learning method based on sketch markup information
CN112434602A (en) * 2020-11-23 2021-03-02 西安交通大学 Fault diagnosis method based on migratable common feature space mining
WO2022072150A1 (en) * 2020-09-30 2022-04-07 Alteryx, Inc. System and method of operationalizing automated feature engineering

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113707312A (en) * 2021-09-16 2021-11-26 人工智能与数字经济广东省实验室(广州) Blood vessel quantitative identification method and device based on deep learning

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2548172A1 (en) * 2010-03-18 2013-01-23 Koninklijke Philips Electronics N.V. Functional image data enhancement and/or enhancer
JP6235610B2 (en) * 2012-12-26 2017-11-22 ボルケーノ コーポレイション Measurement and enhancement in multi-modality medical imaging systems
US9922272B2 (en) * 2014-09-25 2018-03-20 Siemens Healthcare Gmbh Deep similarity learning for multimodal medical images
CN105930877B (en) * 2016-05-31 2020-07-10 上海海洋大学 Remote sensing image classification method based on multi-mode deep learning
US10127659B2 (en) 2016-11-23 2018-11-13 General Electric Company Deep learning medical systems and methods for image acquisition
US10242443B2 (en) 2016-11-23 2019-03-26 General Electric Company Deep learning medical systems and methods for medical procedures
CN106909905B (en) * 2017-03-02 2020-02-14 中科视拓(北京)科技有限公司 Multi-mode face recognition method based on deep learning
CN106971174B (en) * 2017-04-24 2020-05-22 华南理工大学 CNN model, CNN training method and CNN-based vein identification method
CN107220337B (en) * 2017-05-25 2020-12-22 北京大学 Cross-media retrieval method based on hybrid migration network

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LLUIS CASTREJON ET AL: "Learning Aligned Cross-Modal Representations from Weakly Aligned Data", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 July 2016 (2016-07-25), XP080714696 *
TADAS BALTRUSAITIS ET AL: "Multimodal Machine Learning: A Survey and Taxonomy", 25 May 2017 (2017-05-25), XP055414490, Retrieved from the Internet <URL:https://arxiv.org/pdf/1705.09406.pdf> [retrieved on 20190128] *
YUSUF AYTAR ET AL: "Cross-Modal Scene Networks", 27 October 2016 (2016-10-27), XP055549670, Retrieved from the Internet <URL:https://arxiv.org/pdf/1610.09003.pdf> [retrieved on 20190128] *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110210486A (en) * 2019-05-15 2019-09-06 西安电子科技大学 A kind of generation confrontation transfer learning method based on sketch markup information
CN110210486B (en) * 2019-05-15 2021-01-01 西安电子科技大学 Sketch annotation information-based generation countermeasure transfer learning method
CN110186375A (en) * 2019-06-06 2019-08-30 西南交通大学 Intelligent high-speed rail white body assemble welding feature detection device and detection method
WO2022072150A1 (en) * 2020-09-30 2022-04-07 Alteryx, Inc. System and method of operationalizing automated feature engineering
US11941497B2 (en) 2020-09-30 2024-03-26 Alteryx, Inc. System and method of operationalizing automated feature engineering
CN112434602A (en) * 2020-11-23 2021-03-02 西安交通大学 Fault diagnosis method based on migratable common feature space mining
CN112434602B (en) * 2020-11-23 2023-08-29 西安交通大学 Fault diagnosis method based on movable common feature space mining

Also Published As

Publication number Publication date
EP3704636A1 (en) 2020-09-09
JP7467336B2 (en) 2024-04-15
CN111316290A (en) 2020-06-19
JP2021507327A (en) 2021-02-22
CN111316290B (en) 2024-01-12

Similar Documents

Publication Publication Date Title
WO2019090023A1 (en) System and method for interactive representation learning transfer through deep learning of feature ontologies
Altaf et al. Going deep in medical image analysis: concepts, methods, challenges, and future directions
US20210110135A1 (en) Method and system for artificial intelligence based medical image segmentation
Maier et al. A gentle introduction to deep learning in medical image processing
Song et al. PET image super-resolution using generative adversarial networks
US20210012486A1 (en) Image synthesis with generative adversarial network
Fritscher et al. Deep neural networks for fast segmentation of 3D medical images
EP3273387B1 (en) Medical image segmentation with a multi-task neural network system
US20210012162A1 (en) 3d image synthesis system and methods
Srinivasu et al. Self-Learning Network-based segmentation for real-time brain MR images through HARIS
Agravat et al. Deep learning for automated brain tumor segmentation in mri images
You et al. Incremental learning meets transfer learning: Application to multi-site prostate mri segmentation
Conze et al. Current and emerging trends in medical image segmentation with deep learning
Khan et al. Segmentation of shoulder muscle MRI using a new region and edge based deep auto-encoder
Ogiela et al. Natural user interfaces in medical image analysis
Agravat et al. A survey and analysis on automated glioma brain tumor segmentation and overall patient survival prediction
Biswas et al. Data augmentation for improved brain tumor segmentation
Anam et al. Classification of scaled texture patterns with transfer learning
Quan et al. An intelligent system approach for probabilistic volume rendering using hierarchical 3D convolutional sparse coding
Jena et al. Review of neural network techniques in the verge of image processing
Huang et al. A two-level dynamic adaptive network for medical image fusion
Ullah et al. DSFMA: Deeply supervised fully convolutional neural networks based on multi-level aggregation for saliency detection
Kishanrao et al. An improved grade based MRI brain tumor classification using hybrid DCNN-DH framework
Robben et al. DeepVoxNet: voxel‐wise prediction for 3D images
Silva-Rodríguez et al. Towards foundation models and few-shot parameter-efficient fine-tuning for volumetric organ segmentation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18804491

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020524235

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018804491

Country of ref document: EP

Effective date: 20200603