US20200286614A1 - A system and method for automated labeling and annotating unstructured medical datasets - Google Patents

A system and method for automated labeling and annotating unstructured medical datasets Download PDF

Info

Publication number
US20200286614A1
US20200286614A1 US16/644,888 US201816644888A US2020286614A1 US 20200286614 A1 US20200286614 A1 US 20200286614A1 US 201816644888 A US201816644888 A US 201816644888A US 2020286614 A1 US2020286614 A1 US 2020286614A1
Authority
US
United States
Prior art keywords
images
image data
organ
dataset
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/644,888
Inventor
Synho Do
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
General Hospital Corp
Original Assignee
General Hospital Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by General Hospital Corp filed Critical General Hospital Corp
Priority to US16/644,888 priority Critical patent/US20200286614A1/en
Assigned to THE GENERAL HOSPITAL CORPORATION reassignment THE GENERAL HOSPITAL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DO, SYNHO
Publication of US20200286614A1 publication Critical patent/US20200286614A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2155Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • G06K9/00671
    • G06K9/6202
    • G06K9/6259
    • G06K9/6277
    • G06K9/628
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • G06N7/005
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/20Scenes; Scene-specific elements in augmented reality scenes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/40ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mechanical, radiation or invasive therapies, e.g. surgery, laser therapy, dialysis or acupuncture
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • G06V2201/031Recognition of patterns in medical or anatomical images of internal organs

Definitions

  • Diagnostic medical imaging has become central to the practice of modern medicine and diagnostic examination volume has increased during the past decade. Additionally, as systems have become more advanced with higher resolution, the number of images in a given study has also increased. The increased demand for diagnostic imaging also presents a growing risk for human error and delayed diagnosis. While computer aided detection (CADe) and diagnosis (CADx) systems can reduce such problems, they remain limited due to their reliance upon hand-crafted features. Deep-learning approaches sidestep this problem by extracting these features on their own. Recent advances in deep learning technology have enabled data-driven learning of nonlinear image filters and classifiers have improved detection and segmentation of multiple medical applications including brain infarcts, automated bone age analysis, and skin lesion classification. Despite these advances, large-scale and well-labeled training datasets for deep learning are essential for the networks to learn representative and hierarchical abstractions.
  • Axial image location classification is a fundamental step in multiple initial classification processes to classify the location of an image in a volumetric CT examination. Classifying the location is a challenging problem because the details of body regions can vary dramatically between patients, such as with brain gyral patterns, cervical vertebral anatomy, pulmonary vessels, and bowel distribution. Degenerative changes can also distort bony anatomy enough to confuse the network. As a result, large training data sets are common for algorithms to achieve sufficient accuracy.
  • Body-part recognition is also important in automatic medical image analysis as it is a prerequisite step for anatomy identification and organ segmentation.
  • Accurate body-part classification facilitates organ detection and segmentation by reducing the search range for an organ of interest.
  • Multiple techniques have been developed using multi-class random regression and decision forests to classify multiple anatomical structures ranging from 6-10 organs on tomographic (CT) scans. These classifiers can discriminate between similar structures such as the aortic arch and heart.
  • CT tomographic
  • high-quality training data is important to training neural networks and unlock the potential for neural networks to truly improve the clinical use of medical images.
  • creating high-quality training datasets is expensive and time-consuming.
  • the present disclosure addresses the aforementioned drawbacks by providing a system and method for using supervised and unsupervised learning schemes to automatically label medical images for use in subsequent deep learning applications.
  • the system can generate a large labeled dataset from a small initial training set using an iterative snowball sampling scheme.
  • a machine-learning powered, automatic organ classifier for imaging datasets, such as CT datasets, with a deep convolutional neural network (CNN) followed by an organ dose calculation is also provided.
  • This technique can be used for patient-specific organ dose estimation because the locations and sizes of organs for each patient can be calculated independently, rather than other simulation based methods.
  • a method for automatically processing unstructured medical imaging data to generate classified images.
  • the method includes acquiring medical image data of a subject and subjecting the medical image data of the subject to a neural network to generate classified image data.
  • the method may also include comparing the classified image data to a confidence test, and upon determining that the classified image data does not pass the confidence test, subjecting the classified image data to a variational autoencoder (VAE) that implements a snowball sampling algorithm to refine the classified image data by representing features of the classified image data into latent space with a Gaussian distribution. In some configurations, this is repeated until the classified image data passes the confidence test. Annotated images may then be generated from the classified image data.
  • VAE variational autoencoder
  • a method for automatic labeling and annotation for unstructured medical datasets with snowball sampling.
  • the method includes acquiring images of a region of a subject and labeling the images to generate a training dataset with the images.
  • the method also includes training a network, such as a convolutional neural network, with the training dataset and classifying unlabeled images using the trained network.
  • the method may also include determining if a performance threshold is exceeded for the classified images.
  • the dataset may be refined if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
  • a system for automatic labeling and annotation for unstructured medical datasets from medical images with snowball sampling.
  • the system includes a computer system configured to: i) acquire images of a region of a subject and label the images to generate a training dataset with the images; ii) train a convolutional neural network with the training dataset; iii) classify unlabeled images using the trained network; iv) determine if a performance threshold is exceeded for the classified images; and v) refine the dataset if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
  • a method for organ classification for unstructured medical datasets.
  • the method includes acquiring images of a region of a subject and labeling the images to generate a training dataset with the images.
  • the method may also include training a network, such as a convolutional neural network, with the training dataset.
  • a region in the images may be classified using the trained network.
  • the classified images may be segmented using the convolutional neural network to generate segmented images that distinguish between at least two different organs in the classified regions in the images.
  • a report may be generated of a calculated radiation dose for at least one of the organs in the segmented images.
  • FIG. 1 is a schematic diagram of one system in accordance with the present disclosure.
  • FIG. 2 is a schematic diagram showing further details of one, non-limiting example of the system of FIG. 1 .
  • FIG. 3 is a flowchart setting forth some examples of steps for a process in accordance with one aspect of the disclosure.
  • FIG. 4 is a flowchart setting forth some non-limiting examples of steps for a process for utilizing an autoencoder network with four convolutional and deconvolution layers in accordance with one aspect of the present disclosure.
  • FIG. 5 is a graphic illustration of forward and backward propagation using one configuration of an autoencoder where latent and generative losses are minimized in accordance with the present disclosure.
  • FIG. 6A is an image of a coronal reconstruction of a whole-body CT with each body region identified in accordance with the present disclosure.
  • FIG. 6B is a panel of axial image slices corresponding to the body regions identified in FIG. 6A in accordance with the present disclosure.
  • FIG. 7 is a graph providing a scatterplot of feature representations projected onto 2D latent space of a convolutional variational autoencoder in accordance with the present disclosure.
  • FIG. 8 is a series of correlated graphs of 6 example snowball sampling method reflecting increasing accuracy for increasing iterations in accordance with the present disclosure.
  • FIG. 9A is a graph of examples of classification accuracy versus a number of snowball sampling iterations in accordance with the present disclosure.
  • FIG. 9B is a graph of classification accuracy versus training data size for comparing one configuration of a tuned convolutional network with and without snowball sampling in accordance with the present disclosure.
  • FIG. 10A is an image of a circle whose area is the same as that of a patient cross section from FIG. 10B and which may be used to measure a patient effective diameter in accordance with the present disclosure.
  • FIG. 10B is an example CT image of a patient cross section.
  • FIG. 11 is a flowchart setting forth some non-limiting examples of steps for one configuration of an organ dose estimation method in accordance with the present disclosure.
  • the present disclosure provides systems and method for supervised and unsupervised learning schemes that may be used to automatically label medical images for use in deep learning applications.
  • Large labeled datasets may be generated from a small initial training set using an iterative snowball sampling scheme.
  • a machine learning powered automatic organ classifier for imaging datasets, such as CT datasets, with a deep convolutional neural network (CNN) followed by an organ dose calculation is also provided. This technique can be used for patient-specific organ dose estimation since the locations and sizes of organs for each patient can be calculated independently.
  • CNN deep convolutional neural network
  • a desired classification accuracy may be achieved with a minimal labeling process.
  • the automatic labeling system may include a variational autoencoder (VAE) for the purpose of feature representation, Gaussian mixture models (GMMs) for clustering and refining of mislabeled classes, and deep convolutional neural network (DCNN) for classification.
  • VAE variational autoencoder
  • GMMs Gaussian mixture models
  • DCNN deep convolutional neural network
  • the system and method can also quickly and efficiently identify an organ of interest at a higher accuracy when compared to current text-based body part information in digital imaging and communications in medicine (DICOM) headers.
  • DICOM digital imaging and communications in medicine
  • the method selects candidates, classifies them by the DCNN, and then fully refines them by learning features from a VAE and clustering the features by GMM.
  • a computing device 110 can receive multiple types of image data from an image source 102 .
  • the computing device 110 can execute at least a portion of an automatic image labelling system 104 to automatically determine whether a feature is present in images of a subject.
  • the computing device 110 can communicate information about image data received from the image source 102 to a server 120 over a communication network 108 , which can execute at least a portion of the automatic image labelling system 104 to automatically determine whether a feature is present in images of a subject.
  • the server 120 can return information to the computing device 110 (and/or any other suitable computing device) indicative of an output of the automatic image labelling system 104 to determine whether a feature is present or absent.
  • the computing device 110 and/or server 120 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, etc.
  • the automatic image labelling system 104 can extract features from labeled (e.g., labeled as including a condition or disease, or normal) image data using a CNN trained as a general image classifier, and can perform a correlation analysis to calculate correlations between the features corresponding to the image data and a database.
  • the labeled data can be used to train a classification model, such as a support vector machine (SVM), to classify features as indicative of a disease or a condition, or as indicative of normal.
  • a classification model such as a support vector machine (SVM)
  • SVM support vector machine
  • the automatic image labelling system 104 can provide features for unlabeled image data to the trained classification model.
  • the image source 102 can be any suitable source of image data, such as an MRI, CT, ultrasound, PET, SPECT, x-ray, or another computing device (e.g., a server storing image data), and the like.
  • the image source 102 can be local to the computing device 110 .
  • the image source 102 can be incorporated with the computing device 110 (e.g., the computing device 110 can be configured as part of a device for capturing and/or storing images).
  • the image source 102 can be connected to the computing device 110 by a cable, a direct wireless link, or the like.
  • the image source 102 can be located locally and/or remotely from the computing device 110 , and can communicate image data to the computing device 110 (and/or server 120 ) via a communication network (e.g., the communication network 108 ).
  • a communication network e.g., the communication network 108
  • the communication network 108 can be any suitable communication network or combination of communication networks.
  • the communication network 108 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g., a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.), a wired network, etc.
  • a Wi-Fi network which can include one or more wireless routers, one or more switches, etc.
  • a peer-to-peer network e.g., a Bluetooth network
  • a cellular network e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.
  • a wired network etc.
  • the communication network 108 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), other suitable type of network, or any suitable combination of networks.
  • Communications links shown in FIG. 1 can each be any suitable communications link or combination of communications links, such as wired links, fiber optic links, Wi-Fi links, Bluetooth links, cellular links, etc.
  • FIG. 2 shows an example of hardware 200 that can be used to implement the image source 102 , computing device 110 , and/or server 120 in accordance with some aspects of the disclosed subject matter.
  • the computing device 110 can include a processor 202 , a display 204 , one or more inputs 206 , one or more communication systems 208 , and/or memory 210 .
  • the processor 202 can be any suitable hardware processor or combination of processors, such as a central processing unit (CPU), a graphics processing unit (GPU), etc.
  • the display 204 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc.
  • the inputs 206 can include any of a variety of suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, and the like.
  • the communications systems 208 can include a variety of suitable hardware, firmware, and/or software for communicating information over the communication network 108 and/or any other suitable communication networks.
  • the communications systems 208 can include one or more transceivers, one or more communication chips and/or chip sets, etc.
  • the communications systems 208 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.
  • the memory 210 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by the processor 202 to present content using the display 204 , to communicate with the server 120 via the communications system(s) 208 , and the like.
  • the memory 210 can include any of a variety of suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof.
  • the memory 210 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc.
  • the memory 210 can have encoded thereon a computer program for controlling operation of the computing device 110 .
  • the processor 202 can execute at least a portion of the computer program to present content (e.g., MRI images, user interfaces, graphics, tables, and the like), receive content from the server 120 , transmit information to the server 120 , and the like.
  • content e.g., MRI images, user interfaces, graphics, tables, and the like
  • the server 120 can include a processor 212 , a display 214 , one or more inputs 216 , one or more communications systems 218 , and/or memory 220 .
  • the processor 212 can be a suitable hardware processor or combination of processors, such as a CPU, a GPU, and the like.
  • the display 214 can include a suitable display devices, such as a computer monitor, a touchscreen, a television, and the like.
  • the inputs 216 can include a suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, and the like.
  • the communications systems 218 can include a suitable hardware, firmware, and/or software for communicating information over the communication network 108 and/or any other suitable communication networks.
  • the communications systems 218 can include one or more transceivers, one or more communication chips and/or chip sets, and the like.
  • the communications systems 218 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and the like.
  • the memory 220 can include any suitable storage device or devices that can be used to store instructions, values, and the like, that can be used, for example, by the processor 212 to present content using the display 214 , to communicate with one or more computing devices 110 , and the like.
  • the memory 220 can include any of a variety of suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof.
  • the memory 220 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and the like.
  • the memory 220 can have encoded thereon a server program for controlling operation of the server 120 .
  • the processor 212 can execute at least a portion of the server program to transmit information and/or content (e.g., MRI data, results of automatic diagnosis, a user interface, and the like) to one or more computing devices 110 , receive information and/or content from one or more computing devices 110 , receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, and the like), and the like.
  • information and/or content e.g., MRI data, results of automatic diagnosis, a user interface, and the like
  • the image source 102 can include a processor 222 , imaging components 224 , one or more communications systems 226 , and/or memory 228 .
  • processor 222 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, and the like.
  • the imaging components 224 can be any suitable components to generate image data corresponding to one or more imaging modes (e.g., T1 imaging, T2 imaging, fMRI, and the like).
  • An example of an imaging machine that can be used to implement the image source 102 can include a conventional MRI scanner (e.g., a 1.5 T scanner, a 3 T scanner), a high field MRI scanner (e.g., a 7 T scanner), an open bore MRI scanner, a CT system, an ultrasound scanner and the like.
  • a conventional MRI scanner e.g., a 1.5 T scanner, a 3 T scanner
  • a high field MRI scanner e.g., a 7 T scanner
  • an open bore MRI scanner e.g., a CT system
  • ultrasound scanner e.g., a CT system
  • ultrasound scanner e.g., a CT system, an ultrasound scanner and the like.
  • the image source 102 can include any suitable inputs and/or outputs.
  • the image source 102 can include input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, a trackpad, a trackball, hardware buttons, software buttons, and the like.
  • the image source 102 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc., one or more speakers, and the like.
  • the communications systems 226 can include any suitable hardware, firmware, and/or software for communicating information to the computing device 110 (and, in some embodiments, over the communication network 108 and/or any other suitable communication networks).
  • the communications systems 226 can include one or more transceivers, one or more communication chips and/or chip sets, and the like.
  • the communications systems 226 can include hardware, firmware and/or software that can be used to establish a wired connection using any suitable port and/or communication standard (e.g., VGA, DVI video, USB, RS-232, and the like), Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and the like.
  • the memory 228 can include any suitable storage device or devices that can be used to store instructions, values, image data, and the like, that can be used, for example, by the processor 222 to: control the imaging components 224 , and/or receive image data from the imaging components 224 ; generate images; present content (e.g., MRI images, a user interface, and the like) using a display; communicate with one or more computing devices 110 ; and the like.
  • the memory 228 can include any suitable volatile memory, non-volatile memory, storage, or any of a variety of other suitable combination thereof.
  • the memory 228 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and the like.
  • the memory 228 can have encoded thereon a program for controlling operation of the image source 102 .
  • the processor 222 can execute at least a portion of the program to generate images, transmit information and/or content (e.g., MRI image data) to one or more the computing devices 110 , receive information and/or content from one or more computing devices 110 , receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, and the like), and the like.
  • information and/or content e.g., MRI image data
  • FIG. 3 a flowchart is provided setting forth some non-limiting example steps for a method of automatically classifying unstructured imaging data in accordance with the present disclosure.
  • the present disclosure provides an iterative snowball sampling that allows for the accurate classification of unstructured imaging data without the need for extensive training datasets that include costly human-annotated information.
  • an initial seed sample size is selected at step 310 .
  • a training dataset is generated at step 320 with sampled data, which is used to train the convolutional neural network at step 330 .
  • the convolutional network is a deep convolutional neural network (DCNN).
  • the trained network classifies unlabeled data at step 340 with the performance evaluated at step 350 . If the desired performance is achieved, then the process may end. For example, at step 350 , the system may evaluate whether the network's ability to identify a feature in an image exceeds a defined threshold of speed, classification accuracy, reproducibility, efficacy, or other performance metric.
  • the labeled data may be refined at step 360 , by determining if a confidence value for the data is above a certain threshold at step 370 .
  • a confidence value may be the same as the desired performance and use the same metrics, or a confidence value may be a classification accuracy that describes the percentage of the time or the frequency with which an image feature or region is identified correctly. If the confidence value does not exceed the threshold, then the network may be used to re-classify the data by repeating the process at step 340 . If the confidence value is exceeded, then a VAE with a model, such as a GMM, may be used at step 380 to add the data back into the training dataset and repeat the process from step 320 .
  • a VAE with a model such as a GMM
  • the initial seed annotated data sets used to train the DCNN may be small.
  • the generated labeled data may contain errors in classification or may be generally unstructured.
  • steps 340 - 380 may be implemented in an architecture that includes the VAE and GMMs and implements a snowball sampling algorithm to refine the classification.
  • the VAE represents features of the candidate annotated data set (having m data size) into latent space with a Gaussian distribution.
  • the GMMs may then conduct binary clustering within each class across the annotated candidate datasets. Between two clusters consisting of mean and variance vectors, a user may choose the cluster (c*) which is closest to the cluster center (c) of the selected seed sample.
  • the data set with the size m closest to the cluster center (c*) may be selected.
  • the VAE extracts generic features from each cluster and GMM improves clustering accuracy. This iterative data curation process can increase the quantity of annotated dataset and improve the quality of dataset. This may be repeated for each annotated class.
  • x [x 1 , x 2 , . . . , x N ] T
  • y [y 1 , y 2 , . . . , y N ] T
  • b [b 1 , b 2 , b 3 ] T
  • N is the number of classes
  • b 1 , b 2 , and b 3 represent the bias, learning rate, and decay rate, respectively.
  • t p is the desired output when the input is x p
  • y p f(x p
  • b) is the model's output when input is x p
  • r p (b) is the residual between t p and y p
  • R is the matrix form.
  • the weight terms w p in the diagonal matrix W can be determined by applications. In some settings, the weighted nonlinear least-squares estimator may be more appropriate than a regular nonlinear regression method to fit the learning curve when measurement errors do not all have the same variance.
  • Classification accuracy using relatively large sizes of training sets may have a lower variance than when using smaller sample sizes (such as 5, 10, 20, and 50).
  • the learning curve may therefore be fitted by higher weighting values at the points of larger data set sizes.
  • an autoencoder can be created that has two complementary networks consisting of an encoder and a decoder.
  • the encoder has a multilayer perceptron neural network allowing it to map input x to a latent representation z, and the decoder maps the latent variable z back to a reconstructed input value ⁇ circumflex over (x) ⁇ :
  • Generative loss describes how accurately the decoder network reconstructs images ( ⁇ circumflex over (x) ⁇ ) from a latent vector z, and latent loss is derived from q ⁇ (z
  • a typical autoencoder also called vanilla autoencoder
  • VAE variational autoencoder
  • the iterative snowball sampling process for automatic labeling of training data may also be expressed as:
  • the system learns features of input images using a VAE as an unsupervised learning representation, clusters the features by Gaussian mixtures models, and annotates the images by refining candidates pre-labeled or preclassified from the deep convolutional neural network as supervised learning.
  • the DCNN may be trained for body region classification using a small seed training dataset to create larger annotated training datasets by snowball iterative sampling, leading to higher final accuracy.
  • the system may be used to classify images for any part of the human anatomy, and may be used to classify images beyond restricted regions.
  • a database of CT images was compiled from the clinical PACS at a quaternary referral hospital. Preprocessing software was developed to annotate and categorize these images into 6 different body regions: brain, neck, shoulder, chest, abdomen, and pelvis. Only images that could be clearly defined as one of the aforementioned body regions were used. The intervening areas were excluded from training due to their lack of clear regional definition.
  • Each CT examination has a different noise level because of varying radiation dosages, image reconstruction filters, and CT vendors.
  • Image voxels may also have varying pitches because of the differences in the image reconstruction fields. Image slice thickness was thicker than axial voxel pitch, so voxels are anisotropic.
  • the initial seed datasets represent the number of labeled data used to train the DCNNs the first time.
  • An aim was to define the minimum number of cases per class required to annotate larger data sets with comparable accuracy to results from manually labeled conventional training datasets.
  • D CNN Any of a variety of D CNN may be used.
  • GoogLeNet was selected as it is an efficient, highly performing DCNN. Testing was performed using the NVIDIA Deep Learning GPU Training System (DIGITS) on a DevBox to train the model using each experimental dataset.
  • DIGITS NVIDIA Deep Learning GPU Training System
  • GoogLeNet uses 22 convolutional layers including 9 Inception modules and 4 different kernel filters (7 ⁇ 7, 5 ⁇ 5, 3 ⁇ 3, and 1 ⁇ 1). The convolutional filters were trained using a stochastic gradient descent (SGD) algorithm with 0.001 of the base learning rate decreased by three steps based on a stable convergence of loss function.
  • SGD stochastic gradient descent
  • the customized VAE was constructed to represent the features of the selected samples.
  • the VAE contains four convolutional and deconvolutional layers functioning as encoders and decoders, respectively.
  • convolutional layers functioning as encoders and decoders, respectively.
  • Each convolutional layer in the current example had 64 kernel filters (3 ⁇ 3 size) followed by max pooling (2 ⁇ 2).
  • Input images (downscaled to 64 ⁇ 64 from 512 ⁇ 512 for computational efficiency) were compressed to 128-dimensional feature spaces in a Gaussian distribution, ultimately reconstructing the input image using deconvolutional layers and up-sampling layers.
  • the convolutional VAE was implemented using the Keras deep learning library running on a TensorFlow backend. After training the VAE, only the encoder was used as a feature representation, feeding the features into the inputs of Gaussian mixture models (GMMs). Two clusters each having 128-dimensional Gaussian distributions were generated for each snowball iteration. The cluster (c*) which had the closest distance to the cluster center of the Gaussian distribution of the selected seed sample was selected.
  • GMMs Gaussian mixture models
  • Unlabeled 5,000/class datasets were initially annotated by the DCNNs that were trained with labeled seed data (e.g. 5 examples per class).
  • This automatic labeling procedure was conducted according to six different seed sizes (5, 10, 20, 50, 100, and 200/class). During each iteration, annotated image data was added to the next training data pool so that classification accuracy increased gradually. Mislabeled classes were significantly reduced after refining the training.
  • the size of the initial seed influences the overall classification performance, with diminishing returns after 50 cases per class.
  • Each experiment was repeated 10 times by randomly selecting seed samples from labeled training datasets.
  • the trained model was then tested by introducing 1,000 new images of each body class. A total of 6,000 images were used for the performance evaluation in the present example.
  • classification accuracy was at or near 100%. Although the system was not trained on images of transition regions, it was able to infer these areas with considerable accuracy. The network was able to extract and identify similar features at the level of the Inception module despite wide ranges of normal variation in the same anatomic region.
  • generative loss describes how accurately the decoder network g(z) 560 reconstructs images ( ⁇ circumflex over (x) ⁇ ) 540 from input x 530 and encoder network f(x) 500 through a latent vector z 550 , where latent loss is derived from q ⁇ (z
  • latent loss 520 is close to zero. Even though the composed neural network has many unknown weights to estimate, the simple cascade structure of multilayer neural network makes it possible to improve the accuracy by iteration.
  • the input can call forward function and calculate loss function. Then the prediction errors are backpropagated to improve system performance.
  • FIG. 6A depicts a whole-body CT image 600 , such as may be acquired and provided to the above-described systems.
  • body part regions 610 can be classified and labeled. Labeling of the data may be performed with a neural network, and refining of the dataset used to train the neural network may be as described above. Then, as illustrated in FIG. 6B , axial CT images corresponding to the body part regions 610 can be selected and labeled.
  • a brain region 620 has a corresponding axial image 625
  • a neck 630 has axial image 635
  • a shoulder 640 has axial image 645
  • a chest 650 has axial image 655
  • an abdomen 660 has axial image 665
  • a pelvis 670 has axial image 675 .
  • Any number of regions 610 may be identified for a subject, and any number of corresponding axial images may be used.
  • each cluster of data represents different body regions.
  • 128-dimensional latent representations of 6000 cases classified by the convolutional VAE using 200 cases per class were visualized and resulted in 6 body region clusters with areas of overlap.
  • This form of scatter plot display may be used to aid an automated routine in identifying what data corresponds to what body region by accounting for data clustering.
  • FIG. 8 an example of fine-tuned DCNN classification accuracy and number of mislabeled classes during ten snowball sampling iterations before and after a refining process by GMMs with six different initial seed sizes are shown. Varying training datasets were annotated using ten snowball iterations from varying initial seeds (5, 10, 20, 50, 100, and 200). During each iteration, annotated image data was added to the next training data pool so that classification accuracy increases gradually, as can be seen in FIG. 8 .
  • FIGS. 9A and 9B examples of classification accuracy are plotted with respect to size of training data or class.
  • FIG. 9A reflects how the fine-tuned DCNN with transfer learning performs better than the DCNN trained from scratch with random weight initialization. Classification accuracy increased rapidly from seed sizes 5 to 50, while accuracy did not increase significantly from seed size 100 to 200. At this point, the learning curve reached a steady state and did not significantly change in accuracy regardless of the seed size. The learning curve predicted 98% classification accuracy with the observed accuracy at 97.25%.
  • FIG. 9A depicts an example learning curve fit to classification accuracy.
  • FIG. 9B depicts an example of classification accuracy by fine-tuned DCNN model without and with the addition of the snowball sampling iteration.
  • CTDI computed tomography dose index
  • PMMA polymethyl methacrylate
  • a method for machine learning powered personalized patient organ dose which may take the form of software to estimate each patient's unique organ size and shape.
  • the method can be used to enable organ detection, segmentation, and volume estimation (lungs, liver, kidneys, urinary bladder, muscles, and the like), which may then be used to control or optimize the level of radiation dose to the patients.
  • CTDI vol may be measured to indicate the CT scanner output. Generally, the CTDI vol measurement is conducted by imaging a 16 cm diameter phantom for head and a 32 cm diameter phantom for body in a given patient CT scan. It is measured by following standard protocols.
  • the CTDI vol may be denoted as:
  • n is the number of tomographic sections imaged in a single axial scan
  • the number of data channels is the width of the tomographic section along the z-axis imaged by one data channel
  • pitch is the ratio of the table feed per rotation
  • CTDI vol may be provided by commercial CT scanner manufactures.
  • the value from a GE LightSpeed VCT scanner is used to estimate organ dose.
  • the corresponding scan parameters have 120 kVp tube voltage, 0.98 mm pitch, 0.5 second(s) rotation time, and 40 mm collimation.
  • the normalized CTDI w (denoted by n CTDI w ) of ImPACT CT dosimetry calculator was 9.5 (mGy/100 mAs) so that CTDI w is calculated by n CTDI w *mA*s/100 and finally CTDI vol is determined by dividing CTDI w by the pitch in the above equation.
  • a patient effective diameter may be measured by automatic calculation based on reconstructed axial CT images.
  • FIG. 10B shows a CT image of a patient cross section.
  • FIG. 10A shows the effective diameter, which is defined as the diameter of the circle whose area is the same as that of the patient cross section, assuming patient has elliptical cross sections as indicated in FIGS. 10A and 10B .
  • Effective_Diameter ⁇ square root over (AP ⁇ LAT) ⁇ (6)
  • anterior posterior (AP) dimension 1010 represents the thickness of the body part of the patient
  • lateral (LAT) dimension 1020 represents the side-to-side dimension of the body part being scanned, respectively.
  • binary morphologic image techniques may be used, such as image dilation and erosion, for estimation of circle of equal area to the patient diameter.
  • the ImPACT CT dosimetry calculator uses the normalized organ conversion factors, obtained from a Monte Carlo simulation to a mathematical phantom. It has limitation on the calculation of the organ dose of the actual patient, who has different weight and size of body as well as unique organ shape and volumes. In one configuration, to estimate patient organ dose for the various patient weight, a correction factor (CF) may be used at each organ using patient clinical data provided by two different manufacturers. The CF is calculated as:
  • D T,RD n and D T,IM n is the organ dose normalized by CTDI vol , which may be provided by the vendor as described previously, such as from eXposure organ dose software of Radimetrics (RD) and ImPACT (IM), respectively.
  • An automated program may be used to extract CT dose information from DICOM metadata and image data at step 1110 .
  • the DICOM dose report may be retrieved from a PACS, for example.
  • the report may include CTDI vol , dose-length product (DLP), tube current, tube voltage, exposure time, and collimation.
  • the extracted scanner information and scan parameters at step 1120 along with dose-relevant indexes of CT examinations and using machine learning to classify a body part at step 1130 may be used to calculate the organ dose for ImPACT CT dosimetry at step 1140 .
  • the DICOM data may be converted to an image at step 1150 .
  • the DICOM image may also be written to a standard 8-bit gray image format such as PNG or other formats.
  • a patient effective diameter is calculated at step 1160 , and a correction factor as discussed above may be calculated at step 1170 .
  • a course-tuning for organ dose estimation may be performed at step 1180 .
  • the converted images through scan ranges may be fed to the inputs of a machine learning network, such as a deep convolutional neural net, for use in identifying and segmenting patient organs at step 1190 .
  • Organ dose estimates may then be fine-tuned at step 1195 once the organ has been identified and properly segmented by attributing the dose more specifically to the appropriate organs.
  • the ImPACT dosimetry calculated 23 organ doses of each patient.
  • the estimated organ dose by ImPACT were corrected by the correction factor (CF) based on a regression model, representing correlation between the ratio of normalized organ dose and patient effective diameter.
  • the PODE may be fine-tuned by organ volume and shape through an organ segmentation step.
  • the method may automatically identify which patient organs were included in that scan region for the ImPACT dosimetry calculation. These 16 different organs were identified from axial views of CT images and were labeled. A 22-layer deep CNN using an NVIDIA Deep Learning GPU Training System (DIGITS) was trained and validated with a 646 CT scan dataset. The resultant classified organ was automatically mapped to the slab number of a mathematical hermaphrodite phantom to determine the scan range of ImPACT CT dose calculator.
  • DIGITS NVIDIA Deep Learning GPU Training System
  • a GoogLeNet network using 22 convolutional layers including 9 inception modules and 4 sizes of basis or kernel filters (7 ⁇ 7, 5 ⁇ 5, 3 ⁇ 3, and 1 ⁇ 1) was used. 75% of images were used for training and 25% for validation.
  • the GoogLeNet was trained using the NVIDIA toolchain of DIGITS and the DevBox with four TITAN GPUs with 7 TFlops of single precision, 336.5 GB/s of memory bandwidth, and 12 GB of memory.
  • GoogLeNet was trained using a stochastic gradient descent (SGD) algorithm until 150 training epochs. Validation data sets were presented upon every epoch during the training process. The initial learning rate was 0.01 and decreased by three steps according to the convergence to loss function.
  • SGD stochastic gradient descent
  • FIG. 2 is a representative example of classification results of patient organs after chest CT segmentation.
  • the identified organs were labeled from HEAD (Thyroid) to TRUNK (Abdomen1), a region including both kidneys and liver. Based on organ classification, the corresponding scan range for the ImPACT CT dosimetry calculator was determined. For example, the identified thyroid region (HEAD5) was mapped to slab number 171/208 of the adult phantom.
  • the predicted organ location provided by the deep learning driven software also gives information about the volume of an organ in respect to a given scan region.
  • the thyroid HEAD 5 region
  • the thyroid can be identified in slices 10 to 138 of the present example with 99% accuracy, greatly improving patient-specific radiation dose estimation.
  • the ratio of normalized organ dose by CTDI vol according to the patient effective diameter may be assessed.
  • the identified organs at a given scan region may have a linear relationship, whereas some organs such as brain, eye lenses, and salivary glands may not be identified by a convolutional neural net classifier so that the organ doses are not correlated to the effective diameter.
  • the normalized dose coefficients were decreased as the effective diameter increased for all identified organ regions.
  • a best fit model on each organ may be fit by the least square estimate (LSE), or by any other appropriate estimates.
  • the model may be trained for more organ areas than have been disclosed in the examples, such as covering all organs used in the CT organ dose estimator.
  • These organs may include the pancreas, stomach, gall bladder, and colon, and may facilitate longitudinal organ-specific dose calculations.

Abstract

Supervised and unsupervised learning schemes may be used to automatically label medical images for use in deep learning applications. Large labeled datasets may be generated from a small initial training set using an iterative snowball sampling scheme. A machine learning powered automatic organ classifier for imaging datasets, such as CT datasets, with a deep convolutional neural network (CNN) followed by an organ dose calculation is also provided. This technique can be used for patient-specific organ dose estimation since the locations and sizes of organs for each patient can be calculated independently.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of U.S. Provisional Patent Application Ser. No. 62/555,799 filed on Sep. 8, 2017, and entitled “A methodology for automated labeling and annotation for unstructured big medical datasets,” and U.S. Provisional Patent Application Ser. No. 62/555,767 filed on Sep. 8, 2017, and entitled “Method and apparatus of machine learning based personalized organ dose estimation.
  • STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
  • N/A
  • BACKGROUND
  • Diagnostic medical imaging has become central to the practice of modern medicine and diagnostic examination volume has increased during the past decade. Additionally, as systems have become more advanced with higher resolution, the number of images in a given study has also increased. The increased demand for diagnostic imaging also presents a growing risk for human error and delayed diagnosis. While computer aided detection (CADe) and diagnosis (CADx) systems can reduce such problems, they remain limited due to their reliance upon hand-crafted features. Deep-learning approaches sidestep this problem by extracting these features on their own. Recent advances in deep learning technology have enabled data-driven learning of nonlinear image filters and classifiers have improved detection and segmentation of multiple medical applications including brain infarcts, automated bone age analysis, and skin lesion classification. Despite these advances, large-scale and well-labeled training datasets for deep learning are essential for the networks to learn representative and hierarchical abstractions.
  • This labeling requirement is inherently difficult to meet in the medical domain where medical expertise is expensive, labeling is tedious and time-consuming, and examples of certain disease pathologies may be rare. Several automated annotation approaches have been attempted on brain CT, brain MR, and other biomedical image modalities with various feature representation, clustering, and classification algorithms. These approaches are limited because only low-level visual features such as color, edges, and color layouts are extracted. Even with higher-level feature extraction from MR voxels by hierarchical learning using two-layer random forests, segmentation performance is not generally better than features extracted by deep convolutional neural networks. Furthermore, all current methods of annotating medical images still require mid- to large-sized labeled image datasets for obtaining the trained model.
  • Axial image location classification is a fundamental step in multiple initial classification processes to classify the location of an image in a volumetric CT examination. Classifying the location is a challenging problem because the details of body regions can vary dramatically between patients, such as with brain gyral patterns, cervical vertebral anatomy, pulmonary vessels, and bowel distribution. Degenerative changes can also distort bony anatomy enough to confuse the network. As a result, large training data sets are common for algorithms to achieve sufficient accuracy.
  • Body-part recognition is also important in automatic medical image analysis as it is a prerequisite step for anatomy identification and organ segmentation. Accurate body-part classification facilitates organ detection and segmentation by reducing the search range for an organ of interest. Multiple techniques have been developed using multi-class random regression and decision forests to classify multiple anatomical structures ranging from 6-10 organs on tomographic (CT) scans. These classifiers can discriminate between similar structures such as the aortic arch and heart. However, these prior works focus on a general anatomical body part classification.
  • Thus, high-quality training data is important to training neural networks and unlock the potential for neural networks to truly improve the clinical use of medical images. However, creating high-quality training datasets is expensive and time-consuming.
  • SUMMARY OF THE DISCLOSURE
  • The present disclosure addresses the aforementioned drawbacks by providing a system and method for using supervised and unsupervised learning schemes to automatically label medical images for use in subsequent deep learning applications. The system can generate a large labeled dataset from a small initial training set using an iterative snowball sampling scheme. A machine-learning powered, automatic organ classifier for imaging datasets, such as CT datasets, with a deep convolutional neural network (CNN) followed by an organ dose calculation is also provided. This technique can be used for patient-specific organ dose estimation because the locations and sizes of organs for each patient can be calculated independently, rather than other simulation based methods.
  • In one configuration, a method is provided for automatically processing unstructured medical imaging data to generate classified images. The method includes acquiring medical image data of a subject and subjecting the medical image data of the subject to a neural network to generate classified image data. The method may also include comparing the classified image data to a confidence test, and upon determining that the classified image data does not pass the confidence test, subjecting the classified image data to a variational autoencoder (VAE) that implements a snowball sampling algorithm to refine the classified image data by representing features of the classified image data into latent space with a Gaussian distribution. In some configurations, this is repeated until the classified image data passes the confidence test. Annotated images may then be generated from the classified image data.
  • In one configuration, a method is provided for automatic labeling and annotation for unstructured medical datasets with snowball sampling. The method includes acquiring images of a region of a subject and labeling the images to generate a training dataset with the images. The method also includes training a network, such as a convolutional neural network, with the training dataset and classifying unlabeled images using the trained network. The method may also include determining if a performance threshold is exceeded for the classified images. The dataset may be refined if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
  • In one configuration, a system is provided for automatic labeling and annotation for unstructured medical datasets from medical images with snowball sampling. The system includes a computer system configured to: i) acquire images of a region of a subject and label the images to generate a training dataset with the images; ii) train a convolutional neural network with the training dataset; iii) classify unlabeled images using the trained network; iv) determine if a performance threshold is exceeded for the classified images; and v) refine the dataset if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
  • In one configuration, a method is provided for organ classification for unstructured medical datasets. The method includes acquiring images of a region of a subject and labeling the images to generate a training dataset with the images. The method may also include training a network, such as a convolutional neural network, with the training dataset. A region in the images may be classified using the trained network. The classified images may be segmented using the convolutional neural network to generate segmented images that distinguish between at least two different organs in the classified regions in the images. A report may be generated of a calculated radiation dose for at least one of the organs in the segmented images.
  • The foregoing and other aspects and advantages of the present disclosure will appear from the following description. In the description, reference is made to the accompanying drawings that form a part hereof, and in which there is shown by way of illustration a preferred embodiment. This embodiment does not necessarily represent the full scope of the invention, however, and reference is therefore made to the claims and herein for interpreting the scope of the invention.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic diagram of one system in accordance with the present disclosure.
  • FIG. 2 is a schematic diagram showing further details of one, non-limiting example of the system of FIG. 1.
  • FIG. 3 is a flowchart setting forth some examples of steps for a process in accordance with one aspect of the disclosure.
  • FIG. 4 is a flowchart setting forth some non-limiting examples of steps for a process for utilizing an autoencoder network with four convolutional and deconvolution layers in accordance with one aspect of the present disclosure.
  • FIG. 5 is a graphic illustration of forward and backward propagation using one configuration of an autoencoder where latent and generative losses are minimized in accordance with the present disclosure.
  • FIG. 6A is an image of a coronal reconstruction of a whole-body CT with each body region identified in accordance with the present disclosure.
  • FIG. 6B is a panel of axial image slices corresponding to the body regions identified in FIG. 6A in accordance with the present disclosure.
  • FIG. 7 is a graph providing a scatterplot of feature representations projected onto 2D latent space of a convolutional variational autoencoder in accordance with the present disclosure.
  • FIG. 8 is a series of correlated graphs of 6 example snowball sampling method reflecting increasing accuracy for increasing iterations in accordance with the present disclosure.
  • FIG. 9A is a graph of examples of classification accuracy versus a number of snowball sampling iterations in accordance with the present disclosure.
  • FIG. 9B is a graph of classification accuracy versus training data size for comparing one configuration of a tuned convolutional network with and without snowball sampling in accordance with the present disclosure.
  • FIG. 10A is an image of a circle whose area is the same as that of a patient cross section from FIG. 10B and which may be used to measure a patient effective diameter in accordance with the present disclosure.
  • FIG. 10B is an example CT image of a patient cross section.
  • FIG. 11 is a flowchart setting forth some non-limiting examples of steps for one configuration of an organ dose estimation method in accordance with the present disclosure.
  • DETAILED DESCRIPTION
  • The present disclosure provides systems and method for supervised and unsupervised learning schemes that may be used to automatically label medical images for use in deep learning applications. Large labeled datasets may be generated from a small initial training set using an iterative snowball sampling scheme. A machine learning powered automatic organ classifier for imaging datasets, such as CT datasets, with a deep convolutional neural network (CNN) followed by an organ dose calculation is also provided. This technique can be used for patient-specific organ dose estimation since the locations and sizes of organs for each patient can be calculated independently.
  • In one configuration, a desired classification accuracy may be achieved with a minimal labeling process. Using an iterative snowball sampling approach, a large medical image dataset may be annotated automatically with a smaller training subset. The automatic labeling system may include a variational autoencoder (VAE) for the purpose of feature representation, Gaussian mixture models (GMMs) for clustering and refining of mislabeled classes, and deep convolutional neural network (DCNN) for classification. The system and method can also quickly and efficiently identify an organ of interest at a higher accuracy when compared to current text-based body part information in digital imaging and communications in medicine (DICOM) headers. In one configuration, the method selects candidates, classifies them by the DCNN, and then fully refines them by learning features from a VAE and clustering the features by GMM.
  • Referring to FIG. 1, an example of a system 100 is shown for automatically labeling images using image data in accordance with some aspects of the disclosed subject matter. As shown in FIG. 1, a computing device 110 can receive multiple types of image data from an image source 102. In some configurations, the computing device 110 can execute at least a portion of an automatic image labelling system 104 to automatically determine whether a feature is present in images of a subject.
  • Additionally or alternatively, in some configurations, the computing device 110 can communicate information about image data received from the image source 102 to a server 120 over a communication network 108, which can execute at least a portion of the automatic image labelling system 104 to automatically determine whether a feature is present in images of a subject. In such configurations, the server 120 can return information to the computing device 110 (and/or any other suitable computing device) indicative of an output of the automatic image labelling system 104 to determine whether a feature is present or absent.
  • In some configurations, the computing device 110 and/or server 120 can be any suitable computing device or combination of devices, such as a desktop computer, a laptop computer, a smartphone, a tablet computer, a wearable computer, a server computer, a virtual machine being executed by a physical computing device, etc. In some configurations, the automatic image labelling system 104 can extract features from labeled (e.g., labeled as including a condition or disease, or normal) image data using a CNN trained as a general image classifier, and can perform a correlation analysis to calculate correlations between the features corresponding to the image data and a database. In some embodiments, the labeled data can be used to train a classification model, such as a support vector machine (SVM), to classify features as indicative of a disease or a condition, or as indicative of normal. In some configurations, the automatic image labelling system 104 can provide features for unlabeled image data to the trained classification model.
  • In some configurations, the image source 102 can be any suitable source of image data, such as an MRI, CT, ultrasound, PET, SPECT, x-ray, or another computing device (e.g., a server storing image data), and the like. In some configurations, the image source 102 can be local to the computing device 110. For example, the image source 102 can be incorporated with the computing device 110 (e.g., the computing device 110 can be configured as part of a device for capturing and/or storing images). As another example, the image source 102 can be connected to the computing device 110 by a cable, a direct wireless link, or the like. Additionally or alternatively, in some configurations, the image source 102 can be located locally and/or remotely from the computing device 110, and can communicate image data to the computing device 110 (and/or server 120) via a communication network (e.g., the communication network 108).
  • In some configurations, the communication network 108 can be any suitable communication network or combination of communication networks. For example, the communication network 108 can include a Wi-Fi network (which can include one or more wireless routers, one or more switches, etc.), a peer-to-peer network (e.g., a Bluetooth network), a cellular network (e.g., a 3G network, a 4G network, etc., complying with any suitable standard, such as CDMA, GSM, LTE, LTE Advanced, WiMAX, etc.), a wired network, etc. In some configurations, the communication network 108 can be a local area network, a wide area network, a public network (e.g., the Internet), a private or semi-private network (e.g., a corporate or university intranet), other suitable type of network, or any suitable combination of networks. Communications links shown in FIG. 1 can each be any suitable communications link or combination of communications links, such as wired links, fiber optic links, Wi-Fi links, Bluetooth links, cellular links, etc.
  • FIG. 2 shows an example of hardware 200 that can be used to implement the image source 102, computing device 110, and/or server 120 in accordance with some aspects of the disclosed subject matter. As shown in FIG. 2, in some configurations, the computing device 110 can include a processor 202, a display 204, one or more inputs 206, one or more communication systems 208, and/or memory 210. In some configurations, the processor 202 can be any suitable hardware processor or combination of processors, such as a central processing unit (CPU), a graphics processing unit (GPU), etc. In some configurations, the display 204 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc. In some configurations, the inputs 206 can include any of a variety of suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, and the like.
  • In some configurations, the communications systems 208 can include a variety of suitable hardware, firmware, and/or software for communicating information over the communication network 108 and/or any other suitable communication networks. For example, the communications systems 208 can include one or more transceivers, one or more communication chips and/or chip sets, etc. In a more particular example, the communications systems 208 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, etc.
  • In some configurations, the memory 210 can include any suitable storage device or devices that can be used to store instructions, values, etc., that can be used, for example, by the processor 202 to present content using the display 204, to communicate with the server 120 via the communications system(s) 208, and the like. The memory 210 can include any of a variety of suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, the memory 210 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, etc. In some configurations, the memory 210 can have encoded thereon a computer program for controlling operation of the computing device 110. In such configurations, the processor 202 can execute at least a portion of the computer program to present content (e.g., MRI images, user interfaces, graphics, tables, and the like), receive content from the server 120, transmit information to the server 120, and the like.
  • In some configurations, the server 120 can include a processor 212, a display 214, one or more inputs 216, one or more communications systems 218, and/or memory 220. In some configurations, the processor 212 can be a suitable hardware processor or combination of processors, such as a CPU, a GPU, and the like. In some configurations, the display 214 can include a suitable display devices, such as a computer monitor, a touchscreen, a television, and the like. In some configurations, the inputs 216 can include a suitable input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, and the like.
  • In some configurations, the communications systems 218 can include a suitable hardware, firmware, and/or software for communicating information over the communication network 108 and/or any other suitable communication networks. For example, the communications systems 218 can include one or more transceivers, one or more communication chips and/or chip sets, and the like. In a more particular example, the communications systems 218 can include hardware, firmware and/or software that can be used to establish a Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and the like.
  • In some configurations, the memory 220 can include any suitable storage device or devices that can be used to store instructions, values, and the like, that can be used, for example, by the processor 212 to present content using the display 214, to communicate with one or more computing devices 110, and the like. The memory 220 can include any of a variety of suitable volatile memory, non-volatile memory, storage, or any suitable combination thereof. For example, the memory 220 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and the like. In some configurations, the memory 220 can have encoded thereon a server program for controlling operation of the server 120. In such configurations, the processor 212 can execute at least a portion of the server program to transmit information and/or content (e.g., MRI data, results of automatic diagnosis, a user interface, and the like) to one or more computing devices 110, receive information and/or content from one or more computing devices 110, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, and the like), and the like.
  • In some configurations, the image source 102 can include a processor 222, imaging components 224, one or more communications systems 226, and/or memory 228. In some embodiments, processor 222 can be any suitable hardware processor or combination of processors, such as a CPU, a GPU, and the like. In some configurations, the imaging components 224 can be any suitable components to generate image data corresponding to one or more imaging modes (e.g., T1 imaging, T2 imaging, fMRI, and the like). An example of an imaging machine that can be used to implement the image source 102 can include a conventional MRI scanner (e.g., a 1.5 T scanner, a 3 T scanner), a high field MRI scanner (e.g., a 7 T scanner), an open bore MRI scanner, a CT system, an ultrasound scanner and the like.
  • Note that, although not shown, the image source 102 can include any suitable inputs and/or outputs. For example, the image source 102 can include input devices and/or sensors that can be used to receive user input, such as a keyboard, a mouse, a touchscreen, a microphone, a trackpad, a trackball, hardware buttons, software buttons, and the like. As another example, the image source 102 can include any suitable display devices, such as a computer monitor, a touchscreen, a television, etc., one or more speakers, and the like.
  • In some configurations, the communications systems 226 can include any suitable hardware, firmware, and/or software for communicating information to the computing device 110 (and, in some embodiments, over the communication network 108 and/or any other suitable communication networks). For example, the communications systems 226 can include one or more transceivers, one or more communication chips and/or chip sets, and the like. In a more particular example, the communications systems 226 can include hardware, firmware and/or software that can be used to establish a wired connection using any suitable port and/or communication standard (e.g., VGA, DVI video, USB, RS-232, and the like), Wi-Fi connection, a Bluetooth connection, a cellular connection, an Ethernet connection, and the like.
  • In some configurations, the memory 228 can include any suitable storage device or devices that can be used to store instructions, values, image data, and the like, that can be used, for example, by the processor 222 to: control the imaging components 224, and/or receive image data from the imaging components 224; generate images; present content (e.g., MRI images, a user interface, and the like) using a display; communicate with one or more computing devices 110; and the like. The memory 228 can include any suitable volatile memory, non-volatile memory, storage, or any of a variety of other suitable combination thereof. For example, the memory 228 can include RAM, ROM, EEPROM, one or more flash drives, one or more hard disks, one or more solid state drives, one or more optical drives, and the like. In some configurations, the memory 228 can have encoded thereon a program for controlling operation of the image source 102. In such configurations, the processor 222 can execute at least a portion of the program to generate images, transmit information and/or content (e.g., MRI image data) to one or more the computing devices 110, receive information and/or content from one or more computing devices 110, receive instructions from one or more devices (e.g., a personal computer, a laptop computer, a tablet computer, a smartphone, and the like), and the like.
  • Referring to FIG. 3, a flowchart is provided setting forth some non-limiting example steps for a method of automatically classifying unstructured imaging data in accordance with the present disclosure. As will be described, the present disclosure provides an iterative snowball sampling that allows for the accurate classification of unstructured imaging data without the need for extensive training datasets that include costly human-annotated information.
  • In particular, an initial seed sample size is selected at step 310. A training dataset is generated at step 320 with sampled data, which is used to train the convolutional neural network at step 330. In some configurations, the convolutional network is a deep convolutional neural network (DCNN). The trained network classifies unlabeled data at step 340 with the performance evaluated at step 350. If the desired performance is achieved, then the process may end. For example, at step 350, the system may evaluate whether the network's ability to identify a feature in an image exceeds a defined threshold of speed, classification accuracy, reproducibility, efficacy, or other performance metric.
  • If a desired level of performance is not achieved, then the labeled data may be refined at step 360, by determining if a confidence value for the data is above a certain threshold at step 370. A confidence value may be the same as the desired performance and use the same metrics, or a confidence value may be a classification accuracy that describes the percentage of the time or the frequency with which an image feature or region is identified correctly. If the confidence value does not exceed the threshold, then the network may be used to re-classify the data by repeating the process at step 340. If the confidence value is exceeded, then a VAE with a model, such as a GMM, may be used at step 380 to add the data back into the training dataset and repeat the process from step 320.
  • In some configurations, the initial seed annotated data sets used to train the DCNN may be small. In these cases, the generated labeled data may contain errors in classification or may be generally unstructured. To prevent this, steps 340-380 may be implemented in an architecture that includes the VAE and GMMs and implements a snowball sampling algorithm to refine the classification. The VAE represents features of the candidate annotated data set (having m data size) into latent space with a Gaussian distribution. The GMMs may then conduct binary clustering within each class across the annotated candidate datasets. Between two clusters consisting of mean and variance vectors, a user may choose the cluster (c*) which is closest to the cluster center (c) of the selected seed sample. The data set with the size m closest to the cluster center (c*) may be selected. The VAE extracts generic features from each cluster and GMM improves clustering accuracy. This iterative data curation process can increase the quantity of annotated dataset and improve the quality of dataset. This may be repeated for each annotated class.
  • Specifically, deep learning classification accuracy is historically dependent on the size of the initial training datasets. Quantifying the size of a dataset required to achieve a target accuracy is important when trying to decide the feasibility of a system. In many cases, the limitation of data size presents us from developing robust AI algorithms. Learning curve analysis is one such approach to model classification performance and predict the sample size needed. The learning curve can be conceptualized as an inverse power law function. Classification accuracy (y) is expressed as a function of the training set size (x) and a given unknown parameter (b=(b1, b2, b3)), expressed as the following equation

  • y=f(x;b)=b 1 +b 2 ·x b 3   (1)
  • where x=[x1, x2, . . . , xN]T, y=[y1, y2, . . . , yN]T, b=[b1, b2, b3]T, and N is the number of classes; b1, b2, and b3 represent the bias, learning rate, and decay rate, respectively. The model fit assumes that the classification accuracy (y) grows asymptotically to b1, the maximum achievable classification performance. With the observed classification accuracy at six different sizes of training sets (5, 10, 20, 50, 100, and 200), unknown parameters (b=[b1, b2, b3]T) may be estimated using weighted nonlinear regression,
  • E ( b ) = p = 1 m w p · ( t p - y p ) 2 = p = 1 m w p · ( t p - f ( x p , y p ) ) 2 = p = 1 m w p · r p ( b ) 2 = RWR ( 2 )
  • where tp is the desired output when the input is xp; yp=f(xp; b) is the model's output when input is xp; rp(b) is the residual between tp and yp; and R is the matrix form. The weight terms wp in the diagonal matrix W can be determined by applications. In some settings, the weighted nonlinear least-squares estimator may be more appropriate than a regular nonlinear regression method to fit the learning curve when measurement errors do not all have the same variance.
  • Classification accuracy using relatively large sizes of training sets (such as 100 and 200) may have a lower variance than when using smaller sample sizes (such as 5, 10, 20, and 50). The learning curve may therefore be fitted by higher weighting values at the points of larger data set sizes. For example, the weights may be chosen as wp=[1, 1, 1, 1, 100, 150] for a large dataset with a learning curve, but may be wp=[1, 1, 1, 1, 1, 1] for an unweighted nonlinear least-squares estimator.
  • In one configuration, an autoencoder can be created that has two complementary networks consisting of an encoder and a decoder. The encoder has a multilayer perceptron neural network allowing it to map input x to a latent representation z, and the decoder maps the latent variable z back to a reconstructed input value {circumflex over (x)}:

  • z˜f(x)=q φ(z|x)

  • {circumflex over (x)}˜g(z)=p θ(x|z)  (3)
  • where the tunable parameters φ as encoder and θ as decoder of artificial neural networks are optimized for below the variational lower bound, L(θ, φ, x):

  • L(θ,φ,x)=E q φ (z|x)=[log p θ(x|z)]−D KL[q φ(z|x)∥p θ(z)]  (4).
  • The objective of this cost function is to minimize both the generative and the latent losses. Generative loss describes how accurately the decoder network reconstructs images ({circumflex over (x)}) from a latent vector z, and latent loss is derived from qφ(z|x) so that DKL[ . . . ] is close to zero. One difference between a typical autoencoder (also called vanilla autoencoder) and a variational autoencoder (VAE) is that variational autoencoders generate latent vectors approximating a unit Gaussian distribution (i.e. z˜N(0,I)), whereas vanilla autoencoders generate deterministic latent variables z.
  • In some configurations, the iterative snowball sampling process for automatic labeling of training data may also be expressed as:
  • i←0
  • xs,←Select training sample (x) with initial labeled seed size (s)
  • if i=0: xm i←xs
  • else: xm i←Classify unlabeled training set (M) by d(xm i) into m labeled candidate set
  • repeat
  • d(xm i)←Train deep convolutional neural network with xm i
  • f(xm i), g(z)←Train encoder and decoder of VAE
  • z˜f(xm i)←Feature representation on latent space of encoder
  • c←Binary clustering using GMMs
  • c*←Select cluster center closest distance to seed data (xs)
  • xm i←Select m data set size closest to c*
  • m=m+s←Add new label sets into initial seed
  • i←i+1
  • until M←m
  • In some configurations, the system learns features of input images using a VAE as an unsupervised learning representation, clusters the features by Gaussian mixtures models, and annotates the images by refining candidates pre-labeled or preclassified from the deep convolutional neural network as supervised learning. The DCNN may be trained for body region classification using a small seed training dataset to create larger annotated training datasets by snowball iterative sampling, leading to higher final accuracy. The system may be used to classify images for any part of the human anatomy, and may be used to classify images beyond restricted regions.
  • Example of Data Set Annotation
  • In one example of the method and system, experiments were conducted using six different training seed sizes (5, 10, 20, 50, 100, 200/class) on whole body CT images. The fine-tuned DCNN model with snowball sampling was compared with other two common learning methods, the DCNN model trained from scratch and the fine-tuned DCNN model with transfer learning method, to evaluate classification performances. It was observed that the method gave comparable accuracy results (98.79%) from only 100 labeled seed size with the accuracy from 1,000 seed size by the fine-tuned DCNN (98.71%). In the results of this example, the automatic labeling method contributes to saving 90% of labeling efforts in body part classification while preserving the high accuracy.
  • A database of CT images was compiled from the clinical PACS at a quaternary referral hospital. Preprocessing software was developed to annotate and categorize these images into 6 different body regions: brain, neck, shoulder, chest, abdomen, and pelvis. Only images that could be clearly defined as one of the aforementioned body regions were used. The intervening areas were excluded from training due to their lack of clear regional definition. Each CT examination has a different noise level because of varying radiation dosages, image reconstruction filters, and CT vendors. Image voxels may also have varying pitches because of the differences in the image reconstruction fields. Image slice thickness was thicker than axial voxel pitch, so voxels are anisotropic.
  • Four training datasets with varying members per class were prepared-unlabeled (M=5000/class), test (1000/class), validation (1000/class), and initial seed data (s=5, 10, 20, 50, 100, and 200/class). The initial seed datasets represent the number of labeled data used to train the DCNNs the first time. An aim was to define the minimum number of cases per class required to annotate larger data sets with comparable accuracy to results from manually labeled conventional training datasets.
  • Any of a variety of D CNN may be used. In the present example, GoogLeNet was selected as it is an efficient, highly performing DCNN. Testing was performed using the NVIDIA Deep Learning GPU Training System (DIGITS) on a DevBox to train the model using each experimental dataset. GoogLeNet uses 22 convolutional layers including 9 Inception modules and 4 different kernel filters (7×7, 5×5, 3×3, and 1×1). The convolutional filters were trained using a stochastic gradient descent (SGD) algorithm with 0.001 of the base learning rate decreased by three steps based on a stable convergence of loss function. Comparison of the effect of transfer learning on the snowball sampling method was made by training one instance of the DCNN from scratch with random weight initialization and another instance with a preloaded, fine-tuned ImageNet pre-trained model. The snowball sampling procedure was iterated 10 times with each initial seed sample size so that the DCNNs were trained and tested a total of 60 times. During each training step, the validation sets (1000/class) were evaluated and the trained GoogLeNet model with the highest accuracy in the third step of learning rate decay was selected.
  • Referring to FIG. 4, at each snowball sampling iteration, the customized VAE was constructed to represent the features of the selected samples. In the illustrated, non-limiting example, the VAE contains four convolutional and deconvolutional layers functioning as encoders and decoders, respectively. One skilled in the art will appreciate that other examples may use more or fewer convolutional or deconvolutional layers, and that any number of layers may be used. Each convolutional layer in the current example had 64 kernel filters (3×3 size) followed by max pooling (2×2). Input images (downscaled to 64×64 from 512×512 for computational efficiency) were compressed to 128-dimensional feature spaces in a Gaussian distribution, ultimately reconstructing the input image using deconvolutional layers and up-sampling layers. The convolutional VAE was implemented using the Keras deep learning library running on a TensorFlow backend. After training the VAE, only the encoder was used as a feature representation, feeding the features into the inputs of Gaussian mixture models (GMMs). Two clusters each having 128-dimensional Gaussian distributions were generated for each snowball iteration. The cluster (c*) which had the closest distance to the cluster center of the Gaussian distribution of the selected seed sample was selected.
  • Unlabeled 5,000/class datasets were initially annotated by the DCNNs that were trained with labeled seed data (e.g. 5 examples per class). The labeled candidates were refined by binary clustering using GMMs and finally 500/class (m=500) at each snowball iteration. Through ten iterations, all unlabeled data sets were labeled. This automatic labeling procedure was conducted according to six different seed sizes (5, 10, 20, 50, 100, and 200/class). During each iteration, annotated image data was added to the next training data pool so that classification accuracy increased gradually. Mislabeled classes were significantly reduced after refining the training. The size of the initial seed influences the overall classification performance, with diminishing returns after 50 cases per class. Each experiment was repeated 10 times by randomly selecting seed samples from labeled training datasets. The trained model was then tested by introducing 1,000 new images of each body class. A total of 6,000 images were used for the performance evaluation in the present example.
  • For all defined body parts, classification accuracy was at or near 100%. Although the system was not trained on images of transition regions, it was able to infer these areas with considerable accuracy. The network was able to extract and identify similar features at the level of the Inception module despite wide ranges of normal variation in the same anatomic region.
  • Referring to FIG. 5, one non-limiting configuration for an autoencoder is shown, where generative loss ∥(x−{circumflex over (x)})∥2 510 and latent loss 520 are controlled and, in some configurations, DKL[N(μ(x), σ(x))∥N(0, I)] minimized. With equations 3 and 4, generative loss describes how accurately the decoder network g(z) 560 reconstructs images ({circumflex over (x)}) 540 from input x 530 and encoder network f(x) 500 through a latent vector z 550, where latent loss is derived from qφ(z|x) so that DKL[ . . . ] latent loss 520 is close to zero. Even though the composed neural network has many unknown weights to estimate, the simple cascade structure of multilayer neural network makes it possible to improve the accuracy by iteration. The input can call forward function and calculate loss function. Then the prediction errors are backpropagated to improve system performance.
  • Referring to FIGS. 6A and 6B, an example of body part classification created by the above-described systems and processes is shown. FIG. 6A depicts a whole-body CT image 600, such as may be acquired and provided to the above-described systems. Using the above-described techniques, body part regions 610 can be classified and labeled. Labeling of the data may be performed with a neural network, and refining of the dataset used to train the neural network may be as described above. Then, as illustrated in FIG. 6B, axial CT images corresponding to the body part regions 610 can be selected and labeled. As examples, a brain region 620 has a corresponding axial image 625, a neck 630 has axial image 635, a shoulder 640 has axial image 645, a chest 650 has axial image 655, an abdomen 660 has axial image 665, and a pelvis 670 has axial image 675. Any number of regions 610 may be identified for a subject, and any number of corresponding axial images may be used.
  • Referring to FIG. 7, a scatter plot of 2D latent space is shown where each cluster of data represents different body regions. For the example data shown in FIG. 7, 128-dimensional latent representations of 6000 cases classified by the convolutional VAE using 200 cases per class were visualized and resulted in 6 body region clusters with areas of overlap. This form of scatter plot display may be used to aid an automated routine in identifying what data corresponds to what body region by accounting for data clustering.
  • Referring to FIG. 8, an example of fine-tuned DCNN classification accuracy and number of mislabeled classes during ten snowball sampling iterations before and after a refining process by GMMs with six different initial seed sizes are shown. Varying training datasets were annotated using ten snowball iterations from varying initial seeds (5, 10, 20, 50, 100, and 200). During each iteration, annotated image data was added to the next training data pool so that classification accuracy increases gradually, as can be seen in FIG. 8.
  • Referring to FIGS. 9A and 9B, examples of classification accuracy are plotted with respect to size of training data or class. FIG. 9A reflects how the fine-tuned DCNN with transfer learning performs better than the DCNN trained from scratch with random weight initialization. Classification accuracy increased rapidly from seed sizes 5 to 50, while accuracy did not increase significantly from seed size 100 to 200. At this point, the learning curve reached a steady state and did not significantly change in accuracy regardless of the seed size. The learning curve predicted 98% classification accuracy with the observed accuracy at 97.25%. FIG. 9A depicts an example learning curve fit to classification accuracy. FIG. 9B depicts an example of classification accuracy by fine-tuned DCNN model without and with the addition of the snowball sampling iteration.
  • Example of Organ Classifier
  • There are increasing concerns about radiation exposure risk due to rising computed tomography (CT) exams in medicine. To measure dose from CT procedures, various CT dosimetry metrics have been introduced. The computed tomography dose index (CTDI) and its derivatives, such as volume CT dose index (CTDIvol), is a primary metric that is measured by polymethyl methacrylate (PMMA) standard phantoms with either 16 cm or 32 cm in diameter. However, CTDIvol does not describe the actual dose patient receives in respect to the different weight, body shapes, and sizes, and also does not provide organ dose. To estimate organ dose of individual patients, Monte Carlo simulations have been conducted on phantom models using mathematical description or image voxels, such as the Imaging Performance Assessment of CT scanner (ImPACT) CT patient dosimetry. However, these all organ dose estimate methods do not provide organ specific dose specific to the organ size and shape. In one configuration, a method is provided for machine learning powered personalized patient organ dose, which may take the form of software to estimate each patient's unique organ size and shape. The method can be used to enable organ detection, segmentation, and volume estimation (lungs, liver, kidneys, urinary bladder, muscles, and the like), which may then be used to control or optimize the level of radiation dose to the patients.
  • Dedicated patient organ dose reports are an important part of modern radiation safety. Current organ dose estimation techniques use Monte Carlo simulations based on phantoms and mathematical description or image voxels. Considering an individual patient's variance in organ position, orientation, and shape, it is often challenging to map a given CT slice to the slab number of a phantom model for accurate organ dose calculation.
  • CTDIvol may be measured to indicate the CT scanner output. Generally, the CTDIvol measurement is conducted by imaging a 16 cm diameter phantom for head and a 32 cm diameter phantom for body in a given patient CT scan. It is measured by following standard protocols. The CTDIvol may be denoted as:
  • C T D I v o l = C T D I w pitch C T D I w = 1 3 C T D I 1 0 0 c e n t e r + 2 3 C T D I 1 0 0 periphery CT DI 1 0 0 = 1 n T - 5 0 mm + 5 0 mm D ( z ) d z ( 5 )
  • where n is the number of tomographic sections imaged in a single axial scan, the number of data channels, T is the width of the tomographic section along the z-axis imaged by one data channel, pitch is the ratio of the table feed per rotation.
  • In some configurations, CTDIvol may be provided by commercial CT scanner manufactures. In one example, the value from a GE LightSpeed VCT scanner is used to estimate organ dose. The corresponding scan parameters have 120 kVp tube voltage, 0.98 mm pitch, 0.5 second(s) rotation time, and 40 mm collimation. At given scan parameters, the normalized CTDIw (denoted by nCTDIw) of ImPACT CT dosimetry calculator was 9.5 (mGy/100 mAs) so that CTDIw is calculated by nCTDIw*mA*s/100 and finally CTDIvol is determined by dividing CTDIw by the pitch in the above equation.
  • Referring to FIGS. 10A and 10B, a patient effective diameter may be measured by automatic calculation based on reconstructed axial CT images. FIG. 10B shows a CT image of a patient cross section. FIG. 10A shows the effective diameter, which is defined as the diameter of the circle whose area is the same as that of the patient cross section, assuming patient has elliptical cross sections as indicated in FIGS. 10A and 10B.

  • Effective_Diameter=√{square root over (AP×LAT)}  (6)
  • where the anterior posterior (AP) dimension 1010 represents the thickness of the body part of the patient and lateral (LAT) dimension 1020 represents the side-to-side dimension of the body part being scanned, respectively. In some configurations, binary morphologic image techniques may be used, such as image dilation and erosion, for estimation of circle of equal area to the patient diameter.
  • The ImPACT CT dosimetry calculator uses the normalized organ conversion factors, obtained from a Monte Carlo simulation to a mathematical phantom. It has limitation on the calculation of the organ dose of the actual patient, who has different weight and size of body as well as unique organ shape and volumes. In one configuration, to estimate patient organ dose for the various patient weight, a correction factor (CF) may be used at each organ using patient clinical data provided by two different manufacturers. The CF is calculated as:
  • C F o r g a n = D T , RD n D T , IM n ( 7 )
  • where DT,RD n and DT,IM n is the organ dose normalized by CTDIvol, which may be provided by the vendor as described previously, such as from eXposure organ dose software of Radimetrics (RD) and ImPACT (IM), respectively.
  • Referring to FIG. 11, a flowchart is provided that sets forth some example steps for one configuration of a personalized organ dose estimation (PODE) method. An automated program may be used to extract CT dose information from DICOM metadata and image data at step 1110. The DICOM dose report may be retrieved from a PACS, for example. The report may include CTDIvol, dose-length product (DLP), tube current, tube voltage, exposure time, and collimation. The extracted scanner information and scan parameters at step 1120 along with dose-relevant indexes of CT examinations and using machine learning to classify a body part at step 1130 may be used to calculate the organ dose for ImPACT CT dosimetry at step 1140. The DICOM data may be converted to an image at step 1150. The DICOM image may also be written to a standard 8-bit gray image format such as PNG or other formats. A patient effective diameter is calculated at step 1160, and a correction factor as discussed above may be calculated at step 1170. A course-tuning for organ dose estimation may be performed at step 1180. The converted images through scan ranges may be fed to the inputs of a machine learning network, such as a deep convolutional neural net, for use in identifying and segmenting patient organs at step 1190. Organ dose estimates may then be fine-tuned at step 1195 once the organ has been identified and properly segmented by attributing the dose more specifically to the appropriate organs.
  • In one example, based on the extracted scan parameters, the ImPACT dosimetry calculated 23 organ doses of each patient. The estimated organ dose by ImPACT were corrected by the correction factor (CF) based on a regression model, representing correlation between the ratio of normalized organ dose and patient effective diameter. After the correction of organ dose by considering the patient weights, finally the PODE may be fine-tuned by organ volume and shape through an organ segmentation step.
  • In one example where each image slice was classified as one of 16 organs, the method may automatically identify which patient organs were included in that scan region for the ImPACT dosimetry calculation. These 16 different organs were identified from axial views of CT images and were labeled. A 22-layer deep CNN using an NVIDIA Deep Learning GPU Training System (DIGITS) was trained and validated with a 646 CT scan dataset. The resultant classified organ was automatically mapped to the slab number of a mathematical hermaphrodite phantom to determine the scan range of ImPACT CT dose calculator.
  • A dataset of 12,748 CT images of 63 patients was compiled from the clinical PACS (Picture Archiving and Communication System). Preprocessing software was developed to annotate and categorize these images into 16 different body parts in axial views: Brain; Eye Lens; Nose; Salivary Gland; Thyroid; Upper Lung; Thymus; Heart; Chest; Abdomen 1; Abdomen 2; Pelvis 1; Pelvis 2; Urinary Bladder; Genitals; and Leg. Only the scans of regions that could be clearly defined as one of the aforementioned body parts were used. This is an optimized organ classification choice for the organ dose estimation task. The gaps account for transition regions, which were not used for the training algorithm due to their lack of clear regional definition. Each scan has different background image noise because of radiation dosage level, image reconstruction filter selection, and CT scanner vendors.
  • In the present example with 16 organ recognition, a GoogLeNet network using 22 convolutional layers including 9 inception modules and 4 sizes of basis or kernel filters (7×7, 5×5, 3×3, and 1×1) was used. 75% of images were used for training and 25% for validation. The GoogLeNet was trained using the NVIDIA toolchain of DIGITS and the DevBox with four TITAN GPUs with 7 TFlops of single precision, 336.5 GB/s of memory bandwidth, and 12 GB of memory. GoogLeNet was trained using a stochastic gradient descent (SGD) algorithm until 150 training epochs. Validation data sets were presented upon every epoch during the training process. The initial learning rate was 0.01 and decreased by three steps according to the convergence to loss function.
  • A total of 646 patients were included in this retrospective study, with a mean age of 66 years old (range 20-95 years). These patients represented a wide spectrum of body habitus, with a mean weight of 85.6 kg (range, 45-181 kg). FIG. 2 is a representative example of classification results of patient organs after chest CT segmentation. The identified organs were labeled from HEAD (Thyroid) to TRUNK (Abdomen1), a region including both kidneys and liver. Based on organ classification, the corresponding scan range for the ImPACT CT dosimetry calculator was determined. For example, the identified thyroid region (HEAD5) was mapped to slab number 171/208 of the adult phantom.
  • The predicted organ location provided by the deep learning driven software also gives information about the volume of an organ in respect to a given scan region. For example, the thyroid (HEAD 5 region) can be identified in slices 10 to 138 of the present example with 99% accuracy, greatly improving patient-specific radiation dose estimation.
  • In some configurations, the ratio of normalized organ dose by CTDIvol according to the patient effective diameter may be assessed. The identified organs at a given scan region may have a linear relationship, whereas some organs such as brain, eye lenses, and salivary glands may not be identified by a convolutional neural net classifier so that the organ doses are not correlated to the effective diameter. In the example above, the normalized dose coefficients were decreased as the effective diameter increased for all identified organ regions. By assuming linear relationship (Y=a1X+a0) between the normalized dose coefficient and the effective diameter, a best fit model on each organ may be fit by the least square estimate (LSE), or by any other appropriate estimates.
  • It will be appreciated by one skilled in the art that the model may be trained for more organ areas than have been disclosed in the examples, such as covering all organs used in the CT organ dose estimator. These organs may include the pancreas, stomach, gall bladder, and colon, and may facilitate longitudinal organ-specific dose calculations.
  • The present disclosure has described one or more preferred embodiments, and it should be appreciated that many equivalents, alternatives, variations, and modifications, aside from those expressly stated, are possible and within the scope of the invention.

Claims (19)

1. A method for automatically processing unstructured medical imaging data to generate classified images, the method comprising:
a) acquiring medical image data of a subject;
b) subjecting the medical image data of the subject to a neural network to generate classified image data;
c) comparing the classified image data to a confidence test;
d) upon determining that the classified image data does not pass the confidence test, subjecting the classified image data to a variational autoencoder (VAE) that implements a snowball sampling algorithm to refine the classified image data by representing features of the classified image data into latent space with a Gaussian distribution;
e) repeating steps c) and d) until the classified image data passes the confidence test; and
f) generating annotated images from the classified image data.
2. The method of claim 1 wherein the Gaussian distribution is achieved using a Gaussian mixture models (GMMs) to perform binary clustering within each class across the classified image data.
3. The method of claim 1 wherein the confidence test includes a threshold of at least one of speed, classification accuracy, reproducibility, or efficacy.
4. The method of claim 1 wherein the classified image data includes at least one of body region labels, a body organ labels, an organ label, an image feature label, or a condition label.
5. The method of claim 1 further comprising training the neural network by selecting an initial seed sample size to generate the training dataset.
6. A method for automatic labeling and annotation for unstructured medical datasets with snowball sampling comprising:
a) acquiring images of a region of a subject and labeling the images to generate a training dataset with the images;
b) training a convolutional neural network with the training dataset;
c) classifying unlabeled images using the trained network;
d) determining if a performance threshold is exceeded for the classified images; and
e) refining the dataset if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
7. The method of claim 6 wherein the variational autoencoder generates latent vectors approximating a unit Gaussian distribution in order to minimize generative losses.
8. The method of claim 6 wherein the variational autoencoder includes an encoder with a multilayer perceptron neural network allowing it to map an input to a latent representation, and a decoder that maps the latent representation to a reconstructed input value.
9. The method of claim 6 further comprising selecting an initial seed sample size to generate the training dataset.
10. The method of claim 6 wherein the performance threshold includes a threshold of at least one of speed, classification accuracy, reproducibility, or efficacy.
11. The method of claim 6 wherein classifying unlabeled images includes at least one of identifying a body region, a body organ, an organ, an image feature, or a condition.
12. The method of claim 6 wherein snowball sampling includes refining the dataset at least twice.
13. A system for automatic labeling and annotation for unstructured medical datasets from medical images with snowball sampling, the system comprising:
a) a computer system configured to:
i) acquire images of a region of a subject and label the images to generate a training dataset with the images;
ii) train a convolutional neural network with the training dataset;
iii) classify unlabeled images using the trained network;
iv) determine if a performance threshold is exceeded for the classified images; and
v) refine the dataset if the threshold is not exceeded by using a variational autoencoder to label the unlabeled images to create labeled images and updating the dataset with the labeled images.
14. The system of claim 13 wherein the variational autoencoder generates latent vectors approximating a unit Gaussian distribution in order to minimize generative losses.
15. The system of claim 13 wherein the variational autoencoder includes an encoder with a multilayer perceptron neural network allowing it to map an input to a latent representation, and a decoder that maps the latent representation to a reconstructed input value.
16. The system of claim 13 further comprising selecting an initial seed sample size to generate the training dataset.
17. The system of claim 13 wherein the performance threshold includes a threshold of at least one of speed, classification accuracy, reproducibility, or efficacy.
18. The system of claim 13 wherein classifying unlabeled images includes at least one of identifying a body region, a body organ, an organ, an image feature, or a condition.
19. The system of claim 13 wherein snowball sampling includes refining the dataset at least twice.
US16/644,888 2017-09-08 2018-09-10 A system and method for automated labeling and annotating unstructured medical datasets Abandoned US20200286614A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/644,888 US20200286614A1 (en) 2017-09-08 2018-09-10 A system and method for automated labeling and annotating unstructured medical datasets

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201762555767P 2017-09-08 2017-09-08
US201762555799P 2017-09-08 2017-09-08
PCT/US2018/050177 WO2019051359A1 (en) 2017-09-08 2018-09-10 A system and method for automated labeling and annotating unstructured medical datasets
US16/644,888 US20200286614A1 (en) 2017-09-08 2018-09-10 A system and method for automated labeling and annotating unstructured medical datasets

Publications (1)

Publication Number Publication Date
US20200286614A1 true US20200286614A1 (en) 2020-09-10

Family

ID=65634435

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/644,888 Abandoned US20200286614A1 (en) 2017-09-08 2018-09-10 A system and method for automated labeling and annotating unstructured medical datasets
US16/645,240 Active 2039-06-30 US11615879B2 (en) 2017-09-08 2018-09-10 System and method for automated labeling and annotating unstructured medical datasets

Family Applications After (1)

Application Number Title Priority Date Filing Date
US16/645,240 Active 2039-06-30 US11615879B2 (en) 2017-09-08 2018-09-10 System and method for automated labeling and annotating unstructured medical datasets

Country Status (2)

Country Link
US (2) US20200286614A1 (en)
WO (2) WO2019051359A1 (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111312372A (en) * 2020-02-26 2020-06-19 广州金域医学检验中心有限公司 Method and device for establishing medical image standard test data set
US20200364520A1 (en) * 2019-05-13 2020-11-19 International Business Machines Corporation Counter rare training date for artificial intelligence
US20210015438A1 (en) * 2019-07-16 2021-01-21 Siemens Healthcare Gmbh Deep learning for perfusion in medical imaging
US20210081822A1 (en) * 2019-09-18 2021-03-18 Luminex Corporation Using machine learning algorithms to prepare training datasets
CN112612898A (en) * 2021-03-05 2021-04-06 蚂蚁智信(杭州)信息技术有限公司 Text classification method and device
KR102247182B1 (en) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 Method, device and program for creating new data using clustering technique
US11064902B2 (en) * 2018-06-29 2021-07-20 Mayo Foundation For Medical Education And Research Systems, methods, and media for automatically diagnosing intraductal papillary mucinous neosplasms using multi-modal magnetic resonance imaging data
CN113191385A (en) * 2021-03-25 2021-07-30 之江实验室 Unknown image classification automatic labeling method based on pre-training labeling data
US11080484B1 (en) * 2020-10-08 2021-08-03 Omniscient Neurotechnology Pty Limited Natural language processing of electronic records
US20210241037A1 (en) * 2020-01-30 2021-08-05 Canon Medical Systems Corporation Data processing apparatus and method
US20210271914A1 (en) * 2018-11-30 2021-09-02 Fujifilm Corporation Image processing apparatus, image processing method, and program
US20210295108A1 (en) * 2018-07-29 2021-09-23 Zebra Medical Vision Ltd. Systems and methods for automated detection of visual objects in medical images
CN113823385A (en) * 2021-09-03 2021-12-21 青岛海信医疗设备股份有限公司 Method, device, equipment and medium for modifying DICOM image
US11210785B1 (en) * 2019-10-17 2021-12-28 Robert Edwin Douglas Labeling system for cross-sectional medical imaging examinations
US20220067485A1 (en) * 2020-08-31 2022-03-03 Verizon Connect Development Limited Systems and methods for utilizing a machine learning model combining episodic and semantic information to process a new class of data without loss of semantic knowledge
US20220076062A1 (en) * 2019-05-14 2022-03-10 Samsung Electronics Co., Ltd. Image processing device and operation method thereof
US11322256B2 (en) * 2018-11-30 2022-05-03 International Business Machines Corporation Automated labeling of images to train machine learning
US20220198661A1 (en) * 2019-04-24 2022-06-23 Nanjing Turing Microbial Technologies Co. Ltd Artificial intelligence based medical image automatic diagnosis system and method
US20220254029A1 (en) * 2019-03-27 2022-08-11 Nvidia Corporation Image segmentation using a neural network translation model
US20220292673A1 (en) * 2021-03-12 2022-09-15 Siemens Healthcare Gmbh On-Site training of a machine-learning algorithm for generating synthetic imaging data
US11537506B1 (en) 2018-10-26 2022-12-27 Amazon Technologies, Inc. System for visually diagnosing machine learning models
US11556746B1 (en) * 2018-10-26 2023-01-17 Amazon Technologies, Inc. Fast annotation of samples for machine learning model development
DE102021210920A1 (en) 2021-09-29 2023-03-30 Siemens Healthcare Gmbh Apparatus and computer-implemented method for training a machine learning system to associate a scan exam with a standardized identifier code
CN116229442A (en) * 2023-01-03 2023-06-06 武汉工程大学 Text image synthesis and instantiation weight transfer learning method
US11728035B1 (en) * 2018-02-09 2023-08-15 Robert Edwin Douglas Radiologist assisted machine learning
CN117173543A (en) * 2023-11-02 2023-12-05 天津大学 Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11669746B2 (en) * 2018-04-11 2023-06-06 Samsung Electronics Co., Ltd. System and method for active machine learning
JP7049974B2 (en) * 2018-10-29 2022-04-07 富士フイルム株式会社 Information processing equipment, information processing methods, and programs
CN110060258B (en) * 2019-03-27 2021-05-04 山东师范大学 Retina SD-OCT image segmentation method and device based on Gaussian mixture model clustering
KR20200129639A (en) 2019-05-09 2020-11-18 삼성전자주식회사 Model training method and apparatus, and data recognizing method
CN110414562B (en) * 2019-06-26 2023-11-24 平安科技(深圳)有限公司 X-ray film classification method, device, terminal and storage medium
CN110260925B (en) * 2019-07-12 2021-06-25 重庆赛迪奇智人工智能科技有限公司 Method and system for detecting quality of driver parking technology, intelligent recommendation method and electronic equipment
US20220335085A1 (en) * 2019-07-30 2022-10-20 Nippon Telegraph And Telephone Corporation Data selection method, data selection apparatus and program
CN110368018A (en) * 2019-08-22 2019-10-25 南京安科医疗科技有限公司 A kind of CT system scanning dynamic regulating method
CN111144550A (en) * 2019-12-27 2020-05-12 中国科学院半导体研究所 Simplex deep neural network model based on homologous continuity and construction method
KR102112859B1 (en) * 2020-01-02 2020-05-19 셀렉트스타 주식회사 Method for training a deep learning model for a labeling task and apparatus using the same
CN111667457B (en) * 2020-04-29 2023-07-18 杭州深睿博联科技有限公司 Automatic identification method, system, terminal and storage medium for vertebral body information based on medical image
US20210383533A1 (en) * 2020-06-03 2021-12-09 Nvidia Corporation Machine-learning-based object detection system
US20220005187A1 (en) * 2020-07-02 2022-01-06 Enlitic, Inc. Medical scan viewing system with roc adjustment and methods for use therewith
US20220139515A1 (en) 2020-11-03 2022-05-05 Nuance Communications, Inc. Communication System and Method
CN112418289B (en) * 2020-11-17 2021-08-03 北京京航计算通讯研究所 Multi-label classification processing method and device for incomplete labeling data
WO2022140440A1 (en) * 2020-12-22 2022-06-30 Nuance Communications, Inc. Ai platform system and method
CA3103872A1 (en) * 2020-12-23 2022-06-23 Pulsemedica Corp. Automatic annotation of condition features in medical images
US20220301156A1 (en) * 2021-03-16 2022-09-22 Shenzhen Keya Medical Technology Corporation Method and system for annotation efficient learning for medical image analysis
US20220309673A1 (en) * 2021-03-26 2022-09-29 Varian Medical Systems, Inc. Using radiation dose information for automatic organ segmentation model training
CN112926682B (en) * 2021-03-29 2024-04-16 华东理工大学 Nuclear magnetic resonance image small sample learning and classifying method based on graph network
US20220351367A1 (en) 2021-04-30 2022-11-03 Avicenna.Ai Continuous update of hybrid models for multiple tasks learning from medical images
US20230081601A1 (en) * 2021-09-10 2023-03-16 GE Precision Healthcare LLC Patient anatomy and task specific automatic exposure control in computed tomography
CN114355907B (en) * 2021-12-22 2024-01-19 东风汽车集团股份有限公司 Cloud-based intelligent garbage identification and cleaning method and system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007062178A2 (en) 2005-11-21 2007-05-31 The Regents Of The University Of California Method for computing patient radiation dose in computed tomoraphy
EP2651309B1 (en) 2010-12-13 2020-10-14 The Trustees of Columbia University in the City of New York Medical imaging devices, methods, and systems
JP2014528284A (en) * 2011-09-30 2014-10-27 チルドレンズ ホスピタル メディカル センター A method for consistent and verifiable optimization of computed tomography (CT) radiation dose
US20160259888A1 (en) * 2015-03-02 2016-09-08 Sony Corporation Method and system for content management of video images of anatomical regions
US10043261B2 (en) 2016-01-11 2018-08-07 Kla-Tencor Corp. Generating simulated output for a specimen
US10169871B2 (en) * 2016-01-21 2019-01-01 Elekta, Inc. Systems and methods for segmentation of intra-patient medical images
US10098606B2 (en) 2016-02-29 2018-10-16 Varian Medical Systems, Inc. Automatic organ-dose-estimation for patient-specific computed tomography scans
EP3392832A1 (en) * 2017-04-21 2018-10-24 General Electric Company Automated organ risk segmentation machine learning methods and systems

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11728035B1 (en) * 2018-02-09 2023-08-15 Robert Edwin Douglas Radiologist assisted machine learning
US11064902B2 (en) * 2018-06-29 2021-07-20 Mayo Foundation For Medical Education And Research Systems, methods, and media for automatically diagnosing intraductal papillary mucinous neosplasms using multi-modal magnetic resonance imaging data
US20210295108A1 (en) * 2018-07-29 2021-09-23 Zebra Medical Vision Ltd. Systems and methods for automated detection of visual objects in medical images
US11776243B2 (en) * 2018-07-29 2023-10-03 Nano-X Ai Ltd. Systems and methods for automated detection of visual objects in medical images
US11556746B1 (en) * 2018-10-26 2023-01-17 Amazon Technologies, Inc. Fast annotation of samples for machine learning model development
US11537506B1 (en) 2018-10-26 2022-12-27 Amazon Technologies, Inc. System for visually diagnosing machine learning models
US20210271914A1 (en) * 2018-11-30 2021-09-02 Fujifilm Corporation Image processing apparatus, image processing method, and program
US11322256B2 (en) * 2018-11-30 2022-05-03 International Business Machines Corporation Automated labeling of images to train machine learning
US20220254029A1 (en) * 2019-03-27 2022-08-11 Nvidia Corporation Image segmentation using a neural network translation model
US20220198661A1 (en) * 2019-04-24 2022-06-23 Nanjing Turing Microbial Technologies Co. Ltd Artificial intelligence based medical image automatic diagnosis system and method
US11176429B2 (en) * 2019-05-13 2021-11-16 International Business Machines Corporation Counter rare training date for artificial intelligence
US20200364520A1 (en) * 2019-05-13 2020-11-19 International Business Machines Corporation Counter rare training date for artificial intelligence
US11954755B2 (en) * 2019-05-14 2024-04-09 Samsung Electronics Co., Ltd. Image processing device and operation method thereof
US20220076062A1 (en) * 2019-05-14 2022-03-10 Samsung Electronics Co., Ltd. Image processing device and operation method thereof
US20210015438A1 (en) * 2019-07-16 2021-01-21 Siemens Healthcare Gmbh Deep learning for perfusion in medical imaging
US11861514B2 (en) * 2019-09-18 2024-01-02 Luminex Corporation Using machine learning algorithms to prepare training datasets
US20210081822A1 (en) * 2019-09-18 2021-03-18 Luminex Corporation Using machine learning algorithms to prepare training datasets
US11210785B1 (en) * 2019-10-17 2021-12-28 Robert Edwin Douglas Labeling system for cross-sectional medical imaging examinations
US20210241037A1 (en) * 2020-01-30 2021-08-05 Canon Medical Systems Corporation Data processing apparatus and method
CN111312372A (en) * 2020-02-26 2020-06-19 广州金域医学检验中心有限公司 Method and device for establishing medical image standard test data set
US20220067485A1 (en) * 2020-08-31 2022-03-03 Verizon Connect Development Limited Systems and methods for utilizing a machine learning model combining episodic and semantic information to process a new class of data without loss of semantic knowledge
US11651195B2 (en) * 2020-08-31 2023-05-16 Verizon Connect Development Limited Systems and methods for utilizing a machine learning model combining episodic and semantic information to process a new class of data without loss of semantic knowledge
US11080484B1 (en) * 2020-10-08 2021-08-03 Omniscient Neurotechnology Pty Limited Natural language processing of electronic records
WO2022073058A1 (en) * 2020-10-08 2022-04-14 Omniscient Neurotechnology Pty Limited Natural language processing of electronic records
KR102247182B1 (en) * 2020-12-18 2021-05-03 주식회사 이글루시큐리티 Method, device and program for creating new data using clustering technique
CN112612898A (en) * 2021-03-05 2021-04-06 蚂蚁智信(杭州)信息技术有限公司 Text classification method and device
US20220292673A1 (en) * 2021-03-12 2022-09-15 Siemens Healthcare Gmbh On-Site training of a machine-learning algorithm for generating synthetic imaging data
CN113191385A (en) * 2021-03-25 2021-07-30 之江实验室 Unknown image classification automatic labeling method based on pre-training labeling data
CN113823385A (en) * 2021-09-03 2021-12-21 青岛海信医疗设备股份有限公司 Method, device, equipment and medium for modifying DICOM image
DE102021210920A1 (en) 2021-09-29 2023-03-30 Siemens Healthcare Gmbh Apparatus and computer-implemented method for training a machine learning system to associate a scan exam with a standardized identifier code
CN116229442A (en) * 2023-01-03 2023-06-06 武汉工程大学 Text image synthesis and instantiation weight transfer learning method
CN117173543A (en) * 2023-11-02 2023-12-05 天津大学 Mixed image reconstruction method and system for lung adenocarcinoma and pulmonary tuberculosis

Also Published As

Publication number Publication date
WO2019051359A1 (en) 2019-03-14
US20200285906A1 (en) 2020-09-10
US11615879B2 (en) 2023-03-28
WO2019051356A1 (en) 2019-03-14

Similar Documents

Publication Publication Date Title
US11615879B2 (en) System and method for automated labeling and annotating unstructured medical datasets
JP7039153B2 (en) Image enhancement using a hostile generation network
US10762398B2 (en) Modality-agnostic method for medical image representation
US10984905B2 (en) Artificial intelligence for physiological quantification in medical imaging
JP7245364B2 (en) sCT Imaging Using CycleGAN with Deformable Layers
Menze et al. The multimodal brain tumor image segmentation benchmark (BRATS)
CN110556178A (en) decision support system for medical therapy planning
Okada et al. Noninvasive differential diagnosis of dental periapical lesions in cone‐beam CT scans
Li et al. DenseX-net: an end-to-end model for lymphoma segmentation in whole-body PET/CT images
Cho et al. Medical image deep learning with hospital PACS dataset
CN111666966B (en) Material decomposition based on artificial intelligence in medical imaging
Egger et al. Vertebral body segmentation with GrowCut: Initial experience, workflow and practical application
Arega et al. Leveraging uncertainty estimates to improve segmentation performance in cardiac MR
US11908568B2 (en) System and methods for radiographic image quality assessment and protocol optimization
de Azevedo Marques et al. Content-based retrieval of medical images: landmarking, indexing, and relevance feedback
Singh et al. Attention-guided residual W-Net for supervised cardiac magnetic resonance imaging segmentation
Qu et al. Advancing diagnostic performance and clinical applicability of deep learning-driven generative adversarial networks for Alzheimer's disease
Lang et al. LCCF-Net: Lightweight contextual and channel fusion network for medical image segmentation
EP3588378A1 (en) Method for determining at least one enhanced object feature of an object of interest
Thool et al. Artificial Intelligence in Medical Imaging Data Analytics using CT Images
Bui et al. DeepHeartCT: A fully automatic artificial intelligence hybrid framework based on convolutional neural network and multi-atlas segmentation for multi-structure cardiac computed tomography angiography image segmentation
Jin A Quality Assurance Pipeline for Deep Learning Segmentation Models for Radiotherapy Applications
Weisman Automatic Quantification and Assessment of FDG PET/CT Imaging in Patients with Lymphoma
Sahlsten Applicability and Robustness of Deep Learning in Healthcare
Pálsson Robust Imaging Biomarkers for Brain Tumors

Legal Events

Date Code Title Description
AS Assignment

Owner name: THE GENERAL HOSPITAL CORPORATION, MASSACHUSETTS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DO, SYNHO;REEL/FRAME:052093/0977

Effective date: 20180530

STPP Information on status: patent application and granting procedure in general

Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION