WO2018094381A1 - Système et procédé pour l'évaluation automatique d'affection à l'aide de données de balayage oct - Google Patents

Système et procédé pour l'évaluation automatique d'affection à l'aide de données de balayage oct Download PDF

Info

Publication number
WO2018094381A1
WO2018094381A1 PCT/US2017/062747 US2017062747W WO2018094381A1 WO 2018094381 A1 WO2018094381 A1 WO 2018094381A1 US 2017062747 W US2017062747 W US 2017062747W WO 2018094381 A1 WO2018094381 A1 WO 2018094381A1
Authority
WO
WIPO (PCT)
Prior art keywords
patient
computer system
classification module
host computer
classification
Prior art date
Application number
PCT/US2017/062747
Other languages
English (en)
Inventor
James HAYASHI
Ravi Starzl
Hugo ANGULO
Abhishek Kar
Ramesh OSWAL
Diego PENAFIEL
Weidong YAUN
Original Assignee
Tecumseh Vision, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tecumseh Vision, Llc filed Critical Tecumseh Vision, Llc
Priority to US16/462,360 priority Critical patent/US20190313895A1/en
Publication of WO2018094381A1 publication Critical patent/WO2018094381A1/fr

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/102Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • Age Related Macular Degeneration is the leading cause of blindness in the United States. ARMD is commonly thought to exist in two forms— “dry” and “wet.” The wet form often results when a choroidal neovascular membrane (CNVM) has growth beneath the retina. A CNVM often results in sudden, severe vision loss which, if left untreated, is permanent.
  • CNVM choroidal neovascular membrane
  • AREDS and AREDS2 vitamin therapy has been shown to treat dry macular degeneration somewhat effectively. Vitamin therapy is recommended for people who have moderate and/or severe cases of the disease, although the benefit is minimal for people with mild cases.
  • the current recommendation is that people with dry macular degeneration self monitor using an Amsler grid, along with an examination of their retina every six months.
  • diabetic retinopathy Another eye disease is diabetic retinopathy, which is the leading cause of visual disability among working age adults. An estimated 25 million Americans have been diagnosed, which is a small proportion to the complete number. Numerous clinical trials have shown that early intervention in diabetic eye disease, with ophthalmic lasers and anti vascular endothelial growth factor agents, has a profound beneficial effect on the natural progression of the disease. Current therapies have shown to be about 90% effective in preventing severe visual loss (visual acuity ⁇ 5/200). The American Academy of Ophthalmology and the American Diabetes Association recommend routine screening protocols. However, despite the proven benefit of early detection, annual exams are only followed approximately 50% of the time, and annual exam rates may be as low as 30% in high risk groups. Although the treatment of diabetes has led to a decrease in diabetic eye disease prevalence, the overall increase in the prevalence of diabetes has meant that the eye disease burden has not lessened. Because of modernization and the spread of Western dietary practices, diabetes, has unfortunately become a worldwide epidemic.
  • POAG primary open angle glaucoma
  • POAG affects over 2 million Americans and the numbers are expected to increase as the population ages.
  • Primary open angle glaucoma is an ideal disease for screening because it has a reasonably high prevalence in the population, is asymptomatic early in the course of disease, and can slow or even eliminate visual field loss if detected and treated early. Screening for glaucoma is problematic though, because measuring the intraocular pressure has been shown to be very ineffective as a screening measure.
  • the present invention is directed to systems and methods for applying machine learning to Optical Coherence Tomography (OCT) scan data of a patient's retina. It is capable of detecting the presence and/or state of disease conditions in the patient, particularly eye-related diseases, such as ARMD, glaucoma (e.g., POAG), and/or diabetic retinopathy.
  • OCT Optical Coherence Tomography
  • Embodiments of the present invention could also be used to detect other disease conditions from OCT retina scan image data, such cardiovascular, Alzheimer's and/or Parkinson's disease
  • OCT scan data By using OCT scan data according to the present invention, early detection for the various disease conditions can be improved. Moreover, currently most OCT machines are located at an eye doctor' s office. Enhancing the functionality of OCT machines to detect for other diseases, results in the economic incentive to include OCT machines at the offices of primary care providers. Screening for various eye disease conditions can move from an eye specialist's setting to a primary care doctor's office— all while the patient remains at the primary care doctor' s office for a visit. Moreover, such automated assessments can help cover for the shortage in retinal specialists that diagnose patients. Additionally, the automated assessments can improve patient convenience (e.g., by having the assessment performed while the patient is at his/her primary care provider's office for an appointment) and compliance (e.g., by better identifying when follow-up treatment is needed).
  • Figure 1 A is a diagram of an apparatus according to various embodiments of the present invention.
  • Figure IB is a diagram of an OCT scanner according to various embodiments of the present invention.
  • Figure 1C is an example of OCT scan image of the retina
  • Figure ID is a flowchart depicting a process flow of the host computer system of the apparatus of Figure 1A according to various embodiments of the present invention
  • Figure 2A depicts a graphical representation of LeNet
  • Figure 2B depicts a graphical representation of AlexNet
  • Figure 3 depicts a method for developing and training the statistical models of the apparatus of Figure 1A according to various embodiments of the present invention.
  • Figure 4 illustrates an ensemble of statistical methods.
  • OCT Ocular Coherence Tomography
  • ARMD glaucoma
  • diabetic retinopathy an established medical imaging technology that uses light and analysis of the scattering of the light by the biological tissue to produce high resolution images on the micrometer scale.
  • the present invention in one embodiment, can effectively leverage traditional machine learning and deep learning methodologies for the detection of diseases, such as ARMD, glaucoma and/or diabetic retinopathy.
  • embodiments of the present invention can be used to accurately address many of the clinical questions of these conditions, often in a manner not requiring a highly trained specialist, such as a retina doctor, to read the images.
  • questions that can be addressed by the system of the present invention can include: does the patient have ARMD, for example. If so, will the patient benefit from vitamin therapy? Does the ARMD patient now have wet ARMD? If the patient has wet ARMD, will they require frequent or less frequent injections? And if the patient has been treated and responded to anti-VEGF injections, has there been a recurrence of the CNVM? (hereinafter "the follow-up questions"). Similarly, if the patient is diagnosed with glaucoma, the follow-up questions can include whether the patient' s glaucoma is severe, such that it needs to be treated soon, or not so severe such that treatment can be delayed.
  • FIG 1A is a diagram of a system 400 according to various embodiments of the present invention.
  • the system 400 comprises an OCT scanner 402 and a host computer system 406.
  • the OCT scanner 402 is a medical imaging device that uses light (usually infrared light) to capture micrometer-resolution, three-dimensional images from within optical scattering media (e.g., biological tissue), such as a human's retina.
  • the OCT scanner 402 may include an interferometer (e.g., a Michelson type) with a low coherence, broad bandwidth light source, such as a super-luminescent diode (SLD) or laser.
  • Figure IB is a diagram of an OCT scanner according to various embodiments of the present invention.
  • Light from the light source e.g., the SLD
  • BS beam-splitter
  • One beam is directed to a reference (REF) and the other beam is directed to the sample (SMP), e.g., the patient's retina.
  • the reflected light from the sample and reference paths are recombined.
  • a light detector such as a camera (CAM in Figure IB) or photodetector, collects the images data for digital processing.
  • Figure 1C provides an example of an OCT scan of a retina, in this case of a relatively healthy macula portion of the retina.
  • Scan retina image data (or other body part, depending on what the types of diseases being diagnosed) collected by the OCT scanner 402r from a patient (or patients), may be transmitted to the host computer system 406 via a data network 404, such as the Internet, a WAN or LAN, etc.
  • a data network 404 such as the Internet, a WAN or LAN, etc.
  • the OCT scanner 402 could upload the scan image data to a database 415, such as a network or cloud-based database, and the host computer system 406 could then download the scan image data from the database 415 for processing.
  • a database 415 such as a network or cloud-based database
  • the host computer system 406 statistically analyzes the scan data for the patient (or patients), to determine a likelihood that the patient has (or the patients have) the tested-for disease(s), e.g., ARMD, glaucoma, or diabetic retinopathy in one embodiment. If any of those eyes diseases are identified, it analyzes additional features of the eye-disease (e.g., the "follow-up questions" described above). That is, the host computer system 406 may employ machine learning techniques to classify the patients as having an eye disease based on the patient's OCT scan image data and to address follow-up questions (each being a classification task), using traditional machine learning and/or deep learning techniques. The host computer system 406 may employ an ensemble of traditional machine learning and/or deep learning algorithms to make the classifications as described below.
  • the host computer system 406 may employ an ensemble of traditional machine learning and/or deep learning algorithms to make the classifications as described below.
  • the host computer system 406 could be co-located with the OCT scanner 402 or remote from the OCT scanner 402.
  • the OCT scanner 402 and the host computer system 406 may be in communication via a wired communication link (e.g., Ethernet) or a wireless communication link (e.g., WiFi).
  • a wired communication link e.g., Ethernet
  • a wireless communication link e.g., WiFi
  • the OCT scanner 402 and the host computer system 406 are remote, they can be in communication via the electronic data network 404.
  • the host computer system 406 may comprise one or a number of networked computer devices, such as PCs, laptops, servers, etc. Where the host computer system 406 comprises multiple computer devices, they may be co-located or distributed across a network.
  • the host computer system 406 comprises one or more processor(s) 408 and one or more associated memory units 410 (only one of each is shown in Figure 1A for simplicity) that store software for execution by the processor(s) 408.
  • the memory unit(s) 410 may comprise primary and secondary computer memory.
  • the primary memory may be directly accessible by the processor(s) 408.
  • the processor(s) may continuously read instructions (e.g., software) stored in the primary memory and execute the instructions as required.
  • the primary memory can comprise RAM, ROM, processor registers and processor cache memory.
  • the secondary memory can comprise storage devices that are not directly accessible by the processors, such as HDDs, SSDs, flash optical data storage units, magnetic tape memory, etc. Any data actively operated on by the processor(s) 408 may be stored in the primary and/or secondary memory.
  • the processor(s) 408 preferably comprises multiple processing cores, such as multiple
  • GPU cores operate in parallel and, hence, can typically process data more efficiently that a collection of CPU cores, but all the cores execute the same code at one time. GPUs are particularly better suited for deep neural networks, as described below.
  • the memory unit(s) 410 comprises one or more pre-trained classification modules 412A to 412N.
  • the classification modules 412A-N store computer instructions, e.g., software, that is executed by the processor(s) 408 in order to perform the statistical analysis on the OCT scan data received from the OCT scanner 402.
  • Each classification module 412 can be "tuned” or “trained” to a particular classification question, e.g., whether the patient has a particular disease and/or follow-up questions.
  • a first classification module 412A can assess whether the patient has A MD
  • a second classification module 412B can assess whether the patient has glaucoma (POAG)
  • a third classification module 412C can assess whether the patient has diabetic retinopathy (DR)
  • other classification modules 412D-N can address relevant follow-up classifications.
  • the classification modules 412A-N (which can each comprise an ensemble of traditional machine learning and/or deep learning algorithms) can make their respective classifications for a patient based on the patient's OCT scans. Accordingly, in (post-training) operation, the host computer system 406 receives the OCT scan data from the OCT scanner 402 for a patient, and then the processor(s) 408 executes the software for the classification modules 412A-N as needed to make their respective determinations, or classifications. To illustrate, the modules 412A-N can determine whether the patient has ARMD, and/or glaucoma, and/or diabetic retinopathy, etc.
  • the determination by the host computer system 406 can include a probability based on its statistical analysis that the patient has tested-for condition or a binary output (yes or no). If probability exceeds some threshold (e.g., 50%) or if the condition result is yes in a binary determination, the classification modules can be executed to make their respective classifications for follow-up questions as needed. For example, if the ARMD classification module 412A determines that the patient likely has ARMD, the classification modules specific to the ARMD diagnosis can be executed (e.g., in case of wet ARMD - Will vitamin therapy help?).
  • the classification modules specific to the glaucoma diagnosis can be executed (e.g., is the glaucoma severe?).
  • the host computer system 406 may then display the determinations on a screen (not shown) of the host computer system and/or transmit data indicative of the determination to another (e.g., remote) computer system 417.
  • another computer system 417 is a computer system associated with the caregiver that performed the OCT scan and/or a computer system associated with the patient' s health insurance provider.
  • Figure ID is a flowchart of a process that the host computer system may implement according to various embodiments of the present invention.
  • the host computer system 406 receives the patient OCT scan image data.
  • the host computer system 406 pre-processes the patient OCT scan image data. More details about pre-processing are described below.
  • the classification modules 412A-C can be executed to test for their respective tested-for conditions.
  • the classification modules 412A-C can be executed in parallel as suggested by Figure ID, or they can be executed serially in various embodiments.
  • the follow-up questions for the positive tested-for conditions can be executed at steps 504A-C. Some conditions may have more follow-up condition classifications or questions than others. And when the processing is complete at step 505, the results can be transmitted at step 506.
  • the results of the host computing system 406 are provided shortly after the patient' s OCT scan is taken, so that the caregiver can provide and review the results with the patient during the patient's appointment.
  • the OCT scanner 402 could be located in the office or facility of the patient's primary care provider.
  • the classification modules 412A-N may each use an ensemble of traditional machine learning and/or deep learning techniques that are trained on training data to make their respective classifications.
  • the machine learning techniques of the modules 412A-N can comprise, for example, both applied deep learning models and traditional machine learning (i.e., non-deep learning) models.
  • the traditional machine learning models can comprise, for example, decision tree learning, shallow artificial neural networks, support vector machines, and rule-based machine learning.
  • Deep learning on the other hand is machine learning based on learning data representations implicitly.
  • Deep learning architectures may include several neural networks (e.g., deep, feed forward convolutional networks such as convolutional neural networks (CNNs)) and various recursive neural networks.
  • CNNs convolutional neural networks
  • neurons in a neural network are organized in layers. Different layers may perform different kinds of transformations on their inputs. Signals travel from the first (input), to the last (output) layer, possibly after traversing the layers multiple times. Networks with multiple hidden layers are deep neural networks (
  • a neural network shallow or deep, is a computing system that learns (progressively improves performance) to do tasks, such as to detect a certain disease or condition in OCT scan image data, by considering examples, generally without task-specific programming.
  • An Artificial Neural Network comprises a collection of connected units called artificial neurons. Each connection between neurons can transmit a signal to another neuron. The receiving neuron can process the signal(s) and then signal downstream neurons connected to it. Neurons may have a state, generally represented by real numbers, typically between 0.0 and 1.0. Neurons and connections may also have a weight that varies as learning proceeds, which can increase or decrease the strength of the signal that it sends downstream. Further, they may have a threshold such that the signal is sent downstream only if the aggregate signal is below (or above) that level is the downstream signal sent.
  • the statistical models for the modules 412A-N are generated from a database or library of OCT scan image training data, where the test subjects whose scan data are composed of the database/library are classified as positive or negative for each classification question (e.g., whether they have ARMD or not, etc.). That is, for example, to generate a classification module 412A for ARMD, there should be sufficient and equally distributed amounts of training data in the database/library where the test subjects are known to both have wet ARMD, or have dry ARMD.
  • the classification module 412A can train each of its one or more statistical models to classify, once trained, whether particular OCT scan data for patient should be classified as indicating that the patient has wet or dry ARMD (or more particularly, the classification module 412A can compute the likelihood that the patient has ARMD based on its statistical model(s)).
  • the classification module 412A can compute the likelihood that the patient has ARMD based on its statistical model(s)).
  • an ARMD follow-up classification module 412N determines whether the patient would benefit from vitamin therapy (assuming the patient was determined to likely have ARMD by the first classification module 412A)
  • the statistical model(s) of the follow-up classification module 412N can be trained on OCT scan data for ARMD-positive patients that both benefitted from vitamin therapy (positive samples) and did not benefit from vitamin therapy (negative samples).
  • the statistical model(s) of that follow-up classification module 412N can be trained on OCT scan data for ARMD-positive patients that both have wet ARMD (positive samples), or have dry ARMD (negative samples). And so on for the other eye diseases, follow-up questions, and other classification modules, which can classify other relevant and applicable follow-up questions.
  • the training data preferably has to be classified (Dry Macular Degeneration, Normal Eye, Wet Macular Degeneration without treatment, Treated Wet Macular Degeneration (needs injection)).
  • each module 412A-N includes an ensemble of machine learning models, with the ensembles comprising both traditional machine learning models as well as deep learning models.
  • Deep learning on large image datasets is an extremely effective technique for classification, but it may require large amounts of data to converge in order to obtain excellent performance.
  • Another reason for its huge training data requirement is number of parameters to learn in the training phase.
  • Increasing the current Deep Learning network by one layer of neurons leads to a huge amount of new parameters (weights) to be learned, which in turn require large amounts of data.
  • Traditional machine learning models such as decision tree, random forests, and support vector machines (SVMs), generally require less data to converge to optimal performance, but in some cases, may not achieve the same level of performance, and more importantly precision and recall as deep learning does.
  • the classification modules 412A-N preferably includes multiple models from both traditional machine learning and deep learning paradigms, that leverage the relative strengths of each approach into a single ensemble that provides high accuracy as well as high generalizability.
  • the ensemble may also be able to incorporate new image data and new rulesets to improve performance over the lifetime of the system.
  • the training data (see step 501 in Figure ID) is preferably pre-processed prior to training, of the classification modules 412A-N.
  • the pre-processing stage can provide a number of benefits. First, it can reduce the noise in the data. Second, it can compress or reduce the dimensionality of the data so that the classification modules 412A-N can be trained or operate more efficiently.
  • the preprocessing can extract features from the OCT scan data that have been identified as potentially useful in making the desired classification. An incomplete list of methods that obtain these features includes:
  • PCA Principal Component Analysis
  • KLT Karhunen-Loeve Transform
  • ICA Independent Component Analysis
  • PCA principal component analysis
  • PCA is a dimension-reduction algorithm/technique that can be used to reduce a large set of independent variables (e.g., in an OCT scan image) to a small set that still contains most of the information in the large set.
  • it transforms a number of (possibly) correlated variables into a (smaller) number of uncorrelated variables called "principal components.”
  • the first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible. All the principal components are
  • KNN K-Nearest Neighbor
  • Decision Tree any other suitable traditional machine learning method.
  • a traditional machine learning method can be extremely powerful if the underlying features are representative of the sources of variance in the underlying system. Ideal features should be system bases, or in other words, singular (no redundancy of information between features), and maximally informative (represent the complete variance in the measured dimension).
  • Deep Neural Networks a type of applied deep learning method(s) differ from the traditional machine learning methods described above because they simultaneously transform, explore, and fit mathematical functions to all possible feature derivatives (e.g., using nonlinear transformations like a hyperbolic tangent or a rectilinear unit) from an original set of features.
  • Deep neural networks follow a defined architecture and model performance is very sensitive to network architecture, activation function selection, pooling layer size, and initialization settings. They can be trained, for example, with a backpropagation algorithm, which is a method to calculate the gradient of the loss function with respect to the weights for the nodes and connections in the network. Deep Neural Network performance may be difficult to replicate, even across the same data, unless identical settings and architectures are used.
  • CNN convolutional neural network
  • recursive neural network Both classes of deep neural network may be used in this invention.
  • each of the classification modules 412A-N can include as part of their ensemble one or more deep neural network models combined with one or more traditional machine learning models.
  • Embodiments of the present invention can you various classes of deep neural network particularly convolutional neural networks, recursive neural networks such as recurrent neural networks (RNN), recurrent convolutional neural networks (RCNN), long- short term memory (LSTM) and Capsule Nets among others.
  • RNN recurrent neural networks
  • RCNN recurrent convolutional neural networks
  • LSTM long- short term memory
  • Capsule Nets among others.
  • the dimensions of pooling layers, network initialization states, activation function, and trained network weightings may be unique to the applications for this invention.
  • LSTM long- short term memory
  • RCNN recurrent convolutional neural networks
  • the Deep Neural Network model architectures may borrow components from popular architectures such as LeNet and/or AlexNet convolutional neural networks.
  • Figure 2A displays a graphical depiction of a LeNet model
  • Figure 2B displays a graphical representation of AlexNet.
  • LeNet and AlexNet are examples of convolutional deep neural networks that can provide high-performance image classification when correctly modified to fit the sizes of the images in the dataset and provided with adequately labeled data.
  • Sparse, convolutional layers and max-pooling are at the heart of the LeNet family of models.
  • lower-layers are composed to alternating convolution and max-pooling layers.
  • the upper-layers however are fully-connected and correspond to a traditional multilayer perceptron (MLP) (hidden layer plus logistic regression).
  • MLP multilayer perceptron
  • AlexNet has many layers. The first 5 are
  • convolutional and the last 3 are fully connected layers. In between there can also be some pooling and activation layers. More details about LeNet and AlexNet are provided in "Deep Learning tutorial,” Release 0.1, LISA Lab, University of Montreal, September 2015 (available at deeplearning.net/tutorial/lenet.html), Y. LeCun et al., “Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, 86(1 1):2278-2324, November 1998, and Krizhevsky, A., et al., "ImageNet: Classification with Deep Convolutional Neural Networks," NIPS 2012: Neural Information Processing Systems, Lake Tahoe, Nevada (2012), which are incorporated herein by reference in their entirety. As additional data is obtained, GoogLeNet may also be incorporated into the network architecture, as may other suitable neural networks. Furthermore, neural network architectures that perform more optimally on OCT scan data may also be combined with one or more of these architectures.
  • the models may be modified to leverage any time-series data available as part of a Recurrent Neural Network (RNN), which in practice works more optimally with information-rich sequential data.
  • RNN Recurrent Neural Network
  • the decisions for the various dimensions may be dependent on empirical determination. More details that may drive dimension decisions are available in (1) Lipton ZC, Berkowitz J, Elkan C, "A critical review of recurrent neural networks for sequence learning," arXiv: 1506.00019
  • Aggregation also known as "bagging," is a method of combining multiple sub-models (e.g., the deep learning and traditional machine learning algorithms in the ensemble) into a single model that retains an optimum level of performance.
  • the sub-models in the ensemble may be any number of deep learning and/or traditional machine learning models as described above.
  • individual models may be trained on random subsets of data repeatedly, and the resulting models are combined into an ensemble, where the resulting models are combined using a simple linear function, such as the maximum votes among ensemble members.
  • the various models may be trained based on the training OCT scan data, as described earlier. There is no inherent limitation to the type or number of models that can be combined in the classification modules 412A-N.
  • a possible ensemble may include, for example, a LeNet architecture model, an AlexNet architecture model, a Decision Tree model, and a KNN model.
  • the classification modules 412A-N may have more or fewer deep learning and/or traditional machine learning models, and/or different kinds of deep learning and/or traditional machine learning models, such as RN s, RCNNs, and LSTMs, as described above.
  • FIG. 3 displays an embodiment of the method 500 for developing and training the models that make up the ensemble of any one of the classification modules 412A-N.
  • the method of FIG. 3 may be implemented with a suitably programmed computer system, such as the host computer system 406.
  • the training OCT scan data which may be in a Digital Imaging and Communications in Medicine (DICOM) format, is pre-processed.
  • the pre-processing can include PC A, ICA, transformation, normalization, mean subtraction, and/or whitening.
  • the data may be then be used in training the deep learning models at step 506 and the traditional machine learning models at step 508.
  • DICOM Digital Imaging and Communications in Medicine
  • the models can then be tested on the training data. If a model shows insufficient performance, it is not included in the ensemble. Conversely, if the model's performance is successful, it can be included in the ensemble. As an example, based on the performance obtained on the training set, the threshold will segregate the successful from the unsuccessful model to be aggregated in the ensemble.
  • the performance decision in this context, can be based on the F- Measure and the
  • Receiver Operating Characteristic (ROC) curve to determine the threshold that a model must exceed in order to be included in the ensemble.
  • the F-Measure is a statistical analysis approach that considers precision and recall, which are fundamental in the medical context, and measures the effectiveness of the model. Additionally, the ROC curve measures the capability of the model to distinguish between two outcomes. ROC curve takes into consideration the sensitivity (true positive rate) as a function of the specificity (false positive rate).
  • This process can be performed for each model that is generated for the training data.
  • the models developed in steps 506 and 508 may then be combined at step 510 to form the ensemble for the particular classification module 412.
  • the classification module 412 uses a decision criterion to combine the results from the various models in the ensemble, such as a predetermined weighting method. For example, each model could be weighted evenly with a majority rules criteria such that if a majority of the models in the ensemble classify the patient as having the condition, the decision of the ensemble is that the patient has the condition, and vice versa. Other weighting methods could also be used, such as to weight higher models that tend to be more accurate.
  • the classification module 412 can generate "soft" results, such probabilities that the patient has the tested-for condition, rather than a binary positive-negative Decision.
  • the models of the modules 412A-N may continue to be trained after going into testing stage.
  • the host computer system 406 may include classification modules tuned to other diseases that can be detected from OCT scan image data by such statistical models.
  • the classification modules could be trained to detect non-eye related diseases that are detectable through OCT retina scan image data, such as cardiovascular, Alzheimer's and/or Parkinson's disease. Again, such a classification module would need to be trained with a sufficient number of samples for that particular task/disease.
  • the classification module(s) would preferably include an ensemble of task-specific machine learning models and deep learning models, as described above.
  • the present invention is directed to an apparatus that comprises an OCT scanner 402 and a host computer system 406.
  • the OCT scanner 402 captures patient scan image data of a retina of a patient, where the patient scan image data comprises 3-dimensional image data of the patient's retina.
  • the host computer system 406 receives the patient scan image data of the patient' s retina captured by the OCT scanner 402.
  • the host computer system 406 comprises a plurality of classification modules 412A-N that make separate classifications based on the patient scan image data of the patient.
  • the plurality of classification modules 412A-N are pre-trained on labeled OCT scan image training data that is pre-processed prior to training the classification modules, where the preprocessing comprises a principal component analysis (PCA) of the labeled OCT scan image training data.
  • PCA principal component analysis
  • the plurality of classification modules 412A-N comprises: (i) a first classification module 412A that, when executed by the host computer system 406, determines a likelihood that the patient has ARMD; (ii) a second classification module 412B that, when executed by the host computer system 406, determines a likelihood that the patient has glaucoma; and (iii) a third classification module 412C that, when executed by the host computer system 406, determines a likelihood that the patient has diabetic retinopathy.
  • Each of the first, second and third modules 412A-C comprises an ensemble of machine learning algorithms for making their classifications.
  • the host computer system 406 transmits the determinations of the first, second and classification modules to a remote computer system 417, which may be co-located with the OCT scanner 402.
  • a remote computer system 417 which may be co-located with the OCT scanner 402.
  • the OCT scanner 402 and the remote computer system 416 could be co-located at a primary care facility of the patient, and the host computer system 406 transmits the determinations of the first, second and classification modules 412A-C to the remote computer system 417 within 10 to 30 minutes of the OCT scanner 402 capturing the scan image data of the patient's retina.
  • the present invention is directed to a method that comprises the step of pre-processing, by the host computer system 406, labeled OCT scan image training data, where the pre-processing comprises prior a principal component analysis (PCA) of the labeled OCT scan image training data.
  • the method further comprises the steps of, after pre-processing the labeled OCT scan image training data, training, by the host computer system 406, a plurality of classification modules 412A-N of the host computer system 406, where the plurality of classification modules 412A-N are trained with the pre-processed labeled OCT scan image training data.
  • PCA principal component analysis
  • the plurality of classification modules may comprise: (i) a first classification module 412A that, when executed by the host computer system 406, determines a likelihood that a patient has ARMD; (ii) a second classification module 412B that, when executed by the host computer system 406, determines a likelihood that the patient has glaucoma; and (iii) a third classification module 412C that, when executed by the host computer system 406, determines a likelihood that the patient has diabetic retinopathy.
  • the method further comprises the step of capturing, by the OCT scanner 402, patient scan image data of a retina of a patient, where the patient scan image data comprises 3-dimensional image data of the patient' s retina.
  • the method further comprises the step of receiving, by the host computer system 406, the patient scan image data captured by the OCT scanner 402.
  • the method further comprises the steps of: (i) determining, by the host computer system, by execution of the first classification module 412A, a likelihood that the patient has ARMD; (ii) determining, by the host computer system 406, by execution of the second classification module 412B, a likelihood that the patient has glaucoma; and (iii) determining, by the host computer system 406, by execution of the third classification module 412C, a likelihood that the patient has diabetic retinopathy.
  • the method further comprises the step of transmitting, by the host computer system 406, the determinations of the first, second and classification modules to a remote computer system.
  • the ensembles for each of the first, second and third classification module 412A-C respectively comprises at least one deep learning algorithm and at least one traditional machine learning (i.e., non-deep learning) algorithm.
  • the host computer system 406 comprises a fourth classification module that determines, when executed by the host computer system 406, a feature of the patient' s ARMD upon a determination by the first classification module 412A that the likelihood that the patient has ARMD is above a threshold level.
  • the feature may be whether the patient has wet ARMD or whether the patient will benefit from vitamin therapy, for example.
  • the fourth classification module may also comprise an ensemble of machine learning algorithms for making the classification, where the ensemble comprises at least one deep learning algorithm and at least one traditional machine learning algorithm.
  • the host computer system 406 may also transmit the determination of fourth classification module to the remote computing system 417.
  • the host computer system 406 may also include a classification module that determines, when executed by the host computer system, a feature of the patient's glaucoma upon a determination by the second classification module 412B that the likelihood that the patient has glaucoma is above a threshold level.
  • That classification module may also comprise an ensemble of machine learning algorithms for making the classification, where the ensemble comprises at least one deep learning algorithm and at least one traditional machine learning algorithm.
  • the host computer system 406 may also transmit the determination of the classification module to the remote computing system 417

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Surgery (AREA)
  • Ophthalmology & Optometry (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Veterinary Medicine (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Eye Examination Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne des algorithmes d'apprentissage machine qui sont appliqués sur des données d'image de balayage OCT de la rétine d'un patient pour évaluer diverses maladies oculaires du patient, telles qu'ARMD, glaucome et rétinopathie diabétique. Les modules de classification pour chaque maladie ou affection pour lesquelles des essais sont réalisés comprennent, de préférence, un ensemble d'algorithmes d'apprentissage machine, qui comprennent de préférence à la fois des algorithmes d'apprentissage profond et d'apprentissage machine classique (apprentissage non profond). Les résultats de l'analyse peuvent être retransmis à l'installation du soignant qui utilise le dispositif de balayage OCT pour balayer la rétine du patient alors que le patient est toujours présent dans l'installation du soignant pour un rendez-vous.
PCT/US2017/062747 2015-11-12 2017-11-21 Système et procédé pour l'évaluation automatique d'affection à l'aide de données de balayage oct WO2018094381A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/462,360 US20190313895A1 (en) 2015-11-12 2017-11-21 System and method for automatic assessment of disease condition using oct scan data

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662424832P 2016-11-21 2016-11-21
US62/424,832 2016-11-21
US201762524681P 2017-06-26 2017-06-26
US62/524,681 2017-06-26

Publications (1)

Publication Number Publication Date
WO2018094381A1 true WO2018094381A1 (fr) 2018-05-24

Family

ID=62145845

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/062747 WO2018094381A1 (fr) 2015-11-12 2017-11-21 Système et procédé pour l'évaluation automatique d'affection à l'aide de données de balayage oct

Country Status (1)

Country Link
WO (1) WO2018094381A1 (fr)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109171670A (zh) * 2018-06-25 2019-01-11 天津海仁医疗技术有限公司 一种基于逆向主成分分析法的3d血管成像算法
CN109528155A (zh) * 2018-11-19 2019-03-29 复旦大学附属眼耳鼻喉科医院 一种适用于高度近视并发开角型青光眼的智能筛查系统及其建立方法
CN109691979A (zh) * 2019-01-07 2019-04-30 哈尔滨理工大学 一种基于深度学习的糖尿病视网膜图像病变分类方法
CN110059730A (zh) * 2019-03-27 2019-07-26 天津大学 一种基于胶囊网络的甲状腺结节超声图像分类方法
CN110110600A (zh) * 2019-04-04 2019-08-09 平安科技(深圳)有限公司 眼部oct图像病灶识别方法、装置及存储介质
CN110188820A (zh) * 2019-05-30 2019-08-30 中山大学 基于深度学习子网络特征提取的视网膜oct图像分类方法
WO2020150441A1 (fr) * 2019-01-16 2020-07-23 Tecumseh Vision, Llc Utilisation de données biométriques et de l'intelligence artificielle pour des examens de dépistage en série d'affections médicales
CN111488486A (zh) * 2020-04-20 2020-08-04 武汉大学 一种基于多音源分离的电子音乐分类方法及系统
EP3751581A1 (fr) * 2019-06-07 2020-12-16 Welch Allyn, Inc. Criblage d'images numériques et/ou diagnostic utilisant l'intelligence artificielle

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120184845A1 (en) * 2010-11-11 2012-07-19 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Automated macular pathology diagnosis in threedimensional (3d) spectral domain optical coherence tomography (sd-oct) images
US8474978B2 (en) * 2007-06-15 2013-07-02 University Of Southern California Pattern analysis of retinal maps for the diagnosis of optic nerve diseases by optical coherence tomography
WO2015017536A1 (fr) * 2013-07-31 2015-02-05 The Board Of Trustees Of The Leland Stanford Junior University Procédé et système pour évaluer la progression de la dégénérescence maculaire liée à l'âge
US20150110372A1 (en) * 2013-10-22 2015-04-23 Eyenuk, Inc. Systems and methods for automatically generating descriptions of retinal images
US20150265144A1 (en) * 2012-11-08 2015-09-24 The Johns Hopkins University System and method for detecting and classifying severity of retinal disease

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8474978B2 (en) * 2007-06-15 2013-07-02 University Of Southern California Pattern analysis of retinal maps for the diagnosis of optic nerve diseases by optical coherence tomography
US20120184845A1 (en) * 2010-11-11 2012-07-19 University Of Pittsburgh - Of The Commonwealth System Of Higher Education Automated macular pathology diagnosis in threedimensional (3d) spectral domain optical coherence tomography (sd-oct) images
US20150265144A1 (en) * 2012-11-08 2015-09-24 The Johns Hopkins University System and method for detecting and classifying severity of retinal disease
WO2015017536A1 (fr) * 2013-07-31 2015-02-05 The Board Of Trustees Of The Leland Stanford Junior University Procédé et système pour évaluer la progression de la dégénérescence maculaire liée à l'âge
US20150110372A1 (en) * 2013-10-22 2015-04-23 Eyenuk, Inc. Systems and methods for automatically generating descriptions of retinal images

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109171670A (zh) * 2018-06-25 2019-01-11 天津海仁医疗技术有限公司 一种基于逆向主成分分析法的3d血管成像算法
CN109171670B (zh) * 2018-06-25 2021-02-05 天津海仁医疗技术有限公司 一种基于逆向主成分分析法的3d血管成像算法
CN109528155A (zh) * 2018-11-19 2019-03-29 复旦大学附属眼耳鼻喉科医院 一种适用于高度近视并发开角型青光眼的智能筛查系统及其建立方法
CN109691979A (zh) * 2019-01-07 2019-04-30 哈尔滨理工大学 一种基于深度学习的糖尿病视网膜图像病变分类方法
US20220122730A1 (en) * 2019-01-16 2022-04-21 Tecumseh Vision, Llc Using artificial intelligence and biometric data for serial screening exams for medical conditions
US12014827B2 (en) 2019-01-16 2024-06-18 Tecumseh Vision, Llc Using artificial intelligence and biometric data for serial screening exams for medical conditions
WO2020150441A1 (fr) * 2019-01-16 2020-07-23 Tecumseh Vision, Llc Utilisation de données biométriques et de l'intelligence artificielle pour des examens de dépistage en série d'affections médicales
CN110059730A (zh) * 2019-03-27 2019-07-26 天津大学 一种基于胶囊网络的甲状腺结节超声图像分类方法
CN110110600A (zh) * 2019-04-04 2019-08-09 平安科技(深圳)有限公司 眼部oct图像病灶识别方法、装置及存储介质
CN110110600B (zh) * 2019-04-04 2024-05-24 平安科技(深圳)有限公司 眼部oct图像病灶识别方法、装置及存储介质
CN110188820B (zh) * 2019-05-30 2023-04-18 中山大学 基于深度学习子网络特征提取的视网膜oct图像分类方法
CN110188820A (zh) * 2019-05-30 2019-08-30 中山大学 基于深度学习子网络特征提取的视网膜oct图像分类方法
EP3751581A1 (fr) * 2019-06-07 2020-12-16 Welch Allyn, Inc. Criblage d'images numériques et/ou diagnostic utilisant l'intelligence artificielle
US11915826B2 (en) 2019-06-07 2024-02-27 Welch Allyn, Inc. Digital image screening and/or diagnosis using artificial intelligence
CN111488486B (zh) * 2020-04-20 2021-08-17 武汉大学 一种基于多音源分离的电子音乐分类方法及系统
CN111488486A (zh) * 2020-04-20 2020-08-04 武汉大学 一种基于多音源分离的电子音乐分类方法及系统

Similar Documents

Publication Publication Date Title
US20190313895A1 (en) System and method for automatic assessment of disease condition using oct scan data
WO2018094381A1 (fr) Système et procédé pour l'évaluation automatique d'affection à l'aide de données de balayage oct
Pires et al. A data-driven approach to referable diabetic retinopathy detection
Ahmed et al. An expert system to predict eye disorder using deep convolutional neural network
Jain et al. Retinal eye disease detection using deep learning
Khanna et al. Deep learning based computer-aided automatic prediction and grading system for diabetic retinopathy
Singh et al. Collaboration of features optimization techniques for the effective diagnosis of glaucoma in retinal fundus images
Elsharif et al. Retina diseases diagnosis using deep learning
Zeng et al. Automated detection of diabetic retinopathy using a binocular siamese-like convolutional network
Maaliw et al. Cataract detection and grading using ensemble neural networks and transfer learning
Abirami et al. A novel automated komodo Mlipir optimization-based attention BiLSTM for early detection of diabetic retinopathy
Modi et al. Smart detection and diagnosis of diabetic retinopathy using bat based feature selection algorithm and deep forest technique
Rafay et al. EyeCNN: exploring the potential of convolutional neural networks for identification of multiple eye diseases through retinal imagery
izza Rufaida et al. Residual convolutional neural network for diabetic retinopathy
Kayadibi et al. A Hybrid R-FTCNN based on principal component analysis for retinal disease detection from OCT images
Gupta et al. Diabetic retinopathy detection using an efficient artificial intelligence method
Elakkiya et al. A comparative analysis of pretrained and transfer-learning model for automatic diagnosis of glaucoma
Sharma et al. Analysis of eye disease classification by fundus images using different machine/deep/transfer learning techniques
Jena et al. A Novel Approach for Diabetic Retinopathy Screening Using Asymmetric Deep Learning Features. Big Data Cogn. Comput. 2023, 7, 25
Viraktamath et al. Detection of Diabetic Maculopathy
Sabi et al. CLASSIFICATION OF AGE-RELATED MACULAR DEGENERATION USING DAG-CNN ARCHITECTURE
Verma et al. Comparative Analysis of CNN Models for Retinal Disease Detection
Datta et al. Critical retinal disease detection from optical coherence tomography images by deep convolutional neural network and explainable machine learning
Fathima et al. Detection of Diabetic Retinopathy Using Deep Learning Models
Mandal et al. Optimizing deep learning based retinal diseases classification on optical coherence tomography scans

Legal Events

Date Code Title Description
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17870730

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17870730

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17870730

Country of ref document: EP

Kind code of ref document: A1