WO2019082203A1 - A system and method for detection and classification of retinal disease - Google Patents

A system and method for detection and classification of retinal disease

Info

Publication number
WO2019082203A1
WO2019082203A1 PCT/IN2018/050682 IN2018050682W WO2019082203A1 WO 2019082203 A1 WO2019082203 A1 WO 2019082203A1 IN 2018050682 W IN2018050682 W IN 2018050682W WO 2019082203 A1 WO2019082203 A1 WO 2019082203A1
Authority
WO
WIPO (PCT)
Prior art keywords
retinal
fundus image
processor
retinal disease
convolutional network
Prior art date
Application number
PCT/IN2018/050682
Other languages
French (fr)
Inventor
Lalit Pant
Pradeep WALIA
Rajarajeshwari KODHANDAPANI
Raja Raja LAKSHMI
Mrinal HALOI
Original Assignee
Artificial Learning Systems India Private Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Artificial Learning Systems India Private Limited filed Critical Artificial Learning Systems India Private Limited
Publication of WO2019082203A1 publication Critical patent/WO2019082203A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • G06T7/0014Biomedical image inspection using an image reference approach
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic

Definitions

  • the invention relates to the field of medical decision support. More particularly, the invention relates to detection of a presence and/or classification of a retinal disease using a fundus image of a patient based on machine learning applications.
  • Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Early screening of eye diseases through regular screening may prevent visual loss and blindness amongst patients. Analysis of fundus images of a patient is a very convenient way of screening and monitoring eye diseases. The fundus image of the patient illustrates several elements such as optic disc, blood vessels, macula, etc. The fundus of the eye provides indications of several diseases, in particular eye diseases like diabetic retinopathy.
  • diabetic retinopathy is one of the primary cause of vision loss in the working age population. Long-term complications of diabetes include diabetic retinopathy. As the number of patients with diabetes continues to increase, the groundwork required to prevent visual loss due to diabetic retinopathy will become even more deficient. The expertise required are often lacking in areas where the rate of diabetes in populations is high and diabetic retinopathy detection is most needed. Micro-aneurysms is an important feature used for detecting diabetes retinopathy in the fundus image of the patient. Small areas of swellings caused due to vascular changes in the retina's blood vessels are known as rnicro-aneurysms.
  • Micro-aneurysms may sooner or later cause plasma leakage resulting in thickening of the retina. This is known as edema. Thickening of the retina in the macular region may result in vision loss. Proper distinction of features in the fundus image is critical as wrong predictions may lead to wrong treatments causing difficulties to the patient.
  • the present invention discloses a system.
  • the system comprises at least one processor; and one or more storage devices configured to store software instructions configured for execution by the at least one processor in order to cause the system to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
  • Figure 1 illustrates a block diagram of a system in accordance with the invention
  • Figure 2 exemplary illustrates a convolutional network to compute a presence or absence of a retinal disease and related severity of the retinal disease associated with an input fundus image
  • Figure 3 illustrates a flowchart for determination the presence or absence of the retinal disease and related severity of the retinal disease associated with the input fundus image in accordance with the invention. Detailed description of the invention
  • FIG. 1 illustrates a block diagram of a system 1000 in accordance with the invention.
  • the system 1000 comprises at least one processor 102; and one or more storage devices 103 configured to store software instructions configured for execution by the at least one processor 102 in order to cause the system 1000 to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
  • the fundus image herein, refers to a two-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention.
  • the one or more storage devices 103 is, for example, a database to store a structured collection of data.
  • the one or more storage devices 103 may be an internal part of the system 1000.
  • the one or more storage devices 103 may be remotely located and accessed via a network.
  • the one or more storage devices 103 may be, for example, removable and/or nonremovable data storage such as a tape, a magnetic disk, an optical disks, a flash memory card, etc.
  • the one or more storage devices 103 may comprise, for example, random access memory (RAM), read only memory (ROM), electrically erasable programmable read-only memory (EEPROM), a digital versatile disks (DVD), a compact disk (CD), a flash memory, a magnetic tape, a magnetic disk storage, or any combination thereof that can be used to store and access information and is a part of the system 1000.
  • RAM random access memory
  • ROM read only memory
  • EEPROM electrically erasable programmable read-only memory
  • DVD digital versatile disks
  • CD compact disk
  • flash memory a magnetic tape
  • magnetic disk storage or any combination thereof that can be used to store and access information and is a part of the system 1000.
  • the indicator is one of an abnormality, a retinal feature or the like.
  • the retinal feature is an optic disc, a macula, a blood vessel or the like.
  • the abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
  • the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
  • the severity of the retinal disease are represented as levels of increasing seriousness of the retinal disease.
  • the processor 102 of the system 1000 receives a reference dataset from one or more input devices.
  • the reference dataset comprises a plurality of fundus images.
  • the fundus images in the reference dataset are referred to as reference fundus images.
  • the input device is, for example, a camera incorporated into a mobile device such as a smartphone, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc.
  • the system 1000 stores the reference dataset in the one or more storage devices 103 of the system 1000.
  • the reference fundus image is a two-dimensional array of digital image data used for the purpose of training the system 1000.
  • the term 'training' generally refers to a process of developing the system 1000 for the detection and classification of the retinal disease based the reference dataset and a reference ground-truth file.
  • the reference ground-truth file comprises a label and a reference fundus image identifier for each of the reference fundus image.
  • the label provides information about the reference fundus image such as a presence or absence of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease identified in the reference fundus image.
  • the reference fundus image identifier of a reference fundus image is, for example, a name or identity assigned to the reference fundus image.
  • an annotator annotates each of the reference fundus images using an annotation platform 101 of the system 1000.
  • the annotation platform 101 is a graphical user interface (GUI) provided for the annotator to interact with the system 1000.
  • GUI graphical user interface
  • the annotator accesses the reference fundus images via the annotation platform 101.
  • the annotator creates a label with information about a presence or absence of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease based on the annotation.
  • the annotator is usually a trained/certified specialist in accurately annotating the fundus image by analyzing the indicators present in the reference fundus image. In an example, consider that the annotator annotates the reference fundus images for the retinal disease - diabetic retinopathy (DR).
  • DR retinal disease - diabetic retinopathy
  • the annotator may consider one or more standard DR grading standards such as the American ophthalmology DR grading scheme, the Scottish DR grading scheme, the UK DR grading scheme, etc., to annotate the reference fundus images.
  • the annotator may assign a DR severity grade 0 (representing no DR), grade 1 (representing mild DR), grade 2 (representing moderate DR), grade 3 (representing severe DR) or grade 4 (representing proliferative DR) to each of the reference fundus image.
  • the label of the reference fundus image represents the DR severity level associated with the patient.
  • the annotator labels each of the reference fundus image as one of five severity classes- 'No DR', 'DR1 ', 'DR2', 'DR3' and 'DR4'based on an increasing seriousness of DR.
  • 'No DR', 'DR1', 'DR2', 'DR3' and 'DR4' represents the labels indicating different levels of increasing severity of DR associated with the patient.
  • the annotator analyses the indicators in the retinal fundus image and accordingly marks the label. If the annotator detects a microaneurysm, then the annotator considers it as a mild level of DR and marks the label as DR1 for the reference fundus image.
  • the annotator marks the label as DR2 for the reference fundus image.
  • the label DR2 indicates a moderate level of DR.
  • the annotator marks the label as DR3 for the reference fundus image with a severe level of DR upon detection of multiple hemorrhages, hard or soft exudates, etc., and DR4 for the reference fundus image with a proliferative level of DR upon detection of vitreous hemorrhage, neovascularization, etc.
  • the reference fundus image with no traces of DR is marked with the label as 'No DR' by the annotator.
  • the annotator stores the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file located in the one or more storage devices 103.
  • the label provides information about the type of retinal disease and the corresponding severity of the retinal disease as annotated by the annotator.
  • the reference fundus image identifier of a reference fundus image is, for example, a name or identity assigned to the reference fundus image.
  • the processor 102 identifies the indicators throughout each of the reference fundus image to detect the presence or absence of the retinal disease using image analysis techniques.
  • the processor 102 classifies the severity of the retinal disease based on the presence of the retinal disease using a set of predetermined rules.
  • the predetermined rules comprise considering a type of each of the indicators, a count of each indicators, a region of occurrence of each of the indicators, a contrast level of each of the indicators, a size of each of the indicators or any combination thereof to recognize the retinal disease and the severity of the retinal disease.
  • the processor 102 classifies each of the detected retinal diseases according to a corresponding severity grading and generates the label.
  • the processor 102 communicates with the one or more storage devices 103 to store the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file.
  • the processor 102 utilizes the reference dataset to train the convolutional network for subsequent detection and classification of the retina disease in the fundus image.
  • the fundus image which is subsequently analyzed by the processor 102 is referred to as an input fundus image for clarity.
  • the processor 102 pre-processes each of the reference fundus images. For each of the reference fundus image, the processor 102 executes the following steps as part of the preprocessing.
  • the processor 102 separates any text matter present at the border of the reference fundus image.
  • the processor 102 adds a border to the reference fundus image with border pixel values as zero.
  • the processor 102 increases the size of the reference fundus image by a predefined number of pixels, for example, 20 pixels width and height.
  • the additional pixels added are of a zero value.
  • the processor 102 next converts the reference fundus image from a RGB color image to a grayscale image.
  • the processor 102 now binarize the reference fundus image using histogram analysis.
  • the processor 102 applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized reference fundus image.
  • the processor 102 acquires all connected regions such as retina, text matter of the smoothen reference fundus image to separate text matter present in the reference fundus image from a foreground image.
  • the processor 102 determines the largest region among the acquired connected regions as the retina.
  • the retina is assumed to be the connected element with the largest region.
  • the processor 102 calculates a corresponding bounding box for the retina.
  • the processor 102 thus identifies retina from the reference fundus image.
  • the processor 102 identifies the retina in the reference fundus image, the processor 102 further blurs the reference fundus image using a Gaussian filter.
  • the processor 102 compares an image width and an image height of the blurred reference fundus image based on Equation 1.
  • the processor 102 calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred reference fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1.
  • the maximum background pixel value (Max_background pixel value) is given by the below Equation 2.
  • the term 'max_pixel_left' in Equation 2 is the maximum pixel value of the left half of the blurred identified retina.
  • the term 'max_pixel_right' in Equation 2 is the maximum pixel value of the right half of the blurred reference fundus image.
  • Max_background pixel value max(max_pixel_left, max_pixel_right)— Equation 2
  • the processor 102 further extracts foreground pixel values from the blurred reference fundus image by considering pixel values which satisfy the below Equation 3.
  • the processor 102 calculates a bounding box using the extracted foreground pixel values from the blurred reference fundus image.
  • the processor 102 processes the bounding box to obtain a resized image using cubic interpolation of shape, for example, [256, 256, 3].
  • the reference fundus image at this stage is referred to as the pre-processed reference fundus image.
  • the processor 102 stores the pre-processed reference fundus images in a pre-processed reference dataset.
  • the ground-truth file associated with the reference dataset holds good even from the pre- processed reference dataset.
  • the processor 102 stores the pre-processed reference dataset in the one or more storage devices 103.
  • the processor 102 splits the pre-processed reference dataset into two sets - a training set and a validation set.
  • the pre-processed reference fundus images in the training set is termed as training fundus images and the pre-processed reference fundus images in the validation set is termed as validation fundus images for simplicity.
  • the training set is used to train the convolutional network to assess the training fundus images based on the label associated with each of the training fundus image.
  • the validation set is typically used to test the accuracy of the convolutional network.
  • the processor 102 augments the training fundus images in the training set.
  • the processor 102 preforms the following steps for the augmentation of the training set.
  • the processor 102 randomly shuffles the training fundus images to divide the training set into a plurality of batches. Each batch is a collection of a predefined number of training fundus images.
  • the processor 102 randomly samples each batch of training fundus images.
  • the processor 102 processes each batch of the training fundus images using affine transformations.
  • the processor 102 translates and rotates the training fundus images in the batch randomly based on a coin flip analogy.
  • the processor 102 also adjusts the color and brightness of each of the training fundus images in the batch randomly based on the results of the coin flip analogy.
  • the processor 102 trains the system 1000 using the batches of augmented training fundus images via the convolutional network.
  • the convolutional network is a class of deep artificial neural networks that can be applied to analyzing visual imagery.
  • the general arrangement of the convolutional network is as follows.
  • the convolutional network comprising 'n' convolutional stacks applies a convolution operation to the input and passes an intermediate result to a next layer.
  • Each convolutional stack comprises a plurality of convolutional layers.
  • a first convolution stack is configured to convolve pixels from an input with a plurality of filters to generate a first indicator map.
  • the first convolutional stack also comprises a first subsampling layer configured to reduce a size and variation of the first indicator map.
  • the first convolutional layer of the first convolutional stack is configured to convolve pixels from the input with a plurality of filters.
  • the first convolutional stack passes an intermediate result to the next layer.
  • each convolutional stack comprises a sub-sampling layer configured to reduce a size (width and height) of the indicators stack.
  • the input is analyzed based on reference data to provide a corresponding output.
  • the processor 102 groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images.
  • the processor 102 validates each of the validation fundus images in each batch of the validation set using the convolutional network.
  • the processor 102 compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file. The processor 102 thus evaluates a convolutional network performance of the convolutional network for the batch of validation set.
  • the processor 102 optimizes the convolutional network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum.
  • the optimizer iteratively optimizes the parameters of the convolutional network during multiple iterations using the training set.
  • each iteration refers to a batch of the training set.
  • the processor 102 evaluates a convolutional network performance of the convolutional network after a predefined number of iterations on the validation set.
  • each iteration refers to a batch of the validation set.
  • the processor 102 trains the convolutional network based on the augmented training set and tests the convolutional network based on the validation set. Upon completion of training and validation of the convolution network based on the convolutional network performance, the system 1000 is ready to assess the input fundus image based on the indicators present in the input fundus image. [00033]
  • the processor 102 of the system 1000 receives the input fundus image from one of the input devices.
  • the processor 102 pre-processes the input fundus image similar to that of the reference fundus image.
  • the processor 102 test-time augments the preprocessed input fundus image.
  • the processor 102 performs the following step to test-time augment the preprocessed input fundus image.
  • the processor 102 converts the preprocessed input fundus image into a plurality of test time images, for example, twenty test time images, using deterministic augmentation.
  • the processor 102 follows the same steps to augment the input fundus image as that of the training fundus image, except that the augmentations are deterministic.
  • the processor 102 generates deterministically augmented twenty test time images of the preprocessed input fundus image.
  • the processor 102 processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the convolutional network comprising 'n' convolutional stacks. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result.
  • the final prediction result provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image.
  • the probability value is an indication of a confidence that identified indicators are of a particular retinal disease and a corresponding severity of the retinal disease.
  • the output indicates a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
  • Figure 2 exemplary illustrates the convolutional network to compute the presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
  • the deterministically augmented twenty test time images of the preprocessed input fundus image is the input to a first convolutional stack (CS1) of the convolutional network.
  • CS1 first convolutional stack
  • Each of the deterministically augmented twenty test time images is processed by the convolutional network.
  • the deterministically augmented test time image is, for example, represented as a matrix of width 448 pixels and height 448 pixels with '3' channels. That is, the deterministically augmented test time image is a representative array of pixel values is 448 x 448 x 3.
  • the input to the first convolutional stack (CS1) is a color image of size 448 x 448.
  • the first convolution stack (CS 1 ) comprises the following sublayers - a first convolutional layer, a first subsampling layer, a second convolutional layer, a third convolutional layer and a second subsampling layer in the same order.
  • the output of a sublayer is an input to a consecutive sublayer.
  • a subsampling layer is configured to reduce a size and variation of its input and a convolutional layer convolves its input with a plurality of filters, for example, filters of size 3x3.
  • the output of the first convolutional stack (CS1) is a reduced image represented as a matrix of width 112 pixels and height 112 pixels with nl channels. That is, the output of the first convolutional stack (CS1) is a representative array of pixel values 112 x 112 x nl.
  • the second convolutional stack (CS2) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the second convolutional stack (CS2) convolves the representative array of pixel values 112 x 112 x nl and reduces it to a representative array of pixel values of 56 x 56 x n2.
  • the representative array of pixel values of 56 x 56 x n2 is an input to a third convolutional stack (CS3).
  • the third convolutional stack (CS3) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the third convolutional stack (CS3) convolves the representative array of pixel values 56 x 56 x n2 and reduces it to a representative array of pixel values of 28 x 28 x n3.
  • the representative array of pixel values of 28 x 28 x n3 is an input to a fourth convolutional stack (CS4).
  • the fourth convolutional stack (CS4) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order.
  • the fourth convolutional stack (CS4) convolves the representative array of pixel values 28 x 28 x n3 and reduces it to a representative array of pixel values of 14 x 14 x n4.
  • the representative array of pixel values of 14 x 14 x n4 is an input to a fifth convolutional stack (CS4).
  • the fifth convolutional stack (CS5) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the fifth convolutional stack (CS5) convolves the representative array of pixel values 14 x 14 x n4 and reduces it to a representative array of pixel values of 7 x 7 x n5.
  • the representative array of pixel values of 7 x 7 x n5 is a first input to a concatenation block (C).
  • the output of the third convolutional stack (CS3) is an input to a first subsampling block (SSI).
  • the representative array of pixel values of 28 x 28 x n3 is the input to the first subsampling block (SSI).
  • the first subsampling block (SSI) reduces the input with a stride of 4 to obtain an output of a representative array of pixel with value of 7 x 7 x n3. This is a second input to the concatenation block (C).
  • the output of the fourth convolutional stack (CS4) is an input to a second subsampling block (SS2).
  • the representative array of pixel values of 14 x 14 x n4 is the input to the second subsampling block (SS2).
  • the second subsampling block (SS2) reduces the input with a stride of 2 to obtain an output of a representative array of pixel with value of 7 x 7 x n4. This is a third input to the concatenation block (C).
  • the concatenation block (C) receives the first input from the fifth convolutional stack (CS5), the second input from the first subsampling block (SSI) and the third input from the second subsampling block (SS2).
  • the concatenation block (C) concatenates the three inputs received to generate an output of value 7 x 7 x (n5 + n4 + n3).
  • the output of the concatenation block (C) is an input to a probability block (P).
  • the probability block (P) provide a probability of the presence or absence of the retinal disease and related severity of the retinal disease.
  • the predicted probabilities of the twenty test time images are averaged to get a final prediction result.
  • the output of the convolutional network provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image.
  • the probability block (P) as shown in the Figure 2 provides five values by considering the retinal disease to be DR.
  • the output of the probability block are five values depicting the probability for each DR severity level - DR0 (no DR), DR1 (mild DR level), DR2 (moderate DR level), DR3 (severe DR level) and DR4 (proliferative DR level).
  • the system 1000 considers an image capture device characteristics of an image capture device also as one of the parameters to assess the input fundus image.
  • the image capture device characteristics is a resolution, an illumination factor, a field of view or any combination thereof.
  • the image capture device is, for example, a fundus camera, a camera attached to a smartphone, etc., used to capture the input fundus image.
  • the image capture device is the input device used to capture the input fundus image.
  • the processor 102 of the system 1000 considers a manufacturer and version of the image capture device to determine a predefined score for the image capture device characteristics of the image capture device. This predefined score for the image capture device characteristics is used to assess the input fundus image.
  • the predefined score for the image capture device characteristics denotes a superiority of the image capture device characteristics.
  • the predefined score for the image capture device characteristics is a numeric value within the range of [0, 1].
  • 0 defines a least value
  • 1 defines a highest value of the predefined score for the image capture device characteristics.
  • the predefined score for the image capture device characteristics for multiple manufacturers of image capture device is initially stored in the one or more storage devices 103 by an operator of the system 1000. By considering the image capture device characteristics of an image capture device to assess the quality of the input fundus image, the flexibility of the system 1000 is increased, thereby providing customized results for the input fundus image captured using the image capture device of multiple manufacturers.
  • the processor 102 considers the output of the convolutional network and the image capture device characteristics of the image capture device to determine the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image.
  • the processor 102 thereby considers the quality of the input fundus image along with the output of the convolutional network to determine the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image.
  • the processor 102 displays the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image via a display 104 to a user. For example, suitable suggestions with a set of instructions to the user may also be included and provided via a pop-up box displayed on a screen.
  • the system 1000 may also generate a report comprising the input fundus image, the type of the retinal disease and the severity of the retinal disease and communicated to the patient via an electronic mail. The report could also be stored in the one or more storage devices 103 of the system 1000.
  • the processor 102 assesses a quality measure of each of the reference fundus images in the reference dataset.
  • the quality measure is also stored as a part of the label associated with the reference fundus image.
  • the processor 102 trains the convolutional network to learn the quality measure of the reference fundus image along with the identification of the indicators in the reference fundus image.
  • the processor 102 assesses the input fundus image based on the training.
  • the processor 102 identifies the quality measure of the input fundus image using the convolutional network.
  • the processor 102 may also refer to a user defined threshold to define the quality measure of the input fundus image.
  • the user defined threshold is user defined to increase a flexibility of the system 1000.
  • the user defined threshold is the variable factor which may be used to vary the quality measure of the input fundus image to conveniently suit the requirements of the user, for example, medical practitioner.
  • the user defined threshold may be varied to vary the quality measure of the input fundus image based on the doctor's grading experience.
  • the system 1000 may further display 104 a message to an operator to retake another fundus image of the patient when the quality measure of the input fundus image is below a threshold.
  • the system 1000 may further consider characteristics of a device used to capture the input fundus image of the patient as an additional parameter to assess the quality measure of the input fundus image.
  • the device is a fundus camera and the characteristics of the device is a resolution level of the fundus camera.
  • the system 1000 detects the presence of several diseases, for example, diabetes, stroke, hypertension, cardiovascular diseases, etc., and not limited to retinal diseases based on changes in the retinal feature.
  • the processor 102 trains the convolutional network to identify and classify the severity of these diseases using the fundus image of the patient.
  • FIG. 3 illustrates a flowchart for determination of the presence or absence of the retinal disease and related severity of the retinal disease associated with the input fundus image in accordance with the invention.
  • the processor 102 receives the fundus image of the patient.
  • the fundus image is the input fundus image.
  • the input fundus image is a two- dimensional array of digital image data.
  • the processor 102 identifies multiple indicators throughout the fundus image using the convolutional network.
  • the indicator is one of an abnormality, an optic disc, a macula, a blood vessel or the like.
  • the abnormality is one of a lesion like a venous beading, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
  • the processor 102 detects the presence or absence of the retinal disease based the identified indicators using the convolutional network.
  • the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
  • the processor 102 classifies the severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
  • the severity of the retinal disease may be classified into several levels depending upon the severity.
  • the general concepts of the current invention are not limited to a particular number of severity levels. In an embodiment, one severity level could be used which satisfies only the detection of the retina disease. In another embodiment, multiple severity levels could be used to classify the retinal disease. In another embodiment, multiple retinal diseases could be detected based on the identified indicators.
  • the system 1000 classifies each of the detected retinal diseases based on the severity.
  • the system 1000 using the convolutional network emphases on classifying the entire fundus image as a whole. This improves efficiency and reduces errors in identifying various medical conditions.
  • the system 1000 acts as an important tool in the detection, monitoring a progression of a retinal disease and/or or a response to a therapy.
  • the system 1000 trains the convolutional network to detect all indicative indicators related to multiple retinal diseases.
  • the system 1000 accurately detects indicators throughout the input fundus image which are indicative of disease conditions to properly distinguish indicators of a healthy fundus from indicators which define retinal diseases.
  • the system 1000 may also be used to detect certain conditions such as a laser treated fundus.
  • the system 1000 may be a part of a web cloud with the input fundus image and the report uploaded to the web cloud.
  • the system 1000 involving computer-based process of supervised learning using the convolutional network as described can thus be effectively used to screen the fundus images.
  • the system 1000 identifies indicators which are further processed to automatically provide indications of relevant retinal disease, in particular indications of DR.
  • the system 1000 increases efficiency by the utilization of the well trained convolutional network for detecting and classifying the retinal diseases thus providing cost-effective early screening and treatment to the patient.
  • the system 1000 reduces the time-consumption involved in a manual process requiring a trained medical practitioner to evaluate digital fundus photographs of the retina.
  • the system 1000 using the convolutional system 1000 effectively improves the quality of analysis of the fundus image by detecting indicators of minute size which are often difficult to detect in the manual process of evaluating the fundus image.
  • the present invention described above may be configured to work in a network environment comprising a computer in communication with one or more devices.
  • the present invention may be implemented by computer programmable instructions stored on one or more computer readable media and executed by a processor 102 of the computer.
  • the computer comprises the processor 102, a memory unit, an input/output (I/O) controller, and a display 104 communicating via a data bus.
  • the computer may comprise multiple processor 102s to increase a computing capability of the computer.
  • the processor 102 is an electronic circuit which executes computer programs.
  • the memory unit for example, comprises a read only memory (ROM) and a random access memory (RAM).
  • the memory unit stores the instructions for execution by the processor 102.
  • the one or more storage devices 103 is the memory unit.
  • the memory unit stores the reference dataset and the reference ground-truth file.
  • the memory unit may also store intermediate, static and temporary information required by the processor 102 during the execution of the instructions.
  • the computer comprises one or more input devices, for example, a keyboard such as an alphanumeric keyboard, a mouse, a joystick, etc.
  • the I/O controller controls the input and output actions performed by a user.
  • the data bus allows communication between modules of the computer.
  • the computer directly or indirectly communicates with the devices via an interface, for example, a local area network (LAN), a wide area network (WAN) or the Ethernet, the Internet, a token ring, or the like.
  • each of the devices adapted to communicate with the computer and may comprise computers with, for example, Sun® processors 102, IBM® processors 102, Intel® processors 102, AMD® processors 102, etc.
  • the computer readable media comprises, for example, CDs, DVDs, floppy disks, optical disks, magnetic-optical disks, ROMs, RAMs, EEPROMs, magnetic cards, application specific integrated circuits (ASICs), or the like.
  • Each of the computer readable media is coupled to the data bus.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

A system (1000) is disclosed. The system (1000) comprises at least one processor (102); and one or more storage devices (103) configured to store software instructions configured for execution by the at least one processor (102) in order to cause the system (1000) to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.

Description

Title of the invention
[0001] A system and method for detection and classification of retinal disease
Technical field of the invention
[0002] The invention relates to the field of medical decision support. More particularly, the invention relates to detection of a presence and/or classification of a retinal disease using a fundus image of a patient based on machine learning applications.
Background of the invention
[0003] Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Early screening of eye diseases through regular screening may prevent visual loss and blindness amongst patients. Analysis of fundus images of a patient is a very convenient way of screening and monitoring eye diseases. The fundus image of the patient illustrates several elements such as optic disc, blood vessels, macula, etc. The fundus of the eye provides indications of several diseases, in particular eye diseases like diabetic retinopathy.
[0004] Currently, diabetic retinopathy is one of the primary cause of vision loss in the working age population. Long-term complications of diabetes include diabetic retinopathy. As the number of patients with diabetes continues to increase, the groundwork required to prevent visual loss due to diabetic retinopathy will become even more deficient. The expertise required are often lacking in areas where the rate of diabetes in populations is high and diabetic retinopathy detection is most needed. Micro-aneurysms is an important feature used for detecting diabetes retinopathy in the fundus image of the patient. Small areas of swellings caused due to vascular changes in the retina's blood vessels are known as rnicro-aneurysms. Micro-aneurysms may sooner or later cause plasma leakage resulting in thickening of the retina. This is known as edema. Thickening of the retina in the macular region may result in vision loss. Proper distinction of features in the fundus image is critical as wrong predictions may lead to wrong treatments causing difficulties to the patient.
[0005] In recent times, computer-aided screening systems assists doctors to improve the quality of examination of fundus images for screening of eye diseases. Machine learning (ML) algorithms on data are used to extract and evaluate information. Systems apply ML algorithms to ensure faster mode of efficient identification and classification of eye diseases using fundus images which enhances screening of eye diseases. But currently, the systems available for identification and classification of eye diseases using fundus images involving machine learning algorithm are complex and of high cost. This limits the reach of medical eye screening and diagnosis to common man.
[0006] A simple, comprehensive and cost-effective solution involving effective use of ML algorithms enabling the systems to access concealed visions for automated effective identification and classification of eye diseases using fundus images is thus essential.
Summary of invention [0007] The present invention discloses a system. The system comprises at least one processor; and one or more storage devices configured to store software instructions configured for execution by the at least one processor in order to cause the system to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
Brief description of the drawings
[0008] The present invention is described with reference to the accompanying figures. The accompanying figures, which are incorporated herein, are given by way of illustration only and form part of the specification together with the description to explain the make and use the invention, in which,
[0009] Figure 1 illustrates a block diagram of a system in accordance with the invention;
[00010] Figure 2 exemplary illustrates a convolutional network to compute a presence or absence of a retinal disease and related severity of the retinal disease associated with an input fundus image; and
[00011] Figure 3 illustrates a flowchart for determination the presence or absence of the retinal disease and related severity of the retinal disease associated with the input fundus image in accordance with the invention. Detailed description of the invention
[00012] Figure 1 illustrates a block diagram of a system 1000 in accordance with the invention. The system 1000 comprises at least one processor 102; and one or more storage devices 103 configured to store software instructions configured for execution by the at least one processor 102 in order to cause the system 1000 to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
[00013] The fundus image, herein, refers to a two-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention. The one or more storage devices 103 is, for example, a database to store a structured collection of data. In an embodiment, the one or more storage devices 103 may be an internal part of the system 1000. In another embodiment, the one or more storage devices 103 may be remotely located and accessed via a network. The one or more storage devices 103 may be, for example, removable and/or nonremovable data storage such as a tape, a magnetic disk, an optical disks, a flash memory card, etc. The one or more storage devices 103 may comprise, for example, random access memory (RAM), read only memory (ROM), electrically erasable programmable read-only memory (EEPROM), a digital versatile disks (DVD), a compact disk (CD), a flash memory, a magnetic tape, a magnetic disk storage, or any combination thereof that can be used to store and access information and is a part of the system 1000. [00014] The indicator is one of an abnormality, a retinal feature or the like. The retinal feature is an optic disc, a macula, a blood vessel or the like. The abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The severity of the retinal disease are represented as levels of increasing seriousness of the retinal disease.
[00015] The processor 102 of the system 1000 receives a reference dataset from one or more input devices. The reference dataset comprises a plurality of fundus images. Hereafter, the fundus images in the reference dataset are referred to as reference fundus images. The input device is, for example, a camera incorporated into a mobile device such as a smartphone, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc. The system 1000 stores the reference dataset in the one or more storage devices 103 of the system 1000.
[00016] The reference fundus image is a two-dimensional array of digital image data used for the purpose of training the system 1000. In this invention, the term 'training' generally refers to a process of developing the system 1000 for the detection and classification of the retinal disease based the reference dataset and a reference ground-truth file. The reference ground-truth file comprises a label and a reference fundus image identifier for each of the reference fundus image. The label provides information about the reference fundus image such as a presence or absence of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease identified in the reference fundus image. The reference fundus image identifier of a reference fundus image is, for example, a name or identity assigned to the reference fundus image. [00017] In an embodiment, an annotator annotates each of the reference fundus images using an annotation platform 101 of the system 1000. The annotation platform 101 is a graphical user interface (GUI) provided for the annotator to interact with the system 1000. The annotator accesses the reference fundus images via the annotation platform 101. The annotator creates a label with information about a presence or absence of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease based on the annotation. The annotator is usually a trained/certified specialist in accurately annotating the fundus image by analyzing the indicators present in the reference fundus image. In an example, consider that the annotator annotates the reference fundus images for the retinal disease - diabetic retinopathy (DR). The annotator may consider one or more standard DR grading standards such as the American ophthalmology DR grading scheme, the Scottish DR grading scheme, the UK DR grading scheme, etc., to annotate the reference fundus images. The annotator may assign a DR severity grade 0 (representing no DR), grade 1 (representing mild DR), grade 2 (representing moderate DR), grade 3 (representing severe DR) or grade 4 (representing proliferative DR) to each of the reference fundus image. The label of the reference fundus image represents the DR severity level associated with the patient.
[00018] For example, the annotator labels each of the reference fundus image as one of five severity classes- 'No DR', 'DR1 ', 'DR2', 'DR3' and 'DR4'based on an increasing seriousness of DR. Here, 'No DR', 'DR1', 'DR2', 'DR3' and 'DR4' represents the labels indicating different levels of increasing severity of DR associated with the patient. The annotator analyses the indicators in the retinal fundus image and accordingly marks the label. If the annotator detects a microaneurysm, then the annotator considers it as a mild level of DR and marks the label as DR1 for the reference fundus image. Similarly, if the annotator detects one or more of the following - a hard exudate, a soft exudate, a hemorrhage, a venous loop, a venous beading, etc., then the annotator marks the label as DR2 for the reference fundus image. The label DR2 indicates a moderate level of DR. The annotator marks the label as DR3 for the reference fundus image with a severe level of DR upon detection of multiple hemorrhages, hard or soft exudates, etc., and DR4 for the reference fundus image with a proliferative level of DR upon detection of vitreous hemorrhage, neovascularization, etc. The reference fundus image with no traces of DR is marked with the label as 'No DR' by the annotator.
[00019] The annotator stores the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file located in the one or more storage devices 103. The label provides information about the type of retinal disease and the corresponding severity of the retinal disease as annotated by the annotator. The reference fundus image identifier of a reference fundus image is, for example, a name or identity assigned to the reference fundus image.
[00020] In another embodiment, the processor 102 identifies the indicators throughout each of the reference fundus image to detect the presence or absence of the retinal disease using image analysis techniques. The processor 102 classifies the severity of the retinal disease based on the presence of the retinal disease using a set of predetermined rules. The predetermined rules comprise considering a type of each of the indicators, a count of each indicators, a region of occurrence of each of the indicators, a contrast level of each of the indicators, a size of each of the indicators or any combination thereof to recognize the retinal disease and the severity of the retinal disease. The processor 102 classifies each of the detected retinal diseases according to a corresponding severity grading and generates the label. The processor 102 communicates with the one or more storage devices 103 to store the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file.
[00021] The processor 102 utilizes the reference dataset to train the convolutional network for subsequent detection and classification of the retina disease in the fundus image. Hereafter, the fundus image which is subsequently analyzed by the processor 102 is referred to as an input fundus image for clarity. [00022] The processor 102 pre-processes each of the reference fundus images. For each of the reference fundus image, the processor 102 executes the following steps as part of the preprocessing. The processor 102 separates any text matter present at the border of the reference fundus image. The processor 102 adds a border to the reference fundus image with border pixel values as zero. The processor 102 increases the size of the reference fundus image by a predefined number of pixels, for example, 20 pixels width and height. The additional pixels added are of a zero value. The processor 102 next converts the reference fundus image from a RGB color image to a grayscale image. The processor 102 now binarize the reference fundus image using histogram analysis. The processor 102 applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized reference fundus image. The processor 102 acquires all connected regions such as retina, text matter of the smoothen reference fundus image to separate text matter present in the reference fundus image from a foreground image. The processor 102 determines the largest region among the acquired connected regions as the retina. The retina is assumed to be the connected element with the largest region. The processor 102 calculates a corresponding bounding box for the retina. The processor 102, thus identifies retina from the reference fundus image.
[00023] Once the processor 102 identifies the retina in the reference fundus image, the processor 102 further blurs the reference fundus image using a Gaussian filter. The processor 102 compares an image width and an image height of the blurred reference fundus image based on Equation 1.
Image width > 1.2(image height)— Equation 1 [00024] The processor 102 calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred reference fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1. The maximum background pixel value (Max_background pixel value) is given by the below Equation 2. The term 'max_pixel_left' in Equation 2 is the maximum pixel value of the left half of the blurred identified retina. The term 'max_pixel_right' in Equation 2 is the maximum pixel value of the right half of the blurred reference fundus image.
Max_background pixel value = max(max_pixel_left, max_pixel_right)— Equation 2
[00025] The processor 102 further extracts foreground pixel values from the blurred reference fundus image by considering pixel values which satisfy the below Equation 3.
All pixel values > max_background_pixel_value + 10— Equation 3
[00026] The processor 102 calculates a bounding box using the extracted foreground pixel values from the blurred reference fundus image. The processor 102 processes the bounding box to obtain a resized image using cubic interpolation of shape, for example, [256, 256, 3]. The reference fundus image at this stage is referred to as the pre-processed reference fundus image. The processor 102 stores the pre-processed reference fundus images in a pre-processed reference dataset. The ground-truth file associated with the reference dataset holds good even from the pre- processed reference dataset. The processor 102 stores the pre-processed reference dataset in the one or more storage devices 103. [00027] The processor 102 splits the pre-processed reference dataset into two sets - a training set and a validation set. Hereafter, the pre-processed reference fundus images in the training set is termed as training fundus images and the pre-processed reference fundus images in the validation set is termed as validation fundus images for simplicity. The training set is used to train the convolutional network to assess the training fundus images based on the label associated with each of the training fundus image. The validation set is typically used to test the accuracy of the convolutional network.
[00028] The processor 102 augments the training fundus images in the training set. The processor 102 preforms the following steps for the augmentation of the training set. The processor 102 randomly shuffles the training fundus images to divide the training set into a plurality of batches. Each batch is a collection of a predefined number of training fundus images. The processor 102 randomly samples each batch of training fundus images. The processor 102 processes each batch of the training fundus images using affine transformations. The processor 102 translates and rotates the training fundus images in the batch randomly based on a coin flip analogy. The processor 102 also adjusts the color and brightness of each of the training fundus images in the batch randomly based on the results of the coin flip analogy.
[00029] The processor 102 trains the system 1000 using the batches of augmented training fundus images via the convolutional network. In general, the convolutional network is a class of deep artificial neural networks that can be applied to analyzing visual imagery. The general arrangement of the convolutional network is as follows. The convolutional network comprising 'n' convolutional stacks applies a convolution operation to the input and passes an intermediate result to a next layer. Each convolutional stack comprises a plurality of convolutional layers. A first convolution stack is configured to convolve pixels from an input with a plurality of filters to generate a first indicator map. The first convolutional stack also comprises a first subsampling layer configured to reduce a size and variation of the first indicator map. The first convolutional layer of the first convolutional stack is configured to convolve pixels from the input with a plurality of filters. The first convolutional stack passes an intermediate result to the next layer. Similarly, each convolutional stack comprises a sub-sampling layer configured to reduce a size (width and height) of the indicators stack. The input is analyzed based on reference data to provide a corresponding output.
[00030] Similar to the training set, the processor 102 groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images. The processor 102 validates each of the validation fundus images in each batch of the validation set using the convolutional network. The processor 102 compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file. The processor 102 thus evaluates a convolutional network performance of the convolutional network for the batch of validation set.
[00031] The processor 102 optimizes the convolutional network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum. The optimizer iteratively optimizes the parameters of the convolutional network during multiple iterations using the training set. Here, each iteration refers to a batch of the training set. The processor 102 evaluates a convolutional network performance of the convolutional network after a predefined number of iterations on the validation set. Here, each iteration refers to a batch of the validation set.
[00032] Thus, the processor 102 trains the convolutional network based on the augmented training set and tests the convolutional network based on the validation set. Upon completion of training and validation of the convolution network based on the convolutional network performance, the system 1000 is ready to assess the input fundus image based on the indicators present in the input fundus image. [00033] The processor 102 of the system 1000 receives the input fundus image from one of the input devices. The processor 102 pre-processes the input fundus image similar to that of the reference fundus image. The processor 102 test-time augments the preprocessed input fundus image. The processor 102 performs the following step to test-time augment the preprocessed input fundus image. The processor 102 converts the preprocessed input fundus image into a plurality of test time images, for example, twenty test time images, using deterministic augmentation. The processor 102 follows the same steps to augment the input fundus image as that of the training fundus image, except that the augmentations are deterministic. Thus, the processor 102 generates deterministically augmented twenty test time images of the preprocessed input fundus image. The processor 102 processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the convolutional network comprising 'n' convolutional stacks. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image. The probability value is an indication of a confidence that identified indicators are of a particular retinal disease and a corresponding severity of the retinal disease. The output indicates a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
[00034] Figure 2 exemplary illustrates the convolutional network to compute the presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image. The deterministically augmented twenty test time images of the preprocessed input fundus image is the input to a first convolutional stack (CS1) of the convolutional network. Each of the deterministically augmented twenty test time images is processed by the convolutional network. [00035] The deterministically augmented test time image is, for example, represented as a matrix of width 448 pixels and height 448 pixels with '3' channels. That is, the deterministically augmented test time image is a representative array of pixel values is 448 x 448 x 3. The input to the first convolutional stack (CS1) is a color image of size 448 x 448. The first convolution stack (CS 1 ) comprises the following sublayers - a first convolutional layer, a first subsampling layer, a second convolutional layer, a third convolutional layer and a second subsampling layer in the same order. The output of a sublayer is an input to a consecutive sublayer. In general, a subsampling layer is configured to reduce a size and variation of its input and a convolutional layer convolves its input with a plurality of filters, for example, filters of size 3x3. The output of the first convolutional stack (CS1) is a reduced image represented as a matrix of width 112 pixels and height 112 pixels with nl channels. That is, the output of the first convolutional stack (CS1) is a representative array of pixel values 112 x 112 x nl.
[00036] This is the input to a second convolutional stack (CS2). The second convolutional stack (CS2) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The second convolutional stack (CS2) convolves the representative array of pixel values 112 x 112 x nl and reduces it to a representative array of pixel values of 56 x 56 x n2. The representative array of pixel values of 56 x 56 x n2 is an input to a third convolutional stack (CS3).
[00037] The third convolutional stack (CS3) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The third convolutional stack (CS3) convolves the representative array of pixel values 56 x 56 x n2 and reduces it to a representative array of pixel values of 28 x 28 x n3. The representative array of pixel values of 28 x 28 x n3 is an input to a fourth convolutional stack (CS4). [00038] The fourth convolutional stack (CS4) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The fourth convolutional stack (CS4) convolves the representative array of pixel values 28 x 28 x n3 and reduces it to a representative array of pixel values of 14 x 14 x n4. The representative array of pixel values of 14 x 14 x n4 is an input to a fifth convolutional stack (CS4).
[00039] The fifth convolutional stack (CS5) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The fifth convolutional stack (CS5) convolves the representative array of pixel values 14 x 14 x n4 and reduces it to a representative array of pixel values of 7 x 7 x n5. The representative array of pixel values of 7 x 7 x n5 is a first input to a concatenation block (C).
[00040] The output of the third convolutional stack (CS3) is an input to a first subsampling block (SSI). The representative array of pixel values of 28 x 28 x n3 is the input to the first subsampling block (SSI). The first subsampling block (SSI) reduces the input with a stride of 4 to obtain an output of a representative array of pixel with value of 7 x 7 x n3. This is a second input to the concatenation block (C).
[00041] The output of the fourth convolutional stack (CS4) is an input to a second subsampling block (SS2). The representative array of pixel values of 14 x 14 x n4 is the input to the second subsampling block (SS2). The second subsampling block (SS2) reduces the input with a stride of 2 to obtain an output of a representative array of pixel with value of 7 x 7 x n4. This is a third input to the concatenation block (C). [00042] The concatenation block (C) receives the first input from the fifth convolutional stack (CS5), the second input from the first subsampling block (SSI) and the third input from the second subsampling block (SS2). The concatenation block (C) concatenates the three inputs received to generate an output of value 7 x 7 x (n5 + n4 + n3). The output of the concatenation block (C) is an input to a probability block (P).
[00043] The probability block (P) provide a probability of the presence or absence of the retinal disease and related severity of the retinal disease. The predicted probabilities of the twenty test time images are averaged to get a final prediction result. The output of the convolutional network provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image. The probability block (P) as shown in the Figure 2 provides five values by considering the retinal disease to be DR. The output of the probability block are five values depicting the probability for each DR severity level - DR0 (no DR), DR1 (mild DR level), DR2 (moderate DR level), DR3 (severe DR level) and DR4 (proliferative DR level).
[00044] The system 1000 considers an image capture device characteristics of an image capture device also as one of the parameters to assess the input fundus image. The image capture device characteristics is a resolution, an illumination factor, a field of view or any combination thereof. The image capture device is, for example, a fundus camera, a camera attached to a smartphone, etc., used to capture the input fundus image. For example, the image capture device is the input device used to capture the input fundus image. [00045] In an embodiment, the processor 102 of the system 1000 considers a manufacturer and version of the image capture device to determine a predefined score for the image capture device characteristics of the image capture device. This predefined score for the image capture device characteristics is used to assess the input fundus image. The predefined score for the image capture device characteristics denotes a superiority of the image capture device characteristics. The predefined score for the image capture device characteristics is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the predefined score for the image capture device characteristics. For example, the predefined score for the image capture device characteristics for multiple manufacturers of image capture device is initially stored in the one or more storage devices 103 by an operator of the system 1000. By considering the image capture device characteristics of an image capture device to assess the quality of the input fundus image, the flexibility of the system 1000 is increased, thereby providing customized results for the input fundus image captured using the image capture device of multiple manufacturers. Thus, the processor 102 considers the output of the convolutional network and the image capture device characteristics of the image capture device to determine the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image. The processor 102 thereby considers the quality of the input fundus image along with the output of the convolutional network to determine the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image.
[00046] The processor 102 displays the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image via a display 104 to a user. For example, suitable suggestions with a set of instructions to the user may also be included and provided via a pop-up box displayed on a screen. The system 1000 may also generate a report comprising the input fundus image, the type of the retinal disease and the severity of the retinal disease and communicated to the patient via an electronic mail. The report could also be stored in the one or more storage devices 103 of the system 1000.
[00047] In an embodiment, the processor 102 assesses a quality measure of each of the reference fundus images in the reference dataset. The quality measure is also stored as a part of the label associated with the reference fundus image. The processor 102 trains the convolutional network to learn the quality measure of the reference fundus image along with the identification of the indicators in the reference fundus image. The processor 102 assesses the input fundus image based on the training. The processor 102 identifies the quality measure of the input fundus image using the convolutional network. The processor 102 may also refer to a user defined threshold to define the quality measure of the input fundus image. The user defined threshold is user defined to increase a flexibility of the system 1000. The user defined threshold is the variable factor which may be used to vary the quality measure of the input fundus image to conveniently suit the requirements of the user, for example, medical practitioner. The user defined threshold may be varied to vary the quality measure of the input fundus image based on the doctor's grading experience. The system 1000 may further display 104 a message to an operator to retake another fundus image of the patient when the quality measure of the input fundus image is below a threshold. The system 1000 may further consider characteristics of a device used to capture the input fundus image of the patient as an additional parameter to assess the quality measure of the input fundus image. For example, the device is a fundus camera and the characteristics of the device is a resolution level of the fundus camera.
[00048] In another embodiment, the system 1000 detects the presence of several diseases, for example, diabetes, stroke, hypertension, cardiovascular diseases, etc., and not limited to retinal diseases based on changes in the retinal feature. The processor 102 trains the convolutional network to identify and classify the severity of these diseases using the fundus image of the patient.
[00049] Figure 3 illustrates a flowchart for determination of the presence or absence of the retinal disease and related severity of the retinal disease associated with the input fundus image in accordance with the invention. At step 301, the processor 102 receives the fundus image of the patient. The fundus image is the input fundus image. The input fundus image is a two- dimensional array of digital image data. At step 302, the processor 102 identifies multiple indicators throughout the fundus image using the convolutional network. The indicator is one of an abnormality, an optic disc, a macula, a blood vessel or the like. The abnormality is one of a lesion like a venous beading, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
[00050] At step 303, the processor 102 detects the presence or absence of the retinal disease based the identified indicators using the convolutional network. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. At step 304, the processor 102 classifies the severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network. The severity of the retinal disease may be classified into several levels depending upon the severity. The general concepts of the current invention are not limited to a particular number of severity levels. In an embodiment, one severity level could be used which satisfies only the detection of the retina disease. In another embodiment, multiple severity levels could be used to classify the retinal disease. In another embodiment, multiple retinal diseases could be detected based on the identified indicators. The system 1000 classifies each of the detected retinal diseases based on the severity.
[00051] The system 1000 using the convolutional network emphases on classifying the entire fundus image as a whole. This improves efficiency and reduces errors in identifying various medical conditions. The system 1000 acts as an important tool in the detection, monitoring a progression of a retinal disease and/or or a response to a therapy. The system 1000 trains the convolutional network to detect all indicative indicators related to multiple retinal diseases. The system 1000 accurately detects indicators throughout the input fundus image which are indicative of disease conditions to properly distinguish indicators of a healthy fundus from indicators which define retinal diseases. [00052] The system 1000 may also be used to detect certain conditions such as a laser treated fundus. The system 1000 may be a part of a web cloud with the input fundus image and the report uploaded to the web cloud. The system 1000 involving computer-based process of supervised learning using the convolutional network as described can thus be effectively used to screen the fundus images. The system 1000 identifies indicators which are further processed to automatically provide indications of relevant retinal disease, in particular indications of DR. The system 1000 increases efficiency by the utilization of the well trained convolutional network for detecting and classifying the retinal diseases thus providing cost-effective early screening and treatment to the patient.
[00053] The system 1000 reduces the time-consumption involved in a manual process requiring a trained medical practitioner to evaluate digital fundus photographs of the retina. The system 1000 using the convolutional system 1000 effectively improves the quality of analysis of the fundus image by detecting indicators of minute size which are often difficult to detect in the manual process of evaluating the fundus image.
[00054] The present invention described above, although described functionally or sensibly, may be configured to work in a network environment comprising a computer in communication with one or more devices. The present invention, may be implemented by computer programmable instructions stored on one or more computer readable media and executed by a processor 102 of the computer. The computer comprises the processor 102, a memory unit, an input/output (I/O) controller, and a display 104 communicating via a data bus. The computer may comprise multiple processor 102s to increase a computing capability of the computer. The processor 102 is an electronic circuit which executes computer programs. [00055] The memory unit, for example, comprises a read only memory (ROM) and a random access memory (RAM). The memory unit stores the instructions for execution by the processor 102. In this invention, the one or more storage devices 103 is the memory unit. For instance, the memory unit stores the reference dataset and the reference ground-truth file. The memory unit may also store intermediate, static and temporary information required by the processor 102 during the execution of the instructions. The computer comprises one or more input devices, for example, a keyboard such as an alphanumeric keyboard, a mouse, a joystick, etc. The I/O controller controls the input and output actions performed by a user. The data bus allows communication between modules of the computer. The computer directly or indirectly communicates with the devices via an interface, for example, a local area network (LAN), a wide area network (WAN) or the Ethernet, the Internet, a token ring, or the like. Further, each of the devices adapted to communicate with the computer and may comprise computers with, for example, Sun® processors 102, IBM® processors 102, Intel® processors 102, AMD® processors 102, etc.
[00056] The computer readable media comprises, for example, CDs, DVDs, floppy disks, optical disks, magnetic-optical disks, ROMs, RAMs, EEPROMs, magnetic cards, application specific integrated circuits (ASICs), or the like. Each of the computer readable media is coupled to the data bus.
[00057] The foregoing examples have been provided merely for the purpose of explanation and does not limit the present invention disclosed herein. While the invention has been described with reference to various embodiments, it is understood that the words are used for illustration and are not limiting. Those skilled in the art, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the invention in its aspects.

Claims

WE CLAIM :-
1. A system 1000 comprising: at least one processor 102; and one or more storage devices 103 configured to store software instructions configured for execution by the at least one processor 102 in order to cause the system 1000 to: receive a fundus image of a patient; identify a plurality of indicators throughout the fundus image using a convolutional network; detect a presence or absence of a retinal disease based the identified indicators using the convolutional network; and classify a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
2. The system 1000 as claimed in claim 1, wherein the indicator is one of an abnormality, a retinal feature or the like.
3. The system 1000 as claimed in claim 2, wherein the abnormality is one of a lesion, a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate, a hard exudate, a vitreous/preretinal hemorrhage, neovascularization or the like.
4. The system 1000 as claimed in claim 1, wherein the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
5. The system 1000 as claimed in claim 1, wherein the retinal feature is an optic disc, a macula, a blood vessel or the like.
6. The system 1000 as claimed in claim 1, wherein the processor 102 considers an image capture device characteristics of an image capture device to assess the fundus image.
7. A computer-implemented method for determining and classifying a retinal disease, at least a portion of the method being performed by a system 1000 comprising at least one processor 102, wherein said method comprising: receiving a fundus image of a patient;
identifying a plurality of indicators throughout the fundus image using a convolutional network; and
detecting a presence or absence of a retinal disease based the identified indicators using the convolutional network; and
classifying a severity of the retinal disease based on the presence or absence of the retinal disease using the convolutional network.
8. The method as claimed in claim 6, wherein the indicator is one of an abnormality, a
retinal feature or the like.
9. The method as claimed in claim 7, wherein the abnormality is one of a lesion, a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate, a hard exudate, a vitreous/preretinal hemorrhage, neovascularization or the like.
10. The method as claimed in claim 6, wherein the retinal disease is one of diabetic
retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment, retinal emboli, solar retinopathy, retinal vein occlusion or the like.
PCT/IN2018/050682 2017-10-24 2018-10-22 A system and method for detection and classification of retinal disease WO2019082203A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201741037450 2017-10-24
IN201741037450 2017-10-24

Publications (1)

Publication Number Publication Date
WO2019082203A1 true WO2019082203A1 (en) 2019-05-02

Family

ID=66247224

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2018/050682 WO2019082203A1 (en) 2017-10-24 2018-10-22 A system and method for detection and classification of retinal disease

Country Status (1)

Country Link
WO (1) WO2019082203A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028230A (en) * 2019-12-24 2020-04-17 贵州大学 Fundus image optic disc and macula lutea positioning detection algorithm based on YOLO-V3

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110242306A1 (en) * 2008-12-19 2011-10-06 The Johns Hopkins University System and method for automated detection of age related macular degeneration and other retinal abnormalities
US20150110368A1 (en) * 2013-10-22 2015-04-23 Eyenuk, Inc. Systems and methods for processing retinal images for screening of diseases or abnormalities

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110242306A1 (en) * 2008-12-19 2011-10-06 The Johns Hopkins University System and method for automated detection of age related macular degeneration and other retinal abnormalities
US20150110368A1 (en) * 2013-10-22 2015-04-23 Eyenuk, Inc. Systems and methods for processing retinal images for screening of diseases or abnormalities

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028230A (en) * 2019-12-24 2020-04-17 贵州大学 Fundus image optic disc and macula lutea positioning detection algorithm based on YOLO-V3

Similar Documents

Publication Publication Date Title
US20220076420A1 (en) Retinopathy recognition system
US11213197B2 (en) Artificial neural network and system for identifying lesion in retinal fundus image
Wang et al. Automated diagnosis and segmentation of choroidal neovascularization in OCT angiography using deep learning
Kauppi et al. The diaretdb1 diabetic retinopathy database and evaluation protocol.
Tang et al. Splat feature classification with application to retinal hemorrhage detection in fundus images
Sopharak et al. Machine learning approach to automatic exudate detection in retinal images from diabetic patients
Kauppi et al. DIARETDB0: Evaluation database and methodology for diabetic retinopathy algorithms
KR20200005407A (en) Diagnostic auxiliary image providing device based on eye image
Mayya et al. Automated microaneurysms detection for early diagnosis of diabetic retinopathy: A Comprehensive review
Aquino Establishing the macular grading grid by means of fovea centre detection using anatomical-based and visual-based features
WO2019180742A1 (en) System and method for retinal fundus image semantic segmentation
Valizadeh et al. Presentation of a segmentation method for a diabetic retinopathy patient’s fundus region detection using a convolutional neural network
Joshi et al. Glaucoma detection using image processing and supervised learning for classification
Xiao et al. Major automatic diabetic retinopathy screening systems and related core algorithms: a review
Al-Jarrah et al. Non-proliferative diabetic retinopathy symptoms detection and classification using neural network
KR20200023029A (en) Diagnosis assistance system and control method thereof
Kanth et al. Identification of different stages of Diabetic Retinopathy using artificial neural network
CN113158821B (en) Method and device for processing eye detection data based on multiple modes and terminal equipment
Jemima Jebaseeli et al. Retinal blood vessel segmentation from depigmented diabetic retinopathy images
Gupta et al. Artifical intelligence with optimal deep learning enabled automated retinal fundus image classification model
WO2019082202A1 (en) A fundus image quality assessment system
Kumar et al. Deep learning-assisted retinopathy of prematurity (ROP) screening
WO2019171398A1 (en) A fundus image analysis system
Jana et al. A semi-supervised approach for automatic detection and segmentation of optic disc from retinal fundus image
WO2019082203A1 (en) A system and method for detection and classification of retinal disease

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18870363

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18870363

Country of ref document: EP

Kind code of ref document: A1