WO2019171398A1 - A fundus image analysis system - Google Patents

A fundus image analysis system Download PDF

Info

Publication number
WO2019171398A1
WO2019171398A1 PCT/IN2019/050188 IN2019050188W WO2019171398A1 WO 2019171398 A1 WO2019171398 A1 WO 2019171398A1 IN 2019050188 W IN2019050188 W IN 2019050188W WO 2019171398 A1 WO2019171398 A1 WO 2019171398A1
Authority
WO
WIPO (PCT)
Prior art keywords
fundus image
fundus
image
label
analysis
Prior art date
Application number
PCT/IN2019/050188
Other languages
French (fr)
Inventor
Pradeep WALIA
Rajarajeshwari KODHANDAPANI
Raja Lakshmi RAJA
Mrinal HALOI
Original Assignee
Artificial Learning Systems India Private Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Artificial Learning Systems India Private Limited filed Critical Artificial Learning Systems India Private Limited
Publication of WO2019171398A1 publication Critical patent/WO2019171398A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/12Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for looking at the eye fundus, e.g. ophthalmoscopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30168Image quality inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the invention relates to the field of medical decision support. More particularly, the invention relates to the analysis of a fundus image to identify a quality level of the fundus image and aid in the diagnosis of retinal diseases.
  • Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Early screening of eye diseases through regular screening may prevent visual loss and blindness amongst patients. Analysis of fundus images of a patient is a very convenient way of screening and monitoring eye diseases. The fundus of the eye provides indications of several diseases, in particular eye diseases like diabetic retinopathy.
  • diabetic retinopathy is one of the primary cause of vision loss.
  • Long-term complications of diabetes include diabetic retinopathy.
  • the groundwork required to prevent visual loss due to diabetic retinopathy will become even more deficient.
  • the expertise required are often lacking in areas where the rate of diabetes in populations is high and diabetic retinopathy detection is most needed.
  • Micro-aneurysms is an important feature used for detecting diabetes retinopathy in the fundus image of the patient. Small areas of swellings caused due to vascular changes in the retina's blood vessels are known as micro- aneurysms. Micro-aneurysms may sooner or later cause plasma leakage resulting in thickening of the retina. This is known as edema. Thickening of the retina in the macular region may result in vision loss. Proper distinction of features in the fundus image is critical as wrong predictions may lead to wrong treatments causing difficulties to the patient. [0005] In recent times, computer-aided screening systems assists doctors to improve the quality of examination of fundus images for screening of eye diseases. Machine learning (ML) algorithms on data are used to extract and evaluate information.
  • ML Machine learning
  • An artificial neural network is a computational model comprising a group of interconnected artificial neurons.
  • Convolutional neural network is a feed forward artificial neural network having several applications in pattern recognition and classification.
  • Convolutional neural network comprises collections of neurons having a receptive field and together tile an input space.
  • the systems available for identification and classification of eye diseases using fundus images involving machine learning algorithm are complex and of high cost. Additionally, training of the machine learning algorithm is also challenging adding to the overall cost of the system. This limits the reach of medical eye screening and diagnosis to common man.
  • the present invention discloses a computer implemented system for analyzing a fundus image of a patient.
  • the system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image analysis application, the at least one processor configured to execute the fundus image analysis application; and the fundus image analysis application comprising: a graphical user interface comprising a plurality of interactive elements configured to enable capture and analysis of the fundus image via a user device; a reception means adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device, wherein the input is the fundus image of the patient displayed in a live mode; an interactive fundus image rendering means adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface via the user device using the interactive elements; a fundus image capture means adapted to capture the fundus image based on the dynamically rendered input
  • the system further comprises the second analysis means adapted to generate a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means adapted to train a third convolutional neural network using the third label.
  • the user device is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc.
  • the user defined quality threshold is a quality measure defined by a user for the fundus image based on a user grading experience.
  • the image capturing device refers to a camera for photographing the fundus of the patient.
  • the parameters of the image capturing device are a manufacturer of the image capturing device, a version of the image capturing device system and the like.
  • the indicator is one of an abnormality, a retinal feature or the like.
  • the abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
  • the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
  • the state of the retinal disease indicates a level of seriousness of the retinal disease or a likelihood of developing the retinal disease.
  • Figure 1 illustrates a block diagram of a computer implemented system for analyzing a fundus image of a patient in accordance with the invention
  • Figure 2 exemplary illustrates a first convolutional neural network to compute a quality level of an input fundus image
  • Figure 3 exemplary illustrates a second convolutional neural network to compute a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image
  • Figure 4 exemplarily illustrates the architecture of a computer system employed by a fundus image analysis application
  • Figure 5 exemplary illustrates a screenshot of a graphical user interface (GUI) provided by the system, displaying a log-in screen of the system;
  • Figure 6 exemplary illustrates a screenshot of the GUI provided by the system, displaying a menu screen of the system;
  • GUI graphical user interface
  • Figure 7 exemplary illustrates a screenshot of the GUI provided by the system, displaying an add new patient screen of the system
  • Figure 8 exemplary illustrates a screenshot of the GUI provided by the system, displaying an existing patients screen of the system
  • Figure 9 exemplary illustrates a screenshot of the GUI provided by the system, displaying a profile screen of an existing patient of the system;
  • Figure 10 exemplary illustrates a screenshot of the GUI provided by the system, displaying existing images of the existing patient of the system;
  • Figure 11 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image rendering screen of the system
  • Figure 12 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image analysis screen of the system after the fundus image of the patient is captured by the user;
  • Figure 13 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image upload screen of the system to upload the fundus image of the patient for analysis;
  • Figure 14 exemplary illustrates a screenshot of the GUI provided by the system, displaying the fundus image analysis screen of the system when the user selects the option“Image Quality and analyse”
  • Figure 15 exemplary illustrates a screenshot of the GUI provided by the system, displaying the fundus image analysis screen of the system when the user selects the option“Analyse”;
  • Figure 16 exemplary illustrates a screenshot of the GUI provided by the system, displaying a report screen of the system.
  • Figure 17 illustrates a flowchart for analyzing the fundus image of the patient in accordance with the invention.
  • FIG. 1 illustrates a block diagram of a computer implemented system for analyzing a fundus image of a patient in accordance with the invention.
  • the system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image analysis application 103, the at least one processor configured to execute the fundus image analysis application 103; and the fundus image analysis application 103 comprising: a graphical user interface (GUI) 103k comprising a plurality of interactive elements 103j configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c; a reception means 103a adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device, wherein the input is the fundus image of the patient displayed in a live mode; an interactive fundus image rendering means 103b adapted to dynamically render the input, wherein the dynamically rendered input is configurab
  • the system 100 further comprises the second analysis means 103i adapted to generate a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means 103i adapted to train a third convolutional neural network using the third label.
  • the term“patient” refers to an individual receiving or registered to receive medical treatment.
  • the patient is, for example, an individual undergoing a regular health checkup, an individual with a condition of diabetes mellitus, etc.
  • the term“fundus image” refers to a two-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention.
  • the computer implemented system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor and the fundus image analysis application 103.
  • the non-transitory computer readable storage medium is configured to store the fundus image analysis application 103.
  • the at least one processor is configured to execute the fundus image analysis application 103.
  • the fundus image analysis application 103 is executable by at least one processor configured to enable capture and analysis of the fundus image of the patient via the user device 101a, 101b or 101c.
  • the user device 101a, 101b or 101c is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc.
  • the fundus image analysis application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers.
  • the fundus image analysis application 103 is implemented on a web based platform, for example, a fundus image analysis platform 104 as illustrated in Figure 1.
  • the fundus image analysis platform 104 hosts the fundus image analysis application 103.
  • the fundus image analysis application 103 is accessible to one or more user devices 101a, 101b or 101c.
  • the user device 101a, 101b or 101c is, for example, a computer, a mobile phone, a laptop, etc.
  • the user device is accessible over a network such as the internet, a mobile telecommunication network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., etc.
  • the fundus image analysis application 103 is accessible through browsers such as Internet Explorer® (IE) 8, IE 9, IE 10, IE 11 and IE 12 of Microsoft Corporation, Safari® of Apple Inc., Mozilla® Firefox® of Mozilla Foundation, Chrome of Google, Inc., etc., and is compatible with technologies such as hypertext markup language 5 (HTML5), etc.
  • IE Internet Explorer®
  • IE 9 IE 10
  • IE 11 IE 12
  • Mozilla® Firefox® of Mozilla Foundation Chrome of Google, Inc., etc.
  • HTTP5 hypertext markup language 5
  • the fundus image analysis application 103 is configured as a software application, for example, a mobile application downloadable by a user on the user device 101a, 101b or 101c, for example, a tablet computing device, a mobile phone, etc.
  • the term“user” is an individual who operates the fundus image analysis application 103 to capture the fundus images of the patient and generate a report resulting from the analysis of the captured fundus images.
  • the fundus image analysis application 103 is accessible by the user device 101a, 101b or 101c via the GUI 103k provided by the fundus image analysis application 103.
  • the fundus image analysis application 103 is accessible over the network 102.
  • the network 102 is, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.
  • GPRS general packet radio service
  • GSM global system for mobile
  • CDMA code division multiple access
  • the fundus image analysis application 103 comprises the GUI 103k comprising a plurality of interactive elements 103j configured to enable capture and analysis of the fundus image via the user device 101a, 101b or 101c.
  • the term“ interactive elements 103j” refers to interface components on the GUI 103k configured to perform a combination of processes, for example, a retrieval process from the input received from the user, for example, the fundus images of the patient, processes that enable real time user interactions, etc.
  • the interactive elements 103j comprise, for example, clickable buttons.
  • the fundus image analysis application 103 comprises the reception means 103a adapted to receive the input from the image capturing device based on the parameters of the image capturing device.
  • the input is the fundus image of the patient.
  • the input may also be a plurality of fundus images of the patient.
  • the term“image capturing device” refers to a camera for photographing the fundus of the patient.
  • the image capturing device is a Zeiss FF 450+ fundus camera comprising a Charged Coupled Device (CCD) photographic unit.
  • the image capturing device is a smart phone with a camera capable of capturing the fundus images of the patient.
  • the parameters of the image capturing device are a manufacturer of the image capturing device, a version of the image capturing device and the like.
  • the reception means 103a receives information associated with the patient from the user device, for example, 101a, 101b or 101c via the GUI 103k.
  • the information associated with the patient is, for example, personal details about the patient, medical condition of the patient, etc., as shown in Figure 7.
  • the image capturing device is in communication with the fundus image analysis application 103 via the network 102, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.
  • GPRS general packet radio service
  • GSM global system for mobile
  • CDMA code division multiple access
  • 3G third generation
  • 4G fourth generation
  • the fundus image analysis application 103 accesses the image capturing device based on the parameters of the image capturing device to receive the input of the patient.
  • the fundus image analysis application 103 comprises a transmission means to request the image capturing device for a permission to control the activities of the image capturing device to capture the input associated with the patient.
  • the image capturing device responds to the request received from the transmission means.
  • the reception means 103a receives the response of the image capturing device.
  • the image capturing device permits the user of the fundus image analysis application 103 to control the activities of the image capturing device via the interactive elements 103j of the GUI 103k.
  • the term“activities” refer to a viewing of a live mode of the fundus of the patient on a screen of the GUI 103k, focusing a field of view by zooming in or zooming out the field of view to observe the fundus of the patient and capturing the fundus image of the patient from the displayed live mode of the fundus of the patient.
  • the fundus image analysis application 103 adaptably controls the activities specific to the image capturing device based on the parameters, for example, the manufacturer, of the image capturing device.
  • the fundus image analysis application 103 is customizable to suit the parameters of the image capturing device such as the version, the manufacturer, the model details, etc. In other terms, the fundus image analysis application 103 is customizable and can be suitable adapted to capture the fundus images of the patient for different manufacturers of the image capturing device.
  • the user of the fundus image analysis application 103 can view the input of the image capturing device on the screen of the GUI 103k.
  • the interactive fundus image rendering means 103b dynamically renders the input on the GUI 103k.
  • the dynamically rendered input is configurably accessible on the GUI 103k via the user device 101a, 101b or 101c using the interactive elements 103j.
  • the field of view of the image capturing device is displayed on a screen of the GUI 103k via the user device 101a, 101b or 101c.
  • the user can focus the field of view by zooming in or zooming out the field of view to observe the fundus of the patient by using with the interactive elements 103j via a user input device such as a mouse, a trackball, a joystick, etc.
  • the user captures the fundus image of the patient from the displayed live mode of the fundus of the patient using the interactive elements 103j of the GUI 103k via the user device 101a, 101b or 101c.
  • the term“live mode” refers to the seamless display of the fundus of the patient in real time via the GUI 103k.
  • the input is an already existing fundus image of the patient stored in the database 104a.
  • the fundus image analysis application 103 comprises a first analysis means 103h configured to determine the quality level of the captured fundus image.
  • the first analysis means 103h comprises the initial quality level detection means to generate the first label for the fundus image using the first convolutional neural network, wherein the initial label is the initial quality level of the fundus image; and the final quality level determination means to determine the final quality level of the fundus image based on the generated first label, the user defined quality threshold and the parameters of the image capturing device.
  • the fundus image analysis application 103 comprises the second analysis means 103i adapted to analyze the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion, comprising: the indicators identification means to identify multiple indicators throughout the fundus image; and the retinal disease detection means to detect the state of the retinal disease based the identified indicators.
  • the term“quality level” of the fundus image defines a gradable efficiency of the fundus image.
  • the quality level of the fundus image is based on a plurality of quality factors.
  • the quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc.
  • the term “convolutional neural network” refers to a class of deep artificial neural networks that can be applied to analyzing visual imagery.
  • the initial label defines the initial quality level of the fundus image which is a quality level computed by the first convolutional neural network based on a training provided to the first convolutional neural network.
  • the final quality level determination means considers the generated first label which is the detected initial quality level along with the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image.
  • the user defined quality threshold is a user defined parameter to vary the quality level of the fundus image.
  • the user defined quality threshold is based on the user’s confidence and ability to grade the fundus image.
  • the term“indicator” is one of an abnormality, a retinal feature or the like.
  • the retinal feature is an optic disc, a macula, a blood vessel or the like.
  • the abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
  • the retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
  • the state of the retinal disease indicates a presence or absence of the retinal disease represented as levels of increasing seriousness of the retinal disease.
  • the second analysis means 103i analyzes the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion.
  • the user selection criterion refers to a user’s selection of either considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image or analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image.
  • the user selection criterion is a selection process of the user which is realized by a clickable event of either the“Analyse” or the“Image Quality and analyse” buttons on the GUI 103k via the user input device such as a mouse, a trackball, a joystick, etc., as shown in Figure 12.
  • the user selection criterion of considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image can be activated by the selection of the interactive element“Image quality and analyse” clickable button provided by the GUI 103k as shown in Figure 12.
  • the user selection criterion of analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image can be activated by the selection of the interactive element“analyse” clickable button provided by the GUI 103k as shown in Figure 12.
  • the second analysis means 103i considers the determined final quality level into account to analyze the fundus image.
  • the determined final quality level is an output of the first analysis means 103h.
  • the second analysis means 103i considers the output of the first analysis means 103h when user selection criterion is to consider the quality level of the fundus image before the analysis of the fundus image.
  • the second analysis means 103i aborts the analysis of the fundus image.
  • the second analysis means 103i continues with the analysis of the fundus image by using the second convolutional neural network.
  • the‘bad’ final quality level indicates that the fundus image is below a quality threshold
  • the‘good’ final quality level indicates that the fundus image is above the quality threshold.
  • the purpose of providing the quality level of the fundus image is to detect low quality fundus images whose quality is inadequate for retinal disease screening and discard them.
  • the second analysis means 103i directly analyses the fundus image to identify a plurality of indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators.
  • the first convolutional neural network and the second convolutional neural network are convolutional neural networks and correspond to a specific model of an artificial neural network.
  • the first convolutional neural network generates the first label for the fundus image of the patient.
  • the second convolutional neural network generates the second label for the fundus image of the patient.
  • the first label refers to the initial quality level of the fundus image of the patient.
  • the second label for the fundus image of the patient refers to identification of the indicators in the fundus image and determination of the state of the retinal disease in the fundus image of the patient.
  • the convolutional neural network is trained using a first reference dataset of fundus images to accomplish the function associated with the convolutional neural network.
  • the term“function” of the first convolutional neural network refers to the determination of the initial quality level of the fundus image of the patient and the“function” of the second convolutional neural network refers to the identification of the indicators in the fundus image and determination of the state of the retinal disease in the fundus image of the patient.
  • the fundus image analysis application 103 receives the first reference dataset from one or more devices.
  • the first reference dataset comprises a plurality of fundus images.
  • the fundus images in the first reference dataset are referred to as reference fundus images.
  • the device is, for example, the image capturing device such as a camera incorporated into a mobile device, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc.
  • the fundus image analysis application 103 stores the first reference dataset in a database 104a of the system 100.
  • the system 100 comprises the database 104a in communication with the fundus image analysis application 103.
  • the database 104a is also configured to store patient profile information, patient medical history, the reference fundus images of patients, reports of the patients, etc.
  • a same set of reference fundus images is used to train the first convolutional neural network and the second convolutional neural network.
  • different sets of reference fundus images are used to train the first convolutional neural network and the second convolutional neural network.
  • the term“reference fundus image” is a two-dimensional array of digital image data used for the purpose of training the first convolutional neural network and the second convolutional neural network.
  • the term‘training’ generally refers to a process of developing the first convolutional neural network for the detection of the initial quality level of the fundus image and the second convolutional neural network for the identification and determination of the state of the retinal disease based the first reference dataset and a reference ground-truth file.
  • the reference ground-truth file comprises a label and a reference fundus image identifier for each of the reference fundus image.
  • the label provides information about the reference fundus image such as the quality level of the fundus image, the state of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease identified in the reference fundus image.
  • the reference fundus image identifier of the reference fundus image is, for example, a name or an identity assigned to the reference fundus image.
  • the first convolutional neural network and the second convolutional neural network have a separate reference ground-truth file.
  • the first convolutional neural network and the second convolutional neural network refer to a common reference ground-truth file for relevant information required to perform the specific function associated with the convolutional neural network.
  • an annotator annotates each of the reference fundus images the GUI 103k via the user device 101a, 101b or 101c.
  • the term“annotator” refers to a user of the fundus image analysis application 103 who is usually a trained/certified specialist in accurately annotating the fundus image to determine the quality level of the reference fundus image and analyze the indicators present in the reference fundus image.
  • the terms“annotator” and“user” are used interchangeably herein.
  • the annotator accesses the reference fundus images using the GUI 103k.
  • the annotator creates the label with information about the quality level of the fundus image, the state of the retinal disease present in the fundus image, the type of retinal disease and the corresponding severity of the retinal disease based on the annotation.
  • the annotator initially annotates the reference fundus image based on a plurality of quality factors.
  • the term“quality factors” refers to the parameters of the reference fundus image which define a measure of the quality level of the reference fundus image.
  • the quality level is a measure of perceived image degradation as compared to an ideal image reference based on amounts of the multiple quality factors.
  • the quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc.
  • the annotator labels each of the reference fundus image as either‘good’ or ‘bad’ representing the quality level of the reference fundus image.
  • a reference fundus image with the label comprising‘good’ indicates the quality level of the reference fundus image with quality factors above a quality threshold.
  • the reference fundus image with the label comprising‘bad’ indicates the quality level of the reference fundus image with a minimum number of the quality factors below the quality threshold.
  • the label may comprise terms such as either‘low-quality’ or‘high-quality’ based on the quality level of the reference fundus image.
  • the label may comprise terms defining five levels of quality -‘bad’,‘poor’,‘fair’,‘good’ and‘excellent’.
  • the label may comprise a numeric value representing the degree of quality of the reference fundus image based on the values of each of the associated quality factors.
  • the annotator manually analyses the reference fundus image by partitioning the fundus image into a plurality of partitions.
  • the annotator divides the reference fundus image into nine equal partitions and analyses each of the partitions to determine the quality level of the reference fundus image.
  • the annotator considers the multiple quality factors while analyzing the partitions to finally determine the quality level of the reference fundus image.
  • the annotator determines the quality level of each of the partitions to determine the quality level of the reference fundus image. If the quality level of any one of the partitions is below the quality threshold and comprises a region of interest such as an optic disc and/or a macula of the fundus of the patient, then the annotator determines the quality level of the reference fundus image as‘bad’.
  • the annotator considers a minimum of two partitions with the quality level below the quality threshold and with an absence of the region of interest to determine the quality level of the reference fundus image as‘bad’. If the annotator determines the quality level of all the partitions above the quality threshold, then the annotator classifies the quality level of the training fundus image as‘good’. According, the annotator labels each of the reference fundus image as either ‘good’ or‘bad’ representing the quality level of the partitions of the reference fundus image.
  • the annotator next annotates the reference fundus image to identify multiple indicators throughout the fundus image and to detect the state of the retinal disease based the identified indicators.
  • the annotator detects the presence of one or more retinal diseases based on the identified indicators.
  • the annotator further updates the label of the fundus image with each type of the retinal disease, the severity of each type of the retinal disease, etc.
  • the annotator may concentrate only on the identification of a particular retinal disease.
  • the annotator annotates the reference fundus images for the retinal disease - diabetic retinopathy (DR).
  • the annotator may consider one or more standard DR grading standards such as the American ophthalmology DR grading scheme, the Scottish DR grading scheme, the UK DR grading scheme, etc., to annotate the reference fundus images.
  • the annotator may assign a DR severity grade - grade 0 (representing no DR), grade 1 (representing mild DR), grade 2 (representing moderate DR), grade 3 (representing severe DR) or grade 4 (representing proliferative DR) to each of the reference fundus image.
  • the label of the reference fundus image represents the DR severity level associated with the patient.
  • the annotator labels each of the reference fundus image as one of five severity classes-‘No DR’,‘DR1’,‘DR2’,‘DR3’ and‘DR4’ based on an increasing seriousness of DR.
  • ‘No DR’,‘DR1’,‘DR2’,‘DR3’ and‘DR4’ represents the labels indicating different levels of increasing severity of DR associated with the patient.
  • the annotator analyses the indicators in the retinal fundus image and accordingly marks the label. If the annotator detects a microaneurysm, then the annotator considers it as a mild level of DR and marks the label as DR1 for the reference fundus image.
  • the annotator marks the label as DR2 for the reference fundus image.
  • the label DR2 indicates a moderate level of DR.
  • the annotator marks the label as DR3 for the reference fundus image with a severe level of DR upon detection of multiple hemorrhages, hard or soft exudates, etc., and DR4 for the reference fundus image with a proliferative level of DR upon detection of vitreous hemorrhage, neovascularization, etc.
  • the reference fundus image with no traces of DR is marked with the label as‘No DR’ by the annotator.
  • the annotator stores the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file located in the database 104a.
  • the label provides information about the type of retinal disease and the corresponding severity of the retinal disease as annotated by the annotator.
  • the severity of the retinal disease in turn provides the state of the retinal disease.
  • the state of the retinal disease is either a presence or an absence of the retinal disease.
  • the reference fundus image identifier of the reference fundus image is, for example, a name or an identity assigned to the reference fundus image.
  • the first analysis means 103h uses one or more of the known image processing algorithms to detect the quality level of the reference fundus image.
  • the second analysis means 103i identifies the indicators throughout each of the reference fundus image to detect the state of the retinal disease using the known image processing algorithms.
  • the second analysis means 103i classifies the severity of the retinal disease based on the presence of the retinal disease using a set of predetermined rules.
  • the predetermined rules comprise considering a type of each of the indicators, a count of each indicators, a region of occurrence of each of the indicators, a contrast level of each of the indicators, a size of each of the indicators or any combination thereof to recognize the retinal disease and the severity of the retinal disease.
  • the second analysis means 103i classifies each of the detected retinal diseases according to a corresponding severity grading and generates the label.
  • the second analysis means 103i communicates with the database 104a to store the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file.
  • the first analysis means 103h utilizes the first reference dataset to train the first convolutional neural network for subsequent detection of the quality level of the fundus image.
  • the second analysis means 103i utilizes the first reference dataset to train the second convolutional neural network for subsequent detection and classification of the retina disease in the fundus image.
  • the fundus image which is subsequently analyzed by the first analysis means 103h and the second analysis means 103i is referred to as an input fundus image for clarity.
  • the fundus image analysis application 103 further comprises a pre-processing means 103d to pre-processes each of the reference fundus images.
  • the pre-processing means 103d communicates with the database 104a to access the first reference dataset.
  • the pre-processing means 103d executes the following steps as part of the pre processing.
  • the pre-processing means 103d separates any text matter present at the border of the reference fundus image.
  • the pre-processing means 103d adds a border to the reference fundus image with border pixel values as zero.
  • the pre-processing means 103d increases the size of the reference fundus image by a predefined number of pixels, for example, 20 pixels width and height. The additional pixels added are of a zero value.
  • the pre-processing means 103d next converts the reference fundus image from a RGB color image to a grayscale image.
  • the pre-processing means 103d now binarize the reference fundus image using histogram analysis.
  • the pre-processing means 103d applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized reference fundus image.
  • the pre-processing means 103d acquires all connected regions such as retina, text matter of the smoothen reference fundus image to separate text matter present in the reference fundus image from a foreground image.
  • the pre-processing means 103d determines the largest region among the acquired connected regions as the retina. The retina is assumed to be the connected element with the largest region.
  • the pre-processing means 103d calculates a corresponding bounding box for the retina.
  • the pre-processing means 103d thus identifies retina from the reference fundus image.
  • the pre-processing means 103d further blurs the reference fundus image using a Gaussian filter.
  • the pre-processing means 103d compares an image width and an image height of the blurred reference fundus image based on Equation 1.
  • the pre-processing means 103d calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred reference fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1.
  • the maximum background pixel value (Max_background pixel value) is given by the below Equation 2.
  • the term‘max_pixel_left’ in Equation 2 is the maximum pixel value of the left half of the blurred identified retina.
  • the term‘max_pixel_right’ in Equation 2 is the maximum pixel value of the right half of the blurred reference fundus image.
  • Max_background pixel value max (max_pixel_left, max_pixel_right)— Equation 2
  • the pre-processing means 103d further extracts foreground pixel values from the blurred reference fundus image by considering pixel values which satisfy the below Equation 3.
  • the pre-processing means 103d calculates a bounding box using the extracted foreground pixel values from the blurred reference fundus image.
  • the pre-processing means 103d processes the bounding box to obtain a resized image using cubic interpolation of shape, for example, [256, 256, 3].
  • the reference fundus image at this stage is referred to as the pre-processed reference fundus image.
  • the pre-processing means 103d stores the pre-processed reference fundus images in a pre-processed first reference dataset.
  • the ground-truth file associated with the first reference dataset holds good even from the pre-processed first reference dataset.
  • the pre-processing means 103d stores the pre-processed first reference dataset in the database 104a. Segregation of the first reference dataset:
  • the fundus image analysis application 103 further comprises a segregation means 103e.
  • the segregation means 103e splits the pre-processed first reference dataset into two sets - a training set and a validation set.
  • the pre-processed reference fundus images in the training set is termed as training fundus images
  • the pre-processed reference fundus images in the validation set is termed as validation fundus images for simplicity.
  • the training set is used to train the convolutional neural network (the first convolutional neural network and the second convolutional neural network) to assess the reference fundus images based on the label associated with each of the reference fundus image.
  • the validation set is typically used to test the accuracy of the convolutional neural network.
  • the fundus image analysis application 103 further comprises an augmentation means 103f.
  • the augmentation means 103f augments the reference fundus images in the training set.
  • the augmentation means 103f preforms the following steps for the augmentation of the training set.
  • the augmentation means 103f randomly shuffles the reference fundus images to divide the training set into a plurality of batches. Each batch is a collection of a predefined number of reference fundus images.
  • the augmentation means 103f randomly samples each batch of reference fundus images.
  • the augmentation means 103f processes each batch of the reference fundus images using affine transformations.
  • the augmentation means 103f translates and rotates the reference fundus images in the batch randomly based on a coin flip analogy.
  • the augmentation means 103f also adjusts the color and brightness of each of the reference fundus images in the batch randomly based on the results of the coin flip analogy.
  • the convolutional neural network comprising‘n’ convolutional stacks applies a convolution operation to the input and passes an intermediate result to a next layer.
  • Each convolutional stack comprises a plurality of convolutional layers.
  • a first convolution stack is configured to convolve pixels from an input with a plurality of filters to generate a first indicator map.
  • the first convolutional stack also comprises a first subsampling layer configured to reduce a size and variation of the first indicator map.
  • the first convolutional layer of the first convolutional stack is configured to convolve pixels from the input with a plurality of filters.
  • the first convolutional stack passes an intermediate result to the next layer.
  • each convolutional stack comprises a sub-sampling layer configured to reduce a size (width and height) of the indicators stack.
  • the input is analyzed based on reference data to provide a corresponding output.
  • the first analysis means 103h and the second analysis means 103i train the first convolutional neural network and the second convolutional neural network respectively using the batches of augmented reference fundus images.
  • the segregation means 103e groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images.
  • the first analysis means 103h validates each of the validation fundus images in each batch of the validation set using the first convolutional neural network.
  • the first analysis means 103h compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file.
  • the first analysis means 103h thus evaluates a convolutional network performance of the first convolutional neural network for the batch of validation set.
  • the convolutional network performance of the first convolutional neural network refers to the detection of the initial quality level for each of the reference fundus image.
  • the first analysis means 103h optimizes the first convolutional neural network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum.
  • the optimizer iteratively optimizes the parameters of the convolutional neural network during multiple iterations using the training set.
  • each iteration refers to a batch of the training set.
  • the first analysis means 103h evaluates the convolutional network performance of the first convolutional neural network after a predefined number of iterations on the validation set.
  • each iteration refers to a batch of the validation set.
  • the first analysis means 103h trains the first convolutional neural network based on the augmented training set and tests the convolutional network based on the validation set. Upon completion of training and validation of the first convolution neural network based on the convolutional network performance, the first analysis means 103h is ready to assess the quality level of the input fundus image.
  • the second analysis means 103i analyzes the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion and does not consider the parameters of the image capturing device for analysis.
  • the second analysis means 103i validates each of the validation fundus images in each batch of the validation set using the second convolutional neural network.
  • the second analysis means 103i compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file.
  • the second analysis means 103i thus evaluates a convolutional network performance of the second convolutional neural network for the batch of validation set.
  • the convolutional network performance of the second convolutional neural network refers to the identification of the indicators throughout the reference fundus image and detection of the state of the retinal disease based the identified indicators.
  • the second analysis means 103i optimizes the second convolutional neural network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum.
  • the optimizer iteratively optimizes the parameters of the second convolutional neural network during multiple iterations using the training set.
  • each iteration refers to a batch of the training set.
  • the second analysis means 103i evaluates the convolutional network performance of the second convolutional neural network after a predefined number of iterations on the validation set.
  • each iteration refers to a batch of the validation set.
  • the second analysis means 103i trains the second convolutional neural network based on the augmented training set and tests the second convolutional network based on the validation set. Upon completion of training and validation of the second convolution neural network based on the convolutional network performance, the second analysis means 103i is ready.
  • the reception means 103a receives the input fundus image from, for example, the image capturing device.
  • the pre-processing means 103d pre-processes the input fundus image similar to that of the reference fundus image.
  • the fundus image analysis application 103 further comprises a test-time augmentation means 103g to test-time augment the preprocessed input fundus image.
  • the test-time augmentation means 103g converts the preprocessed input fundus image into a plurality of test time images, for example, twenty test time images, using deterministic augmentation.
  • the test-time augmentation means 103g follows the same steps to augment the input fundus image as that of the reference fundus image, except that the augmentations are deterministic.
  • the test-time augmentation means 103g generates deterministically augmented twenty test time images of the preprocessed input fundus image.
  • test-time augmentation means 103g transmits the deterministically augmented twenty test time images to either the first analysis means 103h or the second analysis means 103i.
  • the test-time augmentation means 103g transmits the deterministically augmented twenty test time images to the first analysis means 103h when the user selection criterion is to consider the quality level of the fundus image before the analysis of the fundus image.
  • the first analysis means 103h processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the first convolutional neural network comprising‘n’ convolutional stacks.
  • the predicted probabilities of the twenty test time images are averaged over to get a final prediction result.
  • the final prediction result provides a probability value for each grade (for example, good and bad) of quality level associated with the input fundus image.
  • the probability value is an indication of a confidence denoting the quality level of the input fundus image.
  • the output indicates the quality level associated with the input fundus image.
  • Figure 2 exemplary illustrates the first convolutional neural network to compute the quality level of the input fundus image.
  • the deterministically augmented twenty test time images of the preprocessed input fundus image are the input to a first convolutional stack (CS 1) of the first convolutional neural network.
  • Each of the deterministically augmented twenty test time images of the preprocessed input fundus image is processed by the first convolutional neural network.
  • the deterministically augmented test time image is, for example, represented as a matrix of width 224 pixels and height 224 pixels with‘3’ channels. That is, the deterministically augmented test time image is a representative array of pixel values is 224 x 224 x 3.
  • the first convolution stack (CS1) is configured to convolve pixels from the deterministically augmented test time image with a filter to generate a first feature map.
  • the first convolutional stack (CS1) also comprises a first subsampling layer configured to reduce a size and variation of the first feature map.
  • the output of the first convolutional stack (CS 1 ) is a reduced input fundus image represented as a matrix of width 64 pixels and height 64 pixels with nl channels. That is, the output is a representative array of pixel values 64 x 64 x nl.
  • This is the input to a second convolutional stack (CS2), which again convolves the representative array of pixel values 64 x 64 x nl to generate a second feature map.
  • the second convolutional stack (CS2) comprises a second subsampling layer configured to reduce a size and variation of the second feature map to a representative array of pixel values of 16 x 16 x n2, n2 being the number of channels.
  • the representative array of pixel values of 16 x 16 x n2 is an input to a third convolutional stack (CS3).
  • the third convolutional stack (CS3) convolves the representative array of pixel values 16 x 16 x n2 to generate a third feature map.
  • the third convolutional stack (CS3) comprises a third subsampling layer configured to reduce a size and variation of the third feature map to a representative array of pixel values of 8 x 8 x n3, n3 representing the number of channels.
  • a fourth convolutional stack (CS4) convolves the representative array of pixel values 8 x 8 x n3 to generate a fourth feature map.
  • the fourth convolutional stack (CS4) comprises a fourth subsampling layer configured to reduce a size and variation of the third feature map.
  • a probability block (P) provides a probability of the quality level associated with the input fundus image. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result is the probability of the initial quality level of the input fundus image which are two values within a range [0, 1] indicating the gradable quality measure - a‘goodness’ and a‘badness’ of the input fundus image.
  • the final quality level determination means considers the detected initial quality level, the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image.
  • the user defined threshold is the user defined parameter to vary the quality level of the input fundus image.
  • the user defined threshold is user defined to increase flexibility of the system 100.
  • the user defined threshold is the variable factor which may be used to vary the quality level of the input fundus image to conveniently suit the requirements of the user, for example, medical practitioner.
  • the user defined threshold is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the user defined threshold.
  • the parameters of the image capturing device are, for example, a manufacturer and version of the image capturing device, a resolution, an illumination factor, a field of view etc.
  • the final quality level determination means determines a predefined score for the image capturing device based on the parameters of image capturing device. This predefined score for the image capturing device characteristics is used to assess the quality of the input fundus image.
  • the predefined score for the image capturing device denotes a superiority of the image capturing device.
  • the predefined score for the image capturing device is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the predefined score for the image capturing device.
  • the predefined score for the image capturing device for multiple manufacturers of image capturing device is initially stored in the database 104a by the user of the fundus image analysis application 103.
  • the flexibility of the system 100 is increased, thereby providing customized results for the input fundus image captured using the image capturing device of multiple manufacturers.
  • the first analysis means 103h assesses the final quality level of the input fundus image based on the factors - the probability values provided by the first convolutional neural network, the user defined threshold and the parameters of the image capturing device.
  • the test-time augmentation means 103g transmits the deterministically augmented twenty test time images to the second analysis means 103i when the user selection criterion is directly analyze the input fundus image to identify the indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators.
  • the second analysis means 103i processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the second convolutional neural network comprising‘m’ convolutional stacks.
  • the predicted probabilities of the twenty test time images are averaged over to get a final prediction result.
  • the final prediction result provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image.
  • the probability value is an indication of a confidence that identified indicators are of a particular retinal disease and a corresponding severity of the retinal disease.
  • the output indicates a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
  • Figure 3 exemplary illustrates the second convolutional neural network to compute the presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
  • the deterministically augmented twenty test time images of the preprocessed input fundus image is the input to a first convolutional stack (CS1) of the convolutional network.
  • CS1 first convolutional stack
  • Each of the deterministically augmented twenty test time images is processed by the convolutional network.
  • the deterministically augmented test time image is, for example, represented as a matrix of width 448 pixels and height 448 pixels with‘3’ channels. That is, the deterministically augmented test time image is a representative array of pixel values is 448 x 448 x 3.
  • the input to the first convolutional stack (CS1) is a color image of size 448 x 448.
  • the first convolution stack (CS 1) comprises the following sublayers - a first convolutional layer, a first subsampling layer, a second convolutional layer, a third convolutional layer and a second subsampling layer in the same order.
  • the output of a sublayer is an input to a consecutive sublayer.
  • a subsampling layer is configured to reduce a size and variation of its input and a convolutional layer convolves its input with a plurality of filters, for example, filters of size 3x3.
  • the output of the first convolutional stack (CS1) is a reduced image represented as a matrix of width 112 pixels and height 112 pixels with nl channels. That is, the output of the first convolutional stack (CS1) is a representative array of pixel values 112 x 112 x ml.
  • the second convolutional stack (CS2) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the second convolutional stack (CS2) convolves the representative array of pixel values 112 x 112 x ml and reduces it to a representative array of pixel values of 56 x 56 x m2.
  • the representative array of pixel values of 56 x 56 x m2 is an input to a third convolutional stack (CS3).
  • the third convolutional stack (CS3) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the third convolutional stack (CS3) convolves the representative array of pixel values 56 x 56 x m2 and reduces it to a representative array of pixel values of 28 x 28 x m3.
  • the representative array of pixel values of 28 x 28 x m3 is an input to a fourth convolutional stack (CS4).
  • the fourth convolutional stack (CS4) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the fourth convolutional stack (CS4) convolves the representative array of pixel values 28 x 28 x m3 and reduces it to a representative array of pixel values of 14 x 14 x m4.
  • the representative array of pixel values of 14 x 14 x m4 is an input to a fifth convolutional stack (CS4).
  • the fifth convolutional stack (CS5) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer.
  • the fifth convolutional stack (CS5) convolves the representative array of pixel values 14 x 14 x m4 and reduces it to a representative array of pixel values of 7 x 7 x m5.
  • the representative array of pixel values of 7 x 7 x m5 is a first input to a concatenation block (C).
  • the output of the third convolutional stack (CS3) is an input to a first subsampling block (SS1).
  • the representative array of pixel values of 28 x 28 x m3 is the input to the first subsampling block (SS1).
  • the first subsampling block (SS1) reduces the input with a stride of 4 to obtain an output of a representative array of pixel with value of 7 x 7 x m3. This is a second input to the concatenation block (C).
  • the output of the fourth convolutional stack (CS4) is an input to a second subsampling block (SS2).
  • the representative array of pixel values of 14 x 14 x m4 is the input to the second subsampling block (SS2).
  • the second subsampling block (SS2) reduces the input with a stride of 2 to obtain an output of a representative array of pixel with value of 7 x 7 x m4. This is a third input to the concatenation block (C).
  • the concatenation block (C) receives the first input from the fifth convolutional stack (CS5), the second input from the first subsampling block (SS 1) and the third input from the second subsampling block (SS2).
  • the concatenation block (C) concatenates the three inputs received to generate an output of value 7 x 7 x (m5 + m4 + m3).
  • the output of the concatenation block (C) is an input to a probability block (P).
  • the probability block (P) provide a probability of the presence or absence of the retinal disease and related severity of the retinal disease.
  • the predicted probabilities of the twenty test time images are averaged to get a final prediction result.
  • the output of the convolutional network provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image.
  • the probability block (P) as shown in the Figure 2 provides five values by considering the retinal disease to be DR.
  • the output of the probability block are five values depicting the probability for each DR severity level - DR0 (no DR), DR1 (mild DR level), DR2 (moderate DR level), DR3 (severe DR level) and DR4 (proliferative DR level).
  • the GUI 103k displays output of either the second analysis means 103i or both the outputs of the first analysis means 103h and the second analysis means 103i. That is, the GUI 103k displays the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image when the user selection criterion does not involve the detection of the quality level of the input fundus image.
  • the GUI 103k displays the quality level of the input fundus image along with the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image.
  • suitable suggestions with a set of instructions to the user may also be included and provided via a pop-up box displayed on a screen.
  • the fundus image analysis application 103 may also generate a report comprising the input fundus image, the type of the retinal disease and the severity of the retinal disease and communicated to the patient via an electronic mail.
  • the report could also be stored in the database 104a of the system 100.
  • the system 100 detects the presence of several diseases, for example, diabetes, stroke, hypertension, cardiovascular diseases, etc., and not limited to retinal diseases based on changes in the retinal feature.
  • the second analysis means 103i trains the second convolutional neural network to identify and classify the severity of these diseases in the fundus image of the patient.
  • the second analysis means 103i is initially adapted to generate the second label for the fundus image using the second convolutional neural network.
  • the second label is a state of a retinal disease. That is, the second analysis means 103i initially trains the second convolutional neural network to analyze the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion.
  • the second analysis means 103i is trained to analyze the fundus image to identify the indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators.
  • the second analysis means 103i makes use of this knowledge gained for analyzing the fundus image and applies it to a different but related problem, that is, for analyzing the fundus image specific to the parameters of the image capturing device used to capture the fundus image of the patient.
  • the second analysis means 103i extracts the second label associated with the fundus image and transfers to the third convolutional neural network to generate the third label for the fundus image.
  • the second analysis means 103i “transfers” the learned second label associated with the fundus image to obtain customized results for the fundus image depending on the parameters of the image capturing devices.
  • the third label for the fundus image provides customized analysis of the fundus image stating the state of the retinal disease for specific manufacturer and/or version of the image capturing device, for example, fundus camera.
  • the system 100 can be easily customizable to analyse the fundus images captured using different image capturing devices.
  • the system 100 defines the process of transfer learning, that is, transferring the knowledge learned from generic fundus image analysis to analysis of fundus images specific to manufacturers and/or version of the image capturing device used to capture the fundus images.
  • the system 100 transfers the knowledge learned to generate the second label for the fundus image by trickling high level information down to train the third convolutional neural network to generate the third label for the fundus image.
  • the second analysis means 103i refers to a secondary reference dataset to train and validate the third convolutional neural network.
  • the secondary reference dataset is different from the first reference dataset.
  • the secondary reference dataset also comprises a plurality of fundus images but specific to a set of parameters of the image capturing device.
  • the secondary reference dataset comprises the fundus images captured using the image capturing device of a specific manufacturer and version.
  • the second analysis means 103i trains and validates the third convolutional neural network using the secondary reference dataset but the parameters of the third convolutional neural network are initialized from the previously trained second convolutional neural network using the first reference dataset.
  • the third convolution neural network is ready to detect the state of the retinal disease based on the identified indicators for a specific set of parameters of the image capturing device. This process of transfer learning significantly increases the performance of the third convolutional neural network to provide customized results to the fundus images captured using different image capturing devices.
  • the fundus image analysis application 103 provides suitable interactive elements 103j such as a drop down menu, a button, etc., to select a set of parameters of the image capturing device such a specific manufacturer of the image capturing device while capturing the fundus image of the patient.
  • the fundus image analysis application 103 analyses the fundus image based on the selection of the set of parameters of the image capturing device.
  • an appropriate interactive element 103j is provided by the fundus image analysis application 103.
  • the user can upload an existing fundus image of the patient for analysis as shown in Figure 13.
  • the set of parameters of the image capturing device an option is also provided to upload information regarding the set of parameters of the image capturing device.
  • the system 100 provides specific analysis of the fundus image corresponding to the set of parameters of the image capturing device. If no information regarding the set of parameters of the image capturing device are not uploaded by the user, the system 100 provides generic analysis of the fundus image.
  • Figure 4 exemplarily illustrates the architecture of a computer system 400 employed by the fundus image analysis application 103.
  • the fundus image analysis application 103 of the computer implemented system 100 exemplarily illustrated in Figure 1 employs the architecture of the computer system 400 exemplarily illustrated in Figure 4.
  • the computer system 400 is programmable using a high level computer programming language.
  • the computer system 400 may be implemented using programmed and purposeful hardware.
  • the fundus image analysis platform hosting the fundus image analysis application 103 communicates with user devices, for example, 101a, 101b, 101c, etc., of a user registered with the fundus image analysis application 103 via the network 102.
  • the network 102 is, for example, the internet, a local area network, a wide area network, a wired network, a wireless network, a mobile communication network, etc.
  • the computer system 400 comprises, for example, a processor 401, a memory unit 402 for storing programs and data, an input/output (1/0) controller 403, a network interface 404, a data bus 405, a display unit 406, input devices 407, fixed disks 408, removable disks 409, output devices 410, etc.
  • processor refers to any one or more central processing unit (CPU) devices, microprocessors, an application specific integrated circuit (ASIC), computers, microcontrollers, digital signal processors, logic, an electronic circuit, a field-programmable gate array (FPGA), etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • FPGA field-programmable gate array
  • the processor 401 may also be realized as a processor set comprising, for example, a math or graphics co-processor and a general purpose microprocessor.
  • the processor 401 is selected, for example, from the Intel® processors such as the Itanium® microprocessor or the Pentium® processors, Advanced Micro Devices (AMD®) processors such as the Athlon® processor, MicroSPARC® processors, UltraSPARC® processors, hp® processors, International Business Machines (IBM®) processors, the MIPS® reduced instruction set computer (RISC) processor, Inc., RISC based computer processors of ARM Holdings, etc.
  • the computer implemented system 100 disclosed herein is not limited to a computer system 400 employing a processor 401 but may also employ a controller or a microcontroller.
  • the memory unit 402 is used for storing data, programs, and applications.
  • the memory unit 402 is, for example, a random access memory (RAM) or any type of dynamic storage device that stores information for execution by the processor 401.
  • the memory unit 402 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 401.
  • the computer system 400 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 401.
  • the I/O controller 403 controls input actions and output actions performed by the fundus image analysis application 103.
  • the network interface 404 enables connection of the computer system 400 to the network 102.
  • the fundus image analysis platform hosting the fundus image analysis application 103 connects to the network 102 via the network interface 404.
  • the network interface 404 comprises, for example, one or more of a universal serial bus (USB) interface, a cable interface, an interface implementing Wi-Fi® of the Wireless Ethernet Compatibility Alliance, Inc., a FireWire® interface of Apple, Inc., an Ethernet interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide area network (WAN) interface, interfaces using serial protocols, interfaces using parallel protocols, and Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, interfaces based on transmission control protocol (TCP)/internet protocol (IP), radio frequency (RF) technology, etc.
  • USB universal serial bus
  • cable interface an interface implementing Wi-Fi® of the Wireless Ethernet Compatibility Alliance, Inc., a FireWire® interface of Apple, Inc., an Ethernet interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface,
  • the data bus 405 permits communications between the means/modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103.
  • the display unit 406 via the GUI 103k, displays information, display interfaces, interactive elements 103j j such as drop down menus, text fields, checkboxes, text boxes, floating windows, hyperlinks, etc., for example, for allowing the user to enter inputs associated with the patient.
  • the display unit 406 comprises a liquid crystal display, a plasma display, etc.
  • the input devices 407 are used for inputting data into the computer system 400.
  • a user for example, an operator, registered with the fundus image analysis application 103 uses one or more of the input devices 407 of the user devices, for example, 101a, 101b, 101c, etc., to provide inputs to the fundus image analysis application 103.
  • a user may enter a patient’s profile information, the patient’s medical history, etc., using the input devices 407.
  • the input devices 407 are, for example, a keyboard such as an alphanumeric keyboard, a touch pad, a joystick, a computer mouse, a light pen, a physical button, a touch sensitive display device, a track ball, etc.
  • Computer applications and programs are used for operating the computer system 400.
  • the programs are loaded onto the fixed disks 408 and into the memory unit 402 of the computer system 400 via the removable disks 409.
  • the computer applications and programs may be loaded directly via the network 102.
  • the output devices 410 output the results of operations performed by the fundus image analysis application 103.
  • the processor 401 executes an operating system, for example, the Finux® operating system, the Unix® operating system, any version of the Microsoft® Windows® operating system, the Mac OS of Apple Inc., the IBM® OS/2, VxWorks® of Wind River Systems, Palm OS®, the Solaris operating system, the Android operating system, Windows PhoneTM operating system developed by Microsoft Corporation, the iOS operating system of Apple Inc., etc.
  • the computer system 400 employs the operating system for performing multiple tasks.
  • the operating system is responsible for management and coordination of activities and sharing of resources of the computer system 400.
  • the operating system employed on the computer system is responsible for management and coordination of activities and sharing of resources of the computer system 400.
  • the 400 recognizes, for example, inputs provided by the user using one of the input devices 407, the output display, files, and directories stored locally on the fixed disks 408.
  • the operating system on the computer system 400 executes different programs using the processor 401.
  • the processor 401 executes different programs using the processor 401.
  • the processor 401 retrieves instructions for executing the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103 from the memory unit 402.
  • a program counter determines the location of the instructions in the memory unit 402.
  • the program counter stores a number that identifies the current position in the program of each of the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103.
  • the instructions fetched by the processor 401 from the memory unit 402 after being processed are decoded.
  • the instructions are stored in an instruction register in the processor 401. After processing and decoding, the processor 401 executes the instructions.
  • Figure 5 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying a log-in screen of the system 100.
  • the log-in screen comprises text boxes 501 and 502 to permit the user to enter a user name and password, respectively, and a button 503.
  • the log-in process takes place upon selection of button 503 using the user input device, for example, a keyboard, a mouse, etc.
  • the user enters his credentials to log-in to the system 100.
  • the menu screen 600 is displayed to the user of the system 100 as shown in Figure 6.
  • Figure 6 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the menu screen 600 of the system 100. This provides flexibility to the user for navigating between the components accessed through the menu screen 600, that is, add new patient screen 601, an existing patients screen 602 and a report screen 603.
  • Figure 7 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the add new patient screen 601 of the system 100.
  • the information associated with the new patient such as the personal details about the patient, medical condition of the patient, etc., are recorded by the user of the system 100 using the add new patient screen 601.
  • Figure 8 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the existing patients screen 602 of the system 100.
  • the patients list allows access to individual information in the form of a patient's profile via the“View Profile” option.
  • the reports of the patient comprising the patient’s fundus images and analysis details of the patient’s fundus images can be accessed via the“View Report” option provided for each of the patient.
  • an alphabetical list of the patients is presented by default.
  • Each record includes a patient ID, a patient name, a patient email address, a patient mobile number, a patient age, a patient city and a date of creation of the patient's profile.
  • a search option to search for a specific patient in the “Patient List” is also provided in the existing patients screen 602.
  • Figure 9 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the profile screen 900 of the existing patient of the system 100.
  • the existing patient profile screen provides an“Edit Patientinfo” option to edit the information related to the existing patient.
  • the existing patient profile screen also provides a“View Report” option to view previous report of the existing patient.
  • the existing patient profile screen provides“View all Images” option to view the previously captured fundus images of the existing patient.
  • Figure 10 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the existing images 1000 of the existing patient of the system 100.
  • the user of the system 100 clicks on the“View all Images” option as shown in Figure 9, the previously captured fundus images of the existing patient along with the state of the retinal disease, that is, DR are displayed on the GUI 103k provided by the system 100.
  • Figure 11 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image rendering screen 1100 of the system 100. The live mode of the fundus of the patient is displayed in a box 1101 of the fundus image rendering screen 1100.
  • the user has the options to start 1102 and stop 1103 the display of the live mode of the fundus of the patient.
  • the user also has an option to capture 1104 the fundus image during the display of the live mode of the fundus of the patient.
  • the user can select posterior or anterior of the eye along with the details of the eye - left eye or right eye for the captured fundus image.
  • FIG 12 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 after the fundus image of the patient is captured by the user.
  • the fundus image analysis application 103 provides an option to either directly analyze the fundus image of the patient or determine the quality level of the fundus image of the patient before the analysis of the fundus image. Either or both the fundus images (representing the left eye and/or the right eye of the patient) can be selected by the user for analysis.
  • the second analysis means 103i of the fundus image analysis application 103 analyses the fundus images of the patient without considering the final quality level of the fundus image.
  • the second analysis means 103i analyses each of the fundus image of the patient to determine the state of a retinal disease.
  • the second analysis means 103i of the fundus image analysis application 103 analyses the fundus images of the patient considering the final quality level of the fundus image. That is, the second analysis means 103i considers the output of the first analysis means 103h to either continue or abort with the analysis of the fundus image.
  • the first analysis means 103h determines the final quality level of the fundus image and transmits this output to the second analysis means 103i.
  • the second analysis means 103i analyses the fundus image to identify the indicators and determine the state of the retinal disease (in this case, diabetic retinopathy) when the output of the first analysis means 103h indicates that the final quality level of the fundus image is‘good’.
  • the second analysis means 103i aborts the analysis when the output of the first analysis means 103h indicates that the final quality level of the fundus image is‘bad’.
  • buttons“Analyse” and“Image Quality and analyse” are the interactive elements 103j on the GUI 103k for enabling the analysis of the fundus image of the patient.
  • the user selection criterion defines a selection process of the user which is realized by a clickable event of either the“Analyse” or the“Image Quality and analyse” buttons on the GUI 103k via the user input device such as a mouse, a trackball, a joystick, etc.
  • the user selects the“Analyze” option, the user in turn triggers the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus image without an additional process of determination of the final quality level of the fundus image before the analysis.
  • the user selects the“Image Quality and analyse” option the user triggers the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus image considering the final quality level of the fundus image determined by the first analysis means 103h.
  • FIG. 13 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image upload screen of the system 100 to upload the fundus image of the patient for analysis.
  • the fundus image of the patient is an already existing fundus image and can be uploaded by the user of the fundus image analysis application 103.
  • the ‘Upload” button 1301 is the interactive element 103j on the GUI 103k using which the user can upload an existing fundus image of the patient.
  • the existing fundus image may be located, for example, on the database 104a.
  • the user may upload the fundus image captured by the image capturing device of a particular manufacturer using the“Upload” button 1301 to analyse the fundus image for the particular manufacturer.
  • Figure 14 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 when the user selects the option“Image Quality and analyse”.
  • the first analysis means 103h detects that the “Right Eye” fundus image of the patient is not gradable and the final quality level is‘Bad” and the “Left Eye” fundus image of the patient is gradable and the final quality level is‘Good”.
  • the output of the first analysis means 103h is transmitted to the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus images of the patient.
  • the second analysis means 103i aborts the analysis of the“Right Eye” fundus image of the patient.
  • the second analysis means 103i identifies that the“Left Eye” fundus image of the patient comprises indicators denoting a normal eye condition, that is, without DR.
  • the output of the second analysis means 103i are displayed along with the fundus images of the patient on the fundus image analysis screen of the system 100.
  • The“Right Eye” fundus image of the patient is indicated as“Bad Image” and the“Left Eye” fundus image of the patient is indicated with the output of the second analysis means 103i, that is,“No abnormalities found”.
  • Figure 15 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 when the user selects the option“Analyse”.
  • the second analysis means 103i of the fundus image analysis application 103 analyzes the fundus images of the patient without considering the final quality level of the fundus images. Now consider that the second analysis means 103i identifies that the“Right Eye” fundus image of the patient comprises indicators denoting DR. Each of the fundus images of the patient are analyzed by the second analysis means 103i. The output of the second analysis means 103i are displayed along with the fundus images of the patient on the fundus image analysis screen of the system 100.
  • The“Right Eye” fundus image of the patient is indicated with the output of the second analysis means 103i as“Doctor Review Recommended” and the“Left Eye” fundus image of the patient is indicated with the output of the second analysis means 103i, that is,“No abnormalities found”.
  • Figure 16 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the report screen 603 of the system 100.
  • the report summary screen displays the patient information and the details about the state of the retinal disease (diabetic retinopathy) denoted below each of the fundus images of the patient.
  • a plurality of fundus images for the left eye and the right eye of the patient are analyzed and displayed on the report summary screen.
  • An option to view the report in a PDF version“View PDF” is provided at the top right of the report summary screen along with the“Print” option to print the report and“Send as mail” option to send the report of the patient as an attachment in the mail.
  • FIG. 17 illustrates a flowchart for analyzing the fundus image of the patient in accordance with the invention.
  • the fundus image analysis application 103 receives the fundus image of the patient.
  • the non-transitory computer readable storage medium is configured to store the fundus image analysis application 103 and at least one processor is configured to execute the fundus image analysis application 103.
  • the fundus image analysis application 103 is thus a part of the system 100 comprising the non-transitory computer readable storage medium communicatively coupled to the at least one processor.
  • the fundus image analysis application 103 comprises the GUI 103k comprising multiple interactive elements 103j configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c.
  • the reception means 103a adapted to receive the input from the image capturing device based on multiple parameters of the image capturing device.
  • the input is the fundus image of the patient displayed in a live mode.
  • the fundus image analysis application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers.
  • the interactive fundus image rendering means 103b is adapted to dynamically render the input.
  • the dynamically rendered input is configurably accessible on the GUI 103k via the user device 101a, 101b or 101c using the interactive elements 103j.
  • the fundus image capture means 103c is adapted to capture the fundus image based on the dynamically rendered input.
  • the fundus image analysis application 103 receives the user selection criterion via the user device.
  • the user selection criterion refers to a user’s selection of either considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image (Decision of S4 as YES) or analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image (Decision of S4 as NO).
  • the first analysis means 103h is configured to determine the quality level of the captured fundus image.
  • the initial quality level detection means generates the first label for the fundus image using the first convolutional neural network.
  • the initial label is the initial quality level of the fundus image.
  • the final quality level determination means to determine the final quality level of the fundus image based on the generated first label, the user defined quality threshold and the parameters of the image capturing device.
  • the initial quality level of the fundus image refers to a quality level computed by the first convolutional neural network based on a training provided to the first convolutional neural network.
  • the final quality level determination means considers the detected initial quality level along with the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image.
  • the user defined quality threshold is a user defined parameter to vary the quality level of the fundus image.
  • the user defined quality threshold is based on the user’s confidence and ability to grade the fundus image.
  • the term“indicator” is one of an abnormality, a retinal feature or the like.
  • the retinal feature is an optic disc, a macula, a blood vessel or the like.
  • the abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like.
  • the retinal disease is one of diabetic retinopathy (DR), diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
  • DR diabetic retinopathy
  • the state of the retinal disease indicates a presence or absence of the retinal disease represented as levels of increasing seriousness of the retinal disease.
  • the second analysis means 103i is configured to generate the second label for the fundus image using the second convolutional neural network.
  • the generation of the second label is to analyze the fundus image.
  • the indicators identification means identifies the multiple indicators throughout the fundus image using the second convolutional neural network.
  • the retinal disease detection means detects the state of the retinal disease based the identified indicators using the second convolutional neural network.
  • the second analysis means 103i is also adapted to generate the third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means 103i adapted to train the third convolutional neural network using the third label.
  • the general concepts of the current invention are not limited to a particular number of severity levels.
  • one severity level could be used which satisfies only the detection of the retina disease.
  • multiple severity levels could be used to classify the retinal disease.
  • multiple retinal diseases could be detected based on the identified indicators. The system 100 classifies each of the detected retinal diseases based on the severity.
  • the second analysis means 103i of the fundus image analysis application 103 using the second convolutional neural network emphases on classifying the entire fundus image as a whole. This improves efficiency and reduces errors in identifying various medical conditions.
  • the system 100 acts as an important tool in the detection of the quality level of the fundus image and monitoring a progression of one or more retinal diseases and/or or a response to a therapy.
  • the system 100 trains the second convolutional neural network to detect all indicative indicators related to multiple retinal diseases.
  • the system 100 accurately detects indicators throughout the input fundus image which are indicative of disease conditions to properly distinguish indicators of a healthy fundus from indicators which define retinal diseases.
  • the system 100 uses the analyzed fundus images to further train the convolutional neural network.
  • system 100 refers to the patient profiles to gather information such as age, gender, race, ethnicity, nationality, etc., of existing patients to further train the convolutional neural networks to improve the convolutional network performance and provide customized results to the patients.
  • the system 100 may also be used to detect certain conditions such as a laser treated fundus.
  • the system 100 may be a part of a web cloud with the input fundus image and the report uploaded to the web cloud.
  • the system 100 involving computer-based process of supervised learning using the convolutional network as described can thus be effectively used to screen the fundus images.
  • the system 100 identifies indicators which are further processed to automatically provide indications of relevant retinal disease, in particular indications of DR.
  • the system 100 increases efficiency by the utilization of the well trained convolutional network for detecting and classifying the retinal diseases thus providing cost-effective early screening and treatment to the patient.
  • the system 100 reduces the time-consumption involved in a manual process requiring a trained medical practitioner to evaluate digital fundus photographs of the retina.
  • the system 100 using the convolutional system 100 effectively improves the quality of analysis of the fundus image by detecting indicators of minute size which are often difficult to detect in the manual process of evaluating the fundus image.
  • Non-transitory computer readable media refers to non-transitory computer readable media that participate in providing data, for example, instructions that may be read by a computer, a processor or a similar device.
  • Non-transitory computer readable media comprise all computer readable media.
  • Non volatile media comprise, for example, optical discs or magnetic disks and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitutes a main memory.
  • DRAM dynamic random access memory
  • Volatile media comprise, for example, a processor cache, a register memory, a random access memory (RAM), etc.
  • Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to a processor, etc.
  • Computer readable media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a Blu-ray Disc®, a magnetic medium, a compact disc -read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, a laser disc, RAM, a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other cartridge, etc.
  • the database 104a is, for example, a structured query language (SQL) data base or a not only SQL (NoSQL) data base such as the Microsoft® SQL Server®, the Oracle® servers, the MySQL® database of MySQL AB Company, the MongoDB® of lOgen, Inc., the Neo4j graph database, the Cassandra database of the Apache Software Foundation, the HBaseTM database of the Apache Software Loundation, etc.
  • the database 104a can also be a location on a file system.
  • the database 104a is any storage area or medium that can be used for storing data and files.
  • the database 104a can be remotely accessed by the fundus image analysis application 103 via the network 102.
  • the database 104a a is configured as a cloud based database 104a implemented in a cloud computing environment, where computing resources are delivered as a service over the network 102, for example, the internet.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Veterinary Medicine (AREA)
  • Computing Systems (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Eye Examination Apparatus (AREA)

Abstract

A computer implemented system for analyzing a fundus image of a patient is disclosed. The system (100) comprises at least one processor coupled to a non-transitory computer readable storage medium configured to store a fundus image analysis application (103), comprising: a graphical user interface (103k) comprising interactive elements (103j) configured to enable capture and analysis of the fundus image; a reception means (103a) adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device; an interactive fundus image rendering means (103b) adapted to dynamically render the input; a fundus image capture means (103c) adapted to capture the fundus image based on the dynamically rendered input; a first analysis means (103h) configured to generate a first label for the fundus image; a second analysis means (103i) configured to generate a second label for the fundus image.

Description

1. TITLE OF THE INVENTION
A FUNDUS IMAGE ANALYSIS SYSTEM
Complete specification:
The following specification particularly describes the invention and the manner in which it is to be performed. Technical field of the invention
[0001] The invention relates to the field of medical decision support. More particularly, the invention relates to the analysis of a fundus image to identify a quality level of the fundus image and aid in the diagnosis of retinal diseases.
Background of the invention
[0002] Vision is an important survival attribute for a human, thus making eyes as one of the most vital sensory body part. Though most of the eye diseases may not be fatal, failure of proper diagnosis and treatment of an eye disease may lead to vision loss. Early screening of eye diseases through regular screening may prevent visual loss and blindness amongst patients. Analysis of fundus images of a patient is a very convenient way of screening and monitoring eye diseases. The fundus of the eye provides indications of several diseases, in particular eye diseases like diabetic retinopathy.
[0003] Currently, diabetic retinopathy is one of the primary cause of vision loss. Long-term complications of diabetes include diabetic retinopathy. As the number of patients with diabetes continues to increase, the groundwork required to prevent visual loss due to diabetic retinopathy will become even more deficient. The expertise required are often lacking in areas where the rate of diabetes in populations is high and diabetic retinopathy detection is most needed.
[0004] Micro-aneurysms is an important feature used for detecting diabetes retinopathy in the fundus image of the patient. Small areas of swellings caused due to vascular changes in the retina's blood vessels are known as micro- aneurysms. Micro-aneurysms may sooner or later cause plasma leakage resulting in thickening of the retina. This is known as edema. Thickening of the retina in the macular region may result in vision loss. Proper distinction of features in the fundus image is critical as wrong predictions may lead to wrong treatments causing difficulties to the patient. [0005] In recent times, computer-aided screening systems assists doctors to improve the quality of examination of fundus images for screening of eye diseases. Machine learning (ML) algorithms on data are used to extract and evaluate information. Systems apply ML algorithms to ensure faster mode of efficient identification and classification of eye diseases using fundus images which enhances screening of eye diseases. An artificial neural network is a computational model comprising a group of interconnected artificial neurons. Convolutional neural network is a feed forward artificial neural network having several applications in pattern recognition and classification. Convolutional neural network comprises collections of neurons having a receptive field and together tile an input space. But currently, the systems available for identification and classification of eye diseases using fundus images involving machine learning algorithm are complex and of high cost. Additionally, training of the machine learning algorithm is also challenging adding to the overall cost of the system. This limits the reach of medical eye screening and diagnosis to common man.
[0006] A simple, comprehensive and cost-effective solution involving effective use of ML algorithms enabling the systems to access concealed visions for automated effective identification and classification of eye diseases using fundus images is thus essential.
Summary of invention
[0007] This summary is provided to introduce a selection of concepts in a simplified form that are further disclosed in the detailed description of the invention. This summary is not intended to identify key or essential inventive concepts of the claimed subject matter, nor is it intended for determining the scope of the claimed subject matter.
[0008] The present invention discloses a computer implemented system for analyzing a fundus image of a patient. The system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image analysis application, the at least one processor configured to execute the fundus image analysis application; and the fundus image analysis application comprising: a graphical user interface comprising a plurality of interactive elements configured to enable capture and analysis of the fundus image via a user device; a reception means adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device, wherein the input is the fundus image of the patient displayed in a live mode; an interactive fundus image rendering means adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface via the user device using the interactive elements; a fundus image capture means adapted to capture the fundus image based on the dynamically rendered input; a first analysis means adapted to determine a final quality level of the captured fundus image comprising: an initial quality level detection means to generate a first label for the fundus image using a first convolutional neural network, wherein the initial label is an initial quality level of the fundus image; a final quality level determination means to determine the final quality level of the fundus image based on the generated first label, a user defined quality threshold and the parameters of the image capturing device; and a second analysis means adapted to generate a second label for the fundus image using a second convolutional neural network by considering the determined final quality level based on a user selection, wherein the second label is a state of a retinal disease.
[0009] The system further comprises the second analysis means adapted to generate a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means adapted to train a third convolutional neural network using the third label.
[0010] The user device is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc. The user defined quality threshold is a quality measure defined by a user for the fundus image based on a user grading experience. Here, the image capturing device refers to a camera for photographing the fundus of the patient. The parameters of the image capturing device are a manufacturer of the image capturing device, a version of the image capturing device system and the like. The indicator is one of an abnormality, a retinal feature or the like. The abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The state of the retinal disease indicates a level of seriousness of the retinal disease or a likelihood of developing the retinal disease.
Brief description of the drawings
[0011] The present invention is described with reference to the accompanying figures. The accompanying figures, which are incorporated herein, are given by way of illustration only and form part of the specification together with the description to explain the make and use the invention, in which,
[0012] Figure 1 illustrates a block diagram of a computer implemented system for analyzing a fundus image of a patient in accordance with the invention;
[0013] Figure 2 exemplary illustrates a first convolutional neural network to compute a quality level of an input fundus image;
[0014] Figure 3 exemplary illustrates a second convolutional neural network to compute a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image;
[0015] Figure 4 exemplarily illustrates the architecture of a computer system employed by a fundus image analysis application;
[0016] Figure 5 exemplary illustrates a screenshot of a graphical user interface (GUI) provided by the system, displaying a log-in screen of the system; [0017] Figure 6 exemplary illustrates a screenshot of the GUI provided by the system, displaying a menu screen of the system;
[0018] Figure 7 exemplary illustrates a screenshot of the GUI provided by the system, displaying an add new patient screen of the system;
[0019] Figure 8 exemplary illustrates a screenshot of the GUI provided by the system, displaying an existing patients screen of the system;
[0020] Figure 9 exemplary illustrates a screenshot of the GUI provided by the system, displaying a profile screen of an existing patient of the system;
[0021] Figure 10 exemplary illustrates a screenshot of the GUI provided by the system, displaying existing images of the existing patient of the system;
[0022] Figure 11 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image rendering screen of the system;
[0023] Figure 12 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image analysis screen of the system after the fundus image of the patient is captured by the user;
[0024] Figure 13 exemplary illustrates a screenshot of the GUI provided by the system, displaying a fundus image upload screen of the system to upload the fundus image of the patient for analysis;
[0025] Figure 14 exemplary illustrates a screenshot of the GUI provided by the system, displaying the fundus image analysis screen of the system when the user selects the option“Image Quality and analyse”; [0026] Figure 15 exemplary illustrates a screenshot of the GUI provided by the system, displaying the fundus image analysis screen of the system when the user selects the option“Analyse”;
[0027] Figure 16 exemplary illustrates a screenshot of the GUI provided by the system, displaying a report screen of the system; and
[0028] Figure 17 illustrates a flowchart for analyzing the fundus image of the patient in accordance with the invention.
Detailed description of the invention
[0029] Figure 1 illustrates a block diagram of a computer implemented system for analyzing a fundus image of a patient in accordance with the invention. The system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image analysis application 103, the at least one processor configured to execute the fundus image analysis application 103; and the fundus image analysis application 103 comprising: a graphical user interface (GUI) 103k comprising a plurality of interactive elements 103j configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c; a reception means 103a adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device, wherein the input is the fundus image of the patient displayed in a live mode; an interactive fundus image rendering means 103b adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface 103k via the user device 101a, 101b or 101c using the interactive elements 103j; a fundus image capture means 103c adapted to capture the fundus image based on the dynamically rendered input; a first analysis means 103h adapted to determine a final quality level of the captured fundus image comprising: an initial quality level detection means to generate a first label for the fundus image using a first convolutional neural network, wherein the initial label is an initial quality level of the fundus image; a final quality level determination means to determine the final quality level of the fundus image based on the generated first label, a user defined quality threshold and the parameters of the image capturing device; and a second analysis means 103i adapted to generate a second label for the fundus image using a second convolutional neural network by considering the determined final quality level based on a user selection, wherein the second label is a state of a retinal disease.
[0030] The system 100 further comprises the second analysis means 103i adapted to generate a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means 103i adapted to train a third convolutional neural network using the third label.
[0031] As used herein, the term“patient” refers to an individual receiving or registered to receive medical treatment. The patient is, for example, an individual undergoing a regular health checkup, an individual with a condition of diabetes mellitus, etc. As used herein, the term“fundus image” refers to a two-dimensional array of digital image data, however, this is merely illustrative and not limiting of the scope of the invention.
[0032] The computer implemented system comprises at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor and the fundus image analysis application 103. The non-transitory computer readable storage medium is configured to store the fundus image analysis application 103. The at least one processor is configured to execute the fundus image analysis application 103. The fundus image analysis application 103 is executable by at least one processor configured to enable capture and analysis of the fundus image of the patient via the user device 101a, 101b or 101c. The user device 101a, 101b or 101c is, for example, a personal computer, a laptop, a tablet computing device, a personal digital assistant, a client device, a web browser, etc.
[0033] In an embodiment, the fundus image analysis application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers. For example, the fundus image analysis application 103 is implemented on a web based platform, for example, a fundus image analysis platform 104 as illustrated in Figure 1.
[0034] The fundus image analysis platform 104 hosts the fundus image analysis application 103. The fundus image analysis application 103 is accessible to one or more user devices 101a, 101b or 101c. The user device 101a, 101b or 101c is, for example, a computer, a mobile phone, a laptop, etc. In an example, the user device is accessible over a network such as the internet, a mobile telecommunication network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., etc. The fundus image analysis application 103 is accessible through browsers such as Internet Explorer® (IE) 8, IE 9, IE 10, IE 11 and IE 12 of Microsoft Corporation, Safari® of Apple Inc., Mozilla® Firefox® of Mozilla Foundation, Chrome of Google, Inc., etc., and is compatible with technologies such as hypertext markup language 5 (HTML5), etc.
[0035] In another embodiment, the fundus image analysis application 103 is configured as a software application, for example, a mobile application downloadable by a user on the user device 101a, 101b or 101c, for example, a tablet computing device, a mobile phone, etc. As used herein, the term“user” is an individual who operates the fundus image analysis application 103 to capture the fundus images of the patient and generate a report resulting from the analysis of the captured fundus images.
[0036] The fundus image analysis application 103 is accessible by the user device 101a, 101b or 101c via the GUI 103k provided by the fundus image analysis application 103. In an example, the fundus image analysis application 103 is accessible over the network 102. The network 102 is, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.
[0037] The fundus image analysis application 103 comprises the GUI 103k comprising a plurality of interactive elements 103j configured to enable capture and analysis of the fundus image via the user device 101a, 101b or 101c. As used herein, the term“ interactive elements 103j” refers to interface components on the GUI 103k configured to perform a combination of processes, for example, a retrieval process from the input received from the user, for example, the fundus images of the patient, processes that enable real time user interactions, etc. The interactive elements 103j comprise, for example, clickable buttons.
[0038] The fundus image analysis application 103 comprises the reception means 103a adapted to receive the input from the image capturing device based on the parameters of the image capturing device. The input is the fundus image of the patient. The input may also be a plurality of fundus images of the patient. As used herein, the term“image capturing device” refers to a camera for photographing the fundus of the patient. In an example, the image capturing device is a Zeiss FF 450+ fundus camera comprising a Charged Coupled Device (CCD) photographic unit. In another example, the image capturing device is a smart phone with a camera capable of capturing the fundus images of the patient. The parameters of the image capturing device are a manufacturer of the image capturing device, a version of the image capturing device and the like.
[0039] The reception means 103a receives information associated with the patient from the user device, for example, 101a, 101b or 101c via the GUI 103k. The information associated with the patient is, for example, personal details about the patient, medical condition of the patient, etc., as shown in Figure 7.
[0040] The image capturing device is in communication with the fundus image analysis application 103 via the network 102, for example, the internet, an intranet, a wireless network, a wired network, a Wi-Fi® network of the Wireless Ethernet Compatibility Alliance, Inc., a universal serial bus (USB) communication network, a ZigBee® network of ZigBee Alliance Corporation, a general packet radio service (GPRS) network, a global system for mobile (GSM) communications network, a code division multiple access (CDMA) network, a third generation (3G) mobile communication network, a fourth generation (4G) mobile communication network, a wide area network, a local area network, an internet connection network, an infrared communication network, etc., or any combination of these networks.
[0041] The fundus image analysis application 103 accesses the image capturing device based on the parameters of the image capturing device to receive the input of the patient. The fundus image analysis application 103 comprises a transmission means to request the image capturing device for a permission to control the activities of the image capturing device to capture the input associated with the patient. The image capturing device responds to the request received from the transmission means. The reception means 103a receives the response of the image capturing device.
[0042] In other words, the image capturing device permits the user of the fundus image analysis application 103 to control the activities of the image capturing device via the interactive elements 103j of the GUI 103k. As used herein, the term“activities” refer to a viewing of a live mode of the fundus of the patient on a screen of the GUI 103k, focusing a field of view by zooming in or zooming out the field of view to observe the fundus of the patient and capturing the fundus image of the patient from the displayed live mode of the fundus of the patient. The fundus image analysis application 103 adaptably controls the activities specific to the image capturing device based on the parameters, for example, the manufacturer, of the image capturing device. That is, the fundus image analysis application 103 is customizable to suit the parameters of the image capturing device such as the version, the manufacturer, the model details, etc. In other terms, the fundus image analysis application 103 is customizable and can be suitable adapted to capture the fundus images of the patient for different manufacturers of the image capturing device.
[0043] Once the fundus image analysis application 103 has the permission to control the activities of the image capturing device, the user of the fundus image analysis application 103 can view the input of the image capturing device on the screen of the GUI 103k. The interactive fundus image rendering means 103b dynamically renders the input on the GUI 103k. The dynamically rendered input is configurably accessible on the GUI 103k via the user device 101a, 101b or 101c using the interactive elements 103j. The field of view of the image capturing device is displayed on a screen of the GUI 103k via the user device 101a, 101b or 101c. The user can focus the field of view by zooming in or zooming out the field of view to observe the fundus of the patient by using with the interactive elements 103j via a user input device such as a mouse, a trackball, a joystick, etc. The user captures the fundus image of the patient from the displayed live mode of the fundus of the patient using the interactive elements 103j of the GUI 103k via the user device 101a, 101b or 101c. As used herein, the term“live mode” refers to the seamless display of the fundus of the patient in real time via the GUI 103k. In an embodiment, the input is an already existing fundus image of the patient stored in the database 104a.
[0044] The fundus image analysis application 103 comprises a first analysis means 103h configured to determine the quality level of the captured fundus image. The first analysis means 103h comprises the initial quality level detection means to generate the first label for the fundus image using the first convolutional neural network, wherein the initial label is the initial quality level of the fundus image; and the final quality level determination means to determine the final quality level of the fundus image based on the generated first label, the user defined quality threshold and the parameters of the image capturing device.
[0045] The fundus image analysis application 103 comprises the second analysis means 103i adapted to analyze the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion, comprising: the indicators identification means to identify multiple indicators throughout the fundus image; and the retinal disease detection means to detect the state of the retinal disease based the identified indicators.
[0046] As used herein, the term“quality level” of the fundus image defines a gradable efficiency of the fundus image. The quality level of the fundus image is based on a plurality of quality factors. The quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc. As used herein, the term “convolutional neural network” refers to a class of deep artificial neural networks that can be applied to analyzing visual imagery. The initial label defines the initial quality level of the fundus image which is a quality level computed by the first convolutional neural network based on a training provided to the first convolutional neural network. The final quality level determination means considers the generated first label which is the detected initial quality level along with the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image. The user defined quality threshold is a user defined parameter to vary the quality level of the fundus image. The user defined quality threshold is based on the user’s confidence and ability to grade the fundus image.
[0047] As used herein, the term“indicator” is one of an abnormality, a retinal feature or the like. The retinal feature is an optic disc, a macula, a blood vessel or the like. The abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The state of the retinal disease indicates a presence or absence of the retinal disease represented as levels of increasing seriousness of the retinal disease.
[0048] The second analysis means 103i analyzes the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion. The user selection criterion refers to a user’s selection of either considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image or analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image. The user selection criterion is a selection process of the user which is realized by a clickable event of either the“Analyse” or the“Image Quality and analyse” buttons on the GUI 103k via the user input device such as a mouse, a trackball, a joystick, etc., as shown in Figure 12. [0049] The user selection criterion of considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image can be activated by the selection of the interactive element“Image quality and analyse” clickable button provided by the GUI 103k as shown in Figure 12. The user selection criterion of analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image can be activated by the selection of the interactive element“analyse” clickable button provided by the GUI 103k as shown in Figure 12.
[0050] Further, when the user selection criterion is to consider the quality level of the fundus image before the analysis of the fundus image, the second analysis means 103i considers the determined final quality level into account to analyze the fundus image. The determined final quality level is an output of the first analysis means 103h. In other words, the second analysis means 103i considers the output of the first analysis means 103h when user selection criterion is to consider the quality level of the fundus image before the analysis of the fundus image.
[0051] When the determined final quality level is‘bad’, the second analysis means 103i aborts the analysis of the fundus image. When the determined final quality level is‘good’, the second analysis means 103i continues with the analysis of the fundus image by using the second convolutional neural network. Here, the‘bad’ final quality level indicates that the fundus image is below a quality threshold and the‘good’ final quality level indicates that the fundus image is above the quality threshold. The purpose of providing the quality level of the fundus image is to detect low quality fundus images whose quality is inadequate for retinal disease screening and discard them.
[0052] When the user selection criterion is to not consider the quality level of the fundus image before the analysis of the fundus image, the second analysis means 103i directly analyses the fundus image to identify a plurality of indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators.
[0053] The first convolutional neural network and the second convolutional neural network are convolutional neural networks and correspond to a specific model of an artificial neural network. The first convolutional neural network generates the first label for the fundus image of the patient. The second convolutional neural network generates the second label for the fundus image of the patient. The first label refers to the initial quality level of the fundus image of the patient. The second label for the fundus image of the patient refers to identification of the indicators in the fundus image and determination of the state of the retinal disease in the fundus image of the patient.
[0054] In general the convolutional neural network is trained using a first reference dataset of fundus images to accomplish the function associated with the convolutional neural network. Here, the term“function” of the first convolutional neural network refers to the determination of the initial quality level of the fundus image of the patient and the“function” of the second convolutional neural network refers to the identification of the indicators in the fundus image and determination of the state of the retinal disease in the fundus image of the patient.
[0055] The fundus image analysis application 103 receives the first reference dataset from one or more devices. The first reference dataset comprises a plurality of fundus images. Hereafter, the fundus images in the first reference dataset are referred to as reference fundus images. The device is, for example, the image capturing device such as a camera incorporated into a mobile device, a server, a network of personal computers, or simply a personal computer, a mainframe, a tablet computer, etc. The fundus image analysis application 103 stores the first reference dataset in a database 104a of the system 100. The system 100 comprises the database 104a in communication with the fundus image analysis application 103. The database 104a is also configured to store patient profile information, patient medical history, the reference fundus images of patients, reports of the patients, etc.
[0056] In an embodiment, a same set of reference fundus images is used to train the first convolutional neural network and the second convolutional neural network. In another embodiment, different sets of reference fundus images are used to train the first convolutional neural network and the second convolutional neural network. As used herein, the term“reference fundus image” is a two-dimensional array of digital image data used for the purpose of training the first convolutional neural network and the second convolutional neural network. In this invention, the term‘training’ generally refers to a process of developing the first convolutional neural network for the detection of the initial quality level of the fundus image and the second convolutional neural network for the identification and determination of the state of the retinal disease based the first reference dataset and a reference ground-truth file.
[0057] The reference ground-truth file comprises a label and a reference fundus image identifier for each of the reference fundus image. The label provides information about the reference fundus image such as the quality level of the fundus image, the state of a retinal disease, the type of retinal disease and the corresponding severity of the retinal disease identified in the reference fundus image. The reference fundus image identifier of the reference fundus image is, for example, a name or an identity assigned to the reference fundus image.
[0058] In an embodiment, the first convolutional neural network and the second convolutional neural network have a separate reference ground-truth file. In another embodiment, the first convolutional neural network and the second convolutional neural network refer to a common reference ground-truth file for relevant information required to perform the specific function associated with the convolutional neural network.
Manual grading of the first reference dataset:
[0059] In an embodiment, an annotator annotates each of the reference fundus images the GUI 103k via the user device 101a, 101b or 101c. As used herein, the term“annotator” refers to a user of the fundus image analysis application 103 who is usually a trained/certified specialist in accurately annotating the fundus image to determine the quality level of the reference fundus image and analyze the indicators present in the reference fundus image. The terms“annotator” and“user” are used interchangeably herein. The annotator accesses the reference fundus images using the GUI 103k. The annotator creates the label with information about the quality level of the fundus image, the state of the retinal disease present in the fundus image, the type of retinal disease and the corresponding severity of the retinal disease based on the annotation. [0060] The annotator initially annotates the reference fundus image based on a plurality of quality factors. As used herein, the term“quality factors” refers to the parameters of the reference fundus image which define a measure of the quality level of the reference fundus image. The quality level is a measure of perceived image degradation as compared to an ideal image reference based on amounts of the multiple quality factors. The quality factors are, for example, darkness, light, contrast, color accuracy, tone reproduction, distortion, an exposure accuracy, sharpness, noise, lens flare, etc.
[0061] In an example, the annotator labels each of the reference fundus image as either‘good’ or ‘bad’ representing the quality level of the reference fundus image. For instance, a reference fundus image with the label comprising‘good’ indicates the quality level of the reference fundus image with quality factors above a quality threshold. Similarly, the reference fundus image with the label comprising‘bad’ indicates the quality level of the reference fundus image with a minimum number of the quality factors below the quality threshold. In another embodiment, the label may comprise terms such as either‘low-quality’ or‘high-quality’ based on the quality level of the reference fundus image. In another embodiment, the label may comprise terms defining five levels of quality -‘bad’,‘poor’,‘fair’,‘good’ and‘excellent’. In another embodiment, the label may comprise a numeric value representing the degree of quality of the reference fundus image based on the values of each of the associated quality factors.
[0062] Consider for example, the annotator manually analyses the reference fundus image by partitioning the fundus image into a plurality of partitions. The annotator divides the reference fundus image into nine equal partitions and analyses each of the partitions to determine the quality level of the reference fundus image. The annotator considers the multiple quality factors while analyzing the partitions to finally determine the quality level of the reference fundus image. The annotator determines the quality level of each of the partitions to determine the quality level of the reference fundus image. If the quality level of any one of the partitions is below the quality threshold and comprises a region of interest such as an optic disc and/or a macula of the fundus of the patient, then the annotator determines the quality level of the reference fundus image as‘bad’. The annotator considers a minimum of two partitions with the quality level below the quality threshold and with an absence of the region of interest to determine the quality level of the reference fundus image as‘bad’. If the annotator determines the quality level of all the partitions above the quality threshold, then the annotator classifies the quality level of the training fundus image as‘good’. According, the annotator labels each of the reference fundus image as either ‘good’ or‘bad’ representing the quality level of the partitions of the reference fundus image.
[0063] The annotator next annotates the reference fundus image to identify multiple indicators throughout the fundus image and to detect the state of the retinal disease based the identified indicators. The annotator detects the presence of one or more retinal diseases based on the identified indicators. The annotator further updates the label of the fundus image with each type of the retinal disease, the severity of each type of the retinal disease, etc. In an embodiment, the annotator may concentrate only on the identification of a particular retinal disease.
[0064] In an example, consider that the annotator annotates the reference fundus images for the retinal disease - diabetic retinopathy (DR). The annotator may consider one or more standard DR grading standards such as the American ophthalmology DR grading scheme, the Scottish DR grading scheme, the UK DR grading scheme, etc., to annotate the reference fundus images. The annotator may assign a DR severity grade - grade 0 (representing no DR), grade 1 (representing mild DR), grade 2 (representing moderate DR), grade 3 (representing severe DR) or grade 4 (representing proliferative DR) to each of the reference fundus image. The label of the reference fundus image represents the DR severity level associated with the patient.
[0065] For example, the annotator labels each of the reference fundus image as one of five severity classes-‘No DR’,‘DR1’,‘DR2’,‘DR3’ and‘DR4’ based on an increasing seriousness of DR. Here, ‘No DR’,‘DR1’,‘DR2’,‘DR3’ and‘DR4’ represents the labels indicating different levels of increasing severity of DR associated with the patient. The annotator analyses the indicators in the retinal fundus image and accordingly marks the label. If the annotator detects a microaneurysm, then the annotator considers it as a mild level of DR and marks the label as DR1 for the reference fundus image. Similarly, if the annotator detects one or more of the following - a hard exudate, a soft exudate, a hemorrhage, a venous loop, a venous beading, etc., then the annotator marks the label as DR2 for the reference fundus image. The label DR2 indicates a moderate level of DR. The annotator marks the label as DR3 for the reference fundus image with a severe level of DR upon detection of multiple hemorrhages, hard or soft exudates, etc., and DR4 for the reference fundus image with a proliferative level of DR upon detection of vitreous hemorrhage, neovascularization, etc. The reference fundus image with no traces of DR is marked with the label as‘No DR’ by the annotator.
[0066] The annotator stores the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file located in the database 104a. The label provides information about the type of retinal disease and the corresponding severity of the retinal disease as annotated by the annotator. The severity of the retinal disease in turn provides the state of the retinal disease. The state of the retinal disease is either a presence or an absence of the retinal disease. The reference fundus image identifier of the reference fundus image is, for example, a name or an identity assigned to the reference fundus image.
[0067] In another embodiment, the first analysis means 103h uses one or more of the known image processing algorithms to detect the quality level of the reference fundus image. The second analysis means 103i identifies the indicators throughout each of the reference fundus image to detect the state of the retinal disease using the known image processing algorithms. The second analysis means 103i classifies the severity of the retinal disease based on the presence of the retinal disease using a set of predetermined rules. The predetermined rules comprise considering a type of each of the indicators, a count of each indicators, a region of occurrence of each of the indicators, a contrast level of each of the indicators, a size of each of the indicators or any combination thereof to recognize the retinal disease and the severity of the retinal disease. The second analysis means 103i classifies each of the detected retinal diseases according to a corresponding severity grading and generates the label. The second analysis means 103i communicates with the database 104a to store the label and the reference fundus image identifier for each of reference fundus image in the reference ground-truth file. [0068] The first analysis means 103h utilizes the first reference dataset to train the first convolutional neural network for subsequent detection of the quality level of the fundus image. The second analysis means 103i utilizes the first reference dataset to train the second convolutional neural network for subsequent detection and classification of the retina disease in the fundus image. Hereafter, the fundus image which is subsequently analyzed by the first analysis means 103h and the second analysis means 103i is referred to as an input fundus image for clarity.
Pre-processing of the reference fundus image:
[0069] The fundus image analysis application 103 further comprises a pre-processing means 103d to pre-processes each of the reference fundus images. The pre-processing means 103d communicates with the database 104a to access the first reference dataset. For each of the reference fundus image, the pre-processing means 103d executes the following steps as part of the pre processing. The pre-processing means 103d separates any text matter present at the border of the reference fundus image. The pre-processing means 103d adds a border to the reference fundus image with border pixel values as zero. The pre-processing means 103d increases the size of the reference fundus image by a predefined number of pixels, for example, 20 pixels width and height. The additional pixels added are of a zero value. The pre-processing means 103d next converts the reference fundus image from a RGB color image to a grayscale image. The pre-processing means 103d now binarize the reference fundus image using histogram analysis. The pre-processing means 103d applies repetitive morphological dilation with a rectangular element of size [5, 5] to smoothen the binarized reference fundus image. The pre-processing means 103d acquires all connected regions such as retina, text matter of the smoothen reference fundus image to separate text matter present in the reference fundus image from a foreground image. The pre-processing means 103d determines the largest region among the acquired connected regions as the retina. The retina is assumed to be the connected element with the largest region. The pre-processing means 103d calculates a corresponding bounding box for the retina. The pre-processing means 103d, thus identifies retina from the reference fundus image. [0070] Once the pre-processing means 103d identifies the retina in the reference fundus image, the pre-processing means 103d further blurs the reference fundus image using a Gaussian filter. The pre-processing means 103d compares an image width and an image height of the blurred reference fundus image based on Equation 1.
Image width > 1.2(image height)— Equation 1
[0071] The pre-processing means 103d calculates a maximum pixel value of a left half, a maximum pixel value of a right half and a maximum background pixel value for the blurred reference fundus image when the image width and the image height of the blurred identified retina satisfies the Equation 1. The maximum background pixel value (Max_background pixel value) is given by the below Equation 2. The term‘max_pixel_left’ in Equation 2 is the maximum pixel value of the left half of the blurred identified retina. The term‘max_pixel_right’ in Equation 2 is the maximum pixel value of the right half of the blurred reference fundus image.
Max_background pixel value = max (max_pixel_left, max_pixel_right)— Equation 2
[0072] The pre-processing means 103d further extracts foreground pixel values from the blurred reference fundus image by considering pixel values which satisfy the below Equation 3.
All pixel values > max_background_pixel_value + 10— Equation 3
[0073] The pre-processing means 103d calculates a bounding box using the extracted foreground pixel values from the blurred reference fundus image. The pre-processing means 103d processes the bounding box to obtain a resized image using cubic interpolation of shape, for example, [256, 256, 3]. The reference fundus image at this stage is referred to as the pre-processed reference fundus image. The pre-processing means 103d stores the pre-processed reference fundus images in a pre-processed first reference dataset. The ground-truth file associated with the first reference dataset holds good even from the pre-processed first reference dataset. The pre-processing means 103d stores the pre-processed first reference dataset in the database 104a. Segregation of the first reference dataset:
[0074] The fundus image analysis application 103 further comprises a segregation means 103e. The segregation means 103e splits the pre-processed first reference dataset into two sets - a training set and a validation set. Hereafter, the pre-processed reference fundus images in the training set is termed as training fundus images and the pre-processed reference fundus images in the validation set is termed as validation fundus images for simplicity. The training set is used to train the convolutional neural network (the first convolutional neural network and the second convolutional neural network) to assess the reference fundus images based on the label associated with each of the reference fundus image. The validation set is typically used to test the accuracy of the convolutional neural network.
Augmentation of the reference fundus images:
[0075] The fundus image analysis application 103 further comprises an augmentation means 103f. The augmentation means 103f augments the reference fundus images in the training set. The augmentation means 103f preforms the following steps for the augmentation of the training set. The augmentation means 103f randomly shuffles the reference fundus images to divide the training set into a plurality of batches. Each batch is a collection of a predefined number of reference fundus images. The augmentation means 103f randomly samples each batch of reference fundus images. The augmentation means 103f processes each batch of the reference fundus images using affine transformations. The augmentation means 103f translates and rotates the reference fundus images in the batch randomly based on a coin flip analogy. The augmentation means 103f also adjusts the color and brightness of each of the reference fundus images in the batch randomly based on the results of the coin flip analogy.
General arrangement of the convolutional neural network: [0076] The general arrangement of the convolutional neural network is as follows. The convolutional neural network comprising‘n’ convolutional stacks applies a convolution operation to the input and passes an intermediate result to a next layer. Each convolutional stack comprises a plurality of convolutional layers. A first convolution stack is configured to convolve pixels from an input with a plurality of filters to generate a first indicator map. The first convolutional stack also comprises a first subsampling layer configured to reduce a size and variation of the first indicator map. The first convolutional layer of the first convolutional stack is configured to convolve pixels from the input with a plurality of filters. The first convolutional stack passes an intermediate result to the next layer. Similarly, each convolutional stack comprises a sub-sampling layer configured to reduce a size (width and height) of the indicators stack. The input is analyzed based on reference data to provide a corresponding output.
Training and validation of the first convolutional neural network:
[0077] The first analysis means 103h and the second analysis means 103i train the first convolutional neural network and the second convolutional neural network respectively using the batches of augmented reference fundus images. The segregation means 103e groups the validation fundus images of the validation set into a plurality of batches. Each batch comprises multiple validation fundus images.
[0078] The first analysis means 103h validates each of the validation fundus images in each batch of the validation set using the first convolutional neural network. The first analysis means 103h compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file. The first analysis means 103h thus evaluates a convolutional network performance of the first convolutional neural network for the batch of validation set. Here, the convolutional network performance of the first convolutional neural network refers to the detection of the initial quality level for each of the reference fundus image. [0079] The first analysis means 103h optimizes the first convolutional neural network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum. The optimizer iteratively optimizes the parameters of the convolutional neural network during multiple iterations using the training set. Here, each iteration refers to a batch of the training set. The first analysis means 103h evaluates the convolutional network performance of the first convolutional neural network after a predefined number of iterations on the validation set. Here, each iteration refers to a batch of the validation set.
[0080] Thus, the first analysis means 103h trains the first convolutional neural network based on the augmented training set and tests the convolutional network based on the validation set. Upon completion of training and validation of the first convolution neural network based on the convolutional network performance, the first analysis means 103h is ready to assess the quality level of the input fundus image.
Training and validation of the second convolutional neural network:
[0081] The second analysis means 103i analyzes the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion and does not consider the parameters of the image capturing device for analysis. The second analysis means 103i validates each of the validation fundus images in each batch of the validation set using the second convolutional neural network. The second analysis means 103i compares a result of the validation against a corresponding label of the validation fundus image by referring to the reference ground-truth file. The second analysis means 103i thus evaluates a convolutional network performance of the second convolutional neural network for the batch of validation set. Here, the convolutional network performance of the second convolutional neural network refers to the identification of the indicators throughout the reference fundus image and detection of the state of the retinal disease based the identified indicators.
[0082] The second analysis means 103i optimizes the second convolutional neural network parameters using an optimizer, for example, a Nadam optimizer which is an Adam optimizer with Nesterov Momentum. The optimizer iteratively optimizes the parameters of the second convolutional neural network during multiple iterations using the training set. Here, each iteration refers to a batch of the training set. The second analysis means 103i evaluates the convolutional network performance of the second convolutional neural network after a predefined number of iterations on the validation set. Here, each iteration refers to a batch of the validation set.
[0083] Thus, the second analysis means 103i trains the second convolutional neural network based on the augmented training set and tests the second convolutional network based on the validation set. Upon completion of training and validation of the second convolution neural network based on the convolutional network performance, the second analysis means 103i is ready
to detect the state of the retinal disease based on the identified indicators.
Test-time augmentation of the input fundus image:
[0084] The reception means 103a receives the input fundus image from, for example, the image capturing device. The pre-processing means 103d pre-processes the input fundus image similar to that of the reference fundus image. The fundus image analysis application 103 further comprises a test-time augmentation means 103g to test-time augment the preprocessed input fundus image. The test-time augmentation means 103g converts the preprocessed input fundus image into a plurality of test time images, for example, twenty test time images, using deterministic augmentation. The test-time augmentation means 103g follows the same steps to augment the input fundus image as that of the reference fundus image, except that the augmentations are deterministic. Thus, the test-time augmentation means 103g generates deterministically augmented twenty test time images of the preprocessed input fundus image.
[0085] Based on the user selection criterion, the test-time augmentation means 103g transmits the deterministically augmented twenty test time images to either the first analysis means 103h or the second analysis means 103i. Evaluate quality level of the input fundus image:
[0086] The test-time augmentation means 103g transmits the deterministically augmented twenty test time images to the first analysis means 103h when the user selection criterion is to consider the quality level of the fundus image before the analysis of the fundus image. The first analysis means 103h processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the first convolutional neural network comprising‘n’ convolutional stacks. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result provides a probability value for each grade (for example, good and bad) of quality level associated with the input fundus image. The probability value is an indication of a confidence denoting the quality level of the input fundus image. The output indicates the quality level associated with the input fundus image.
[0087] Figure 2 exemplary illustrates the first convolutional neural network to compute the quality level of the input fundus image. The deterministically augmented twenty test time images of the preprocessed input fundus image are the input to a first convolutional stack (CS 1) of the first convolutional neural network. Each of the deterministically augmented twenty test time images of the preprocessed input fundus image is processed by the first convolutional neural network. The deterministically augmented test time image is, for example, represented as a matrix of width 224 pixels and height 224 pixels with‘3’ channels. That is, the deterministically augmented test time image is a representative array of pixel values is 224 x 224 x 3. The first convolution stack (CS1) is configured to convolve pixels from the deterministically augmented test time image with a filter to generate a first feature map. The first convolutional stack (CS1) also comprises a first subsampling layer configured to reduce a size and variation of the first feature map. The output of the first convolutional stack (CS 1 ) is a reduced input fundus image represented as a matrix of width 64 pixels and height 64 pixels with nl channels. That is, the output is a representative array of pixel values 64 x 64 x nl. This is the input to a second convolutional stack (CS2), which again convolves the representative array of pixel values 64 x 64 x nl to generate a second feature map. The second convolutional stack (CS2) comprises a second subsampling layer configured to reduce a size and variation of the second feature map to a representative array of pixel values of 16 x 16 x n2, n2 being the number of channels. The representative array of pixel values of 16 x 16 x n2 is an input to a third convolutional stack (CS3). The third convolutional stack (CS3) convolves the representative array of pixel values 16 x 16 x n2 to generate a third feature map. The third convolutional stack (CS3) comprises a third subsampling layer configured to reduce a size and variation of the third feature map to a representative array of pixel values of 8 x 8 x n3, n3 representing the number of channels. A fourth convolutional stack (CS4) convolves the representative array of pixel values 8 x 8 x n3 to generate a fourth feature map. The fourth convolutional stack (CS4) comprises a fourth subsampling layer configured to reduce a size and variation of the third feature map. A probability block (P) provides a probability of the quality level associated with the input fundus image. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result is the probability of the initial quality level of the input fundus image which are two values within a range [0, 1] indicating the gradable quality measure - a‘goodness’ and a‘badness’ of the input fundus image.
[0088] The final quality level determination means considers the detected initial quality level, the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image.
[0089] The user defined threshold is the user defined parameter to vary the quality level of the input fundus image. The user defined threshold is user defined to increase flexibility of the system 100. The user defined threshold is the variable factor which may be used to vary the quality level of the input fundus image to conveniently suit the requirements of the user, for example, medical practitioner. The user defined threshold is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the user defined threshold.
[0090] The parameters of the image capturing device are, for example, a manufacturer and version of the image capturing device, a resolution, an illumination factor, a field of view etc. The final quality level determination means determines a predefined score for the image capturing device based on the parameters of image capturing device. This predefined score for the image capturing device characteristics is used to assess the quality of the input fundus image. The predefined score for the image capturing device denotes a superiority of the image capturing device. The predefined score for the image capturing device is a numeric value within the range of [0, 1]. Here, 0 defines a least value and 1 defines a highest value of the predefined score for the image capturing device.
[0091] For example, the predefined score for the image capturing device for multiple manufacturers of image capturing device is initially stored in the database 104a by the user of the fundus image analysis application 103. By considering the image capturing device to assess the quality of the input fundus image, the flexibility of the system 100 is increased, thereby providing customized results for the input fundus image captured using the image capturing device of multiple manufacturers.
[0092] Thus, the first analysis means 103h assesses the final quality level of the input fundus image based on the factors - the probability values provided by the first convolutional neural network, the user defined threshold and the parameters of the image capturing device.
Analysis of the input fundus image:
[0093] The test-time augmentation means 103g transmits the deterministically augmented twenty test time images to the second analysis means 103i when the user selection criterion is directly analyze the input fundus image to identify the indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators. The second analysis means 103i processes the deterministically augmented twenty test time images of the preprocessed input fundus image using the second convolutional neural network comprising‘m’ convolutional stacks. The predicted probabilities of the twenty test time images are averaged over to get a final prediction result. The final prediction result provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image. The probability value is an indication of a confidence that identified indicators are of a particular retinal disease and a corresponding severity of the retinal disease. The output indicates a presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image.
[0094] Figure 3 exemplary illustrates the second convolutional neural network to compute the presence or absence of a retinal disease and related severity of the retinal disease associated with the input fundus image. The deterministically augmented twenty test time images of the preprocessed input fundus image is the input to a first convolutional stack (CS1) of the convolutional network. Each of the deterministically augmented twenty test time images is processed by the convolutional network.
[0095] The deterministically augmented test time image is, for example, represented as a matrix of width 448 pixels and height 448 pixels with‘3’ channels. That is, the deterministically augmented test time image is a representative array of pixel values is 448 x 448 x 3. The input to the first convolutional stack (CS1) is a color image of size 448 x 448. The first convolution stack (CS 1) comprises the following sublayers - a first convolutional layer, a first subsampling layer, a second convolutional layer, a third convolutional layer and a second subsampling layer in the same order. The output of a sublayer is an input to a consecutive sublayer. In general, a subsampling layer is configured to reduce a size and variation of its input and a convolutional layer convolves its input with a plurality of filters, for example, filters of size 3x3. The output of the first convolutional stack (CS1) is a reduced image represented as a matrix of width 112 pixels and height 112 pixels with nl channels. That is, the output of the first convolutional stack (CS1) is a representative array of pixel values 112 x 112 x ml.
[0096] This is the input to a second convolutional stack (CS2). The second convolutional stack (CS2) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The second convolutional stack (CS2) convolves the representative array of pixel values 112 x 112 x ml and reduces it to a representative array of pixel values of 56 x 56 x m2. The representative array of pixel values of 56 x 56 x m2 is an input to a third convolutional stack (CS3). [0097] The third convolutional stack (CS3) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The third convolutional stack (CS3) convolves the representative array of pixel values 56 x 56 x m2 and reduces it to a representative array of pixel values of 28 x 28 x m3. The representative array of pixel values of 28 x 28 x m3 is an input to a fourth convolutional stack (CS4).
[0098] The fourth convolutional stack (CS4) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The fourth convolutional stack (CS4) convolves the representative array of pixel values 28 x 28 x m3 and reduces it to a representative array of pixel values of 14 x 14 x m4. The representative array of pixel values of 14 x 14 x m4 is an input to a fifth convolutional stack (CS4).
[0099] The fifth convolutional stack (CS5) comprises the following sublayers - four convolutional layers and a subsampling layer arranged in the same order. Again, the output of a sublayer is an input to a consecutive sublayer. The fifth convolutional stack (CS5) convolves the representative array of pixel values 14 x 14 x m4 and reduces it to a representative array of pixel values of 7 x 7 x m5. The representative array of pixel values of 7 x 7 x m5 is a first input to a concatenation block (C).
[0100] The output of the third convolutional stack (CS3) is an input to a first subsampling block (SS1). The representative array of pixel values of 28 x 28 x m3 is the input to the first subsampling block (SS1). The first subsampling block (SS1) reduces the input with a stride of 4 to obtain an output of a representative array of pixel with value of 7 x 7 x m3. This is a second input to the concatenation block (C).
[0101] The output of the fourth convolutional stack (CS4) is an input to a second subsampling block (SS2). The representative array of pixel values of 14 x 14 x m4 is the input to the second subsampling block (SS2). The second subsampling block (SS2) reduces the input with a stride of 2 to obtain an output of a representative array of pixel with value of 7 x 7 x m4. This is a third input to the concatenation block (C).
[0102] The concatenation block (C) receives the first input from the fifth convolutional stack (CS5), the second input from the first subsampling block (SS 1) and the third input from the second subsampling block (SS2). The concatenation block (C) concatenates the three inputs received to generate an output of value 7 x 7 x (m5 + m4 + m3). The output of the concatenation block (C) is an input to a probability block (P).
[0103] The probability block (P) provide a probability of the presence or absence of the retinal disease and related severity of the retinal disease. The predicted probabilities of the twenty test time images are averaged to get a final prediction result. The output of the convolutional network provides a probability value for each of the retinal disease and a corresponding retinal disease severity level associated with the input fundus image. The probability block (P) as shown in the Figure 2 provides five values by considering the retinal disease to be DR. The output of the probability block are five values depicting the probability for each DR severity level - DR0 (no DR), DR1 (mild DR level), DR2 (moderate DR level), DR3 (severe DR level) and DR4 (proliferative DR level).
[0104] Based on the user selection criterion, the GUI 103k displays output of either the second analysis means 103i or both the outputs of the first analysis means 103h and the second analysis means 103i. That is, the GUI 103k displays the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image when the user selection criterion does not involve the detection of the quality level of the input fundus image. When the user selection criterion involves the detection of the quality level of the input fundus image before the analysis of the fundus image, the GUI 103k displays the quality level of the input fundus image along with the presence or absence of a retinal disease and/or related severity of the retinal disease associated with the input fundus image. [0105] For example, suitable suggestions with a set of instructions to the user may also be included and provided via a pop-up box displayed on a screen. The fundus image analysis application 103 may also generate a report comprising the input fundus image, the type of the retinal disease and the severity of the retinal disease and communicated to the patient via an electronic mail. The report could also be stored in the database 104a of the system 100.
[0106] In another embodiment, the system 100 detects the presence of several diseases, for example, diabetes, stroke, hypertension, cardiovascular diseases, etc., and not limited to retinal diseases based on changes in the retinal feature. The second analysis means 103i trains the second convolutional neural network to identify and classify the severity of these diseases in the fundus image of the patient.
Transfer learning:
[0107] The second analysis means 103i is initially adapted to generate the second label for the fundus image using the second convolutional neural network. The second label is a state of a retinal disease. That is, the second analysis means 103i initially trains the second convolutional neural network to analyze the fundus image using the second convolutional neural network by considering the determined final quality level based on the user selection criterion. The second analysis means 103i is trained to analyze the fundus image to identify the indicators throughout the fundus image; and detect the state of the retinal disease based the identified indicators.
[0108] The second analysis means 103i makes use of this knowledge gained for analyzing the fundus image and applies it to a different but related problem, that is, for analyzing the fundus image specific to the parameters of the image capturing device used to capture the fundus image of the patient. The second analysis means 103i extracts the second label associated with the fundus image and transfers to the third convolutional neural network to generate the third label for the fundus image. Instead of training the third convolutional neural network from scratch, the second analysis means 103i“transfers” the learned second label associated with the fundus image to obtain customized results for the fundus image depending on the parameters of the image capturing devices. In other words, the third label for the fundus image provides customized analysis of the fundus image stating the state of the retinal disease for specific manufacturer and/or version of the image capturing device, for example, fundus camera. This way, the system 100 can be easily customizable to analyse the fundus images captured using different image capturing devices.
[0109] For example, the system 100 defines the process of transfer learning, that is, transferring the knowledge learned from generic fundus image analysis to analysis of fundus images specific to manufacturers and/or version of the image capturing device used to capture the fundus images. The system 100 transfers the knowledge learned to generate the second label for the fundus image by trickling high level information down to train the third convolutional neural network to generate the third label for the fundus image.
[0110] The second analysis means 103i refers to a secondary reference dataset to train and validate the third convolutional neural network. The secondary reference dataset is different from the first reference dataset. The secondary reference dataset also comprises a plurality of fundus images but specific to a set of parameters of the image capturing device. For example, the secondary reference dataset comprises the fundus images captured using the image capturing device of a specific manufacturer and version.
[0111] Thus, the second analysis means 103i trains and validates the third convolutional neural network using the secondary reference dataset but the parameters of the third convolutional neural network are initialized from the previously trained second convolutional neural network using the first reference dataset. Upon completion of training and validation of the third convolution neural network is ready to detect the state of the retinal disease based on the identified indicators for a specific set of parameters of the image capturing device. This process of transfer learning significantly increases the performance of the third convolutional neural network to provide customized results to the fundus images captured using different image capturing devices.
[0112] For example, the fundus image analysis application 103 provides suitable interactive elements 103j such as a drop down menu, a button, etc., to select a set of parameters of the image capturing device such a specific manufacturer of the image capturing device while capturing the fundus image of the patient. The fundus image analysis application 103 analyses the fundus image based on the selection of the set of parameters of the image capturing device. When a generic analysis of the fundus image is desired by the user (without the consideration of the set of parameters of the image capturing device), an appropriate interactive element 103j is provided by the fundus image analysis application 103. In another example, the user can upload an existing fundus image of the patient for analysis as shown in Figure 13. In this case, the set of parameters of the image capturing device an option is also provided to upload information regarding the set of parameters of the image capturing device. When the information regarding the set of parameters of the image capturing device are uploaded by the user, the system 100 provides specific analysis of the fundus image corresponding to the set of parameters of the image capturing device. If no information regarding the set of parameters of the image capturing device are not uploaded by the user, the system 100 provides generic analysis of the fundus image.
[0113] Figure 4 exemplarily illustrates the architecture of a computer system 400 employed by the fundus image analysis application 103. The fundus image analysis application 103 of the computer implemented system 100 exemplarily illustrated in Figure 1 employs the architecture of the computer system 400 exemplarily illustrated in Figure 4. The computer system 400 is programmable using a high level computer programming language. The computer system 400 may be implemented using programmed and purposeful hardware.
[0114] The fundus image analysis platform hosting the fundus image analysis application 103 communicates with user devices, for example, 101a, 101b, 101c, etc., of a user registered with the fundus image analysis application 103 via the network 102. The network 102 is, for example, the internet, a local area network, a wide area network, a wired network, a wireless network, a mobile communication network, etc. The computer system 400 comprises, for example, a processor 401, a memory unit 402 for storing programs and data, an input/output (1/0) controller 403, a network interface 404, a data bus 405, a display unit 406, input devices 407, fixed disks 408, removable disks 409, output devices 410, etc. [0115] As used herein, the term“processor” refers to any one or more central processing unit (CPU) devices, microprocessors, an application specific integrated circuit (ASIC), computers, microcontrollers, digital signal processors, logic, an electronic circuit, a field-programmable gate array (FPGA), etc., or any combination thereof, capable of executing computer programs or a series of commands, instructions, or state transitions. The processor 401 may also be realized as a processor set comprising, for example, a math or graphics co-processor and a general purpose microprocessor. The processor 401 is selected, for example, from the Intel® processors such as the Itanium® microprocessor or the Pentium® processors, Advanced Micro Devices (AMD®) processors such as the Athlon® processor, MicroSPARC® processors, UltraSPARC® processors, hp® processors, International Business Machines (IBM®) processors, the MIPS® reduced instruction set computer (RISC) processor, Inc., RISC based computer processors of ARM Holdings, etc. The computer implemented system 100 disclosed herein is not limited to a computer system 400 employing a processor 401 but may also employ a controller or a microcontroller.
[0116] The memory unit 402 is used for storing data, programs, and applications. The memory unit 402 is, for example, a random access memory (RAM) or any type of dynamic storage device that stores information for execution by the processor 401. The memory unit 402 also stores temporary variables and other intermediate information used during execution of the instructions by the processor 401. The computer system 400 further comprises a read only memory (ROM) or another type of static storage device that stores static information and instructions for the processor 401.
[0117] The I/O controller 403 controls input actions and output actions performed by the fundus image analysis application 103. The network interface 404 enables connection of the computer system 400 to the network 102. For example, the fundus image analysis platform hosting the fundus image analysis application 103 connects to the network 102 via the network interface 404. The network interface 404 comprises, for example, one or more of a universal serial bus (USB) interface, a cable interface, an interface implementing Wi-Fi® of the Wireless Ethernet Compatibility Alliance, Inc., a FireWire® interface of Apple, Inc., an Ethernet interface, a digital subscriber line (DSL) interface, a token ring interface, a peripheral controller interconnect (PCI) interface, a local area network (LAN) interface, a wide area network (WAN) interface, interfaces using serial protocols, interfaces using parallel protocols, and Ethernet communication interfaces, asynchronous transfer mode (ATM) interfaces, interfaces based on transmission control protocol (TCP)/internet protocol (IP), radio frequency (RF) technology, etc. The data bus 405 permits communications between the means/modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103.
[0118] The display unit 406, via the GUI 103k, displays information, display interfaces, interactive elements 103j j such as drop down menus, text fields, checkboxes, text boxes, floating windows, hyperlinks, etc., for example, for allowing the user to enter inputs associated with the patient. In an example, the display unit 406 comprises a liquid crystal display, a plasma display, etc. The input devices 407 are used for inputting data into the computer system 400. A user, for example, an operator, registered with the fundus image analysis application 103 uses one or more of the input devices 407 of the user devices, for example, 101a, 101b, 101c, etc., to provide inputs to the fundus image analysis application 103. For example, a user may enter a patient’s profile information, the patient’s medical history, etc., using the input devices 407. The input devices 407 are, for example, a keyboard such as an alphanumeric keyboard, a touch pad, a joystick, a computer mouse, a light pen, a physical button, a touch sensitive display device, a track ball, etc.
[0119] Computer applications and programs are used for operating the computer system 400. The programs are loaded onto the fixed disks 408 and into the memory unit 402 of the computer system 400 via the removable disks 409. In an embodiment, the computer applications and programs may be loaded directly via the network 102. The output devices 410 output the results of operations performed by the fundus image analysis application 103.
[0120] The processor 401 executes an operating system, for example, the Finux® operating system, the Unix® operating system, any version of the Microsoft® Windows® operating system, the Mac OS of Apple Inc., the IBM® OS/2, VxWorks® of Wind River Systems, Palm OS®, the Solaris operating system, the Android operating system, Windows Phone™ operating system developed by Microsoft Corporation, the iOS operating system of Apple Inc., etc. [0121] The computer system 400 employs the operating system for performing multiple tasks. The operating system is responsible for management and coordination of activities and sharing of resources of the computer system 400. The operating system employed on the computer system
400 recognizes, for example, inputs provided by the user using one of the input devices 407, the output display, files, and directories stored locally on the fixed disks 408. The operating system on the computer system 400 executes different programs using the processor 401. The processor
401 and the operating system together define a computer platform for which application programs in high level programming languages are written.
[0122] The processor 401 retrieves instructions for executing the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103 from the memory unit 402. A program counter determines the location of the instructions in the memory unit 402. The program counter stores a number that identifies the current position in the program of each of the modules (103a, 103b, 103c, 103d, 103e, 103f, 103g, 103h, 103i, 103j and 103k) of the fundus image analysis application 103. The instructions fetched by the processor 401 from the memory unit 402 after being processed are decoded. The instructions are stored in an instruction register in the processor 401. After processing and decoding, the processor 401 executes the instructions.
[0123] Figure 5 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying a log-in screen of the system 100. The log-in screen comprises text boxes 501 and 502 to permit the user to enter a user name and password, respectively, and a button 503. The log-in process takes place upon selection of button 503 using the user input device, for example, a keyboard, a mouse, etc. The user enters his credentials to log-in to the system 100. Upon successful log-in, the menu screen 600 is displayed to the user of the system 100 as shown in Figure 6.
[0124] Figure 6 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the menu screen 600 of the system 100. This provides flexibility to the user for navigating between the components accessed through the menu screen 600, that is, add new patient screen 601, an existing patients screen 602 and a report screen 603.
[0125] Figure 7 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the add new patient screen 601 of the system 100. The information associated with the new patient such as the personal details about the patient, medical condition of the patient, etc., are recorded by the user of the system 100 using the add new patient screen 601.
[0126] Figure 8 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the existing patients screen 602 of the system 100. The patients list allows access to individual information in the form of a patient's profile via the“View Profile” option. The reports of the patient comprising the patient’s fundus images and analysis details of the patient’s fundus images can be accessed via the“View Report” option provided for each of the patient. In an example, an alphabetical list of the patients is presented by default. Each record includes a patient ID, a patient name, a patient email address, a patient mobile number, a patient age, a patient city and a date of creation of the patient's profile. A search option to search for a specific patient in the “Patient List” is also provided in the existing patients screen 602.
[0127] Figure 9 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the profile screen 900 of the existing patient of the system 100. The existing patient profile screen provides an“Edit Patientinfo” option to edit the information related to the existing patient. The existing patient profile screen also provides a“View Report” option to view previous report of the existing patient. The existing patient profile screen provides“View all Images” option to view the previously captured fundus images of the existing patient.
[0128] Figure 10 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the existing images 1000 of the existing patient of the system 100. When the user of the system 100 clicks on the“View all Images” option as shown in Figure 9, the previously captured fundus images of the existing patient along with the state of the retinal disease, that is, DR are displayed on the GUI 103k provided by the system 100. [0129] Figure 11 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image rendering screen 1100 of the system 100. The live mode of the fundus of the patient is displayed in a box 1101 of the fundus image rendering screen 1100. The user has the options to start 1102 and stop 1103 the display of the live mode of the fundus of the patient. The user also has an option to capture 1104 the fundus image during the display of the live mode of the fundus of the patient. The user can select posterior or anterior of the eye along with the details of the eye - left eye or right eye for the captured fundus image.
[0130] Figure 12 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 after the fundus image of the patient is captured by the user. The fundus image analysis application 103 provides an option to either directly analyze the fundus image of the patient or determine the quality level of the fundus image of the patient before the analysis of the fundus image. Either or both the fundus images (representing the left eye and/or the right eye of the patient) can be selected by the user for analysis. When the user selects the button“Analyse”, the second analysis means 103i of the fundus image analysis application 103 analyses the fundus images of the patient without considering the final quality level of the fundus image. The second analysis means 103i analyses each of the fundus image of the patient to determine the state of a retinal disease.
[0131] When the user selects the button“Image quality and analyse”, the second analysis means 103i of the fundus image analysis application 103 analyses the fundus images of the patient considering the final quality level of the fundus image. That is, the second analysis means 103i considers the output of the first analysis means 103h to either continue or abort with the analysis of the fundus image. The first analysis means 103h determines the final quality level of the fundus image and transmits this output to the second analysis means 103i. The second analysis means 103i analyses the fundus image to identify the indicators and determine the state of the retinal disease (in this case, diabetic retinopathy) when the output of the first analysis means 103h indicates that the final quality level of the fundus image is‘good’. The second analysis means 103i aborts the analysis when the output of the first analysis means 103h indicates that the final quality level of the fundus image is‘bad’.
[0132] Here, the buttons“Analyse” and“Image Quality and analyse” are the interactive elements 103j on the GUI 103k for enabling the analysis of the fundus image of the patient. The user selection criterion defines a selection process of the user which is realized by a clickable event of either the“Analyse” or the“Image Quality and analyse” buttons on the GUI 103k via the user input device such as a mouse, a trackball, a joystick, etc. When the user selects the“Analyze” option, the user in turn triggers the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus image without an additional process of determination of the final quality level of the fundus image before the analysis. When the user selects the“Image Quality and analyse” option, the user triggers the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus image considering the final quality level of the fundus image determined by the first analysis means 103h.
[0133] Figure 13 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image upload screen of the system 100 to upload the fundus image of the patient for analysis. In an embodiment, the fundus image of the patient is an already existing fundus image and can be uploaded by the user of the fundus image analysis application 103. The ‘Upload” button 1301 is the interactive element 103j on the GUI 103k using which the user can upload an existing fundus image of the patient. The existing fundus image may be located, for example, on the database 104a. For example, the user may upload the fundus image captured by the image capturing device of a particular manufacturer using the“Upload” button 1301 to analyse the fundus image for the particular manufacturer.
[0134] Figure 14 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 when the user selects the option“Image Quality and analyse”. Consider that the first analysis means 103h detects that the “Right Eye” fundus image of the patient is not gradable and the final quality level is‘Bad” and the “Left Eye” fundus image of the patient is gradable and the final quality level is‘Good”. The output of the first analysis means 103h is transmitted to the second analysis means 103i of the fundus image analysis application 103 to analyze the fundus images of the patient. Since the final quality level of the“Right Eye” fundus image of the patient is‘Bad”, the second analysis means 103i aborts the analysis of the“Right Eye” fundus image of the patient. Now consider that the second analysis means 103i identifies that the“Left Eye” fundus image of the patient comprises indicators denoting a normal eye condition, that is, without DR. The output of the second analysis means 103i are displayed along with the fundus images of the patient on the fundus image analysis screen of the system 100. The“Right Eye” fundus image of the patient is indicated as“Bad Image” and the“Left Eye” fundus image of the patient is indicated with the output of the second analysis means 103i, that is,“No abnormalities found”.
[0135] Figure 15 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the fundus image analysis screen 1200 of the system 100 when the user selects the option“Analyse”. The second analysis means 103i of the fundus image analysis application 103 analyzes the fundus images of the patient without considering the final quality level of the fundus images. Now consider that the second analysis means 103i identifies that the“Right Eye” fundus image of the patient comprises indicators denoting DR. Each of the fundus images of the patient are analyzed by the second analysis means 103i. The output of the second analysis means 103i are displayed along with the fundus images of the patient on the fundus image analysis screen of the system 100. The“Right Eye” fundus image of the patient is indicated with the output of the second analysis means 103i as“Doctor Review Recommended” and the“Left Eye” fundus image of the patient is indicated with the output of the second analysis means 103i, that is,“No abnormalities found”.
[0136] Figure 16 exemplary illustrates the screenshot of the GUI 103k provided by the system 100, displaying the report screen 603 of the system 100. The report summary screen displays the patient information and the details about the state of the retinal disease (diabetic retinopathy) denoted below each of the fundus images of the patient. Here, a plurality of fundus images for the left eye and the right eye of the patient are analyzed and displayed on the report summary screen. An option to view the report in a PDF version“View PDF” is provided at the top right of the report summary screen along with the“Print” option to print the report and“Send as mail” option to send the report of the patient as an attachment in the mail.
[0137] Figure 17 illustrates a flowchart for analyzing the fundus image of the patient in accordance with the invention. At step SI, the fundus image analysis application 103 receives the fundus image of the patient. The non-transitory computer readable storage medium is configured to store the fundus image analysis application 103 and at least one processor is configured to execute the fundus image analysis application 103. The fundus image analysis application 103 is thus a part of the system 100 comprising the non-transitory computer readable storage medium communicatively coupled to the at least one processor. The fundus image analysis application 103 comprises the GUI 103k comprising multiple interactive elements 103j configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c. The reception means 103a adapted to receive the input from the image capturing device based on multiple parameters of the image capturing device. The input is the fundus image of the patient displayed in a live mode. In an embodiment, the fundus image analysis application 103 is a web application implemented on a web based platform, for example, a website hosted on a server or a setup of servers.
[0138] At step S2, the interactive fundus image rendering means 103b is adapted to dynamically render the input. The dynamically rendered input is configurably accessible on the GUI 103k via the user device 101a, 101b or 101c using the interactive elements 103j. At step S3, the fundus image capture means 103c is adapted to capture the fundus image based on the dynamically rendered input.
[0139] At step S4, the fundus image analysis application 103 receives the user selection criterion via the user device. The user selection criterion refers to a user’s selection of either considering the quality level of the fundus image before analyzing the fundus image for detection of one or more retinal diseases in the fundus image (Decision of S4 as YES) or analyzing the fundus image for detection of one or more retinal diseases in the fundus image without considering the quality level of the fundus image (Decision of S4 as NO). [0140] At step S5, the first analysis means 103h is configured to determine the quality level of the captured fundus image. The initial quality level detection means generates the first label for the fundus image using the first convolutional neural network. The initial label is the initial quality level of the fundus image. At step S6, the final quality level determination means to determine the final quality level of the fundus image based on the generated first label, the user defined quality threshold and the parameters of the image capturing device. The initial quality level of the fundus image refers to a quality level computed by the first convolutional neural network based on a training provided to the first convolutional neural network. The final quality level determination means considers the detected initial quality level along with the user defined quality threshold and the parameters of the image capturing device to determine the final quality level of the fundus image. The user defined quality threshold is a user defined parameter to vary the quality level of the fundus image. The user defined quality threshold is based on the user’s confidence and ability to grade the fundus image.
[0141] As used herein, the term“indicator” is one of an abnormality, a retinal feature or the like. The retinal feature is an optic disc, a macula, a blood vessel or the like. The abnormality is one of a lesion like a venous beading, a venous loop, an intra retinal microvascular abnormality, an intra retinal hemorrhage, a micro aneurysm, a soft exudate (cotton-wool spots), a hard exudate, a vitreous/preretinal hemorrhage, neovascularization, a drusen or the like. The retinal disease is one of diabetic retinopathy (DR), diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like. The state of the retinal disease indicates a presence or absence of the retinal disease represented as levels of increasing seriousness of the retinal disease.
[0142] At step S7, the second analysis means 103i is configured to generate the second label for the fundus image using the second convolutional neural network. The generation of the second label is to analyze the fundus image. The indicators identification means identifies the multiple indicators throughout the fundus image using the second convolutional neural network. At step S8, the retinal disease detection means detects the state of the retinal disease based the identified indicators using the second convolutional neural network. [0143] Further, the second analysis means 103i is also adapted to generate the third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and the second analysis means 103i adapted to train the third convolutional neural network using the third label.
[0144] The general concepts of the current invention are not limited to a particular number of severity levels. In an embodiment, one severity level could be used which satisfies only the detection of the retina disease. In another embodiment, multiple severity levels could be used to classify the retinal disease. In another embodiment, multiple retinal diseases could be detected based on the identified indicators. The system 100 classifies each of the detected retinal diseases based on the severity.
[0145] The second analysis means 103i of the fundus image analysis application 103 using the second convolutional neural network emphases on classifying the entire fundus image as a whole. This improves efficiency and reduces errors in identifying various medical conditions. The system 100 acts as an important tool in the detection of the quality level of the fundus image and monitoring a progression of one or more retinal diseases and/or or a response to a therapy. The system 100 trains the second convolutional neural network to detect all indicative indicators related to multiple retinal diseases. The system 100 accurately detects indicators throughout the input fundus image which are indicative of disease conditions to properly distinguish indicators of a healthy fundus from indicators which define retinal diseases.
[0146] In an embodiment, the system 100 uses the analyzed fundus images to further train the convolutional neural network. In another embodiment, system 100 refers to the patient profiles to gather information such as age, gender, race, ethnicity, nationality, etc., of existing patients to further train the convolutional neural networks to improve the convolutional network performance and provide customized results to the patients. [0147] The system 100 may also be used to detect certain conditions such as a laser treated fundus. The system 100 may be a part of a web cloud with the input fundus image and the report uploaded to the web cloud. The system 100 involving computer-based process of supervised learning using the convolutional network as described can thus be effectively used to screen the fundus images. The system 100 identifies indicators which are further processed to automatically provide indications of relevant retinal disease, in particular indications of DR. The system 100 increases efficiency by the utilization of the well trained convolutional network for detecting and classifying the retinal diseases thus providing cost-effective early screening and treatment to the patient.
[0148] The system 100 reduces the time-consumption involved in a manual process requiring a trained medical practitioner to evaluate digital fundus photographs of the retina. The system 100 using the convolutional system 100 effectively improves the quality of analysis of the fundus image by detecting indicators of minute size which are often difficult to detect in the manual process of evaluating the fundus image.
[0149] The present invention described above, although described functionally or sensibly, may be configured to work in a network environment comprising a computer in communication with one or more devices. It will be readily apparent that the various methods, algorithms, and computer programs disclosed herein may be implemented on computer readable media appropriately programmed for general purpose computers and computing devices. As used herein, the term “computer readable media” refers to non-transitory computer readable media that participate in providing data, for example, instructions that may be read by a computer, a processor or a similar device. Non-transitory computer readable media comprise all computer readable media. Non volatile media comprise, for example, optical discs or magnetic disks and other persistent memory volatile media including a dynamic random access memory (DRAM), which typically constitutes a main memory. Volatile media comprise, for example, a processor cache, a register memory, a random access memory (RAM), etc. Transmission media comprise, for example, coaxial cables, copper wire, fiber optic cables, modems, etc., including wires that constitute a system bus coupled to a processor, etc. Common forms of computer readable media comprise, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, a Blu-ray Disc®, a magnetic medium, a compact disc -read only memory (CD-ROM), a digital versatile disc (DVD), any optical medium, a flash memory card, a laser disc, RAM, a programmable read only memory (PROM), an erasable programmable read only memory (EPROM), an electrically erasable programmable read only memory (EEPROM), a flash memory, any other cartridge, etc.
[0150] The database 104a is, for example, a structured query language (SQL) data base or a not only SQL (NoSQL) data base such as the Microsoft® SQL Server®, the Oracle® servers, the MySQL® database of MySQL AB Company, the MongoDB® of lOgen, Inc., the Neo4j graph database, the Cassandra database of the Apache Software Foundation, the HBase™ database of the Apache Software Loundation, etc. In an embodiment, the database 104a can also be a location on a file system. The database 104a is any storage area or medium that can be used for storing data and files. In another embodiment, the database 104a can be remotely accessed by the fundus image analysis application 103 via the network 102. In another embodiment, the database 104a a is configured as a cloud based database 104a implemented in a cloud computing environment, where computing resources are delivered as a service over the network 102, for example, the internet.
[0151] The foregoing examples have been provided merely for the purpose of explanation and does not limit the present invention disclosed herein. While the invention has been described with reference to various embodiments, it is understood that the words are used for illustration and are not limiting. Those skilled in the art, may effect numerous modifications thereto and changes may be made without departing from the scope and spirit of the invention in its aspects.
Blank page upon filing

Claims

Claims We claim:
1. A computer implemented system 100 for analyzing a fundus image of a patient, comprising: at least one processor; a non-transitory computer readable storage medium communicatively coupled to the at least one processor, the non-transitory computer readable storage medium configured to store a fundus image analysis application 103, the at least one processor configured to execute the fundus image analysis application 103; and the fundus image analysis application 103 comprising: a graphical user interface 103k comprising a plurality of interactive elements 103j configured to enable capture and analysis of the fundus image via a user device 101a, 101b or 101c; a reception means 103a adapted to receive an input from an image capturing device based on a plurality of parameters of the image capturing device, wherein the input is the fundus image of the patient; an interactive fundus image rendering means 103b adapted to dynamically render the input, wherein the dynamically rendered input is configurably accessible on the graphical user interface 103k via the user device 101a, 101b or 101c using the interactive elements 103j; a fundus image capture means 103c adapted to capture the fundus image based on the dynamically rendered input; a first analysis means 103h adapted to determine a final quality level of the captured fundus image comprising: an initial quality level detection means to generate a first label for the fundus image using a first convolutional neural network, wherein the initial label is an initial quality level of the fundus image; a final quality level determination means to determine the final quality level of the fundus image based on the generated first label, a user defined quality threshold and the parameters of the image capturing device; and a second analysis means 103i adapted to generate a second label for the fundus image using a second convolutional neural network by considering the determined final quality level based on a user selection, wherein the second label is a state of a retinal disease.
2. The system 100 as claimed in claim 1, further comprising: said second analysis means 103i adapted to generate a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image; and said second analysis means 103i adapted to train a third convolutional neural network using the third label.
3. The system 100 as claimed in claim 1, wherein the user defined quality threshold is a quality measure defined by a user for the fundus image based on a user grading experience.
4. The system 100 as claimed in claim 1 , wherein the parameters of the image capturing device are a manufacturer of the image capturing device, a version of the image capturing device and the like.
5. The system 100 as claimed in claim 1, wherein the indicator is one of an abnormality, a retinal feature or the like.
6. The system 100 as claimed in claim 1 , wherein the retinal disease is one of diabetic
retinopathy, diabetic macular edema, glaucoma, coloboma, retinal tear, retinal detachment or the like.
7. The system 100 as claimed in claim 1, wherein the state of the retinal disease indicates a level of seriousness of the retinal disease or a likelihood of developing the retinal disease.
8. The system 100 as claimed in claim 1, wherein the user selection criterion is an user action using the interactive elements 103j to either or not consider the determined final quality level of the fundus image to generate the second label for the fundus image.
9. A computer implemented method for analyzing a fundus image, said method comprising: providing a fundus image analysis application 103 executable by at least one processor configured to analysis the fundus image of the patient, wherein the fundus image analysis application 103 is accessible by a user device 101a, 101b or 101c via a graphical user interface 103k provided the fundus image analysis application 103; providing a plurality of interactive elements 103j on the graphical user interface 103k by the fundus image analysis application 103, the interactive elements 103j configured to enable capture and analysis of the fundus image on the graphical user interface 103k via the user device 101a, 101b or 101c; receiving an input from an image capturing device based on a plurality of parameters of the image capturing device by the fundus image analysis application 103 via the graphical user interface 103k, wherein the input is the fundus image of the patient displayed in a live mode; dynamically rendering the input by the fundus image analysis application 103 via the graphical user interface 103k, wherein the dynamically rendered input is configurably accessible on the graphical user interface 103k via the user device 101a, 101b or 101c using the interactive elements 103j; capturing the fundus image based on the dynamically rendered input by the fundus image analysis application 103 via the graphical user interface 103k; determining a quality level of the captured fundus image by the first analysis means 103h, comprising: generating a first label for the fundus image using a first convolutional neural network, wherein the initial label is an initial quality level of the fundus image by an initial quality level detection means; determining the final quality level of the fundus image based on the generated first label, a user defined quality threshold and the parameters of the image capturing device by a final quality level determination means; and generating a second label for the fundus image using a second convolutional neural network by considering the determined final quality level based on a user selection, wherein the second label is a state of a retinal disease by a second analysis means 103i.
10. The method as claimed in claim 9, further comprising: generating a third label for the fundus image based on the parameters of the image capturing device used to capture the fundus image using the second convolutional neural network, wherein the second convolutional neural network is previously trained to generate the second label for the fundus image by the second analysis means 103i; and training a third convolutional neural network using the third label by the second analysis means 103i.
PCT/IN2019/050188 2018-03-08 2019-03-05 A fundus image analysis system WO2019171398A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201841008545 2018-03-08
IN201841008545 2018-03-08

Publications (1)

Publication Number Publication Date
WO2019171398A1 true WO2019171398A1 (en) 2019-09-12

Family

ID=67846560

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2019/050188 WO2019171398A1 (en) 2018-03-08 2019-03-05 A fundus image analysis system

Country Status (1)

Country Link
WO (1) WO2019171398A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541924A (en) * 2020-12-08 2021-03-23 北京百度网讯科技有限公司 Fundus image generation method, device, equipment and storage medium
WO2024128108A1 (en) * 2022-12-12 2024-06-20 DeepEyeVision株式会社 Information processing device, information processing method, and computer-readable recording medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787638B2 (en) * 2011-04-07 2014-07-22 The Chinese University Of Hong Kong Method and device for retinal image analysis
WO2018035473A2 (en) * 2016-08-18 2018-02-22 Google Llc Processing fundus images using machine learning models

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8787638B2 (en) * 2011-04-07 2014-07-22 The Chinese University Of Hong Kong Method and device for retinal image analysis
WO2018035473A2 (en) * 2016-08-18 2018-02-22 Google Llc Processing fundus images using machine learning models

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541924A (en) * 2020-12-08 2021-03-23 北京百度网讯科技有限公司 Fundus image generation method, device, equipment and storage medium
CN112541924B (en) * 2020-12-08 2023-07-18 北京百度网讯科技有限公司 Fundus image generation method, fundus image generation device, fundus image generation apparatus, and fundus image storage medium
WO2024128108A1 (en) * 2022-12-12 2024-06-20 DeepEyeVision株式会社 Information processing device, information processing method, and computer-readable recording medium

Similar Documents

Publication Publication Date Title
US20220076420A1 (en) Retinopathy recognition system
US11790645B2 (en) Diagnosis assistance system and control method thereof
CN111712186B (en) Method and device for aiding in the diagnosis of cardiovascular disease
WO2019180742A1 (en) System and method for retinal fundus image semantic segmentation
Pires et al. Advancing bag-of-visual-words representations for lesion classification in retinal images
Xu et al. FFU‐Net: Feature Fusion U‐Net for Lesion Segmentation of Diabetic Retinopathy
KR20200005407A (en) Diagnostic auxiliary image providing device based on eye image
Rahim et al. Automatic detection of microaneurysms in colour fundus images for diabetic retinopathy screening
KR102301058B1 (en) Diagnosis assistance system and control method thereof
KR20230104083A (en) Diagnostic auxiliary image providing device based on eye image
Shorfuzzaman et al. An explainable deep learning ensemble model for robust diagnosis of diabetic retinopathy grading
WO2021005613A1 (en) Chest radiograph image analysis system and a method thereof
US11721023B1 (en) Distinguishing a disease state from a non-disease state in an image
CN116848588A (en) Automatic labeling of health features in medical images
WO2019171398A1 (en) A fundus image analysis system
Jemima Jebaseeli et al. Retinal blood vessel segmentation from depigmented diabetic retinopathy images
Masud et al. A convolutional neural network model using weighted loss function to detect diabetic retinopathy
WO2019082202A1 (en) A fundus image quality assessment system
Sridhar et al. Artificial intelligence in medicine: diabetes as a model
WO2019082203A1 (en) A system and method for detection and classification of retinal disease
Jiang et al. Segmentation of Laser Marks of Diabetic Retinopathy in the Fundus Photographs Using Lightweight U‐Net
Rajarajeswari et al. Simulation of diabetic retinopathy utilizing convolutional neural networks
Kumar et al. Automatic Identification of Cataract by Analyzing Fundus Images Using VGG19 Model
Alatoum Diabetic Retinopathy Detection through Neural Networks
Deng et al. Beyond Predictions: Explainability and Learning from Machine Learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19764945

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19764945

Country of ref document: EP

Kind code of ref document: A1