WO2013142219A1 - System and method for crowd-sourced telepathology - Google Patents

System and method for crowd-sourced telepathology Download PDF

Info

Publication number
WO2013142219A1
WO2013142219A1 PCT/US2013/031109 US2013031109W WO2013142219A1 WO 2013142219 A1 WO2013142219 A1 WO 2013142219A1 US 2013031109 W US2013031109 W US 2013031109W WO 2013142219 A1 WO2013142219 A1 WO 2013142219A1
Authority
WO
WIPO (PCT)
Prior art keywords
cells
images
computer
image
individual cells
Prior art date
Application number
PCT/US2013/031109
Other languages
French (fr)
Inventor
Aydogan Ozcan
Sam Mavandadi
Original Assignee
The Regents Of The University Of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of California filed Critical The Regents Of The University Of California
Publication of WO2013142219A1 publication Critical patent/WO2013142219A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/101Collaborative creation, e.g. joint development of products or services
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/212Input arrangements for video game devices characterised by their sensors, purposes or types using sensors worn by the player, e.g. for measuring heart beat or leg activity
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/65Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition
    • A63F13/655Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition by importing photos, e.g. of the player
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/40ICT specially adapted for the handling or processing of patient-related medical or healthcare data for data related to laboratory analysis, e.g. patient specimen analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/60ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
    • G16H40/67ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1012Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals involving biosensors worn by the player, e.g. for measuring heart beat, limb activity
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/61Score computation
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/69Involving elements of the real world in the game world, e.g. measurement in live races, real video
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques
    • G06F18/2178Validation; Performance evaluation; Active pattern learning techniques based on feedback of a supervisor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the technical field generally relates to systems and methods for remote telepathology and more specifically, crowd-sourced telepathology.
  • Crowd-sourcing is an emerging concept that has attracted significant attention in recent years as a strategy for solving computationally expensive and difficult problems.
  • pieces of difficult computational problems are distributed to a large number of individuals.
  • Each participant completes one piece of the computational puzzle, sending the results back to a central system where they are all combined together to formulate the overall solution to the original problem.
  • crowd-sourcing is often used as a solution to various pattern-recognition and analysis tasks that may take computers long times to solve.
  • One of the underlying assumptions of such an approach is that humans are better than machines at certain computational and pattern recognition tasks.
  • Foldlt (available on the Internet at fold.it), as an example, is a game in which players attempt to digitally simulate folding of various proteins, helping researchers to achieve better predictions about protein structures.
  • EteRNA is another game, which likewise makes use of crowds to get a better understanding of RNA folding.
  • the game play consists of games or puzzles wherein users attempt to design RNA sequences that fold up into a target shape on the user's computer. These puzzles are used to help reveal new principles for designing RNA-based switches and nanomachines.
  • a system and method that transfers microscopic images or portions thereof of specimens to a plurality of computer devices configured to run gaming software thereon (e.g., computer gaming devices).
  • the computer gaming devices may include, for example, personal computers, portable electronic devices such as a tablets, mobile phones, or wearable computers.
  • the gaming software receives the microscopic images and prompts the user to provide a response about various aspects of the microscopic image.
  • the microscopic image presented to a user of the gaming software may include individual cells contained with the specimen. The user may be asked to identify those cells that are positive with respect to a particular phenotype or disease state. Likewise, the user may be asked to identify those cells that are negative with respect to a particular phenotype or disease state.
  • all or substantially all of the users that play the game to identify the cells are non-expert users.
  • the non-expert users are not specially trained in cell pathology or microscopy.
  • some (or all) of the users that play the game may be characterized as expert users.
  • the results associated with the expert users may, in some instances, be combined with the results associated with the nonexpert users to improve diagnostic results.
  • the results from each user are then transferred from each respective computer gaming device to one or more remotely located computing devices.
  • the remotely located computing devices may include a computer server or multiple computer servers.
  • the results from any particular user may include an identification or label that is associated with a particular image or image frame. In some instances, this information may be binary information such as positive, negative or infected, not infected. In other embodiments, the information may also include additional options such as when the user is unsure of the particular identification, i.e., user is unsure whether image is positive or negative.
  • the microscopic images or portions thereof may include known control images in which the identification is known a priori. For example, images of cells that are known to be positive or negative may be presented to the user.
  • the control images may be used to score the accuracy or effectiveness of the particular user.
  • the images presented to the users may not include any control images. For example, experienced users may receive microscopic images or portions thereof with no controls.
  • the gaming software may optionally provide feedback to the users.
  • the feedback may include a performance metric.
  • a metric may include a number or percentage that corresponds to identification accuracy the like.
  • the gaming software may also provide one or more motivators that encourage users to play the game.
  • Such motivation may include monetary motivation.
  • the user may be paid a small amount of money per identification or "click.”
  • the user may be paid based on his or her expertise level.
  • the monetary motivation may include a donation that is paid to the user's selected charitable organization.
  • the motivator may also include software features that engage the user.
  • the user may self-select data sets that originate from a certain geographical region or area of interest (e.g., user's home country). Any number of motivators are contemplated.
  • results of each game or partial game are transmitted back to the one or more remotely located computing devices.
  • the results are then aggregated or fused and subject to analysis whereby the putative correct label or diagnosis (e.g., positive, infected) is assigned to the particular input image.
  • this process involves decoding the results from all of the users using decoding software or the like.
  • decoding may include a Maximum Posteriori Probability (MAP) approach.
  • MAP Maximum Posteriori Probability
  • a diagnosis is given based on the labels or diagnosis applied to the individual image.
  • a slide or other sample may include hundreds or thousands of cells. Once all or a portion of these cells have been labeled or diagnosed, the slide or sample may then be diagnosed.
  • the slide contains blood that tests positive for a disease state.
  • red blood cells (RBCs) that are stained with Giemsa stain may be imaged with microscopes whereby images of individual cells within the same are transmitted to different computing devices that are configured to run gaming software.
  • the various images are then classified or identified by the users and the results transmitted back to the central server or servers for decoding.
  • the user identifies those cells that are infected or not infected with the malaria parasite.
  • Each cell is assigned a label based at least in part on the aggregated data from the different users running the gaming software. This information may then be used to assign a diagnosis to a slide or sample. For example, based on the decoded data, the software running on the remote server(s) may output that the slide is viewed as positive for infection with the malaria parasite. If patient identification information is associated with each slide, the software may output that a particular patient has tested positive for the malaria parasite.
  • a method of analyzing microscope slide images using crowd-sourcing includes obtaining one or more microscopic images of cells on the microscope slide.
  • Image processing is performed to identify groups of cells or individual cells within the image. Images of the groups of cells or individual cells are transferred to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display the images of the groups of cells or individual cells on a display.
  • the gaming software is used to identify individual cells suspected of having a particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices.
  • Identification information is transferred from the plurality of different computer gaming devices to one or more remotely located computing devices.
  • the individual cells are labeled based at least in part on a decoding operation performed by the one or more remotely located computing devices on the transmitted identification information.
  • a method of analyzing microscope slide images using crowd-sourcing includes obtaining one or more microscopic images of cells on the microscope slide.
  • Image processing is performed with at least one computer to identify individual cells or groups of cells within the image.
  • Individual cells or groups of cells suspected of having a particular characteristic or phenotype are automatically identified using a pre-trained machine learning algorithm executed on the at least one computer, wherein the automatically identified cells or groups of cells are those cells having a confidence level about a threshold value.
  • Images of the remaining cells are transferred to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display images of the cells on a display, wherein the remaining cells are those cells having a confidence level below the threshold value.
  • the gaming software is used to identify individual cells or groups of cells suspected of having the particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices.
  • the identification information from the plurality of different computer gaming devices is transferred to the at least one computer.
  • the individual cells are labeled based at least in part on a decoding operation performed by the at least one computer on the transmitted identification information for the cells having the confidence level below the threshold value.
  • a system for analyzing microscope slide images using crowd-sourcing includes a remote computing device configured to receive one or more microscopic images of cells on the microscope slide and further configured to identify groups of cells or individual cells within the image; a plurality of computer gaming devices containing gaming software configured to receive images of the groups of cells or individual cells from the remote computing device, the gaming software further configured to display images of the groups of cells or individual cells on a display and permit user identification of individual cells suspected of having a particular characteristic or phenotype; and wherein the remote computer is configured to receive user identification information transmitted from the plurality of computer gaming devices and further configured to label the individual cells based at least in part on a decoding operation performed by the remotely located computing device on the transmitted identification information.
  • FIG. 1 illustrates a top level view of one method of analyzing microscope slide images using crowd sourcing.
  • FIG. 2 illustrates a schematic or block representation of the major components of the system according to one embodiment.
  • FIG. 3 illustrates a screen view an exemplary graphical interface of a game that is displayed on a display of the computer gaming device.
  • FIG. 4 illustrates a screen view of another exemplary graphical interface of a game that is displayed on a display of the computer gaming device.
  • FIG. 5 illustrates the framework of the decoding algorithm in which the games are modeled as a communication system consisting of a broadcast unit, multiple repeaters, and a receiver/decoder.
  • FIG. 6 illustrates a hybrid (human plus machine) diagnostics method according to one embodiment.
  • FIG. 7 illustrates a Local Color Peak Histogram. For every window block, a color histogram is calculated.
  • FIG. 8 illustrates an adaptive boosting algorithm
  • FIG. 9 illustrates various acronyms, their respective term names, and definitions.
  • FIG. 10 illustrates a graph of accuracy and sensitivity for twenty (20) gamers.
  • FIG. 1 1 illustrates a browser-based interface for remote cell labeling according to another embodiment.
  • FIG. 12 illustrates a forward model used as part of the mixture model.
  • FIG. 13 illustrates a decoding model used as part of the mixture model.
  • FIG. 14 illustrates a graph showing experimental results on the level of agreement among experts.
  • Y axis represents % of total images in category (positive, negative, and questionable) and
  • X axis represents number of experts agreeing on decision.
  • FIG. 15 illustrates experimental results of level of self-inconsistency of each expert within each category.
  • FIG. 16 shows the performance results from nine simulated experts with varying average ensemble accuracies.
  • FIG. 17 illustrates experimental performance metrics of experts. The metrics are calculated after combining the responses of all the experts using EM and then assuming the results to be correct.
  • Accuracy (TP+TN)/(TP+TN+FP+FN)
  • PPV TP/(TP+FP)
  • NPV TN/(TN+FN)
  • FPR FP/(TN+FP)
  • TP,TN,FP, and FN correspond to the number of true positive, true negative, false positive, and false negative labels respectively.
  • FIG. 18 illustrates sample cells classified by the system and method.
  • FIGS. 19A illustrates Receiver Operator Characteristic (ROC) curves for smear- level diagnosis with a smear that has a parasitemia level of 0.5%.
  • ROC Receiver Operator Characteristic
  • FIGS. 19B illustrates ROC curves for smear-level diagnosis with a smear that has a parasitemia level of 1%.
  • FIG. 20 illustrates an example of an embodiment what is implemented as part of a bioCAPTCHA.
  • FIG. 1 illustrates a top level view of one method of analyzing microscope slide images using crowd sourcing.
  • microscopic images of cells 10 contained on a microscope slide 12 are obtained using a microscope 14.
  • FIG. 1 illustrates a map of the world illustrating that microscopes 14 may be located in a number of different geographic regions.
  • the microscopes 14 are able to digitally capture images of the cells 10 on the microscope slide 12.
  • the cells 10 may be prepared using conventional preparation and staining techniques. For example, as one illustrated embodiment described herein, a thin blood smear is formed on the microscope slide 12. In this embodiment, the thin blood smear is stained with Giemsa stain which is used to detect the presence of malaria parasites in red blood cells (RBCs).
  • RBCs red blood cells
  • Giemsa stain stains the nuclear material in the malaria parasite with a blue tint and does not affect RBC morphology.
  • the microscopes 14 are typically brightfield microscopes that are able to digitally capture images of the stained RBC preparation using a 100X oil-immersion objective lens.
  • the microscopes 14 in this example tend to be located within geographic regions where there is risk of contracting malaria.
  • the microscopes 14 may be located in clinics, hospitals, or other point-of-care facilities.
  • the digitized images of the stained cells are then transmitted back a remotely located computing device 16.
  • This computing device 16 may include one or more computer servers with ample storage capacity to store imaging and other data received from the geographically dispersed microscopes 14 as well as crowd-source game data as described in more detail herein.
  • the remotely located computing device 16 may be housed in a data center or the like.
  • the image data from the microscopes 14 is transmitted to the remotely located computing device 16 wherein the images are subject to image pre-processing and then distributed out to a plurality of computer gaming devices 20 which are configured to run gaming software utilizing the processed images.
  • the computer gaming devices 20 may include, by way of example, personal computers, laptop computers, hand held computers, tablets, mobile phones, and wearable computers (e.g., glasses, watches, and the like with embedded computers or processors).
  • a user of a computer gaming device 20 is referred to herein as "gamers.”
  • the gamers identify cells 10 having a particular phenotype or characteristic which is then communicated back to the remotely located computing device 16.
  • the individual results from multiple gamers are then fused together to label or identify individual cells 10 having a particular phenotype or characteristic.
  • one particular phenotype or characteristic includes infected or positive cells 10 that have been identified as being infected by the malaria parasite.
  • Another particular phenotype or characteristic may include non-infected or negative cells 10 that have been identified as not being infected with the malaria parasite.
  • Reference to a phenotype or characteristic of a cell 10 may refer to the physical size, shape, or state of the cell 10. It may also refer to whether the cell has a particular disease state such as described above with respect to infection with a parasite. Other disease states are also contemplated such as cancer, sickle cell anemia, immunological disease, and the like. Phenotype or characteristics of a cell 10 may also include abnormalities that may or may not be associated with a diseased state. Stains or the like may or may not be used in connection with the operation of obtaining microscopic images. As explained in more detail below, results from multiple gamers are decoded by the remote computing device 16 to assign a label to each cell 10 (or in some instances groups of cells 10).
  • one label may be infected cells 10 while another label may be non- infected cells.
  • the remote computing device 16 uses statistical-based algorithms to assign these labels based at least in part on the labels provided by the many gamers. For example, rather than use a majority -based system to assign final labels to cells 10, the computing device 16 is able to weight (or de-weight as the case may be) the label results provided by the various gamers to achieve a final label to the cell 10 that closely
  • Gamers are monitored for their accuracy during game play by the addition of one or more control images where the results are known a priori. This information can be used to score or otherwise track the accuracy to which a particular gamer is labeling the cells 10. This information is then fed into the remote computing device 16 and is used to synthesize a final label determination for a particular cell 10.
  • FIG. 2 illustrates a schematic or block representation of the major components of the system according to one embodiment. Illustrated in FIG. 2 is the remote computing device 16.
  • the remote computing device 16 may include one or more computers (e.g., servers) that include therein processors that are configured to execute software to carry out the methods described herein.
  • the remote computing device 16 may also include storage functionality as well as computational functionality.
  • the remote computing device 16 may include or be associated with storage media (e.g., disk drives) for storing data such as images, labels, identification information, and the like).
  • storage media e.g., disk drives
  • one or more components contained in the remote computing device 16 may be distributed across multiple locations. For example, storage of image files may take place in a separate location where the software of the remote computing device 16 is executed.
  • the remote computing device 16 runs software includes a data pre-processing and collection module 24 or functionality.
  • the data pre-processing and collection module 24 may receive raw data from microscopes 14.
  • FIG. 2 illustrates raw data being transmitted to the remote computing device 16 from a remote source 26 such as point of care clinics or hospitals. While raw data may be communicated from the remote source 26, diagnosis or other information can be communicated from the remote computing device 16 back to the remote source 26 (e.g., clinic or hospital).
  • the data pre-processing and collection module 24 may receive digital images of cells 10 contained on slides 12 and subject the same to lower level processing.
  • the data pre-processing module 24 thus may prepare the raw image files into smaller image files that can then be sent to the gaming community 28 using a game module 30.
  • the gaming community 28 may include, in some embodiments, substantially all non-experts 32 or nonprofessionals having no experience in pathology or cytology. In other embodiments, the gaming community 28 may include experts 34 or other trained professionals with experience in examining images of cellular samples.
  • the games that are played by the gaming community 28 on respective computer gaming devices 20 may take any number of forms.
  • the games may be downloaded or otherwise transferred to the gaming device 20 using, for example, the gaming module 30.
  • the game may take the form of an application or "app" that resides on the computer gaming device 20.
  • the game may also run within a web browser (e.g., using JAVA, FLASH, HTML, or the like).
  • the computer gaming device 20 may include, for example, personal computers, portable electronic devices such as tablets, mobile phones, or the like.
  • the games may be run on multi-platforms and are not limited to a particular computer gaming device 20. Some users may play the game on a mobile phone, while others play the game on a personal computer, while still others play the game on their tablet devices.
  • a machine learning module 36 is optionally included in the software running on the remote computing device 16.
  • the machine learning module 36 may contain one or more machine-vision algorithms that are prepared by training using a dataset of known control slides.
  • the machine-vision algorithm is able to automatically label cells 10 based on input images.
  • both the game module 30 and the machine learning module 36 are used together as part of a hybrid system to create a more efficient and accurate labeling platform. For example, results obtained from the various gamers may be used to improve the accuracy of the machine-vision algorithm. It should be understood, however, that in some embodiments, there is no need for the machine learning module 36.
  • game results from the gamer community 28 is returned to the remote computing device 16.
  • Data may be transferred in batch form after the game has ended or the data may be transferred back to the remote computing device 16 as the game is being played.
  • Data transfer between the computer gaming devices 20 and the remote computing device 16 may occur across any number of networks.
  • commercial wireless networks used for voice or data traffic may be used to transfer data to and from the computer gaming devices 20.
  • a wide area network such as the internet may be used to transfer data between the computer gaming devices 20 and the remote computing device 16.
  • Some combination of different networks can also be used.
  • computer gaming devices 20 may be connected to the internet via a WiFi network, Bluetooth, or the like.
  • the computer gaming device 20 includes gaming software in which user's label cells 10 as part of the gaming environment. These user-labeled cells 10 are then identified by the gaming software and appropriate labels are then given to the different images. This information is then transmitted back to the remote computing device 16.
  • the game module 30 may include decoding functionality as described in more detail herein wherein label results provided by a plurality of users are "decoded" to find the true or correct label for each particular cell 10. This decoding operation may weigh the user-provided labels depending on the particular accuracy or effectiveness of the user that is playing the game. For example, the labels provided by the expert gamer 34 may be weighted higher than those of the non-expert 32 gamer. This weighting may be based on the performance each respective gamer when labeling or otherwise classifying cells. This performance may be monitored, for example, by the use of control images that are interspersed among the test images that are displayed to the user on the computer gaming device 16.
  • FIG. 2 also illustrates that one or more educational or research facilities 38 may exchange data with the remote computing device 16.
  • a diagnosis of a slide or sample is returned to the remote source 26.
  • the remote computing device 16 may be used to form a library of images in which cells 10 are labeled using the crowd-source techniques described herein.
  • This library of images may be used as a teaching tool, for example, to train pathologist or cytologists.
  • the library of images may also be used for research purposes.
  • the library of known images may also be used to train other machine learning algorithms.
  • FIG. 3 illustrates a screen view an exemplary graphical interface 39 of a game that is displayed on a display of the computer gaming device 20.
  • a user is presented with a plurality of individual images 40 with each image containing approximately a single cell 10.
  • these individual images of cells 10 may be produced by pre-processing of the raw images using the pre-processing module 24.
  • the individual images 40 are presented in an array format although other formats could be used. In this example, twenty four (24) images are presented to the user.
  • the user is required to label those cells 10 that are viewed as infected with the malaria parasite, i.e., those cells 10 viewed as positive.
  • the user would touch the positive button 42 located on one side of the interface and then proceed to touch those cells 10 that the user interpreted as positive for the malaria parasite.
  • Touching the positive button 42 as well as the cells 10 can be accomplished in any number of ways. For example, a cursor could be used to depress buttons 42 and select cells 10. Alternatively, if the screen is a touch screen, a user may simply touch the screen in the location of the button or cell to accomplish the same result.
  • image 40' represents a RBC 10 that is infected by malaria parasite.
  • RBCs do not have nuclei themselves.
  • the Giemsa stains the nuclear material in the parasite with blue colored tint but does not affect the RBC morphology.
  • later stage infections by the malaria parasite tend to look like "headphones" which can be seen within the RBC of image 40'.
  • the user would then click image 40' to label the same as "positive" or infected.
  • a graphic or other image may be superimposed over the image 40' to indicate to the user visually that this particular image 40' has already been identified as positive.
  • the image 40' may simply disappear when clicked or touched.
  • the user After the user has identified all of the positive cells 10 within the array of images 40, the user then labels all of the remaining images as negative. To effectuate this operation, the user depresses the Label All - Negative button 44. When this button 44 is depressed, all of the remaining cells 10 are labeled as "negative” or non- infected. Thus, unlike the positive identification step where each positive cell 10 must be actively clicked or touched, all of the remaining cells 10 are labeled as negative in one operation. This is possible because the large majority of cells 10 are negative. If the nature of the phenotype or characteristic of the cell 10 is more frequent, a user may be required to individually select the negative cells 10 as well.
  • a performance bar 46 At the top of the interface of FIG. 3 is located a performance bar 46.
  • the performance bar 46 is used to indicate to the user his or her accuracy in identifying the positive and negative cells 10 that are presented to the user. As performance improves, the bar 46 moves to the right. Conversely, as performance declines, the bar 46 moves to the left.
  • the performance bar 46 may also change color as accuracy increase or decreases.
  • the performance bar 46 may also be used as a motivator to motivate the player of the game. For example, the user is also provided with a score 48 that indicates the player's performance. The gamer is motivated to accurately identify and label cells 10 to increase his or her score.
  • gamers may be paid on the number of images that are labeled by a user. As a requirement to be paid, players may need to keep a minimum score.
  • the rate of pay may increase for gamers that attain and maintain higher scores. Donations may also be made to the gamer's charity of choice based on the number of labeled images. Additional motivators include the option of unlocking additional features of the game. For example, after reviewing a minimum number of images or scoring at a certain level, a user may be able to select the geographic region from which the sample was taken. For example, a gamer in India may choose to play games loaded with cellular images taken from a person in Mumbai. Other motivators may also be employed like a high score list, prizes, and the like. Another motivator may be sound or music that accompanies the game. In the user interface 39 of FIG. 4, the user is able to turn music on or off through button 49.
  • the user interface includes an undo button 50 that permits the user to go back and undo a selection that was previously made. For example, the user may decide that an image 40 that was previously viewed to contain a positive cell 10 was, on second thought, negative. The user may depress or otherwise select this undo button 50 to restore the original image 40. Depression of the back button 50 may result in a decrease in the performance bar 46 and the score 48. The game may be timed or untimed. Timing data may be used to determine those gamers that can most efficiently label cells 10 with the requisite accuracy.
  • FIG. 4 illustrates another view of an exemplary graphical interface 39 of a game that is displayed on a display of the computer gaming device 20.
  • This game environment is similar to that of FIG. 3 with the exception that multiple or groups of cells 10 are contained in each image 40.
  • a user still picks individual cells within the image 40.
  • the selection does not apply to the entire image 40 but rather a point or region of the image 40.
  • the cell 10' is labeled as positive wherein a cross or cursor marks wherein the selection has been made. This may be made using a cursor or touch screen environment.
  • the specific cell 10' within the group can be associated with the label because positional data is obtained relative to the image 40. For example, the x and y coordinates within the image 40 are obtained where the user placed the positive marker. This can be used to associate the label with one cell 10' of many cells 10 contained in the single image 40.
  • results from the gamers are transmitted from the gaming community 28 back to the remote computing device 16 to decode the "correct" label.
  • results from the gamers are transmitted from the gaming community 28 back to the remote computing device 16 to decode the "correct" label.
  • results from the gamers are transmitted from the gaming community 28 back to the remote computing device 16 to decode the "correct" label.
  • several different gamers have received the same images 40 that are required to be labeled.
  • the game module 30 aspect of the software will then decode what the correct label should be based on the labels applied by the gamers to the images 40.
  • each gamer will output a decision sequence. Ideally, the output of each gamer yield the correct diagnostic labels for the images. Given that each image either corresponds to a healthy cell or an infected cell, one can use binary labels to identify them: 0 for healthy and 1 for infected. Recasting the system as a communications system, the remote computing device 16 or server will act as a broadcaster of a binary sequence and each gamer will act as a noisy Binary Channel, retransmitting the symbols back along with some errors. Note that since the gamers may not necessarily make mistakes symmetrically, the probability of a gamer mislabeling a healthy cell may be different from that of mislabeling an infected cell.
  • FIG. 5 illustrates the framework of the decoding algorithm in which the games are modeled as a communication system consisting of a broadcast unit, multiple repeaters, and a receiver/decoder.
  • the repeaters i.e., the gamers
  • the receiver/decoder block which in turn combines and stores the decisions (image labels).
  • the cell images are essentially treated as binary symbols. The model assumes the equivalency of the two image classes infected and healthy with binary symbols 1 and 0 respectively.
  • the sequence of symbols 3 ⁇ 4, XN is broadcast to M repeater units (i.e., the gamers within the crowd) that can be viewed as parallel noisy channels. To be able to decode the outputs of these channels reliably, it is necessary to learn the channels adaptively. As such, it is needed to embed some known symbols (i.e., control images) in the output of the broadcast unit. Knowing the binary value of certain symbols/images at specific times, one can learn the conditional probabilities P j (yi ⁇ Xi) as more symbols are transmitted by the broadcast unit and passed through the repeaters/gamers. Additionally, an encoder unit can also be placed after the broadcast unit to increase the redundancy of the transmission, and allow for error correction at the decoder.
  • Each gamer is modeled as an independent repeater that behaves as a binary channel.
  • the error probabilities are defined using the notation p j (y t ⁇ x t ), corresponding to the probability that the j th user will output the symbol y t when observing the symbol x t (i.e. i th image).
  • the broadcast unit can also include an encoder to increase the information redundancy prior to transmission to the repeaters/gamers. Given that the symbols being transmitted by the broadcast unit are not known a priori, the appropriate coding scheme is the repetition code, where each symbol is repeated for an odd number of times prior to transmission. At the decoder, a majority vote is taken on the channel outputs.
  • ECC Error Control Coding
  • the decoder used in the gaming platform uses a Maximum a Posteriori Probability (MAP) approach.
  • MAP Maximum a Posteriori Probability
  • FIG. 6 illustrates another embodiment of a hybrid method and system that combines a human and machine-based diagnostic framework.
  • new images that are to be labeled or diagnosed are generated in operation 60.
  • the new images are then subject a pre-trained machine learning algorithm 62.
  • Pre-trained machine learning algorithm 62 may be implemented in a machine learning module 36 or other feature like that disclosed in FIG. 1.
  • the diagnosis or label that is produced by the pre- trained machine learning algorithm 62 is associated with a confidence level.
  • the confidence level is then compared against a threshold level (T) as illustrated in operation 64.
  • Threshold may be a number, percentage, or the like correlates the confidence level of a diagnosis or label for a particular image.
  • the confidence level associated with the output of the pre- trained machine learning algorithm 62 determines whether the images are crowed-sourced or not. Those images that have a confidence level that is below the threshold (T) are passed onto the gamers and crowd-sources as illustrated in operation 66. Conversely, those images that have a confidence level that is above the threshold (T) are not passed to the gamers and crowd-sourced. In this embodiment, the difficult-to-diagnose images are crowd-sourced while the easy-to-diagnose images are run through the pre-trained machine learning algorithm 62.
  • the results are merged with the diagnosis of the easy-to-diagnose images in merging operation 68.
  • an optional final diagnosis operation 70 may be performed where the sample or slide may be given a diagnosis. Of course, this operation 70 may be omitted where one only needs the merged diagnosis or label information.
  • a set of training data 72 is produced by the merged diagnostic information. This training data 72 is used to improve the self-learning algorithm 62. During each cycle, the self-learning algorithm 62 will improve as a result of added training data.
  • the algorithm extracts local color features from the cell images and feeds them to a classifier.
  • a small subset of cell images in the dataset i.e., control images
  • the training set a small subset of cell images in the dataset (i.e., control images) are used as the training set.
  • the advantage of using hard-coded features is one can use prior knowledge of the physical/structural properties of the parasites. For example, one can look for "ring-shaped objects" within the RBC image as an indicator for the existence of the parasite.
  • the first is a simple color histogram of the image in grayscale. This is a feature that carries information about the general distribution of image values.
  • a second, more complicated color feature which was named Local Color Peak Histograms (LCPH) was also employed.
  • the LCPH for an image is formed by first generating highly quantized color histograms in the Hue-Saturation space over local windows. For each window the two most occurring histogram bins are found. Any given pair of bins corresponds to a particular index value. In other words, an index can be assigned to the occurrence of each pair of values in the histogram, and this is the value that is recorded for each local window in the image.
  • a second histogram is generated over the recorded indexes of all the local windows and used as the LCPH features. This feature essentially measures the relative occurrences of various color pair co-occurrences throughout the image.
  • FIG. 7 illustrates LCPHs.
  • a color histogram is calculated. The dominant pair of colors is used to compute an index (e.g., with 5 bins, there are a total of 10 different index values).
  • a histogram of all index values is computed and used as part of the feature vector.
  • a number of more basic image features such as mean, variance, and gradient magnitude histograms are used to form final feature vectors.
  • Adaptive Boosting In training the classifier for the malaria parasite, a variation of Adaptive Boosting was used. In this method, many weak decision-tree classifiers are trained that together can produce a very strong classifier. In classical Adaptive Boosting, the overall classifier is tested on the complete training dataset at each iteration. Data points that are not correctly classified are then assigned larger weights for the next classifier to be trained. This is where the algorithm used deviated from the classical algorithm in that instead of re-weighting the full training set and training a new classifier, the weights were used to probabilistically select a small subset of the training data for the next weak classifier to be trained.
  • FIG. 8 illustrates a summary of the Adaptive Boosting algorithm that was used.
  • the total number of training points is fixed for each weak classifier, i.e., for each weak classifier a total of n s training vectors are chosen based on the weights W k (i).
  • a digital gaming platform was developed through which an unlimited number of gamers from any location in the world were provided the opportunity to access and diagnose images of human RBCs that are potentially infected with P. falciparum.
  • the gaming platform was implemented to be run both on personal computers (using Adobe Flash running on any internet browser such as Internet Explorer, Mozilla Firefox etc.) and on Android- based mobile devices, including mobile-phones as well as tablet PCs.
  • each gamer was given a brief online tutorial explaining the rules of the game and how malaria infected RBCs typically look with some example images.
  • each gamer played a training game where the gamer was required to successfully complete in order to continue playing the rest of the game.
  • This test game consisted of 261 unique RBC images, where 20 of them were infected.
  • the gamers were required to achieve >99% accuracy in this training game, and in the case of failure, they were asked to re-play the game until they achieved 99%. This way all the gamers became familiar with the rules of the game and were briefly trained on the diagnostics task.
  • This training game was required only once— when the gamers registered on our platform.
  • this training game Upon registration, a unique user ID was assigned to each gamer and her/his individual diagnostics performance was tracked. Furthermore, this training game provided direct feedback to the players on their performance and their mistakes through a scoring mechanism. Since the labels (i.e., infected cell vs. healthy cell) of all the images were known a priori for the purposes of this training game, the player's score was updated throughout the game (i.e., positive score for correct diagnosis, and negative score for incorrect diagnosis).
  • each frame there are a certain number of cells whose labels (infected or healthy) are known to the game, but unknown to the gamers.
  • These control cell images allow the system to dynamically estimate the performance of the gamers (in terms of correct and incorrect diagnosis) as they go through each frame and also help in assigning a score for every frame that they pass through. This is different in the training game where all the images are effectively control images.
  • a score is assigned based on the performance of the gamer only on the control images.
  • These control images (roughly 20% of all the images) along with the scoring system allow the game to provide some feedback to the gamer on their performance such that as the gamers continue to play, they can improve their diagnostics performance.
  • the images and their order of appearance were identical among different gamers, thus allowing us to make a fair comparison among their relative performances.
  • each individual RBC image was cropped and resized to fixed dimensions of 50x50 pixels.
  • a set of images provided by the Center for Disease Control (CDC) was also used, yielding an addition of 1 18 infected and 595 uninfected RBC images.
  • CDC Center for Disease Control
  • the framework of the games was modeled as a noisy communication network consisting of a broadcast unit, multiple repeaters, and a receiver/decoder unit for the final diagnosis as seen in FIG. 2.
  • the repeaters i.e., the gamers
  • receiver/decoder block which in turn computes the optimal "correct" label for each individual unknown RBC image using a Maximum a Posteriori Probability (MAP) approach.
  • MAP Maximum a Posteriori Probability
  • an automated computer vision-based algorithm was developed to detect the presence of malaria parasites. In doing so the aim was to ultimately create a hybrid system such that machine vision and human vision can be coupled to each other, creating a more efficient and accurate biomedical diagnostics platform as illustrated in FIG. 6.
  • the automated diagnosis performance of the machine-vision algorithm was tested, which was trained on 1266 RBC images (same as the control images used in experiment #1) and was tested on a total of 5055 unique RBC images (471 positives and 4584 negatives - see Table 1).
  • FIG. 10 summarizes "the effect of the crowd” on diagnosis accuracy and sensitivity, i.e., how the overall performance of the crowd's diagnosis is improved as more gamers are added to the system.
  • the overall diagnosis accuracy also steadily improves as more gamers are added as shown in FIG. 10.
  • This crowd effect may seem like a deviation from the traditional benefits of crowd-sourcing, in that multiple players are inaccurately solving the whole puzzle and then their results are combined to yield a more accurate solution.
  • cell images from a single blood smear slide can be broken up into multiple batches, where each batch is crowd-sourced to a group of players.
  • each unique group of players will focus on one common batch of cell images, and in the end the diagnosis results will be combined once at the group level to boost the accuracies for each cell, and again at the slide level to make a correct overall diagnosis per patient. Therefore, the contribution of the crowd is twofold. First, it allows for the analysis problem to be broken up into smaller batches, and second, the analysis of the same batch by multiple individuals from the crowd allows for significantly higher overall diagnosis accuracies.
  • diagnosis results are for 'individual' RBCs, not for patients.
  • malaria diagnosis using a blood smear sample corresponding to a patient is a relatively easier task compared to single cell diagnosis since a thin blood smear for each patient sample already contains thousands of RBCs on it. Therefore statistical errors in the parasite recognition task could be partially hidden if the diagnostics decisions are made on a per blood-smear slide basis.
  • the system was aimed for the diagnosis of individual RBCs, rather than patients.
  • any given patient's blood smear slide will be digitally divided into smaller images (containing e.g., a handful of RBCs per image), and >1,000 RBC images per patient will be distributed to the crowd, one should expect much higher levels of accuracy and sensitivity for diagnosis of individual patients.
  • the single-cell-diagnosis- based gaming approach could also be very useful to estimate the parasitemia rate of patients which can be quite important and valuable for monitoring the treatment of malaria patients.
  • diagnosis of individual RBCs is described herein, the results may also be used to apply diagnostic results on a per slide or per patient basis.
  • This digital hub will allow for the creation of very large databases of microscopic images that can be used for e.g., the purposes of training and fine tuning automated computer vision algorithms. It can also serve as an analysis tool for health-care policy makers toward e.g., better management and/or prevention of pandemics.
  • motivator or other incentives can be used to recruit health-care professionals who are trained and educated to diagnose such biomedical conditions, making them part of the gamer crowd.
  • the gaming platform may serve as an intelligent telemedicine backbone that helps the sharing of medical resources through e.g., remote diagnosis and centralized data collection/processing.
  • the diagnosis can take place by professionals far away from the point-of-care.
  • it also enables the resolution of possible conflicting diagnostics decisions among medical experts, potentially improving the diagnostics outcome.
  • the number of gamers assigned to an image that is waiting to be diagnosed can be significantly lower as compared to the case where "non-professional" gamers are assigned to the same image.
  • the result of the diagnosis can still be very useful to reduce the workload of health-care professionals located at point-of-care offices or clinics where the raw images were acquired. In the case of malaria diagnosis, this is especially relevant since the health-care professional is required to look at > 1,000 RBC images for accurate diagnosis.
  • the proposed methodology can be expanded to include a training platform. Assuming the expansion of this crowd-sourced diagnostics platform and the generation of large image databases with correct diagnostics labels, software can be created to make use of such databases to assist in the training of medical professionals. Through such software, medical students and/or trainees can spend time looking at thousands of images, attempting diagnosis, and getting real-time feedback on their performances.
  • control images with "known" labels were used to estimate the statistical behavior of decisions made by individual gamers, when was then used to combine all the gamers' responses through a MAP estimation.
  • such "gold standard” metrics may be missing. For example, one may not have access to any labeled data. For example, user's may not stay with the gaming platform long enough to track information about their diagnosis accuracy.
  • a system looks at decisions made by a group of trained medical experts.
  • EM Expectation Maximization
  • one can combine the decisions made by such experts to generate more reliable diagnostic decisions at the single cell level.
  • one needs to simultaneously learn the image labels and the error probabilities associated with each expert, while maximizing the posterior probability of the observed labels.
  • a three-category mixture model for the original data was assumed and an EM algorithm used to generate the maximum likelihood labels for unknown cell images.
  • FIG. 1 1 illustrates a browser-based interface for remote cell labeling.
  • the interface is similar to that of FIG. 3 with similar features labeled as such.
  • Each expert is allowed to navigate through the database of cell images eliminating the infected cells and marking those that are questionable with a dedicated button 80. As a result each image can be labeled as positive, negative, or questionable.
  • This dataset of 8,664 images was derived from an original set of 2,888 images; i.e., each original image was rotated at multiples of 90° and randomly distributed within the final dataset.
  • These images were originally captured using different digital microscopes through lOOx objective lenses (with at least a numerical aperture of 1.0), and were digitized at 24 bits.
  • These images were then remotely presented to each individual expert through a browser- based web interface as shown in FIG. 1 1.
  • This interface consists of multiple image frames, each containing a grid of individual RBC images.
  • the size of the grid depends on the screen resolution of the computer accessing the interface and is automatically adjusted.
  • the expert is asked to remove the infected and questionable (e.g., poor image quality, difficult to diagnose, etc.) cell images using the appropriate tools selectable from the side bar. Once all such images have been labelled, the remaining cells can all be labelled as uninfected or healthy using a Label All Negative button on the side bar.
  • the experts are asked to log-in prior to starting the diagnosis, and their individual responses are recorded on servers as they progress through the database of images. The experts were allowed to view and diagnose the images in multiple sessions and were not given any time constraints for completing the diagnosis task. All of the slide readers were certified malaria diagnosticians and had at least two (2) years of clinical experience with reading of thin smears.
  • Every expert is a statistical decision unit, and all the possible sources of error for an individual expert (e.g., relatively weaker training, poor eye sight, low display resolution, etc.) are treated as a lumped entity; and there is no aim to investigate different factors that make up the overall error probability of an individual expert. Instead, one of the main goals of this work was to demonstrate that a group of experts could be digitally combined to significantly boost the accuracy of the final diagnostic decision, when compared to even the best individual of the group.
  • FIG. 12 illustrates the forward model of the embodiment.
  • N+l images with possible labels from ⁇ 0, 1,2 ⁇ being sent to M+1 experts.
  • the j" 1 expert labels each image with a certain probability P (x
  • the final dataset consists of an
  • FIG. 13 illustrates the expert responses x . are treated as the observed variables and the true image labels I n as the latent variables in a mixture model with parameters P. (x ⁇ I) .
  • Expectation Maximisation (EM) is used to obtain the maximum likelihood solution to the data.
  • y(z nk ) are the "responsibilities" for component k given the data point x n (i.e. observation vector for the n th image), which are evaluated during the "E” step of the EM algorithm.
  • x n i.e. observation vector for the n th image
  • the focus is on the diagnoses of 'single' RBC images by experts since it is the basic task to be repeated e.g., more than 1,000 times toward accurate diagnosis of a single patient's blood smear sample.
  • Single-cell-based analysis of a smear is essential for estimating the parasitemia, which can be quite important and valuable for monitoring the treatment of malaria patients.
  • a slide-level diagnosis is made for a patient (i.e., malaria infection observed, or malaria infection not observed).
  • is the probability of correctly labelling an infected cell as infected.
  • results [00115] The motivation system and method is not only to create a more accessible platform for telepathology, but also to increase the efficiency and accuracy of remote medical diagnosis. In other words, even relatively poorly trained medical personnel can be digitally and remotely combined to create highly accurate collective decisions (assuming each individual can perform at least better than chance in terms of accuracy). To set the stage in terms of motivation and potential severity of the problem, FIG. 14 shows the experimental results, revealing the level of agreement that exists among nine highly trained medical personnel who are experts in diagnosing malaria.
  • each RBC image in the database was presented three times at rotations of 90° to each expert for labelling.
  • FIG. 15 shows the level of self- inconsistency that each expert exhibits within her/his responses. The most consistent expert has a self- inconsistency of 0.2% and 0.8% for the negative and positive categories, respectively, and the least consistent expert is more than 2% inconsistent in each of those categories.
  • FIG. 18 also illustrates some sample RBC images from the categories that resulted from this consensus. Absolute accuracy is not the best metric to measure the performance of the experts in this setting due to the significant imbalance that exists in the number of healthy and infected cells in our dataset— this imbalance is even more drastic in individual patient samples due to the low parasitemia levels that typically exist in malaria infected patients.
  • NPV Negative Predictive Value
  • PPV Positive Predictive Value
  • cell-level diagnosis As noted herein, there is a distinction between cell-level diagnosis and smear-level diagnosis. Although the former is a necessary step in performing the latter task, the two do not correspond to each other in a straight-forward linear fashion.
  • the method may be used to score or diagnose slides, sample results from a subject, or even groups of subjects.
  • FIG. 19A shows that when diagnosing a smear that has a parasitemia level of 0.5% (which can be typical), if the expert has a sensitivity (i.e., true positive rate) of 99%— meaning that s/he labels an infected cell correctly 99% of the time— and a false positive rate of 1% (i.e., specificity of 99%)— meaning that s/he makes a mistake of calling an uninfected cell as infected 1% of the time— s/he would then need to label more than 2,000 individual cells so that s/he would have a smear-level false positive rate less than 10% with a true positive detection rate of 80%.
  • FIG. 19B illustrates the same graph with a parasitemia level of 1.0%.
  • the smear-level diagnosis accuracy improves as the number of labelled cells N is increased.
  • This theoretical analysis may to some extent explain the prevalence of false positive diagnoses in sub-Saharan Africa (sometimes approaching ⁇ 60%), since even with extremely high single-cell accuracy levels, professionals can still make mistakes, and unless they observe statistically significant numbers of cells, they cannot avoid making frequent false positive diagnoses.
  • This mathematical framework can be generalized and used to customize and fine tune standard diagnostic procedures depending on the training levels of individual experts. Such action may lead to significant improvements in diagnosis efficiency and cost-effectiveness, especially within a digital telepathology platform.
  • bioCAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart.
  • the bioCAPTCHA is a biologically oriented CAPTCHA that various entities can incorporate into their websites in place of traditional CAPTCHAs to ensure that visitors are humans and not automated computer software.
  • bioCAPTCHA This task could include any number of tasks including, but not limited to, counting of objects within an image (e.g., counting cells, parasites, etc.) or segmentation of a large image toward classification of different parts that make up the image. Because training of users for a bioCAPTCHA interface would not be feasible, the human tasks that are required will solve only part of overall image analysis. As one embodiment, the bioCAPTCHA interfaces could be used to help create images to be used in the games for further advanced analysis by one or more trained gamers.
  • certain regions of an images cell may be highlighted by the user of the bioCAPTCHA interface. This includes, for example, identifying organelles, cell borders, and stained regions of the cells. Other examples including counting of cells or features contained within a presented image (which may be zoomed). The user may also be asked to identify certain abnormalities or inconsistencies within among a set of images (e.g., which cell or cells is/are different from the majority of cells). This last example may involve the user passively identifying potentially diseased cells.
  • FIG. 20 illustrates an example of a bioCAPTCHA.
  • the visitor to a website is asked to count and enter the number of cells in two images.
  • the number of cells in image (a) is known a priori as a control measure to both assess the performance of the user, and to ensure that computer software is not allowed to circumvent the bioCAPTCHA.
  • the cell-count in image (b) is unknown.
  • Image (b) is presented to many users as a
  • image (b) can be a part of a much larger image, which is split into small patches, making the task of counting the cells easy enough for the bioCAPTCHA interface.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Human Resources & Organizations (AREA)
  • Biomedical Technology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Pathology (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Cardiology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

A method of analyzing microscope slide images using crowd-sourcing includes obtaining one or more microscopic images of cells. The images are subject to image preprocessing to identify groups of cells or individual cells within the image. Images of the groups of cells or individual cells are transferred to computer gaming devices associated with different users that are configured to run gaming software. The users identify, using the gaming software, individual cells suspected of having a particular characteristic or phenotype. The identification information from the different computer gaming devices is transferred to a remotely located computing device. The remotely located computing device labels the individual cells based at least in part on a decoding operation.

Description

SYSTEM AND METHOD FOR CROWD-SOURCED TELEPATHOLOGY
Related Application
[0001] This Application claims priority to U.S. Provisional Patent Application No.
61/613,396 filed on March 20, 2012. Priority is claimed pursuant to 35 U.S.C. § 119. The above-noted Patent Application is incorporated by reference as if set forth fully herein.
Technical Field
[0002] The technical field generally relates to systems and methods for remote telepathology and more specifically, crowd-sourced telepathology.
Background
[0003] Crowd-sourcing is an emerging concept that has attracted significant attention in recent years as a strategy for solving computationally expensive and difficult problems. In this computing paradigm, pieces of difficult computational problems are distributed to a large number of individuals. Each participant completes one piece of the computational puzzle, sending the results back to a central system where they are all combined together to formulate the overall solution to the original problem. In this context, crowd-sourcing is often used as a solution to various pattern-recognition and analysis tasks that may take computers long times to solve. One of the underlying assumptions of such an approach is that humans are better than machines at certain computational and pattern recognition tasks.
[0004] There has been work done in the general field of 'gaming' as a method for crowd- sourcing of computational tasks. Digital games have been used as effective means to engage an individual's attention to computational tasks of interest. If a pattern-recognition task can be embedded as part of an engaging game, then a gamer may help in solving this task together with other gamers. One of the most successful crowd-sourced projects is reCAPTCHA, a crowd-sourcing project for digitizing books and non-traditional prints. More recently, a number of gaming platforms have been created to tackle problems in e.g., biology and medical sciences, allowing non-experts to take part in solving such problems. Foldlt (available on the Internet at fold.it), as an example, is a game in which players attempt to digitally simulate folding of various proteins, helping researchers to achieve better predictions about protein structures. EteRNA is another game, which likewise makes use of crowds to get a better understanding of RNA folding. The game play consists of games or puzzles wherein users attempt to design RNA sequences that fold up into a target shape on the user's computer. These puzzles are used to help reveal new principles for designing RNA-based switches and nanomachines. [0005] Separate from crowd-sourcing of computational tasks, medical imaging has gone through a co-evolution along with the computer industry over the past three decades, with each medical imaging modality benefiting in major ways from the ever- increasing abilities of modern computers. The possibility to capture, store, and manipulate images digitally has brought upon a new age of medical imaging with a significant shift in focus toward more complex analysis software. Through shear computation and clever mathematical algorithms, modern medical imaging devices are capable of producing higher-quality images faster while exposing patients to much less harmful radiation.
[0006] Another dimension of medical imaging's evolution has been a consequence of rapid advances in telecommunications and the coming of age of the Internet. These days an X-ray or a microscope slide image can be viewed almost instantaneously thousands of miles away from the point of capture by an expert who had no involvement in the imaging procedure. This unprecedented level of access to medical images and data is now opening up new approaches to medical diagnosis, heralding the age of telemedicine, where one can outsource medical diagnosis to doctors in faraway locations, while making it significantly easier to get a second opinion on a particular diagnosis.
[0007] Attempts have also been made to automate the process of detecting diseases states in microscopic images of blood. For example, there have been prior attempts based on machine vision algorithms to automate the process of malaria (Plasmodium falciparum) diagnosis in Giemsa-stained thin blood smears using optical microscopy images. However, there are a number of factors that can negatively affect the performance of such algorithms, including variations in blood smear preparation and cell density on the slide, as well as variations in illumination, digital recording conditions, optical aberrations and the lens quality. As a result, these methodologies have not yet been able to find their ways into mainstream malaria diagnostics tools to start replacing manual inspection of blood smears.
Summary
[0008] In one embodiment, a system and method is provided that transfers microscopic images or portions thereof of specimens to a plurality of computer devices configured to run gaming software thereon (e.g., computer gaming devices). The computer gaming devices may include, for example, personal computers, portable electronic devices such as a tablets, mobile phones, or wearable computers. The gaming software receives the microscopic images and prompts the user to provide a response about various aspects of the microscopic image. For example, the microscopic image presented to a user of the gaming software may include individual cells contained with the specimen. The user may be asked to identify those cells that are positive with respect to a particular phenotype or disease state. Likewise, the user may be asked to identify those cells that are negative with respect to a particular phenotype or disease state.
[0009] In one embodiment, all or substantially all of the users that play the game to identify the cells are non-expert users. In this regard, the non-expert users are not specially trained in cell pathology or microscopy. In other embodiments however, some (or all) of the users that play the game may be characterized as expert users. The results associated with the expert users may, in some instances, be combined with the results associated with the nonexpert users to improve diagnostic results.
[0010] The results from each user are then transferred from each respective computer gaming device to one or more remotely located computing devices. For example, the remotely located computing devices may include a computer server or multiple computer servers. The results from any particular user may include an identification or label that is associated with a particular image or image frame. In some instances, this information may be binary information such as positive, negative or infected, not infected. In other embodiments, the information may also include additional options such as when the user is unsure of the particular identification, i.e., user is unsure whether image is positive or negative.
[0011] In one embodiment, the microscopic images or portions thereof may include known control images in which the identification is known a priori. For example, images of cells that are known to be positive or negative may be presented to the user. The control images may be used to score the accuracy or effectiveness of the particular user. In other embodiments, the images presented to the users may not include any control images. For example, experienced users may receive microscopic images or portions thereof with no controls.
[0012] The gaming software may optionally provide feedback to the users. For example, the feedback may include a performance metric. Such a metric may include a number or percentage that corresponds to identification accuracy the like. The gaming software may also provide one or more motivators that encourage users to play the game. Such motivation may include monetary motivation. For example, the user may be paid a small amount of money per identification or "click." As another option, the user may be paid based on his or her expertise level. Alternatively, the monetary motivation may include a donation that is paid to the user's selected charitable organization. The motivator may also include software features that engage the user. For example, the user may self-select data sets that originate from a certain geographical region or area of interest (e.g., user's home country). Any number of motivators are contemplated.
[0013] The results of each game or partial game are transmitted back to the one or more remotely located computing devices. In one embodiment of the invention, the results are then aggregated or fused and subject to analysis whereby the putative correct label or diagnosis (e.g., positive, infected) is assigned to the particular input image. In one aspect of the invention, this process involves decoding the results from all of the users using decoding software or the like. In one embodiment, such decoding may include a Maximum Posteriori Probability (MAP) approach.
[0014] In one embodiment, a diagnosis is given based on the labels or diagnosis applied to the individual image. For example, a slide or other sample may include hundreds or thousands of cells. Once all or a portion of these cells have been labeled or diagnosed, the slide or sample may then be diagnosed. For example, the slide contains blood that tests positive for a disease state. As one example, red blood cells (RBCs) that are stained with Giemsa stain may be imaged with microscopes whereby images of individual cells within the same are transmitted to different computing devices that are configured to run gaming software. The various images are then classified or identified by the users and the results transmitted back to the central server or servers for decoding. In one example, the user identifies those cells that are infected or not infected with the malaria parasite. Each cell is assigned a label based at least in part on the aggregated data from the different users running the gaming software. This information may then be used to assign a diagnosis to a slide or sample. For example, based on the decoded data, the software running on the remote server(s) may output that the slide is viewed as positive for infection with the malaria parasite. If patient identification information is associated with each slide, the software may output that a particular patient has tested positive for the malaria parasite.
[0015] In one embodiment, a method of analyzing microscope slide images using crowd- sourcing includes obtaining one or more microscopic images of cells on the microscope slide. Image processing is performed to identify groups of cells or individual cells within the image. Images of the groups of cells or individual cells are transferred to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display the images of the groups of cells or individual cells on a display. The gaming software is used to identify individual cells suspected of having a particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices.
Identification information is transferred from the plurality of different computer gaming devices to one or more remotely located computing devices. The individual cells are labeled based at least in part on a decoding operation performed by the one or more remotely located computing devices on the transmitted identification information.
[0016] In another embodiment, a method of analyzing microscope slide images using crowd-sourcing includes obtaining one or more microscopic images of cells on the microscope slide. Image processing is performed with at least one computer to identify individual cells or groups of cells within the image. Individual cells or groups of cells suspected of having a particular characteristic or phenotype are automatically identified using a pre-trained machine learning algorithm executed on the at least one computer, wherein the automatically identified cells or groups of cells are those cells having a confidence level about a threshold value. Images of the remaining cells are transferred to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display images of the cells on a display, wherein the remaining cells are those cells having a confidence level below the threshold value. The gaming software is used to identify individual cells or groups of cells suspected of having the particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices. The identification information from the plurality of different computer gaming devices is transferred to the at least one computer. The individual cells are labeled based at least in part on a decoding operation performed by the at least one computer on the transmitted identification information for the cells having the confidence level below the threshold value.
[0017] In another embodiment, a system for analyzing microscope slide images using crowd-sourcing includes a remote computing device configured to receive one or more microscopic images of cells on the microscope slide and further configured to identify groups of cells or individual cells within the image; a plurality of computer gaming devices containing gaming software configured to receive images of the groups of cells or individual cells from the remote computing device, the gaming software further configured to display images of the groups of cells or individual cells on a display and permit user identification of individual cells suspected of having a particular characteristic or phenotype; and wherein the remote computer is configured to receive user identification information transmitted from the plurality of computer gaming devices and further configured to label the individual cells based at least in part on a decoding operation performed by the remotely located computing device on the transmitted identification information.
Brief Description of the Drawings
[0018] FIG. 1 illustrates a top level view of one method of analyzing microscope slide images using crowd sourcing.
[0019] FIG. 2 illustrates a schematic or block representation of the major components of the system according to one embodiment.
[0020] FIG. 3 illustrates a screen view an exemplary graphical interface of a game that is displayed on a display of the computer gaming device.
[0021] FIG. 4 illustrates a screen view of another exemplary graphical interface of a game that is displayed on a display of the computer gaming device.
[0022] FIG. 5 illustrates the framework of the decoding algorithm in which the games are modeled as a communication system consisting of a broadcast unit, multiple repeaters, and a receiver/decoder.
[0023] FIG. 6 illustrates a hybrid (human plus machine) diagnostics method according to one embodiment.
[0024] FIG. 7 illustrates a Local Color Peak Histogram. For every window block, a color histogram is calculated.
[0025] FIG. 8 illustrates an adaptive boosting algorithm.
[0026] FIG. 9 illustrates various acronyms, their respective term names, and definitions.
[0027] FIG. 10 illustrates a graph of accuracy and sensitivity for twenty (20) gamers.
[0028] FIG. 1 1 illustrates a browser-based interface for remote cell labeling according to another embodiment.
[0029] FIG. 12 illustrates a forward model used as part of the mixture model.
[0030] FIG. 13 illustrates a decoding model used as part of the mixture model.
[0031] FIG. 14 illustrates a graph showing experimental results on the level of agreement among experts. Y axis represents % of total images in category (positive, negative, and questionable) and X axis represents number of experts agreeing on decision.
[0032] FIG. 15 illustrates experimental results of level of self-inconsistency of each expert within each category.
[0033] FIG. 16 shows the performance results from nine simulated experts with varying average ensemble accuracies. [0034] FIG. 17 illustrates experimental performance metrics of experts. The metrics are calculated after combining the responses of all the experts using EM and then assuming the results to be correct. Accuracy = (TP+TN)/(TP+TN+FP+FN), PPV = TP/(TP+FP), NPV = TN/(TN+FN), FPR = FP/(TN+FP), where TP,TN,FP, and FN correspond to the number of true positive, true negative, false positive, and false negative labels respectively.
[0035] FIG. 18 illustrates sample cells classified by the system and method.
[0036] FIGS. 19A illustrates Receiver Operator Characteristic (ROC) curves for smear- level diagnosis with a smear that has a parasitemia level of 0.5%.
[0037] FIGS. 19B illustrates ROC curves for smear-level diagnosis with a smear that has a parasitemia level of 1%.
[0038] FIG. 20 illustrates an example of an embodiment what is implemented as part of a bioCAPTCHA.
Detailed Description of the Illustrated Embodiments
[0039] FIG. 1 illustrates a top level view of one method of analyzing microscope slide images using crowd sourcing. In this method, microscopic images of cells 10 contained on a microscope slide 12 are obtained using a microscope 14. FIG. 1 illustrates a map of the world illustrating that microscopes 14 may be located in a number of different geographic regions. The microscopes 14 are able to digitally capture images of the cells 10 on the microscope slide 12. The cells 10 may be prepared using conventional preparation and staining techniques. For example, as one illustrated embodiment described herein, a thin blood smear is formed on the microscope slide 12. In this embodiment, the thin blood smear is stained with Giemsa stain which is used to detect the presence of malaria parasites in red blood cells (RBCs). Giemsa stain stains the nuclear material in the malaria parasite with a blue tint and does not affect RBC morphology. The microscopes 14 are typically brightfield microscopes that are able to digitally capture images of the stained RBC preparation using a 100X oil-immersion objective lens.
[0040] As seen in FIG. 1, the microscopes 14 in this example tend to be located within geographic regions where there is risk of contracting malaria. For example, the microscopes 14 may be located in clinics, hospitals, or other point-of-care facilities. The digitized images of the stained cells are then transmitted back a remotely located computing device 16. This computing device 16 may include one or more computer servers with ample storage capacity to store imaging and other data received from the geographically dispersed microscopes 14 as well as crowd-source game data as described in more detail herein. The remotely located computing device 16 may be housed in a data center or the like. In its broadest sense, the image data from the microscopes 14 is transmitted to the remotely located computing device 16 wherein the images are subject to image pre-processing and then distributed out to a plurality of computer gaming devices 20 which are configured to run gaming software utilizing the processed images. The computer gaming devices 20 may include, by way of example, personal computers, laptop computers, hand held computers, tablets, mobile phones, and wearable computers (e.g., glasses, watches, and the like with embedded computers or processors). A user of a computer gaming device 20 is referred to herein as "gamers." The gamers identify cells 10 having a particular phenotype or characteristic which is then communicated back to the remotely located computing device 16. At a top level description, the individual results from multiple gamers are then fused together to label or identify individual cells 10 having a particular phenotype or characteristic. In the context of RBC cells 10 stained with Giemsa stain, one particular phenotype or characteristic includes infected or positive cells 10 that have been identified as being infected by the malaria parasite. Another particular phenotype or characteristic may include non-infected or negative cells 10 that have been identified as not being infected with the malaria parasite.
[0041] Reference to a phenotype or characteristic of a cell 10 may refer to the physical size, shape, or state of the cell 10. It may also refer to whether the cell has a particular disease state such as described above with respect to infection with a parasite. Other disease states are also contemplated such as cancer, sickle cell anemia, immunological disease, and the like. Phenotype or characteristics of a cell 10 may also include abnormalities that may or may not be associated with a diseased state. Stains or the like may or may not be used in connection with the operation of obtaining microscopic images. As explained in more detail below, results from multiple gamers are decoded by the remote computing device 16 to assign a label to each cell 10 (or in some instances groups of cells 10). For example, in a binary-based labeling system, one label may be infected cells 10 while another label may be non- infected cells. The remote computing device 16 uses statistical-based algorithms to assign these labels based at least in part on the labels provided by the many gamers. For example, rather than use a majority -based system to assign final labels to cells 10, the computing device 16 is able to weight (or de-weight as the case may be) the label results provided by the various gamers to achieve a final label to the cell 10 that closely
approximates or even exceeds the accuracy achieved by a trained or expert cytologist.
Gamers are monitored for their accuracy during game play by the addition of one or more control images where the results are known a priori. This information can be used to score or otherwise track the accuracy to which a particular gamer is labeling the cells 10. This information is then fed into the remote computing device 16 and is used to synthesize a final label determination for a particular cell 10.
[0042] FIG. 2 illustrates a schematic or block representation of the major components of the system according to one embodiment. Illustrated in FIG. 2 is the remote computing device 16. As stated above, the remote computing device 16 may include one or more computers (e.g., servers) that include therein processors that are configured to execute software to carry out the methods described herein. The remote computing device 16 may also include storage functionality as well as computational functionality. For example, the remote computing device 16 may include or be associated with storage media (e.g., disk drives) for storing data such as images, labels, identification information, and the like). In some embodiments, one or more components contained in the remote computing device 16 may be distributed across multiple locations. For example, storage of image files may take place in a separate location where the software of the remote computing device 16 is executed.
[0043] As seen in FIG. 2, the remote computing device 16 runs software includes a data pre-processing and collection module 24 or functionality. The data pre-processing and collection module 24 may receive raw data from microscopes 14. For example, FIG. 2 illustrates raw data being transmitted to the remote computing device 16 from a remote source 26 such as point of care clinics or hospitals. While raw data may be communicated from the remote source 26, diagnosis or other information can be communicated from the remote computing device 16 back to the remote source 26 (e.g., clinic or hospital). In one aspect, the data pre-processing and collection module 24 may receive digital images of cells 10 contained on slides 12 and subject the same to lower level processing. For example, a large image of the entire field of view of the microscope 14 may be broken down into smaller images, where each image contains a single cell 10 or a group of multiple cells 10. The data pre-processing module 24 thus may prepare the raw image files into smaller image files that can then be sent to the gaming community 28 using a game module 30. The gaming community 28 may include, in some embodiments, substantially all non-experts 32 or nonprofessionals having no experience in pathology or cytology. In other embodiments, the gaming community 28 may include experts 34 or other trained professionals with experience in examining images of cellular samples.
[0044] The games that are played by the gaming community 28 on respective computer gaming devices 20 may take any number of forms. The games may be downloaded or otherwise transferred to the gaming device 20 using, for example, the gaming module 30. The game may take the form of an application or "app" that resides on the computer gaming device 20. The game may also run within a web browser (e.g., using JAVA, FLASH, HTML, or the like). The computer gaming device 20 may include, for example, personal computers, portable electronic devices such as tablets, mobile phones, or the like. The games may be run on multi-platforms and are not limited to a particular computer gaming device 20. Some users may play the game on a mobile phone, while others play the game on a personal computer, while still others play the game on their tablet devices.
[0045] Still referring to FIG. 2, in one aspect of the invention, a machine learning module 36 is optionally included in the software running on the remote computing device 16. The machine learning module 36 may contain one or more machine-vision algorithms that are prepared by training using a dataset of known control slides. The machine-vision algorithm is able to automatically label cells 10 based on input images. In some embodiments of the invention, both the game module 30 and the machine learning module 36 are used together as part of a hybrid system to create a more efficient and accurate labeling platform. For example, results obtained from the various gamers may be used to improve the accuracy of the machine-vision algorithm. It should be understood, however, that in some embodiments, there is no need for the machine learning module 36.
[0046] Still referring to FIG. 2, game results from the gamer community 28 is returned to the remote computing device 16. Data may be transferred in batch form after the game has ended or the data may be transferred back to the remote computing device 16 as the game is being played. Data transfer between the computer gaming devices 20 and the remote computing device 16 may occur across any number of networks. For example, commercial wireless networks used for voice or data traffic may be used to transfer data to and from the computer gaming devices 20. In other instances, a wide area network such as the internet may be used to transfer data between the computer gaming devices 20 and the remote computing device 16. Some combination of different networks can also be used. For example, computer gaming devices 20 may be connected to the internet via a WiFi network, Bluetooth, or the like.
[0047] The computer gaming device 20 includes gaming software in which user's label cells 10 as part of the gaming environment. These user-labeled cells 10 are then identified by the gaming software and appropriate labels are then given to the different images. This information is then transmitted back to the remote computing device 16. As seen in FIG. 2, the game module 30 may include decoding functionality as described in more detail herein wherein label results provided by a plurality of users are "decoded" to find the true or correct label for each particular cell 10. This decoding operation may weigh the user-provided labels depending on the particular accuracy or effectiveness of the user that is playing the game. For example, the labels provided by the expert gamer 34 may be weighted higher than those of the non-expert 32 gamer. This weighting may be based on the performance each respective gamer when labeling or otherwise classifying cells. This performance may be monitored, for example, by the use of control images that are interspersed among the test images that are displayed to the user on the computer gaming device 16.
[0048] FIG. 2 also illustrates that one or more educational or research facilities 38 may exchange data with the remote computing device 16. For example, in one embodiment, a diagnosis of a slide or sample is returned to the remote source 26. As another example, the remote computing device 16 may be used to form a library of images in which cells 10 are labeled using the crowd-source techniques described herein. This library of images may be used as a teaching tool, for example, to train pathologist or cytologists. The library of images may also be used for research purposes. The library of known images may also be used to train other machine learning algorithms.
[0049] FIG. 3 illustrates a screen view an exemplary graphical interface 39 of a game that is displayed on a display of the computer gaming device 20. In this game, a user is presented with a plurality of individual images 40 with each image containing approximately a single cell 10. With reference to FIG. 2, these individual images of cells 10 may be produced by pre-processing of the raw images using the pre-processing module 24. The individual images 40 are presented in an array format although other formats could be used. In this example, twenty four (24) images are presented to the user. In this game, the user is required to label those cells 10 that are viewed as infected with the malaria parasite, i.e., those cells 10 viewed as positive. To do this, the user would touch the positive button 42 located on one side of the interface and then proceed to touch those cells 10 that the user interpreted as positive for the malaria parasite. Touching the positive button 42 as well as the cells 10 can be accomplished in any number of ways. For example, a cursor could be used to depress buttons 42 and select cells 10. Alternatively, if the screen is a touch screen, a user may simply touch the screen in the location of the button or cell to accomplish the same result.
[0050] In this instance, image 40' represents a RBC 10 that is infected by malaria parasite. RBCs do not have nuclei themselves. The Giemsa stains the nuclear material in the parasite with blue colored tint but does not affect the RBC morphology. In addition, later stage infections by the malaria parasite tend to look like "headphones" which can be seen within the RBC of image 40'. In this example, the user would then click image 40' to label the same as "positive" or infected. In some embodiments, a graphic or other image may be superimposed over the image 40' to indicate to the user visually that this particular image 40' has already been identified as positive. In other embodiments, the image 40' may simply disappear when clicked or touched.
[0051] After the user has identified all of the positive cells 10 within the array of images 40, the user then labels all of the remaining images as negative. To effectuate this operation, the user depresses the Label All - Negative button 44. When this button 44 is depressed, all of the remaining cells 10 are labeled as "negative" or non- infected. Thus, unlike the positive identification step where each positive cell 10 must be actively clicked or touched, all of the remaining cells 10 are labeled as negative in one operation. This is possible because the large majority of cells 10 are negative. If the nature of the phenotype or characteristic of the cell 10 is more frequent, a user may be required to individually select the negative cells 10 as well.
[0052] At the top of the interface of FIG. 3 is located a performance bar 46. The performance bar 46 is used to indicate to the user his or her accuracy in identifying the positive and negative cells 10 that are presented to the user. As performance improves, the bar 46 moves to the right. Conversely, as performance declines, the bar 46 moves to the left. The performance bar 46 may also change color as accuracy increase or decreases. The performance bar 46 may also be used as a motivator to motivate the player of the game. For example, the user is also provided with a score 48 that indicates the player's performance. The gamer is motivated to accurately identify and label cells 10 to increase his or her score.
[0053] Various other motivators may also be used in connection with the game. For example, in one embodiment gamers may be paid on the number of images that are labeled by a user. As a requirement to be paid, players may need to keep a minimum score.
Alternatively, the rate of pay may increase for gamers that attain and maintain higher scores. Donations may also be made to the gamer's charity of choice based on the number of labeled images. Additional motivators include the option of unlocking additional features of the game. For example, after reviewing a minimum number of images or scoring at a certain level, a user may be able to select the geographic region from which the sample was taken. For example, a gamer in India may choose to play games loaded with cellular images taken from a person in Mumbai. Other motivators may also be employed like a high score list, prizes, and the like. Another motivator may be sound or music that accompanies the game. In the user interface 39 of FIG. 4, the user is able to turn music on or off through button 49. [0054] Still referring to FIG. 3, the user interface includes an undo button 50 that permits the user to go back and undo a selection that was previously made. For example, the user may decide that an image 40 that was previously viewed to contain a positive cell 10 was, on second thought, negative. The user may depress or otherwise select this undo button 50 to restore the original image 40. Depression of the back button 50 may result in a decrease in the performance bar 46 and the score 48. The game may be timed or untimed. Timing data may be used to determine those gamers that can most efficiently label cells 10 with the requisite accuracy.
[0055] FIG. 4 illustrates another view of an exemplary graphical interface 39 of a game that is displayed on a display of the computer gaming device 20. This game environment is similar to that of FIG. 3 with the exception that multiple or groups of cells 10 are contained in each image 40. In this embodiment, a user still picks individual cells within the image 40. In this embodiment, the selection does not apply to the entire image 40 but rather a point or region of the image 40. In the example of FIG. 4, the cell 10' is labeled as positive wherein a cross or cursor marks wherein the selection has been made. This may be made using a cursor or touch screen environment. The specific cell 10' within the group can be associated with the label because positional data is obtained relative to the image 40. For example, the x and y coordinates within the image 40 are obtained where the user placed the positive marker. This can be used to associate the label with one cell 10' of many cells 10 contained in the single image 40.
[0056] As stated above, results from the gamers are transmitted from the gaming community 28 back to the remote computing device 16 to decode the "correct" label. In this example, several different gamers have received the same images 40 that are required to be labeled. The game module 30 aspect of the software will then decode what the correct label should be based on the labels applied by the gamers to the images 40.
[0057] Since the system combines decisions that are received from many gamers, the users will be delivered the same set of images to label. Therefore, there is a single sequence of images to be labeled, and each gamer will output a decision sequence. Ideally, the output of each gamer yield the correct diagnostic labels for the images. Given that each image either corresponds to a healthy cell or an infected cell, one can use binary labels to identify them: 0 for healthy and 1 for infected. Recasting the system as a communications system, the remote computing device 16 or server will act as a broadcaster of a binary sequence and each gamer will act as a noisy Binary Channel, retransmitting the symbols back along with some errors. Note that since the gamers may not necessarily make mistakes symmetrically, the probability of a gamer mislabeling a healthy cell may be different from that of mislabeling an infected cell.
[0058] FIG. 5 illustrates the framework of the decoding algorithm in which the games are modeled as a communication system consisting of a broadcast unit, multiple repeaters, and a receiver/decoder. In the ideal scenario, the repeaters (i.e., the gamers) would simply receive a set of incoming symbols (images to be diagnosed) from the broadcast unit, and transmit them to the receiver/decoder block, which in turn combines and stores the decisions (image labels). One can model the source of the microscopic images as a broadcast unit. In this analogy, the cell images are essentially treated as binary symbols. The model assumes the equivalency of the two image classes infected and healthy with binary symbols 1 and 0 respectively.
[0059] The sequence of symbols ¾, XN is broadcast to M repeater units (i.e., the gamers within the crowd) that can be viewed as parallel noisy channels. To be able to decode the outputs of these channels reliably, it is necessary to learn the channels adaptively. As such, it is needed to embed some known symbols (i.e., control images) in the output of the broadcast unit. Knowing the binary value of certain symbols/images at specific times, one can learn the conditional probabilities Pj (yi \Xi) as more symbols are transmitted by the broadcast unit and passed through the repeaters/gamers. Additionally, an encoder unit can also be placed after the broadcast unit to increase the redundancy of the transmission, and allow for error correction at the decoder.
[0060] Each gamer is modeled as an independent repeater that behaves as a binary channel. The error probabilities are defined using the notation pj (yt \xt), corresponding to the probability that the jth user will output the symbol yt when observing the symbol xt (i.e. ith image). In general, the error probabilities are asymmetric, i.e., p(y = x\x) is not the same for different values of x. However, it is difficult to accurately estimate this asymmetric probability in the games due to the imbalance in the existence of positive and negative training data (which is true not only for malaria diagnosis but also for various biomedical image analysis/diagnosis problems such that disease signatures are relatively rare compared to healthy data) which causes a bias towards better estimating the error probabilities when x = 0 (healthy case). In addition to this, another practical limitation is due to the general infeasibility of embedding large numbers of training images within the game. It is therefore more straightforward to estimate a simpler bit-flip probability, assuming a Binary Symmetric Channel (BSC). Reference is made to Cover TM, Thomas JA, Elements of Information
Theory, pp.7-9 (2006) which is incorporated by reference herein. It was observed in experiments that a BSC model performs better than an asymmetric model. This observation stems from the fact that there is an inherent imbalance in the number of positive and negative image samples. Since a limited number of control images are used to estimate the probability densities of the gamers' performances, this imbalance in the control data translates to having a small number positive samples for accurate estimation of p^and p10.
[0061] The broadcast unit can also include an encoder to increase the information redundancy prior to transmission to the repeaters/gamers. Given that the symbols being transmitted by the broadcast unit are not known a priori, the appropriate coding scheme is the repetition code, where each symbol is repeated for an odd number of times prior to transmission. At the decoder, a majority vote is taken on the channel outputs.
[0062] Error Control Coding (ECC) amounts to asking the gamer to play each image an odd number of times, and the most frequently assigned label is taken to be his or her answer for that particular image. This can also be interpreted as the gamer's confidence in the given diagnosis response. In other words, if the gamer is absolutely sure of a particular diagnosis for an image, then he or she will choose the same label on every observation. However, if the image is difficult to diagnose, then there is the chance that the gamer would not be consistent in making a decision, thus producing a lower confidence level.
[0063] The decoder used in the gaming platform uses a Maximum a Posteriori Probability (MAP) approach. Suppose that one has N symbols/images xt, ... , xN being broadcast and relayed through M repeaters/gamers and are received by the decoder. Also assume that one has estimates of the repeater/gamer error probabilities Pj (yi
Figure imgf000016_0001
for the jth repeater and the ith symbol. One would then like to estimate xt given all the repeater outputs y , ... , y . For a particular transmitted symbol x* we have: p{x
ίΗ3 )
Figure imgf000016_0002
[0064] We can then have:
M
Figure imgf000016_0003
(2) [0065] The value of z; is taken to be x* that maximizes the above posterior. Therefore, we have:
Figure imgf000017_0001
[0066] FIG. 6 illustrates another embodiment of a hybrid method and system that combines a human and machine-based diagnostic framework. In this method, new images that are to be labeled or diagnosed are generated in operation 60. The new images are then subject a pre-trained machine learning algorithm 62. Pre-trained machine learning algorithm 62 may be implemented in a machine learning module 36 or other feature like that disclosed in FIG. 1. In this hybrid embodiment, the diagnosis or label that is produced by the pre- trained machine learning algorithm 62 is associated with a confidence level. The confidence level is then compared against a threshold level (T) as illustrated in operation 64. Threshold may be a number, percentage, or the like correlates the confidence level of a diagnosis or label for a particular image. The confidence level associated with the output of the pre- trained machine learning algorithm 62 determines whether the images are crowed-sourced or not. Those images that have a confidence level that is below the threshold (T) are passed onto the gamers and crowd-sources as illustrated in operation 66. Conversely, those images that have a confidence level that is above the threshold (T) are not passed to the gamers and crowd-sourced. In this embodiment, the difficult-to-diagnose images are crowd-sourced while the easy-to-diagnose images are run through the pre-trained machine learning algorithm 62.
[0067] Still referring to FIG. 6, once the difficult-to-diagnose images are diagnosed by the gamers in operation 66, the results are merged with the diagnosis of the easy-to-diagnose images in merging operation 68. Following the merging operation 68, an optional final diagnosis operation 70 may be performed where the sample or slide may be given a diagnosis. Of course, this operation 70 may be omitted where one only needs the merged diagnosis or label information. As seen in FIG. 6, a set of training data 72 is produced by the merged diagnostic information. This training data 72 is used to improve the self-learning algorithm 62. During each cycle, the self-learning algorithm 62 will improve as a result of added training data.
[0068] With respect to the self-learning algorithm 62, the algorithm extracts local color features from the cell images and feeds them to a classifier. In training the classifier, a small subset of cell images in the dataset (i.e., control images) are used as the training set. In general, there are two possible approaches to extracting features from the acquired microscope images for the purpose of building a digital classifier. One can either attempt to extract very specific, hand-coded features or try to learn discriminative features from a large set of training examples. The advantage of using hard-coded features is one can use prior knowledge of the physical/structural properties of the parasites. For example, one can look for "ring-shaped objects" within the RBC image as an indicator for the existence of the parasite. The advantage of using such features is that they lend themselves to very fast implementations and make the job of the classifier much easier. On the other hand, these hard-coed features are difficult to design, and are in general very inflexible to variations in sample preparation or illumination/recording conditions. For example, custom-designed malaria feature-sets that use shape and color may not be easily modified to detect a different parasite type.
[0069] In contrast to hand-coded features, 'learned' features can be very generic and easily modified and applied to similar detection problems. They also take less time and effort to design for a particular problem, and put most of the burden of classification on the classifier itself. In the self-learning algorithm 62 described herein, the second approach (i.e., learned features) was preferred and used a set of generic color-based features for training a classifier to discriminate between RBC images that contain P. falciparum and those which do not.
[0070] Two types of histogram-based features were used as input to the classifier. The first is a simple color histogram of the image in grayscale. This is a feature that carries information about the general distribution of image values. A second, more complicated color feature which was named Local Color Peak Histograms (LCPH) was also employed. The LCPH for an image is formed by first generating highly quantized color histograms in the Hue-Saturation space over local windows. For each window the two most occurring histogram bins are found. Any given pair of bins corresponds to a particular index value. In other words, an index can be assigned to the occurrence of each pair of values in the histogram, and this is the value that is recorded for each local window in the image. A second histogram is generated over the recorded indexes of all the local windows and used as the LCPH features. This feature essentially measures the relative occurrences of various color pair co-occurrences throughout the image.
[0071] FIG. 7 illustrates LCPHs. For every window block, a color histogram is calculated. The dominant pair of colors is used to compute an index (e.g., with 5 bins, there are a total of 10 different index values). A histogram of all index values is computed and used as part of the feature vector. In addition to color-based features, a number of more basic image features such as mean, variance, and gradient magnitude histograms are used to form final feature vectors.
[0072] In training the classifier for the malaria parasite, a variation of Adaptive Boosting was used. In this method, many weak decision-tree classifiers are trained that together can produce a very strong classifier. In classical Adaptive Boosting, the overall classifier is tested on the complete training dataset at each iteration. Data points that are not correctly classified are then assigned larger weights for the next classifier to be trained. This is where the algorithm used deviated from the classical algorithm in that instead of re-weighting the full training set and training a new classifier, the weights were used to probabilistically select a small subset of the training data for the next weak classifier to be trained. This allows for completely disjoint training data for the weak classifiers and results in very fast convergence of the boosted classifier. FIG. 8 illustrates a summary of the Adaptive Boosting algorithm that was used. The total number of training points is fixed for each weak classifier, i.e., for each weak classifier a total of ns training vectors are chosen based on the weights Wk(i).
[0073] Experimental Results
[0074] A digital gaming platform was developed through which an unlimited number of gamers from any location in the world were provided the opportunity to access and diagnose images of human RBCs that are potentially infected with P. falciparum. The gaming platform was implemented to be run both on personal computers (using Adobe Flash running on any internet browser such as Internet Explorer, Mozilla Firefox etc.) and on Android- based mobile devices, including mobile-phones as well as tablet PCs.
[0075] Before starting to play the game, each gamer was given a brief online tutorial explaining the rules of the game and how malaria infected RBCs typically look with some example images. After this, each gamer played a training game where the gamer was required to successfully complete in order to continue playing the rest of the game. This test game consisted of 261 unique RBC images, where 20 of them were infected. The gamers were required to achieve >99% accuracy in this training game, and in the case of failure, they were asked to re-play the game until they achieved 99%. This way all the gamers became familiar with the rules of the game and were briefly trained on the diagnostics task. This training game was required only once— when the gamers registered on our platform. Upon registration, a unique user ID was assigned to each gamer and her/his individual diagnostics performance was tracked. Furthermore, this training game provided direct feedback to the players on their performance and their mistakes through a scoring mechanism. Since the labels (i.e., infected cell vs. healthy cell) of all the images were known a priori for the purposes of this training game, the player's score was updated throughout the game (i.e., positive score for correct diagnosis, and negative score for incorrect diagnosis).
[0076] Given that the focus was not to educate the players, and in fact it was to demonstrate the quality of diagnostic results that can be achieved through untrained (nonexpert) individuals, this initial test/training game was designed in a simple repetitive fashion. As the gamer proceeds through game play, the gamer is presented with multiple frames of RBC images. The gamer had the option of using a "syringe" tool to "kill" the infected cells one by one, or use a "collect-all" tool to designate all the remaining cells in the current frame as "healthy," which significantly speeds up the cell diagnosis process since most of the RBCs are, in fact, healthy and not infected. Within each frame, there are a certain number of cells whose labels (infected or healthy) are known to the game, but unknown to the gamers. These control cell images allow the system to dynamically estimate the performance of the gamers (in terms of correct and incorrect diagnosis) as they go through each frame and also help in assigning a score for every frame that they pass through. This is different in the training game where all the images are effectively control images. Once a frame is completed, a score is assigned based on the performance of the gamer only on the control images. These control images (roughly 20% of all the images) along with the scoring system allow the game to provide some feedback to the gamer on their performance such that as the gamers continue to play, they can improve their diagnostics performance. The images and their order of appearance were identical among different gamers, thus allowing us to make a fair comparison among their relative performances.
[0077] Image Database
[0078] To build a malaria infected RBC database, thin blood smears on slides were used that contained mono-layers of cultured human RBCs which were infected by Plasmodium falciparum (P. falciparum) forming the source for our image dataset. These malaria slides were then scanned with a bright-field optical microscope using a 100X oil-immersion objective lens (numerical aperture: 1.25). At each FOV, the captured RBC images were passed on to an infectious disease expert for identification of P. falciparum signatures and digital labeling of each RBC image (positive vs. negative). This process generated a dataset of 7116 unique RBC images, with 1603 of them infected by the malaria parasite. To form the set of images to be used in the games, each individual RBC image was cropped and resized to fixed dimensions of 50x50 pixels. To further increase the total number of images and their diversity (in terms of sample preparation, density and imaging conditions), a set of images provided by the Center for Disease Control (CDC) was also used, yielding an addition of 1 18 infected and 595 uninfected RBC images. With this, there was a total of 7829 characterized human RBC images, with 1721 of them infected with P. falciparum, forming a ground truth database for evaluating the crowd-sourcing, gaming, and machine-vision based diagnostics platform.
[0079] Diagnostic Analysis
[0080] When analyzing the game results, individual performance parameters and diagnoses were accessible (for both the control images and the unknown test images). The results from all gamers that have completed a particular game were combined and generated a more accurate set of diagnoses for the test RBC images. Given that each RBC image either corresponds to a healthy cell or an infected cell, one can use binary labels to identify them: 0 for healthy and 1 for infected. Recasting the setup as a communications system, the server will act as a broadcaster of a binary sequence and each gamer will act as a noisy Binary Channel, retransmitting the symbols back along with some errors. Therefore, the framework of the games was modeled as a noisy communication network consisting of a broadcast unit, multiple repeaters, and a receiver/decoder unit for the final diagnosis as seen in FIG. 2. In the ideal scenario, the repeaters (i.e., the gamers) would simply receive a set of incoming symbols (images to be diagnosed) from the broadcast unit (through various light microscopes located in e.g., point-of-care offices or malaria clinics), and transmit them to the
receiver/decoder block, which in turn computes the optimal "correct" label for each individual unknown RBC image using a Maximum a Posteriori Probability (MAP) approach.
[0081] Results and Discussions
[0082] To test the viability of the crowd-sourced gaming-based malaria diagnosis platform, different experiments were run with 31 unique participants (non-experts), ranging between the ages of 18 and 40. In total, five different experiments were performed, the results of which are summarized in Table 1 below. FIG. 9 illustrates the definitions of various acronyms used in Table 1 below. Table 1
Figure imgf000022_0001
[0083] Initially, the capability of the platform was tested through a game consisting of 5055 images, of which 471 were of infected RBCs and 4584 were of healthy RBCs (see Table 1). Additionally, 1266 (103 positives and 1163 negatives) RBC images were embedded as control images within the same game such that each gamer had to go through 6321 RBC images. The combined accuracy of the gamer diagnoses was 99%, with sensitivity (SE) of 95.1% and specificity (SP) of 99.4%. The positive predictive value (PPV) and negative predictive value (NPV) were also quite high at 94.3% and 99.5% respectively.
[0084] In addition to the gaming and the crowd-sourcing platform described above, an automated computer vision-based algorithm was developed to detect the presence of malaria parasites. In doing so the aim was to ultimately create a hybrid system such that machine vision and human vision can be coupled to each other, creating a more efficient and accurate biomedical diagnostics platform as illustrated in FIG. 6. For this purpose, independent of the human crowd, the automated diagnosis performance of the machine-vision algorithm was tested, which was trained on 1266 RBC images (same as the control images used in experiment #1) and was tested on a total of 5055 unique RBC images (471 positives and 4584 negatives - see Table 1). This algorithm was able to achieve an overall accuracy of 96.3%, with SE-SP of 69.6%-99.0%, and PPV-NPV of 87.7%-96.9%. In terms of performance, the gamer crowd did better than this machine algorithm as summarized in Table 1.
[0085] However, it should be noted that with an even larger training dataset (containing e.g., >10,000 RBC images) and more advanced classifiers, it is possible to significantly improve the performance of the automated algorithm. This feat may be achieved through the coupling of statistical learning and crowd-sourcing into a hybrid model as illustrated in FIG. 6, where a feedback exists between the gamers and the automated algorithm, yielding an ever-enlarging training dataset as more games are played. This uni-directional feedback loop has the effect of labeling more and more images as training data for the automated algorithm, potentially leaving only the most difficult ones to be labeled by human gamers.
[0086] Following this initial comparison between human vision and machine vision for identification of malaria infected RBCs, to assess the viability of the above discussed hybrid diagnosis methodology, another test was conducted (identified as experiments #3 & #4 in Table 1), where among all the RBC images characterized using the machine-vision algorithm, the images with a diagnosis confidence level that is less than 30% of the maximum achieved confidence level, (i.e. a total of 459 RBC images that were relatively difficult to diagnose) were extracted. The training dataset (1266 RBC images that were used to train the machine algorithm, which also served as the control images of experiment #1) were then mixed with these "difficult-to-diagnose" 459 RBC images and were used to form a new game that is crowd-sourced to 27 human gamers. This new game (experiment #3) yielded an accuracy of 95.4%, with SE-SP at 97.8%-91.9%, and PPV-NPV at 94.7%-96.6% on these 459 difficult- to-diagnose RBC images. Next, the results from the crowd-sourced game (experiment #3) and the machine algorithm (experiment #2) were merged to arrive at an overall accuracy of 98.5%, with SE-SP of 89.4%-99.4% and PPV-NPV of 94.2%-98.9% (see experiment #4, Table 1). Thus, in this hybrid case the specificity and positive predictive value increased by 20% and 7%, respectively, and a performance comparable to that of a completely human- labeled system (experiment #1) was achieved, but with only 10% of the number of cells actually being labeled by humans. This significantly increases the efficiency of the presented gaming platform such that the innate visual and pattern-recognition abilities of the human crowd/gamers is put to much better use by only focusing on the 'difficult-to-diagnose' images through the hybrid system.
[0087] In another experiment (experiment #5) the number of infected RBC images in the game was increased by three- fold to simulate a scaled up version of the gaming platform. A total of 7829 unique RBC images were incorporated into the game, of which 784 were taken as control images that were repeatedly inserted into the game for a total of 2349 times. As a result, each gamer would go through 9394 RBC images, a quarter of which (2349) are known control images. Within the remaining 7045 test RBC images, there were 1549 (22%) positive images and 5496 negative images, which were all treated as unknown images to be diagnosed by the human crowd at the single cell level. The same ratio of positive to negative images was also chosen for the control RBC images in the game to eliminate any unfair estimation biases that may result from differing distributions. Completing this game (i.e., experiment # 5) took on average less than one hour for each gamer, and one can see in Table 1 that the accuracy of the overall human crowd (non-professionals) is within 1.25% of the diagnostic decisions made by the infectious disease expert. This experiment yielded an SE of 97.8% and an SP of 99.1%. The PPV was 96.7% and the NPV was 99.4%.
[0088] Based on experiment #5, FIG. 10 summarizes "the effect of the crowd" on diagnosis accuracy and sensitivity, i.e., how the overall performance of the crowd's diagnosis is improved as more gamers are added to the system. One can see significant boosts in the sensitivity (i.e., the true positive rate) as diagnosis results from more gamers are added into the system. This is quite important as one of the major challenges in malaria diagnosis in sub- Saharan Africa is the unacceptably high false-positive rate, reaching -60% of the reported cases. The overall diagnosis accuracy also steadily improves as more gamers are added as shown in FIG. 10. This crowd effect may seem like a deviation from the traditional benefits of crowd-sourcing, in that multiple players are inaccurately solving the whole puzzle and then their results are combined to yield a more accurate solution. However, we should also note that cell images from a single blood smear slide can be broken up into multiple batches, where each batch is crowd-sourced to a group of players. In other words, each unique group of players will focus on one common batch of cell images, and in the end the diagnosis results will be combined once at the group level to boost the accuracies for each cell, and again at the slide level to make a correct overall diagnosis per patient. Therefore, the contribution of the crowd is twofold. First, it allows for the analysis problem to be broken up into smaller batches, and second, the analysis of the same batch by multiple individuals from the crowd allows for significantly higher overall diagnosis accuracies.
[0089] It should be emphasized that diagnosis results are for 'individual' RBCs, not for patients. In reality, malaria diagnosis using a blood smear sample corresponding to a patient is a relatively easier task compared to single cell diagnosis since a thin blood smear for each patient sample already contains thousands of RBCs on it. Therefore statistical errors in the parasite recognition task could be partially hidden if the diagnostics decisions are made on a per blood-smear slide basis. To better demonstrate the abilities of the gaming based crowd- sourcing system and method, the system was aimed for the diagnosis of individual RBCs, rather than patients. Since any given patient's blood smear slide will be digitally divided into smaller images (containing e.g., a handful of RBCs per image), and >1,000 RBC images per patient will be distributed to the crowd, one should expect much higher levels of accuracy and sensitivity for diagnosis of individual patients. Furthermore, the single-cell-diagnosis- based gaming approach could also be very useful to estimate the parasitemia rate of patients which can be quite important and valuable for monitoring the treatment of malaria patients. Of course, while diagnosis of individual RBCs is described herein, the results may also be used to apply diagnostic results on a per slide or per patient basis.
[0090] This digital hub will allow for the creation of very large databases of microscopic images that can be used for e.g., the purposes of training and fine tuning automated computer vision algorithms. It can also serve as an analysis tool for health-care policy makers toward e.g., better management and/or prevention of pandemics.
[0091] In one embodiment of the system and method, motivator or other incentives (e.g., monetary incentives) can be used to recruit health-care professionals who are trained and educated to diagnose such biomedical conditions, making them part of the gamer crowd. In such a scenario, the gaming platform may serve as an intelligent telemedicine backbone that helps the sharing of medical resources through e.g., remote diagnosis and centralized data collection/processing. In other words, it is a platform whereby the diagnosis can take place by professionals far away from the point-of-care. At the same time, it also enables the resolution of possible conflicting diagnostics decisions among medical experts, potentially improving the diagnostics outcome.
[0092] For this potentially highly trained crowd of "professional" gamers, the final decisions made through the crowd can be used for direct treatment of the patient.
Furthermore, since these are trained medical professionals, the number of gamers assigned to an image that is waiting to be diagnosed can be significantly lower as compared to the case where "non-professional" gamers are assigned to the same image. On the other hand, if an image is diagnosed by entirely non-professional gamers, the result of the diagnosis can still be very useful to reduce the workload of health-care professionals located at point-of-care offices or clinics where the raw images were acquired. In the case of malaria diagnosis, this is especially relevant since the health-care professional is required to look at > 1,000 RBC images for accurate diagnosis. Hence even a non-professional crowd's diagnostics decisions could be highly valuable in guiding the local medical expert through the examination of a malaria slide, such that the most relevant RBC images are quickly screened first, eliminating the need for conducting a manual random scan for rare parasite signatures. Slide images may, in some cases, be pre-screened and pre-labeled for final review and oversight by a trained pathologist or cytologist.
[0093] Finally, the proposed methodology can be expanded to include a training platform. Assuming the expansion of this crowd-sourced diagnostics platform and the generation of large image databases with correct diagnostics labels, software can be created to make use of such databases to assist in the training of medical professionals. Through such software, medical students and/or trainees can spend time looking at thousands of images, attempting diagnosis, and getting real-time feedback on their performances.
[0094] In the above-noted embodiments and experiments, control images with "known" labels were used to estimate the statistical behavior of decisions made by individual gamers, when was then used to combine all the gamers' responses through a MAP estimation.
However, in some instances, such "gold standard" metrics may be missing. For example, one may not have access to any labeled data. For example, user's may not stay with the gaming platform long enough to track information about their diagnosis accuracy.
[0095] According to another embodiment, a system is proposed that looks at decisions made by a group of trained medical experts. By using an Expectation Maximization (EM) algorithm, one can combine the decisions made by such experts to generate more reliable diagnostic decisions at the single cell level. In this embodiment, one needs to simultaneously learn the image labels and the error probabilities associated with each expert, while maximizing the posterior probability of the observed labels. To achieve this, a three-category mixture model for the original data was assumed and an EM algorithm used to generate the maximum likelihood labels for unknown cell images.
[0096] FIG. 1 1 illustrates a browser-based interface for remote cell labeling. The interface is similar to that of FIG. 3 with similar features labeled as such. Each expert is allowed to navigate through the database of cell images eliminating the infected cells and marking those that are questionable with a dedicated button 80. As a result each image can be labeled as positive, negative, or questionable.
[0097] A total of 8,644 RBC images that were digitally cropped from Giemsa stained thin blood smears acquired from U.S. Centers for Disease Control and Prevention (CDC) database. This dataset of 8,664 images was derived from an original set of 2,888 images; i.e., each original image was rotated at multiples of 90° and randomly distributed within the final dataset. These images were originally captured using different digital microscopes through lOOx objective lenses (with at least a numerical aperture of 1.0), and were digitized at 24 bits. These images were then remotely presented to each individual expert through a browser- based web interface as shown in FIG. 1 1. This interface consists of multiple image frames, each containing a grid of individual RBC images. The size of the grid depends on the screen resolution of the computer accessing the interface and is automatically adjusted. The expert is asked to remove the infected and questionable (e.g., poor image quality, difficult to diagnose, etc.) cell images using the appropriate tools selectable from the side bar. Once all such images have been labelled, the remaining cells can all be labelled as uninfected or healthy using a Label All Negative button on the side bar. The experts are asked to log-in prior to starting the diagnosis, and their individual responses are recorded on servers as they progress through the database of images. The experts were allowed to view and diagnose the images in multiple sessions and were not given any time constraints for completing the diagnosis task. All of the slide readers were certified malaria diagnosticians and had at least two (2) years of clinical experience with reading of thin smears. There were no controls nor were any conditions enforced on the viewing devices of the observers. Any inconsistencies in the quality of their viewing hardware and conditions would be reflected in their diagnosis accuracies. For the mathematical framework used herein, every expert is a statistical decision unit, and all the possible sources of error for an individual expert (e.g., relatively weaker training, poor eye sight, low display resolution, etc.) are treated as a lumped entity; and there is no aim to investigate different factors that make up the overall error probability of an individual expert. Instead, one of the main goals of this work was to demonstrate that a group of experts could be digitally combined to significantly boost the accuracy of the final diagnostic decision, when compared to even the best individual of the group.
[0098] In the general scenario, it was assumed there are a total of N+l medical images waiting to be diagnosed by M+1 experts. It was also assumed that the diagnosis is of a binary nature, meaning that it is either positive or negative, as in the case of malaria diagnosis. However, the method also allowed the possibility that a particular image is of low quality preventing in some cases reliable diagnosis. As a result, each image can be labelled as positive, negative, or questionable.
[0099] FIG. 12 illustrates the forward model of the embodiment. There are a total of N+l images with possible labels from {0, 1,2} being sent to M+1 experts. The j"1 expert labels each image with a certain probability P (x | /) . The final dataset consists of an
(N + l) x (M + 1) matrix of values from the set {0, 1,2} . FIG. 13 illustrates the expert responses x . are treated as the observed variables and the true image labels In as the latent variables in a mixture model with parameters P. (x \ I) . Expectation Maximisation (EM) is used to obtain the maximum likelihood solution to the data.
[00100] Mixture Model Formulation
[00101] An assumption is made that each image In has one of three possible labels from the set {0, 1, 2} corresponding to the diagnostic decisions: negative, positive, and questionable images respectively. Therefore each input image belongs to one of three possible distributions corresponding to the three possible labels. This gives us a mixture model with K = 3 components.
[00102] For each component k, one assumes the most general decision model for each user with six parameters describing the probability of the user's responses given the true labels of the images. Furthermore, one assumes a 1-of-k representation for the true image labels using the variable zk , where zk is a K-tuple with only the kth position set to one and the rest zero. For example, if the image has the label "1" (i.e., infected in this scenario), then it is represented by [0,1, of , and thus zx = \ and z0 = z2 = 0 . Therefore, for any image we have
(4)
P ( .
Figure imgf000028_0001
where ( x . == 1) is Boolean indicator for when user j has labelled the observed image as 1. In other words, if the jth observer labels the image as "1" (i.e., infected in this scenario), then (xj == l) = 1 and (xy == θ) = (xy == 2) = 0 . Now, define pkj and qkj as probabilities for user j of labelling an image from the kth component as 0 and 1. We thus have the set of parameters shown in Table 2, below.
Table 2
Figure imgf000028_0003
[00103] Now suppose there is a set of N+l images Io,... IN, each observed and labelled by a set of M+1 experts, with the labels represented by a matrix X of size (N + l) x ( + l) . It is desired to use the EM algorithm to find the correct labels In. Assuming the described three component mixture model, one can write the complete data log-likelihood as
(5) +∑[(*,, == θ)ΐηΛ. +(*„. == l)ln¾ + (¾. == 2)ln(l - ¾ - ¾. )]
Figure imgf000028_0002
where znk is a 1 -οΐ-k representation of the latent variables In and thus ( xnk == κ) is a Boolean representing the labelling of κ e {0, 1,2} by expert k for the n"1 image; and k is the prior probability for the k mixture component (in this case K=3). Taking the expectation with respect to the latent variables Z yields 1
Ez[ln (X,Z|^,p,q)] =∑∑ (znk) +∑[(*„ == θ)ΐιι¾. + (*„. = l)ln¾ + (*„ = 2)ln(l -/¾ (6) where p and q represent the set of all parameters pkJ and qkJ associated with the accuracy of the experts, and we have defined
(7) and
Figure imgf000029_0001
_ 3=0
~~ K- l M
vTfe [ (pk'j' )13^' (qk'j' )13^' · (i - Pk'j' - qk'j'†njl k'=0 j'=0
y(znk ) are the "responsibilities" for component k given the data point xn (i.e. observation vector for the nth image), which are evaluated during the "E" step of the EM algorithm. During the "M" step, we maximise the data log-likelihood with respect to the parameters π, p and q. This leads to the following update equations:
^ , , . (9)
(10)
Figure imgf000029_0002
(11)
" ... k'^O «':::{> [00104] With regard to the EM algorithm, reference is made to Bishop CM, Pattern Recognition and Machine Learning (Information Science and Statistics), Secaucus, NJ, USA: Springer-Verlag New York, Inc. (2006), which is incorporated by reference as if set forth fully herein.
[00105] Simulations
[00106] Since there exist no real ground-truth labels for the type of image data that are considered in this work (i.e., microscopic images of 'single' RBCs that are potentially infected by malaria parasites), the viability of the EM-based algorithm is demonstrated through simulations. To this end, labels were randomly assigned to a simulated set of 4,000 cell images. A parasitemia of 15% was chosen (i.e., 15% of the labels were l 's), a
"questionable" probability of 5% (i.e., 5% of the labels were 2's), and the remaining labels (i.e., 80%) were set to 0's. Since the most difficult diagnostic task is identification of true positives, in the simulations more positives were used than typically occurring to better test the efficacy of the mathematical framework. The responses of a set of nine experts diagnosing the images were then simulated. Each individual was assigned a set of accuracy numbers (i.e., P. (x \ I) ) from which their responses were sampled. Once the individual responses were generated, the combined set of diagnoses was computed using Expectation Maximisation, as described above, and was compared to the original simulated cell labels, generating the combined accuracy metrics.
[00107] Throughout this embodiment, the focus is on the diagnoses of 'single' RBC images by experts since it is the basic task to be repeated e.g., more than 1,000 times toward accurate diagnosis of a single patient's blood smear sample. Single-cell-based analysis of a smear is essential for estimating the parasitemia, which can be quite important and valuable for monitoring the treatment of malaria patients. Often in practice however, a slide-level diagnosis is made for a patient (i.e., malaria infection observed, or malaria infection not observed). Since a thin blood smear typically contains hundreds of thousands of intact RBCs on it, slide-level malaria diagnosis using a patient's blood smear slide is relatively easier than cell-level diagnosis, as statistical errors in parasite recognition may be partially hidden. In other words, as long as the overall slide-level diagnosis is correct, the individual cell-level mistakes no longer matter (unless accurate parasitemia measurement is required for e.g., monitoring of a positive patient).
[00108] Systematic translation from the diagnoses of individual RBC images to that of a patient's blood smear is often needed. In the following analysis, a detailed look is given at this important problem and for medical professionals with different levels of expertise, the number of RBC images that needs to be diagnosed per blood smear sample should vary based on their abilities, in order to claim an accurate diagnosis per patient slide. This mathematical framework can be rather useful to customize and fine tune standard diagnostic procedures depending on the training level of the experts.
[00109] In analysing a smear and calling it infected vs. uninfected, one can treat the formation of the slide as a stochastic process. It is assumed that the infected and uninfected smears follow two distinct processes with different distributions. In the case of an uninfected slide, there are no physically infected cells on the smear. Therefore, in the ideal deterministic scenario, none of the cells observed under the microscope should be labelled as infected. This however is not necessarily true, due to errors on the part of the individual (e.g., a pathologist) looking at the cells. The observer will have an error probability ζ associated with her/his labels, which defines the probability of mislabelling a healthy cell as infected.
[00110] Assuming N cells are observed (or labelled) from the same blood smear slide, for a healthy smear we will have the following Binomial distribution for the number of cells labelled as infected L
[00111]
Figure imgf000031_0001
[00112] The case of an infected slide (with a parasitemia rate of ξ), however, is much more complicated to analyze since: (1) the total number of truly infected cells (i.e., n ) within the smear can range from 0 to TV with varying probabilities; and (2) the total number of positive labels assigned to the cells by the medical expert can be due to a combination of truly infected and uninfected cells. As a result, the following distribution exists for the number of infected/positively labelled cells L for an infected smear that has a parasitemia rate of ξ :
P(L = l\infected smear ) (1 ) min(n,/) \ (N - n \
'(N) Ν-η+ j-l
=∑ ξ» (ΐ - ξ)Λ-" ∑ \η -ηΤ] *\ Ί . ζΗ(1 " ζ)
[ n j j=max(0,l+n-N) J J i -j
[00113] where η is the probability of correctly labelling an infected cell as infected.
[00114] Assuming one knows the true positive and false positive probabilities, one can generate the Receiver Operating Characteristics (ROC) curves for different parasitemia levels ξ and labeled cell counts N.
[00115] Results [00116] The motivation system and method is not only to create a more accessible platform for telepathology, but also to increase the efficiency and accuracy of remote medical diagnosis. In other words, even relatively poorly trained medical personnel can be digitally and remotely combined to create highly accurate collective decisions (assuming each individual can perform at least better than chance in terms of accuracy). To set the stage in terms of motivation and potential severity of the problem, FIG. 14 shows the experimental results, revealing the level of agreement that exists among nine highly trained medical personnel who are experts in diagnosing malaria. Given that the image database only consisted of single images of individual cells (totalling more than 8,000 RBC images) without the ability to focus in and out, the experts were asked to label the images as infected by malaria, uninfected by malaria, or questionable (i.e., a certain judgment cannot be made). An interesting observation was the degree of variance in the expert responses as shown in FIG. 14, i.e., these nine experts agreed on 93% of the images that they labelled as negative (or uninfected), and only on 12% of what they labelled as positive (or infected).
Furthermore, only 64% of the images labelled as positive received that label from the majority of the experts, which implies a simple majority vote of the experts might lead to highly inefficient and potentially inaccurate diagnoses.
[00117] In addition to the inconsistencies that exist among the different experts, there is a significant amount of self- inconsistency that is exhibited by 'each' expert. To test the level of self-consistency of experts, each RBC image in the database was presented three times at rotations of 90° to each expert for labelling. FIG. 15 shows the level of self- inconsistency that each expert exhibits within her/his responses. The most consistent expert has a self- inconsistency of 0.2% and 0.8% for the negative and positive categories, respectively, and the least consistent expert is more than 2% inconsistent in each of those categories. This self- inconsistency of experts can be interpreted to mean that the diagnosis of an expert— even a highly trained one— is not a deterministic process, and inherently contains a stochastic and thus random component. It also implies that this stochastic nature can be exploited to achieve a higher level of accuracy by combining diagnoses generated by multiple experts.
[00118] Again, for the cell images that were used in the experiments, the true labels are not known. This is a direct consequence of the fact that the performance of experts who would normally create such ground-truth labels is being analysed. As a result, the only practical way to test the applicability and performance of the system and method is to do so with simulations. Toward this goal, a general model was created of an expert's response. A model was assumed with six degrees of freedom through the parameters listed in Table 1. Eight simulation experiments were run where in each trial a pool of nine experts with varying performances were simulated. The range of overall expert accuracy for each trial was set to 10% yielding predetermined average accuracies ranging from 55.7% to 81.5%. Running the EM-based algorithm on each of the simulated pools of responses, the combined accuracies for these virtual experts were generated. The results of the simulated experiments are shown in FIG. 16, where one can readily observe that even when the average accuracy of the experts is less than 60%, it is possible to obtain combined accuracies that are higher than 95%. What is more interesting is the fact that the boost in accuracy that resulted from combining the multiple responses (i.e., back row of bars) does not increase at the same rate as the average accuracies of the individuals. In other words, after a certain number, subsequent addition of more experts reaches a point of diminishing returns in terms of contribution to the overall accuracy of the combined diagnosis. This can be seen as both a strength and a weakness of the proposed methodology in that if there exists a lone expert who is extremely accurate as compared to his peers within the pool, her/his responses may 'not' have a significant impact on the overall accuracy, and her/his voice may get drowned by the crowd.
[00119] At the same time, a single incompetent individual cannot have a significant negative influence on the overall results. Another point that must be emphasised with regards to these simulations is that when generating the results the simulations did not take into account the possibility that some images may be inherently more difficult to diagnose;
furthermore it was assumed that the errors that the experts make will be uniformly distributed across the images. Intuitively, this uniformity assumption gives each image a reasonable chance to receive more correct responses than incorrect ones. If for example, all of the experts incorrectly diagnose a set of images, then there is no way to correct those errors.
[00120] Returning back to the experimental results with nine malaria experts, taking the EM-based consensus of the crowd to be the ground truth for the cell labels, one can generate a set of experimental performance metrics for each expert as illustrated in FIG. 17. FIG. 18 also illustrates some sample RBC images from the categories that resulted from this consensus. Absolute accuracy is not the best metric to measure the performance of the experts in this setting due to the significant imbalance that exists in the number of healthy and infected cells in our dataset— this imbalance is even more drastic in individual patient samples due to the low parasitemia levels that typically exist in malaria infected patients. As such, two better metrics are the Negative Predictive Value (NPV) and the Positive Predictive Value (PPV), which are indicative of the reliability of the negative and positive labels assigned to the cell images (see FIG. 17). As seen in FIG. 17, even though all the experts have achieved very high and similar accuracy levels, their response quality varies significantly in terms of NPV and PPV. An interesting observation can be made by comparing FIG. 15 with FIG. 17: experts 4, 5, and 6 who exhibited the highest levels of self-consistency in their responses to the uninfected and infected cell images also had the highest PPV levels.
[00121] As noted herein, there is a distinction between cell-level diagnosis and smear-level diagnosis. Although the former is a necessary step in performing the latter task, the two do not correspond to each other in a straight-forward linear fashion. One can use a probabilistic framework to make the transition from cell-level diagnoses to smear-level diagnosis. In doing so, one can see that depending on the expertise level of the medical professional making the diagnosis, to achieve a particular level of certainty when calling a smear slide positive, with a fixed false positive rate, the number of individual cells that need to be examined varies drastically. Of course, the method may be used to score or diagnose slides, sample results from a subject, or even groups of subjects.
[00122] For example, FIG. 19A shows that when diagnosing a smear that has a parasitemia level of 0.5% (which can be typical), if the expert has a sensitivity (i.e., true positive rate) of 99%— meaning that s/he labels an infected cell correctly 99% of the time— and a false positive rate of 1% (i.e., specificity of 99%)— meaning that s/he makes a mistake of calling an uninfected cell as infected 1% of the time— s/he would then need to label more than 2,000 individual cells so that s/he would have a smear-level false positive rate less than 10% with a true positive detection rate of 80%. FIG. 19B illustrates the same graph with a parasitemia level of 1.0%.
[00123] The smear-level diagnosis accuracy improves as the number of labelled cells N is increased. This theoretical analysis may to some extent explain the prevalence of false positive diagnoses in sub-Saharan Africa (sometimes approaching ~ 60%), since even with extremely high single-cell accuracy levels, professionals can still make mistakes, and unless they observe statistically significant numbers of cells, they cannot avoid making frequent false positive diagnoses. As an example, for a parasitemia of ξ= 0.5%, 7V=2000, η = 99%, and ζ = 1%, a true positive rate above 90% cannot be achieved with a false positive rate less than 30% (see FIG. 19A). This mathematical framework can be generalized and used to customize and fine tune standard diagnostic procedures depending on the training levels of individual experts. Such action may lead to significant improvements in diagnosis efficiency and cost-effectiveness, especially within a digital telepathology platform.
[00124] The cell-level and slide-level diagnosis methodologies that have been described in this manuscript were applied to thin smear samples. Under various circumstances, however, thick smear blood samples are also used for the diagnosis of malaria in the field. The multi- expert telediagnosis framework described herein is applicable to thick smears as well. In such a scenario, there will be no cell-level diagnosis, and the thick-smear images will be cropped into smaller pieces and then sent to experts for diagnostic labelling. Instead of combining the experts' inputs to extract the infection state of individual cells, in this scenario, the experts' labels will be combined to extract the infection state of different cropped regions of the thick smear image. The methods and systems described herein apply to images that contain parts of a histopathology slide or smear of cells on substrate (e.g., slide).
[00125] In another embodiment, instead of using active gaming-based methods of getting individuals to help solve biologically relevant pattern-recognition tasks, in another embodiment, passive puzzles or games may be used. In this alternative embodiment, the puzzle or game is implemented as a bioCAPTCHA. CAPTCHA stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart. The bioCAPTCHA is a biologically oriented CAPTCHA that various entities can incorporate into their websites in place of traditional CAPTCHAs to ensure that visitors are humans and not automated computer software. Instead of solving a traditional text or object based CAPTCHA, the user is presented with a bioCAPTCHA which would involve the analysis of an image of a specimen to prove that the user is a human and, at the same time, solving some part of a biologically relevant pattern analysis task. This task could include any number of tasks including, but not limited to, counting of objects within an image (e.g., counting cells, parasites, etc.) or segmentation of a large image toward classification of different parts that make up the image. Because training of users for a bioCAPTCHA interface would not be feasible, the human tasks that are required will solve only part of overall image analysis. As one embodiment, the bioCAPTCHA interfaces could be used to help create images to be used in the games for further advanced analysis by one or more trained gamers. For example, certain regions of an images cell may be highlighted by the user of the bioCAPTCHA interface. This includes, for example, identifying organelles, cell borders, and stained regions of the cells. Other examples including counting of cells or features contained within a presented image (which may be zoomed). The user may also be asked to identify certain abnormalities or inconsistencies within among a set of images (e.g., which cell or cells is/are different from the majority of cells). This last example may involve the user passively identifying potentially diseased cells.
[00126] FIG. 20 illustrates an example of a bioCAPTCHA. In this example, the visitor to a website is asked to count and enter the number of cells in two images. The number of cells in image (a) is known a priori as a control measure to both assess the performance of the user, and to ensure that computer software is not allowed to circumvent the bioCAPTCHA. The cell-count in image (b) is unknown. Image (b) is presented to many users as a
bioCAPTCHA, and the consensus formed from the many user responses is used to generate an accurate cell count. Note that image (b) can be a part of a much larger image, which is split into small patches, making the task of counting the cells easy enough for the bioCAPTCHA interface.
[00127] While embodiments have been shown and described, various modifications may be made without departing from the scope of the inventive concepts disclosed herein. The invention(s), therefore, should not be limited, except to the following claims, and their equivalents.

Claims

WHAT IS CLAIMED IS:
1. A method of analyzing microscope slide images using crowd-sourcing comprising:
obtaining one or more microscopic images of cells on the microscope slide;
performing image processing to identify groups of cells or individual cells within the image;
transferring images of the groups of cells or individual cells to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display the images of the groups of cells or individual cells on a display;
identifying with the gaming software, individual cells suspected of having a particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices;
transmitting the identification information from the plurality of different computer gaming devices to one or more remotely located computing devices; and
labeling the individual cells based at least in part on a decoding operation performed by the one or more remotely located computing devices on the transmitted identification information.
2. The method of claim 1, further comprising diagnosing the slide or a subject based on the labeled cells.
3. The method of claim 1, further comprising generating a library database based on the labeled cells.
4. The method of claim 3, wherein the library database comprises a training database.
5. The method of claim 1, wherein the plurality of computer gaming devices configured to run gaming software comprise at least one of a mobile phone, tablet, personal computer, and a wearable computer.
6. The method of claim I, wherein the characteristic or phenotype comprises at least one of a diseased or abnormal state.
7. The method of claim 6, wherein the at least one of a diseased or abnormal states comprises the presence of malaria parasites.
8. The method of claim I, wherein substantially all of the users are non-experts in the field of pathology .
9. The method of claim 1, wherein at least some of the users are experts in the field of pathology.
10. The method of claim 1, wherein the decoding operation comprises a Maximum Posteriori Probability algorithm executed on the remotely located computing device.
11. The method of claim 1 , wherein the gaming software comprises at least one motivator.
12. The method of claim 1 1, wherein the motivator comprises a monetary award.
13. The method of claim 1, wherein at least some of the images of the groups of cells or individual cells are control images.
14. The method of claim 1, wherein the one or more microscopic images comprise parts of a histopathology slide or smear or cells
15. The method of claim 1, wherein the image processing to identify groups of cells or individual cells within the image is performed by the one or more remotely located computing devices.
16. The method of claim 1, wherein the one or more remotely located computing devices comprise one or more computer servers.
17. A method of analyzing microscope slide images using crowd-sourcing comprising:
obtaining one or more microscopic images of cells on the microscope slide;
performing image processing with at least one computer to identify individual cells or groups of cells within the image;
automatically identifying individual cells or groups of cells suspected of having a particular characteristic or phenotype using a pre-trained machine learning algorithm executed on the at least one computer, wherein the automatically identified cells or groups of cells are those cells having a confidence level about a threshold value;
transferring images of the remaining cells to a plurality of computer gaming devices associated with different users, the plurality of computer gaming devices configured to run gaming software, the gaming software configured to display images of the cells on a display, wherein the remaining cells are those cells having a confidence level below the threshold value;
identifying with the gaming software, individual cells or groups of cells suspected of having the particular characteristic or phenotype, wherein the identification is performed on the plurality of different computer gaming devices;
transmitting the identification information from the plurality of different computer gaming devices to the at least one computer; and
labeling the individual cells based at least in part on a decoding operation performed by the at least one computer on the transmitted identification information for the cells having the confidence level below the threshold value.
18. The method of claim 17, further comprising diagnosing the slide or a subject or a group of subjects based on the labeled cells.
19. The method of claim 17, further comprising generating a library database based on the labeled cells.
20. The method of claim 19, wherein the library database comprises a training database.
21. The method of claim 17, wherein the plurality of computer gaming devices configured to run gaming software comprise at least one of a mobile phone, tablet, personal computer, and a wearable computer.
22. The method of claim 17, wherein the characteristic or phenotype comprises at least one of a diseased state or abnormal state.
23. A system for analyzing microscope slide images using crowd-sourcing comprising:
a remote computing device configured to receive one or more microscopic images of cells on the microscope slide and further configured to identify groups of cells or individual cells within the image;
a plurality of computer gaming devices containing gaming software configured to receive images of the groups of cells or individual cells from the remote computing device, the gaming software further configured to display images of the groups of cells or individual cells on a display and permit user identification of individual cells suspected of having a particular characteristic or phenotype; and
wherein the remote computer is configured to receive user identification information transmitted from the plurality of computer gaming devices and further configured to label the individual cells based at least in part on a decoding operation performed by the remotely located computing device on the transmitted identification information.
24. The system of claim 23, wherein the plurality computer gaming devices configured to run gaming software comprise at least one of a mobile phone, tablet, personal computer, and a wearable computer.
PCT/US2013/031109 2012-03-20 2013-03-14 System and method for crowd-sourced telepathology WO2013142219A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201261613396P 2012-03-20 2012-03-20
US61/613,396 2012-03-20
US201261664010P 2012-06-25 2012-06-25
US61/664,010 2012-06-25

Publications (1)

Publication Number Publication Date
WO2013142219A1 true WO2013142219A1 (en) 2013-09-26

Family

ID=49223219

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/031109 WO2013142219A1 (en) 2012-03-20 2013-03-14 System and method for crowd-sourced telepathology

Country Status (1)

Country Link
WO (1) WO2013142219A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767419B2 (en) 2014-01-24 2017-09-19 Microsoft Technology Licensing, Llc Crowdsourcing system with community learning
US11120373B2 (en) 2014-07-31 2021-09-14 Microsoft Technology Licensing, Llc Adaptive task assignment
US11494901B2 (en) 2019-05-21 2022-11-08 Saint Louis University Digital telepathology and virtual control of a microscope using edge computing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005121863A1 (en) * 2004-06-11 2005-12-22 Nicholas Etienne Ross Automated diagnosis of malaria and other infections
US20100293026A1 (en) * 2009-05-18 2010-11-18 Microsoft Corporation Crowdsourcing
US20110122242A1 (en) * 2009-10-26 2011-05-26 Texas Instruments Incorporated Digital microscopy equipment with image acquisition, image analysis and network communication
EP2348477A1 (en) * 2010-01-06 2011-07-27 Alcatel Lucent Crowdsourcing through mobile network
US20110313820A1 (en) * 2010-06-17 2011-12-22 CrowdFlower, Inc. Using virtual currency to compensate workers in a crowdsourced task

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005121863A1 (en) * 2004-06-11 2005-12-22 Nicholas Etienne Ross Automated diagnosis of malaria and other infections
US20100293026A1 (en) * 2009-05-18 2010-11-18 Microsoft Corporation Crowdsourcing
US20110122242A1 (en) * 2009-10-26 2011-05-26 Texas Instruments Incorporated Digital microscopy equipment with image acquisition, image analysis and network communication
EP2348477A1 (en) * 2010-01-06 2011-07-27 Alcatel Lucent Crowdsourcing through mobile network
US20110313820A1 (en) * 2010-06-17 2011-12-22 CrowdFlower, Inc. Using virtual currency to compensate workers in a crowdsourced task

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9767419B2 (en) 2014-01-24 2017-09-19 Microsoft Technology Licensing, Llc Crowdsourcing system with community learning
US10762443B2 (en) 2014-01-24 2020-09-01 Microsoft Technology Licensing, Llc Crowdsourcing system with community learning
US11120373B2 (en) 2014-07-31 2021-09-14 Microsoft Technology Licensing, Llc Adaptive task assignment
US11494901B2 (en) 2019-05-21 2022-11-08 Saint Louis University Digital telepathology and virtual control of a microscope using edge computing

Similar Documents

Publication Publication Date Title
Mavandadi et al. Distributed medical image analysis and diagnosis through crowd-sourced games: a malaria case study
Luo et al. Deep mining external imperfect data for chest X-ray disease screening
Albarqouni et al. Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images
Wang et al. Does non-COVID-19 lung lesion help? investigating transferability in COVID-19 CT image segmentation
Bissoto et al. Deep-learning ensembles for skin-lesion segmentation, analysis, classification: RECOD titans at ISIC challenge 2018
US11721023B1 (en) Distinguishing a disease state from a non-disease state in an image
Qu et al. An experimental study of data heterogeneity in federated learning methods for medical imaging
Punitha et al. Detecting COVID-19 from lung computed tomography images: A swarm optimized artificial neural network approach
WO2013142219A1 (en) System and method for crowd-sourced telepathology
Haloi et al. Towards radiologist-level accurate deep learning system for pulmonary screening
Neggaz et al. Boosting Archimedes optimization algorithm using trigonometric operators based on feature selection for facial analysis
CN116721772B (en) Tumor treatment prognosis prediction method, device, electronic equipment and storage medium
Lee et al. VisCUIT: Visual auditor for bias in CNN image classifier
Zhou et al. Audit to Forget: A Unified Method to Revoke Patients' Private Data in Intelligent Healthcare
CN116645346A (en) Processing method of rotator cuff scanning image, electronic equipment and storage medium
Ferber et al. In-context learning enables multimodal large language models to classify cancer pathology images
Sharma et al. Surya Namaskar: real-time advanced yoga pose recognition and correction for smart healthcare
Rousseau et al. The TrackML challenge
CN113327212A (en) Face driving method, face driving model training device, electronic equipment and storage medium
Heinrich et al. Evaluating viewpoint entropy for ribbon representation of protein structure
Ravin et al. Mitigating domain shift in AI-based TB screening with unsupervised domain adaptation
Ganjdanesh et al. Multi-modal genotype and phenotype mutual learning to enhance single-modal input based longitudinal outcome prediction
Yang et al. Coordinate-wise monotonic transformations enable privacy-preserving age estimation with 3D face point cloud
Viroonluecha et al. COVID19 X-ray image classification using voting ensemble CNNs transfer learning
Wijaya et al. The Design of Convolutional Neural Networks Model for Classification of Ear Diseases on Android Mobile Devices

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13763813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13763813

Country of ref document: EP

Kind code of ref document: A1