WO2005091203A2 - Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale - Google Patents

Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale Download PDF

Info

Publication number
WO2005091203A2
WO2005091203A2 PCT/US2005/008350 US2005008350W WO2005091203A2 WO 2005091203 A2 WO2005091203 A2 WO 2005091203A2 US 2005008350 W US2005008350 W US 2005008350W WO 2005091203 A2 WO2005091203 A2 WO 2005091203A2
Authority
WO
WIPO (PCT)
Prior art keywords
feature
moφhometric
stroma
nuclei
patient
Prior art date
Application number
PCT/US2005/008350
Other languages
English (en)
Other versions
WO2005091203A3 (fr
Inventor
Olivier Saidi
David A. Verbel
Mikhail Teverovskiy
Original Assignee
Aureon Laboratories, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/991,897 external-priority patent/US7483554B2/en
Priority claimed from US10/991,240 external-priority patent/US7505948B2/en
Priority claimed from US11/067,066 external-priority patent/US7321881B2/en
Application filed by Aureon Laboratories, Inc. filed Critical Aureon Laboratories, Inc.
Priority to EP05728291A priority Critical patent/EP1728211A2/fr
Priority to CA2559241A priority patent/CA2559241C/fr
Publication of WO2005091203A2 publication Critical patent/WO2005091203A2/fr
Publication of WO2005091203A3 publication Critical patent/WO2005091203A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection

Definitions

  • Embodiments of the invention relate to methods and systems that use clinical information, molecular information and computer-generated morphometric information in a predictive model for predicting the occurrence of a medical condition (e.g., disease or responsiveness or unresponsiveness to treatment).
  • a medical condition e.g., disease or responsiveness or unresponsiveness to treatment
  • the invention comprises methods and systems that use clinical, molecular and morphometric information to treat, diagnose and predict the recurrence of prostate cancer.
  • BACKGROUND Physicians are required to make many medical decisions ranging from, for example, whether and when a patient is likely to experience a medical condition to how a patient should be treated once the patient has been diagnosed with the condition.
  • Determining an appropriate course of treatment for a patient may increase the patient's chances for, for example, survival and/or recovery.
  • predicting the occurrence of an event advantageously allows individuals to plan for the event. For example, predicting whether a patient is likely to experience occurrence (e.g., recurrence) of a disease may allow a physician to recommend an appropriate course of treatment for that patient.
  • occurrence e.g., recurrence
  • physicians rely heavily on their expertise and training to treat, diagnose and predict the occurrence of medical conditions.
  • pathologists use the Gleason scoring system to evaluate the level of advancement and aggression of prostate cancer, in which cancer is graded based on the appearance of prostate tissue under a microscope as perceived by a physician.
  • Gleason grading is widely considered by pathologists to be reliable, it is a subjective scoring system. Particularly, different pathologists viewing the same tissue samples may make conflicting interpretations.
  • Conventional tools for assisting physicians in medical diagnostics are limited in scope and application. For example, tools for assisting physicians with decisions regarding prostate cancer treatment after a patient has undergone radical prostatectomy are limited to serum- based PSA screening tests and generalized nomograms.
  • One postoperative nomogram developed by Kattan et al. U.S. Patent No. 6,409,664, is widely used by urologists and allows prediction of the 7-year probability of disease recurrence for patients treated by radical prostatectomy.
  • This nomogram provides information about the likelihood of biochemical failure only (i.e., an increase in PSA level), and does not predict clinical failure (death). Moreover, this nomogram only predicts whether a patient's condition is likely to recur within 7 years, and does not predict when in that interval the patient's condition might recur.
  • Prognostic variables used in this nomogram include pre-treatment serum PSA levels, Gleason score, and microscopic assessment by a pathologist of prostate capsular invasion, surgical margins, seminal vesicle invasion, and lymph node status. Treatment failure is recorded when there is clinical evidence of disease recurrence, a rising serum PSA, or initiation of adjuvant therapy.
  • these nomograms have several limitations.
  • a feature X is said to be continuous-valued if, for some A ⁇ B, the set of values for the feature includes all numbers x between A and B.
  • Cancer image analysis systems have been developed for images taken from cytological specimens [2] [3]. However, such systems only capture cells and thus do not utilize all of the architectural information observable at the tissue level, let alone combine that information with clinical and molecular information. Cancer image analysis systems have not been provided for analyzing the structure of different pathological elements at the tissue level, which often plays a more important role in diagnosis (e.g., in Gleason analysis) than the appearance of individual cells. Thus, pathologists have resorted to manual techniques for analyzing the shape and size of the prostate gland to determine the pathologic grade of the cancer [4].
  • predicting an occurrence of a medical condition may include, for example, predicting whether and/or when a patient will experience occurrence (e.g., recurrence) of disease such as cancer, predicting whether a patient is likely to respond to one or more therapies (e.g., a new pharmaceutical drug), and predicting the occurrence of any other suitable medical condition. Predictions by embodiments of the present invention may be used by physicians or other individuals to, for example, select an appropriate course of treatment for a patient and/or to diagnose a medical condition in the patient. In an aspect of the present invention, systems and methods are provided for generating a model that predicts the occurrence of a medical condition.
  • occurrence e.g., recurrence
  • therapies e.g., a new pharmaceutical drug
  • Generating a predictive model may include using an analytical tool to train a support vector machine (SVM) or a neural network with data for a cohort of patients whose outcomes are at least partially known.
  • the training data includes clinical data, molecular data, and computer-generated morphometric data.
  • data of a particular type (e.g., clinical, molecular, or morphometric) may include one or more features of that type.
  • morphometric data is defined to include any computer-generated data associated with or derived from an electronic (digital) image of tissue, including but not limited to data regarding structural properties of the tissue or portion thereof (e.g., area, length, width, compactness, and density), spectral properties of the tissue or portion thereof (e.g., red, green, blue (RGB) color channel values, brightness and channel histograms), and f actal properties of the tissue image and/or identified tissue components (e.g., fractal dimension of intraepithelial interface, lumen outline), statistical properties of wavelet decomposition coefficients and/or other image data transforms.
  • structural properties of the tissue or portion thereof e.g., area, length, width, compactness, and density
  • spectral properties of the tissue or portion thereof e.g., red, green, blue (RGB) color channel values, brightness and channel histograms
  • f actal properties of the tissue image and/or identified tissue components e.g., fractal dimension of intraepitheli
  • the training data includes computer-generated morphometric data only or the combination of clinical data and computer-generated morphometric data.
  • systems and methods are provided for generating a predictive model based on one or more computer-generated morphometric features related to stroma, cytoplasm, epithelial nuclei, stroma nuclei, lumen, red blood cells, tissue artifacts, or tissue background, or a combination thereof.
  • the predictive model may be generated based on the computer-generated morphometric features alone or in combination with one or more of the clinical features listed in Table 4 and/or one or more of the molecular features listed in Table 6.
  • the one or more features may be input to an analytical tool that determines an affect of the features on the ability of an associated model to predict a medical condition.
  • Features that increase the predictive power of the model maybe included in the final model, whereas features that do not increase (e.g., or decrease) the predictive power may be removed from consideration.
  • Using the above-described morphornetric features alone or in combination with the clinical and/or morphometric features listed in Tables 4 and/or 6, respectively, as a basis for developing a predictive model may focus the resources of physicians, other individuals, and/or automated processing equipment (e.g., a tissue image analysis system) on obtaining data for patient features that are more likely to be correlated with outcome and therefore useful in the final predictive model.
  • automated processing equipment e.g., a tissue image analysis system
  • a predictive model that evaluates a dataset for a patient in order to evaluate the risk of occurrence of a medical condition in the patient, where the predictive model is based on computer- generated morphometric data alone or in combination with clinical data and/or molecular data.
  • the predictive model may receive the dataset for the patient as input, and may output a "score" indicating the likelihood that the patient will experience one or more outcomes related to the medical condition.
  • a predictive model for predicting occurrence or recurrence of disease, where the model is based on one or more computer-generated morphometric features related to stroma, cytoplasm, epithelial nuclei, stroma nuclei, lumen, red blood cells, tissue artifacts, or tissue background, or a combination thereof.
  • the predictive model may be based on these computer-generated morphometric features alone or in combination with one or more of the clinical features listed in Table 4 and/or one or more of the molecular features listed in Table 6.
  • a predictive model for predicting prostate cancer recurrence, where the model is based on one or more of the clinical and/or molecular features set forth in Figure 6 and one or more morphometric features for one or more of the following pathological objects: red blood cell, epithelial nuclei, stroma, lumen, cytoplasm, and tissue background.
  • a predictive model is provided for predicting prostate cancer recurrence, where the model is based on one or more of the clinical and/or molecular features set forth in Figure 9 and one or more morphometric features for one or more of the following pathological objects: red blood cell, epithelial nuclei, stroma, lumen, and cytoplasm.
  • a predictive model for predicting prostate cancer survivability, where the model is based on one or more of the clinical and/or molecular features set forth in Figure 11 and one or more morphometric features for one or more of the following pathological objects: red blood cell, epithelial nuclei, and stroma.
  • the predictive model may determine whether a tissue sample is normal or abnormal or may predict whether a patient is likely to experience clinical failure post prostatectomy.
  • systems and methods are provided in which data for a patient is measured at each of a plurality of points in time and evaluated by a predictive model of the present invention. A diagnosis or treatment of the patient may be based on a comparison of the results from each evaluation.
  • Such a comparison may be summarized in, for example, a report output by a computer for use by a physician or other individual.
  • systems and methods may be provided for screening for an inhibitor compound of a medical condition.
  • a first dataset for a patient may be evaluated by a predictive model, where the model is based on clinical data, molecular data, and computer-generated morphometric data.
  • a test compound may be administered to the patient.
  • a second dataset may be obtained from the patient and evaluated by the predictive model.
  • the results of the evaluation of the first dataset may be compared to the results of the evaluation from the second dataset.
  • a change in the results for the second dataset with respect to the first dataset may indicate that the test compound is an inhibitor compound.
  • a test kit for treating, diagnosing and/or predicting the occurrence of a medical condition.
  • a test kit may be situated in a hospital, other medical facility, or any other suitable location.
  • the test kit may receive data for a patient (e.g., including clinical data, molecular data, and/or computer- generated morphometric data), compare the patient's data to a predictive model (e.g., programmed in memory of the test kit) and output the results of the comparison.
  • the molecular data and/or the computer-generated morphometric data may be at least partially generated by the test kit.
  • the molecular data may be generated by an analytical approach subsequent to receipt of a tissue sample for a patient.
  • the morphometric data may be generated by segmenting an electronic image of the tissue sample into one or more objects, classifying the one or more objects into one or more object classes (e.g., stroma, lumen, red blood cells, etc.), and determining the morphometric data by taking one or more measurements for the one or more object classes.
  • the test kit may include an input for receiving, for example, updates to the predictive model, h some embodiments, the test kit may include an output for, for example, transmitting data, such as data useful for patient billing and or tracking of usage, to another device or location.
  • Figures 1 A and IB are block diagrams of systems that use a predictive model to treat, diagnose or predict the occurrence of a medical condition
  • Figure 1 C is a block diagram of a system for generating a predictive model
  • Figure 2 shows illustrative results for a patient that may be output by a predictive model
  • Figure 3 is flowchart of illustrative stages involved in processing tissue images
  • Figure 4 is a flowchart of illustrative stages involved in screening for an inhibitor compound of a medical condition
  • Figures 5 a and 5b show grayscale digital images of healthy and abnormal prostate tissue specimens, respectively, after image segmentation and classification
  • Figure 6 shows various clinical, molecular, and computer-generated morphometric features used by a model to predict prostate cancer recurrence
  • Embodiments of this invention relate to methods and systems that use computer- generated morphometric information alone or in c mbination with clinical information and/or molecular information in a predictive model for predicting the occurrence of a medical condition.
  • clinical, molecular and computer-generated morphometric information is "used to predict the recurrence of prostate cancer.
  • the teachings provided herein are used to predict the occurrence of other medical conditions such as, for example, other types of disease (e.g., epithelial and mixed-neoplasms including breast, colon, lung, bladder, liver, pancreas, renal cell, and soft tissue) and the responsiveness or unresponsiveness of a patient to one or more therapies (e.g., pharmaceutical drugs).
  • therapies e.g., pharmaceutical drugs.
  • an ar alytical tool including a support vector machine (SVM) and/or a neural network may be provided that determines correlations between clinical, molecular, and computer-generated morphometric features and a medical condition.
  • the correlated features may form a model that can be used to predict the occurrence or recurrence of the condition.
  • an analytical tool may be used to generate a predictive model based on data for a cohort of patients whose outcomes with respect to a medical condition (e.g., time to recurrence of cancer) are at least partially known. The model may then be used to evaluate data for a. new patient in order to predict the occurrence of the medical condition for the new patient.
  • only a subset of the three data types may be used by the analytical tool to generate the predictive model.
  • the clinical, molecular, and/or morphometric data used by embodiments of the present invention may include any clinical, molecular, and/or morphometric data that is relevant to the diagnosis, treatment and or prediction of a medical condition.
  • Features analyzed for correlations with prostate cancer recurrence and survival in order to generate predictive models are described below in connection with, for example, Tables 1, 2, 4 and/or 6.
  • these features may provide a basis for developing predictive models for other medical conditions (e.g., breast, colon, lung, bladder, liver, pancreas, renal cell, and soft tissue).
  • other medical conditions e.g., breast, colon, lung, bladder, liver, pancreas, renal cell, and soft tissue.
  • one or more of the features in Tables 1, 2, 4 and/or 6 maybe assessed for patients having some other medical condition and then input to an analytical tool that determines whether the features correlate with the medical condition.
  • Features that increase the ability of the model to predict the occurrence of the medical condition may be included in the final model, whereas features that do not increase (e.g., or decrease) the predictive power of the model may be removed from consideration.
  • Using the features in Tables 1, 2, 4 and/or 6 as a basis for developing a predictive model may focus the resources of physicians, other individuals, and/or automated processing equipment (e.g., a tissue image analysis system) on obtaining patient data that is more likely to be correlated with outcome and therefore useful in the final predictive model.
  • the features determined to be correlated with prostate cancer recurrence and survival are shown in Figures 6, 9, and 11. It will be understood that these features may be included directly in final models predictive of prostate cancer recurrence and or survival, and/or used for developing predictive models for other medical conditions.
  • the morphometric data may include computer-generated data indicating various structural and/or spectral properties of, for example, tissue specimens.
  • the morphometric data may include data for morphometric features of stroma, cytoplasm, epithelial nuclei, stroma nuclei, lumen, red blood cells, tissue artifacts, tissue background, or a combination thereof.
  • a tissue image analysis system is provided for obtaining measurements of the morphometric features from a tissue image.
  • Such a system may be the MAGICTM system which uses the Definiens Cellenger software.
  • Such a system may receive an H&E stained image as input, and may output various measurements of morphometric features for pathological objects in the image. Additional details regarding systems and methods for obtaining mo ⁇ hometric features from an image are described below in connection with Figure 3.
  • Clinical features may include or be based on data for one or more patients such as age, race, weight, height, medical history, genotype and disease state, where disease state refers to clinical and pathologic staging characteristics and ⁇ ar other clinical features gathered specifically for the disease process at hand.
  • clinical data is gathered by a physician during the course of examining a patient and/or the "tissue or cells of the patient.
  • the clinical data may also include clinical data that may be more specific to a particular medical context.
  • the clinical data may include data indicating blood concentration of prostate specific antigen (PSA), the result of a digital rectal exam, Gleason score, and/or other clinical data that may be more specific to prostate cancer.
  • PSA prostate specific antigen
  • other histologic disease-specific features/manifestations may include regions of necrosis (e.g., ductal carcinoma in situ for the breast), size, shape and regional pattern/distribution of epithelial cells (e.g., breast, lung), degree of differentiation (e.g., squamous differentiation with non- small cell lung cancer (NSCLC, mucin production as seen wit i various adenocarcinomas seen in both breast and colon)), mo ⁇ hological/microscopic distribution of the cells (e.g., lining ducts in breast cancer, lining bronchioles in NSCLC), a ⁇ id degree and type of inflammation (e.g., having different characteristics for breast and NSCLC in comparison to prostate).
  • regions of necrosis e.g., ductal carcinoma in situ for the breast
  • size, shape and regional pattern/distribution of epithelial cells e.g., breast, lung
  • degree of differentiation e.g., squamous differentiation with non
  • the molecular features may include or be based on data indicating the presence, absence, relative increase or decrease or relative location of biological molecules including nucleic acids, polypeptides, saccharides, steroids and other small molecules or combinations of the above, for example, glycoroteins and protein-RNA complexes.
  • the locations at which these molecules are measured may include glands, tumors, stroma, and/or other locations, and may depend on the particular medical context.
  • molecular data is gathered using common molecular biological and biochemical techniques including Southern, Western, and Northern blots, polymerase chain reaction (PCR), immunohistochemistry, and immunofluorescence. Further, in situ hybridization may be used to show both the relative abundance and location of molecular biological features.
  • Figures 1 A and IB show illustrative systems that use a predictive model to predict the occurrence of a medical condition in a patient.
  • the arrangement in Figure 1A may be used when, for example, a medical diagnostics lab provides support for a medical decision to a physician or other individual associated with a remote access device.
  • predictive model 102 is located in diagnostics facility 104.
  • Predictive model 102 may include any suitable hardware, software, or combination thereof for receiving data for a patient, evaluating the data in order to predict the occurrence (e.g., recurrence) of a medical condition for the patient, and outputting the results of the evaluation.
  • model 102 may be used to predict the responsiveness of a patient to particular one or more therapies.
  • Diagnostics facility 104 may receive data for a patient from remote access device 106 via Internet service provider (ISP) 108 and communications networks 110 and 112, and may input the data to predictive model 102 for evaluation.
  • ISP Internet service provider
  • Other arrangements for receiving and evaluating data for a patient from a remote location are of course possible (e.g., via another connection such as a telephone line or through the physical mail).
  • the remotely located physician or individual may acquire the data for the patient in any suitable manner and may use remote access device 106 to transmit the data to diagnostics facility 104.
  • the data for the patient may be at least partially generated by diagnostics facility 104 or another facility.
  • diagnostics facility 104 may receive a digitized version of an H&E stained image from remote access device 106 or other device and may generate mo ⁇ hometric data for the patient based on the image.
  • actual tissue samples may be received and processed by diagnostics facility 104 in order to generate the mo ⁇ hometric data.
  • a third party may receive an image or tissue for a new patient, generate mo ⁇ hometric data based on the image or tissue, and provide the mo ⁇ hometric data to diagnostics facility 104.
  • a suitable image processing tool for generating mo ⁇ hometric data from tissue images and/or samples is described below in connection with Figure 3.
  • Diagnostics facility 104 may provide the results of the evaluation to a physician or individual associated with remote access device 106 through, for example, a transmission to remote access device 106 via ISP 108 and communications networks 110 and 112 or in another manner such as the physical mail or a telephone call.
  • the results may include a diagnostic "score" (e.g., an indication of the likelihood that the patient will experience one or more outcomes related to the medical condition such as the predicted time to recurrence of the event), information indicating one or more features analyzed by predictive model 102 as being correlated with the medical condition, information indicating the sensitivity and/or specificity of the predictive model, or other suitable diagnostic information or a combination thereof.
  • a diagnostic "score" e.g., an indication of the likelihood that the patient will experience one or more outcomes related to the medical condition such as the predicted time to recurrence of the event
  • information indicating one or more features analyzed by predictive model 102 as being correlated with the medical condition information indicating the sensitivity and/or specificity of the
  • Figure 2 shows an example of a report for a fictional patient that may be output by the predictive model.
  • the report maps the patient's probability of outcome (e.g., recmxence of prostate cancer; i.e., y-axis) to time in months (x-axis).
  • the patient has a score of "520" which places the patient in a high-risk category.
  • Such a report may be used by a physician or other individual to assist in determining a more refined clinical-diagnostic tumor grade, develop an effective means to sub-classify patients and finally generate more accurate (and appropriate) treatment option algorithms for the individual patient.
  • the report may also be useful in that it may help the physician or individual to explain the patient's risk to the patient.
  • Remote access device 106 may be any remote device capable of transmitting and/or receiving data from diagnostics facility 104 such as, for example, a personal computer, a wireless device such as a laptop computer, a cell phone or a personal digital assistant (PDA), or any other suitable remote access device.
  • diagnostics facility 104 may include a server capable of receiving and processing communications to and/or from remote access device 106.
  • Such a server may include a distinct component of computing hardware and/or storage, but may also be a software application or a combination of hardware and software.
  • the server may be implemented using one or more computers. 2 ⁇
  • Each of communications links 110 and 112 may be any suitable wired or wireless communications path or combination of paths such as, for example, a local area network, wide area network, telephone network, cable television network, intranet, or Internet.
  • Some suitable wireless communications networks may be a global system for mobile communications (GSM) network, a time-division multiple access (TDMA) network, a code- division multiple access (CDMA) network, a Bluetooth network, or any other suitable wireless network.
  • GSM global system for mobile communications
  • TDMA time-division multiple access
  • CDMA code- division multiple access
  • Bluetooth any other suitable wireless network.
  • Test kit 122 may include any suitable hardware, software, or combination thereof (e.g., a personal computer) that is adapted to receive data for a patient (e.g., at least one of clinical, mo ⁇ hometric and molecular data), evaluate the patient's data with a predictive model (e.g., programmed in memory of the test kit), and output the results of the evaluation.
  • test kit 122 may include a computer readable medium encoded with computer executable instructions for performing the functions of the predictive model.
  • test kit 122 may optionally include an image processing tool capable of generating data corresponding to mo ⁇ hometric features from, for example, a tissue sample or image. A suitable image processing tool is described below in connection with Figure 3.
  • test kit 122 may receive pre-packaged data for the mo ⁇ hometric features as input from, for example, an input device (e.g., keyboard) or another device or location.
  • Test kit 122 may optionally include an input for receiving, for example, updates to the predictive model.
  • the test kit may also optionally include an output for transmitting data, such as data useful for patient billing and/or tracking of usage, to a main facility or other suitable device or location.
  • the billing data may include, for example, medical insurance information for a patient evaluated by the test kit (e.g., name, insurance provider, and account number). Such information may be useful when, for example, a provider of the test kit charges for the kit on a per-use basis and or when the provider needs patients' insurance information to submit claims to insurance providers.
  • Figure 1C shows an illustrative system for generating a predictive model.
  • the system includes analytical tool 132 (e.g., including a support vector machine (SVM) and/or a neural network) and database 134 of patients whose outcomes are at least partially known.
  • SVM support vector machine
  • Analytical tool 132 may include any suitable hardware, software, or combination thereof for determining conelations between the data from database 134 and a medical condition.
  • the system in Figure 1C may also include image processmg tool 136 capable of generating mo ⁇ hometric data based on, for example, a digitized version of an H&E stained tissue image, an actual tissue sample, or both.
  • Tool 136 may generate mo ⁇ hometric data for, for example, the known patients whose data is included in database 134.
  • a suitable image processing tool 136 is described below in connection with Figure 3.
  • Database 134 may include any suitable patient data such as data for clinical features, mo ⁇ hometric features, molecular features, or a combination thereof.
  • Database 134 may also include data indicating the outcomes of patients such as whether and when the patients have experienced disease recmrence.
  • database 134 may include uncensored data for patients (i.e., data for patients whose outcomes are completely known) such as data for patients who have experienced a recurrence of a medical condition.
  • Database 134 may alternatively or additionally include censored data for patients (i.e., data for patients whose outcomes are not completely known) such as data for patients who have not shown signs of disease recurrence in one or more follow-up visits to a physician.
  • the use of censored data by analytical tool 132 may increase the amount of data available to generate the predictive model and, therefore, may advantageously improve the reliability and predictive power of the model.
  • analytical tool 132 may include a support vector machine (SVM).
  • SVM support vector machine
  • tool 132 preferably includes an SVM capable of performing support vector regression on censored data (SVRc).
  • SVRc support vector regression on censored data
  • a novel modified loss/penalty function is provided for use within an SVM that may allow the SVM to utilize censored data.
  • Data including clinical, molecular and/or mo ⁇ hometric features of known patients from database 134 may be input to the SVM to determine parameters for a predictive model.
  • the parameters may indicate the relative importance of input features, and may be adjusted in order to maximize the ability of the SVM to predict the outcomes of the known patients. Additional details regarding the use of SVM to determine correlations of features with a medical condition are described in [5] and [6].
  • the use of SVRc by analytical tool 132 may include obtaining from database 134 multi-dimensional, non-linear vectors of information indicative of status of patients, where at least one of the vectors lacks an indication of a time of occurrence of an event with respect to a corresponding patient. Analytical tool 132 may then perform regression using the vectors to produce a kernel-based model that provides an output value related to a prediction of time to the event based upon at least some of the information contained in the vectors of information.
  • Analytical tool 132 may use a loss function for each vector containing censored data that is different from a loss function used by tool 132 for vectors comprising uncensored data.
  • a censored data sample may be handled differently because it may provide only "onesided information.” For example, in the case of survival time prediction, a censored data sample typically only indicates that the event has not happened within a given time, and there is no indication of when it will happen after the given time, if at all.
  • the loss function used by analytical tool 132 for censored data may be as follows:
  • W is a vector in F
  • ⁇ (x) maps the input x to a vector in F.
  • the W and b are obtained by solving an optimization problem, the general form of which is: min 1 _, - W ⁇ W W,b 2 s.t. y ⁇ - (W ⁇ ⁇ ( Xl ) + b) ⁇ ⁇ (W ⁇ ⁇ (x,) + b)-y t ⁇ ⁇
  • analytical tool 132 may include a neural network.
  • tool 132 preferably includes a neural network that is capable of utilizing censored data.
  • the neural network preferably uses an objective function substantially in accordance with an approximation (e.g., derivative) of the concordance index (CI) to train an associated model (NNci).
  • an approximation e.g., derivative
  • the CI has long been used as a performance indicator for survival analysis [7]
  • the use of the CI to train a neural network has not been proposed previously.
  • the difficulty of using the CI as a training objective function in the past is that the CI is non-differentiable and cannot be optimized by gradient-based methods.
  • U.S. Patent Application No. / filed February 25, 2005, and entitled "Methods and Systems for Predicting Occurrence of an
  • this obstacle may be overcome by using an approximation of the CI as the objective function.
  • analytical tool 132 includes a neural network that is used to predict prostate cancer recurrence
  • the neural network may process input data for a cohort of patients whose outcomes with respect to prostate cancer recurrence are at least partially known in order to produce an output.
  • the particular features selected for input to the neural network may be selected through the use of the above-described SVRc (e.g., implemented with a support vector machine of analytical tool 132) or using another suitable feature selection process.
  • An enor module of tool 132 may determine an error between the output and a desired output corresponding to the input data (e.g., the difference between a predicted outcome and the known outcome for a patient).
  • Analytical tool 132 may then use an objective function substantially in accordance with an approximation of the CI to rate the performance of the neural network.
  • Analytical tool 132 may adapt the weighted connections (e.g., relative importance of features) of the neural network based upon the results of the objective function. Additional details regarding adapting the weighed connections of a neural network in order to adjust the correlations of features with a predicted outcome are described in [8] and [9].
  • the concordance index may be expressed in the form:
  • consists of all the pairs of patients ⁇ i,j ⁇ who meet the following conditions: • both patients andy experienced recurrence, and the recurrence time ti of patient i is shorter than patient y's recurrence time tf, or • only patient i experienced recurrence and is shorter than patient y ' J s follow-up visit time t j .
  • the numerator of the CI represents the number of times that the patient predicted to recur earlier by the neural network actually does recur earlier.
  • the denominator is the total number of pairs of patients who meet the predetermined conditions.
  • the model is more accurate.
  • An embodiment of the present invention provides an approximation of the CI as follows:
  • each R(i t ,t ) is weighted by the difference between i t and i .
  • This mechanism effectively overcomes over-fitting of the data during training of the model and makes the optimization preferably focus on only moving more pairs of samples in ⁇ to satisfy t t —i > ⁇ .
  • the influence of the training samples is adaptively adjusted according to the pair- wise comparisons during training.
  • the positive margin ⁇ in R is preferable for improved generalization performance, h other words, the parameters of the neural network are adjusted during training by calculating the CI after all the patient data has been entered. The neural network then adjusts the parameters with the goal of minimizing the objective function and thus maximizing the CI.
  • over-fitting generally refers to the complexity of the neural network. Specifically, if the network is too complex, the network will react to "noisy" data. Overfitting is risky in that it can easily lead to predictions that are far beyond the range of the training data. Additional details regarding systems and methods for using an objective function substantially in accordance with an approximation of the CI to train a neural network are described in above-inco ⁇ orated U.S. Patent Application No.
  • Figure 3 is a flowchart of illustrative functions of a suitable image processing tool. The functions in Figure 3 relate primarily to the segmentation of tissue images in order to classify pathological objects in the images (e.g., classifying objects as cytoplasm, lumen, nuclei, stroma, background, artifacts, and red blood cells).
  • pathological objects e.g., classifying objects as cytoplasm, lumen, nuclei, stroma, background, artifacts, and red blood cells.
  • the image processing tool may include a light microscope that captures tissue images at 20X magnification using a SPOT Insight QE Color Digital Camera (KAI2000) and produces images with 1600 x 1200 pixels.
  • the images may be stored as images with 24 bits per pixel in Tiff format.
  • the image processing tool may also include any suitable hardware, software, or combination thereof for segmenting and classifying objects in the captured images, and then measuring mo ⁇ hometric features of the objects.
  • the image processing tool may include the commercially-available Defmiens Cellenger Developer Studio (v.
  • the image processing tool may measure various mo ⁇ hometric features of the objects including spectral- based characteristics (red, green, blue (RGB) channel characteristics, such as mean values, standard deviations, etc.), position, size, perimeter, shape (asymmetry, compactness, elliptic fit, etc.) and relationships to neighboring objects (contrast).
  • spectral- based characteristics red, green, blue (RGB) channel characteristics, such as mean values, standard deviations, etc.
  • the image processing tool may measure these features for every instance of every identified pathological object in the image and may output these features for, for example, evaluation by predictive model 102 (Figure 1A), test kit 122 ( Figure IB), or analytical tool 132 ( Figure 1C).
  • the image processing tool may also output an overall statistical summary for the image for each of the measured features. Additional details regarding measuring mo ⁇ hometric features of the classified pathological objects are described below in connection with Tables 1 and 2. The following is a description of the functions shown in Figure 3 of the image processing tool.
  • Initial Segmentation In a first stage, the image processing tool may segment an image (e.g., an H&E stained tissue microarray (TMA) image or an H&E of a whole tissue section) into small groups of contiguous pixels known as objects.
  • TMA H&E stained tissue microarray
  • the size of the objects can be varied by adjusting a few parameters [11].
  • an object rather than a pixel is typically the smallest unit of processing.
  • all mo ⁇ hometric feature calculations and operations may be performed with respect to objects. For example, when a threshold is applied to the image, the feature values of the object are subject to the threshold. As a result, all the pixels within an object are assigned to the same class.
  • the size of objects may be controlled to be 10-20 pixels at the finest level. Based on this level, subsequent higher and coarser levels are built by forming larger objects from the smaller ones in the lower level. Background Extraction.
  • the image processing tool may segment the image tissue core from the background (transparent region of the slide) using intensity threshold and convex hull.
  • the intensity threshold is an intensity value that separates image pixels in two classes: "tissue core” and "background". Any pixel with an intensity value greater than or equal the threshold is classified as a "tissue core” pixel, otherwise the pixel is classified as a "background” pixel.
  • the convex hull of a geometric object is the smallest convex set (polygon) containing that object.
  • a set S is convex if, whenever two points P and Q are inside S, then the whole line segment PQ is also in S. Coarse Segmentation.
  • the image processing tool may re-segment the foreground (e.g., TMA core) into rough regions corresponding to nuclei and white spaces.
  • the main characterizing feature of nuclei in H&E stained images is that they are stained blue compared to the rest of the pathological objects. Therefore, the difference in the red and blue channels (R-B) intensity values may be used as a distinguishing feature. Particularly, for every image object obtained in the initial segmentation step, the difference between average red and blue pixel intensity values may be determined. The length/width ratio may also be used to determine whether an object should be classified as nuclei area.
  • a green channel threshold can be used to classify objects in the tissue core as white spaces.
  • Tissue stroma is dominated by the color red.
  • the white space regions may correspond to both lumen (pathological object) and artifacts (broken tissue areas) in the image. The smaller white space objects (area less than 100 pixels) are usually artifacts.
  • the image processing tool may apply an area filter to classify them as artifacts.
  • Nuclei De-fusion and Classification In the stage of coarse segmentation, the nuclei area is often obtained as contiguous fused regions that encompass several real nuclei. Moreover, the nuclei region might also include surrounding misclassified cytoplasm. Thus, these fused nuclei areas may need to be de-fused in order to obtain individual nuclei.
  • the image processing tool may use two different approaches to de-fuse the nuclei. The first approach may be based on a region growing algorithm that fuses the image objects constituting nuclei area under shape constraints (roundness). This approach has been determined to work well when the fusion is not severe.
  • the image processing tool may use a different approach based on supervised learning.
  • This approach involves manual labeling of the nuclei areas by an expert (pathologist).
  • the features of image objects belonging to the labeled nuclei may be used to design statistical classifiers.
  • feature selection may be performed on the training set using two different classifiers: the Bayesian classifier and the k nearest neighbor classifier [12].
  • the leave-one-out method [13] may be used for cross-validation, and the sequential forward search algorithm may be used to choose the best features.
  • two Bayesian classifiers may be designed with number of features equal to 1 and 5, respectively.
  • the class-conditional distributions may be assumed to be Gaussian with diagonal covariance matrices.
  • the input image may include different kinds of nuclei: epithelial nuclei, fibroblasts, basal nuclei, endothelial nuclei, apoptotic nuclei and red blood cells. Since the number of epithelial nuclei is typically regarded as an important feature in grading the extent of the tumor, it may be important to distinguish the epithelial nuclei from the others.
  • the image processing tool may accomplish this by classifying the detected nuclei into two classes: epithelial nuclei and "the rest” based on shape (eccentricity) and size (area) features.
  • the image processing tool may measure various mo ⁇ hometric features subsequent to the segmenting and classifying of objects in the image by the tool. These mo ⁇ hometric features may be indicative of one or more properties and/or statistics.
  • the object properties may include both spectral properties (e.g., color channel mean values, standard deviations and brightness) and structural/shape properties (e.g., area, length, width, compactness, density).
  • the statistics may include minimum, maximum, mean and standard deviation and may be computed for each property of an image object.
  • Tables 1 and 2 (appended hereto) show various examples of mo ⁇ hometric features that may be measured in accordance with the present invention. The mo ⁇ hometric features in these tables are named using a convention that indicates the various properties and/or statistics measured by these features. The particular naming convention shown in Tables 1 and 2 is adapted from the commercially-available Defmiens software product described above and, therefore, will be understood by one of ordinary skill in the art. It will be understood that the computer-generated mo ⁇ hometric features shown in Tables 1 and 2 are only illustrative and that any computer- generated mo ⁇ hometric features may be utilized without departing from the scope of the present invention.
  • Tables 1 and 2 include different sets of mo ⁇ hometric features.
  • the reduced and modified set of features in Table 2 resulted from additional experimentation in the field of prostate cancer recmrence and survival from the time that the study involving Table 1 was performed.
  • the additional experimentation provided additional insight regarding the types of features which may be more likely to correlate with outcome.
  • the inventors expect that continued experimentation and/or the use of other suitable hardware, software, or combination thereof will yield various other sets of computer-generated features (e.g., a subset of the features in Table 2) that may correlate with these and other medical conditions. Referring to Tables 1 and 2, the feature "Lumen.
  • StdDevAreaPxl indicates a type of image object
  • StdDev indicates a statistic (standard deviation) to be computed using all instances of the identified Lumen
  • AreaPxl indicates a feature of an object instance (area as a number of pixels) to be evaluated by the statistic.
  • An image processmg tool may measure mo ⁇ hometric features for all the objects previously segmented and classified in the image. For example, the image processing tool may measure mo ⁇ hometric features for objects including "Background,” “Cytoplasm,” “Epithelial nuclei,” “Lumen,” “Stroma,” “Stroma nuclei” and “Red blood cells.” "Background” includes portions of the digital image that are not occupied by tissue.
  • Cytoplasm refers to the cytoplasm of a cell, which may be an amo ⁇ hous area (e.g., pink area that surrounds an epithelial nucleus in an image of, for example, H&E stained tissue).
  • Epithelial nuclei refers to the nucleus present within epithelial cells/luminal and basal cells of the glandular unit, ⁇ which appear as "round” objects surrounded by cytoplasm.
  • Luen refers to central glandular space where secretions are deposited by epithelial cells, which appear as enclosed white areas sunounded by epithelial cells.
  • the lumen can be filled by prostatic fluid (which typically appears pink in H&E stained tissue) or other "debris" (e.g., macrophages, dead cells, etc.). Together the lumen and the epithelial cytoplasm and nuclei form a gland unit.
  • prostatic fluid typically appears pink in H&E stained tissue
  • other "debris” e.g., macrophages, dead cells, etc.
  • Stroma refers to a form of connective tissue with different density that maintains the architecture of the prostatic tissue. Stroma tissue is present between the gland units, and appears as red to pink in H&E stained tissue.
  • Stroma nuclei are elongated cells with no or minimal amounts of cytoplasm (fibroblasts).
  • This category may also include endothelial cells and inflammatory cells, and epithelial nuclei may also be found scattered within the stroma if cancer is present.
  • Red blood cells are small red round objects usually located within the vessels (arteries or veins), but can also be found dispersed throughout tissue.
  • C2EN in the below tables is a relative ratio of nucleus area to the cytoplasm. The more anaplastic/malignant the epithelial cell is, the more area is occupied by the nucleus.
  • EN2SN is the percent or relative amount of epithelial to stroma cells present in the digital tissue image.
  • L2Core is the number or area of lumen present within the tissue.
  • FIG. 1 is a flowchart of illustrative stages involved in screening for an inhibitor compound in accordance with an embodiment of the present invention.
  • a first dataset for a patient may be obtained that includes one or more of clinical data, mo ⁇ hometric data and molecular data.
  • a test compound may be administered to the patient at stage 404.
  • a second dataset may be obtained from the patient at stage 406.
  • the second dataset may or may not include the same data types (i.e., features) included in the first dataset.
  • the second dataset may be compared to the first dataset, where a change in the second dataset following administration of the test compound indicates that the test compound is an inhibitor compound.
  • Stage 408 of comparing the datasets may include, for example, comparing an output generated by a predictive model of the present invention responsive to an input of the first dataset with an output generated by the predictive model responsive to an input of the second dataset.
  • the inhibitor compound may be a given drug and the present invention may determine whether the drug is effective as a medical treatment for a medical condition.
  • the present invention used clinical and mo ⁇ hometric data to predict the recurrence of prostate cancer
  • an embodiment of the present invention used clinical, mo ⁇ hometric, and molecular data to predict the recurrence of prostate cancer and overall survivability.
  • an embodiment of the present invention was used to predict the occurrence of aggressive disease subsequent to a patient prostatectomy.
  • an embodiment of the present invention was used to predict liver toxicology.
  • Prostate cancer is the leading cause of death among men in the United States with an anticipated 230,000 newly diagnosed cases and nearly 30,000 deaths in 2004.
  • the expanded use of serum based screening with PSA has offered physicians the ability to detect prostate cancer at an earlier stage (i.e. Tla-c, T2), either localized to the prostate or regionally spread while only a small percentage are detected at the metastatic stage.
  • Tla-c T2
  • the reported benefits of early detection and diagnosis have placed enormous pressure on both the patient and the urologist in selecting the course of treatment.
  • the need for accurate prognosis is critical when selecting initial therapeutic intervention, as the majority of tumors are indolent and require minimal intervention (i.e. 'watchful waiting') while others are more aggressive and early intervention (i.e.
  • Embodiments of the present invention provide a 'Systems Pathology' approach to successfully improve upon the accuracy of a predictive model for PSA / BCR post prostatectomy. This represents an 'individualized' view of the patients own tumor sample, including quantitative assessment of cellular and microanatomic mo ⁇ hometric characteristics, clinical profiles and molecular markers to create a highly accurate and integrative model of prediction.
  • CD34 is a transmembrane glycoprotein which is present on endothelial cells which line vessels in the human body. Further studies are underway to better understand these observations and the potential impact on predicting prostate cancer progression. Also of note were the selected image segmentation and mo ⁇ hometric characteristics which represent in part a highly accurate, non-subjective and quantitative Gleason Score in addition to several novel tissue descriptors which were important in model development and accuracy.
  • the defined mo ⁇ hometric features relating to the Gleason Scoring System include in part the overall appearance of the glandular structures, shape and size (cytoplasmic composition) of the epithelial cells, epithelial cell nuclei and the demonstration of single epithelial cells admixed in the stroma.
  • the androgen receptor protein receives naturally occurring androgenic hormones (testosterone and its 5 .alpha.-reduced metabolite, dihydrotestosterone) after these hormones are synthesized by the Leydig cells of the male testes. Particularly, after synthesizing, these hormones circulate throughout the body and bind to the AR. Androgens, acting through the receptor AR, stimulate development of the male genitalia and accessory sex glands in the fetus, virilization and growth in the pubertal male, and maintenance of male virility and reproductive function in the adult.
  • hormones testosterone and its 5 .alpha.-reduced metabolite, dihydrotestosterone
  • the androgen receptor together with other steroid hormone receptors, constitute a family of trans-acting transcriptional regulatory proteins that control gene transcription through interactions with specific gene sequences.
  • Sovak et al. U.S. Patent No. 6,472,415 proposes that growth of prostate cancer in early stages is androgen driven and can, at least temporarily, be stopped by androgen deprivation.
  • French et al. U.S. Patent No. 6,821,767 proposes various ways for measuring AR that may allow for the use .of androgen receptor assays in the diagnostic evaluation of prostate cancer by physicians.
  • these studies have not proposed using measurements of AR in conjunction with automated models that predict the occurrence of prostate cancer, as disclosed herein.
  • Example 1 Prediction of Prostate Cancer Recurre ⁇ ce Clinical and Morphometric Data
  • the full set of raw features was chosen agnostically to avoid disregarding potentially useful features.
  • all of these mo ⁇ hometric features were not likely to be equally informative, and a prediction model built based on the full feature set would be likely to have poor predictive perfonnance due to the "curse of dimensionality" [13]. So a dimensionality reduction procedure was applied, and a set of eight mo ⁇ hometric features was finally selected.
  • PSA prostate specific antigen
  • BCR biochemical recurrence
  • the four specific clinical measures, or features, considered in this study were (1) the biopsy Gleason grade, (2) the biopsy Gleason score, (3) the post-operative Gleason grade, and (4) the post-operative Gleason score.
  • the mo ⁇ hometric features were analyzed separately from the clinically derived Gleason score feature to predict both the probability and the time to PSA/BCR recurrence.
  • the image and Gleason score (features) were then combined to establish a recunence and time to recurrence time prediction. Improved prediction accuracy achieved by this joint set of features indicated that the image features indeed provided additional information and thus enhanced the recurrence prediction rate and the overall prediction model.
  • the predictive performance of the mo ⁇ hometric features is comparable with that of the Gleason scores, and the combination of the morphometric features and the Gleason scores achieves a higher predictive rate, which confirms that the mo ⁇ hometric features extracted by the tissue image analysis system indeed provide extra information beyond the Gleason scores. Therefore, the use of the mo ⁇ hor ⁇ etric measurements can enhance overall recurrence prediction.
  • Example 2 Prediction of Prostate Cancer Recurrence and Overall Simrvival Clinical, Morphometric and Molecular Data Two studies were conducted which successfully predicted prostate specific antigen (PSA) recurrence with 88% and 87% predictive accuracies, respectively. B y combining clinical, molecular, and mo ⁇ hometric features with machine learning, a robnst platform was created which has broad applications in patient diagnosis, treatment management and prognostication. A third study was conducted to predict overall survival of prostate cancer patients, where the outcome of interest was death due to any cause. A cohort of 539 patients who underwent radical prostatectomy was studied inco ⁇ orating high-density tissue microanays (TMAs) constructed from prostatectomy specimens.
  • TMAs tissue microanays
  • H&E hematoxylin and eosin
  • IHC immunohistochemistry
  • TMAs Tissue microarrays
  • Missing values for clinical features were imputed with flexible additive regression models containing all of the features to estimate the value of the missing feature without reference to outcome, and only those patients with complete clinical- (after imputation), mo ⁇ hometric, and molecular data, as well as non-missing outcome information, were further studied.
  • the effective sample size for Study 1 consisted of 132 patients. The primary classification of interest was whether a patient recurred, or not after surgery for prostate cancer. Patients who had two observed consecutive elevations in PSA > 0.2 ng mL were considered to have recurrent prostate cancer. If a patient did not recur as of his last visit, or the patient outcome was unknown as of his most recent visit (i.e.
  • Time to recurrence was defined as the time (in months) from radical prostatectomy until PS . (biochemical) recunence.
  • Study 2 was performed using 268 patients from the original 539 patient cohort including 129 of the 132 patients from Study 1. Instead of utilizing H&E images derived from TMA cores, whole sections from radical prostatectomies were analyzed.
  • Study 3 examined the same 268-patient cohort but was used to predict overall survival, where the outcome of interest was death due to any cause. Image Analysis and Morphometry Studies. Representative areas of the original tumor tissue retrieved from each patient, either from a tissue core or whole section, were digitized and analyzed using the H&E stained slides.
  • a panel of 12 biomarkers including Cytokeratin 18 (luminal cells), Cytokeratin 14 (basal cells), CD45 (lymphocytes), CD34 (endothelial cells), CD68 (macrophages), Ki67 (proliferation), PSA (hK-3, kallikrein), PSMA (growth receptor), Cyclin Dl (cell cycle), p27 (cell cycle), Androgen Receptor (endocrine) and Her-2/neu (signaling) were applied across all 7 TMA blocks with standard chromogenic immunohistochemistry.
  • Antigen retrieval was performed with a 0.01M citrate buffer (pH 6) for 30 min in a pressure cooker for all antibodies. Illustrative methods and systems relating to such a process are described in above-inco ⁇ orated U.S. Patent Application No. 10/624,233, filed July 21, 2003, and entitled "Methods and compositions for the preparation and use of fixed-treated cell-lines and tissue in fluorescence in situ hybridization.”
  • Primary antibodies (shown in Table 5) were diluted in Tris-buffered saline with 0.1% Tween and applied for 16 h at 4 °C followed by biotinylated secondary antibodies (Vector) at 1:1000 dilution for 1 h.
  • Negative control slides received normal mouse serum (DAKO) as the primary antibody. Slides were counterstained with Harris hematoxylin and reviewed by two independent pathologists with all discrepancies resolved by a third pathologist. The recorded IHC data from all 539 patients and their respective triplicate cores included the percentage and intensity (0-3+) of cells which stained for a particular antigen under investigation. Where applicable, these two measures were combined to create a Staining Index for that particular biomarker (Table 6, below, shows an exemplary list of molecular features). A Staining Index was calculated for AR (Androgen Receptor), CK14 (Cytokeratin 14), Cyclin Dl, PSA
  • the Staining Index ranged from 0-300, and was calculated as follows: l*(the percentage of cells staining positive with 1+ intensity for a biomarker) + 2*(the percentage of cells staining positive with 2+ intensity for the biomarker) + 3*(the percentage of cells staining positive with 3+ intensity for the biomarker), where the percentage of cells staining positive refers to the number of positive cells identified per every 100 cells counted. Additional details regarding this staining index are described in [19]. Such a staining index is only illustrative and any other suitable way for measuring molecular features may be used without departing from the scope of the present invention.
  • p27 belongs to the family of cell cycle regulators called cyclin-dependent kinase inhibitors, which bind to cyclin-CDK complexes and cause cell cycle anest in the Gl phase.
  • the biomarker p27 is postulated to promote apoptosis and play a role in terminal differentiation of some tissues.
  • By immunohistochemistry the loss of nuclear p27 expression is associated with a more aggressive phenotype.
  • Her2/neu is a member of the EGFR family of receptor tyrosine kinases and plays an important role in the patho genesis of certain human cancers. The over- expression of Her2/neu by immunohistochemistry on cellular membranes has been associated with a more aggressive type of breast cancer.
  • Ki67 is one of many proliferative markers that stains the nucleus with varying degrees of intensity and is utilized to assess a proliferative index or measure of cellular activity of the tumor sample in question.
  • CD45 is a cell surface antigen that is used to identify cells that are destined to become immune cells such as lymphocytes (T cells, B-cells, NK cells etc.). The intensity is believed not to be as important as its distribution / presence and association with other histological elements.
  • CD68 is a cytoplasmic antigen closely associated with lysosomes. It is expressed throughout the monocyte differentiation cascade but is usually more intense in macrophages than monocytes.
  • the concordance index is based on pairwise comparisons between the prognostic scores of two randomly selected patients who meet any one of the following criteria: both patients experienced the event and the event time of the first patient is shorter than that of the second patient or only the first patient experienced the event and his event time is shorter than the second patient's follow-up time.
  • the CI estimates the probability that a patient with the higher prognostic score from the model will experience the event within a shorter time than a patient with a lower score and is tightly associated with the area under the ROC curve (AUC).
  • Other metrics may also be used to measure the ability of a predictive model. For example, sensitivity and specificity maybe used in assessing diagnostics.
  • a "p-value" may be used that represents the probability that chance alone is responsible for, for example, the observed differences between strata (e.g., see Figures 8, 10, and 12). Therefore, the lower the p-value, the more likely there is a true statistical association with outcome. Typically, the standard is that any p-value less than or equal to 0.05 is statistically significant. Study 1. In this analysis, the above-described SVRc model was applied sequentially to the clinical, molecular, and mo ⁇ hometric data, with the clinical features first serving as an anchor for a "greedy-forward" feature selection (“FS”) algorithm via SVRc run on the molecular data.
  • FS grey-forward
  • a sepond SVRc greedy-forward feature selection algorithm on the mo ⁇ hometric data was run, using the combination of the clinical and selected molecular features as the anchor.
  • the last step involved running a greedy-backward selection algorithm on the combination of the clinical, selected molecular and selected mo ⁇ hometric features to derive a final model.
  • the criterion to determine whether a feature was entered (or kept) in the model was based on whether the presence (or absence) of that feature increased the concordance index, i.e. added predictive information.
  • the model was evaluated for predictive accuracy using both internal and external validation. Internal validation was performed using five-fold cross-validation.
  • a neural network of the type described above was used, in which network was trained using an objective function substantially in accordance with an approximation of the concordance index.
  • the output of this final model was used to estimate individual future patient risk for PSA recunence.
  • Study 2 The goals of this study were identical to Study 1; however, different feature selection and validation procedures were used. Instead of using the anchoring approach, all of the features were ranked by their association with time to PSA recunence (measured by the concordance index) and those features which passed a certain pre-determined threshold (CI > 0.60) were selected. This was done after the number of imaging features was reduced by our domain experts, and these features were then evaluated in a series of n-feature models (e.g.
  • Patent No. 6,409,664 which is hereby inco ⁇ orated by reference herein in its entirety.
  • the output of the final model was used to estimate individual future patient risk for PSA recunence.
  • Study 3 The goal of this study was to identify features predictive of overall survival using the same cohort and feature set analyzed in Study 2 as well as the same feature selection algorithm.
  • the output of the final model was used to estimate individual future patient risk for death due to any cause.
  • RESULTS The general approach was to apply systems pathology (the combination of mo ⁇ hometric analyses, molecular signatures and patient clinical profiles) to develop predictive models for PSA recunence and overall survival in a cohort of prostate cancer patients status post prostatectomy.
  • Biopsy Gleason Score 2 0 (0.0%) 1 (0.4%) 3 0 (0.0%) 0 (0.0%) 4 6 (4.6%) 7 (2.6%) 5 27 (20.5%) 56 (20.9%) 6 41 (31.1%) 97 (36.2%) 7 48 (36.4%) 90 (33.6%) 8 7 (5.3%) 13 (4.9%) 9 3 (2.3%) 4 (1.5%)
  • Percent Ploidy Fraction Mean 3.4 3.5 Median 2.6 2.4 Range 0.0 - 20.0 0.0 - 20.0
  • Figures 5a and 5b illustrate digitized images of healthy and abnormal prostate tissue, respectively, obtained after segmentation and classification in accordance with the present invention.
  • Various pathological objects have been labeled in the tissue for illustration.
  • a total of 496 mo ⁇ hometric features (shown in Table 1, above) were generated by the image analysis software. Of the 496 mo ⁇ hometric features, the 10 mo ⁇ hometric features shown in Figure 6 were selected as being predictive of PSA recunence.
  • the mo ⁇ hometric features of length for red blood cell, radius of smallest enclosure and border length for lumen, border length for cytoplasm, density for stroma (e.g., square root of the area covered by a stroma divided by its radius), and area for background were determined to conelate with outcome.
  • the mo ⁇ hometric features of compactness, width, green channel value, and radius of smallest enclosure for epithelial nuclei were also determined to conelate with outcome.
  • epithelial nuclei e.g., ellipse with the same area as the object is created and then enlarged until it completely encloses the epithelial nuclei, and the ratio of the radius of the smallest enclosing ellipse to the radius of the original ellipse is computed
  • the morphometric feature of compactness of the epithelial nuclei may be a reflection of the 'back to back' nature of epithelial cells in a circumferential pattern which would suggest a loss of glandular and lumen formation / differentiation and therefore be consistent with a higher Gleason grade (i.e., higher disease progression).
  • the mo ⁇ hometric feature of the radius of smallest enclosure of the lumen relates to the overall size of the lumen which is dramatically reduced and diminished as the Gleason grade increases.
  • the conelations determined in this study may be at least partially explained by the hypothesis that epithelial nuclei typically become less diverse in shape (e.g., more round with less variations) and size (e.g., area and border length) and have less color variation as the epithelial nuclei invade the stroma. This invasion of the stroma may also explain why mo ⁇ hometric features of the stroma have been determined to be conelated with disease progression. Particularly, cancerous images are typically characterized by a small amount of stroma because the stroma area is replaced by epithelial cell cytoplasm as cancer progresses.
  • biomarkers encompassing 14 specific molecular features were selected as being associated with PSA recunence.
  • Some examples of the more highly selected molecular features are annotated as follows (biomarker - # times selected by the model) and include : AR Staining Index - tumor (93), AR Staining Index - atrophic gland (54), CD34 - associated Tumor / PIN (22), Ki-67 - tumor (18) and CD45 - associated with PIN (17), where PIN is an abbreviation for prostatic intraepithelial neoplasm.
  • Figures 7a and 7b illustrate representative fields demonstrating expression profiles for AR and CD34, respectively.
  • Biopsy Gleason Score the summarized Gleason grades (dominant and secondary) which are assigned to the multiple Needle Biopsy Tissue Samples received by a pathologist.
  • the Gleason scoring system was developed to create a standardized, somewhat subjective, means of representing the architecture of prostatic adenocarcinoma by histology with the production of individual grades.
  • the grades range from 1 - 5 based on the degree of differentiation of the glandular units and epithelial cells.
  • the dominant (primary) and sub-dominant (secondary) patterns are added together to create a Gleason Summary.
  • the features of overall stromal compactness, epithelial cell size and nnclear features are occasionally considered in the overall grading system.
  • Race e.g., African American, Caucasian, etc.
  • UICC Stage International Union against Cancer TNM staging system use to define clinical staging for cancer, where "T” stands for Tumor size, “N” stands for lymph node involvement and “M” stands for metastasis to a. distant site. . •
  • DNA content which is a. reflection of the overall DNA content within the prostate cancer epithelial cells. Benign cells and well-behaved tumor cells grow and divide in an orderly fashion. In the resting state, they contain one complete set of chromosomes (this is the diploid condition). This complete set of chronxosomes consists of 23 chromosomes (or N) from Ma and 23 (N again) chromosomes from Pa (equaling a total of 2N). A cell must double the number of its chromosomes before it can divide, creating two complete sets of chromosomes (this is 4N, or the tetraploid state).
  • each new cell After division is completed, each new cell receives half of the genetic material and therefore becomes diploid (2N) once again. If DNA ploidy analysis were to be performed on a group of these cells, one would see that most of the cells would be diploid and a small fraction of them (those getting ready to divide) would be tetraploid. Additionally, in measuring and creating a graph of the amount of genetic material in each cell, one would see a dominant diploid peak and a minor tetraploid peak.
  • the amount of DNA in a cell can be measured by staining it with a dye that binds to the genetic material. The concentration and distribution of this dye (Fuelgen stain) can be measured by image analysis microscopy. When tumors worsen they tend to not divide as orderly as they once did.
  • the resting state may only have a set and a half. Such cells would have a DNA content that was neither diploid nor tetraploid but mid-way between. Plotting these cells on the above-described graph above would yield an aneuploid peak midway between the other two peaks. Studies have shown that tumors that have a significant aneuploid peak do not behave as well as those that do not. This is not svuprising because a strong conelation exists between ploidy status and nuclear grade. A nuclear grade can be assessed by any pathologist with enough experience with prostate cancer.
  • DRE Result Result from a digital rectal exam (e.g., negative or positive) which is utilized to determine extent of disease both within the prostate as well as extra prostatic extension by palpation.
  • Lymph Node Involvement a measure of the extent to which lymph nodes contain tumor cells (e.g., prostate cancer epithelial cells), which can be assessed either by clinical / surgical inspection or at the time of a prostatectomy.
  • tumor cells e.g., prostate cancer epithelial cells
  • Dominant Biopsy Gleason Grade See above description of Biopsy Gleason Score. This reflects the dominant Gleason grading pattern seen on either a biopsy or a prostatectomy specimen.
  • Percent Ploidy in S Phase represents a fraction of the cellular content which is in a proliferative or S phase of the cell cycle and reflects the growth potential of the tumor.
  • Post-operative Gleason Score Scoring of tissue taken after surgery from various regions of the prostate resection sample.
  • TNM Stage Tumor, Node and Metastasis based on the UICC criteria post prostatectomy and based on pathologic examination of tissue samples.
  • Dominant Post-operative Gleason Grade the dominant Gleason grade which represents the most predominant histologic feature present in the prostatectomy specimen.
  • Surgical Margin Involvement Involvement of the surgical margins by tumor which reflects the extent to which the bed from which the tumor/prostate vas removed at the time of surgery contained tumor cells. 17. Extracapsular Involvement: Extension of me tumor beyond the capsule of the prostate. Molecular Features
  • AR - tumor Androgen Receptor (AR) Staining Index for a tumor, which is a measure of the percentage and intensity of cells staining positive for AR. Witn respect to prostate cancer, the staining index may represent the degree of brown reaction prod.nct which is detected in the nuclei of epithelial cells in the prostate samples evaluated.
  • AR - gland AR Staining Index for a tumor, which is present within a glandular structure.
  • CD34 - tumor/PIN The localization of CD34 to the endothelial cells of vessels which are associated with tumor and PIN.
  • Ki67 - tumor 2 The identification of ki67 positive nuclei in- tumor epithelial cell nuclei.
  • CD45 - PIN 3 The identification f CD45 positive lymphocytes in association with PIN.
  • CD34 - tumor/stroma The localization of CD34 vessels which are associated with tumor.
  • Ki-67 - tumor 3 see above.
  • p27 - tumor The identification of p27 in the nuclei of tumor epithelial cells.
  • C14 - PIN The identification of cytokeratin 14 in the (epithelial) basal cells of the glandular unit.
  • CD34 - tumor The localization of CD34 to vessels which are associated with the tumor.
  • PSA - gland The identification of PSA to the luminal epitlxelial cells of the gland unit.
  • PSMA - PIN The identification of PSMA to the glandular J luminal cells of regions identified as PIN.
  • CD34 - PIN/stroma The localization of CD34 to vessels associated with PIN.
  • CD45 - tumor 3 The identification of CD45 positive lymphocytes which are associated with tumor.
  • Figure 9 shows that, of the 350 features, 6 mo ⁇ hometric features were selected as being predictive of PSA recurrence, where these mo ⁇ hometric features related to the pathological objects of epithelial nuclei, stroma, cytoplasm, red blood cell, and lumen (i.e., EpithelialNucleiMinCompactne0215, StromaMaxStddevChar ⁇ nel30569, CytoplasmStddevMaxDiff0148, RedBloodCellMeanAreaPxl0386,
  • AR Staining Index Each number in Figure 9 represents the concordance index of a predictive model based on the conesponding feature and all other feature(s) in Figure 9 having smaller number(s).
  • 0.8483 is the CI of a model based on features TNM Clinical Stage, Surgical Margins, EpithelialNucleiMinCompactne0215, Lymph Nodes, and StromaMaxStddevChannel30569.
  • the CI of a model based on the same 5 features plus AR Staining Index (tumor) is 0.8528. In other words, the addition of the AR Staining Index molecular feature to the model increases the predictive power of the model.
  • Molecular Analysis No additional inrmunohistochemistry studies were necessary. The data originally collected was used as described in Materials and Methods (see Appendix, Tables 9a, 8b, and 9c for a complete summary of the molecular features).
  • a single molecular feature was selected as being predictive of PSA recunence: AR Staining Index - tumor.
  • the resulting output of the SVRc model can also be inte ⁇ reted as a relative risk estimate of PSA recunence for an individual patient.
  • the mo ⁇ hometric features of mean value of red color channel, mean value of blue color channel and max difference for stroma were detennined to be conelated with outcome.
  • the mo ⁇ hometric features of mean and standard deviation of red channel, mean and standard deviation of green channel and elliptic fit for red blood cell were determined to be conelated with outcome.
  • To determine the mo ⁇ hometric feature of elliptic fit an ellipse with the same area as the red blood cell was created, the area of the red blood cell outside the ellipse was compared with the area inside the ellipse that was not filled with the red blood cell, and a value of 0 was assigned where there was no fit whereas a value of 1 was assigned for a complete fitting object.
  • the mo ⁇ hometric features of border length, area and elliptic fit for epithelial nuclei were determined to be conelated with outcome. Various possible reasons for at least some of these conelations are described above in connection with Example 1 and/or Study 1. For example, the overall shape of the epithelial nuclei reflects a histologic appearance of a higher Gleason grade. Additionally, in this study, the conelation with respect to stroma may be explained by the understanding that stroma will exhibit a reduced contrast (as measured by the max difference mo ⁇ hometric feature) as cancer progresses due to its interruption with epithelial cells. Molecular Analysis. The same set of molecular features from Study 2 was used in this study.
  • psapsi refers to the staining index for prostate specific antigen (PSA) in the prostatic intraepithelial neoplasm (PIN).
  • PSA prostate specific antigen
  • PIN prostatic intraepithelial neoplasm
  • Each number in Figure 11 represents the concordance index of a predictive model based on the conesponding feature and all other feature(s) in Figure 11 having smaller number(s).
  • 0.6804 is the CI of a model based on StromaMinMeanChannell0535
  • 0.7362 is the CI when the model is based on both StromaMinMeanChannell0535 and TNM.
  • the resulting output of the SVRc model can also be inte ⁇ reted as a relative risk estimate of death for an individual patient.
  • risk groups of patients were created; the Kaplan-Meier estimates of recunence for each risk group as predicted by the SVRc model are presented in Figure 12. Using the log-rank test, a significant difference in survival was observed between risk groups (p ⁇ 0.0001).
  • Example 2 Discussion of Results (Example 2) The observed reduction of (composite) selected features from Study 1 (41) to Study 2 (10) while retaining the predictive accuracy of the model emphasized the precision and filtering attributes that were achieved through different machine learning algorithms.
  • the concordance index of the model that was developed in the 268-patient cohort was 0.87; by comparison, when the Kattan nomogram [20] is applied to this cohort it achieved a concordance index of 0.78.
  • the successful end result was the ability to predict with 80% accuracy an individual's overall survival and time to death utilizing a total of 14 combined domain features. Although limited by the small number of events (7% dead from any cause) and absence of a comparable published nomogram, the results further support the use of a systems approach for developing these types of predictive tests. Additional efforts are underway with respect to expanding this Overall survival' analysis to include clinical measures of poor outcome (i.e., metastasis and or death due to prostate cancer) utilizing a retrospective multi-institutional population with an independent external validation study. In addition, a 'Systems Pathology' approach recently has been initiated to intenogate diagnostic needle biopsies in order to have an impact on treatment issues prior to surgery.
  • Example 3 Prediction of Aggressive Disease Subsequent to Prostatectomy Clinical and Morphometric Data This study was undertaken to predict aggressive disease (i.e., clinical failure as demonstrated by a positive bone scan representing metastatic prostate cancer to bone) subsequent to a patient having a prostatectomy. Prior to the present invention, no accurate analytical tools existed for providing such a prediction. As described above, the systems pathology approach of the present invention has been shown to accurately predict PSA recunence. This study demonstrates that the present invention can also be used to accurately predict distant bone metastasis after prostatectomy. A cohort of 1 19 patients who underwent radical prostatectomy was studied inco ⁇ orating tissue microanays (TMAs) constructed from prostatectomy specimens.
  • TMAs tissue microanays
  • Mo ⁇ hometric (i.e., image analysis) studies were performed using hematoxylin and eosin (H&E) stained tissue sections, and biological determinants were assessed with irnmunohistochemistry (IHC) utilizing a series of biomarkers selected for their potential biological relevance for prostate cancer progression.
  • IHC irnmunohistochemistry
  • a predictive model for clinical failure i.e., positive bone scan
  • Predictive perfonnance of the model was estimated using the concordance index (CI) with generated scores used to define risk groups.
  • Example 4 Liver Toxicology Morphometric Data This study was undertaken to demonstrate image analysis and statistical modeling capabilities in the area of toxicology. Specifically, the study called for the acquisition and analysis of sections of rat liver with the overall objective being to classify the sections as normal or abnormal. Being able to automate this process while simultaneously achieving a high-level of classification accuracy could allow for the creation of a high-throughput platform used to objectively screen for toxicities in pre-clinical studies. The study was divided into two phases. The initial phase used a set of 100 rat liver sections as a training set; 80 normal liver sections and 20 abnormal. This set of sections was used to develop an image analysis application using the tissue image analysis system described above as well as perform feature and model selection to classify the sections.
  • the established image analysis process was then applied to an unlabeled set of 100 rat liver sections in the second phase of the study in which the statistical models designed in the training phase were tested. Segmentation Accuracy The global segmentation accuracy for all objects, as measured by a pathologist's assessment, was 80% - 90%. Statistics The statistical component of the study involved two steps. The first step involved selecting features from the imaging data generated by the image analysis of the sections. Reducing the number of features used for classification may improve the robustness and reliability of the classification of the sections. The second step involved both training a model using the selected feature set and labels for each section (abnormal, normal) and then testing the model by predicting the classification of an independent set of rat liver sections where the labels were unknown.
  • Feature Selection The statistical measurements generated for each of the above objects were: — Number of obj ects — Relative area (percent, in relation to total area of image) — Minimum size (in pixels) — Maximum size (in pixels) — Average size (in pixels) — Standard deviation of the size Since multiples images which were analyzed per section, these measures were themselves averaged across all images for an individual rat liver section. The total number of original features was 378. Feature selection also involved two steps. The first step utilized domain expertise. A pathologist selected features from the original feature list generated by the image analysis of the sections. The decision to include or exclude features was based on the understanding of the pathology of the liver and potential abnormalities/toxicities that could be encountered.
  • LDA linear discriminant analysis
  • the selected features were then entered into a linear discriminant analysis (LDA) which classified each of the liver sections as abnormal or normal.
  • LDA linear discriminant analysis
  • the output of the model was conected for potential bias via cross-validation.
  • Neural networks were also explored as a classifier.
  • the selected features were used as the inputs to the neural network model, which is a standard multilayer perceptron (MLP) structure with zero hidden units and direct connection between the input and output layers.
  • MLP multilayer perceptron
  • the model was trained by trying to directly maximize an approximation to the area under the ROC curve, which is explained below. It was found that the MLP model trained by this criterion achieves better accuracy than an MLP model trained by the typical criteria, e.g., mean square enor and cross entropy.
  • the output from both models were used to create a receiver operating characteristic
  • ROC ROC
  • AUC area under the ROC curve
  • test key labels were compared with the predicted classifications of the linear discriminant function and those of the neural networks. Based on the key, the results are summarized in Tables I la and ll " b as follows:
  • the cut point used for the LDA classifier equaled 0.0031 ; the cut point used for the NN classifier equaled 0.0002. Both conespond to the system requirements of 100% sensitivity and 90% specificity. Discussion Based on the sensitivity and specificity of each classifier after applying them to the test set, LDA outperformed NN. The LDA classifier achieved a sensitivity of 86% which means that this classifier conectly labeled the abnormal rat liver sections as abnormal 86% of the time, as opposed to the neural network classifier which achieved a sensitivity of 73%. Specificity for both classifiers was 63%.
  • the computer system may be any suitable apparatus, system or device.
  • the computer system may be a programmable data processing apparatus, a general pu ⁇ ose computer, a Digital Signal Processor or a microprocessor.
  • the computer program may be embodied as source code and undergo compilation for implementation on a computer, or may be embodied as object code, for example. It is also conceivable that some or all of the functionality ascribed to the computer program or computer system aforementioned may be implemented in hardware, for example by means of one or more application specific integrated circuits. Suitably, the computer program can be stored on a carrier medium in computer usable form, which is also envisaged as an aspect of the present invention.
  • the carrier medium may be solid-state memory, optical or magneto-optical- memory such as a readable and/or writable disk for example a compact disk (CD) or a digital versatile disk (DVD), or magnetic memory such as disc or tape, and the computer system can utilize the program to configure it for operation.
  • the computer program may also be supplied from a remote source embodied in a carrier medium such as an electronic signal, including a radio frequency carrier wave or an optical carrier wave.
  • a method of evaluating a risk of prostate cancer recunence in a patient comprising: receiving a patient dataset for the patient; and evaluating the patient dataset with a model predictive of prostate cancer recunence, wherein the model is based on a staining index of an Androgen Receptor (AR), thereby evaluating the risk of prostate cancer recunence in the patient.
  • AR Androgen Receptor
  • a method of generating a model predictive of a medical condition comprising: receiving for each of two or more subjects whose outcome with respect to the medical condition is at least partially known, a training subject dataset comprising one or more clinical feature(s), one or more molecular feature(s), and one or more computer-generated mo ⁇ hometric feature(s) generated from a tissue image; and performing a multivariate analysis on the training subject datasets, thereby generating the model predictive of the medical condition.
  • one or more of the molecular feature(s) is from the group of molecular features listed in Table 6.
  • the model is based on a selected subset of said one or more clinical feature(s), one or more molecular feature(s), and one or more computer-generated mo ⁇ hometric feature(s) in said training subject datasets, which subset is selected as being predictive of said medical condition.
  • the model is predictive of prostate cancer recunence and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, a mo ⁇ hometric feature of epithelial nuclei, a mo ⁇ hometric feature of stroma, a mo ⁇ hometric feature of lumen, a mo ⁇ hometric feature of cytoplasm, and a mo ⁇ hometric feature of tissue background.
  • the predictive model comprises a concordance index of at least about 0.88.
  • the predictive model comprises a p value less than about 0.0001 for a log-rank test.
  • the model is predictive of prostate cancer recunence and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, a mo ⁇ hometric feature of epithelial nuclei, a mo ⁇ hometric feature of stroma, a mo ⁇ hometric feature of lumen, and a mo ⁇ hometric feature of cytoplasm.
  • the selected subset of feature(s) comprises one or more of the clinical feature(s) from the group of clinical features listed in Figure 9.
  • the model is predictive of prostate cancer survival and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, and a mo ⁇ hometric feature of epithelial nuclei, a mo ⁇ hometric feature of stroma.
  • a method of screening for an inhibitor compound comprising: receiving a first dataset for a patient; evaluating the first dataset with a model predictive of a medical condition, wherein the model is based on one or more clinical feature(s), one or more molecular feature(s), and one or more computer-generated mo ⁇ hometric feature(s) generated from one or more tissue image(s); administering to the patient a test compound; receiving a second dataset for the patient following the administering of the test compound; evaluating the second dataset with the model; and comparing results of the evaluation of the first dataset with results of the evaluation of the second dataset, wherein a change in the results for the second dataset with respect to the results for the first dataset indicates the test compound is an inhibitor compound.
  • An apparatus for evaluating a risk of prostate cancer recunence in a patient comprising: a model predictive of prostate cancer recunence, wherein the model is based on a staining index of an Androgen Receptor (AR), and wherein the model is configured to: receive a patient dataset for the patient; and evaluate the patient dataset, thereby evaluating the risk of prostate cancer recunence in the patient.
  • AR Androgen Receptor
  • An apparatus for generating a model predictive of a medical condition comprising: an analytical tool configured to: receive for each of two or more subjects whose outcome with respect to the medical condition is at least partially known, a training subject dataset comprising one or more clinical feature(s), one or more molecular feature(s), and one or more computer-generated mo ⁇ hometric feature(s) generated from a tissue image; and perform a multivariate analysis on the training subject datasets, thereby generating the model predictive of the medical condition.
  • said classifying the one or more objects into one or more object classes by the image processing tool comprises classifying by the image processing tool each of one or more of the objects into a class from the group of classes consisting of stroma, cytoplasm, epithelial nuclei, stroma nuclei, lumen, red blood cells, tissue artifacts, and tissue background.
  • said taking one or more measurements pertaining to the one or more object classes with the image processing tool comprises taking with the image processing tool one or more measurements of one or more spectral properties and/or one or more shape properties of the one or more object classes.
  • the model is predictive of prostate cancer recunence and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, a mo ⁇ hometric feature of epithelial nuclei, a mo ⁇ hometric feature of stroma, a mo ⁇ hometric feature of lumen, a mo ⁇ hometric feature of cytoplasm, and a mo ⁇ hometric feature of tissue background.
  • the selected subset of feature(s) comprises one or more of the clinical feature(s) listed in Figure 6.
  • the model is predictive of prostate cancer recunence and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, a mo ⁇ hometric feature of epithelial nuclei, a mo ⁇ hometric feature of stroma, a mo ⁇ hometric feature of lumen, and a mo ⁇ hometric feature of cytoplasm.
  • the model is predictive of prostate cancer survival and the selected subset of feature(s) comprises one or more of the mo ⁇ hometric feature(s) from the group of mo ⁇ hometric features consisting of a mo ⁇ hometric feature of a red blood cell, a mo ⁇ hometric feature of epithelial nuclei, and a mo ⁇ hometric feature of stroma.
  • the selected subset of feature(s) comprises one or more of the clinical feature(s) from the group of clinical features listed in Figure 11.
  • the selected subset of feature(s) comprises one or more of the molecular feature(s) from the group of molecular features Usted in Figure 11.
  • An apparatus for screening for an inhibitor compound comprising: a predictive model based on one or more clinical feature(s), one or more molecular feature(s), and one or more computer-generated mo ⁇ hometric feature(s) generated from one or more tissue image(s), wherein the predictive model is configured to: receive a first dataset for a patient; evaluate the first dataset according to the model; receive a second dataset for the patient following the administering of a test compound to the patient; and evaluate the second dataset according to the model, wherein a comparison of results from the evaluation of the first dataset with results from the evaluation of the second dataset indicates that the test compound is an inhibitor compound when there is a change in the results for the second dataset with respect to the results for the first dataset.
  • An apparatus for evaluating the risk of occu ⁇ ence of a medical condition in a patient comprising: a model predictive of the medical condition, wherein the model is based on one or more computer-generated mo ⁇ hometric feature(s) generated from one or more tissue image(s) and wherein the model is configured to: receive a patient dataset for the patient; and evaluate the patient dataset according to the model, thereby evaluating the risk of occurrence of the medical condition in the patient.
  • the patient dataset comprises a patient dataset based on a liver tissue image for the patient and wherein the predictive model is configured to determine whether the liver tissue is normal or abnormal.
  • said model is based on the one or more computer-generated mormo ⁇ hometric feature(s) and one or more clinical feature(s).
  • the patient dataset comprises a patient dataset based on a prostate tissue image for the patient and wherein the predictive model is configured to make a prediction with respect to prostate cancer recurrence for the patient.
  • the patient dataset comprises a patient dataset based on a prostate tissue image for the patient and wherein the predictive model is configured to make a prediction with respect to clinical failure for the patient.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

La présente invention a trait à des procédés et des systèmes mettant en oeuvre une information clinique, une information moléculaire et une information morphométrique générée par ordinateur dans un modèle prédictif pour la prédiction de la survenance (par exemple, la réapparition) d'une condition médicale, telle que par exemple un cancer.
PCT/US2005/008350 2004-03-12 2005-03-14 Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale WO2005091203A2 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP05728291A EP1728211A2 (fr) 2004-03-12 2005-03-14 Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale
CA2559241A CA2559241C (fr) 2004-03-12 2005-03-14 Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale

Applications Claiming Priority (18)

Application Number Priority Date Filing Date Title
US55249704P 2004-03-12 2004-03-12
US60/552,497 2004-03-12
US57705104P 2004-06-04 2004-06-04
US60/577,051 2004-06-04
US60076404P 2004-08-11 2004-08-11
US60/600,764 2004-08-11
US62051404P 2004-10-20 2004-10-20
US60/620,514 2004-10-20
US10/991,897 2004-11-17
US10/991,240 2004-11-17
US10/991,897 US7483554B2 (en) 2003-11-17 2004-11-17 Pathological tissue mapping
US10/991,240 US7505948B2 (en) 2003-11-18 2004-11-17 Support vector regression for censored data
US64515805P 2005-01-18 2005-01-18
US60/645,158 2005-01-18
US65177905P 2005-02-09 2005-02-09
US60/651,779 2005-02-09
US11/067,066 2005-02-25
US11/067,066 US7321881B2 (en) 2004-02-27 2005-02-25 Methods and systems for predicting occurrence of an event

Publications (2)

Publication Number Publication Date
WO2005091203A2 true WO2005091203A2 (fr) 2005-09-29
WO2005091203A3 WO2005091203A3 (fr) 2006-02-02

Family

ID=34963058

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2005/008350 WO2005091203A2 (fr) 2004-03-12 2005-03-14 Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale

Country Status (3)

Country Link
EP (1) EP1728211A2 (fr)
CA (1) CA2559241C (fr)
WO (1) WO2005091203A2 (fr)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007123914A2 (fr) * 2006-04-17 2007-11-01 Siemens Medical Solutions Usa, Inc. Modélisation pronostique à partir d'une ou de plusieurs sources d'information
WO2009072098A1 (fr) * 2007-12-04 2009-06-11 University College Dublin, National University Of Ireland Système et procédé pour une analyse d'image
US8114615B2 (en) 2006-05-17 2012-02-14 Cernostics, Inc. Method for automated tissue analysis
WO2013158812A1 (fr) * 2012-04-20 2013-10-24 International Business Machines Corporation Combinaison de connaissances et d'intuitions basées sur des données pour identifier des facteurs de risque dans les soins de santé
US9747654B2 (en) 2014-12-09 2017-08-29 Cerner Innovation, Inc. Virtual home safety assessment framework
US10018631B2 (en) 2011-03-17 2018-07-10 Cernostics, Inc. Systems and compositions for diagnosing Barrett's esophagus and methods of using the same
EP1949285B1 (fr) * 2005-10-13 2019-07-24 Fundação D. Anna Sommer Champalimaud E Dr. Carlos Montez Champalimaud Systemes et methodes destines au traitement, au diagnostic et a la prevision de l'occurrence d'un etat pathologique
US20200105380A1 (en) * 2018-10-02 2020-04-02 Origent Data Sciences, Inc. Systems and methods for designing clinical trials
CN111199794A (zh) * 2018-11-19 2020-05-26 复旦大学附属眼耳鼻喉科医院 一种适用于高度近视白内障的手术智能决策系统及其建立方法
CN111554387A (zh) * 2020-04-26 2020-08-18 医渡云(北京)技术有限公司 医生信息推荐的方法、装置、存储介质及电子设备
CN112017771A (zh) * 2020-08-31 2020-12-01 吾征智能技术(北京)有限公司 一种基于精液常规检查数据的疾病预测模型的构建方法及系统
US10962544B2 (en) 2015-11-25 2021-03-30 Cernostics, Inc. Methods of predicting progression of Barrett's esophagus
CN112967809A (zh) * 2021-04-06 2021-06-15 联仁健康医疗大数据科技股份有限公司 一种随访用户的确定方法、装置、设备及存储介质
WO2021207404A1 (fr) * 2020-04-09 2021-10-14 Micron Technology, Inc. Surveillance de patient à l'aide de serveurs périphériques comportant un accélérateur pour l'apprentissage profond et une mémoire vive
US11355175B2 (en) 2020-04-09 2022-06-07 Micron Technology, Inc. Deep learning accelerator and random access memory with a camera interface
CN115116594A (zh) * 2022-06-06 2022-09-27 中国科学院自动化研究所 医疗装置有效性的检测方法及装置
US11461651B2 (en) 2020-04-09 2022-10-04 Micron Technology, Inc. System on a chip with deep learning accelerator and random access memory
CN116646088A (zh) * 2023-07-27 2023-08-25 广东省人民医院 一种预测方法、装置、设备及介质
US11874897B2 (en) 2020-04-09 2024-01-16 Micron Technology, Inc. Integrated circuit device with deep learning accelerator and random access memory
US11887647B2 (en) 2020-04-09 2024-01-30 Micron Technology, Inc. Deep learning accelerator and random access memory with separate memory access connections
CN115116594B (zh) * 2022-06-06 2024-05-31 中国科学院自动化研究所 医疗装置有效性的检测方法及装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6025128A (en) 1994-09-29 2000-02-15 The University Of Tulsa Prediction of prostate cancer progression by analysis of selected predictive parameters

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6025128A (en) 1994-09-29 2000-02-15 The University Of Tulsa Prediction of prostate cancer progression by analysis of selected predictive parameters

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KIM, K.S. ET AL.: "Automatic Classification of Cells Using Morphological Shape in Peripheral Blood Images", PROCEEDING OF SPIE, vol. 4210, 2000, pages 290 - 298, XP008056791, DOI: doi:10.1117/12.403813

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1949285B1 (fr) * 2005-10-13 2019-07-24 Fundação D. Anna Sommer Champalimaud E Dr. Carlos Montez Champalimaud Systemes et methodes destines au traitement, au diagnostic et a la prevision de l'occurrence d'un etat pathologique
WO2007123914A3 (fr) * 2006-04-17 2008-01-24 Siemens Medical Solutions Modélisation pronostique à partir d'une ou de plusieurs sources d'information
US7805385B2 (en) 2006-04-17 2010-09-28 Siemens Medical Solutions Usa, Inc. Prognosis modeling from literature and other sources
WO2007123914A2 (fr) * 2006-04-17 2007-11-01 Siemens Medical Solutions Usa, Inc. Modélisation pronostique à partir d'une ou de plusieurs sources d'information
US8114615B2 (en) 2006-05-17 2012-02-14 Cernostics, Inc. Method for automated tissue analysis
US8597899B2 (en) 2006-05-17 2013-12-03 Cernostics, Inc. Method for automated tissue analysis
WO2009072098A1 (fr) * 2007-12-04 2009-06-11 University College Dublin, National University Of Ireland Système et procédé pour une analyse d'image
US8116551B2 (en) 2007-12-04 2012-02-14 University College, Dublin, National University of Ireland Method and system for image analysis
US10018631B2 (en) 2011-03-17 2018-07-10 Cernostics, Inc. Systems and compositions for diagnosing Barrett's esophagus and methods of using the same
US11221333B2 (en) 2011-03-17 2022-01-11 Cernostics, Inc. Systems and compositions for diagnosing Barrett's esophagus and methods of using the same
WO2013158812A1 (fr) * 2012-04-20 2013-10-24 International Business Machines Corporation Combinaison de connaissances et d'intuitions basées sur des données pour identifier des facteurs de risque dans les soins de santé
US10198780B2 (en) 2014-12-09 2019-02-05 Cerner Innovation, Inc. Virtual home safety assessment framework
US9747654B2 (en) 2014-12-09 2017-08-29 Cerner Innovation, Inc. Virtual home safety assessment framework
US10962544B2 (en) 2015-11-25 2021-03-30 Cernostics, Inc. Methods of predicting progression of Barrett's esophagus
US11139051B2 (en) * 2018-10-02 2021-10-05 Origent Data Sciences, Inc. Systems and methods for designing clinical trials
US20200105380A1 (en) * 2018-10-02 2020-04-02 Origent Data Sciences, Inc. Systems and methods for designing clinical trials
US20220139505A1 (en) * 2018-10-02 2022-05-05 Origent Data Sciences, Inc. Systems and methods for designing clinical trials
CN111199794B (zh) * 2018-11-19 2024-03-01 复旦大学附属眼耳鼻喉科医院 一种适用于高度近视白内障的手术智能决策系统及其建立方法
CN111199794A (zh) * 2018-11-19 2020-05-26 复旦大学附属眼耳鼻喉科医院 一种适用于高度近视白内障的手术智能决策系统及其建立方法
US11726784B2 (en) 2020-04-09 2023-08-15 Micron Technology, Inc. Patient monitoring using edge servers having deep learning accelerator and random access memory
US11874897B2 (en) 2020-04-09 2024-01-16 Micron Technology, Inc. Integrated circuit device with deep learning accelerator and random access memory
US11942135B2 (en) 2020-04-09 2024-03-26 Micron Technology, Inc. Deep learning accelerator and random access memory with a camera interface
US11355175B2 (en) 2020-04-09 2022-06-07 Micron Technology, Inc. Deep learning accelerator and random access memory with a camera interface
US11887647B2 (en) 2020-04-09 2024-01-30 Micron Technology, Inc. Deep learning accelerator and random access memory with separate memory access connections
US11461651B2 (en) 2020-04-09 2022-10-04 Micron Technology, Inc. System on a chip with deep learning accelerator and random access memory
WO2021207404A1 (fr) * 2020-04-09 2021-10-14 Micron Technology, Inc. Surveillance de patient à l'aide de serveurs périphériques comportant un accélérateur pour l'apprentissage profond et une mémoire vive
CN111554387A (zh) * 2020-04-26 2020-08-18 医渡云(北京)技术有限公司 医生信息推荐的方法、装置、存储介质及电子设备
CN112017771A (zh) * 2020-08-31 2020-12-01 吾征智能技术(北京)有限公司 一种基于精液常规检查数据的疾病预测模型的构建方法及系统
CN112017771B (zh) * 2020-08-31 2024-02-27 吾征智能技术(北京)有限公司 一种基于精液常规检查数据的疾病预测模型的构建方法及系统
CN112967809B (zh) * 2021-04-06 2024-01-23 联仁健康医疗大数据科技股份有限公司 一种随访用户的确定方法、装置、设备及存储介质
CN112967809A (zh) * 2021-04-06 2021-06-15 联仁健康医疗大数据科技股份有限公司 一种随访用户的确定方法、装置、设备及存储介质
US12002553B2 (en) 2021-10-04 2024-06-04 Origent Data Sciences, Inc. Systems and methods for designing clinical trials
CN115116594A (zh) * 2022-06-06 2022-09-27 中国科学院自动化研究所 医疗装置有效性的检测方法及装置
CN115116594B (zh) * 2022-06-06 2024-05-31 中国科学院自动化研究所 医疗装置有效性的检测方法及装置
CN116646088A (zh) * 2023-07-27 2023-08-25 广东省人民医院 一种预测方法、装置、设备及介质
CN116646088B (zh) * 2023-07-27 2023-12-01 广东省人民医院 一种预测方法、装置、设备及介质

Also Published As

Publication number Publication date
CA2559241C (fr) 2015-10-13
EP1728211A2 (fr) 2006-12-06
WO2005091203A3 (fr) 2006-02-02
CA2559241A1 (fr) 2005-09-29

Similar Documents

Publication Publication Date Title
CA2624970C (fr) Systemes et methodes destines au traitement, au diagnostic et a la prevision de l'occurrence d'un etat pathologique
US7461048B2 (en) Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition
US7467119B2 (en) Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition
CA2559241C (fr) Systemes et procedes pour le traitement, le diagnostic et la prediction de la survenance d'une condition medicale
CA2679436C (fr) Systemes et procedes destines a traiter, diagnostiquer et prevoir la survenue d'un etat medical
US20180096742A1 (en) Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition
US20170351837A1 (en) Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition
US20160253469A1 (en) Systems and methods for predicting favorable-risk disease for patients enrolled in active surveillance
Saltz et al. Spatial organization and molecular correlation of tumor-infiltrating lymphocytes using deep learning on pathology images
CA2732171C (fr) Systemes et procedes de traitement, de diagnostic et de prediction de la survenue d'un etat pathologique
US20120010528A1 (en) Systems and methods for predicting disease progression in patients treated with radiotherapy
TW200538734A (en) Systems and methods for treating, diagnosing and predicting the occurrence of a medical condition
Joo et al. Artificial intelligence-based non-small cell lung cancer transcriptome RNA-sequence analysis technology selection guide

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2559241

Country of ref document: CA

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

WWE Wipo information: entry into national phase

Ref document number: 2005728291

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2005728291

Country of ref document: EP