WO2012125411A1 - Methods of predicting prognosis in cancer - Google Patents

Methods of predicting prognosis in cancer Download PDF

Info

Publication number
WO2012125411A1
WO2012125411A1 PCT/US2012/028307 US2012028307W WO2012125411A1 WO 2012125411 A1 WO2012125411 A1 WO 2012125411A1 US 2012028307 W US2012028307 W US 2012028307W WO 2012125411 A1 WO2012125411 A1 WO 2012125411A1
Authority
WO
WIPO (PCT)
Prior art keywords
cancer
tumor
patient
biomarkers
biomarker
Prior art date
Application number
PCT/US2012/028307
Other languages
French (fr)
Inventor
Eric Devroe
Chang-Jiun Terrence WU
Eldar Yehuda GILADI
Yuxun Wang
Sharon Yehudit FRIEDLANDER
William R. BRADLEY
Clayton G. SMALL, III
Nadia GURVICH
Yi Elaine HUANG
William Karl DAHLBERG
Michail SHIPITSIN
Thomas Patrick NIFONG
David L. Rimm
Lynda Chin
Peter Blume-Jensen
Original Assignee
Metamark Genetics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Metamark Genetics, Inc. filed Critical Metamark Genetics, Inc.
Publication of WO2012125411A1 publication Critical patent/WO2012125411A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/5743Specifically defined cancers of skin, e.g. melanoma
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/50Determining the risk of developing a disease
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/60Complex ways of combining multiple protein biomarkers for diagnosis

Definitions

  • This invention relates to using biomarker panels to predict prognosis in cancer patients.
  • Metastasis involves multiple biological processes driven by an ensemble of genetic alterations (Gupta et a., Cell 127(4): 679-95(2006)).
  • a long held view holds that metastasis- conferring genetic events are acquired stochastically as a tumor grows and expands.
  • An alternative view posits that tumors are "hard-wired" with pro-metastatic genetic alterations early in the evolution of tumors and that these alterations also drive the genesis of cancer (Bernards et al, Nature 418(6900):823 (2002)).
  • the current melanoma staging system is primarily based on tumor thickness (Breslow score), ulceration, and sentinel lymph node (SLN) status (Balch, C. M., J. E. Gershenwald, et al. (2003). Chapter 3: Staging and classification. Cutaneous Melanoma. C. M. Balch, A. N. Houghton, A. Sober and S. J. Soong. St. Louis, Quality Medical Publishing: 55-76). While standard clinical risk assessments are capable of stratifying patients into low, intermediate, and high risk of relapse, standard clinical staging approaches have significant limitations.
  • SLNB sentinel lymph node biopsy
  • SLNB procedures are negative (e.g., no nodal involvement is detected).
  • the relative value of this procedure has been called into question.
  • 10-20% of patients who are clinically assessed as low-risk e.g., thin Stage I or II tumors; SLNB negative
  • SLNB negative thin Stage I or II tumors
  • the instant invention provides molecular signatures that can robustly stratify cancer, including melanoma, patients according to their risk for metastatic progression.
  • the present invention provides a set of biomarkers (e.g., genes and gene products) that can accurately inform about the risk of cancer progression and recurrence, as well as methods of their use. These biomarkers, also denoted Prognosis Determinants (PDs), provide prognostic value for human cancer patients.
  • biomarkers e.g., genes and gene products
  • PDs Prognosis Determinants
  • the invention provides a method of predicting prognosis of a cancer patient.
  • a cancerous tissue sample from the patient, measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample, and obtains biomarker scores based on the measured levels, wherein the biomarker scores are indicative of the prognosis of the cancer patient.
  • the patient has melanoma.
  • the prognosis may be that the patient is at a low risk of having metastatic cancer or recurrence of the melanoma.
  • the selected biomarkers may be (1) CD44, ANLN, CD1 17, MMP1 , and KIF2C; or (2) CDH2, SPARC, PCNA, FSCN1 , and
  • the prognosis may be that the patient is at a high risk of having metastatic cancer or recurrence of the melanoma.
  • the selected biomarkers may be (1) CD1 17, CD44, KIF2C, MMP1 , and CDH2; or (2) PCNA, ANLN, SPARC, FSCN1 , and DEPDC1.
  • the patient may have a negative result in sentinel lymph node biopsy (SLNB), and the selected biomarkers may be (1) ANLN, MMP1 , CDH2, KIF2C, and SPARC; or (2) CD1 17, PCNA, FSCN1 , CD44, and DEPDC1.
  • SLNB sentinel lymph node biopsy
  • the invention also provides a method of analyzing a cancerous tissue sample from a cancer patient.
  • a cancerous tissue sample from the patient and measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample.
  • the invention additionally provides a method of identifying a cancer patient in need of adjuvant therapy.
  • a cancerous tissue sample from the patient, measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample, and obtains biomarker scores based on the measured levels, wherein the biomarker scores indicate that the patient is in need of adjuvant therapy.
  • the adjuvant therapy may be selected from the group consisting of radiation therapy, chemotherapy, immunotherapy, hormone therapy, and targeted therapy.
  • the targeted therapy targets another component of a signaling pathway in which one or more of the selected biomarkers is a component.
  • the targeted therapy targets one or more of the selected biomarkers.
  • the invention also provides a further method of treating a cancer patient.
  • this method one obtains the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in a cancerous tissue sample from the patient, and treats the patient with adjuvant therapy if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer.
  • the adjuvant therapy is an
  • the invention additionally provides a method of identifying a cancer patient in need of a sentinel lymph node biopsy.
  • this method one obtains the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in the sample, and performs a sentinel lymph node biopsy on the patient if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer.
  • the invention conversely provides a method of identifying a cancer patient not in need of a sentinel lymph node biopsy.
  • biomarker scores are obtained by applying a coefficient to the measured levels of the selected biomarkers.
  • biomarker scores are calculated by using one or more algorithms selected from the group consisting of the Greedy Model, the Cox regression algorithm, the LASSO algorithm, the AlC-Optimizing Stepwise Forward Selection Cox Regression algorithm.
  • the RNA transcript levels of the selected biomarkers are measured.
  • the transcript levels are determined by microarray, quantitative RT-PCR or Nanostring nCounter.
  • the protein levels of the selected biomarkers are measured.
  • the protein levels are measured by antibodies, for example, by immunohistochemistry or immunofluorescence.
  • the protein levels may be measured in subcellular compartments, for example, by measuring the protein levels of biomarkers in the nucleus relative to the protein levels of the biomarkers in the cytoplasm.
  • the protein levels of biomarkers may be measured in the nucleus and/or in the cytoplasm.
  • the levels of the biomarkers may be measured separately.
  • the levels of the biomarkers may be measured in a multiplex reaction.
  • noncancerous cells are excluded from the tissue sample.
  • a cancerous tissue sample is a formalin- fixed paraffin embedded tissue sample, a snap-frozen tissue sample, an ethanol-fixed tissue sample, a tissue sample fixed with an organic solvent, a tissue sample fixed with plastic or epoxy, a cross-linked tissue sample, surgically removed tumor tissue, circulating tumor cells, a biopsy sample, or a blood sample.
  • the cancerous tissue is melanoma, prostate cancer, breast cancer, or colon cancer tissue.
  • At least one standard parameter associated with the cancer is measured in addition to the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC.
  • the at least one standard parameter may be, for example, tumor stage, tumor grade, tumor size, tumor visual characteristics, tumor location, tumor growth, lymph node status, tumor thickness (Breslow score), ulceration, age of onset, PSA level, or Gleason score.
  • one or more pathway context genes, gene transcripts, or gene products are measured in addition to the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC.
  • the pathway context genes, gene transcripts, or gene products are mutated pathway context genes, gene transcripts, or gene products.
  • the invention provides a kit for measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC.
  • This kit comprises reagents for specifically measuring the levels of the selected biomarkers.
  • the reagents may be nucleic acid molecules such as PCR primers or hybridizing probes.
  • the reagents may be antibodies or equivalents thereof.
  • Fig. 1 shows quantification of PD mR A levels by qRT-PCR.
  • Gene expression results represented as log 2-fold change values for the ten PDs in nine melanoma cell lines (bottom row).
  • the melanocyte cell line NHEM neo was used as the comparator (log 2-fold change of 0). Changes in the expression of the endogenous control gene, ZNF592, are shown as well.
  • MMP1 was not detected in NHEM neo and A375
  • c-KIT was not detected in WM-266.4, SK-MEL-2, SK-MEL-24, SK-MEL-31 and A375.
  • CD44-a represents the C- terminal domain
  • CD44-b the variable domain
  • CD44-C the N-terminal domain of CD44.
  • FIG. 2 shows quantification of PD protein levels by western blot.
  • Western blot results shown for the 10 PDs in normal melanocyte cell line (NHEM) and nine melanoma cell lines. The values above each band represent quantitative PD mRNA expression levels based on Taqman qRT-PCR (see Fig. 1). Actin was used as a loading control. ND, not detected.
  • FIG. 3 shows qualitative analysis of PD protein levels by conventional
  • FFPE Formalin- fixed, paraffin embedded
  • Fig. 4 shows analysis of PD protein levels by immunofluorescence.
  • KIF2C A
  • CDH2 B
  • DEPDCl C
  • CD44 D
  • CDl 17 E
  • SPARC F
  • FSCNl G
  • PCNA H
  • MMP1 I
  • ANLN J
  • the 10 PDs were detected and amplified using Cy5-tyramide.
  • 20 x magnification images were captured in the Cy 5 -fluorescent channels using the AperioFL ScanScope hardware. Digital images were recorded in the Aperio Spectrum Database and processed using ImageScope software.
  • Fig. 5 shows PD quantification in a tumor compartment defined using the Definiens Composer segmentation and classification tool in Definiens TissueStudio TM software.
  • the Cy3 -acquired image (A) was used to determine tissue background separation (B) as well as generate the tumor mask excluding non-tumor cells in the sample in the analysis and separate it from the stroma (dark versus bright segments in C).
  • the DAPI acquired image (D) was used to determine the nuclear (E) versus the non nuclear (F) compartments and the Cy5- acquired image (G) was used to determine and quantitate target PCNA intensity in the diverse tumor regions of interest (H).
  • Fig. 6 shows PD quantification in a tumor compartment defined using a molecular mask and AQUA.
  • I PD markers and nuclei were visualized by multiplex immunofluorescent staining for S-100 labeled with Cy3 (B,F); FSCNl (C) and SPARC (G) labeled with Cy5 and 4,6-diamidino-2-phenylindole (DAPI) (A,E) respectively.
  • images were acquired in the CY3, CY5 and DAPI channels, recorded in the Aperio Spectrum Database and analyzed using the AQUA software. The above images were merged in Fig. 6D and H.
  • II Representative column chart comparing the distributions of composite AQUA scores for each of the 10 PDs in tumor mask as well as in nuclear and non-nuclear areas within the tumor mask in assorted human melanomas.
  • Fig. 7 shows an overview of the workflow of a clinical study in which 10 PDs were assessed by AQUA using a melanoma TMA cohort.
  • Fig. 8 shows a method to prioritize PDs for inclusion in prognostic models.
  • Left panel The compartment-specific marker expression levels measured by AQUA platform were preprocessed and analyzed with two variable-reducing algorithms (each using either continuous or binarized data), employing bootstrapping to develop an importance score for each PD.
  • Right panel PDs are ranked by the sum of importance scores across the four algorithms. Accordingly, the top ones correspond to the highest absolute aggregated scores.
  • the top-K markers are used to build a multivariate Cox regression algorithm to design a linear melanoma prognostic model.
  • Fig. 9 shows development of a prognostic model using multivariate Cox regression as described in Fig. 8 to optimize for the identification of low risk subjects.
  • the top-ranked 5 PDs were used to construct a linear melanoma prognostic model by a multivariate Cox proportional hazard regression algorithm.
  • the ability of the model to segregate the high and low risk groups in both the training and validation cohorts is assessed with Kaplan-Meier curves.
  • TR.HiRisk is the high risk training population
  • TR.LoRisk is the low risk training population
  • TE.HiRisk is the high risk testing population
  • TE.LoRisk is the low risk testing population.
  • Numbers in parentheses indicate the number of events (patient deaths)/total population in the high risk and low risk designations.
  • SumScores represents the sum of the scores calculated from methods (A), (B), (C), and (D).
  • Fig. 10 shows the development of a prognostic model using stepwise forward selection to optimize for the identification of low risk subjects.
  • Stepwise forward selection was employed with a Cox regression algorithm to select four PDs for the model.
  • the ability of the model to segregate the high and low risk groups of the training and validation cohorts were assessed with Kaplan-Meier curves.
  • the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers.
  • Fig. 11 shows three different 5-marker models which were obtained using the standard model selection procedure illustrated in Fig. 8, optimized to identify low-risk subjects.
  • the score that each model gives a sample is the linear combination of the log2 AQUA scores for the PDs in the model weighted by the model coefficients.
  • the first model is obtained by choosing the highest ranking markers while the other two are obtained by choosing markers that are further down the list.
  • the performance of the models on the high and low risk groups of the training cohort is illustrated using Kaplan-Meier curves.
  • Fig. 12 shows examples of prognostic models using various numbers of PDs. In this case it was used to identify specific marker combinations that all performed well in predicting low risk for progression.
  • the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers.
  • Fig. 13 presents results for a version of the variable selection algorithm that favors variable combinations that yield models for which the low risk population has a very low FN rate.
  • the figure presents Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination).
  • the number of "markers” referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers.
  • Hazard Ratio is the increase in hazard in moving between high risk and low risk populations that occurs when a Cox proportional model is matched to these two populations.
  • Fig. 14 presents Kaplan-Meier curves for prognostic models that are sensitive to the identification of high risk patients. Models with three or more parameters perform well in two respects. On the one hand the high risk population contains a relatively high fraction of the recurrent cases, and on the other the follow-up time for the censored cases is generally short, demonstrating that they are likely high risk.
  • the Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination) are shown. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers. PPR was calculated as TP/(TP+FP), and specificity was calculated as TN/(TN+FP).
  • Fig. 15 shows examples of prognostic models that are able to identify high risk patients in a SLNB-negative patient population. Examples are shown for 1, 2, 3, 4, and 5 PD combinations. The Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination) are shown. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers. [0039] Fig. 16 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are very sensitive in detecting the low risk population.
  • Fig. 17 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are geared towards identifying the high risk population.
  • the algorithm is analogous to the one for identifying the low risk population (see legend of Fig. 16) except that variables that improve the specificity are added at each stage.
  • the table presents the performance of the algorithm on the training data while the Kaplan Meier curves demonstrate the performance on both training and test data for the model with six variables.
  • the high risk population has a relatively high fraction of recurrence events, as well as relatively short follow-up times for the censored cases.
  • Fig. 18 illustrates that multiple models can be combined to improve population segregation.
  • the greedy model that segregates well the low risk population is combined with the greedy model that segregates well the high risk population to yield stratification into three classes: high risk, medium risk, and low risk.
  • the model scores consist of a linear combination of the log2 AQUA scores for the markers weighted by the coefficients for the two models indicated in the two tables.
  • the thresholds for the model scores used to segregate populations are indicated in the decision tree below the table. For the training sets, thresholds that maximized sensitivity (for low-risk subjects) or specificity (for high-risk subjects) was chosen.
  • Fig. 19 demonstrates the effectiveness of the combined model for segregating low and high risk populations illustrated in Fig. 18. Kaplan-Meier curves for both training and test cohorts are presented.
  • Fig. 20 demonstrates that molecular prognostic models are independent of standard clinical parameters.
  • Top panel the prognostic model dichotomizes thinner or thicker Breslow depth tumors into risk-distinct groups in training and validation cohorts.
  • Bottom panel the prognostic model also dichotomizes sentinel lymph node negative patients into low- and high-risk groups in both training and validation cohorts. In both cases, the biomarker panels were able to predict low risk and high risk independent of the clinical parameters. The voting algorithm was used for both low risk and high risk.
  • Fig. 21 shows a prognostic model for prostate cancer using PDs.
  • Expression data from PMID: 20579941 were preprocessed in a standard fashion (see, e.g., Example 4) and a Cox model was constructed using ten PDs: ANLN, CDH2, PCNA, KIF2C, DEPDC1, SPARC, FSCN1, MMP1, CD44, and CD117.
  • Kaplan Meier curves illustrate separation between low and high risk populations as determined by biochemical recurrence.
  • Fig. 22 shows a prognostic model for breast cancer using the ten PDs.
  • Data from PMID: 12490681 were preprocessed in a standard fashion (see, e.g., Example 4) and clustered by hierarchical clustering and the two main clusters were used to identify two populations.
  • Kaplan-Meier curves illustrate separation between low and high risk
  • Fig. 23 shows a prognostic model for colon cancer using the ten PDs.
  • Data from PMID: 19996206 were preprocessed in a standard fashion (see, e.g., Example 4) and clustered by k-means clustering using all ten PDs.
  • Kaplan Meier curves illustrate the separation between low and high risk populations.
  • the number of recurrences R and the size of each group S is indicated in the figure title as R/S.
  • Fig. 24 shows quantification of mRNA levels for pathway context genes relevant for malignant melanoma.
  • Gene expression results represented as log 2-fold change values for 3 members of the canonical MAPK pathway (K-RAS, N-RAS, B-RAF) in 9 melanoma cell lines. These 3 genes provide examples of clinically relevant pathway context used to model human melanoma sub-types.
  • the melanocyte cell line NHEM neo was used as the comparator (log 2-fold change of 0). Changes in the expression of the endogenous control gene, ZNF592, are shown as well.
  • Fig. 25 shows quantification of pathway context protein levels by Western blot.
  • Western blot results shown for 3 members of the canonical MAPK pathway (K-Ras, N-Ras, B-Raf) in 9 melanoma cell lines. The values above each band represent quantitative pathway context gene mRNA expression levels based on Taqman qRT-PCR (see Fig. 24). Actin was used as a loading control.
  • Fig. 26 shows detection of pathway context proteins by immunohistochemical staining.
  • Clinically-relevant pathway context markers, K-Ras, N-Ras, and B-Raf can be detected in human melanoma FFPE samples by conventional immunohistochemistry and immunofluorescence staining. The pictures illustrate immunohistochemical and
  • Fig. 27 shows pathway context protein quantification in a tumor compartment defined using a molecular mask and AQUA.
  • the 3 pathway context markers, K-Ras, N-Ras, and B-Raf, and nuclei were visualized by multiplex immuno fluorescent staining.
  • S-100 was labeled with Cy3
  • K-Ras, N-Ras, and B-Raf were labeled with Cy5, and nuclei with 4,6-diamidino-2-phenylindole (DAPI), respectively.
  • images were acquired in the CY3, CY5 and DAPI channels, recorded in the Aperio Spectrum Database and analyzed using the AQUA software.
  • a representative column chart showing the AQUA scores for each of the 3 pathway context markers in tumor mask as well as in nuclear and non-nuclear areas within the tumor mask is illustrated.
  • Fig. 28 provides a description of the fields of the tables shown in figures 29-31.
  • LR Low Risk
  • HR High Risk
  • Fig. 29 shows performance of selected Low Risk (LR) models in a training subcohort and complimentary validation subcohort. A subset of the models is depicted in Fig. 13.
  • LR Low Risk
  • Fig. 30 shows performance of selected High Risk (HR) models in a training subcohort and complimentary validation subcohort. A subset of the models is depicted in Fig. 14.
  • Fig. 31 shows performance of selected High Risk (HR) models in a training subcohort and complimentary validation subcohort with negative sentinel lymph node biopsy (SLNB) results.
  • HR High Risk
  • SLNB negative sentinel lymph node biopsy
  • the present invention is based on the discovery that biomarker panels comprising two or more members from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCN1, KIF2C, MMP1, PCNA, and SPARC ("prognosis determinants” or "PD”s; Table 1) are useful in providing molecular, evidence-based reliable prognosis about cancer patients.
  • biomarker panels comprising two or more members from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCN1, KIF2C, MMP1, PCNA, and SPARC
  • prognosis determinants or "PD”s; Table 1
  • the levels can be used to predict disease progression (e.g., the metastatic or recurrence potential of a cancer), or efficacy of a cancer therapy (e.g., surgery, radiation therapy or chemotherapy) independent of, or in addition to, traditional, established risk assessment procedures.
  • the levels also can be used to identify patients in need of aggressive cancer therapy (e.g., adjuvant therapy), or to guide further diagnostic tests (e.g., sentinel lymph node biopsy).
  • the levels can also be used to inform patients about which types of therapy they would be most likely to benefit from, and to stratify patients for inclusion in a clinical study .
  • the levels also can be used to identify patients who will not benefit from and/or do not need cancer therapy (e.g., surgery, radiation therapy, chemotherapy, targeted therapy, or adjuvant therapy).
  • cancer therapy e.g., surgery, radiation therapy, chemotherapy, targeted therapy, or adjuvant therapy.
  • the biomarker panels of this invention allow clinicians to optimally manage cancer patients.
  • the biomarker panels of the present invention provide useful prognostic information about a variety of cancers, including, for example, carcinomas (e.g., malignant tumors derived from epithelial cells such as, for example, common forms of breast, prostate, lung, and colon cancer), sarcomas (e.g., malignant tumors derived from connective tissue or mesenchymal cells), lymphomas and leukemias (i.e., malignancies derived from a variety of cancers, including, for example, carcinomas (e.g., malignant tumors derived from epithelial cells such as, for example, common forms of breast, prostate, lung, and colon cancer), sarcomas (e.g., malignant tumors derived from connective tissue or mesenchymal cells), lymphomas and leukemias (i.e., malignancies derived from a variety of cancers, including, for example, carcinomas (e.g., malignant tumors derived from epithelial cells such as,
  • cancers include, without limitation, cancers of: breast, skin, bone, prostate, ovaries, uterus, cervix, liver, lung, brain, spine, larynx, gallbladder, pancreas, rectum, parathyroid, thyroid, adrenal gland, immune system, head and neck, colon, stomach, bronchi, and kidneys.
  • biomarker score a numeric score for each of the biomarkers based on its measured level.
  • the biomarker scores of a given panel have been correlated with a specific prognosis of the cancer. For example, a particular profile of biomarker scores for a panel can be predictive of a low risk of cancer metastasis or recurrence, while another profile of biomarker scores of the same or a different panel can be predictive of a high risk of cancer metastasis or recurrence.
  • the prognosis determinants of this invention include the ten biomarkers listed in Table 1 below.
  • a “biomarker” or “marker” refers to an analyte (e.g., a nucleic acid, peptide, protein, or metabolite) that can be objectively measured and evaluated as an indicator for a biological process. The inventors have discovered that the expression or activity levels of these ten biomarkers correlates reliably with the prognosis of cancer patients.
  • ANLN stands for anillin.
  • ANLN also may be known in the art as scraps, sera, the actin-binding protein anillin, or anillin (Drosophila Scraps homolog), actin binding protein. It is a scaffold protein that links RhoA with actin and myosin during cytokinesis.
  • An exemplary human ANLN protein contains 1 124 amino acid residues and has the following polypeptide sequence:
  • niRNA sequence for this ANLN polypeptide is:
  • CD44 is a cell-surface glycoprotein that is a receptor for hyaluronic acid.
  • CD44 may also be known in the art as Hermes Antigen, Pgp, PGP-1, Phagocytic glycoprotein 1, PGP-I, Phagocytic glycoprotein I, INLU-related p80 glycoprotein, MIC4, MDU2, MDU3, MC56, HCELL, CSPG8, hyaluronate receptor, heparan sulfate proteoglycan, extracellular matrix receptor III, ECMR- III, HUTCH-I, LHR, GP90 lymphocyte homing/adhesion receptor, CD44R, CDW44, antigen gp90 homing receptor, MUTCH-I, chondroitin sulfate proteoglycan 8, MGC 10468, hematopoietic cell E- and L-selectin ligand, Epican, Hermes, Ly-24, lymphocyte antigen 24, CD44A, METAA,
  • CD44 is involved in cell adhesion and migration.
  • An exemplary human CD44 protein contains 742 amino acid residues and has the following polypeptide sequence:
  • CKIT is a cytokine receptor that binds to stem cell factor.
  • CKIT may also be known in the art as CD117, v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog, piebald trait, SCFR, stem cell factor receptor, tyrosine-protein kinase Kit, mast/stem cell growth factor receptor, or mast-cell growth factor receptor.
  • An exemplary human CKIT protein contains 976 amino acid residues and has the following polypeptide sequence:
  • niRNA sequence for this CKIT polypeptide is:
  • DEPDC 1 may be a transcriptional co-reporessor, and has been shown to localize to the nucleus of bladder cancer cells.
  • DEPDC 1 may also be known in the art as DEP domain containing 1, cell cycle control protein SDP352, DEPDC 1 A, DEPDC 1-V2, FLJ20354, 5830484J08Rik, DEP.8, DEP domain-containing protein 1A, or SDP35.
  • An exemplary human DEPDC 1 protein contains 811 amino acid residues and has the following polypeptide sequence:
  • FSCN1 is an actin-bundling protein that has been linked to the metastasis of breast cancer.
  • FSCN1 may also be known in the art as fascin homolog 1, P55, SNL, fascin 1, FLJ38511, FAN1, singed-like, 55 kDa actin-bundling protein, HSN, actin bundling protein, singed, drosophila, homolog-like, or singed (Drosophila)-like (sea urchin fascin homolog like).
  • An exemplary human FSCN1 protein contains 493 amino acid residues and has the following polypeptide sequence:
  • KIF2C is a kinesin-like protein that has microtubule-depolymerizing activity.
  • KIF2C may also be known in the art as kinesin family member 2C, mitotic centromere- associated kinesin, MCAK1, kinesin-like 6 (mitotic centromere-associated kinesin), KNSL6, kinesin-like protein KIF2C, kinesin-like protein 6, kinesin-like 6, or RP11-269F19.1.
  • An exemplary human KIF2C protein contains 725 amino acid residues and has the following polypeptide sequence:
  • MMP1 may also be known in the art as matrix metallopeptidase 1, matrix metalloprotease 1 matrix metalloproteinase 1, MMP-1, EC 3.4.24.7, CLG, or CLGN.
  • An exemplary human MMP1 protein contains 469 amino acid residues and has the following polypeptide sequence: MHSFPPLLLLLFWGVVSHSFPATLETQEQDVDLVQKYLEKYYNLKNDGRQVEKRRN SGPVVEKLKQMQEFFGLKVTGKPDAETLKVMKQPRCGVPDVAQFVLTEGNPRWEQ THLTYRIENYTPDLPRADVDHAIEKAFQLWSNVTPLTFTKVSEGQADIMISFVRGDHR DNSPFDGPGGNLAHAFQPGPGIGGDAHFDEDERWTNNFREYNLHRVAAHELGHSLG LSHSTDIGALMYPSYTFSGDVQLAQDDIDGIQAIYGRSQNPVQPIGPQTPKACDSKLT FDAITTIRGEVMFFKDR
  • N-cadherin (NCAD or CDH2) is a calcium-dependent membrane protein that is involved in cell adhesion.
  • N-cadherin may also be known in the art as cadherin 2 type 1 N- cadherin (neuronal), N-cadherin 1, CDFiN, neuronal calcium-dependent adhesion protein, neural-cadherin, CD325, neural cadherin2, CD325 antigen, CDw325, or cadherin-2.
  • An exemplary human N-cadherin protein contains 906 amino acid residues and has the following polypeptide sequence:
  • Osteonectin is a protein that is involved in the synthesis of extracellular matrix. Osteonectin may also be known in the art as secreted protein, acidic, cysteine-rich, SPARC, Basement-membrane protein 40, BM-40, ON, cysteine-rich protein, or secreted protein acidic and rich in cysteine.
  • An exemplary human osteonectin protein contains 303 amino acid residues and has the following polypeptide sequence:
  • PCNA is a cofactor for DNA polymerase delta, and, thus, is involved in DNA replication.
  • PCNA may also be known in the art as proliferating cell nuclear antigen,
  • An exemplary human PCNA protein contains 261 amino acid residues and has the following polypeptide sequence:
  • biomarkers of this invention encompass all forms and variants of any specifically described biomarkers, including, but not limited to, polymorphic or allelic variants, isoforms, mutants, derivatives, precursors including nucleic acids and pro-proteins, cleavage products, and structures comprised of any of the biomarkers as constituent subunits of the fully assembled structure.
  • the biomarkers of this invention can be measured in various forms. For example, one may measure the RNA transcript levels (e.g., mRNA or total RNA levels) or gene copy numbers of the biomarkers, or may measure the protein or activity levels of the biomarkers. In some embodiments, one may also measure metabolites (e.g., such as peptide fragment) of the biomarkers, or surrogates of the biomarkers (e.g., substrates or ligands of the biomarkers, or biological entities downstream in the signaling pathways of the biomarkers).
  • metabolites e.g., such as peptide fragment
  • surrogates of the biomarkers e.g., substrates or ligands of the biomarkers, or biological entities downstream in the signaling pathways of the biomarkers.
  • the biomarkers, their metabolites, or surrogates of the biomarkers can also be measured together with genes or gene products, like B-Raf, N-Ras, K-Ras, pi 6, p53. In some embodiments, the biomarkers, their metabolites, or surrogates of the biomarkers are measured together with the measurement of mutations in genes like B-Raf, N-Ras, K-Ras, pi 6, and p53.
  • biomarkers may be measured by electrophoresis, Northern and Southern blot analyses, in situ hybridization (e.g., single or multiplex nucleic acid in situ hybridization technology such as Advanced Cell Diagnostic's RNAscope technology), RNAse protection assays, and microarrays (e.g., . Illumina BeadArrayTM technology; Beads Array for Detection of Gene Expression (BADGE)). Biomarkers may also be measured by polymerase chain reaction (PCR)-based assays, e.g., quantitative PCR, real-time PCR, quantitative real-time PCR (qRT-PCR), and reverse transcriptase PCR (RT-PCR). Other amplification-based methods include, for example, transcript-mediated amplification (TMA), strand displacement amplification (SDA), nucleic acid sequence based amplification
  • TMA transcript-mediated amplification
  • SDA strand displacement amplification
  • Nucleic acid biomarkers also may be measured by sequencing-based techniques such as, for example, serial analysis of gene expression (SAGE), RNA-Seq, and high-throughput sequencing technologies (e.g., massively parallel sequencing), and Sequenom MassARRAY® technology. Nucleic acid biomarkers also may be measured by, for example, NanoString nCounter, and high coverage expression profiling (HiCEP).
  • Levels of biomarkers also can be determined at the protein level, in whole cells and/or in subcellular compartments (e.g., nucleus, cytoplasm and cell membrane).
  • Exemplary methods include, without limitation, immunoassays such as
  • immunohistochemistry assays IHC
  • immunofluorescence assays IF
  • enzyme-linked immunosorbent assays ELISA
  • immunoradiometric assays immunoenzymatic assays.
  • immunoassays one may use, for example, antibodies that bind to a biomarker or a fragment thereof.
  • the antibodies may be monoclonal, polyclonal, chimeric, or humanized.
  • antigen-binding fragments of a whole antibody such as single chain antibodies, Fv fragments, Fab fragments, Fab' fragments, F(ab') 2 fragments, Fd fragments, single chain Fv molecules (scFv), bispecific single chain Fv dimers, diabodies, domain- deleted antibodies, single domain antibodies, and/or an oligoclonal mixture of two or more specific monoclonal antibodies.
  • biomarkers at the protein level include, for example, chromatography, mass spectrometry, Luminex xMAP Technology, microfluidic chip-based assays, surface plasmon resonance, sequencing, Western blot analysis, aptamer binding, molecular imprints, or a combination thereof.
  • AQUA® see, e.g., U.S. Patents 7,219,016, and 7,709,222; Camp et al, Nature Medicine, 8(11): 1323-27 (2002)
  • TissueStudio TM see, e.g., U.S. Patents 7,873,223, 7,801,361, 7,467,159, and
  • post-translational modifications of a biomarker may be relevant to cancer prognosis.
  • modifications include, without limitation,
  • phosphorylation e.g., tyrosine, threonine, or serine phosphorylation
  • glycosylation e.g., O-GlcNAc
  • modifications may be detected, for example, by antibodies specific for the modifications, or by metastable ions in reflector matrix-assisted laser desorption ionization- time of flight mass spectrometry (MALDI-TOF) (Wirth, Proteomics 2(10): 1445-51 (2002)).
  • MALDI-TOF reflector matrix-assisted laser desorption ionization- time of flight mass spectrometry
  • biomarker proteins known to have enzymatic activity their levels can be measured through their activities.
  • assays include, without limitation, kinase assays, phosphatase assays, and reductase assays, among many others.
  • Modulation of the kinetics of enzyme activities can be determined by measuring the rate constant KM using known algorithms, such as the Hill plot, Michaelis-Menten equation, linear regression plots such as Lineweaver-Burk analysis, and Scatchard plot.
  • biomarker protein and nucleic acid metabolites can be measured.
  • the term "metabolite” includes any chemical or biochemical product of a metabolic process, such as any compound produced by the processing, cleavage or consumption of a biomarker.
  • Metabolites can be detected in a variety of ways known to one of skill in the art, including the refractive index spectroscopy (RI), ultra-violet spectroscopy (UV), fluorescence analysis, radiochemical analysis, near-infrared spectroscopy (near-IR), nuclear magnetic resonance spectroscopy (NMR), light scattering analysis (LS), mass spectrometry, pyrolysis mass spectrometry, nephelometry, dispersive Raman spectroscopy, gas chromatography combined with mass spectrometry, liquid chromatography combined with mass spectrometry, matrix- assisted laser desorption ionization-time of flight (MALDI-TOF) combined with mass spectrometry, ion spray spectroscopy combined with mass spectrometry, capillary
  • RI refractive index spectroscopy
  • UV ultra-violet spectroscopy
  • fluorescence analysis radiochemical analysis
  • radiochemical analysis near-infrared spectroscopy
  • NMR nuclear magnetic resonance spect
  • the measured level of a biomarker is normalized against normalizing genes or proteins, including housekeeping genes such as GAPDH, Cynl,
  • a sample utilized in the measurement of biomarker profiles of the invention can be any sample useful for this purpose, e.g., a cancerous tissue sample.
  • a cancerous tissue sample includes, for example, any sample derived from a cancerous tissue of a patient, and from a tissue that is suspected to be cancerous.
  • the sample can be, by way of example, tissue biopsies, blood, serum, plasma, blood cells, endothelial cells, lymphatic fluid, ascites fluid, interstitial fluid, bone marrow, cerebrospinal fluid, saliva, mucous, sputum, sweat, urine, circulating tumor cells, and circulating endothelial cells.
  • the sample may be fresh, frozen ⁇ e.g., snap-frozen), fixed ⁇ e.g., by formalin, ethanol, or an organic solvent, or with plastic or epoxy), embedded ⁇ e.g., in paraffin or wax), and/or cross-linked.
  • the sample may be taken as core biopsies, punch biopsies, fine needle aspirations, surgically removed tumor tissue, or tumor-derived cells grown in vitro or in live animals.
  • the sample may be formalin-fixed paraffin-embedded biopsies.
  • the tissue sample may be collected from a subject that is preferably a mammal.
  • the mammal can be a human, non-human primate, mouse, rat, dog, cat, horse, or cow, but is not limited to these examples. Mammals other than humans can be advantageously used as subjects that represent animal models of tumor metastasis.
  • a subject can be male or female.
  • a subject can be one who has been previously diagnosed or identified as having primary tumor or a metastatic tumor, and optionally has already undergone, or is undergoing, a therapeutic intervention for the tumor such as surgery.
  • a subject can be one who has not been previously diagnosed as having a primary or metastatic tumor, including one who exhibits one or more risk factors for a primary or metastatic tumor.
  • a subject has a primary tumor, a recurrent tumor, or a metastatic tumor.
  • the sample is taken from a subject that has previously been treated for a tumor. In other embodiments, the sample is taken from a subject prior to being treated for a tumor.
  • Biomarker panels of this invention can be constructed with two or more of the PDs described herein.
  • the particular composition of the panel may depend on the desired prognostic information. For example, a clinician may want to know
  • a patient is at a low risk or high risk of cancer metastasis or recurrence, or of disease-specific death. If the patient is determined to be at low risk, the clinician may recommend a less aggressive treatment regimen, to avoid unnecessary side-effects. If the patient is determined not to be at low risk, the clinician may further want to know if the patient is at high risk. If the patient is determined to be at high risk, then the clinician may want to recommend aggressive treatment regimen, such as post-surgery adjuvant therapy, including radiation, chemotherapy, hormone therapy, and targeted therapy. If the patient is neither, then the clinician may want to recommend active surveillance and regular follow-up on the patient.
  • aggressive treatment regimen such as post-surgery adjuvant therapy, including radiation, chemotherapy, hormone therapy, and targeted therapy.
  • biomarker panel tailored to provide a particular piece of prognostic information one can select constituent biomarkers using one or more algorithms that prioritize the candidate biomarkers as well as train the optimal formula to combine the results from multiple biomarkers for a panel.
  • PCA Principal Components Analysis
  • LogReg Logistic Regression
  • LDA Linear Discriminant Analysis
  • ELD A Eigengene Linear Discriminant Analysis
  • SVM Support Vector Machines
  • RF Random Forest
  • RPART Recursive Partitioning Tree
  • SC Shrunken Centroids
  • StepAIC Kth-Nearest
  • biomarker selection algorithms are, e.g., forward selection, backwards selection, stepwise selection, complete enumeration of all potential panels of a given size, genetic algorithms, voting-based algorithms, greedy algorithms, the LASSO algorithm, the AIC- Optimizing Stepwise Forward Selection Cox Regression Model, and other Cox regression algorithms, Weibull models, Kaplan-Meier models, and Greenwood models. Enumeration and ranking of all possible subsets of variables is also considered for subset of variable selection.
  • the above algorithms may be coupled with information criteria, such as Akaike's Information Criterion (AIC) or Bayes' Information Criterion (BIC), in order to quantify the tradeoff between additional biomarkers and model improvement, and to aid in minimizing overfit.
  • AIC Akaike's Information Criterion
  • BIC Bayes' Information Criterion
  • the resulting predictive models may be validated in other studies, or cross-validated in the study they were originally trained in, using such techniques as Bootstrap, Leave-One- Out (LOO) and 10-Fold cross-validation (10-Fold CV).
  • LEO Leave-One- Out
  • 10-Fold cross-validation 10-Fold CV
  • Scores from several biomarkers can be combined via a linear or non- linear equation to yield an overall score for a patient and the latter can then be used to stratify patients into risk groups.
  • the performance and, thus, usefulness, of biomarker panels may be assessed in multiple ways.
  • the sensitivity, specificity, positive predictive value (or rate), and negative predictive value (or rate) of the panel may be considered.
  • the following variables are used: "true positive” or TP (correctly classifying a subject as diseased in regard to a disease state of interest ⁇ e.g., cancer-attributable death at a given time point)); "true negative” or TN (correctly classifying a subject as non-diseased in regard to a disease state of interest (i.e., no cancer-attributable death at a given time point)); "false positive” or FP (i.e., incorrectly classifying a subject as diseased in regard to a disease state of interest); and “false negative” or FN (i.e., incorrectly classifying a subject as non- diseased in regard to a disease state of interest).
  • Sensitivity of a biomarker panel may be calculated by TP/(TP + FN), i.e., the true- positive fraction of disease subjects.
  • Specificity of a biomarker panel may be calculated by TN/(TN + FP), i.e., the true negative fraction of non-diseased subjects. Sensitivity of 100% and specificity of 100% are ideal, although for practical purposes, sensitivity and/or specificity of more than 70% (e.g., 75%, 80%, 85%, 90%, 95%, or more) may be acceptable.
  • biomarker panels may also be assessed for their "positive predictive value” or “positive predictive rate” (true positive fraction of all positive test results, i.e., TP/(TP + FP)) and "negative predictive value” or “negative predictive rate” (true negative fraction of all negative test results, i.e., TN/(TN + FN)).
  • Positive predictive value of 100% and negative predictive value of 100% are ideal, although for practical purposes, positive and/or negative predictive values of more than 70%> (e.g., 75%, 80%>, 85%, 90%, 95%, or more) may be acceptable.
  • Various statistical measures may be used to evaluate the performance of a biomarker panel in order to provide an acceptable level of performance.
  • AUC area under the curve
  • Various statistical measures e.g., area under the curve (AUC), goodness-of-fit, or quantitative range of a PD read-out
  • AUC area under the curve
  • c-index concordance index
  • a biomarker panel of this invention may comprise two, three, four, five, six, seven, eight, nine or all ten of the biomarkers of Table 1. The precise combination and weight of the biomarkers may vary dependent on the prognostic information being sought.
  • Examples of biomarker panels useful in identifying cancer patients with a poor prognosis may comprise:
  • cytoplasmic i.e., protein level in the cytoplasm
  • nuclear i.e., protein level in the nucleus
  • tumor i.e., whole tumor cell
  • tumor PCNA nuclear MMP1, tumor PCNA, and cytoplasmic SPARC
  • cytoplasmic CD117 cytoplasmic CD44, tumor KIF2C, nuclear MMP1, and tumor PCNA
  • g nuclear MMP 1 and tumor PCNA
  • cytoplasmic ANLN cytoplasmic CD117, nuclear CD44, cytoplasmic DEPDC1, tumor KIF2C, nuclear MMP1, tumor PCNA, and nuclear SPARC, or
  • biomarker panels may be used to identify cancer patients with a favorable prognosis (e.g., patients with a low risk for tumor metastasis). Such panels may comprise:
  • nuclear CDH2 non-nuclear (i.e., protein level outside the nucleus) PCNA, nuclear KIF2C, nuclear CD 117, and nuclear DEPDC 1 ,
  • nuclear CDH2 nuclear CDH2, non-nuclear PCNA, nuclear KIF2C, nuclear CDl 17, and nuclear DEPDCl
  • tumor CDl 17, tumor KIF2C, nuclear MMP1, nuclear CDl 17, nuclear SPARC, tumor CD44, and non-nuclear KIF2C tumor CDl 17, tumor KIF2C, nuclear MMP1, nuclear CDl 17, nuclear SPARC, tumor CD44, and non-nuclear KIF2C
  • tumor SPARC tumor SPARC
  • non-nuclear SPARC tumor CD44
  • non-nuclear CD44 nuclear CDl 17
  • nuclear FSCN 1 nuclear FSCN 1
  • non-nuclear KIF2C tumor PCNA
  • nuclear MMP1 nuclear MMP1, non-nuclear KIF2C, tumor DEPDCl, nuclear PCNA, tumor CD44, non- nuclear CDl 17, tumor PCNA, non-nuclear ANLN, non-nuclear CD44, and nuclear KIF2C
  • cytoplasmic CDl 17 and cytoplasmic PCNA nuclear MMP1, non-nuclear KIF2C, tumor DEPDCl, nuclear PCNA, tumor CD44, non- nuclear CDl 17, tumor PCNA, non-nuclear ANLN, non-nuclear CD44, and nuclear KIF2C
  • nuclear CDl 17 nuclear CD44, cytoplasmic CDH2, tumor DEPDCl, nuclear FSCNl, nuclear KIF2C, nuclear PCNA, and nuclear SPARC, or
  • nuclear CDl 17 nuclear CD44, cytoplasmic CDH2, tumor DEPDCl, nuclear FSCNl, and nuclear KIF2C.
  • biomarker panels may be used to identify cancer patients with a poor prognosis (e.g., patients with a high risk for tumor metastasis) among patients that test negative in a sentinel lymph node biopsy.
  • Such panels may comprise:
  • tumor CDH2, nuclear MMP1, and tumor PCNA c) tumor CDH2, nuclear MMP1, and tumor PCNA, d) nuclear ANLN, nuclear MMP1 , tumor PCNA, and cytoplasmic SPARC, or e) cytoplasmic CD1 17, tumor KIF2C, nuclear MMP1 , tumor PCNA, and cytoplasmic SPARC.
  • Exemplary biomarker panels of this invention further include the panels illustrative in Figs. 29-31.
  • biomarker panel of this invention may compare the biomarker score profile of the panel (e.g., a reference, baseline, or index value) with the biomarker score profile of a cancerous sample from a patient, where the comparison results provide prognostic information for the patient.
  • a biomarker score is calculated on the basis of a measured level of the biomarker, using one or more of algorithms well known in the art, e.g., the algorithms illustrated herein.
  • a "biomarker score” may be obtained by applying a numeric coefficient (e.g., by multiplication or division) to a measured level of a biomarker. The coefficients may be provided by an algorithm.
  • biomarker scores can be used as cutoff points or threshold values to stratify a patient population (e.g., identifying patients with a high risk for metastatic disease
  • Comparison of the biomarker scores of a tissue sample to a reference, index, or baseline value can be achieved with techniques such as reference limits, discrimination limits, or risk defining thresholds to define cutoff points and abnormal values.
  • Exemplary reference samples, index values, or baseline values may be taken or derived from, for example, a control subject, or a population with known clinical outcomes.
  • the control subject or population may be without cancer or without a clinical outcome being considered (e.g., metastasis, cancer recurrence, or cancer-attributable death by a given time point).
  • the control subject or population may have had cancer or the clinical outcome being considered (e.g., metastasis, cancer recurrence, or cancer- attributable death at a given time point.)
  • a reference, index or baseline value is the level of a biomarker in a noncancerous tissue.
  • the noncancerous tissue is derived from a cancer patient (e.g., the cancer patient from whom the cancerous tissue sample is derived).
  • the noncancerous tissue is derived from an individual or population without cancer.
  • the value is the level of a biomarker in a control sample derived from one or more subjects who are asymptomatic and/or lack traditional risk factors for a metastatic tumor or recurrence.
  • such subjects may be monitored and/or periodically retested for a diagnostically relevant period of time ("longitudinal studies") following such test to verify continued absence of a metastatic tumor (disease or event free survival).
  • a diagnostically relevant period of time may be one year, two years, two to five years, five years, five to ten years, ten years, or more than ten years from the initial testing date for determination of the reference value.
  • the value is the level of a biomarker in a control sample derived from a tumor with low metastatic potential or risk of recurrence.
  • the value is the level of a biomarker in a control sample derived from a tumor with that has not metastasized or recurred.
  • comparisons can be performed between patient and reference values measured concurrently or at temporally distinct times, e.g., between patient values and values derived from a database of compiled expression information that assembles information about expression levels of cancer- associated genes.
  • a reference, index, or baseline value also may be derived from one or more subjects who have been exposed to a treatment ⁇ e.g., adjuvant therapy) and have shown improvements as a result of the treatment. Comparing a cancer patient's biomarker scores/profile with such values may provide useful information in predicting responsiveness of the patient to this cancer treatment.
  • the value may be derived from a patient who has received an initial cancer treatment, and then as the patient receives additional treatments, his biomarker scores/profile will be compared to his original reference, index, or baseline biomarker scores/profile, so as to monitor the progress of the treatments.
  • a reference, index value or baseline value also may be derived from risk prediction algorithms or computed indices from population studies. In general, reference, index or baseline values may vary based on which biomarkers are included in the value.
  • Reference, index or baseline values, as described above, also can be used to generate a "reference biomarker profile.”
  • the biomarkers disclosed herein can be used to generate a "subject biomarker profile" taken from subjects who, for example, are at high risk for tumor metastasis or cancer recurrence.
  • the subject biomarker profiles can be compared to a reference biomarker profile to identify, for example, subjects at risk of developing tumor metastasis or cancer recurrence, or to monitor the progression of a cancer or the effectiveness of a cancer treatment.
  • the reference and subject biomarker profiles of the present invention may be contained in a machine-readable medium, such as but not limited to, analog tapes like those readable by a VCR, CD-ROM, DVD-ROM, and USB flash media, among others.
  • a machine-readable medium such as but not limited to, analog tapes like those readable by a VCR, CD-ROM, DVD-ROM, and USB flash media, among others.
  • Such machine-readable media can also contain additional test results, such as, without limitation, measurements of clinical parameters and traditional laboratory risk factors.
  • the machine-readable media can also comprise subject information such as medical history and any relevant family history.
  • the machine-readable media can also contain information relating to other disease-risk algorithms and computed indices such as those described herein.
  • biomarker panels of this invention may be used in conjunction with additional biomarkers, clinical parameters, or traditional laboratory risk factors known to be present or associated with the clinical outcome of interest.
  • One or more clinical parameters may be used in the practice of the invention as a biomarker input in a formula or as a pre-selection criterion defining a relevant population to be measured using a particular biomarker panel and formula.
  • One or more clinical parameters may also be useful in the biomarker normalization and pre-processing, or in biomarker selection, panel construction, formula type selection and derivation, and formula result post-processing.
  • a similar approach can be taken with the traditional laboratory risk factors.
  • Clinical parameters or traditional laboratory risk factors are clinical features typically evaluated in the clinical laboratory and used in traditional global risk assessment algorithms.
  • Clinical parameters or traditional laboratory risk factors for tumor metastasis may include, for example, tumor stage, tumor grade, tumor size, tumor visual characteristics, tumor location, tumor growth, lymph node status, histology, tumor thickness (Breslow score), ulceration, proliferative index, tumor-infiltrating lymphocytes, age of onset, PSA level, or Gleason score.
  • Other traditional laboratory risk factors for tumor metastasis are known to those skilled in the art.
  • Biomarker panels of the invention provide useful information in prognosis of cancer patients.
  • prognosis refers to the prediction of how a disease will progress, including, for example, likelihood or risk of death attributable to cancer within a given period of time (e.g., six months, twelve months, two years, three years, five years, eight years, ten years, fifteen years, or more), cancer recurrence or metastasis; likelihood of recovery;
  • risk relates to the probability that an event (e.g., a metastatic event) will occur over a specific time period, and can mean a subject's “absolute” risk or “relative” risk.
  • Absolute risk can be measured with reference to either actual observation post-measurement for the relevant time cohort, or with reference to index values developed from statistically valid historical cohorts that have been followed for the relevant time period.
  • Relative risk refers to the ratio of absolute risks of a subject compared either to the absolute risks of low risk cohorts or an average population risk, which can vary dependent on how clinical risk factors are assessed.
  • a "poor prognosis” may refer to a prognosis in which one or more negative characteristics of cancer is increased as compared to a reference population of cancer patients.
  • poor prognosis may refer to a decreased chance of survival, an increased risk of cancer recurrence, increased malignancy, increased metastatic potential, increased tumor size and growth rate, increased progression of symptoms, or decreased response to treatment.
  • a patient having a poor prognosis is considered "high risk.” High risk patients may, for example, have an increased risk of tumor metastasis or recurrence.
  • a favorable prognosis may refer to an increased chance of survival, a decreased risk of recurrence of disease, decreased malignancy, decreased metastasis, decreased metastatic potential, decreased or static tumor size, decreased tumor growth rate, decreased progression of symptoms, or increased response to treatment as compared to the average cancer characteristic in a population of cancer patients.
  • a patient having a favorable prognosis is considered "low risk.”
  • Low risk patients may, for example, have a decreased risk of tumor metastasis or recurrence.
  • Risk calculation is statistical. "Statistically significant” refers to an alteration is greater than what may be expected to happen by chance alone (which would be a “false positive”). Statistical significance can be determined by methods well known in the art. Commonly used measures of significance include the p-value, which represents the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone. A result is often considered highly significant at a p- value of 0.05 or less (e.g., 0.005 or 0.0005 or less).
  • the probability of occurrence of an undesired clinical event in a patient classified as "high risk” is at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%. In some embodiments, the probability of occurrence of an undesired clinical event in a patient classified as "low risk” is no more than 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1%.
  • a biomarker panel also may aid the diagnosis and staging (e.g., stage I, II, III, or IV) of cancer.
  • the biomarker score of a patient can be compared to a reference, index or baseline biomarker value that is obtained from subjects known to have cancer or a particular stage of cancer, or from subjects that are cancer- free, or from subjects with low- risk cancer.
  • the comparison is analyzed to conclude the diagnosis or staging.
  • the comparison may be used to identify a cancer patient in need of a further diagnostic procedure (e.g., a sentinel lymph node biopsy).
  • the comparison may be used to identify a cancer patient that is not in need of a further diagnostic procedure (e.g., a sentinel lymph node biopsy).
  • Identifying a subject with a favorable prognosis empowers a decision to avoid or delay various therapeutic interventions or treatment regimens. Such subjects may not require treatments if their tumors are molecularly hard- wired to remain indolent, cured by surgical excision alone, and/or unlikely to metastasize. Likewise, identifying a subject with a poor prognosis empowers a decision to use more aggressive treatment strategies. Thus, the prognostic and diagnostic information obtained with the present biomarker panels enables the selection and initiation of suitable treatments or therapeutic regimens to delay, reduce, or prevent progression of the cancer while avoiding unnecessary morbidity associated with cancer treatments.
  • the methods and panels of this invention can be used to identify patients in need of "adjuvant therapy," that is, therapy given in conjunction with surgery, after which little or no evidence of residual disease can be detected.
  • Adjuvant therapy is given to reduce the risk of disease recurrence, either local or metastatic.
  • Adjuvant therapy can include, for example, radiation therapy, chemotherapy, hormone therapy, experimental therapy (e.g., as part of a clinical trial), neo-adjuvant therapy (therapy administered prior to the primary therapy), and targeted therapy.
  • Targeted therapy entails the use of biologies that inhibit or enhance the function of a molecular target, or a signaling pathway associated therewith, in cancer cells.
  • Targeted therapy associated with methods of this invention may include therapy that targets one or more PDs described herein and their associated signaling pathways, including molecular inhibitors of key intracellular drug targets operating in cancer survival pathways, like inhibitors of B-Raf (like PLX-4032), the serine/threonine kinase Akt, the serine/threonine kinase MEK, the serine/threonine kinase p90RSK, the lipid kinase PI3K, and the serine/threonine kinase IKK2.
  • molecular inhibitors of key intracellular drug targets operating in cancer survival pathways like inhibitors of B-Raf (like PLX-4032), the serine/threonine kinase Akt, the serine/threonine kinase MEK, the serine/threonine kinase p90RSK, the lipid kinase PI3K, and the serine/threon
  • test agents can be small molecule inhibitors or therapeutic antibodies directed against cell surface proteins, like receptor tyrosine kinases, including EGFR, HER2/NEU, HER3, IGF-1R, c-Met, VEGFR, Axl, Eph receptors, etc.
  • receptor tyrosine kinases including EGFR, HER2/NEU, HER3, IGF-1R, c-Met, VEGFR, Axl, Eph receptors, etc.
  • MEK inhibitors Akt inhibitors
  • PI3K inhibitors PI3K inhibitors
  • CTLA4-directed therapies and Interferon
  • the biomarker panels of this invention can also be used to aid the selection of an appropriate adjuvant therapy. For example, one can obtain a biomarker profile from a patient before a proposed treatment, or from a reference subject (e.g., an individual or population having no cancer, or having non-metastatic cancer, or having improvements in risk factors (e.g., clinical parameters or traditional laboratory risk factors). A difference in the biomarker scores between the test sample and the reference sample may indicate that a treatment is suitable for administration, whereas a similarity between the two samples may indicate that the treatment is not suitable.
  • a reference subject e.g., an individual or population having no cancer, or having non-metastatic cancer, or having improvements in risk factors (e.g., clinical parameters or traditional laboratory risk factors).
  • risk factors e.g., clinical parameters or traditional laboratory risk factors
  • the methods of treating a cancer patient may include the selection of suitable treatment for a particular patient that is directed to their unique physiology. Differences in the genetic makeup of cancer patients can result in different abilities to metabolize various drugs, which may modulate the symptoms or risk factors of cancer or metastatic events. Accordingly, a biomarker panel, alone or in combination with known genetic factors for drug metabolism, may be used to predict whether a candidate cancer therapeutic will be suitable for treating a particular cancer patient.
  • embodiments may further comprise predicting or diagnosing adverse side effects associated with administration of the treatment.
  • the biomarker panels of the invention further provide a means for monitoring the progression of cancer in a subject, for example, by screening for changes in marker expression associated with a cancer.
  • these methods comprise determining biomarker levels or scores in a subject-derived sample (e.g., a cancerous tissue sample), comparing these to the biomarker levels or scores in a reference sample, and identifying alterations in amounts of the levels or scores in the subject sample compared to the reference sample. These measurements may be repeated over a clinically relevant period of time, for example, six months, twelve months, two years, three years, five years, eight years, ten years, fifteen years, or more.
  • the reference sample is from a non-cancerous tissue, or from a subject with non-metastatic cancer
  • increasing similarities between the biomarker levels indicate that the cancer is not progressing, or is regressing.
  • increasing differences between the biomarker levels may indicate that the cancer is progressing.
  • a biomarker panel may be used to monitor the course of treatment in a subject.
  • a biological sample can be provided from a subject undergoing treatment regimens, e.g., drug treatments, for cancer.
  • the biological samples may be obtained from the subject at various time points before, during, or after treatment. Comparison of the levels of the biomarkers at various time points will indicate the effectiveness of the course of treatment. For example, if the biomarker scores or profile of the cancerous tissue from the patient return to a baseline value measured in one or more subjects without a metastatic or recurrent tumor, or in subjects who do not exhibit traditional risk factors for metastatic disease, the treatment may be considered successful.
  • a biomarker panel also may be used in stratifying patients for inclusion in a clinical trial.
  • biomarker levels or scores may be obtained from a cancerous tissue sample and compared to a reference or index value.
  • the biomarker panels of this invention also may be used to provide a report that is useful in a clinical setting.
  • the report may include biomarker scores and associated prognostic information, including likelihood of long-term survival, cancer recurrence, and cancer metastasis, as well as treatment recommendations based on the prognostic information.
  • the biomarker panels can be used to identify useful cancer treatments. By following changes in the biomarker profile of patients over the course of an experimental therapy, one can determine whether the therapy is efficacious in treating these patients. VIII. THE ROLE OF PATHWAY CONTEXT GENES
  • the strong correlation between the present PDs and oncologic clinical outcome indicates that the PDs are involved in tumor progression ⁇ e.g., as an oncogenic facilitator or a tumor suppressor) through cellular pathways that lead to or block tumor advancement.
  • the present invention provides methods and compositions that use other molecular entities in these cellular pathways as surrogate biomarkers or therapeutic targets.
  • a particular PD may not be the ultimate driver of tumor progression in a pathway, but it may be a downstream or upstream of the true driver in the pathway.
  • certain PDs may be replaced with other molecular biomarkers that can serve as functional readouts of the perturbed pathways causing early stage cancers to progress.
  • the PDs can be said to function in a pathway context.
  • Path context is a term used to describe clinically relevant molecular alterations that result in perturbed or deregulated signal transduction pathways in diseased cells.
  • AQUA® technology Using measurement methods described herein (e.g., the AQUA® technology) in conjunction with pathway mapping, one may identify the true drivers of tumor progression. For example, one can analyze perturbed or deregulated pathway activity using phosphoproteomics or phosphoantibodies directed against key proteins inside the cells that are regulated by the mutated or otherwise molecularly altered pathway context gene products. Using quantitative technology, for example AQUA, in conjunction with analysis of the signal transduction pathway activity inside aggressive cancer cells (so-called 'pathway analysis'), the true drivers of tumor progression may be identified. These drivers will have an important effect on prognosis and can also be used as therapeutic targets.
  • AQUA analysis of the signal transduction pathway activity inside aggressive cancer cells
  • a PD is a true driver of tumor progression
  • the pathway can be mapped and the PDs role in the pathway can be elucidated. This information may be used as a molecular stratifier of the patient population with a certain tumor type.
  • drug targets can be designed to critical upstream components of the pathway or the PD itself. The efficacy of the drug in inhibiting or activating the pathway can immediately be tested, for example, by using AQUA to quantitate expression of the PD. In this manner, AQUA can be utilized for testing the effectiveness of targeted therapeutics.
  • pathway context is thought to be causally involved in the phenotype of the disease.
  • LEF loss-of-function
  • GAF gain- of-function
  • pathway context genes denote the genes that can undergo molecular alterations involved in pathway context. Examples of pathway context genes that are particularly well studied in human cancer include the oncoproteins Ras (for a detailed review see Vigil D, Cherfils J, Rossman KL, Der CJ.; Nat Rev Cancer. 2010;
  • B-Raf can serve as a pathway context gene product, but it is also itself regulated by another pathway context gene product(s), namely N-Ras and K-Ras.
  • N-Ras and K-Ras are mutated in -50%, 30% and 10% of human melanomas, respectively (Martin MJ, Carling D, Marais R., Cancer Cell. 2009 Mar 3;15(3): 163-4). These mutations result in hyperactivation of the Ras-Raf-MEK-ERK signaling cascade.
  • B-Raf itself is a popular drug target for small molecule inhibitors, and such inhibitors efficiently shut down signal transduction in B-Raf- V600E-mutant cancers that are Ras wild type, whilst the effect of inhibiting B-Raf in Ras- GOF-mutant cancers is an activation of its downstream target ERK.
  • the Raf inhibitor PLX-4032 has been shown to have remarkable overall response rate and progression-free survival of 81% and over 6-7 months in B-Raf-V600E mutant patients that are wild type for Ras (Poulikakos PI, Rosen N.; Cancer Cell. 2011 Jan 18; 19(1): 11-5 and references therein). This is an example of N-Ras pathway context function of B-Raf.
  • the present invention provides biomarker panels that are indicative of a general physiological pathway associated with a cancer.
  • the present invention provides biomarker panels that are specifically indicative of a particular pathway context, but not other pathway contexts.
  • an additional pathway component also may be identified with methods known to those skilled in the art, including mass spectroscopy, phosphoproteomics or phosphoantibodies directed against key proteins inside the cells. These other pathway components may be used in a biomarker panel in addition to or as a substitute for one or more of the biomarkers in the panel, provided they share certain defined characteristics of a good biomarker. These characteristics may include analytically important characteristics such as levels of the biomarker that may be measured at a useful signal to noise ratio.
  • the additional pathway component may be used as a target for therapy (e.g., adjuvant therapy).
  • the levels of biomarkers in a panel may be measured using a a kit with detection reagents that specifically detects and quantify the biomarker analytes.
  • the detection reagents may have been detectably labeled, or the kit provides labeling reagents for conjugation to the detection reagents.
  • the kit may comprise an array of detection reagents, e.g., antibodies and/or oligonucleotides that can bind to biomarker proteins (or fragments thereof) or nucleic acids, respectively.
  • the biomarkers are proteins and the kit contains antibodies that bind to the biomarkers.
  • the biomarkers are nucleic acids and the kit contains oligonucleotides or aptamers that bind to the biomarkers.
  • the oligonucleotides may be fragments of the biomarker genes.
  • the oligonucleotides can be 200, 150, 100, 50, 25, or fewer nucleotides in length.
  • a kit also may contain in separate containers a nucleic acid or antibody (alone, or already bound to a solid matrix or packaged separately with reagents for binding them to the matrix), control formulations (positive and/or negative), and/or a detectable label such as fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, quantum dots, luciferase, and radiolabels, among others.
  • a detectable label such as fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, quantum dots, luciferase, and radiolabels, among others.
  • Instructions e.g., written, tape, VCR, CD-ROM, and/or DVD
  • for carrying out the assay may be included in the kit.
  • the biomarker detection reagents provided in a kit can be immobilized on a solid matrix such as a porous strip to form at least one biomarker detection site.
  • the measurement or detection region of the porous strip may include a plurality of sites containing, for example, a nucleic acid or antibody, and may optionally contain sites for negative and/or positive controls. Alternatively, control sites can be located on a separate strip from the test strip.
  • the different detection sites may contain different amounts of biomarker detection reagents, e.g., a higher amount in the first detection site and lesser amounts in subsequent sites.
  • the number of sites displaying a detectable signal may provide a quantitative indication of the amount or level of biomarkers present in the sample.
  • the detection sites may be configured in any suitably detectable shape and can be in the shape of a bar or dot spanning the width of a test strip.
  • a kit comprises a nucleic acid substrate array comprising one or more nucleic acid sequences that specifically identify one or more biomarker nucleic acid sequences.
  • the substrate array can be on a solid substrate (for example, a "chip” such as a microarray chip (see, e.g., U.S. Patent 5,744,305)).
  • the substrate array can be a solution array, e.g., xMAP (Luminex, Austin, TX), Cyvera (Illumina, San Diego, CA), CellCard (Vitra Bioscience, Mountain View, CA) and Quantum Dots' Mosaic (Invitrogen, Carlsbad, CA).
  • a kit comprises an antibody substrate array comprising one or more antibodies that specifically identify one or more biomarker proteins ⁇ e.g., an array for performing an immunoassay such as an ELISA assay or AQUA®).
  • Example 1 Quantification of PD nucleic acid levels in human cancer
  • Cell lines (WM-115, WM-266.4, SK-MEL-2, SK-MEL-5, SK-MEL-24, SK-MEL- 28, SK-MEL-31, RPMI-7951, A375 and NHEM neo) were obtained from ATCC and grown using conditions described by the vendor. They were harvested when in log growth phase and frozen at -80°C. Total RNA was isolated from frozen cell pellets containing ⁇ 5xl0 6 cells using a Qiagen RNeasy Plus Mini kit (#74134), followed by quantification using the
  • RNA samples were analyzed for potential contamination with genomic DNA using an Applied Biosystems (ABI) Taqman assay (PTEN, Hs02621230_sl). Upon verification that the samples analyzed contained no interfering genomic DNA, 2 ⁇ g of total RNA was converted to cDNA using the ABI High Capacity cDNA Reverse Transcription Kit (#4368813), according to the manufacturer's instructions.
  • ABSI Applied Biosystems
  • Standard Taqman reagents were used in a total reaction volume of 25 ⁇ 1 containing 20ng of cDNA per well. Duplicate wells were assayed on an ABI StepOne Plus instrument using universal thermal cycling conditions of 50°C for 2 minutes, 95°C for 10 minutes, followed by 40 cycles of 95°C for 15 seconds and 60°C for 1 minute.
  • ZNF592 was determined to have the lowest variation in expression between all cell lines, and hence was chosen for use as an endogenous control and used to normalize the other expression levels.
  • niRNA expression level for another PD is almost universally increased across all cell lines. This is consistent with the phenomenon of 'cadherin switching' which is defined by decreased E- cadherin and increased N-cadherin expression and is observed in various tumors undergoing a more invasive and metastatic phenotype (Hazan RB, Qiao R, Keren R, Badano I, Suyama K; Ann N Y Acad Sci. 2004;1014: 155-63). Given that malignant melanoma cell lines in general have been generated from advanced and highly invasive tumors, the observed expression pattern of PDs would be expected to correlate with higher risk of melanoma progression. Indeed, we observed that the general pattern of PD protein expression in higher risk groups was mostly consistent with PD mRNA expression in malignant melanoma cell lines (compare Figs. 1 and 2).
  • Example 2 Quantification of PD protein levels in human cancer
  • Protein lysates were prepared using RIPA buffer (Thermo Scientific, Rockland, IL) from the nine melanoma cell lines (A375, RPMI7951, SK-MEL2, SK-MEL5, SK-MEL24, SK-MEL28, SK-MEL31 , WMl 15) and the normal melanocyte NHEM cell line.
  • the protein lysates were quantified using Pierce BCA Protein Assay kit (Thermo Scientific). For each sample, 8 ⁇ g of protein was loaded onto 4-20% Precast Mini-Protean TGX gels (Bio-Rad, Hercules, CA) for electrophoresis and transferred to PVDF membrane (Bio-Rad).
  • the membrane was blocked in 5% milk TBS-T (0.1% Tween 20, 25 mM Tris, 0.15 M NaCl, pH 7.2) at room temperature for 1 hr. The membrane was then incubated with the primary antibody in 0.3% BSA TBS-T at room temperature for 1 hr or at 4 ⁇ C overnight.
  • the detailed information of the antibodies is shown in Table 2. Briefly, for the primary antibodies used, a range of 1 : 100 to 1 : 10,000 dilutions was chosen based on the manufacturer's recommended protocol. The membrane was washed in TBS-T three times, for 5 min each.
  • the membrane was subsequently incubated with HRP-conjugated secondary antibody (Invitrogen Molecular Probes, Eugene, OR) in 3% milk TBS-T at room temperature for 1 hr.
  • HRP-conjugated secondary antibody Invitrogen Molecular Probes, Eugene, OR
  • the membrane was washed in TBS-T three times, for 5 min each before chemiluminescent detection using HRP-conjugated secondary antibody (Invitrogen Molecular Probes, Eugene, OR) in 3% milk TBS-T at room temperature for 1 hr.
  • the membrane was washed in TBS-T three times, for 5 min each before chemiluminescent detection using HRP-conjugated secondary antibody (Invitrogen Molecular Probes, Eugene, OR) in 3% milk TBS-T at room temperature for 1 hr.
  • the membrane was washed in TBS-T three times, for 5 min each before chemiluminescent detection using HRP-conjugated secondary antibody (In
  • Negative controls were obtained by omitting the target protein primary antibody. All procedure steps were completed at room temperature. The following dilutions of the indicated primary antibody were used: KIF2C (1 :300); CDH2 (1 :300); DEPDC1 (1 : 1K); CD44 (1 :300); CD117 (1 : 100); SPARC (1 :3K); FSCN1 (1 : 1K); PCNA (1 :300); MMP1 (1 : IK) and ANLN (1 :3K). These results demonstrate that PD expression levels vary not only in melanoma derived cell lines, as observed by Western blotting and Taqman qRT-PCR technologies, but importantly, also at the protein level across different human melanoma tumors per se (Fig.
  • the ability to determine PD expression levels is of great importance for prognosis. For example, if the PD is a driver of tumor progression or a tumor suppressor, its high expression levels in an early stage tumor will correlate with poor outcome in patients or favorable prognosis, respectively. Conversely, low expression levels in early stage tumors will correlate with favorable prognosis or poor prognosis, respectively. Thus, our observations further emphasize the critical relevance of developing an unbiased method to quantitate and determine tumor specific protein expression levels for accurate prognostic determination.
  • Endogenous peroxidase activity was blocked with 0.3% hydrogen peroxide for 5 min followed by 10 min incubation with a Background Sniper block.
  • a mouse antibody directed at one of the target proteins multiplexed with rabbit polyclonal anti-SlOOB (DAKO) was applied to each slide for 60 min, the latter to distinguish the regions corresponding to melanoma from surrounding tissue in the absence of counterstain.
  • Target mouse monoclonal antibodies were as follows: KIF2C (1 :300); CDH2 (1 :300); DEPDC1 (1 : 10K); CD44 (1 :30K); CD117 (1 : 100); SPARC (1 :30k); FSCN1 (1 : 1k); PCNA (1 :30K); MMP1 (1 : 1K); and ANLN (1 :3K).
  • the secondary antibodies, Alexa 546-conjugated goat anti-rabbit (1 : 100; Molecular Probes) diluted into Envision anti-mouse (neat; DAKO) were applied for 30 min followed by 10 min Cy5-tyramide signal amplification (Perkin-Elmer Life Sciences) incubation to amplify the target signal. Finally, nuclei were visualized by counterstaining with 4,6-diamidino-2-phenylindole (DAPI) prolong gold. Negative controls were obtained by omitting the target protein primary antibody. All procedure steps were completed at room temperature.
  • Perturbations in cellular signaling pathways drive tumor progression.
  • it is important to selectively quantitate PD expression in the tumor cells while excluding surrounding cells such as stromal cells and infiltrating lymphocytes.
  • the subcellular localization of certain PDs is often very informative. For instance, if a given PD is a transcription factor that has a tumor suppressor role in tumor development, its nuclear localization is likely indicative of its activation status and hence is likely correlated with a favorable prognosis, while its cytoplasmic localization will be correlated with a poor prognosis.
  • MMP1 matrix metalloproteinase MMP1. This protein is a PD involved in the breakdown of extracellular matrix during metastasis.
  • Definiens TissueStudio TM product is based on cognition network technology which is semantic networks of objects and their mutual relationships (so-called Object recognition').
  • tissue segmentation, tumor regions of interest, and either nuclear or non-nuclear attributes of the cells are identified automatically using Definiens Composer segmentation and classification tools in
  • TissueStudio software (Definiens AG, Munchen). An example of Definiens TissueStudio applied for PD quantitation is shown in Fig. 5, wherein TissueStudio TM software is used to quantitatively assess the expression level of Fascin within the tumor and its subcellular compartments. Automated Quantitative Analysis (AQUA)
  • Compartmentalization of each sample and quantitation of the target protein signal within each compartment are executed as follows.
  • Alexa 546 signal is used to represent SI 00 staining and is binary gated (i.e., a threshold was established to "bin" cells into either SI 00 positive or SI 00 negative groups) to indicate the tumor mask.
  • the nuclear compartment is defined by applying a rapid exponential subtraction algorithm to the DAPI channel images, which restricts the nuclear compartment assignment to only those pixels that show any positive DAPI signal within the plane of focus.
  • the nonnuclear compartment is then defined by the Pixel-Based Locale Assignment for
  • AQUA scores were calculated as the average AQUA score for each of the individual pixels included in the selected compartment and were reported on a scale of 0-4095.
  • a high AQUA score indicates a high level of expression of the candidate PD in the analyzed tumor section whereas a low AQUA score indicates a low level of expression or an absence of expression in the tumor section.
  • AQUA scores are an advantageous method of quantitation because they are a continuous variable, not restricted to the historical categories (0, 1+, 2+, 3+) of scoring.
  • AQUA's utilization of a continuous scoring scale allows it to capture the diverse and varied pathway deregulations in human cancer. Further, the AQUA score generated for a tumor section is unique to the properties and staining of that individual section.
  • TMAs tumor tissue micro arrays
  • the discovery array (YTMA59) included single cores from the 192 cutaneous melanomas as well as a series of cell line and human tumor samples as controls.
  • the validation cohort was scored twice and arrayed onto two recipient blocks to create a tissue microarray (TMA) with 2-fold redundancy (YTMA76). It also contained 60 randomly selected individuals from YTMA59 that served as normalization controls for immunofluorescent staining. For details, see Gould-Rothberg et al, J Clin One, 2009;
  • the discovery of these proteins included identification of the genes that were differentially up- or down-regulated in a metastatic melanoma model based on an inducible c-Met overexpressing transgenic mouse on an pl9/ARF -/- background versus a non-metastatic inducible H-Ras, pl9/ARF -/- transgenic mouse.
  • the differentially regulated genes were subjected to cross-species oncogenomic comparison with human genes that exhibited corresponding copy number alterations in metastatic versus non-metastatic melanoma.
  • Monoclonal antibodies for each of the 10 PDs were optimized for AQUA.
  • each antibody was tested on melanoma test arrays using the HistoRx PM-2000 platform to establish optimal antibody titer.
  • a titration curve analysis was performed in which AQUA score was plotted for a given antibody tested at 1 : 10, 1 :30, 1 : 100, 1 :300, 1 :1k, 1 :3k, 1 : 10k, 1 :30k, 1 : 100k, 1 :300k, or 1 : 1000k dilutions, or with no primary antibody. For each dilution, the ratio between the highest and lowest AQUA score was calculated.
  • the subcellular distribution (e.g. nuclear versus cytoplasmic) was quantitatively assessed at multiple dilutions of the primary antibody.
  • the optimal dilution for each antibody was employed to stain the YTMA76 cohort (Fig. 7).
  • the data was analyzed using the AQUA version 2.3.3.2 software.
  • AQUA scores in the nuclear and non-nuclear compartment, as well as total AQUA scores under the tumor mask were exported for each core.
  • Selection Cox Regression model was deployed on the log-2 transformed AQUA and two- level discretized AQUA data separately, yielding four different algorithmic approaches.
  • each PD was either included or excluded from the derived model.
  • An algorithm-specific biomarker importance score is defined as the number of a times an individual PD occurred in the 200 bootstrapping models, weighted by the model's performance (C-index) on the internal validation samples (e.g., the remaining samples not chosen by the resampling).
  • C-index model's performance
  • PDs were ranked by the sum of their importance scores (SumScores) across the four algorithms with the top ones corresponding to the highest absolute aggregated scores.
  • the top-K (wherein K can be any number of the ten markers) were subsequently selected for inclusion into the final prognostic models, constructed by a multivariate Cox regression algorithm (see, e.g., Fig. 9).
  • Another approach to variable selection involves a stepwise process. Starting from an empty list, firstly we exhaustively examine Cox regression models composed by all 1 -marker and 2-marker combinations. The best model is chosen so as to maximize the segregation of the low risk/high risk population. Hence this approach yields two types of models, one that is sensitive to the low risk population by maximizing the sensitivity and negative prediction rate (NPR) (which is perferably 50% or greater), while the other is sensitive to high risk population by maximizing the specificity and positive prediction rate (PPR). At the next step, one or two additional markers are added to the already established best performing set of markers. Again, the additional markers are chosen so as to maximize the sensitivity/NPR of low risk model or the specificity/PPR of high risk model.
  • NPR sensitivity and negative prediction rate
  • PPR specificity and positive prediction rate
  • Fig. 16 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are very sensitive in detecting the low risk population. This is a greedy algorithm which iteratively adds to an already established group the one or two additional variables that improve best the sensitivity of the model in the low risk group.
  • the table presents the results for the training set.
  • the Kaplan-Meier curves present the performance on both the training and test set for the model with six variables. We see that for both sets the sensitivity in the low risk group is 1.
  • FIG. 17 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are geared towards identifying the high risk population.
  • the algorithm is analogous to the one for identifying the low risk population (see legend of Fig. 16) except that variables that improve the specificity are added at each stage.
  • the table presents the performance of the algorithm on the training data while the Kaplan Meier curves demonstrate the performance on both training and test data for the model with six variables.
  • the high risk population has a relatively high fraction of recurrence events, as well as relatively short follow-up times for the censored cases.
  • Fig. 18 illustrates that multiple models can be combined to improve population segregation.
  • the model that effectively segregates the low risk population is combined with the model that effectively segregates the high risk population to yield stratification into three classes: high risk, medium risk, and low risk.
  • the model scores consist of a linear combination of the log2 AQUA scores for the markers weighted by the coefficients for the two models indicated in the two tables.
  • the thresholds for the model scores used to segregate populations are indicated in the decision tree below the table.
  • Fig. 19 demonstrates the effectiveness of the combined model for segregating low and high risk populations illustrated in Fig. 18. Kaplan-Meier curves for both training and test cohorts are presented.
  • variable selection and model construction phase yield a set of PDs and a set of coefficients.
  • the score of each sample is obtained as a linear combination of the log2 AQUA score of the PDs weighted by the coefficients.
  • Optimal cutoffs are selected to optimize model performance on training set.
  • stepwise forward selection was employed with a Cox regression algorithm to select three PDs into a linear model.
  • multiple subcellular compartments e.g., the nuclear compartment (“NUCLEAR") as well as the total nuclear plus cytoplasmic signal present within the tumor mask (“TUMOR”) for N-cadherin (CDH2) significantly contributed to the model (Fig. 10).
  • TUMOR tumor mask
  • prognostic models could be successfully developed using all 10 PDs, or various subsets thereof (Fig. 12). In the latter two cases subsets of PDs are selected from the ranked list of PDs generated by the algorithm of Fig. 8, and coefficients are obtained by training a Cox proportional hazard model using these coefficients.
  • the prognostic models were demonstrated to be independent of standard clinical parameters (Fig. 20). Specifically, the PD-based prognostic models could identify high-risk patients typically considered “low-risk” given a relatively “thin” melanoma tumor at time of diagnosis. Similarly, the PD-based prognostic models identified high risk patients that were SLN negative (Fig. 15 and 20), as discussed above.
  • a linear model combining the expression profiles of the ten PDs was built for the cohort using multivariate Cox proportional hazard regression that assigned a risk score to every patient in the cohort.
  • This risk score threshold dichotomized the cohort into low-risk (lower scores) and high-risk (higher scores) groups for biochemical recurrence (Fig. 21). The significance of our prognostic model is assessed by hazard ratio between the two groups, P values in log-rank test, and C-indexes.
  • the breast cancer dataset was clustered by hierarchical clustering and the two main clusters were used to identify two populations. All 10 PDs were used. The clustering process naturally identified low and high risk populations, and they are illustrated using Kaplan Meier curves in Fig. 22.
  • the colon cancer dataset was clustered by k-means clustering with two centroids, using only expression values from the ten PDs. The clustering process naturally identified low and high risk populations, as illustrated by Kaplan-Meier analysis in Fig. 23. Taken together, these results demonstrate that the PDs of the present invention are prognostic in diverse cancer types.
  • Example 5 PDs operate within pathway context, which is a molecular stratifier of the patient population
  • PDs may be involved in a number of pathways relevant to cancer progression, which can include proteins with molecular alterations affecting their ability to regulate the normal molecular events of the cell.
  • a loss-of-function (LOF) mutation in a tumor suppressor or a gain-of-function (GOF) mutation in an oncoprotein are examples of pathway context molecular alterations that can cause perturbed signal transduction pathway activity that is associated with cancer aggressiveness.
  • GOF and LOF mutations in these pathway context genes may result in perturbed or deregulated pathway activity that in many instances is thought to be a driver of an aggressive, malignant phenotype.
  • Figs. 13-15 nuclear CDl 17, nuclear CD44, non-nuclear CDH2, and nuclear KIF2C constitute a particularly powerful PD four-marker combination in identifying patients at low risk for progression based on their quantitative expression levels (Fig. 13).
  • this marker combination is able to perform very well in identifying low risk patients based on levels of expression, they should also be able to accurately predict the complementary, namely high risk patients based on the exact expression levels.
  • non-nuclear CD117, tumor KIF2C, nuclear MMP1, and tumor PCNA is a particularly powerful PD 4-marker combination to identify high risk patients with a minimal number of false positive patients (Fig. 14).
  • yet another PD four-marker combination namely nuclear ANLN, nuclear MMP 1 , tumor PCNA, and non-nuclear
  • SPARC is particularly sensitive for identifying the high risk patients in a SLNB-negative patient population (Fig. 15). While not wishing to be bound by theory, the observation that different four-marker PD combinations are optimal in performance for different prognostic endpoints, e.g. identification of low risk, high risk, and high risk amongst node-negative patients suggests that the individual PDs and specific combinations of PDs depend on a distinct pathway context.
  • This concept has important implications for the functional role of PDs.
  • the PD may not be the ultimate driver of tumor progression in a pathway, but, rather, could be a downstream or upstream product of the true driver.
  • many of the PDs can be substituted with other protein markers that can serve as functional readouts of the perturbed pathways causing early stage cancers to progress.
  • the true drivers of tumor progression will be identified. These drivers will have an important effect on prognosis.
  • Fig. 24 illustrates an example of quantitative analysis of mRNA levels through qRT-
  • Fig. 25 is a Western blot analysis of the 3 pathway context proteins, B-Raf, N-Ras, and K-Ras in the 9 melanoma cell lines.
  • FIGS. 26 and 27 show qualitative and quantitative analysis of the expression levels of B-Raf, N-Ras, and K-Ras in a human melanoma sample. This is important, as the functional role of PDs in a cancer type will depend on the expression level and/or the activity state of the pathway context proteins. If a pathway context protein that is known to be associated with aggressiveness, for instance B-Raf in malignant melanoma, is activated or mutated, certain PDs will be associated with high risk and cancer progression. However, other PD combination(s) can predict low risk for progression, as they serve a different role in context of a wild-type B-Raf.
  • pathway context proteins Through analysis of the expression of pathway context proteins with specific antibodies together with analysis of the signal transduction pathway activity in the pathway context-regulated proteins through standard tools, like pan and phospho-specific antibodies, one can get a good functional assessment of the activity state of pathway context-regulated proteins (for examples, see Andersen et al, 2010. Pathway-based identification of biomarkers for targeted therapeutics: personalized oncology with PI3K pathway inhibitors. Sci Transl Med. 2(43):43ra55).
  • Such 'pathway mapping' analysis of PDs in the pathway context using quantitative means like AQUA or Defmiens TissueStudioTM, will enable a more precise understanding of the role of a particular PD as a driver of the aggressive cancer phenotype, or just a co-regulated molecule.
  • the information that a particular PD is active in a particular pathway context, but not in another can be linked with a functional characterization of the role of that PD when it is in the active pathway context.

Abstract

The present invention provides a set of biomarkers (e.g., genes and gene products) that can accurately inform about the risk of cancer progression and recurrence, as well as methods of their use. These biomarkers, also denoted Prognosis Determinants (PDs), provide prognostic value for human cancer patients. The biomarkers can be used to supply valuable clinical information that may be useful in determining appropriate and effective methods of treatment for cancer patients.

Description

METHODS OF PREDICTING PROGNOSIS IN CANCER
CROSS REFERENCES TO OTHER APPLICATIONS
[0001] This application claims priority from U.S. Provisional Application 61/452,054, filed March 11, 2011. The disclosure of that application is incorporated by reference herein in its entirety.
SEQUENCE LISTING
[0002] This application contains a Sequence Listing that has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII file, created on February 21, 2012, is named 106672WO.txt and is 107,104 bytes in size. FIELD OF THE INVENTION
[0003] This invention relates to using biomarker panels to predict prognosis in cancer patients.
BACKGROUND OF THE INVENTION
[0004] More than one million new cancer cases arise each year in the United States. Of these, approximately half are classified as early-stage diseases. As detection technologies improve and strategies for routine screening become widely adopted, the number of early stage cancers with no clear evidence of metastatic spread will increase dramatically.
Unfortunately, while these cancers are classified as "early" stage, up to 20% of these individuals will behave as if they had more advanced disease and will eventually succumb to lethal metastatic cancer. The study of telomere dynamics in carcinogenesis and recent high- resolution genomic profiles of staged-human cancers have revealed that early stage primary tumors are endowed with a "complete" genetic profile that dictates their future biological behavior as benign or aggressive. Given the adverse impact of metastasis on morbidity and mortality, there is a critical unmet need for the development of accurate tests that can identify high-risk patients who may otherwise be misidentified as low-risk patients. Such tests can help provide more accurate management of metastatic tumors by guiding patients towards more aggressive treatments, as may be necessary. Conversely, patients with inherently benign early and intermediate stage cancers can be steered away from unnecessarily aggressive treatments, thereby reducing healthcare costs and avoiding the morbidity of such interventions.
[0005] Metastasis involves multiple biological processes driven by an ensemble of genetic alterations (Gupta et a., Cell 127(4): 679-95(2006)). A long held view holds that metastasis- conferring genetic events are acquired stochastically as a tumor grows and expands. An alternative view posits that tumors are "hard-wired" with pro-metastatic genetic alterations early in the evolution of tumors and that these alterations also drive the genesis of cancer (Bernards et al, Nature 418(6900):823 (2002)). Despite a wealth of knowledge at the molecular and genetic level about major cancer forms in humans, including colorectal, lung, breast, liver, pancreas, and other cancers, there is still a very poor understanding of which of these models accurately reflects the molecular events underpinning tumor progression and metastasis. Accordingly, there remains a desperate need to understand which patients will have recurrence of their tumors and ultimately a lethal outcome, and how early diagnosis and treatment may impact these outcomes (American Cancer Society. Cancer Facts & Figures 2010. Atlanta: American Cancer Society; 2010; world wide web link at
cancer.org/acs/groups/content/@nho/documents/document/acspc-024113.pdf).
[0006] Melanoma ranks second only to leukemia in terms of loss of years of potential life, reflecting its median age of diagnosis at 45-55 years and its poor prognosis (Rezaul, K., L. L. Wilson, et al. (2008). "Direct tissue proteomics in human diseases: potential applications to melanoma research." Expert Rev Proteomics 5(3): 405-12). Approximately 80% of melanoma patients present with clinically localized disease (e.g., cutaneous melanoma with no evidence of metastasis at the time of diagnosis) (Ries LAG 2008). The current melanoma staging system is primarily based on tumor thickness (Breslow score), ulceration, and sentinel lymph node (SLN) status (Balch, C. M., J. E. Gershenwald, et al. (2003). Chapter 3: Staging and classification. Cutaneous Melanoma. C. M. Balch, A. N. Houghton, A. Sober and S. J. Soong. St. Louis, Quality Medical Publishing: 55-76). While standard clinical risk assessments are capable of stratifying patients into low, intermediate, and high risk of relapse, standard clinical staging approaches have significant limitations. For melanomas over 1 mm (or more recently, all tumors >0.3 mm with at least one mitotic cell), current practice often includes a sentinel lymph node biopsy (SLNB) and, pending involvement of the draining node, resection of regional lymph nodes. The vast majority (-90%) of SLNB procedures are negative (e.g., no nodal involvement is detected). Given the high rate of negatives and frequently serious side effects associated with this surgical procedure, including pain, scarring, and lymphoedema, the relative value of this procedure has been called into question. Furthermore, 10-20% of patients who are clinically assessed as low-risk (e.g., thin Stage I or II tumors; SLNB negative) nevertheless develop metastasis and die of their disease within 10 years.
[0007] Thus, there is an urgent unmet medical need for molecular biomarkers that can accurately provide prognostic information independent of standard clinical parameters.
Several groups have attempted to identify such molecular biomarkers. Winnepenninckx and colleagues identified a 254-gene expression signature using a small cohort of frozen melanoma samples that could identify patients at risk of developing distant metastasis (J NCI 98(7): 472-82 (2006)). However, when 23 of the most promising genes were retested using a clinically relevant immunohistochemistry format with antibodies directed against the corresponding gene products, 15 of the markers failed to demonstrate any prognostic significance and those that did demonstrate statistical significance did not provide a robust separation between patients with high and low risks of recurrence. Thus, the clinical utility of these markers remains dubious. Ryu and colleagues reported a signature of melanoma progression by performing gene expression analysis of a series of melanoma cell lines (PLOS ONE, Issue 7 e594 (2007)). However, this group failed to extend their observations beyond cell lines and into clinically annotated human melanoma tumors, thus the clinical utility of these observations remains unknown. More recently, Gould-Rothberg and colleagues developed a quantitative immunofluorescence-based model using AQUA® technology (J Clin One 27(34):5772-80 (2009)), yet the 10-year survival rate for the high and low risk group were not statistically distinct as illustrated by Kaplan-Meier analysis of the validation cohort. Thus, although the approach appears promising, the specific biomarkers leveraged in this model do not appear to be sufficiently robust for clinical implementation.
[0008] The instant invention provides molecular signatures that can robustly stratify cancer, including melanoma, patients according to their risk for metastatic progression. SUMMARY OF THE INVENTION
[0009] The present invention provides a set of biomarkers (e.g., genes and gene products) that can accurately inform about the risk of cancer progression and recurrence, as well as methods of their use. These biomarkers, also denoted Prognosis Determinants (PDs), provide prognostic value for human cancer patients.
[0010] The invention provides a method of predicting prognosis of a cancer patient. In this method, one obtains a cancerous tissue sample from the patient, measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample, and obtains biomarker scores based on the measured levels, wherein the biomarker scores are indicative of the prognosis of the cancer patient. In some embodiments, the patient has melanoma. For example, the prognosis may be that the patient is at a low risk of having metastatic cancer or recurrence of the melanoma. In these embodiments, the selected biomarkers may be (1) CD44, ANLN, CD1 17, MMP1 , and KIF2C; or (2) CDH2, SPARC, PCNA, FSCN1 , and
DEPDC1. In alternative embodiments, the prognosis may be that the patient is at a high risk of having metastatic cancer or recurrence of the melanoma. In these embodiments, the selected biomarkers may be (1) CD1 17, CD44, KIF2C, MMP1 , and CDH2; or (2) PCNA, ANLN, SPARC, FSCN1 , and DEPDC1. In embodiments wherein the prognosis is that the patient is at a high risk of having metastatic cancer or recurrence of the melanoma, the patient may have a negative result in sentinel lymph node biopsy (SLNB), and the selected biomarkers may be (1) ANLN, MMP1 , CDH2, KIF2C, and SPARC; or (2) CD1 17, PCNA, FSCN1 , CD44, and DEPDC1.
[0011] The invention also provides a method of analyzing a cancerous tissue sample from a cancer patient. In this method, one obtains a cancerous tissue sample from the patient and measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample.
[0012] The invention additionally provides a method of identifying a cancer patient in need of adjuvant therapy. In this method, one obtains a cancerous tissue sample from the patient, measures the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD1 17, DEPDC1 , FSCN1 , KIF2C, MMP1 , PCNA, and SPARC in the sample, and obtains biomarker scores based on the measured levels, wherein the biomarker scores indicate that the patient is in need of adjuvant therapy. For example, the adjuvant therapy may be selected from the group consisting of radiation therapy, chemotherapy, immunotherapy, hormone therapy, and targeted therapy. In some embodiments, the targeted therapy targets another component of a signaling pathway in which one or more of the selected biomarkers is a component. In alternative embodiments, the targeted therapy targets one or more of the selected biomarkers.
[0013] The invention also provides a further method of treating a cancer patient. In this method, one obtains the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in a cancerous tissue sample from the patient, and treats the patient with adjuvant therapy if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer. In some embodiments, the adjuvant therapy is an
experimental therapy.
[0014] The invention additionally provides a method of identifying a cancer patient in need of a sentinel lymph node biopsy. In this method, one obtains the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in the sample, and performs a sentinel lymph node biopsy on the patient if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer. The invention conversely provides a method of identifying a cancer patient not in need of a sentinel lymph node biopsy. In this method, one obtains the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in the sample, and does not perform the sentinel lymph node biopsy on the patient if the biomarker scores indicate that the patient is at a low risk of having metastatic cancer or recurrence of cancer.
[0015] In some embodiments, biomarker scores are obtained by applying a coefficient to the measured levels of the selected biomarkers. In some embodiments, biomarker scores are calculated by using one or more algorithms selected from the group consisting of the Greedy Model, the Cox regression algorithm, the LASSO algorithm, the AlC-Optimizing Stepwise Forward Selection Cox Regression algorithm.
[0016] In some embodiments, the RNA transcript levels of the selected biomarkers are measured. In certain embodiments, the transcript levels are determined by microarray, quantitative RT-PCR or Nanostring nCounter. In alternative embodiments, the protein levels of the selected biomarkers are measured. In certain embodiments, the protein levels are measured by antibodies, for example, by immunohistochemistry or immunofluorescence. In these embodiments, the protein levels may be measured in subcellular compartments, for example, by measuring the protein levels of biomarkers in the nucleus relative to the protein levels of the biomarkers in the cytoplasm. In some embodiments, the protein levels of biomarkers may be measured in the nucleus and/or in the cytoplasm.
[0017] In some embodiments, the levels of the biomarkers may be measured separately. Alternatively, the levels of the biomarkers may be measured in a multiplex reaction.
[0018] In some embodiments, noncancerous cells are excluded from the tissue sample. In some embodiments, a cancerous tissue sample is a formalin- fixed paraffin embedded tissue sample, a snap-frozen tissue sample, an ethanol-fixed tissue sample, a tissue sample fixed with an organic solvent, a tissue sample fixed with plastic or epoxy, a cross-linked tissue sample, surgically removed tumor tissue, circulating tumor cells, a biopsy sample, or a blood sample. In some embodiments, the cancerous tissue is melanoma, prostate cancer, breast cancer, or colon cancer tissue.
[0019] In some embodiments, at least one standard parameter associated with the cancer is measured in addition to the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC. The at least one standard parameter may be, for example, tumor stage, tumor grade, tumor size, tumor visual characteristics, tumor location, tumor growth, lymph node status, tumor thickness (Breslow score), ulceration, age of onset, PSA level, or Gleason score.
[0020] In some embodiments, one or more pathway context genes, gene transcripts, or gene products are measured in addition to the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC. In certain embodiments, the pathway context genes, gene transcripts, or gene products are mutated pathway context genes, gene transcripts, or gene products.
[0021] The invention provides a kit for measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC. This kit comprises reagents for specifically measuring the levels of the selected biomarkers. For example, the reagents may be nucleic acid molecules such as PCR primers or hybridizing probes. Alternatively, the reagents may be antibodies or equivalents thereof. [0022] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present invention. All publications and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. Although a number of documents are cited herein, this citation does not constitute an admission that any of these documents forms part of the common general knowledge in the art. Throughout this specification and embodiments, the word "comprise," or variations such as "comprises" or "comprising" will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers. The materials, methods, and examples are illustrative only and not intended to be limiting.
[0023] Other features and advantages of the invention will be apparent from and
encompassed by the following detailed description and embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
[0024] Fig. 1 shows quantification of PD mR A levels by qRT-PCR. Gene expression results represented as log 2-fold change values for the ten PDs in nine melanoma cell lines (bottom row). The melanocyte cell line NHEM neo was used as the comparator (log 2-fold change of 0). Changes in the expression of the endogenous control gene, ZNF592, are shown as well. Note that MMP1 was not detected in NHEM neo and A375, c-KIT was not detected in WM-266.4, SK-MEL-2, SK-MEL-24, SK-MEL-31 and A375. CD44-a represents the C- terminal domain, CD44-b the variable domain and CD44-C the N-terminal domain of CD44.
[0025] Fig. 2 shows quantification of PD protein levels by western blot. Western blot results shown for the 10 PDs in normal melanocyte cell line (NHEM) and nine melanoma cell lines. The values above each band represent quantitative PD mRNA expression levels based on Taqman qRT-PCR (see Fig. 1). Actin was used as a loading control. ND, not detected.
[0026] Fig. 3 shows qualitative analysis of PD protein levels by conventional
immunohistochemistry. Formalin- fixed, paraffin embedded (FFPE) blocks of human melanomas were analyzed using antibodies against each of the 10 PDs: KIF2C (A,B), CDH2 (C,D), DEPDC1 (E,F), CD44 (G,H) CD117 (I,J) SPARC (K,L), FSCN1 (M,N), PCNA (Ο,Ρ) MMP1 (Q,R) and ANLN (S,T) and stained using the diaminobenzidine (DAB) procedure. Qualitatively, differential expression in human melanoma tumors was classified as intense, moderate, or low/absent. An example of intensely and moderately stained PDs in tumors is shown for each of the PDs.
[0027] Fig. 4 shows analysis of PD protein levels by immunofluorescence.
Immunofluorescent staining of FFPE human melanomas was performed for all 10 PDs,
KIF2C (A), CDH2 (B), DEPDCl (C), CD44 (D) CDl 17 (E) SPARC (F), FSCNl (G), PCNA (H) MMP1 (I) and ANLN (J). The 10 PDs were detected and amplified using Cy5-tyramide. 20 x magnification images were captured in the Cy 5 -fluorescent channels using the AperioFL ScanScope hardware. Digital images were recorded in the Aperio Spectrum Database and processed using ImageScope software.
[0028] Fig. 5 shows PD quantification in a tumor compartment defined using the Definiens Composer segmentation and classification tool in Definiens TissueStudio software. I) The Cy3 -acquired image (A) was used to determine tissue background separation (B) as well as generate the tumor mask excluding non-tumor cells in the sample in the analysis and separate it from the stroma (dark versus bright segments in C). The DAPI acquired image (D) was used to determine the nuclear (E) versus the non nuclear (F) compartments and the Cy5- acquired image (G) was used to determine and quantitate target PCNA intensity in the diverse tumor regions of interest (H). II) Representative column chart comparing the distributions of the Definiens score (represented as Intensity/Exposure time, Int/exp) for total PCNA in the tumor compartment ("Tumor mask") as well as PCNA in the nuclear and non- nuclear areas within the tumor compartment in a representative human melanoma PCNA was predominantly localized to the tumor nucleus, consistent with PCNA's known function.
[0029] Fig. 6 shows PD quantification in a tumor compartment defined using a molecular mask and AQUA. I) PD markers and nuclei were visualized by multiplex immunofluorescent staining for S-100 labeled with Cy3 (B,F); FSCNl (C) and SPARC (G) labeled with Cy5 and 4,6-diamidino-2-phenylindole (DAPI) (A,E) respectively. For each stain, images were acquired in the CY3, CY5 and DAPI channels, recorded in the Aperio Spectrum Database and analyzed using the AQUA software. The above images were merged in Fig. 6D and H. II) Representative column chart comparing the distributions of composite AQUA scores for each of the 10 PDs in tumor mask as well as in nuclear and non-nuclear areas within the tumor mask in assorted human melanomas.
[0030] Fig. 7 shows an overview of the workflow of a clinical study in which 10 PDs were assessed by AQUA using a melanoma TMA cohort. [0031] Fig. 8 shows a method to prioritize PDs for inclusion in prognostic models. Left panel: The compartment-specific marker expression levels measured by AQUA platform were preprocessed and analyzed with two variable-reducing algorithms (each using either continuous or binarized data), employing bootstrapping to develop an importance score for each PD. Right panel: PDs are ranked by the sum of importance scores across the four algorithms. Accordingly, the top ones correspond to the highest absolute aggregated scores. The top-K markers are used to build a multivariate Cox regression algorithm to design a linear melanoma prognostic model.
[0032] Fig. 9 shows development of a prognostic model using multivariate Cox regression as described in Fig. 8 to optimize for the identification of low risk subjects. In this case, the top-ranked 5 PDs were used to construct a linear melanoma prognostic model by a multivariate Cox proportional hazard regression algorithm. The ability of the model to segregate the high and low risk groups in both the training and validation cohorts is assessed with Kaplan-Meier curves. TR.HiRisk is the high risk training population, TR.LoRisk is the low risk training population, TE.HiRisk is the high risk testing population, and TE.LoRisk is the low risk testing population. Numbers in parentheses indicate the number of events (patient deaths)/total population in the high risk and low risk designations. SumScores represents the sum of the scores calculated from methods (A), (B), (C), and (D).
[0033] Fig. 10 shows the development of a prognostic model using stepwise forward selection to optimize for the identification of low risk subjects. Stepwise forward selection was employed with a Cox regression algorithm to select four PDs for the model. The ability of the model to segregate the high and low risk groups of the training and validation cohorts were assessed with Kaplan-Meier curves. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers.
[0034] Fig. 11 shows three different 5-marker models which were obtained using the standard model selection procedure illustrated in Fig. 8, optimized to identify low-risk subjects. The score that each model gives a sample is the linear combination of the log2 AQUA scores for the PDs in the model weighted by the model coefficients. The first model is obtained by choosing the highest ranking markers while the other two are obtained by choosing markers that are further down the list. The performance of the models on the high and low risk groups of the training cohort is illustrated using Kaplan-Meier curves. [0035] Fig. 12 shows examples of prognostic models using various numbers of PDs. In this case it was used to identify specific marker combinations that all performed well in predicting low risk for progression. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers.
[0036] Fig. 13 presents results for a version of the variable selection algorithm that favors variable combinations that yield models for which the low risk population has a very low FN rate. The figure presents Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination). In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers. Hazard Ratio is the increase in hazard in moving between high risk and low risk populations that occurs when a Cox proportional model is matched to these two populations. Logrank P value and C-index were calculated as in, for example, Modelling Survival Data in Medical Research, Second Edition, 2003, Chapman & Hall CRC; David Collett, pg 42 and Pencina MJ and D'Agostino RB. Statist Med (2004) 23:2109-2123, respectively. NPR was calculated as TN/(TN+FN), and sensitivity was calculated as TP/(TP+FN).
[0037] Fig. 14 presents Kaplan-Meier curves for prognostic models that are sensitive to the identification of high risk patients. Models with three or more parameters perform well in two respects. On the one hand the high risk population contains a relatively high fraction of the recurrent cases, and on the other the follow-up time for the censored cases is generally short, demonstrating that they are likely high risk. The Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination) are shown. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers. PPR was calculated as TP/(TP+FP), and specificity was calculated as TN/(TN+FP).
[0038] Fig. 15 shows examples of prognostic models that are able to identify high risk patients in a SLNB-negative patient population. Examples are shown for 1, 2, 3, 4, and 5 PD combinations. The Kaplan-Meier curves for both the training cohort (left panel of each biomarker combination) and testing cohort (right panel of each biomarker combination) are shown. In this figure, the number of "markers" referred to includes those biomarkers in different compartments (nuclear, non-nuclear, or tumor) as separate markers. [0039] Fig. 16 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are very sensitive in detecting the low risk population. This is a greedy algorithm which iteratively adds to an already established group the one or two additional variables that improve best the sensitivity of the model in the low risk group. The table presents the results for the training set. The Kaplan-Meier curves present the performance on both the training and test set for the model with six variables. We see that for both sets the sensitivity in the low risk group is 1. P, p-value; C, C-index. Events (nE vents) measured are subject deaths. nCase represents the total population. Train2 is a set to train on two "extreme" population segments, in this case subjects with Breslow thicknesses <1.5mm or >=2mm. The highlighted panel had the maximum specificity and PPR among the panels tested. All panels had a "perfect" sensitivity of 1.
[0040] Fig. 17 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are geared towards identifying the high risk population. The algorithm is analogous to the one for identifying the low risk population (see legend of Fig. 16) except that variables that improve the specificity are added at each stage. The table presents the performance of the algorithm on the training data while the Kaplan Meier curves demonstrate the performance on both training and test data for the model with six variables. The high risk population has a relatively high fraction of recurrence events, as well as relatively short follow-up times for the censored cases.
[0041] Fig. 18 illustrates that multiple models can be combined to improve population segregation. Here the greedy model that segregates well the low risk population is combined with the greedy model that segregates well the high risk population to yield stratification into three classes: high risk, medium risk, and low risk. The model scores consist of a linear combination of the log2 AQUA scores for the markers weighted by the coefficients for the two models indicated in the two tables. The thresholds for the model scores used to segregate populations are indicated in the decision tree below the table. For the training sets, thresholds that maximized sensitivity (for low-risk subjects) or specificity (for high-risk subjects) was chosen.
[0042] Fig. 19 demonstrates the effectiveness of the combined model for segregating low and high risk populations illustrated in Fig. 18. Kaplan-Meier curves for both training and test cohorts are presented.
[0043] Fig. 20 demonstrates that molecular prognostic models are independent of standard clinical parameters. Top panel: the prognostic model dichotomizes thinner or thicker Breslow depth tumors into risk-distinct groups in training and validation cohorts. Bottom panel: the prognostic model also dichotomizes sentinel lymph node negative patients into low- and high-risk groups in both training and validation cohorts. In both cases, the biomarker panels were able to predict low risk and high risk independent of the clinical parameters. The voting algorithm was used for both low risk and high risk.
[0044] Fig. 21 shows a prognostic model for prostate cancer using PDs. Expression data from PMID: 20579941 were preprocessed in a standard fashion (see, e.g., Example 4) and a Cox model was constructed using ten PDs: ANLN, CDH2, PCNA, KIF2C, DEPDC1, SPARC, FSCN1, MMP1, CD44, and CD117. Kaplan Meier curves illustrate separation between low and high risk populations as determined by biochemical recurrence.
[0045] Fig. 22 shows a prognostic model for breast cancer using the ten PDs. Data from PMID: 12490681 were preprocessed in a standard fashion (see, e.g., Example 4) and clustered by hierarchical clustering and the two main clusters were used to identify two populations. Kaplan-Meier curves illustrate separation between low and high risk
populations.
[0046] Fig. 23 shows a prognostic model for colon cancer using the ten PDs. Data from PMID: 19996206 were preprocessed in a standard fashion (see, e.g., Example 4) and clustered by k-means clustering using all ten PDs. Kaplan Meier curves illustrate the separation between low and high risk populations. The number of recurrences R and the size of each group S is indicated in the figure title as R/S.
[0047] Fig. 24 shows quantification of mRNA levels for pathway context genes relevant for malignant melanoma. Gene expression results represented as log 2-fold change values for 3 members of the canonical MAPK pathway (K-RAS, N-RAS, B-RAF) in 9 melanoma cell lines. These 3 genes provide examples of clinically relevant pathway context used to model human melanoma sub-types. The melanocyte cell line NHEM neo was used as the comparator (log 2-fold change of 0). Changes in the expression of the endogenous control gene, ZNF592, are shown as well.
[0048] Fig. 25 shows quantification of pathway context protein levels by Western blot. Western blot results shown for 3 members of the canonical MAPK pathway (K-Ras, N-Ras, B-Raf) in 9 melanoma cell lines. The values above each band represent quantitative pathway context gene mRNA expression levels based on Taqman qRT-PCR (see Fig. 24). Actin was used as a loading control. [0049] Fig. 26 shows detection of pathway context proteins by immunohistochemical staining. Clinically-relevant pathway context markers, K-Ras, N-Ras, and B-Raf, can be detected in human melanoma FFPE samples by conventional immunohistochemistry and immunofluorescence staining. The pictures illustrate immunohistochemical and
immunofluorescence staining for B-Raf (A & D), K-Ras (B & E) and N-Ras (C & F), respectively. The top panels illustrate immunohistochemistry; the bottom panels, immunofluorescence.
[0050] Fig. 27 shows pathway context protein quantification in a tumor compartment defined using a molecular mask and AQUA. The 3 pathway context markers, K-Ras, N-Ras, and B-Raf, and nuclei were visualized by multiplex immuno fluorescent staining. For tumor cell recognition S-100 was labeled with Cy3; K-Ras, N-Ras, and B-Raf were labeled with Cy5, and nuclei with 4,6-diamidino-2-phenylindole (DAPI), respectively. For each stain, images were acquired in the CY3, CY5 and DAPI channels, recorded in the Aperio Spectrum Database and analyzed using the AQUA software. A representative column chart showing the AQUA scores for each of the 3 pathway context markers in tumor mask as well as in nuclear and non-nuclear areas within the tumor mask is illustrated.
[0051] Fig. 28 provides a description of the fields of the tables shown in figures 29-31. (A) Fields related to characteristics and results in training samples. (B) Fields related to characteristics and results of validation samples. These relate to the data generated by the greedy algorithm for identifying Low Risk (LR) and High Risk (HR) populations.
[0052] Fig. 29 (A-L) shows performance of selected Low Risk (LR) models in a training subcohort and complimentary validation subcohort. A subset of the models is depicted in Fig. 13.
[0053] Fig. 30 (A-L) shows performance of selected High Risk (HR) models in a training subcohort and complimentary validation subcohort. A subset of the models is depicted in Fig. 14.
[0054] Fig. 31 (A-L) shows performance of selected High Risk (HR) models in a training subcohort and complimentary validation subcohort with negative sentinel lymph node biopsy (SLNB) results.. A subset of the models is depicted in Fig. 15.
DETAILED DESCRIPTION
[0055] The present invention is based on the discovery that biomarker panels comprising two or more members from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCN1, KIF2C, MMP1, PCNA, and SPARC ("prognosis determinants" or "PD"s; Table 1) are useful in providing molecular, evidence-based reliable prognosis about cancer patients. By measuring the expression or activity levels of the biomarkers in a cancerous tissue sample from a patient, one can reliably predict survival of the patient at a given time point. The levels can be used to predict disease progression (e.g., the metastatic or recurrence potential of a cancer), or efficacy of a cancer therapy (e.g., surgery, radiation therapy or chemotherapy) independent of, or in addition to, traditional, established risk assessment procedures. The levels also can be used to identify patients in need of aggressive cancer therapy (e.g., adjuvant therapy), or to guide further diagnostic tests (e.g., sentinel lymph node biopsy). When used in context with pathway context genes or proteins, the levels can also be used to inform patients about which types of therapy they would be most likely to benefit from, and to stratify patients for inclusion in a clinical study ,The levels also can be used to identify patients who will not benefit from and/or do not need cancer therapy (e.g., surgery, radiation therapy, chemotherapy, targeted therapy, or adjuvant therapy). In other words, the biomarker panels of this invention allow clinicians to optimally manage cancer patients.
[0056] The biomarker panels of the present invention provide useful prognostic information about a variety of cancers, including, for example, carcinomas (e.g., malignant tumors derived from epithelial cells such as, for example, common forms of breast, prostate, lung, and colon cancer), sarcomas (e.g., malignant tumors derived from connective tissue or mesenchymal cells), lymphomas and leukemias (i.e., malignancies derived from
hematopoietic cells), germ cell tumors (i.e., tumors derived from totipotent cells). Specific examples of these cancers include, without limitation, cancers of: breast, skin, bone, prostate, ovaries, uterus, cervix, liver, lung, brain, spine, larynx, gallbladder, pancreas, rectum, parathyroid, thyroid, adrenal gland, immune system, head and neck, colon, stomach, bronchi, and kidneys.
[0057] To use a panel of this invention, one may measure the levels of the panel's constituent biomarkers in a cancerous tissue sample from the patient, and then use statistical algorithms described below to calculate a numeric score (i.e., a "biomarker score") for each of the biomarkers based on its measured level. The biomarker scores of a given panel have been correlated with a specific prognosis of the cancer. For example, a particular profile of biomarker scores for a panel can be predictive of a low risk of cancer metastasis or recurrence, while another profile of biomarker scores of the same or a different panel can be predictive of a high risk of cancer metastasis or recurrence. I. PROGNOSIS DETERMINANTS
[0058] The prognosis determinants of this invention include the ten biomarkers listed in Table 1 below. A "biomarker" or "marker" refers to an analyte (e.g., a nucleic acid, peptide, protein, or metabolite) that can be objectively measured and evaluated as an indicator for a biological process. The inventors have discovered that the expression or activity levels of these ten biomarkers correlates reliably with the prognosis of cancer patients.
Table 1. Prognosis Determinants and Exemplary NCBI Reference Numbers
Figure imgf000016_0001
[0059] ANLN stands for anillin. ANLN also may be known in the art as scraps, sera, the actin-binding protein anillin, or anillin (Drosophila Scraps homolog), actin binding protein. It is a scaffold protein that links RhoA with actin and myosin during cytokinesis. An exemplary human ANLN protein contains 1 124 amino acid residues and has the following polypeptide sequence:
MDPFTEKLLERTRARRENLQRKMAERPTAAPRSMTHAKRARQPLSEASNQQPLSGG EEKSCTKPSPSKKRCSDNTEVEVSNLENKQPVESTSAKSCSPSPVSPQVQPQAADTIS DSVAVPASLLGMRRGLNSRLEATAASSVKTRMQKLAEQRRRWDNDDMTDDIPESSL FSPMPSEEKAASPPRPLLSNASATPVGRRGRLANLAATICSWEDDVNHSFAKQNSVQ EQPGTACLSKFSSASGASARINSSSVKQEATFCSQRDGDASLNKALSSSADDASLVNA SISSSVKATSPVKSTTSITDAKSCEGQNPELLPKTPISPLKTGVSKPIVKSTLSQTVPSKG ELSREICLQSQSKDKSTTPGGTGIKPFLERFGERCQEHSKESPARSTPHRTPIITPNTKAI QERLFKQDTSSSTTHLAQQLKQERQKELACLRGRFDKGNIWSAEKGGNSKSKQLET KQETHCQSTPLKKHQGVSKTQSLPVTEKVTENQIPAK SSTEPKGFTECEMTKSSPLK ITLFLEEDKSLKVTSDPKVEQKIEVIREIEMSVDDDDINSSKVINDLFSDVLEEGELDM EKSQEEMDQALAESSEEQEDALNISSMSLLAPLAQTVGVVSPESLVSTPRLELKDTSR SDESPKPGKFQRTRVPRAESGDSLGSEDRDLLYSIDAYRSQRFKETERPSIKQVIVRKE DVTSKLDEKNNAFPCQVNIKQKMQEL NEINMQQTVIYQASQALNCCVDEEHGKGS LEEAEAERLLLIATGKRTLLIDELNKLK EGPQRK KASPQSEFMPSKGSVTLSEIRLP LKADFVCSTVQKPDAANYYYLIILKAGAENMVATPLASTSNSLNGDALTFTTTFTLQ DVSNDFEINIEVYSLVQKKDPSGLDKKK TSKSKAITPKRLLTSITTKSNIHSSVMASP GGL S AVRT SNF AL VGS YTL SLS S VGNTKF VLDKVPFL S SLEGHI YLKIKC Q VNS S VEE RGFLTIFED VS GFG A WHRRWC VL S GNCI S Y WT YPDDEKRK PIGRINL ANCT SRQIEP ANREFCARRNTFELITVRPQREDDRETLVSQCRDTLCVTK WLSADTKEERDLWMQ KLNQVLVDIRLWQPDACYKPIGKP (SEQ ID NO: 1; NCBI Reference No. NP 061155.2)
[0060] An niRNA sequence for this ANLN polypeptide is:
ATGGATCCGTTTACGGAGAAACTGCTGGAGCGAACCCGTGCC AGGCGAGAGAAT CTTCAGAGAAAAATGGCTGAGAGGCCCACAGCAGCTCCAAGGTCTATGACTCAT GCTAAGCGAGCTAGACAGCCACTTTCAGAAGCAAGTAACCAGCAGCCCCTCTCT GGTGGTGAAGAGAAATCTTGTACAAAACCATCGCCATCAAAAAAACGCTGTTCT GACAACACTGAAGTAGAAGTTTCTAACTTGGAAAATAAACAACCAGTTGAGTCG ACATCTGCAAAATCTTGTTCTCCAAGTCCTGTGTCTCCTCAGGTGCAGCCACAAG CAGCAGATACCATCAGTGATTCTGTTGCTGTCCCGGCATCACTGCTGGGCATGAG GAGAGGGCTGAACTCAAGATTGGAAGCAACTGCAGCCTCCTCAGTTAAAACACG TATGCAAAAACTTGCAGAGCAACGGCGCCGTTGGGATAATGATGATATGACAGA TGACATTCCTGAAAGCTCACTCTTCTCACCAATGCCATCAGAGGAAAAGGCTGCT TCCCCTCCCAGACCTCTGCTTTCAAATGCCTCGGCAACTCCAGTTGGCAGAAGGG GCCGTCTGGCCAATCTTGCTGCAACTATTTGCTCCTGGGAAGATGATGTAAATCA CTCATTTGCAAAACAAAACAGTGTACAAGAACAGCCTGGTACCGCTTGTTTATCC AAATTTTCCTCTGCAAGTGGAGCATCTGCTAGGATCAATAGCAGCAGTGTTAAGC AGGAAGCTACATTCTGTTCCCAAAGGGATGGCGATGCCTCTTTGAATAAAGCCCT ATCCTCAAGTGCTGATGATGCGTCTTTGGTTAATGCCTCAATTTCCAGCTCTGTGA AAGCTACTTCTCCAGTGAAATCTACTACATCTATCACTGATGCTAAAAGTTGTGA GGGACAAAATCCTGAGCTACTTCCAAAAACTCCTATTAGTCCTCTGAAAACGGGG GTATCGAAACCAATTGTGAAGTCAACTTTATCCCAGACAGTTCCATCCAAGGGAG AATTAAGTAGAGAAATTTGTCTGCAATCTCAATCTAAAGACAAATCTACGACACC AGGAGGAACAGGAATTAAGCCTTTCCTGGAACGCTTTGGAGAGCGTTGTCAAGA ACATAGCAAAGAAAGTCCAGCTCGTAGCACACCCCACAGAACCCCCATTATTAC TCCAAATACAAAGGCCATCCAAGAAAGATTATTCAAGCAAGACACATCTTCATCT ACTACCCATTTAGCACAACAGCTCAAGCAGGAACGTCAAAAAGAACTAGCATGT CTTCGTGGCCGATTTGACAAGGGCAATATATGGAGTGCAGAAAAAGGCGGAAAC TCAAAAAGCAAACAACTAGAAACCAAACAGGAAACTCACTGTCAGAGCACTCCC CTCAAAAAACACCAAGGTGTTTCAAAAACTCAGTCACTTCCAGTAACAGAAAAG GTGACCGAAAACCAGATACCAGCCAAAAATTCTAGTACAGAACCTAAAGGTTTC ACTGAATGCGAAATGACGAAATCTAGCCCTTTGAAAATAACATTGTTTTTAGAAG AGGACAAATCCTTAAAAGTAACATCAGACCCAAAGGTTGAGCAGAAAATTGAAG TGATACGTGAAATTGAGATGAGTGTGGATGATGATGATATCAATAGTTCGAAAG TAATTAATGACCTCTTCAGTGATGTCCTAGAGGAAGGTGAACTAGATATGGAGA AGAGCCAAGAGGAGATGGATCAAGCATTAGCAGAAAGCAGCGAAGAACAGGAA GATGCACTGAATATCTCCTCAATGTCTTTACTTGC ACC ATTGGCACAAAC AGTTG GTGTGGTAAGTCCAGAGAGTTTAGTGTCCACACCTAGACTGGAATTGAAAGACA CCAGCAGAAGTGATGAAAGTCCAAAACCAGGAAAATTCCAAAGAACTCGTGTCC CTCGAGCTGAATCTGGTGATAGCCTTGGTTCTGAAGATCGTGATCTTCTTTACAG CATTGATGCATATAGATCTCAAAGATTCAAAGAAACAGAACGTCCATCAATAAA GCAGGTGATTGTTCGGAAGGAAGATGTTACTTCAAAACTGGATGAAAAAAATAA TGCCTTTCCTTGTCAAGTTAATATCAAACAGAAAATGCAGGAACTCAATAACGAA ATAAATATGCAACAGACAGTGATCTATCAAGCTAGCCAGGCTCTTAACTGCTGTG TTGATGAAGAACATGGAAAAGGGTCCCTAGAAGAAGCTGAAGCAGAAAGACTTC TTCTAATTGCAACTGGGAAGAGAACACTTTTGATTGATGAATTGAATAAATTGAA GAACGAAGGACCTCAGAGGAAGAATAAGGCTAGTCCCCAAAGTGAATTTATGCC ATCCAAAGGATCAGTTACTTTGTCAGAAATCCGCTTGCCTCTAAAAGCAGATTTT GTCTGCAGTACGGTTCAGAAACCAGATGCAGCAAATTACTATTACTTAATTATAC TAAAAGCAGGAGCTGAAAATATGGTAGCCACACCATTAGCAAGTACTTCAAACT CTCTTAACGGTGATGCTCTGACATTCACTACTACATTTACTCTGCAAGATGTATCC AATGACTTTGAAATAAATATTGAAGTTTACAGCTTGGTGCAAAAGAAAGATCCCT CAGGCCTTGATAAGAAGAAAAAAACATCCAAGTCCAAGGCTATTACTCCAAAGC GACTCCTCACATCTATAACCACAAAAAGCAACATTCATTCTTCAGTCATGGCCAG TCCAGGAGGTCTTAGTGCTGTGCGAACCAGCAACTTCGCCCTTGTTGGATCTTAC ACATTATCATTGTCTTCAGTAGGAAATACTAAGTTTGTTCTGGACAAGGTCCCCTT TTTATCTTCTTTGGAAGGTCATATTTATTTAAAAATAAAATGTCAAGTGAATTCCA GTGTTGAAGAAAGAGGTTTTCTAACCATATTTGAAGATGTTAGTGGTTTTGGTGC CTGGCATCGAAGATGGTGTGTTCTTTCTGGAAACTGTATATCTTATTGGACTTATC CAGATGATGAGAAACGCAAGAATCCCATAGGAAGGATAAATCTGGCTAATTGTA CCAGTCGTCAGATAGAACCAGCCAACAGAGAATTTTGTGCAAGACGCAACACTT TTGAATTAATTACTGTCCGACCACAAAGAGAAGATGACCGAGAGACTCTTGTCA GCCAATGCAGGGACACACTCTGTGTTACCAAGAACTGGCTGTCTGCAGATACTAA AGAAGAGCGGGATCTCTGGATGCAAAAACTCAATCAAGTTCTTGTTGATATTCGC CTCTGGCAACCTGATGCTTGCTACAAACCTATTGGAAAGCCTTAA (SEQ ID NO : 2; NCBI Reference No. NM_018685.2)
[0061] CD44 is a cell-surface glycoprotein that is a receptor for hyaluronic acid. CD44 may also be known in the art as Hermes Antigen, Pgp, PGP-1, Phagocytic glycoprotein 1, PGP-I, Phagocytic glycoprotein I, INLU-related p80 glycoprotein, MIC4, MDU2, MDU3, MC56, HCELL, CSPG8, hyaluronate receptor, heparan sulfate proteoglycan, extracellular matrix receptor III, ECMR- III, HUTCH-I, LHR, GP90 lymphocyte homing/adhesion receptor, CD44R, CDW44, antigen gp90 homing receptor, MUTCH-I, chondroitin sulfate proteoglycan 8, MGC 10468, hematopoietic cell E- and L-selectin ligand, Epican, Hermes, Ly-24, lymphocyte antigen 24, CD44A, METAA,
MGC 124941, RHAMM, or hyaluronate binding protein. CD44 is involved in cell adhesion and migration. An exemplary human CD44 protein contains 742 amino acid residues and has the following polypeptide sequence:
MDKFWWHAAWGLCLVPLSLAQIDLNITCRFAGVFHVEKNGRYSISRTEAADLCKAF NSTLPTMAQMEKALSIGFETCRYGFIEGHVVIPRIHPNSICAANNTGVYILTSNTSQYD TYCFNASAPPEEDCTSVTDLPNAFDGPITITIVNRDGTRYVQKGEYRTNPEDIYPSNPT DDDVSSGSSSERSSTSGGYIFYTFSTVHPIPDEDSPWITDSTDRIPATTLMSTSATATET ATKRQETWDWFSWLFLPSESKNHLHTTTQMAGTSSNTISAGWEPNEENEDERDRHL SFSGSGIDDDEDFISSTISTTPRAFDHTKQNQDWTQWNPSHSNPEVLLQTTTRMTDVD RNGTTAYEGNWNPEAHPPLIHHEHHEEEETPHSTSTIQATPSSTTEETATQKEQWFGN RWHEGYRQTPKEDSHSTTGTAAASAHTSHPMQGRTTPSPEDSSWTDFFNPISHPMGR GHQAGRRMDMDSSHSITLQPTANPNTGLVEDLDRTGPLSMTTQQSNSQSFSTSHEGL EEDKDHPTTSTLTSSNRNDVTGGRRDPNHSEGSTTLLEGYTSHYPHTKESRTFIPVTS AKTGSFGVTAVTVGDSNSNVNRSLSGDQDTFHPSGGSHTTHGSESDGHSHGSQEGG ANTTSGPIRTPQIPEWLIILASLLALALILAVCIAVNSR RCGQKKKLVINSGNGAVED RKPSGLNGEASKSQEMVHLVNKESSETPDQFMTADETRNLQNVDMKIGV (SEQ ID NO: 3; NCBI Reference No. NP 000601.3) [0062] An niRNA sequence for this CD44 polypeptide is:
GAGAAGAAAGCCAGTGCGTCTCTGGGCGCAGGGGCCAGTGGGGCTCGGAGGCAC AGGCACCCCGCGACACTCCAGGTTCCCCGACCCACGTCCCTGGCAGCCCCGATTA TTTACAGCCTCAGCAGAGCACGGGGCGGGGGCAGAGGGGCCCGCCCGGGAGGG CTGCTACTTCTTAAAACCTCTGCGGGCTGCTTAGTCACAGCCCCCCTTGCTTGGGT GTGTCCTTCGCTCGCTCCCTCCCTCCGTCTTAGGTC ACTGTTTTCAACCTCGAATA AAAACTGCAGCCAACTTCCGAGGCAGCCTCATTGCCCAGCGGACCCCAGCCTCT GCCAGGTTCGGTCCGCCATCCTCGTCCCGTCCTCCGCCGGCCCCTGCCCCGCGCC CAGGGATCCTCCAGCTCCTTTCGCCCGCGCCCTCCGTTCGCTCCGGACACCATGG ACAAGTTTTGGTGGCACGCAGCCTGGGGACTCTGCCTCGTGCCGCTGAGCCTGGC GCAGATCGATTTGAATATAACCTGCCGCTTTGC AGGTGTATTCCACGTGGAGAAA AATGGTCGCTACAGCATCTCTCGGACGGAGGCCGCTGACCTCTGCAAGGCTTTCA ATAGCACCTTGCCCACAATGGCCCAGATGGAGAAAGCTCTGAGCATCGGATTTG AGACCTGCAGGTATGGGTTCATAGAAGGGCACGTGGTGATTCCCCGGATCCACC CCAACTCCATCTGTGCAGCAAACAACACAGGGGTGTACATCCTCACATCCAACA CCTCCCAGTATGACACATATTGCTTCAATGCTTCAGCTCCACCTGAAGAAGATTG TACATCAGTCACAGACCTGCCCAATGCCTTTGATGGACCAATTACCATAACTATT GTTAACCGTGATGGCACCCGCTATGTCCAGAAAGGAGAATACAGAACGAATCCT GAAGACATCTACCCCAGCAACCCTACTGATGATGACGTGAGCAGCGGCTCCTCC AGTGAAAGGAGCAGCACTTCAGGAGGTTACATCTTTTACACCTTTTCTACTGTAC ACCCCATCCCAGACGAAGACAGTCCCTGGATCACCGACAGCACAGACAGAATCC CTGCTACCACTTTGATGAGCACTAGTGCTACAGCAACTGAGACAGCAACCAAGA GGCAAGAAACCTGGGATTGGTTTTCATGGTTGTTTCTACCATCAGAGTCAAAGAA TCATCTTCACACAACAACACAAATGGCTGGTACGTCTTCAAATACCATCTCAGCA GGCTGGGAGCCAAATGAAGAAAATGAAGATGAAAGAGACAGACACCTCAGTTTT TCTGGATCAGGCATTGATGATGATGAAGATTTTATCTCCAGCACCATTTCAACCA C AC C AC GGGCTTTTG AC C AC AC AAAAC AG AAC C AGG ACTGG AC CC AGTGG AACC CAAGCCATTCAAATCCGGAAGTGCTACTTCAGACAACCACAAGGATGACTGATG TAGACAGAAATGGCACCACTGCTTATGAAGGAAACTGGAACCCAGAAGCACACC CTCCCCTCATTCACCATGAGCATCATGAGGAAGAAGAGACCCCACATTCTACAA GCACAATCCAGGCAACTCCTAGTAGTACAACGGAAGAAACAGCTACCCAGAAGG AACAGTGGTTTGGCAACAGATGGCATGAGGGATATCGCCAAACACCCAAAGAAG ACTCCCATTCGACAACAGGGACAGCTGCAGCCTCAGCTCATACCAGCCATCCAAT GCAAGGAAGGACAACACCAAGCCCAGAGGACAGTTCCTGGACTGATTTCTTCAA CCCAATCTCACACCCCATGGGACGAGGTCATCAAGCAGGAAGAAGGATGGATAT GGACTCCAGTCATAGTATAACGCTTCAGCCTACTGCAAATCCAAACACAGGTTTG GTGGAAGATTTGGACAGGACAGGACCTCTTTCAATGACAACGCAGCAGAGTAAT TCTCAGAGCTTCTCTACATCACATGAAGGCTTGGAAGAAGATAAAGACCATCCA ACAACTTCTACTCTGAC ATC AAGCAATAGGAATGATGTC ACAGGTGGAAGAAGA GACCCAAATCATTCTGAAGGCTCAACTACTTTACTGGAAGGTTATACCTCTCATT ACCCACACACGAAGGAAAGCAGGACCTTCATCCCAGTGACCTCAGCTAAGACTG GGTCCTTTGGAGTTACTGCAGTTACTGTTGGAGATTCCAACTCTAATGTCAATCGT TCCTTATCAGGAGACCAAGACACATTCCACCCCAGTGGGGGGTCCCATACCACTC ATGGATCTGAATCAGATGGAC ACTCAC ATGGGAGTCAAGAAGGTGGAGC AAACA CAACCTCTGGTCCTATAAGGACACCCCAAATTCCAGAATGGCTGATCATCTTGGC ATCCCTCTTGGCCTTGGCTTTGATTCTTGCAGTTTGCATTGCAGTCAACAGTCGAA GAAGGTGTGGGCAGAAGAAAAAGCTAGTGATCAACAGTGGCAATGGAGCTGTG GAGGACAGAAAGCCAAGTGGACTCAACGGAGAGGCCAGCAAGTCTCAGGAAAT GGTGCATTTGGTGAACAAGGAGTCGTCAGAAACTCCAGACCAGTTTATGACAGC TGATGAGACAAGGAACCTGCAGAATGTGGACATGAAGATTGGGGTGTAACACCT ACACCATTATCTTGGAAAGAAACAACCGTTGGAAACATAACCATTACAGGGAGC TGGGACACTTAACAGATGCAATGTGCTACTGATTGTTTCATTGCGAATCTTTTTTA GCATAAAATTTTCTACTCTTTTTGTTTTTTGTGTTTTGTTCTTTAAAGTCAGGTCCA ATTTGTAAAAACAGCATTGCTTTCTGAAATTAGGGCCCAATTAATAATCAGCAAG AATTTGATCGTTCCAGTTCCCACTTGGAGGCCTTTCATCCCTCGGGTGTGCTATGG ATGGCTTCTAACAAAAACTACACATATGTATTCCTGATCGCCAACCTTTCCCCCA CCAGCTAAGGACATTTCCCAGGGTTAATAGGGCCTGGTCCCTGGGAGGAAATTTG AATGGGTCCATTTTGCCCTTCCATAGCCTAATCCCTGGGCATTGCTTTCCACTGAG GTTGGGGGTTGGGGTGTACTAGTTACACATCTTCAACAGACCCCCTCTAGAAATT TTTCAGATGCTTCTGGGAGACACCCAAAGGGTGAAGCTATTTATCTGTAGTAAAC TATTTATCTGTGTTTTTGAAATATTAAACCCTGGATCAGTCCTTTGATCAGTATAA TTTTTTAAAGTTACTTTGTCAGAGGCACAAAAGGGTTTAAACTGATTCATAATAA ATATCTGTACTTCTTCGATCTTCACCTTTTGTGCTGTGATTCTTCAGTTTCTAAACC AGCACTGTCTGGGTCCCTACAATGTATCAGGAAGAGCTGAGAATGGTAAGGAGA CTCTTCTAAGTCTTCATCTCAGAGACCCTGAGTTCCCACTCAGACCCACTCAGCC AAATCTCATGGAAGACCAAGGAGGGCAGCACTGTTTTTGTTTTTTGTTTTTTGTTT TTTTTTTTTGACACTGTCCAAAGGTTTTCCATCCTGTCCTGGAATCAGAGTTGGAA GCTGAGGAGCTTCAGCCTCTTTTATGGTTTAATGGCCACCTGTTCTCTCCTGTGAA AGGCTTTGCAAAGTCACATTAAGTTTGCATGACCTGTTATCCCTGGGGCCCTATTT CATAGAGGCTGGCCCTATTAGTGATTTCCAAAAACAATATGGAAGTGCCTTTTGA TGTCTTACAATAAGAGAAGAAGCCAATGGAAATGAAAGAGATTGGCAAAGGGG AAGGATGATGCCATGTAGATCCTGTTTGAC ATTTTTATGGCTGTATTTGTAAACTT AAACACACCAGTGTCTGTTCTTGATGCAGTTGCTATTTAGGATGAGTTAAGTGCC TGGGGAGTCCCTCAAAAGGTTAAAGGGATTCCCATCATTGGAATCTTATCACCAG ATAGGCAAGTTTATGACCAAACAAGAGAGTACTGGCTTTATCCTCTAACCTCATA TTTTCTCCCACTTGGCAAGTCCTTTGTGGCATTTATTCATCAGTCAGGGTGTCCGA TTGGTCCTAGAACTTCC AAAGGCTGCTTGTCATAGAAGCCATTGC ATCTATAAAG CAACGGCTCCTGTTAAATGGTATCTCCTTTCTGAGGCTCCTACTAAAAGTCATTTG TTACCTAAACTTATGTGCTTAACAGGCAATGCTTCTCAGACCACAAAGCAGAAAG AAGAAGAAAAGCTCCTGACTAAATCAGGGCTGGGCTTAGACAGAGTTGATCTGT AGAATATCTTTAAAGGAGAGATGTCAACTTTCTGCACTATTCCCAGCCTCTGCTC CTCCCTGTCTACCCTCTCCCCTCCCTCTCTCCCTCCACTTCACCCCACAATCTTGA AAAACTTCCTTTCTCTTCTGTGAACATCATTGGCCAGATCCATTTTCAGTGGTCTG GATTTCTTTTTATTTTCTTTTCAACTTGAAAGAAACTGGACATTAGGCCACTATGT GTTGTTACTGCCACTAGTGTTCAAGTGCCTCTTGTTTTCCCAGAGATTTCCTGGGT CTGCCAGAGGCCCAGACAGGCTCACTCAAGCTCTTTAACTGAAAAGCAACAAGC CACTCCAGGACAAGGTTCAAAATGGTTACAACAGCCTCTACCTGTCGCCCCAGG GAGAAAGGGGTAGTGATACAAGTCTCATAGCCAGAGATGGTTTTCCACTCCTTCT AGATATTCCCAAAAAGAGGCTGAGACAGGAGGTTATTTTCAATTTTATTTTGGAA TTAAATACTTTTTTCCCTTTATTACTGTTGTAGTCCCTCACTTGGATATACCTCTGT TTTCACGATAGAAATAAGGGAGGTCTAGAGCTTCTATTCCTTGGCCATTGTCAAC GGAGAGCTGGCCAAGTCTTCACAAACCCTTGCAACATTGCCTGAAGTTTATGGAA TAAGATGTATTCTCACTCCCTTGATCTCAAGGGCGTAACTCTGGAAGCACAGCTT GACTACACGTCATTTTTACCAATGATTTTCAGGTGACCTGGGCTAAGTCATTTAA ACTGGGTCTTTATAAAAGTAAAAGGCCAACATTTAATTATTTTGCAAAGCAACCT AAGAGCTAAAGATGTAATTTTTCTTGCAATTGTAAATCTTTTGTGTCTCCTGAAGA CTTCCCTTAAAATTAGCTCTGAGTGAAAAATCAAAAGAGACAAAAGACATCTTC GAATCCATATTTCAAGCCTGGTAGAATTGGCTTTTCTAGCAGAACCTTTCCAAAA GTTTTATATTGAGATTCATAACAACACCAAGAATTGATTTTGTAGCCAACATTCA TTCAATACTGTTATATCAGAGGAGTAGGAGAGAGGAAACATTTGACTTATCTGGA AAAGCAAAATGTACTTAAGAATAAGAATAACATGGTCCATTCACCTTTATGTTAT AGATATGTCTTTGTGTAAATCATTTGTTTTGAGTTTTCAAAGAATAGCCCATTGTT CATTCTTGTGCTGTACAATGACCACTGTTATTGTTACTTTGACTTTTCAGAGCACA CCCTTCCTCTGGTTTTTGTATATTTATTGATGGATCAATAATAATGAGGAAAGCAT GATATGTATATTGCTGAGTTGAAAGC ACTTATTGGAAAATATTAAAAGGCTAAC A TTAAAAGACTAAAGGAAACAGAAAAAAAAAAAAAAAAA (SEQ ID NO: 4; NCBI Reference No. NM 000610.3)
[0063] CKIT (or c-kit) is a cytokine receptor that binds to stem cell factor. CKIT may also be known in the art as CD117, v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog, piebald trait, SCFR, stem cell factor receptor, tyrosine-protein kinase Kit, mast/stem cell growth factor receptor, or mast-cell growth factor receptor. An exemplary human CKIT protein contains 976 amino acid residues and has the following polypeptide sequence:
MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLC TDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPA KLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIK SVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFT VTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSG VFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPE HQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVN AAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPV DVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKTSAYFNFAFKGNNK EQIHPHTLFTPLLIGFVIVAGMMCIIVMILTYKYLQKPMYEVQWKVVEEINGNNYVYI DPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLK PSAHLTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRR KRDSFICSKQEDHAEAALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRR SVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAARNI LLTHGRITKICDFGLARDIK DSNYVVKGNARLPVKWMAPESIFNCVYTFESDVWSY GIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTCWDADPL KRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLL VHDDV (SEQ ID NO: 5; NCBI Reference No. NP 000213.1)
[0064] An niRNA sequence for this CKIT polypeptide is:
TCTGGGGGCTCGGCTTTGCCGCGCTCGCTGCACTTGGGCGAGAGCTGGAACGTGG
ACCAGAGCTCGGATCCCATCGCAGCTACCGCGATGAGAGGCGCTCGCGGCGCCT
GGGATTTTCTCTGCGTTCTGCTCCTACTGCTTCGCGTCCAGACAGGCTCTTCTCAA CC ATCTGTGAGTCCAGGGGAACCGTCTCCACC ATCCATCCATCC AGGAAAATCAG ACTTAATAGTCCGCGTGGGCGACGAGATTAGGCTGTTATGCACTGATCCGGGCTT TGTCAAATGGACTTTTGAGATCCTGGATGAAACGAATGAGAATAAGCAGAATGA ATGGATCACGGAAAAGGCAGAAGCCACCAACACCGGCAAATACACGTGCACCA ACAAACACGGCTTAAGCAATTCCATTTATGTGTTTGTTAGAGATCCTGCCAAGCT TTTCCTTGTTGACCGCTCCTTGTATGGGAAAGAAGAC AACGAC ACGCTGGTCCGC TGTCCTCTCACAGACCCAGAAGTGACCAATTATTCCCTCAAGGGGTGCCAGGGGA AGCCTCTTCCCAAGGACTTGAGGTTTATTCCTGACCCCAAGGCGGGCATCATGAT CAAAAGTGTGAAACGCGCCTACCATCGGCTCTGTCTGCATTGTTCTGTGGACCAG GAGGGCAAGTCAGTGCTGTCGGAAAAATTCATCCTGAAAGTGAGGCCAGCCTTC AAAGCTGTGCCTGTTGTGTCTGTGTCCAAAGCAAGCTATCTTCTTAGGGAAGGGG AAGAATTCACAGTGACGTGCACAATAAAAGATGTGTCTAGTTCTGTGTACTCAAC GTGGAAAAGAGAAAACAGTCAGACTAAACTACAGGAGAAATATAATAGCTGGC ATCACGGTGACTTCAATTATGAACGTCAGGCAACGTTGACTATCAGTTCAGCGAG AGTTAATGATTCTGGAGTGTTCATGTGTTATGCCAATAATACTTTTGGATCAGCA AATGTCACAACAACCTTGGAAGTAGTAGATAAAGGATTCATTAATATCTTCCCCA TGATAAACACTACAGTATTTGTAAACGATGGAGAAAATGTAGATTTGATTGTTGA ATATGAAGCATTCCCCAAACCTGAACACCAGCAGTGGATCTATATGAACAGAAC CTTCACTGATAAATGGGAAGATTATCCCAAGTCTGAGAATGAAAGTAATATCAG ATACGTAAGTGAACTTCATCTAACGAGATTAAAAGGCACCGAAGGAGGCACTTA CACATTCCTAGTGTCCAATTCTGACGTCAATGCTGCCATAGCATTTAATGTTTATG TGAATACAAAACCAGAAATCCTGACTTACGACAGGCTCGTGAATGGCATGCTCC AATGTGTGGCAGCAGGATTCCCAGAGCCCACAATAGATTGGTATTTTTGTCCAGG AACTGAGCAGAGATGCTCTGCTTCTGTACTGCCAGTGGATGTGCAGACACTAAAC TCATCTGGGCCACCGTTTGGAAAGCTAGTGGTTCAGAGTTCTATAGATTCTAGTG CATTCAAGCACAATGGCACGGTTGAATGTAAGGCTTACAACGATGTGGGCAAGA CTTCTGCCTATTTTAACTTTGCATTTAAAGGTAACAACAAAGAGCAAATCCATCC CCACACCCTGTTCACTCCTTTGCTGATTGGTTTCGTAATCGTAGCTGGCATGATGT GCATTATTGTGATGATTCTGACCTACAAATATTTACAGAAACCCATGTATGAAGT ACAGTGGAAGGTTGTTGAGGAGATAAATGGAAACAATTATGTTTACATAGACCC AACACAACTTCCTTATGATCACAAATGGGAGTTTCCCAGAAACAGGCTGAGTTTT GGGAAAACCCTGGGTGCTGGAGCTTTCGGGAAGGTTGTTGAGGCAACTGCTTAT GGCTTAATTAAGTCAGATGCGGCCATGACTGTCGCTGTAAAGATGCTCAAGCCGA GTGCCC ATTTGAC AGAACGGGAAGCCCTCATGTCTGAACTCAAAGTCCTGAGTTA CCTTGGTAATCACATGAATATTGTGAATCTACTTGGAGCCTGCACCATTGGAGGG CCCACCCTGGTCATTACAGAATATTGTTGCTATGGTGATCTTTTGAATTTTTTGAG AAGAAAACGTGATTCATTTATTTGTTCAAAGCAGGAAGATCATGCAGAAGCTGC ACTTTATAAGAATCTTCTGCATTCAAAGGAGTCTTCCTGCAGCGATAGTACTAAT GAGTACATGGACATGAAACCTGGAGTTTCTTATGTTGTCCCAACC AAGGCCGACA AAAGGAGATCTGTGAGAATAGGCTCATACATAGAAAGAGATGTGACTCCCGCCA TCATGGAGGATGACGAGTTGGCCCTAGACTTAGAAGACTTGCTGAGCTTTTCTTA CCAGGTGGCAAAGGGCATGGCTTTCCTCGCCTCCAAGAATTGTATTCACAGAGAC TTGGCAGCCAGAAATATCCTCCTTACTCATGGTCGGATCACAAAGATTTGTGATT TTGGTCTAGCCAGAGACATCAAGAATGATTCTAATTATGTGGTTAAAGGAAACGC TCGACTACCTGTGAAGTGGATGGCACCTGAAAGCATTTTCAACTGTGTATACACG TTTGAAAGTGACGTCTGGTCCTATGGGATTTTTCTTTGGGAGCTGTTCTCTTTAGG AAGCAGCCCCTATCCTGGAATGCCGGTCGATTCTAAGTTCTACAAGATGATCAAG GAAGGCTTCCGGATGCTCAGCCCTGAACACGCACCTGCTGAAATGTATGACATA ATGAAGACTTGCTGGGATGCAGATCCCCTAAAAAGACCAACATTCAAGCAAATT GTTCAGCTAATTGAGAAGCAGATTTCAGAGAGCACCAATCATATTTACTCCAACT TAGCAAACTGCAGCCCCAACCGACAGAAGCCCGTGGTAGACCATTCTGTGCGGA TCAATTCTGTCGGCAGCACCGCTTCCTCCTCCCAGCCTCTGCTTGTGCACGACGAT GTCTGAGCAGAATCAGTGTTTGGGTCACCCCTCCAGGAATGATCTCTTCTTTTGG CTTCCATGATGGTTATTTTCTTTTCTTTCAACTTGCATCCAACTCCAGGATAGTGG GCACCCCACTGCAATCCTGTCTTTCTGAGCACACTTTAGTGGCCGATGATTTTTGT CATCAGCCACCATCCTATTGCAAAGGTTCCAACTGTATATATTCCCAATAGCAAC GTAGCTTCTACCATGAACAGAAAACATTCTGATTTGGAAAAAGAGAGGGAGGTA TGGACTGGGGGCCAGAGTCCTTTCCAAGGCTTCTCCAATTCTGCCCAAAAATATG GTTGATAGTTTACCTGAATAAATGGTAGTAATCACAGTTGGCCTTCAGAACCATC CATAGTAGTATGATGATACAAGATTAGAAGCTGAAAACCTAAGTCCTTTATGTGG AAAACAGAACATCATTAGAACAAAGGACAGAGTATGAACACCTGGGCTTAAGAA ATCTAGTATTTCATGCTGGGAATGAGACATAGGCCATGAAAAAAATGATCCCCA AGTGTGAACAAAAGATGCTCTTCTGTGGACCACTGCATGAGCTTTTATACTACCG ACCTGGTTTTTAAATAGAGTTTGCTATTAGAGCATTGAATTGGAGAGAAGGCCTC CCTAGCCAGCACTTGTATATACGCATCTATAAATTGTCCGTGTTCATACATTTGAG GGGAAAACACCATAAGGTTTCGTTTCTGTATACAACCCTGGCATTATGTCCACTG TGTATAGAAGTAGATTAAGAGCCATATAAGTTTGAAGGAAACAGTTAATACCAT TTTTTAAGGAAACAATATAACCACAAAGCACAGTTTGAACAAAATCTCCTCTTTT AGCTGATGAACTTATTCTGTAGATTCTGTGGAACAAGCCTATCAGCTTCAGAATG GCATTGTACTCAATGGATTTGATGCTGTTTGACAAAGTTACTGATTCACTGCATG GCTCCCACAGGAGTGGGAAAACACTGCCATCTTAGTTTGGATTCTTATGTAGCAG GAAATAAAGTATAGGTTTAGCCTCCTTCGCAGGCATGTCCTGGACACCGGGCC AG TATCTATATATGTGTATGTACGTTTGTATGTGTGTAGACAAATATTTGGAGGGGT ATTTTTGCCCTGAGTCCAAGAGGGTCCTTTAGTACCTGAAAAGTAACTTGGCTTT CATTATTAGTACTGCTCTTGTTTCTTTTCACATAGCTGTCTAGAGTAGCTTACCAG AAGCTTCCATAGTGGTGCAGAGGAAGTGGAAGGCATCAGTCCCTATGTATTTGCA GTTCACCTGCACTTAAGGCACTCTGTTATTTAGACTCATCTTACTGTACCTGTTCC TTAGACCTTCCATAATGCTACTGTCTCACTGAAACATTTAAATTTTACCCTTTAGA CTGTAGCCTGGATATTATTCTTGTAGTTTACCTCTTTAAAAACAAAACAAAACAA AACAAAAAACTCCCCTTCCTCACTGCCCAATATAAAAGGCAAATGTGTACATGGC AGAGTTTGTGTGTTGTCTTGAAAGATTCAGGTATGTTGCCTTTATGGTTTCCCCCT TCTACATTTCTTAGACTACATTTAGAGAACTGTGGCCGTTATCTGGAAGTAACCA TTTGCACTGGAGTTCTATGCTCTCGCACCTTTCCAAAGTTAACAGATTTTGGGGTT GTGTTGTCACCCAAGAGATTGTTGTTTGCCATACTTTGTCTGAAAAATTCCTTTGT GTTTCTATTGACTTCAATGATAGTAAGAAAAGTGGTTGTTAGTTATAGATGTCTA GGTACTTCAGGGGCACTTCATTGAGAGTTTTGTCTTGGATATTCTTGAAAGTTTAT ATTTTTATAATTTTTTCTTACATCAGATGTTTCTTTGCAGTGGCTTAATGTTTGAAA TTATTTTGTGGCTTTTTTTGTAAATATTGAAATGTAGCAATAATGTCTTTTGAATA TTCCCAAGCCCATGAGTCCTTGAAAATATTTTTTATATATACAGTAACTTTATGTG TAAATACATAAGCGGCGTAAGTTTAAAGGATGTTGGTGTTCCACGTGTTTTATTC CTGTATGTTGTCCAATTGTTGACAGTTCTGAAGAATTCTAATAAAATGTACATAT ATAAATCAAAAAAAAAAAAAAAA (SEQ ID NO: 6; NCBI Reference No.
NM_000222.2) [0065] DEPDC 1 may be a transcriptional co-reporessor, and has been shown to localize to the nucleus of bladder cancer cells. DEPDC 1 may also be known in the art as DEP domain containing 1, cell cycle control protein SDP352, DEPDC 1 A, DEPDC 1-V2, FLJ20354, 5830484J08Rik, DEP.8, DEP domain-containing protein 1A, or SDP35. An exemplary human DEPDC 1 protein contains 811 amino acid residues and has the following polypeptide sequence:
MESQGVPPGPYRATKLWNEVTTSFRAGMPLRKHRQHFK YGNCFTAGEAVDWLY DLLRNNSNFGPEVTRQQTIQLLRKFLKNHVIEDIKGRWGSENVDDNNQLFRFPATSPL KTLPRRYPELRKNNIENFSKDKDSIFKLRNLSRRTPKRHGLHLSQENGEKIKHEIINED QENAIDNRELSQEDVEEVWRYVILIYLQTILGVPSLEEVINPKQVIPQYIMYNMANTS KRGVVILQNKSDDLPHWVLS AMKCLANWPRSNDMNNPTYVGFERDVFRTIADYFL DLPEPLLTFEYYELFVNILVVCGYITVSDRSSGIHKIQDDPQSSKFLHLNNLNSFKSTE CLLLSLLHREKNKEESDSTERLQISNPGFQERCAKKMQLVNLRNRRVSANDIMGGSC HNLIGLSNMHDLSSNSKPRCCSLEGIVDVPGNSSKEASSVFHQSFPNIEGQNNKLFLES KPKQEFLLNLHSEENIQKPFSAGFKRTSTLTVQDQEELCNGKCKSKQLCRSQSLLLRS STRRNSYINTPVAEIIMKPNVGQGSTSVQTAMESELGESSATINKRLCKSTIELSENSL LPASSMLTGTQSLLQPHLERVAIDALQLCCLLLPPPNRRKLQLLMRMISRMSQNVDM PKLHDAMGTRSLMIHTFSRCVLCCAEEVDLDELLAGRLVSFLMDHHQEILQVPSYLQ TAVEKHLDYLK GHIENPGDGLFAPLPTYSYCKQISAQEFDEQKVSTSQAAIAELLEN IIKNRSLPLKEKRKKLKQFQKEYPLIYQKRFPTTESEAALFGDKPTIKQPMLILRKPKF RSLR (SEQ ID NO: 7; NCBI Reference No. NP 001107592.1)
[0066] An mRNA sequence for this DEPDC 1 polypeptide is:
TATGCTATTCAAATCGGCGGCGGGGCCAACGGTTGTGCCGAGACTCGCCACTGCC GCGGCCGCTGGGCCTGAGTGTCGCCTTCGCCGCCATGGACGCCACCGGGCGCTG ACAGACCTATGGAGAGTCAGGGTGTGCCTCCCGGGCCTTATCGGGCCACCAAGC TGTGGAATGAAGTTACCACATCTTTTCGAGCAGGAATGCCTCTAAGAAAACACA GACAACACTTTAAAAAATATGGCAATTGTTTCACAGCAGGAGAAGCAGTGGATT GGCTTTATGACCTATTAAGAAATAATAGCAATTTTGGTCCTGAAGTTACAAGGCA ACAGACTATCCAACTGTTGAGGAAATTTCTTAAGAATCATGTAATTGAAGATATC AAAGGGAGGTGGGGATCAGAAAATGTTGATGATAACAACCAGCTCTTCAGATTT CCTGCAACTTCGCCACTTAAAACTCTACCACGAAGGTATCCAGAATTGAGAAAA AACAACATAGAGAACTTTTCCAAAGATAAAGATAGCATTTTTAAATTACGAAACT TATCTCGTAGAACTCCTAAAAGGCATGGATTACATTTATCTCAGGAAAATGGCGA GAAAATAAAGCATGAAATAATCAATGAAGATCAAGAAAATGCAATTGATAATAG AGAACTAAGCCAGGAAGATGTTGAAGAAGTTTGGAGATATGTTATTCTGATCTAC CTGCAAACCATTTTAGGTGTGCCATCCCTAGAAGAAGTCATAAATCCAAAACAA GTAATTCCCCAATATATAATGTACAACATGGCCAATACAAGTAAACGTGGAGTA GTTATACTACAAAACAAATCAGATGACCTCCCTC ACTGGGTATTATCTGCC ATGA AGTGCCTAGCAAATTGGCCAAGAAGCAATGATATGAATAATCCAACTTATGTTG GATTTGAACGAGATGTATTCAGAACAATCGCAGATTATTTTCTAGATCTCCCTGA ACCTCTACTTACTTTTGAATATTACGAATTATTTGTAAACATTTTGGTTGTTTGTG GCTACATCACAGTTTCAGATAGATCCAGTGGGATACATAAAATTCAAGATGATCC ACAGTCTTCAAAATTCCTTCACTTAAACAATTTGAATTCCTTC AAATC AACTGAGT GCCTTCTTCTCAGTCTGCTTCATAGAGAAAAAAACAAAGAAGAATCAGATTCTAC TGAGAGACTACAGATAAGCAATCCAGGATTTCAAGAAAGATGTGCTAAGAAAAT GCAGCTAGTTAATTTAAGAAACAGAAGAGTGAGTGCTAATGACATAATGGGAGG AAGTTGTCATAATTTAATAGGGTTAAGTAATATGCATGATCTATCCTCTAACAGC AAACCAAGGTGCTGTTCTTTGGAAGGAATTGTAGATGTGCCAGGGAATTCAAGT AAAGAGGCATCCAGTGTCTTTCATCAATCTTTTCCGAACATAGAAGGACAAAATA ATAAACTGTTTTTAGAGTCTAAGCCCAAACAGGAATTCCTGTTGAATCTTCATTC AGAGGAAAATATTCAAAAGCCATTCAGTGCTGGTTTTAAGAGAACCTCTACTTTG ACTGTTCAAGACCAAGAGGAGTTGTGTAATGGGAAATGCAAGTCAAAACAGCTT TGTAGGTCTCAGAGTTTGCTTTTAAGAAGTAGTACAAGAAGGAATAGTTATATCA ATACACCAGTGGCTGAAATTATCATGAAACCAAATGTTGGACAAGGCAGCACAA GTGTGCAAACAGCTATGGAAAGTGAACTCGGAGAGTCTAGTGCCACAATCAATA AAAGACTCTGCAAAAGTACAATAGAACTTTCAGAAAATTCTTTACTTCCAGCTTC TTCTATGTTGACTGGCACACAAAGCTTGCTGCAACCTCATTTAGAGAGGGTTGCC ATCGATGCTCTACAGTTATGTTGTTTGTTACTTCCCCCACCAAATCGTAGAAAGCT TCAACTTTTAATGCGTATGATTTCCCGAATGAGTCAAAATGTTGATATGCCCAAA CTTCATGATGCAATGGGTACGAGGTCACTGATGATACATACCTTTTCTCGATGTG TGTTATGCTGTGCTGAAGAAGTGGATCTTGATGAGCTTCTTGCTGGAAGATTAGT TTCTTTCTTAATGGATCATCATCAGGAAATTCTTCAAGTACCCTCTTACTTACAGA CTGCAGTGGAAAAACATCTTGACTACTTAAAAAAGGGACATATTGAAAATCCTG GAGATGGACTATTTGCTCCTTTGCCAACTTACTCATACTGTAAGCAGATTAGTGC TCAGGAGTTTGATGAGCAAAAAGTTTCTACCTCTCAAGCTGCAATTGCAGAACTT TTAGAAAATATTATTAAAAACAGGAGTTTACCTCTAAAGGAGAAAAGAAAAAAA CTAAAACAGTTTCAGAAGGAATATCCTTTGATATATCAGAAAAGATTTCCAACCA CGGAGAGTGAAGCAGCACTTTTTGGTGACAAACCTACAATCAAGCAACCAATGC TGATTTTAAGAAAACCAAAGTTCCGTAGTCTAAGATAACTAACTGAATTAAAAAT TATGTAATACTTGTGGAACTTTGATAAATGAAGCCATATCTGAGAATGTAGCTAC TCAAAAGGAAGTCTGTC ATTAATAAGGTATTTCTAAATAAACACATTATGTAAGG AAGTGCCAAAATAGTTATCAATGTGAGACTCTTAGGAAACTAACTAGATCTCAAT TGAGAGCACATAACAATAGATGATACCAAATACTTTTTGTTTTTAACACAGCTAT CCAGTAAGGCTATCATGATGTGTGCTAAAATTTTATTTACTTGAATTTTGAAAACT GAGCTGTGTTAGGGATTAAACTATAATTCTGTTCTTAAAAGAAAATTTATCTGCA AATGTGC AAGTTCTGAGATATTAGCTAATGAATTAGTTGTTTGGGGTTACTTCTTT GTTTCTAAGTATAAGAATGTGAAGAATATTTGAAAACTCAATGAAATAATTCTCA GCTGCCAAATGTTGCACTCTTTTATATATTCTTTTTCCACTTTTGATCTATTTATAT ATATGTATGTGTTTTTAAAATATGTGTATATTTTATCAGATTTGGTTTTGCCTTAA ATATTATCCCCAATTGCTTCAGTCATTCATTTGTTCAGTATATATATTTTGAATTCT AGTTTTCATAATCTATTAGAAGATGGGGATATAAAAGAAGTATAAGGCAATCAT ATATTCATTCAAAAGATATTTATTTAGCAACTGCTATGTGCCTTTCGTTGTTCCAG ATATGCAGAGACAATGATAAATAAAACATATAATCTCTTCCATAAGGTATTTATT TTTTAATCAAGGGAGATACACCTATCAGATGTTTAAAATAACAACACTACCCACT GAAATCAGGGCATATAGAATCATTCAGCTAAAGAGTGACTTCTATGATGATGGA ACAGGTCTCTAAGCTAGTGGTTTTCAAACTGGTACACATTAGACTCACCCGAGGA ATTTTAAAACAGCCTATATGCCCAGGGCCTAACTTACACTAATTAAATCTGAATT TTGGGGATGTTGTATAGGGATTAGTATTTTTTTTAATCTAGGTGATTCCAATATTC AGCCAACTGTGAGAATCAATGGCCTAAATGCTTTTTATAAACATTTTTATAAGTG TCAAGATAATGGCACATTGACTTTATTTTTTCATTGGAAGAAAATGCCTGCCAAG TATAAATGACTCTCATCTTAAAACAAGGTTCTTCAGGTTTCTGCTTGATTGACTTG GTACAAACTTGAAGCAAGTTGCCTTCTAATTTTTACTCCAAGATTGTTTCATATCT ATTCCTTAAGTGTAAAGAAATATATAATGCATGGTTTGTAATAAAATCTTAATGT TTAATGACTGTTCTCATTTCTCAATGTAATTTCATACTGTTTCTCTATAAAATGAT AGTATTCCATTTAACATTACTGATTTTTATTAAAAACCTGGACAGAAAATTATAA ATTATAAATATGACTTTATCCTGGCTATAAAATTATTGAACCAAAATGAATTCTTT CTAAGGCATTTGAATACTAAAACGTTTATTGTTTATAGATATGTAAAATGTGGAT TATGTTGCAAATTGAGATTAAAATTATTTGGGGTTTTGTAACAATATAATTTTGCT TTTGTATTATAGACAAATATATAAATAATAAAGGCAGGCAACTTTCATTTGCACT AATGTACATGCAATTGAGATTACAAAATACATGGTACAATGCTTTAATAACAAAC TCTGCCAGTCAGGTTTGAATCCTACTGTGCTATTAACTAGCTAGTAAACTCAGAC AAGTTACTTAACTTCTCTAAGCCCCAGTTTTGTTATCTATAAAATGAATATTATAA TAGTACCTCTTTTTAGGATTGCGAGGATTAAGCAGGATAATGCATGTAAAGTGTT AGC ACAGTGTCTC ACATAGAATAAGCACTCTATAAATATTTTACTAGAATCACCT AGGATTATAGCACTAGAAGAGATCTTAGCAAAAATGTGGTCCTTTCTGTTGCTTT GG AC AG AC ATG AAC C AAAAC AAAATT AC GG AC AATTG ATG AGC CTT ATT AACT A TCTTTTCATTATGAGACAAAGGTTCTGATTATGCCTACTGGTTGAAATTTTTTAAT CTAGTCAAGAAGGAAAATTTGATGAGGAAGGAAGGAATGGATATCTTCAGAAGG GCTTCGCCTAAGCTGGAACATGGATAGATTCC ATTCTAACATAAAGATCTTTAAG TTCAAATATAGATGAGTTGACTGGTAGATTTGGTGGTAGTTGCTTTCTCGGGATA TAAGAAGCAAAATCAACTGCTACAAGTAAAGAGGGGATGGGGAAGGTGTTGCAC ATTTAAAGAGAGAAAGTGTGAAAAAGCCTAATTGTGGGAATGCACAGGTTTCAC CAGATCAGATGATGTCTGGTTATTCTGTAAATTATAGTTCTTATCCCAGAAATTAC TGCCTCCACCATCCCTAATATCTTCTAATTGGTATCATATAATGACCCACTCTTCT TATGTTATCCAAACAGTTATGTGGCATTTAGTAATGGAATGTACATGGAATTTCC CACTGACTTACCTTTCTGTCCTTGGGAAGCTTAAACTCTGAATCTTCTCATCTGTA AAATGTGAATTAAAGTATCTACCTAACTGAGTTGTGATTGTAGTGAAAGAAAGG CAATATATTTAAATCTTGAATTTAGCAAGCCCACGCTTGATTTTTATGTCCTTTCC TCTTGCCTTGTATTGAGTTTAAGATCTCTACTGATTAAAACTCTTTTGCTATCAAA AAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 8; NCBI Reference No.
NM_001114120.1)
[0067] FSCN1 (fascin) is an actin-bundling protein that has been linked to the metastasis of breast cancer. FSCN1 may also be known in the art as fascin homolog 1, P55, SNL, fascin 1, FLJ38511, FAN1, singed-like, 55 kDa actin-bundling protein, HSN, actin bundling protein, singed, drosophila, homolog-like, or singed (Drosophila)-like (sea urchin fascin homolog like). An exemplary human FSCN1 protein contains 493 amino acid residues and has the following polypeptide sequence:
MTANGTAEAVQIQFGLINCGNKYLTAEAFGFKVNASASSLKK QIWTLEQPPDEAGS AAVCLRSHLGRYLAADKDGNVTCEREVPGPDCRFLIVAHDDGRWSLQSEAHRRYFG GTEDRLSCFAQTVSPAEKWSVHIAMHPQVNIYSVTRKRYAHLSARPADEIAVDRDVP WGVDSLITLAFQDQRYSVQTADHRFLRHDGRLVARPEPATGYTLEFRSGKVAFRDC EGRYLAPSGPSGTLKAGKATKVGKDELFALEQSCAQVVLQAANERNVSTRQGMDLS ANQDEETDQETFQLEIDRDTK CAFRTHTGKYWTLTATGGVQSTASSK ASCYFDIE WRDRRITLRASNGKFVTSKK GQLAASVETAGDSELFLMKLINRPIIVFRGEHGFIGC RKVTGTLDANRSSYDVFQLEFNDGAYNIKDSTGKYWTVGSDSAVTSSGDTPVDFFF EFCDYNKVAIKVGGRYLKGDHAGVLKASAETVDPASLWEY (SEQ ID NO: 9, NCBI Reference No. NP 003079.1)
[0068] An mRNA sequence for this FSCN1 polypeptide is:
GCTGCGGAGGGTGCGTGCGGGCCGCGGCAGCCGAACAAAGGAGCAGGGGCGCC GCCGCAGGGACCCGCCACCCACCTCCCGGGGCCGCGCAGCGGCCTCTCGTCTACT GCCACCATGACCGCCAACGGCACAGCCGAGGCGGTGCAGATCCAGTTCGGCCTC ATCAACTGCGGCAACAAGTACCTGACGGCCGAGGCGTTCGGGTTCAAGGTGAAC GCGTCCGCCAGCAGCCTGAAGAAGAAGCAGATCTGGACGCTGGAGCAGCCCCCT GACGAGGCGGGCAGCGCGGCCGTGTGCCTGCGCAGCCACCTGGGCCGCTACCTG GCGGCGGACAAGGACGGCAACGTGACCTGCGAGCGCGAGGTGCCCGGTCCCGAC TGCCGTTTCCTCATCGTGGCGCACGACGACGGTCGCTGGTCGCTGCAGTCCGAGG CGCACCGGCGCTACTTCGGCGGCACCGAGGACCGCCTGTCCTGCTTCGCGCAGAC GGTGTCCCCCGCCGAGAAGTGGAGCGTGCACATCGCCATGCACCCTCAGGTCAA CATCTACAGCGTCACCCGTAAGCGCTACGCGCACCTGAGCGCGCGGCCGGCCGA CGAGATCGCCGTGGACCGCGACGTGCCCTGGGGCGTCGACTCGCTCATCACCCTC GCCTTCCAGGACCAGCGCTACAGCGTGCAGACCGCCGACCACCGCTTCCTGCGCC ACGACGGGCGCCTGGTGGCGCGCCCCGAGCCGGCCACTGGCTACACGCTGGAGT TCCGCTCCGGCAAGGTGGCCTTCCGCGACTGCGAGGGCCGTTACCTGGCGCCGTC GGGGCCCAGCGGCACGCTCAAGGCGGGCAAGGCCACCAAGGTGGGCAAGGACG AGCTCTTTGCTCTGGAGCAGAGCTGCGCCCAGGTCGTGCTGCAGGCGGCCAACG AGAGGAACGTGTCCACGCGCCAGGGTATGGACCTGTCTGCCAATCAGGACGAGG AGACCGACCAGGAGACCTTCCAGCTGGAGATCGACCGCGACACCAAAAAGTGTG CCTTCCGTACCCACACGGGCAAGTACTGGACGCTGACGGCCACCGGGGGCGTGC AGTCCACCGCCTCCAGCAAGAATGCCAGCTGCTACTTTGACATCGAGTGGCGTGA CCGGCGCATCACACTGAGGGCGTCCAATGGCAAGTTTGTGACCTCCAAGAAGAA TGGGCAGCTGGCCGCCTCGGTGGAGACAGCAGGGGACTCAGAGCTCTTCCTCAT GAAGCTCATCAACCGCCCCATCATCGTGTTCCGCGGGGAGCATGGCTTCATCGGC TGCCGCAAGGTCACGGGCACCCTGGACGCCAACCGCTCCAGCTATGACGTCTTCC AGCTGGAGTTCAACGATGGCGCCTACAACATCAAAGACTCCACAGGCAAATACT GGACGGTGGGCAGTGACTCCGCGGTCACCAGCAGCGGCGACACTCCTGTGGACT TCTTCTTCGAGTTCTGCGACTATAACAAGGTGGCCATCAAGGTGGGCGGGCGCTA CCTGAAGGGCGACCACGC AGGCGTCCTGAAGGCCTCGGCGGAAACCGTGGACCC CGCCTCGCTCTGGGAGTACTAGGGCCGGCCCGTCCTTCCCCGCCCCTGCCCACAT GGCGGCTCCTGCCAACCCTCCCTGCTAACCCCTTCTCCGCCAGGTGGGCTCCAGG GCGGGAGGCAAGCCCCCTTGCCTTTCAAACTGGAAACCCCAGAGAAAACGGTGC CCCCACCTGTCGCCCCTATGGACTCCCCACTCTCCCCTCCGCCCGGGTTCCCTACT CCCCTCGGGTCAGCGGCTGCGGCCTGGCCCTGGGAGGGATTTCAGATGCCCCTGC CCTCTTGTCTGCCACGGGGCGAGTCTGGCACCTCTTTCTTCTGACCTCAGACGGCT CTGAGCCTTATTTCTCTGGAAGCGGCTAAGGGACGGTTGGGGGCTGGGAGCCCTG GGCGTGTAGTGTAACTGGAATCTTTTGCCTCTCCCAGCCACCTCCTCCCAGCCCC CCAGGAGAGCTGGGCACATGTCCCAAGCCTGTCAGTGGCCCTCCCTGGTGCACTG TCCCCGAAACCCCTGCTTGGGAAGGGAAGCTGTCGGGTGGGCTAGGACTGACCC TTGTGGTGTTTTTTTGGGTGGTGGCTGGAAACAGCCCCTCTCCCACGTGGCAGAG GCTCAGCCTGGCTCCCTTCCCTGGAGCGGCAGGGCGTGACGGCCACAGGGTCTG CCCGCTGCACGTTCTGCCAAGGTGGTGGTGGCGGGCGGGTAGGGGTGTGGGGGC CGTCTTCCTCCTGTCTCTTTCCTTTCACCCTAGCCTGACTGGAAGCAGAAAATGAC CAAATCAGTATTTTTTTTAATGAAATATTATTGCTGGAGGCGTCCCAGGCAAGCC TGGCTGTAGTAGCGAGTGATCTGGCGGGGGGCGTCTCAGCACCCTCCCCAGGGG GTGCATCTCAGCCCCCTCTTTCCGTCCTTCCCGTCCAGCCCCAGCCCTGGGCCTGG GCTGCCGACACCTGGGCCAGAGCCCCTGCTGTGATTGGTGCTCCCTGGGCCTCCC GGGTGGATGAAGCCAGGCGTCGCCCCCTCCGGGAGCCCTGGGGTGAGCCGCCGG GGCCCCCCTGCTGCCAGCCTCCCCCGTCCCCAACATGCATCTCACTCTGGGTGTC TTGGTCTTTTATTTTTTGTAAGTGTCATTTGTATAACTCTAAACGCCCATGATAGT AGCTTCAAACTGGAAATAGCGAAATAAAATAACTCAGTCTGCAGCCCCAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 10; NCBI Reference No. NM_003088.2)
[0069] KIF2C is a kinesin-like protein that has microtubule-depolymerizing activity.
KIF2C may also be known in the art as kinesin family member 2C, mitotic centromere- associated kinesin, MCAK1, kinesin-like 6 (mitotic centromere-associated kinesin), KNSL6, kinesin-like protein KIF2C, kinesin-like protein 6, kinesin-like 6, or RP11-269F19.1. An exemplary human KIF2C protein contains 725 amino acid residues and has the following polypeptide sequence:
MAMDSSLQARLFPGLAIKIQRSNGLIHSANVRTVNLEKSCVSVEWAEGGATKGKEID FDDVAAINPELLQLLPLHPKDNLPLQENVTIQKQKRRSVNSKIPAPKESLRSRSTRMS TVSELRITAQENDMEVELPAAANSRKQFSVPPAPTRPSCPAVAEIPLRMVSEEMEEQV HSIRGSSSANPVNSVRRKSCLVKEVEKMKNKREEKKAQNSEMRMKRAQEYDSSFPN WEFARMIKEFRATLECHPLTMTDPIEEHRICVCVRKRPLNKQELAKKEIDVISIPSKCL LLVHEPKLKVDLTKYLENQAFCFDFAFDETASNEVVYRFTARPLVQTIFEGGKATCF AYGQTGSGKTHTMGGDLSGKAQNASKGIYAMASRDVFLLKNQPCYRKLGLEVYVT FFEIYNGKLFDLLNKKAKLRVLEDGKQQVQVVGLQEHLVNSADDVIKMIDMGSACR TSGQTFANSNSSRSHACFQIILRAKGRMHGKFSLVDLAGNERGADTSSADRQTRMEG AEINKSLLALKECIRALGQNKAHTPFRESKLTQVLRDSFIGENSRTCMIATISPGISSCE YTLNTLRYADRVKELSPHSGPSGEQLIQMETEEMEACSNGALIPGNLSKEEEELSSQM SSFNEAMTQIRELEEKAMEELKEIIQQGPDWLELSEMTEQPDYDLETFVNKAESALA QQAKHFSALRDVIKALRLAMQLEEQASRQISSKKRPQ (SEQ ID NO: 11; NCBI Reference No. NP_006836.2) [0070] An mRNA sequence for this KIF2C polypeptide is:
ACGCTTGCGCGCGGGATTTAAACTGCGGCGGTTTACGCGGCGTTAAGACTTCGTA GGGTTAGCGAAATTGAGGTTTCTTGGTATTGCGCGTTTCTCTTCCTTGCTGACTCT CCGAATGGCCATGGACTCGTCGCTTCAGGCCCGCCTGTTTCCCGGTCTCGCTATC AAGATCCAACGCAGTAATGGTTTAATTCACAGTGCCAATGTAAGGACTGTGAACT TGGAGAAATCCTGTGTTTCAGTGGAATGGGCAGAAGGAGGTGCCACAAAGGGCA AAGAGATTGATTTTGATGATGTGGCTGCAATAAACCCAGAACTCTTACAGCTTCT TCCCTTACATCCGAAGGACAATCTGCCCTTGCAGGAAAATGTAACAATCCAGAA ACAAAAACGGAGATCCGTCAACTCCAAAATTCCTGCTCCAAAAGAAAGTCTTCG AAGCCGCTCCACTCGCATGTCCACTGTCTCAGAGCTTCGCATCACGGCTCAGGAG AATGACATGGAGGTGGAGCTGCCTGCAGCTGCAAACTCCCGCAAGCAGTTTTCA GTTCCTCCTGCCCCCACTAGGCCTTCCTGCCCTGCAGTGGCTGAAATACCATTGA GGATGGTCAGCGAGGAGATGGAAGAGCAAGTCCATTCCATCCGAGGCAGCTCTT CTGCAAACCCTGTGAACTCAGTTCGGAGGAAATCATGTCTTGTGAAGGAAGTGG AAAAAATGAAGAACAAGCGAGAAGAGAAGAAGGCCCAGAACTCTGAAATGAGA ATGAAGAGAGCTCAGGAGTATGACAGTAGTTTTCCAAACTGGGAATTTGCCCGA ATGATTAAAGAATTTCGGGCTACTTTGGAATGTCATCCACTTACTATGACTGATC CTATCGAAGAGCACAGAATATGTGTCTGTGTTAGGAAACGCCCACTGAATAAGC AAGAATTGGCCAAGAAAGAAATTGATGTGATTTCCATTCCTAGCAAGTGTCTCCT CTTGGTACATGAACCCAAGTTGAAAGTGGACTTAACAAAGTATCTGGAGAACCA AGCATTCTGCTTTGACTTTGCATTTGATGAAACAGCTTCGAATGAAGTTGTCTAC AGGTTCACAGCAAGGCCACTGGTACAGACAATCTTTGAAGGTGGAAAAGCAACT TGTTTTGCATATGGCCAGACAGGAAGTGGCAAGACACATACTATGGGCGGAGAC CTCTCTGGGAAAGCCCAGAATGCATCCAAAGGGATCTATGCC ATGGCCTCCCGG GACGTCTTCCTCCTGAAGAATCAACCCTGCTACCGGAAGTTGGGCCTGGAAGTCT ATGTGACATTCTTCGAGATCTACAATGGGAAGCTGTTTGACCTGCTCAACAAGAA GGCCAAGCTGCGCGTGCTGGAGGACGGCAAGCAACAGGTGCAAGTGGTGGGGCT GCAGGAGCATCTGGTTAACTCTGCTGATGATGTCATCAAGATGATCGACATGGGC AGCGCCTGCAGAACCTCTGGGCAGACATTTGCCAACTCCAATTCCTCCCGCTCCC ACGCGTGCTTCCAAATTATTCTTCGAGCTAAAGGGAGAATGCATGGCAAGTTCTC TTTGGTAGATCTGGCAGGGAATGAGCGAGGCGCGGACACTTCCAGTGCTGACCG GCAGACCCGCATGGAGGGCGCAGAAATCAACAAGAGTCTCTTAGCCCTGAAGGA GTGCATCAGGGCCCTGGGACAGAACAAGGCTCACACCCCGTTCCGTGAGAGCAA GCTGACACAGGTGCTGAGGGACTCCTTCATTGGGGAGAACTCTAGGACTTGCATG ATTGCCACGATCTCACCAGGCATAAGCTCCTGTGAATATACTTTAAACACCCTGA GATATGCAGACAGGGTCAAGGAGCTGAGCCCCCACAGTGGGCCCAGTGGAGAGC AGTTGATTCAAATGGAAACAGAAGAGATGGAAGCCTGCTCTAACGGGGCGCTGA TTCCAGGCAATTTATCCAAGGAAGAGGAGGAACTGTCTTCCCAGATGTCCAGCTT TAACGAAGCCATGACTCAGATCAGGGAGCTGGAGGAGAAGGCTATGGAAGAGCT CAAGGAGATCATACAGCAAGGACCAGACTGGCTTGAGCTCTCTGAGATGACCGA GCAGCCAGACTATGACCTGGAGACCTTTGTGAACAAAGCGGAATCTGCTCTGGC CCAGCAAGCCAAGCATTTCTCAGCCCTGCGAGATGTCATCAAGGCCTTGCGCCTG GCCATGCAGCTGGAAGAGCAGGCTAGCAGACAAATAAGCAGCAAGAAACGGCC CCAGTGACGACTGCAAATAAAAATCTGTTTGGTTTGACACCCAGCCTCTTCCCTG GCCCTCCCCAGAGAACTTTGGGTACCTGGTGGGTCTAGGCAGGGTCTGAGCTGGG ACAGGTTCTGGTAAATGCCAAGTATGGGGGCATCTGGGCCCAGGGCAGCTGGGG AGGGGGTCAGAGTGACATGGGACACTCCTTTTCTGTTCCTCAGTTGTCGCCCTCA CGAGAGGAAGGAGCTCTTAGTTACCCTTTTGTGTTGCCCTTCTTTCCATCAAGGG GAATGTTCTCAGCATAGAGCTTTCTCCGCAGCATCCTGCCTGCGTGGACTGGCTG CTAATGGAGAGCTCCCTGGGGTTGTCCTGGCTCTGGGGAGAGAGACGGAGCCTTT AGTACAGCTATCTGCTGGCTCTAAACCTTCTACGCCTTTGGGCCGAGCACTGAAT GTCTTGTACTTTAAAAAAATGTTTCTGAGACCTCTTTCTACTTTACTGTCTCCCTA GAGATCCTAGAGGATCCCTACTGTTTTCTGTTTTATGTGTTTATACATTGTATGTA ACAATAAAGAGAAAAAATAAATCAGCTGTTTAAGTGTGTGGAAAAAAAAAAAA AAAAAA (SEQ ID NO: 12; NCBI Reference No. NM 006845.3) [0071] MMP1 is a matrix metalloproteinase that degrades extracellular matrix proteins.
MMP1 may also be known in the art as matrix metallopeptidase 1, matrix metalloprotease 1 matrix metalloproteinase 1, MMP-1, EC 3.4.24.7, CLG, or CLGN. An exemplary human MMP1 protein contains 469 amino acid residues and has the following polypeptide sequence: MHSFPPLLLLLFWGVVSHSFPATLETQEQDVDLVQKYLEKYYNLKNDGRQVEKRRN SGPVVEKLKQMQEFFGLKVTGKPDAETLKVMKQPRCGVPDVAQFVLTEGNPRWEQ THLTYRIENYTPDLPRADVDHAIEKAFQLWSNVTPLTFTKVSEGQADIMISFVRGDHR DNSPFDGPGGNLAHAFQPGPGIGGDAHFDEDERWTNNFREYNLHRVAAHELGHSLG LSHSTDIGALMYPSYTFSGDVQLAQDDIDGIQAIYGRSQNPVQPIGPQTPKACDSKLT FDAITTIRGEVMFFKDRFYMRTNPFYPEVELNFISVFWPQLPNGLEAAYEFADRDEVR FFKGNKYWAVQGQNVLHGYPKDIYSSFGFPRTVKHIDAALSEENTGKTYFFVANKY WRYDEYKRSMDPGYPKMIAHDFPGIGHKVDAVFMKDGFFYFFHGTRQYKFDPKTK RILTLQKANSWFNCRKN (SEQ ID NO: 13; NCBI Reference No. NP 002412.1)
[0072] An mRNA sequence for this polypeptide is:
AGCATGAGTCAGACAGCCTCTGGCTTTCTGGAAGGGCAAGGACTCTATATATACA GAGGGAGCTTCCTAGCTGGGATATTGGAGCAGCAAGAGGCTGGGAAGCCATCAC TTACCTTGCACTGAGAAAGAAGACAAAGGCCAGTATGCACAGCTTTCCTCCACTG CTGCTGCTGCTGTTCTGGGGTGTGGTGTCTCACAGCTTCCCAGCGACTCTAGAAA CACAAGAGCAAGATGTGGACTTAGTCCAGAAATACCTGGAAAAATACTACAACC TGAAGAATGATGGGAGGCAAGTTGAAAAGCGGAGAAATAGTGGCCCAGTGGTTG AAAAATTGAAGCAAATGCAGGAATTCTTTGGGCTGAAAGTGACTGGGAAACCAG ATGCTGAAACCCTGAAGGTGATGAAGCAGCCCAGATGTGGAGTGCCTGATGTGG CTCAGTTTGTCCTCACTGAGGGGAACCCTCGCTGGGAGCAAACACATCTGACCTA CAGGATTGAAAATTACACGCCAGATTTGCCAAGAGCAGATGTGGACCATGCCAT TGAGAAAGCCTTCCAACTCTGGAGTAATGTCACACCTCTGACATTCACCAAGGTC TCTGAGGGTCAAGCAGACATCATGATATCTTTTGTCAGGGGAGATCATCGGGACA ACTCTCCTTTTGATGGACCTGGAGGAAATCTTGCTCATGCTTTTCAACCAGGCCC AGGTATTGGAGGGGATGCTC ATTTTGATGAAGATGAAAGGTGGACCAACAATTT CAGAGAGTACAACTTACATCGTGTTGCAGCTCATGAACTCGGCCATTCTCTTGGA CTCTCCCATTCTACTGATATCGGGGCTTTGATGTACCCTAGCTACACCTTCAGTGG TGATGTTCAGCTAGCTCAGGATGACATTGATGGCATCCAAGCCATATATGGACGT TCCCAAAATCCTGTCCAGCCCATCGGCCCACAAACCCCAAAAGCGTGTGACAGT AAGCTAACCTTTGATGCTATAACTACGATTCGGGGAGAAGTGATGTTCTTTAAAG ACAGATTCTACATGCGCACAAATCCCTTCTACCCGGAAGTTGAGCTCAATTTCAT TTCTGTTTTCTGGCCACAACTGCCAAATGGGCTTGAAGCTGCTTACGAATTTGCC GACAGAGATGAAGTCCGGTTTTTCAAAGGGAATAAGTACTGGGCTGTTCAGGGA CAGAATGTGCTACACGGATACCCCAAGGACATCTACAGCTCCTTTGGCTTCCCTA GAACTGTGAAGCATATCGATGCTGCTCTTTCTGAGGAAAACACTGGAAAAACCT ACTTCTTTGTTGCTAACAAATACTGGAGGTATGATGAATATAAACGATCTATGGA TCCAGGTTATCCCAAAATGATAGCACATGACTTTCCTGGAATTGGCCACAAAGTT GATGCAGTTTTCATGAAAGATGGATTTTTCTATTTCTTTCATGGAACAAGACAAT ACAAATTTGATCCTAAAACGAAGAGAATTTTGACTCTCCAGAAAGCTAATAGCTG GTTCAACTGCAGGAAAAATTGAACATTACTAATTTGAATGGAAAACACATGGTG TGAGTCCAAAGAAGGTGTTTTCCTGAAGAACTGTCTATTTTCTCAGTCATTTTTAA CCTCTAGAGTCACTGATACACAGAATATAATCTTATTTATACCTCAGTTTGCATAT TTTTTTACTATTTAGAATGTAGCCCTTTTTGTACTGATATAATTTAGTTCCACAAA TGGTGGGTACAAAAAGTCAAGTTTGTGGCTTATGGATTCATATAGGCCAGAGTTG CAAAGATCTTTTCCAGAGTATGCAACTCTGACGTTGATCCCAGAGAGCAGCTTCA GTGACAAACATATCCTTTCAAGACAGAAAGAGACAGGAGACATGAGTCTTTGCC GGAGGAAAAGCAGCTCAAGAACACATGTGCAGTCACTGGTGTCACCCTGGATAG GCAAGGGATAACTCTTCTAACACAAAATAAGTGTTTTATGTTTGGAATAAAGTCA ACCTTGTTTCTACTGTTTTATACACTTTCAAAAAAAAAAAAAAAAAAAAAAAAA A (SEQ ID NO: 14; NCBI Reference No. NM 002421.3)
[0073] N-cadherin (NCAD or CDH2) is a calcium-dependent membrane protein that is involved in cell adhesion. N-cadherin may also be known in the art as cadherin 2 type 1 N- cadherin (neuronal), N-cadherin 1, CDFiN, neuronal calcium-dependent adhesion protein, neural-cadherin, CD325, neural cadherin2, CD325 antigen, CDw325, or cadherin-2. An exemplary human N-cadherin protein contains 906 amino acid residues and has the following polypeptide sequence:
MCRIAGALRTLLPLLAALLQ AS VEASGEIALCKTGFPEDVYS AVLSKDVHEGQPLLN VKFSNCNGKRKVQYESSEPADFKVDEDGMVYAVRSFPLSSEHAKFLIYAQDKETQE KWQVAVKLSLKPTLTEESVKESAEVEEIVFPRQFSKHSGHLQRQKRDWVIPPINLPEN SRGPFPQELVRIRSDRDKNLSLRYSVTGPGADQPPTGIFIINPISGQLSVTKPLDREQIA RFHLRAHAVDINGNQVENPIDIVINVIDMNDNRPEFLHQVWNGTVPEGSKPGTYVMT VTAIDADDPNALNGMLRYRIVSQAPSTPSPNMFTINNETGDIITVAAGLDREKVQQYT LIIQATDMEGNPTYGLSNTATAVITVTDVNDNPPEFTAMTFYGEVPENRVDIIVANLT VTDKDQPHTPAWNAVYRISGGDPTGRFAIQTDPNSNDGLVTVVKPIDFETNRMFVLT VAAENQVPLAKGIQHPPQSTATVSVTVIDVNENPYFAPNPKIIRQEEGLHAGTMLTTF TAQDPDRYMQQNIRYTKLSDPANWLKIDPVNGQITTIAVLDRESPNVKNNIYNATFL ASDNGIPPMSGTGTLQIYLLDINDNAPQVLPQEAETCETPDPNSINITALDYDIDPNAG PFAFDLPLSPVTIKRNWTITRLNGDFAQLNLKIKFLEAGIYEVPIIITDSGNPPKSNISILR VKVCQCDSNGDCTDVDRIVGAGLGTGAIIAILLCIIILLILVLMFVVWMKRRDKERQA KQLLIDPEDDVRDNILKYDEEGGGEEDQDYDLSQLQQPDTVEPDAIKPVGIRRMDER PIHAEPQYPVRSAAPHPGDIGDFINEGLKAADNDPTAPPYDSLLVFDYEGSGSTAGSL S SLNS S S SGGEQD YD YLND WGPRFKKLADM YGGGDD (SEQ ID NO: 15; NCBI Reference No. NP_001783.2)
[0074] An mRNA sequence for this NCAD polypeptide is:
GGGGAGCGCCATCCGCTCCACTTCCACCTCCACATCCTCCACCGGCCAAGGTCCC CGCCGCTGCATCCCTCGCGGCTTCCGCTGCGCTCCGGGCCGGAGCCGAGCCGCCT GCGCTGCCACAGCAGCCGCCTCCACACACTCGCAGACGCTCACACGCTCTCCCTC CCTGTTCCCCCGCCCCCTCCCCAGCTCCTTGATCTCTGGGTCTGTTTTATTACTCCT GGTGCGAGTCCCGCGGACTCCGCGGCCCGCTATTTGTCATCAGCTCGCTCTCCAT TGGCGGGGAGCGGAGAGCAGCGAAGAAGGGGGTGGGGAGGGGAGGGGAAGGG AAGGGGGTGGAAACTGCCTGGAGCCGTTTCTCCGCGCCGCTGTTGGTGCTGCCGC TGCCTCCTCCTCCTCCGCCGCCGCCGCCGCCGCCGCCGCCTCCTCCGGCTCTTCGC TCGGCCCCTCTCCGCCTCCATGTGCCGGATAGCGGGAGCGCTGCGGACCCTGCTG CCGCTGCTGGCGGCCCTGCTTCAGGCGTCTGTAGAGGCTTCTGGTGAAATCGCAT TATGCAAGACTGGATTTCCTGAAGATGTTTACAGTGCAGTCTTATCGAAGGATGT GCATGAAGGACAGCCTCTTCTCAATGTGAAGTTTAGCAACTGCAATGGAAAAAG AAAAGTACAATATGAGAGCAGTGAGCCTGCAGATTTTAAGGTGGATGAAGATGG CATGGTGTATGCCGTGAGAAGCTTTCCACTCTCTTCTGAGCATGCCAAGTTCCTG ATATATGCCCAAGAC AAAGAGACCCAGGAAAAGTGGCAAGTGGCAGTAAAATTG AGCCTGAAGCCAACCTTAACTGAGGAGTCAGTGAAGGAGTCAGCAGAAGTTGAA GAAATAGTGTTCCCAAGACAATTCAGTAAGCACAGTGGCCACCTACAAAGGCAG AAGAGAGACTGGGTCATCCCTCCAATCAACTTGCCAGAAAACTCCAGGGGACCT TTTCCTCAAGAGCTTGTCAGGATCAGGTCTGATAGAGATAAAAACCTTTCACTGC GGTACAGTGTAACTGGGCC AGGAGCTGACC AGCCTCCAACTGGTATCTTC ATT AT CAACCCCATCTCGGGTCAGCTGTCGGTGACAAAGCCCCTGGATCGCGAGCAGAT AGCCCGGTTTCATTTGAGGGCACATGCAGTAGATATTAATGGAAATCAAGTGGA GAACCCCATTGACATTGTCATCAATGTTATTGACATGAATGACAACAGACCTGAG TTCTTACACCAGGTTTGGAATGGGACAGTTCCTGAGGGATCAAAGCCTGGAACAT ATGTGATGACCGTAACAGCAATTGATGCTGACGATCCCAATGCCCTCAATGGGAT GTTGAGGTACAGAATCGTGTCTCAGGCTCCAAGCACCCCTTCACCCAACATGTTT ACAATCAACAATGAGACTGGTGACATCATCACAGTGGCAGCTGGACTTGATCGA GAAAAAGTGCAACAGTATACGTTAATAATTCAAGCTACAGACATGGAAGGCAAT CCCACATATGGCCTTTCAAACACAGCCACGGCCGTCATCACAGTGACAGATGTCA ATGACAATCCTCCAGAGTTTACTGCCATGACGTTTTATGGTGAAGTTCCTGAGAA CAGGGTAGACATCATAGTAGCTAATCTAACTGTGACCGATAAGGATCAACCCCA TACACCAGCCTGGAACGCAGTGTACAGAATCAGTGGCGGAGATCCTACTGGACG GTTCGCCATCCAGACCGACCCAAACAGCAACGACGGGTTAGTCACCGTGGTCAA ACCAATCGACTTTGAAACAAATAGGATGTTTGTCCTTACTGTTGCTGCAGAAAAT CAAGTGCCATTAGCCAAGGGAATTCAGCACCCGCCTCAGTCAACTGCAACCGTG TCTGTTACAGTTATTGACGTAAATGAAAACCCTTATTTTGCCCCCAATCCTAAGAT CATTCGCCAAGAAGAAGGGCTTCATGCCGGTACCATGTTGACAACATTCACTGCT CAGGACCCAGATCGATATATGCAGCAAAATATTAGATACACTAAATTATCTGATC CTGCCAATTGGCTAAAAATAGATCCTGTGAATGGACAAATAACTACAATTGCTGT TTTGGACCGAGAATCACCAAATGTGAAAAACAATATATATAATGCTACTTTCCTT GCTTCTGACAATGGAATTCCTCCTATGAGTGGAACAGGAACGCTGCAGATCTATT TACTTGATATTAATGACAATGCCCCTCAAGTGTTACCTCAAGAGGCAGAGACTTG CGAAACTCCAGACCCCAATTCAATTAATATTACAGCACTTGATTATGACATTGAT CCAAATGCTGGACCATTTGCTTTTGATCTTCCTTTATCTCCAGTGACTATTAAGAG AAATTGGACCATCACTCGGCTTAATGGTGATTTTGCTCAGCTTAATTTAAAGATA AAATTTCTTGAAGCTGGTATCTATGAAGTTCCCATCATAATCACAGATTCGGGTA ATCCTCCCAAATCAAATATTTCCATCCTGCGCGTGAAGGTTTGCCAGTGTGACTC CAACGGGGACTGC ACAGATGTGGACAGGATTGTGGGTGCGGGGCTTGGC ACCGG TGCCATCATTGCCATCCTGCTCTGCATCATCATCCTGCTTATCCTTGTGCTGATGT TTGTGGTATGGATGAAACGCCGGGATAAAGAACGCCAGGCCAAACAACTTTTAA TTGATCCAGAAGATGATGTAAGAGATAATATTTTAAAATATGATGAAGAAGGTG GAGGAGAAGAAGACCAGGACTATGACTTGAGCCAGCTGCAGCAGCCTGACACTG TGGAGCCTGATGCCATCAAGCCTGTGGGAATCCGACGAATGGATGAAAGACCCA TCCACGCCGAGCCCCAGTATCCGGTCCGATCTGCAGCCCCACACCCTGGAGACAT TGGGGACTTCATTAATGAGGGCCTTAAAGCGGCTGACAATGACCCCACAGCTCC ACCATATGACTCCCTGTTAGTGTTTGACTATGAAGGCAGTGGCTCCACTGCTGGG TCCTTGAGCTCCCTTAATTCCTCAAGTAGTGGTGGTGAGCAGGACTATGATTACC TGAACGACTGGGGGCCACGGTTCAAGAAACTTGCTGACATGTATGGTGGAGGTG ATGACTGAACTTCAGGGTGAACTTGGTTTTTGGACAAGTACAAACAATTTCAACT GATATTCCCAAAAAGCATTCAGAAGCTAGGCTTTAACTTTGTAGTCTACTAGCAC AGTGCTTGCTGGAGGCTTTGGCATAGGCTGCAAACCAATTTGGGCTCAGAGGGA ATATCAGTGATCCATACTGTTTGGAAAAACACTGAGCTCAGTTACACTTGAATTT TACAGTACAGAAGCACTGGGATTTTATGTGCCTTTTTGTACCTTTTTCAGATTGGA ATTAGTTTTCTGTTTAAGGCTTTAATGGTACTGATTTCTGAAACGATAAGTAAAA GACAAAATATTTTGTGGTGGGAGCAGTAAGTTAAACCATGATATGCTTCAACACG CTTTTGTTACATTGCATTTGCTTTTATTAAAATACAAAATTAAACAAACAAAAAA ACTCATGGAGCGATTTTATTATCTTGGGGGATGAGACCATGAGATTGGAAAATGT ACATTACTTCTAGTTTTAGACTTTAGTTTGTTTTTTTTTTTTTCACTAAAATCTTAA AACTTACTCAGCTGGTTGCAAATAAAGGGAGTTTTCATATCACCAATTTGTAGCA AAATTGAATTTTTTCATAAACTAGAATGTTAGACACATTTTGGTCTTAATCCATGT ACACTTTTTTATTTCTGTATTTTTCCACTTCACTGTAAAAATAGTATGTGTACATA ATGTTTTATTGGCATAGTCTATGGAGAAGTGCAGAAACTTCAGAACATGTGTATG TATTATTTGGACTATGGATTCAGGTTTTTTGCATGTTTATATCTTTCGTTATGGAT AAAGTATTTACAAAACAGTGACATTTGATTCAATTGTTGAGCTGTAGTTAGAATA CTCAATTTTTAATTTTTTTAATTTTTTTATTTTTTATTTTCTTTTTGGTTTGGGGAGG GAGAAAAGTTCTTAGCACAAATGTTTTACATAATTTGTACCAAAAAAAAAAAAA AAGGAAAGGAAAGAAAGGGGTGGCCTGACACTGGTGGCACTACTAAGTGTGTGT TTTTTTAAAAAAAAAATGGAAAAAAAAAAGCTTTTAAACTGGAGAGACTTCTGA CAACAGCTTTGCCTCTGTATTGTGTACCAGAATATAAATGATACACCTCTGACCC CAGCGTTCTGAATAAAATGCTAATTTTGGATCTGGAAAAAAAAAAAAA (SEQ ID NO: 16; NCBI Reference No. NM_001792.3)
[0075] Osteonectin is a protein that is involved in the synthesis of extracellular matrix. Osteonectin may also be known in the art as secreted protein, acidic, cysteine-rich, SPARC, Basement-membrane protein 40, BM-40, ON, cysteine-rich protein, or secreted protein acidic and rich in cysteine. An exemplary human osteonectin protein contains 303 amino acid residues and has the following polypeptide sequence:
MRAWIFFLLCLAGRALAAPQQEALPDETEVVEETVAEVTEVSVGANPVQVEVGEFD DGAEETEEEVVAENPCQNHHCKHGKVCELDENNTPMCVCQDPTSCPAPIGEFEKVC SNDNKTFDSSCHFFATKCTLEGTK GHKLHLDYIGPCKYIPPCLDSELTEFPLRMRD WLKNVLVTLYERDEDNNLLTEKQKLRVKKIHENEKRLEAGDHPVELLARDFEKNYN MYIFPVHWQFGQLDQHPIDGYLSHTELAPLRAPLIPMEHCTTRFFETCDLDNDKYIAL DEWAGCFGIKQKDIDKDLVI (SEQ ID NO: 17; NCBI Reference No. NP 003109.1)
[0076] An mRNA sequence for this polypeptide is:
GTTGCCTGTCTCTAAACCCCTCCACATTCCCGCGGTCCTTCAGACTGCCCGGAGA GCGCGCTCTGCCTGCCGCCTGCCTGCCTGCCACTGAGGGTTCCCAGCACCATGAG GGCCTGGATCTTCTTTCTCCTTTGCCTGGCCGGGAGGGCCTTGGCAGCCCCTCAG CAAGAAGCCCTGCCTGATGAGACAGAGGTGGTGGAAGAAACTGTGGCAGAGGTG ACTGAGGTATCTGTGGGAGCTAATCCTGTCCAGGTGGAAGTAGGAGAATTTGAT GATGGTGCAGAGGAAACCGAAGAGGAGGTGGTGGCGGAAAATCCCTGCCAGAA CCACCACTGCAAACACGGCAAGGTGTGCGAGCTGGATGAGAACAACACCCCCAT GTGCGTGTGCCAGGACCCCACCAGCTGCCCAGCCCCCATTGGCGAGTTTGAGAA GGTGTGCAGCAATGACAACAAGACCTTCGACTCTTCCTGCCACTTCTTTGCCACA AAGTGCACCCTGGAGGGCACCAAGAAGGGCCACAAGCTCCACCTGGACTACATC GGGCCTTGCAAATACATCCCCCCTTGCCTGGACTCTGAGCTGACCGAATTCCCCC TGCGCATGCGGGACTGGCTCAAGAACGTCCTGGTCACCCTGTATGAGAGGGATG AGGACAACAACCTTCTGACTGAGAAGCAGAAGCTGCGGGTGAAGAAGATCCATG AGAATGAGAAGCGCCTGGAGGCAGGAGACCACCCCGTGGAGCTGCTGGCCCGG GACTTCGAGAAGAACTATAACATGTACATCTTCCCTGTACACTGGCAGTTCGGCC AGCTGGACCAGCACCCCATTGACGGGTACCTCTCCCACACCGAGCTGGCTCCACT GCGTGCTCCCCTCATCCCCATGGAGCATTGCACCACCCGCTTTTTCGAGACCTGT GACCTGGACAATGACAAGTACATCGCCCTGGATGAGTGGGCCGGCTGCTTCGGC ATCAAGC AGAAGGATATCGAC AAGGATCTTGTGATCTAAATCCACTCCTTCCACA GTACCGGATTCTCTCTTTAACCCTCCCCTTCGTGTTTCCCCCAATGTTTAAAATGT TTGGATGGTTTGTTGTTCTGCCTGGAGACAAGGTGCTAACATAGATTTAAGTGAA TACATTAACGGTGCTAAAAATGAAAATTCTAACCCAAGACATGACATTCTTAGCT GTAACTTAACTATTAAGGCCTTTTCCACACGCATTAATAGTCCCATTTTTCTCTTG CC ATTTGTAGCTTTGCCCATTGTCTTATTGGC ACATGGGTGGACACGGATCTGCTG GGCTCTGCCTTAAACACACATTGCAGCTTCAACTTTTCTCTTTAGTGTTCTGTTTG AAACTAATACTTACCGAGTCAGACTTTGTGTTCATTTCATTTCAGGGTCTTGGCTG CCTGTGGGCTTCCCCAGGTGGCCTGGAGGTGGGCAAAGGGAAGTAACAGACACA CGATGTTGTCAAGGATGGTTTTGGGACTAGAGGCTCAGTGGTGGGAGAGATCCCT GCAGAACCCACCAACCAGAACGTGGTTTGCCTGAGGCTGTAACTGAGAGAAAGA TTCTGGGGCTGTGTTATGAAAATATAGACATTCTCACATAAGCCCAGTTCATCAC CATTTCCTCCTTTACCTTTCAGTGCAGTTTCTTTTCACATTAGGCTGTTGGTTCAAA CTTTTGGGAGCACGGACTGTCAGTTCTCTGGGAAGTGGTCAGCGCATCCTGCAGG GCTTCTCCTCCTCTGTCTTTTGGAGAACCAGGGCTCTTCTCAGGGGCTCTAGGGAC TGCCAGGCTGTTTCAGCCAGGAAGGCCAAAATCAAGAGTGAGATGTAGAAAGTT GTAAAATAGAAAAAGTGGAGTTGGTGAATCGGTTGTTCTTTCCTCACATTTGGAT GATTGTCATAAGGTTTTTAGCATGTTCCTCCTTTTCTTCACCCTCCCCTTTTTTCTT CTATTAATCAAGAGAAACTTCAAAGTTAATGGGATGGTCGGATCTCACAGGCTG AGAACTCGTTCACCTCCAAGCATTTCATGAAAAAGCTGCTTCTTATTAATCATAC AAACTCTCACCATGATGTGAAGAGTTTCACAAATCCTTCAAAATAAAAAGTAATG ACTTAGAAACTGCCTTCCTGGGTGATTTGCATGTGTCTTAGTCTTAGTCACCTTAT TATCCTGACACAAAAACACATGAGCATACATGTCTACACATGACTACACAAATG CAAACCTTTGCAAACACATTATGCTTTTGCACACACACACCTGTACACACACACC GGCATGTTTATACACAGGGAGTGTATGGTTCCTGTAAGCACTAAGTTAGCTGTTT TCATTTAATGACCTGTGGTTTAACCCTTTTGATCACTACCACCATTATCAGCACCA GACTGAGCAGCTATATCCTTTTATTAATCATGGTCATTCATTCATTCATTCATTCA CAAAATATTTATGATGTATTTACTCTGCACCAGGTCCCATGCCAAGCACTGGGGA CACAGTTATGGCAAAGTAGACAAAGCATTTGTTCATTTGGAGCTTAGAGTCCAGG AGGAATACATTAGATAATGACACAATCAAATATAAATTGCAAGATGTCACAGGT GTGATGAAGGGAGAGTAGGAGAGACCATGAGTATGTGTAACAGGAGGACACAG CATTATTCTAGTGCTGTACTGTTCCGTACGGCAGCCACTACCCACATGTAACTTTT TAAGATTTAAATTTAAATTAGTTAACATTCAAAACGCAGCTCCCCAATCACACTA GCAAC ATTTCAAGTGCTTGAGAGCCATGCATGATTAGTGGTTACCCTATTGAATA GGTCAGAAGTAGAATCTTTTCATCATCACAGAAAGTTCTATTGGACAGTGCTCTT CTAGATCATCATAAGACTACAGAGCACTTTTCAAAGCTCATGCATGTTCATCATG TTAGTGTCGTATTTTGAGCTGGGGTTTTGAGACTCCCCTTAGAGATAGAGAAACA GACCCAAGAAATGTGCTCAATTGCAATGGGCCACATACCTAGATCTCCAGATGTC ATTTCCCCTCTCTTATTTTAAGTTATGTTAAGATTACTAAAAC AATAAAAGCTCCT AAAAAATCAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA (SEQ ID NO: 18; NCBI Reference No. NM 003118.2)
[0077] PCNA is a cofactor for DNA polymerase delta, and, thus, is involved in DNA replication. PCNA may also be known in the art as proliferating cell nuclear antigen,
MGC8367, or DNA polymerase delta auxiliary protein. An exemplary human PCNA protein contains 261 amino acid residues and has the following polypeptide sequence:
MFEARLVQGSILK VLEALKDLINEACWDISSSGVNLQSMDSSHVSLVQLTLRSEGF DTYRCDRNLAMGVNLTSMSKILKCAGNEDIITLRAEDNADTLALVFEAPNQEKVSD YEMKLMDLDVEQLGIPEQEYSCVVKMPSGEFARICRDLSHIGDAVVISCAKDGVKFS AS GELGNGNIKLS QT SN VDKEEE AVTIEMNEP VQLTF ALRYLNFFTKATPL S ST VTL S MSADVPLVVEYKIADMGHLKYYLAPKIEDEEGS (SEQ ID NO: 19; NCBI Reference No. NP_002583.1)
[0078] An mRNA sequence for this polypeptide is:
GGATGGCCGGAGCTGGCGCCCTGGTTCTGGAGGTAACCGGTTACTGAGGGCGAG AAGCGCCACCCGGAGGCTCTAGCCTGACAAATGCTTGCTGACCTGGGCCAGAGC TCTTCCCTTACGCAAGTCTCAGCCGGTCGTCGCGACGTTCGCCCGCTCGCTCTGA GGCTCCTGAAGCCGAAACCAGCTAGACTTTCCTCCTTCCCGCCTGCCTGTAGCGG CGTTGTTGCCACTCCGCCACCATGTTCGAGGCGCGCCTGGTCCAGGGCTCCATCC TCAAGAAGGTGTTGGAGGCACTCAAGGACCTCATCAACGAGGCCTGCTGGGATA TTAGCTCCAGCGGTGTAAACCTGCAGAGCATGGACTCGTCCCACGTCTCTTTGGT GCAGCTCACCCTGCGGTCTGAGGGCTTCGACACCTACCGCTGCGACCGCAACCTG GCCATGGGCGTGAACCTCACCAGTATGTCCAAAATACTAAAATGCGCCGGCAAT GAAGATATCATTACACTAAGGGCCGAAGATAACGCGGATACCTTGGCGCTAGTA TTTGAAGCACCAAACCAGGAGAAAGTTTCAGACTATGAAATGAAGTTGATGGAT TTAGATGTTGAACAACTTGGAATTCCAGAACAGGAGTACAGCTGTGTAGTAAAG ATGCCTTCTGGTGAATTTGCACGTATATGCCGAGATCTCAGCCATATTGGAGATG CTGTTGTAATTTCCTGTGCAAAAGACGGAGTGAAATTTTCTGCAAGTGGAGAACT TGGAAATGGAAACATTAAATTGTCACAGACAAGTAATGTCGATAAAGAGGAGGA AGCTGTTACCATAGAGATGAATGAACCAGTTCAACTAACTTTTGCACTGAGGTAC CTGAACTTCTTTACAAAAGCCACTCCACTCTCTTCAACGGTGACACTCAGTATGT CTGCAGATGTACCCCTTGTTGTAGAGTATAAAATTGCGGATATGGGAC ACTTAAA ATACTACTTGGCTCCCAAGATCGAGGATGAAGAAGGATCTTAGGCATTCTTAAAA TTCAAGAAAATAAAACTAAGCTCTTTGAGAACTGCTTCTAAGATGCCAGCATATA CTGAAGTCTTTTCTGTCACCAAATTTGTACCTCTAAGTACATATGTAGATATTGTT TTCTGTAAATAACCTATTTTTTTCTCTATTCTCTGCAATTTGTTTAAAGAATAAAG TCCAAAGTCAGATCTGGTCTAGTTAACCTAGAAGTATTTTTGTCTCTTAGAAATA CTTGTGATTTTTATAATACAAAAGGGTCTTGACTCTAAATGCAGTTTTAAGAATT GTTTTTGAATTTAAATAAAGTTACTTGAATTTCAAACATCA (SEQ ID NO: 20; NCBI Reference No NM_002592.2) [0079] The sequences presented above are merely illustrative. The biomarkers of this invention encompass all forms and variants of any specifically described biomarkers, including, but not limited to, polymorphic or allelic variants, isoforms, mutants, derivatives, precursors including nucleic acids and pro-proteins, cleavage products, and structures comprised of any of the biomarkers as constituent subunits of the fully assembled structure.
II. MEASUREMENT OF BIOMARKERS
[0080] The biomarkers of this invention can be measured in various forms. For example, one may measure the RNA transcript levels (e.g., mRNA or total RNA levels) or gene copy numbers of the biomarkers, or may measure the protein or activity levels of the biomarkers. In some embodiments, one may also measure metabolites (e.g., such as peptide fragment) of the biomarkers, or surrogates of the biomarkers (e.g., substrates or ligands of the biomarkers, or biological entities downstream in the signaling pathways of the biomarkers). In some embodiments, the biomarkers, their metabolites, or surrogates of the biomarkers can also be measured together with genes or gene products, like B-Raf, N-Ras, K-Ras, pi 6, p53. In some embodiments, the biomarkers, their metabolites, or surrogates of the biomarkers are measured together with the measurement of mutations in genes like B-Raf, N-Ras, K-Ras, pi 6, and p53.
[0081] At the nucleic acid level, biomarkers may be measured by electrophoresis, Northern and Southern blot analyses, in situ hybridization (e.g., single or multiplex nucleic acid in situ hybridization technology such as Advanced Cell Diagnostic's RNAscope technology), RNAse protection assays, and microarrays (e.g., . Illumina BeadArray™ technology; Beads Array for Detection of Gene Expression (BADGE)). Biomarkers may also be measured by polymerase chain reaction (PCR)-based assays, e.g., quantitative PCR, real-time PCR, quantitative real-time PCR (qRT-PCR), and reverse transcriptase PCR (RT-PCR). Other amplification-based methods include, for example, transcript-mediated amplification (TMA), strand displacement amplification (SDA), nucleic acid sequence based amplification
(NASBA), and signal amplification methods such as bDNA. Nucleic acid biomarkers also may be measured by sequencing-based techniques such as, for example, serial analysis of gene expression (SAGE), RNA-Seq, and high-throughput sequencing technologies (e.g., massively parallel sequencing), and Sequenom MassARRAY® technology. Nucleic acid biomarkers also may be measured by, for example, NanoString nCounter, and high coverage expression profiling (HiCEP).
[0082] Levels of biomarkers also can be determined at the protein level, in whole cells and/or in subcellular compartments (e.g., nucleus, cytoplasm and cell membrane).
Exemplary methods include, without limitation, immunoassays such as
immunohistochemistry assays (IHC), immunofluorescence assays (IF), enzyme-linked immunosorbent assays (ELISA), immunoradiometric assays, and immunoenzymatic assays. In immunoassays, one may use, for example, antibodies that bind to a biomarker or a fragment thereof. The antibodies may be monoclonal, polyclonal, chimeric, or humanized. One may also use antigen-binding fragments of a whole antibody, such as single chain antibodies, Fv fragments, Fab fragments, Fab' fragments, F(ab')2 fragments, Fd fragments, single chain Fv molecules (scFv), bispecific single chain Fv dimers, diabodies, domain- deleted antibodies, single domain antibodies, and/or an oligoclonal mixture of two or more specific monoclonal antibodies.
[0083] Other methods to measure biomarkers at the protein level include, for example, chromatography, mass spectrometry, Luminex xMAP Technology, microfluidic chip-based assays, surface plasmon resonance, sequencing, Western blot analysis, aptamer binding, molecular imprints, or a combination thereof. To determine whole cell and/or subcellular levels of a biomarker, one may also use methods such as AQUA® (see, e.g., U.S. Patents 7,219,016, and 7,709,222; Camp et al, Nature Medicine, 8(11): 1323-27 (2002)), and
Definiens TissueStudio (see, e.g., U.S. Patents 7,873,223, 7,801,361, 7,467,159, and
7,146,380, and Baatz et al, Comb Chem High Throughput Screen, 12(9):908-16 (2009)).
[0084] In some embodiments, post-translational modifications of a biomarker may be relevant to cancer prognosis. Such modifications include, without limitation,
phosphorylation (e.g., tyrosine, threonine, or serine phosphorylation) and glycosylation (e.g., O-GlcNAc). Such modifications may be detected, for example, by antibodies specific for the modifications, or by metastable ions in reflector matrix-assisted laser desorption ionization- time of flight mass spectrometry (MALDI-TOF) (Wirth, Proteomics 2(10): 1445-51 (2002)).
[0085] For biomarker proteins known to have enzymatic activity, their levels can be measured through their activities. Such assays include, without limitation, kinase assays, phosphatase assays, and reductase assays, among many others. Modulation of the kinetics of enzyme activities can be determined by measuring the rate constant KM using known algorithms, such as the Hill plot, Michaelis-Menten equation, linear regression plots such as Lineweaver-Burk analysis, and Scatchard plot.
[0086] Alternatively, biomarker protein and nucleic acid metabolites can be measured. The term "metabolite" includes any chemical or biochemical product of a metabolic process, such as any compound produced by the processing, cleavage or consumption of a biomarker. Metabolites can be detected in a variety of ways known to one of skill in the art, including the refractive index spectroscopy (RI), ultra-violet spectroscopy (UV), fluorescence analysis, radiochemical analysis, near-infrared spectroscopy (near-IR), nuclear magnetic resonance spectroscopy (NMR), light scattering analysis (LS), mass spectrometry, pyrolysis mass spectrometry, nephelometry, dispersive Raman spectroscopy, gas chromatography combined with mass spectrometry, liquid chromatography combined with mass spectrometry, matrix- assisted laser desorption ionization-time of flight (MALDI-TOF) combined with mass spectrometry, ion spray spectroscopy combined with mass spectrometry, capillary
electrophoresis, NMR and IR detection. See, e.g., International Patent Application
Publication Nos. WO04/056456 and WO04/088309.
[0087] In some embodiments, the measured level of a biomarker is normalized against normalizing genes or proteins, including housekeeping genes such as GAPDH, Cynl,
ZNF592, or actin, to remove sources of variation. Methods of normalization are well known in the art. See, e.g., Park et al, BMC Bioinformatics. 4:33 (2003).
III. SAMPLE SOURCES
[0088] One of skill in the art will appreciate that a sample utilized in the measurement of biomarker profiles of the invention can be any sample useful for this purpose, e.g., a cancerous tissue sample. A cancerous tissue sample includes, for example, any sample derived from a cancerous tissue of a patient, and from a tissue that is suspected to be cancerous. The sample can be, by way of example, tissue biopsies, blood, serum, plasma, blood cells, endothelial cells, lymphatic fluid, ascites fluid, interstitial fluid, bone marrow, cerebrospinal fluid, saliva, mucous, sputum, sweat, urine, circulating tumor cells, and circulating endothelial cells.
[0089] The sample may be fresh, frozen {e.g., snap-frozen), fixed {e.g., by formalin, ethanol, or an organic solvent, or with plastic or epoxy), embedded {e.g., in paraffin or wax), and/or cross-linked. The sample may be taken as core biopsies, punch biopsies, fine needle aspirations, surgically removed tumor tissue, or tumor-derived cells grown in vitro or in live animals. In some embodiments, the sample may be formalin-fixed paraffin-embedded biopsies.
[0090] The tissue sample may be collected from a subject that is preferably a mammal. The mammal can be a human, non-human primate, mouse, rat, dog, cat, horse, or cow, but is not limited to these examples. Mammals other than humans can be advantageously used as subjects that represent animal models of tumor metastasis. A subject can be male or female. A subject can be one who has been previously diagnosed or identified as having primary tumor or a metastatic tumor, and optionally has already undergone, or is undergoing, a therapeutic intervention for the tumor such as surgery. Alternatively, a subject can be one who has not been previously diagnosed as having a primary or metastatic tumor, including one who exhibits one or more risk factors for a primary or metastatic tumor. In some embodiments, a subject has a primary tumor, a recurrent tumor, or a metastatic tumor. In some embodiments, the sample is taken from a subject that has previously been treated for a tumor. In other embodiments, the sample is taken from a subject prior to being treated for a tumor. IV. CONSTRUCTION OF BIOMARKER PANELS
[0091] Biomarker panels of this invention can be constructed with two or more of the PDs described herein. In some embodiments, the particular composition of the panel may depend on the desired prognostic information. For example, a clinician may want to know
affirmatively whether a patient is at a low risk or high risk of cancer metastasis or recurrence, or of disease-specific death. If the patient is determined to be at low risk, the clinician may recommend a less aggressive treatment regimen, to avoid unnecessary side-effects. If the patient is determined not to be at low risk, the clinician may further want to know if the patient is at high risk. If the patient is determined to be at high risk, then the clinician may want to recommend aggressive treatment regimen, such as post-surgery adjuvant therapy, including radiation, chemotherapy, hormone therapy, and targeted therapy. If the patient is neither, then the clinician may want to recommend active surveillance and regular follow-up on the patient.
[0092] In some embodiments, to construct a biomarker panel tailored to provide a particular piece of prognostic information, one can select constituent biomarkers using one or more algorithms that prioritize the candidate biomarkers as well as train the optimal formula to combine the results from multiple biomarkers for a panel. By way of example, one may use linear and non-linear equations and statistical classification analyses to determine the relationship between levels of the biomarkers detected in a training cohort and the cohort's known clinical outcome (e.g., survival at a given time point). One may also use structural and syntactic statistical classification algorithms, and methods of risk index construction, utilizing pattern recognition features, including established techniques such as cross- correlation, Principal Components Analysis (PCA), factor rotation, Logistic Regression (LogReg), Linear Discriminant Analysis (LDA), Eigengene Linear Discriminant Analysis (ELD A), Support Vector Machines (SVM), Random Forest (RF), Random Survival Forests, Recursive Partitioning Tree (RPART), as well as other related decision tree classification techniques, Shrunken Centroids (SC), StepAIC, Kth-Nearest Neighbor, Boosting, Decision Trees, Neural Networks, Bayesian Networks, and Hidden Markov Models, among others. Other biomarker selection algorithms are, e.g., forward selection, backwards selection, stepwise selection, complete enumeration of all potential panels of a given size, genetic algorithms, voting-based algorithms, greedy algorithms, the LASSO algorithm, the AIC- Optimizing Stepwise Forward Selection Cox Regression Model, and other Cox regression algorithms, Weibull models, Kaplan-Meier models, and Greenwood models. Enumeration and ranking of all possible subsets of variables is also considered for subset of variable selection.
[0093] The above algorithms may be coupled with information criteria, such as Akaike's Information Criterion (AIC) or Bayes' Information Criterion (BIC), in order to quantify the tradeoff between additional biomarkers and model improvement, and to aid in minimizing overfit. The resulting predictive models may be validated in other studies, or cross-validated in the study they were originally trained in, using such techniques as Bootstrap, Leave-One- Out (LOO) and 10-Fold cross-validation (10-Fold CV). At various steps, false discovery rates may be estimated by value permutation according to techniques known in the art.
Scores from several biomarkers can be combined via a linear or non- linear equation to yield an overall score for a patient and the latter can then be used to stratify patients into risk groups.
[0094] The performance and, thus, usefulness, of biomarker panels may be assessed in multiple ways. For example, the sensitivity, specificity, positive predictive value (or rate), and negative predictive value (or rate) of the panel may be considered. To calculate these parameters, the following variables are used: "true positive" or TP (correctly classifying a subject as diseased in regard to a disease state of interest {e.g., cancer-attributable death at a given time point)); "true negative" or TN (correctly classifying a subject as non-diseased in regard to a disease state of interest (i.e., no cancer-attributable death at a given time point)); "false positive" or FP (i.e., incorrectly classifying a subject as diseased in regard to a disease state of interest); and "false negative" or FN (i.e., incorrectly classifying a subject as non- diseased in regard to a disease state of interest). The performance of a quantitative read-out of a biomarker combination may be based on the algorithm applied.
[0095] "Sensitivity" of a biomarker panel may be calculated by TP/(TP + FN), i.e., the true- positive fraction of disease subjects. "Specificity" of a biomarker panel may be calculated by TN/(TN + FP), i.e., the true negative fraction of non-diseased subjects. Sensitivity of 100% and specificity of 100% are ideal, although for practical purposes, sensitivity and/or specificity of more than 70% (e.g., 75%, 80%, 85%, 90%, 95%, or more) may be acceptable. Likewise, biomarker panels may also be assessed for their "positive predictive value" or "positive predictive rate" (true positive fraction of all positive test results, i.e., TP/(TP + FP)) and "negative predictive value" or "negative predictive rate" (true negative fraction of all negative test results, i.e., TN/(TN + FN)). Positive predictive value of 100% and negative predictive value of 100% are ideal, although for practical purposes, positive and/or negative predictive values of more than 70%> (e.g., 75%, 80%>, 85%, 90%, 95%, or more) may be acceptable. Various statistical measures (e.g., area under the curve (AUC), goodness-of-fit, or quantitative range of a PD read-out) may be used to evaluate the performance of a biomarker panel in order to provide an acceptable level of performance. Commonly used metrics to assess the ability of the model that uses the biomarker panel to segregate between low and high risk groups are the log-rank p-value, the hazard ration, and the concordance index (c-index). For methods of assessing biomarkers' performance, see, e.g., O'Marcaigh et al., Clin Pediatr (Phila). 32(8):485-91 (1993); Pepe et al., Am J Epidemiol. 159(9):882-90 (2004); Shultz, Fundamentals of Clinical Chemistry, Burtis and Ashwood (eds.), 4th Ed., pp. 192-199 (W.B. Saunders, 1996); Zweig et al, Clin Chem. 38(8 Pt 1): 1425-28 (1992); Cook et al, Circulation. 115:928-35 (2007); and Vasan, Circulation. 113:2335-62 (2006).
[0096] A biomarker panel of this invention may comprise two, three, four, five, six, seven, eight, nine or all ten of the biomarkers of Table 1. The precise combination and weight of the biomarkers may vary dependent on the prognostic information being sought. Examples of biomarker panels useful in identifying cancer patients with a poor prognosis (e.g., patients with a high risk for tumor metastasis) may comprise:
a) cytoplasmic (i.e., protein level in the cytoplasm) CD117, nuclear (i.e., protein level in the nucleus) CD44, tumor (i.e., whole tumor cell) KIF2C, nuclear MMP1, tumor PCNA, and cytoplasmic SPARC,
b) cytoplasmic CD44 and nuclear KIF2C,
c) cytoplasmic ANLN, nuclear CD117, and cytoplasmic CD44,
d) nuclear MMP1, tumor PCNA, and tumor SPARC,
e) cytoplasmic CD117, tumor KIF2C, nuclear MMP1, and tumor PCNA,
f) cytoplasmic CD117, cytoplasmic CD44, tumor KIF2C, nuclear MMP1, and tumor PCNA, g) nuclear MMP 1 and tumor PCNA,
h) cytoplasmic ANLN, cytoplasmic CD117, nuclear CD44, cytoplasmic DEPDC1, tumor KIF2C, nuclear MMP1, tumor PCNA, and nuclear SPARC, or
i) cytoplasmic CD117, tumor KIF2C, nuclear MMP1, and tumor PCNA. [0097] In some embodiments, biomarker panels may be used to identify cancer patients with a favorable prognosis (e.g., patients with a low risk for tumor metastasis). Such panels may comprise:
a) nuclear CDH2, non-nuclear (i.e., protein level outside the nucleus) PCNA, nuclear KIF2C, nuclear CD 117, and nuclear DEPDC 1 ,
b) nuclear CDl 17, nuclear CDH2, tumor CDH2, and nuclear KIF2C,
c) nuclear SPARC, non-nuclear KIF2C, tumor MMP1, tumor ANLN, and nuclear FSCNl, d) tumor DEPDCl, non-nuclear CD44, tumor KIF2C, non-nuclear CDH2, and nuclear SPARC, e) non-nuclear CDl 17 and non-nuclear FSCNl,
f) nuclear KIF2C, tumor CD44, non-nuclear CD44, and non-nuclear ANLN,
g) nuclear CDH2, non-nuclear PCNA, nuclear KIF2C, nuclear CDl 17, and nuclear DEPDCl, h) tumor CDl 17, tumor KIF2C, nuclear MMP1, nuclear CDl 17, nuclear SPARC, tumor CD44, and non-nuclear KIF2C,
i) tumor SPARC, non-nuclear SPARC, tumor CD44, non-nuclear CD44, nuclear CDl 17, nuclear FSCN 1 , non-nuclear KIF2C, and tumor PCNA,
j) nuclear MMP1, non-nuclear KIF2C, tumor DEPDCl, nuclear PCNA, tumor CD44, non- nuclear CDl 17, tumor PCNA, non-nuclear ANLN, non-nuclear CD44, and nuclear KIF2C, k) cytoplasmic CDl 17 and cytoplasmic PCNA,
1) nuclear CD44, cytoplasmic CDH2, and tumor KIF2C,
m) tumor ANLN, nuclear CD44, and cytoplasmic CDH2,
n) nuclear CDl 17, nuclear CD44, cytoplasmic CDH2, and nuclear KIF2C,
o) nuclear CD44, cytoplasmic CDH2, cytoplasmic FSCNl, and tumor SPARC,
p) nuclear CD44 and cytoplasmic CDH2,
q) nuclear CDl 17, nuclear CD44, cytoplasmic CDH2, tumor DEPDCl, nuclear FSCNl, nuclear KIF2C, nuclear PCNA, and nuclear SPARC, or
r) nuclear CDl 17, nuclear CD44, cytoplasmic CDH2, tumor DEPDCl, nuclear FSCNl, and nuclear KIF2C.
[0098] In some embodiments, biomarker panels may be used to identify cancer patients with a poor prognosis (e.g., patients with a high risk for tumor metastasis) among patients that test negative in a sentinel lymph node biopsy. Such panels may comprise:
a) cytoplasmic CD44 and nuclear KIF2C,
b) nuclear MMP1, tumor PCNA, and tumor SPARC,
c) tumor CDH2, nuclear MMP1, and tumor PCNA, d) nuclear ANLN, nuclear MMP1 , tumor PCNA, and cytoplasmic SPARC, or e) cytoplasmic CD1 17, tumor KIF2C, nuclear MMP1 , tumor PCNA, and cytoplasmic SPARC.
[0099] Exemplary biomarker panels of this invention further include the panels illustrative in Figs. 29-31.
V. BIOMARKER SCORES
[0100] To use a biomarker panel of this invention, one may compare the biomarker score profile of the panel (e.g., a reference, baseline, or index value) with the biomarker score profile of a cancerous sample from a patient, where the comparison results provide prognostic information for the patient. A biomarker score is calculated on the basis of a measured level of the biomarker, using one or more of algorithms well known in the art, e.g., the algorithms illustrated herein. In some embodiments, a "biomarker score" may be obtained by applying a numeric coefficient (e.g., by multiplication or division) to a measured level of a biomarker. The coefficients may be provided by an algorithm. Use of coefficients allows one to weigh the biomarkers in a panel differently, achieving optimal prognostic results. In some embodiments, biomarker scores can be used as cutoff points or threshold values to stratify a patient population (e.g., identifying patients with a high risk for metastatic disease
progression or a low risk for metastatic disease progression).
[0101] Comparison of the biomarker scores of a tissue sample to a reference, index, or baseline value can be achieved with techniques such as reference limits, discrimination limits, or risk defining thresholds to define cutoff points and abnormal values. Exemplary reference samples, index values, or baseline values may be taken or derived from, for example, a control subject, or a population with known clinical outcomes. In certain embodiments, the control subject or population may be without cancer or without a clinical outcome being considered (e.g., metastasis, cancer recurrence, or cancer-attributable death by a given time point). In alternative embodiments, the control subject or population may have had cancer or the clinical outcome being considered (e.g., metastasis, cancer recurrence, or cancer- attributable death at a given time point.)
[0102] In some embodiments, a reference, index or baseline value is the level of a biomarker in a noncancerous tissue. In some embodiments, the noncancerous tissue is derived from a cancer patient (e.g., the cancer patient from whom the cancerous tissue sample is derived). In alternative embodiments, the noncancerous tissue is derived from an individual or population without cancer. In some embodiments of the present invention, the value is the level of a biomarker in a control sample derived from one or more subjects who are asymptomatic and/or lack traditional risk factors for a metastatic tumor or recurrence. In further embodiments, such subjects may be monitored and/or periodically retested for a diagnostically relevant period of time ("longitudinal studies") following such test to verify continued absence of a metastatic tumor (disease or event free survival). Such period of time may be one year, two years, two to five years, five years, five to ten years, ten years, or more than ten years from the initial testing date for determination of the reference value.
Furthermore, retrospective measurement of biomarkers in properly banked historical subject samples may be used in establishing these reference values, thus shortening the study time required. In some embodiments, the value is the level of a biomarker in a control sample derived from a tumor with low metastatic potential or risk of recurrence. In some
embodiments, the value is the level of a biomarker in a control sample derived from a tumor with that has not metastasized or recurred. In some embodiments, comparisons can be performed between patient and reference values measured concurrently or at temporally distinct times, e.g., between patient values and values derived from a database of compiled expression information that assembles information about expression levels of cancer- associated genes.
[0103] A reference, index, or baseline value also may be derived from one or more subjects who have been exposed to a treatment {e.g., adjuvant therapy) and have shown improvements as a result of the treatment. Comparing a cancer patient's biomarker scores/profile with such values may provide useful information in predicting responsiveness of the patient to this cancer treatment. In other embodiments, the value may be derived from a patient who has received an initial cancer treatment, and then as the patient receives additional treatments, his biomarker scores/profile will be compared to his original reference, index, or baseline biomarker scores/profile, so as to monitor the progress of the treatments. A reference, index value or baseline value also may be derived from risk prediction algorithms or computed indices from population studies. In general, reference, index or baseline values may vary based on which biomarkers are included in the value.
[0104] Reference, index or baseline values, as described above, also can be used to generate a "reference biomarker profile." The biomarkers disclosed herein can be used to generate a "subject biomarker profile" taken from subjects who, for example, are at high risk for tumor metastasis or cancer recurrence. The subject biomarker profiles can be compared to a reference biomarker profile to identify, for example, subjects at risk of developing tumor metastasis or cancer recurrence, or to monitor the progression of a cancer or the effectiveness of a cancer treatment.
[0105] The reference and subject biomarker profiles of the present invention may be contained in a machine-readable medium, such as but not limited to, analog tapes like those readable by a VCR, CD-ROM, DVD-ROM, and USB flash media, among others. Such machine-readable media can also contain additional test results, such as, without limitation, measurements of clinical parameters and traditional laboratory risk factors. Alternatively or additionally, the machine-readable media can also comprise subject information such as medical history and any relevant family history. The machine-readable media can also contain information relating to other disease-risk algorithms and computed indices such as those described herein.
VI. ADDITIONAL PROGNOSTIC FACTORS
[0106] The biomarker panels of this invention may be used in conjunction with additional biomarkers, clinical parameters, or traditional laboratory risk factors known to be present or associated with the clinical outcome of interest. One or more clinical parameters may be used in the practice of the invention as a biomarker input in a formula or as a pre-selection criterion defining a relevant population to be measured using a particular biomarker panel and formula. One or more clinical parameters may also be useful in the biomarker normalization and pre-processing, or in biomarker selection, panel construction, formula type selection and derivation, and formula result post-processing. A similar approach can be taken with the traditional laboratory risk factors. Clinical parameters or traditional laboratory risk factors are clinical features typically evaluated in the clinical laboratory and used in traditional global risk assessment algorithms. Clinical parameters or traditional laboratory risk factors for tumor metastasis may include, for example, tumor stage, tumor grade, tumor size, tumor visual characteristics, tumor location, tumor growth, lymph node status, histology, tumor thickness (Breslow score), ulceration, proliferative index, tumor-infiltrating lymphocytes, age of onset, PSA level, or Gleason score. Other traditional laboratory risk factors for tumor metastasis are known to those skilled in the art. VII. CLINICAL UTILITY OF BIOMARKER PANELS
[0107] Biomarker panels of the invention provide useful information in prognosis of cancer patients. The term "prognosis" refers to the prediction of how a disease will progress, including, for example, likelihood or risk of death attributable to cancer within a given period of time (e.g., six months, twelve months, two years, three years, five years, eight years, ten years, fifteen years, or more), cancer recurrence or metastasis; likelihood of recovery;
efficacy of a particular treatment; and rate of tumor progression. Typically, survival beyond three years is considered "long-term" survival. As used herein, "risk" relates to the probability that an event (e.g., a metastatic event) will occur over a specific time period, and can mean a subject's "absolute" risk or "relative" risk. Absolute risk can be measured with reference to either actual observation post-measurement for the relevant time cohort, or with reference to index values developed from statistically valid historical cohorts that have been followed for the relevant time period. Relative risk refers to the ratio of absolute risks of a subject compared either to the absolute risks of low risk cohorts or an average population risk, which can vary dependent on how clinical risk factors are assessed.
[0108] A "poor prognosis" may refer to a prognosis in which one or more negative characteristics of cancer is increased as compared to a reference population of cancer patients. For example, poor prognosis may refer to a decreased chance of survival, an increased risk of cancer recurrence, increased malignancy, increased metastatic potential, increased tumor size and growth rate, increased progression of symptoms, or decreased response to treatment. A patient having a poor prognosis is considered "high risk." High risk patients may, for example, have an increased risk of tumor metastasis or recurrence.
[0109] By contrast, a favorable prognosis may refer to an increased chance of survival, a decreased risk of recurrence of disease, decreased malignancy, decreased metastasis, decreased metastatic potential, decreased or static tumor size, decreased tumor growth rate, decreased progression of symptoms, or increased response to treatment as compared to the average cancer characteristic in a population of cancer patients. A patient having a favorable prognosis is considered "low risk." Low risk patients may, for example, have a decreased risk of tumor metastasis or recurrence.
[0110] Risk calculation is statistical. "Statistically significant" refers to an alteration is greater than what may be expected to happen by chance alone (which would be a "false positive"). Statistical significance can be determined by methods well known in the art. Commonly used measures of significance include the p-value, which represents the probability of obtaining a result at least as extreme as a given data point, assuming the data point was the result of chance alone. A result is often considered highly significant at a p- value of 0.05 or less (e.g., 0.005 or 0.0005 or less). In some embodiments, the probability of occurrence of an undesired clinical event in a patient classified as "high risk" is at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%. In some embodiments, the probability of occurrence of an undesired clinical event in a patient classified as "low risk" is no more than 30%, 25%, 20%, 15%, 10%, 5%, 4%, 3%, 2%, or 1%.
[0111] A biomarker panel also may aid the diagnosis and staging (e.g., stage I, II, III, or IV) of cancer. For example, the biomarker score of a patient can be compared to a reference, index or baseline biomarker value that is obtained from subjects known to have cancer or a particular stage of cancer, or from subjects that are cancer- free, or from subjects with low- risk cancer. The comparison is analyzed to conclude the diagnosis or staging. In certain embodiments, the comparison may be used to identify a cancer patient in need of a further diagnostic procedure (e.g., a sentinel lymph node biopsy). In alternative embodiments, the comparison may be used to identify a cancer patient that is not in need of a further diagnostic procedure (e.g., a sentinel lymph node biopsy).
[0112] Identifying a subject with a favorable prognosis empowers a decision to avoid or delay various therapeutic interventions or treatment regimens. Such subjects may not require treatments if their tumors are molecularly hard- wired to remain indolent, cured by surgical excision alone, and/or unlikely to metastasize. Likewise, identifying a subject with a poor prognosis empowers a decision to use more aggressive treatment strategies. Thus, the prognostic and diagnostic information obtained with the present biomarker panels enables the selection and initiation of suitable treatments or therapeutic regimens to delay, reduce, or prevent progression of the cancer while avoiding unnecessary morbidity associated with cancer treatments.
[0113] Accordingly, the methods and panels of this invention can be used to identify patients in need of "adjuvant therapy," that is, therapy given in conjunction with surgery, after which little or no evidence of residual disease can be detected. Adjuvant therapy is given to reduce the risk of disease recurrence, either local or metastatic. Adjuvant therapy can include, for example, radiation therapy, chemotherapy, hormone therapy, experimental therapy (e.g., as part of a clinical trial), neo-adjuvant therapy (therapy administered prior to the primary therapy), and targeted therapy. Targeted therapy entails the use of biologies that inhibit or enhance the function of a molecular target, or a signaling pathway associated therewith, in cancer cells. Targeted therapy associated with methods of this invention may include therapy that targets one or more PDs described herein and their associated signaling pathways, including molecular inhibitors of key intracellular drug targets operating in cancer survival pathways, like inhibitors of B-Raf (like PLX-4032), the serine/threonine kinase Akt, the serine/threonine kinase MEK, the serine/threonine kinase p90RSK, the lipid kinase PI3K, and the serine/threonine kinase IKK2. Moreover, the test agents can be small molecule inhibitors or therapeutic antibodies directed against cell surface proteins, like receptor tyrosine kinases, including EGFR, HER2/NEU, HER3, IGF-1R, c-Met, VEGFR, Axl, Eph receptors, etc. In treatment of malignant melanoma B-Raf inhibitors, MEK inhibitors, Akt inhibitors, PI3K inhibitors, CTLA4-directed therapies, and Interferon, are particularly relevant. Stratifying patients into different risk categories allows informed decision on whether adjuvant therapy is recommended.
[0114] The biomarker panels of this invention can also be used to aid the selection of an appropriate adjuvant therapy. For example, one can obtain a biomarker profile from a patient before a proposed treatment, or from a reference subject (e.g., an individual or population having no cancer, or having non-metastatic cancer, or having improvements in risk factors (e.g., clinical parameters or traditional laboratory risk factors). A difference in the biomarker scores between the test sample and the reference sample may indicate that a treatment is suitable for administration, whereas a similarity between the two samples may indicate that the treatment is not suitable.
[0115] In some embodiments, the methods of treating a cancer patient may include the selection of suitable treatment for a particular patient that is directed to their unique physiology. Differences in the genetic makeup of cancer patients can result in different abilities to metabolize various drugs, which may modulate the symptoms or risk factors of cancer or metastatic events. Accordingly, a biomarker panel, alone or in combination with known genetic factors for drug metabolism, may be used to predict whether a candidate cancer therapeutic will be suitable for treating a particular cancer patient. These
embodiments may further comprise predicting or diagnosing adverse side effects associated with administration of the treatment.
[0116] The biomarker panels of the invention further provide a means for monitoring the progression of cancer in a subject, for example, by screening for changes in marker expression associated with a cancer. In certain embodiments, these methods comprise determining biomarker levels or scores in a subject-derived sample (e.g., a cancerous tissue sample), comparing these to the biomarker levels or scores in a reference sample, and identifying alterations in amounts of the levels or scores in the subject sample compared to the reference sample. These measurements may be repeated over a clinically relevant period of time, for example, six months, twelve months, two years, three years, five years, eight years, ten years, fifteen years, or more. If the reference sample is from a non-cancerous tissue, or from a subject with non-metastatic cancer, increasing similarities between the biomarker levels indicate that the cancer is not progressing, or is regressing. Conversely, increasing differences between the biomarker levels may indicate that the cancer is progressing.
[0117] In further embodiments, a biomarker panel may be used to monitor the course of treatment in a subject. To monitor treatment, a biological sample can be provided from a subject undergoing treatment regimens, e.g., drug treatments, for cancer. The biological samples may be obtained from the subject at various time points before, during, or after treatment. Comparison of the levels of the biomarkers at various time points will indicate the effectiveness of the course of treatment. For example, if the biomarker scores or profile of the cancerous tissue from the patient return to a baseline value measured in one or more subjects without a metastatic or recurrent tumor, or in subjects who do not exhibit traditional risk factors for metastatic disease, the treatment may be considered successful.
[0118] A biomarker panel also may be used in stratifying patients for inclusion in a clinical trial. In these embodiments, biomarker levels or scores may be obtained from a cancerous tissue sample and compared to a reference or index value. By using the methods provided herein, a more appropriate cohort may be obtained as compared to traditional methods of selecting patients, increasing the likelihood of the trial's success and decreasing its duration and cost.
[0119] The biomarker panels of this invention also may be used to provide a report that is useful in a clinical setting. In some embodiments, the report may include biomarker scores and associated prognostic information, including likelihood of long-term survival, cancer recurrence, and cancer metastasis, as well as treatment recommendations based on the prognostic information. Similarly, the biomarker panels can be used to identify useful cancer treatments. By following changes in the biomarker profile of patients over the course of an experimental therapy, one can determine whether the therapy is efficacious in treating these patients. VIII. THE ROLE OF PATHWAY CONTEXT GENES
[0120] With the advances in whole genome and 'deep sequencing' efforts, the underlying genetic alterations driving aggressive cancer progression are being unraveled at an unprecedented rate. These molecular, genetic alterations result in signal transduction pathway alterations which ultimately determine the phenotype of specific cancers. This insight constitutes the conceptual foundation for 'pathway context-based drug discovery' in which molecular-targeted therapeutics are developed against both mutant oncoproteins (for example, BCR-Abl) and wild-type effector molecules (for example, mammalian target of rapamycin (mTOR)) that reside upstream of or within deregulated oncogenic pathways (see, e.g., Blume- Jensen, P & Hunter, T, 2001, Oncogenic Kinase signaling, Nature. 2001 May 17;411(6835):355-65; Vogelstein B, Kinzler KW. 2004: Cancer Genes and the Pathways they Control, Nat Med. (8):789-99; Engelman JA, Luo J, Cantley LC. 2006. The evolution of phosphatidylinositol 3 -kinases as regulators of growth and metabolism. Nat Rev Genet. (8):606-19; and Andersen, J. N. et al. 2010. Pathway-based identification of biomarkers for targeted therapeutics: personalized oncology with PI3K pathway inhibitors. Sci Transl Med. 2(43):43ra55).
[0121] Numerous targeted agents either in clinical trials with promising preliminary results or already on the market are based on the ability to efficiently counter the pathway deregulation at critical molecular junctures in a patient sub-population defined by the molecular pathway context resulting from the mutant oncoproteins. Vivid examples of this include the marketed Imatinib (Gleevec®), which represents a change in our ability to treat chronic myelogenous leukemia and c-Kit-driven gastrointestinal tumors (Druker BJ. 2002. Perspectives on the development of a molecularly targeted agent. Cancer Cell. (l):31-6.; and Judson I, Demetri G. 2007. Advances in the treatment of gastrointestinal stromal tumours. Ann Oncol. Suppl 10:x20-4.) and the small molecule inhibitor of B-Raf, PLX-4032 currently in advanced clinical trials for B-Raf-mutant malignant melanoma witnessing overall response rates of 81% (Poulikakos PI et al 2010. RAF inhibitors transactivate RAF dimers and ERK signalling in cells with wild-type BRAF. Nature. 464(7287):427-30; and Bollag., G, et al. 2010. Clinical efficacy of a RAF inhibitor needs broad target blockade in BRAF -mutant melanoma. Nature. 467(7315):596-9). This clinical proof-of-concept for pathway context as an important predictor of responsiveness to targeted agents, would imply that identical proteins in the same type of human cancer, e.g. breast, colorectal, lung, prostate, etc, have different functional roles depending on the molecular pathway context within said tumor. Accordingly, the specific proteins that are involved in an aggressive phenotype in a particular cancer depend on the molecular pathway context, as does the therapeutic response to targeted agents. This has important implications for linking newer types of targeted therapies with high risk cancers as identified by our PDs as they operate in molecularly defined pathway context (McDermott U et al; 201 1, N Engl J Med; 364(4):340-50).
[0122] The strong correlation between the present PDs and oncologic clinical outcome indicates that the PDs are involved in tumor progression {e.g., as an oncogenic facilitator or a tumor suppressor) through cellular pathways that lead to or block tumor advancement.
Accordingly, the present invention provides methods and compositions that use other molecular entities in these cellular pathways as surrogate biomarkers or therapeutic targets. A particular PD may not be the ultimate driver of tumor progression in a pathway, but it may be a downstream or upstream of the true driver in the pathway. Thus, certain PDs may be replaced with other molecular biomarkers that can serve as functional readouts of the perturbed pathways causing early stage cancers to progress. In this sense, the PDs can be said to function in a pathway context. "Pathway context" is a term used to describe clinically relevant molecular alterations that result in perturbed or deregulated signal transduction pathways in diseased cells. Using measurement methods described herein (e.g., the AQUA® technology) in conjunction with pathway mapping, one may identify the true drivers of tumor progression. For example, one can analyze perturbed or deregulated pathway activity using phosphoproteomics or phosphoantibodies directed against key proteins inside the cells that are regulated by the mutated or otherwise molecularly altered pathway context gene products. Using quantitative technology, for example AQUA, in conjunction with analysis of the signal transduction pathway activity inside aggressive cancer cells (so-called 'pathway analysis'), the true drivers of tumor progression may be identified. These drivers will have an important effect on prognosis and can also be used as therapeutic targets.
[0123] Regardless of whether a PD is a true driver of tumor progression, if the PD is correlated with prognosis it must be involved in a pathway leading to or blocking tumor advancement. Through careful experimentation the pathway can be mapped and the PDs role in the pathway can be elucidated. This information may be used as a molecular stratifier of the patient population with a certain tumor type. In addition, drug targets can be designed to critical upstream components of the pathway or the PD itself. The efficacy of the drug in inhibiting or activating the pathway can immediately be tested, for example, by using AQUA to quantitate expression of the PD. In this manner, AQUA can be utilized for testing the effectiveness of targeted therapeutics.
[0124] In many cases, pathway context is thought to be causally involved in the phenotype of the disease. For instance, in cancer, loss-of-function (LOF) of a tumor suppressor or gain- of-function (GOF) of an oncoprotein are examples of pathway context molecular alterations that are clinically relevant and cause perturbed signal transduction pathway activity that often is involved in the cancer aggressiveness. "Pathway context genes" denote the genes that can undergo molecular alterations involved in pathway context. Examples of pathway context genes that are particularly well studied in human cancer include the oncoproteins Ras (for a detailed review see Vigil D, Cherfils J, Rossman KL, Der CJ.; Nat Rev Cancer. 2010;
10(12):842-57.) and B-Raf (Davies, H., et al: Nature. 2002 27; 417(6892):949-54.; Vakiani E, Solit DB.; J Pathol. 2011;223(2):219-29.), and the tumor suppressor proteins PTEN (Wong K , Engelman JA, Cantley LC; Curr Opin Genet Dev. 2010; 20(l):87-90) and p53
(Vousden KH, Prives C; Cell. 2009; 137(3):413-31). Pathway context genes have often been discovered through the fact that they are very frequently mutated in human cancer resulting in either gain-of-function (GOF) for oncoproteins or loss-of-fuction for tumor suppressor proteins. For instance, either PTEN or p53 are thought to be functionally inactivated through LOF mutations in up to 95% of all human cancer (Soussi T, Wiman KG., Cancer Cell. 2007; 12(4):303-12).
[0125] Similarly, the three isoforms of Ras, namely N-Ras, K-Ras, and H-Ras together are thought to be mutated in more than 70-75% of human cancers (Vigil D, Cherfils J, Rossman KL, Der CJ.; Nat Rev Cancer. 2010; 10(12):842-57), while 25-30% of papillary thyroid cancer and 5-10% of colorectal cancer patients have a GOF mutation in B-Raf, B-Raf-V600E (Schubbert S, Shannon K, Bollag G., Nat Rev Cancer. 2007; 7(4):295-308; Martin MJ, Carling D, Marais R., Cancer Cell. 2009 3;15(3): 163-4). In this context, B-Raf can serve as a pathway context gene product, but it is also itself regulated by another pathway context gene product(s), namely N-Ras and K-Ras. In malignant melanoma, B-Raf, N-Ras, and K-Ras are mutated in -50%, 30% and 10% of human melanomas, respectively (Martin MJ, Carling D, Marais R., Cancer Cell. 2009 Mar 3;15(3): 163-4). These mutations result in hyperactivation of the Ras-Raf-MEK-ERK signaling cascade. B-Raf itself is a popular drug target for small molecule inhibitors, and such inhibitors efficiently shut down signal transduction in B-Raf- V600E-mutant cancers that are Ras wild type, whilst the effect of inhibiting B-Raf in Ras- GOF-mutant cancers is an activation of its downstream target ERK. Hence, the Raf inhibitor PLX-4032 has been shown to have remarkable overall response rate and progression-free survival of 81% and over 6-7 months in B-Raf-V600E mutant patients that are wild type for Ras (Poulikakos PI, Rosen N.; Cancer Cell. 2011 Jan 18; 19(1): 11-5 and references therein). This is an example of N-Ras pathway context function of B-Raf. Most patients eventually develop resistance to the B-Raf inhibitor even in patients that initially respond well to the drug. Some of the prevailing resistance mechanisms are well characterized and involve acquired mutations in N-Ras. Mutant N-Ras signals through both MEK-ER and PB'K signaling pathways, rendering the cells relatively insensitive to B-Raf or MEK inhibitors. This means that in one context, namely Ras wild type, B-Raf-V600E is a driver of the melanoma phenotype, while in N-Ras-GOF-mutant cells, Ras is more important in driving the phenotype. This exemplifies the concept that proteins may have different functions depending on the pathway context within which they operate. This example also has made it clear that in order to efficiently treat Ras mutant cells, one needs to inhibit both the MEK- ERK and the PB'K pathway simultaneously (Engelman et al, 2008). Other resistance mechanisms to B-Raf inhibitors include upregulation of receptor protein tyrosine kinases, like IGF-1R and PDGFR beta, as well as of the serine/threonine protein kinase Tpl2/COT
(Villanueva et al, 2010; Johannessen et al, 2010). In both examples the upregulated proteins confer new pathway context to B-Raf, so it is no longer a driver of the malignant phenotype. In the case of Tpl2/COT upregulation, MEK now becomes activated independently of Raf, and hence MEK becomes a key driver independent of Raf, rendering cells sensitive to MEK inhibition, but not to B-Raf. These data exemplify that identification of the pathway context of a PD will, therefore, be very useful in providing clinically relevant information, such as a prognosis or an effective therapeutic target.
[0126] In some embodiments, the present invention provides biomarker panels that are indicative of a general physiological pathway associated with a cancer. In some
embodiments, the present invention provides biomarker panels that are specifically indicative of a particular pathway context, but not other pathway contexts. In some embodiments, an additional pathway component also may be identified with methods known to those skilled in the art, including mass spectroscopy, phosphoproteomics or phosphoantibodies directed against key proteins inside the cells. These other pathway components may be used in a biomarker panel in addition to or as a substitute for one or more of the biomarkers in the panel, provided they share certain defined characteristics of a good biomarker. These characteristics may include analytically important characteristics such as levels of the biomarker that may be measured at a useful signal to noise ratio. In further embodiments, the additional pathway component may be used as a target for therapy (e.g., adjuvant therapy).
KITS
[0127] The levels of biomarkers in a panel may be measured using a a kit with detection reagents that specifically detects and quantify the biomarker analytes. The detection reagents may have been detectably labeled, or the kit provides labeling reagents for conjugation to the detection reagents. The kit may comprise an array of detection reagents, e.g., antibodies and/or oligonucleotides that can bind to biomarker proteins (or fragments thereof) or nucleic acids, respectively. In some embodiments, the biomarkers are proteins and the kit contains antibodies that bind to the biomarkers. In other embodiments, the biomarkers are nucleic acids and the kit contains oligonucleotides or aptamers that bind to the biomarkers. In some embodiments, the oligonucleotides may be fragments of the biomarker genes. For example the oligonucleotides can be 200, 150, 100, 50, 25, or fewer nucleotides in length.
[0128] A kit also may contain in separate containers a nucleic acid or antibody (alone, or already bound to a solid matrix or packaged separately with reagents for binding them to the matrix), control formulations (positive and/or negative), and/or a detectable label such as fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, quantum dots, luciferase, and radiolabels, among others. Instructions (e.g., written, tape, VCR, CD-ROM, and/or DVD) for carrying out the assay may be included in the kit.
[0129] The biomarker detection reagents provided in a kit can be immobilized on a solid matrix such as a porous strip to form at least one biomarker detection site. The measurement or detection region of the porous strip may include a plurality of sites containing, for example, a nucleic acid or antibody, and may optionally contain sites for negative and/or positive controls. Alternatively, control sites can be located on a separate strip from the test strip. Optionally, the different detection sites may contain different amounts of biomarker detection reagents, e.g., a higher amount in the first detection site and lesser amounts in subsequent sites. Upon the addition of test sample, the number of sites displaying a detectable signal may provide a quantitative indication of the amount or level of biomarkers present in the sample. The detection sites may be configured in any suitably detectable shape and can be in the shape of a bar or dot spanning the width of a test strip.
[0130] In some embodiments, a kit comprises a nucleic acid substrate array comprising one or more nucleic acid sequences that specifically identify one or more biomarker nucleic acid sequences. In certain embodiments, the substrate array can be on a solid substrate (for example, a "chip" such as a microarray chip (see, e.g., U.S. Patent 5,744,305)). Alternatively, the substrate array can be a solution array, e.g., xMAP (Luminex, Austin, TX), Cyvera (Illumina, San Diego, CA), CellCard (Vitra Bioscience, Mountain View, CA) and Quantum Dots' Mosaic (Invitrogen, Carlsbad, CA). In alternative embodiments, a kit comprises an antibody substrate array comprising one or more antibodies that specifically identify one or more biomarker proteins {e.g., an array for performing an immunoassay such as an ELISA assay or AQUA®).
EXAMPLES
[0131] Further details of the invention will be described in the following non-limiting Examples. It should be understood that these examples, while indicating preferred
embodiments of the invention, are given by way of illustration only, and should not be construed as limiting the appended embodiments. From the present disclosure and these examples, one skilled in the art can ascertain certain characteristics of this invention, and without departing from the spirit and scope thereof, can make various changes and modifications of the invention to adapt it to various usages and conditions.
Example 1: Quantification of PD nucleic acid levels in human cancer
[0132] Cell lines (WM-115, WM-266.4, SK-MEL-2, SK-MEL-5, SK-MEL-24, SK-MEL- 28, SK-MEL-31, RPMI-7951, A375 and NHEM neo) were obtained from ATCC and grown using conditions described by the vendor. They were harvested when in log growth phase and frozen at -80°C. Total RNA was isolated from frozen cell pellets containing ~5xl06 cells using a Qiagen RNeasy Plus Mini kit (#74134), followed by quantification using the
NanoDrop 2000.
[0133] Before conversion to cDNA, the total RNA samples were analyzed for potential contamination with genomic DNA using an Applied Biosystems (ABI) Taqman assay (PTEN, Hs02621230_sl). Upon verification that the samples analyzed contained no interfering genomic DNA, 2μg of total RNA was converted to cDNA using the ABI High Capacity cDNA Reverse Transcription Kit (#4368813), according to the manufacturer's instructions.
[0134] Applied Biosystems (ABI) Assays-On-Demand (AOD) gene specific primer-probe pairs are pre-validated, QC tested and optimized for use on ABI sequence detection instruments. Gene specific primer-probe pairs for 10 PDs (Table 1) were used to determine relative levels of gene expression in each melanoma cell line.
[0135] Standard Taqman reagents were used in a total reaction volume of 25μ1 containing 20ng of cDNA per well. Duplicate wells were assayed on an ABI StepOne Plus instrument using universal thermal cycling conditions of 50°C for 2 minutes, 95°C for 10 minutes, followed by 40 cycles of 95°C for 15 seconds and 60°C for 1 minute.
[0136] To identify a suitable endogenous control gene, Taqman analysis of 5 candidate endogenous control genes (ZNF592, PPIA, TBP, TRFC and RPLPO) was performed on the melanoma cell lines as well as the comparator cell line NHEM neo. Based on this analysis, ZNF592 was determined to have the lowest variation in expression between all cell lines, and hence was chosen for use as an endogenous control and used to normalize the other expression levels.
[0137] Relative quantification of gene expression was performed using the formula 2" with fold change values calculated by dividing melanoma cell line results by NHEM neo results (Schmittgen, T.D.; Livak, K.J., Nature Protocols vol 3, 1101-1108, 2008). Ct values > 35 were considered below the limits of detection. Results are shown in Fig. 1. In comparison with benign HNEM neo cells, all malignant melanoma cell lines demonstrated general overexpression of PD mRNAs, with the notable exception of c-KIT (CD117), which was downregulated across all cell lines. This is consistent with reports that downregulation of c- Kit expression is known to be a common marker of metastatic melanoma and expression of c- Kit is associated with a more differentiated phenotype. Only in acral, uveal, and mucosal melanomas have overexpression and mutations of c-Kit been associated with a more aggressive phenotype (Prignano F, Gerlini G, Salvatori B, Orlando C, Mazzoli S, Pimpinelli N, Moretti S; Clin Exp Metastasis. 2006;23(3-4): 177-86. Epub 2006 Sep 22; Smalley KS, Sondak VK, Weber JS Histol Histopathol. 2009 May;24(5):643-50). The niRNA expression level for another PD, N-cadherin, is almost universally increased across all cell lines. This is consistent with the phenomenon of 'cadherin switching' which is defined by decreased E- cadherin and increased N-cadherin expression and is observed in various tumors undergoing a more invasive and metastatic phenotype (Hazan RB, Qiao R, Keren R, Badano I, Suyama K; Ann N Y Acad Sci. 2004;1014: 155-63). Given that malignant melanoma cell lines in general have been generated from advanced and highly invasive tumors, the observed expression pattern of PDs would be expected to correlate with higher risk of melanoma progression. Indeed, we observed that the general pattern of PD protein expression in higher risk groups was mostly consistent with PD mRNA expression in malignant melanoma cell lines (compare Figs. 1 and 2).
Example 2: Quantification of PD protein levels in human cancer
Western blotting
[0138] Protein lysates were prepared using RIPA buffer (Thermo Scientific, Rockland, IL) from the nine melanoma cell lines (A375, RPMI7951, SK-MEL2, SK-MEL5, SK-MEL24, SK-MEL28, SK-MEL31 , WMl 15) and the normal melanocyte NHEM cell line. The protein lysates were quantified using Pierce BCA Protein Assay kit (Thermo Scientific). For each sample, 8 μg of protein was loaded onto 4-20% Precast Mini-Protean TGX gels (Bio-Rad, Hercules, CA) for electrophoresis and transferred to PVDF membrane (Bio-Rad). The membrane was blocked in 5% milk TBS-T (0.1% Tween 20, 25 mM Tris, 0.15 M NaCl, pH 7.2) at room temperature for 1 hr. The membrane was then incubated with the primary antibody in 0.3% BSA TBS-T at room temperature for 1 hr or at 4 ^C overnight. The detailed information of the antibodies is shown in Table 2. Briefly, for the primary antibodies used, a range of 1 : 100 to 1 : 10,000 dilutions was chosen based on the manufacturer's recommended protocol. The membrane was washed in TBS-T three times, for 5 min each. The membrane was subsequently incubated with HRP-conjugated secondary antibody (Invitrogen Molecular Probes, Eugene, OR) in 3% milk TBS-T at room temperature for 1 hr. The membrane was washed in TBS-T three times, for 5 min each before chemiluminescent detection using
Amersham ECL Plus Western Blotting Detection system (GE Healthcare, Buckinghamshire, UK). The images were captured by FluorChem Q (Cell Biosciences, Santa Clara, CA). As a loading control, β actin antibody was used to blot the same membranes. As shown in Fig. 2, the protein levels detected by Western blotting generally tracked mRNA levels detected by Taqman qRT-PCR, but exceptions were noted. These results are consistent with the notion that certain PDs are regulated at transcriptional, post-transcriptional, and posttranslational levels during tumor progression. Furthermore, this finding suggests different utility of protein-based and mRNA-based markers in different biological contexts. One such example is illustrated by ANLN. While ANLN mRNA levels were up-regulated in the majority of melanoma cell lines compared to normal melanocyte control, ANLN protein showed decreased expression in most melanoma cell lines relative to normal melanocytes.
Furthermore, the difference in ANLN band patterns between melanoma and normal melanocytes on the Western blots suggest one or more tumor-specific posttranslational modifications of ANLN protein in melanoma cells.
Table 2. Summary of antibodies
Figure imgf000066_0001
Immunohistochemistry
[0139] Formalin-fixed, paraffin embedded human melanoma specimens were acquired from Origene. All blocks shipped included the following clinical annotations: patient age, gender, TNM data and minimum stage grouping as well as an abstracted pathology report. Five (5) μιη sections were cut using a Shandon Finesse 325 tissue microtome, transferred to glass slides, and stored at room temperature (RT). Slides were deparaffmized using two xylene exchanges followed by rehydration through an ethanol gradient. Antigen retrieval was performed by incubating the slides at 95°C in a sealed Thermo Scientific PT Module apparatus containing Tris-EDTA pH.9.0 for 40 min. Automated IHC staining was done using Labvision Autostainer 360. Endogenous peroxidase activity was blocked with 0.3% hydrogen peroxide for 10 min at room temperature. Immunohistochemical staining was done using Ultra Vision LP Detection System kit from Labvision according to the manufacturer's recommendations. Slides were incubated in an Ultra V Block for 5 min, followed by application of a mouse primary antibody (Table 2) directed at one of the target proteins for 60 min. Primary Antibody Enhancer was applied for 10 min, followed by HRP Polymer incubation for 15 min and then standard detection with 3,3-diaminobenzidine (DAB).
Negative controls were obtained by omitting the target protein primary antibody. All procedure steps were completed at room temperature. The following dilutions of the indicated primary antibody were used: KIF2C (1 :300); CDH2 (1 :300); DEPDC1 (1 : 1K); CD44 (1 :300); CD117 (1 : 100); SPARC (1 :3K); FSCN1 (1 : 1K); PCNA (1 :300); MMP1 (1 : IK) and ANLN (1 :3K). These results demonstrate that PD expression levels vary not only in melanoma derived cell lines, as observed by Western blotting and Taqman qRT-PCR technologies, but importantly, also at the protein level across different human melanoma tumors per se (Fig. 3 and data not shown). The ability to determine PD expression levels is of great importance for prognosis. For example, if the PD is a driver of tumor progression or a tumor suppressor, its high expression levels in an early stage tumor will correlate with poor outcome in patients or favorable prognosis, respectively. Conversely, low expression levels in early stage tumors will correlate with favorable prognosis or poor prognosis, respectively. Thus, our observations further emphasize the critical relevance of developing an unbiased method to quantitate and determine tumor specific protein expression levels for accurate prognostic determination.
PD Analysis using Immunofluorescence (IF) microscopy
[0140] Slides were deparaffinized using two xylene exchanges followed by rehydration through an ethanol gradient. Antigen retrieval was done by incubating the slides at 102°C in a sealed Thermo Scientific PT Module apparatus containing Tris-EDTA pH.9.0 for 25 min. Automated Immunofluorescence staining was done using Labvision Autostainer 360.
Endogenous peroxidase activity was blocked with 0.3% hydrogen peroxide for 5 min followed by 10 min incubation with a Background Sniper block. A mouse antibody directed at one of the target proteins multiplexed with rabbit polyclonal anti-SlOOB (DAKO) was applied to each slide for 60 min, the latter to distinguish the regions corresponding to melanoma from surrounding tissue in the absence of counterstain. Target mouse monoclonal antibodies were as follows: KIF2C (1 :300); CDH2 (1 :300); DEPDC1 (1 : 10K); CD44 (1 :30K); CD117 (1 : 100); SPARC (1 :30k); FSCN1 (1 : 1k); PCNA (1 :30K); MMP1 (1 : 1K); and ANLN (1 :3K). The secondary antibodies, Alexa 546-conjugated goat anti-rabbit (1 : 100; Molecular Probes) diluted into Envision anti-mouse (neat; DAKO) were applied for 30 min followed by 10 min Cy5-tyramide signal amplification (Perkin-Elmer Life Sciences) incubation to amplify the target signal. Finally, nuclei were visualized by counterstaining with 4,6-diamidino-2-phenylindole (DAPI) prolong gold. Negative controls were obtained by omitting the target protein primary antibody. All procedure steps were completed at room temperature.
[0141] For each sample, sets of 20x magnification monochromatic, high-resolution images were captured in each of the DAPI, Alexa 546 and Cy5 fluorescent channels using the AperioFL ScanScope hardware (Aperio Technologies, Vista, CA). Digital images were recorded within the Aperio Spectrum Database and were processed with the ImageScope software.
[0142] Consistent with the qualitative analysis using conventional DAB
immunohistochemistry staining, differential expression of each PD was observed via immunofluorescence. A representative image for each PD is shown in Fig. 4.
PD Quantification in Tumor Compartment using Definiens TissueStudio Software
[0143] Perturbations in cellular signaling pathways drive tumor progression. Thus, it is important to selectively quantitate PD expression in the tumor cells while excluding surrounding cells such as stromal cells and infiltrating lymphocytes. In addition, the subcellular localization of certain PDs is often very informative. For instance, if a given PD is a transcription factor that has a tumor suppressor role in tumor development, its nuclear localization is likely indicative of its activation status and hence is likely correlated with a favorable prognosis, while its cytoplasmic localization will be correlated with a poor prognosis. Another example is provided by the matrix metalloproteinase MMP1. This protein is a PD involved in the breakdown of extracellular matrix during metastasis. It is secreted as an inactive precursor protein which becomes activated upon cleavage by extracellular proteinases. Accordingly, one possible interpretation of high intracellular levels of MMP1 in a tumor area would be a predisposition for metastasis, while high extracellular levels of expression might be correlated with a poor prognosis. The Definiens TissueStudio product is based on cognition network technology which is semantic networks of objects and their mutual relationships (so-called Object recognition'). Here, tissue segmentation, tumor regions of interest, and either nuclear or non-nuclear attributes of the cells are identified automatically using Definiens Composer segmentation and classification tools in
TissueStudio software (Definiens AG, Munchen). An example of Definiens TissueStudio applied for PD quantitation is shown in Fig. 5, wherein TissueStudio software is used to quantitatively assess the expression level of Fascin within the tumor and its subcellular compartments. Automated Quantitative Analysis (AQUA)
[0144] Automated Quantitative Analysis (AQUA) (previously described by Camp RL, Chung GG, Rimm DL. Automated subcellular localization and quantification of protein expression in tissue microarrays. Nat Med 2002;8: 1323-7) is yet another example of an approach to automate tumor segmentation and quantify protein expression.
Compartmentalization of each sample and quantitation of the target protein signal within each compartment are executed as follows. Alexa 546 signal is used to represent SI 00 staining and is binary gated (i.e., a threshold was established to "bin" cells into either SI 00 positive or SI 00 negative groups) to indicate the tumor mask. Within the region defined by the tumor, the nuclear compartment is defined by applying a rapid exponential subtraction algorithm to the DAPI channel images, which restricts the nuclear compartment assignment to only those pixels that show any positive DAPI signal within the plane of focus. The nonnuclear compartment is then defined by the Pixel-Based Locale Assignment for
Compartmentalization of Expression algorithm as all pixels assigned to the tumor mask but are not included within the nuclear compartment. Finally, target antigen expression levels are determined in an automated fashion, following application of the rapid exponential subtraction algorithm to the Cy5 images to obtain a relative pixel intensity restricted to the signal emanating from the plane of focus. The final AQUA score for the entire tumor mask or any of its subcellular compartments is calculated as the average AQUA score for each of the individual pixels included in the selected compartment. Representative human melanoma data comparing the distributions of composite AQUA scores for each of the 10 PDs in an SlOO-defined tumor mask, as well as in nuclear and non-nuclear areas within the tumor mask is shown as a column chart in Fig. 6. AQUA scores were calculated as the average AQUA score for each of the individual pixels included in the selected compartment and were reported on a scale of 0-4095. A high AQUA score indicates a high level of expression of the candidate PD in the analyzed tumor section whereas a low AQUA score indicates a low level of expression or an absence of expression in the tumor section. AQUA scores are an advantageous method of quantitation because they are a continuous variable, not restricted to the historical categories (0, 1+, 2+, 3+) of scoring. AQUA's utilization of a continuous scoring scale allows it to capture the diverse and varied pathway deregulations in human cancer. Further, the AQUA score generated for a tumor section is unique to the properties and staining of that individual section. Example 3: Development of prognostic models for human melanoma
Cohort description
[0145] Two cohorts, The Yale Melanoma Discovery Cohort containing 192 primary cutaneous melanomas from white patients and The Yale Melanoma Validation Cohort consisting of 246 primary cutaneous melanomas were assembled. For The Yale Melanoma Validation Cohort, sentinel node dissection was performed coincident with primary tumor resection. Both cohorts were survival outcome-annotated. For a detailed cohort summary, see Gould-Rothberg et al, J Clin One, 2009; 27(34):5772-80.
Construction of tumor microarrays
[0146] Formalin-fixed paraffin-embedded blocks were used to generate 0.6 mm diameter tumor tissue micro arrays (TMAs). The discovery array (YTMA59) included single cores from the 192 cutaneous melanomas as well as a series of cell line and human tumor samples as controls. The validation cohort was scored twice and arrayed onto two recipient blocks to create a tissue microarray (TMA) with 2-fold redundancy (YTMA76). It also contained 60 randomly selected individuals from YTMA59 that served as normalization controls for immunofluorescent staining. For details, see Gould-Rothberg et al, J Clin One, 2009;
27(34):5772-80.
Identification of the 10 PDs
[0147] A total of 38 candidate proteins were analyzed by quantitative immunofluorescence on YTMA59 as described (Gould-Rothberg et al, J Clin One, 2009; 27(34):5772-80). Work described in International Patent Application Publication No. WO2009/158620 using metastatic versus non-metastatic melanoma mouse models and functional protein analyses identified 31 partially non-overlapping human proteins of importance for early melanoma invasion and spreading. The discovery of these proteins included identification of the genes that were differentially up- or down-regulated in a metastatic melanoma model based on an inducible c-Met overexpressing transgenic mouse on an pl9/ARF -/- background versus a non-metastatic inducible H-Ras, pl9/ARF -/- transgenic mouse. The differentially regulated genes were subjected to cross-species oncogenomic comparison with human genes that exhibited corresponding copy number alterations in metastatic versus non-metastatic melanoma. A subset of the 295 upregulated genes in both mouse and human were ultimately validated in functional in vitro assays for a gain-of-function invasive phenotype in HMEL- 468 cells through transient overexpression in Boyden Chamber and trans-well migration assays. For a detailed description see International Patent Application Publication No.
WO2009/158620. Based on this work, 31 proteins were identified as positives, and a subset of 17 of these were analyzed further by quantitative immunofluorescence (QIF) on YTMA59. An algorithm- voting approach (see Fig. 8) was ultimately applied to a subset of the 38 candidate proteins stained in the Gould-Rothberg study and the 17 markers from the mouse model studies to prioritize the ten markers that consistently contributed to highly predictive models using various computational/statistical algorithms across extensive resampling perturbations on YTMA59. Of these ten, six markers were identified from the subset of 38, and four from the subset of seventeen. These ten PDs were then assessed by QIF using AQUA on YTMA76, as described in the following paragraph.
Antibody optimization and staining
[0148] Monoclonal antibodies for each of the 10 PDs (Table 2) were optimized for AQUA. Here, each antibody was tested on melanoma test arrays using the HistoRx PM-2000 platform to establish optimal antibody titer. A titration curve analysis was performed in which AQUA score was plotted for a given antibody tested at 1 : 10, 1 :30, 1 : 100, 1 :300, 1 :1k, 1 :3k, 1 : 10k, 1 :30k, 1 : 100k, 1 :300k, or 1 : 1000k dilutions, or with no primary antibody. For each dilution, the ratio between the highest and lowest AQUA score was calculated. In addition to global fluorescence, the subcellular distribution (e.g. nuclear versus cytoplasmic) was quantitatively assessed at multiple dilutions of the primary antibody. Collectively, the above experiments identified experimental conditions that maximize signal-to-noise ratios on the AQUA platform across cancer tissues. The optimal dilution for each antibody was employed to stain the YTMA76 cohort (Fig. 7). The data was analyzed using the AQUA version 2.3.3.2 software. AQUA scores in the nuclear and non-nuclear compartment, as well as total AQUA scores under the tumor mask were exported for each core.
An algorithm-voting approach to prioritize PDs for inclusion in prognostic models
[0149] A voting-based approach was applied to prioritize markers that consistently contribute to highly predictive models using various computational/statistical algorithms across extensive resampling perturbations (Fig. 8). Specifically, two variable-reducing algorithms, LASSO ( The Lasso Method for Variable Selection in The Cox Model, Robert Tibshirani, Statistics in Medicine 16 (1997)) and AlC-Optimizing Stepwise Forward
Selection Cox Regression model was deployed on the log-2 transformed AQUA and two- level discretized AQUA data separately, yielding four different algorithmic approaches. During the extensive 200-iteration bootstrapping stage, each PD was either included or excluded from the derived model. An algorithm-specific biomarker importance score is defined as the number of a times an individual PD occurred in the 200 bootstrapping models, weighted by the model's performance (C-index) on the internal validation samples (e.g., the remaining samples not chosen by the resampling). PDs were ranked by the sum of their importance scores (SumScores) across the four algorithms with the top ones corresponding to the highest absolute aggregated scores. The top-K (wherein K can be any number of the ten markers) were subsequently selected for inclusion into the final prognostic models, constructed by a multivariate Cox regression algorithm (see, e.g., Fig. 9).
Greedy algorithm for generating models that effectivley segregate low risk or high risk populations.
[0150] Another approach to variable selection involves a stepwise process. Starting from an empty list, firstly we exhaustively examine Cox regression models composed by all 1 -marker and 2-marker combinations. The best model is chosen so as to maximize the segregation of the low risk/high risk population. Hence this approach yields two types of models, one that is sensitive to the low risk population by maximizing the sensitivity and negative prediction rate (NPR) (which is perferably 50% or greater), while the other is sensitive to high risk population by maximizing the specificity and positive prediction rate (PPR). At the next step, one or two additional markers are added to the already established best performing set of markers. Again, the additional markers are chosen so as to maximize the sensitivity/NPR of low risk model or the specificity/PPR of high risk model. If the addition of new markers enhances model performance, the best expanded model replaces the existing model and the expansion step is repeated until the current model cannot be improved by adding 1 or 2 new markers. Exemplification of the construction of these models is illustrated in Figs. 16 and 17. Fig. 16 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are very sensitive in detecting the low risk population. This is a greedy algorithm which iteratively adds to an already established group the one or two additional variables that improve best the sensitivity of the model in the low risk group. The table presents the results for the training set. The Kaplan-Meier curves present the performance on both the training and test set for the model with six variables. We see that for both sets the sensitivity in the low risk group is 1. Fig. 17 presents results for a version of the variable selection algorithm that favors variable combinations that yield models which are geared towards identifying the high risk population. The algorithm is analogous to the one for identifying the low risk population (see legend of Fig. 16) except that variables that improve the specificity are added at each stage. The table presents the performance of the algorithm on the training data while the Kaplan Meier curves demonstrate the performance on both training and test data for the model with six variables. The high risk population has a relatively high fraction of recurrence events, as well as relatively short follow-up times for the censored cases. Fig. 18 illustrates that multiple models can be combined to improve population segregation. Here, the model that effectively segregates the low risk population is combined with the model that effectively segregates the high risk population to yield stratification into three classes: high risk, medium risk, and low risk. The model scores consist of a linear combination of the log2 AQUA scores for the markers weighted by the coefficients for the two models indicated in the two tables. The thresholds for the model scores used to segregate populations are indicated in the decision tree below the table. Fig. 19 demonstrates the effectiveness of the combined model for segregating low and high risk populations illustrated in Fig. 18. Kaplan-Meier curves for both training and test cohorts are presented.
[0151] For all algorithms, the variable selection and model construction phase yield a set of PDs and a set of coefficients. The score of each sample is obtained as a linear combination of the log2 AQUA score of the PDs weighted by the coefficients. Optimal cutoffs are selected to optimize model performance on training set. Development and validation of prognostic models
[0152] To examine if the risk models are able to segregate melanoma tumors into risk- distinct groups, we considered several segregation scenarios with high clinical relevance. In the first scenario we consider identifying the low risk population which could forgo SLN biopsy. Here, we used the greedy model that segregates low risk populations and obtained results presented in Fig. 13. Another important category of patients are the high risk group among all patients. This latter group would be candidate for adjuvant therapy and/or more aggressive follow-up monitoring. Here, we used the high risk (greedy) model with results presented in Fig. 14. An additional group of patients are the high risk patients among SLNB- negative patients. This latter group would benefit from adjuvant therapy and/or more aggressive follow-up monitoring which is not suggested by current standards of care. Fig. 15 demonstrates the ability to identify patients (using the greedy model) in that category.
[0153] Other computational approaches were also successfully employed to develop and validate prognostic models. In one example, stepwise forward selection was employed with a Cox regression algorithm to select three PDs into a linear model. In this example, multiple subcellular compartments (e.g., the nuclear compartment ("NUCLEAR") as well as the total nuclear plus cytoplasmic signal present within the tumor mask ("TUMOR") for N-cadherin (CDH2) significantly contributed to the model (Fig. 10). Additional efforts confirmed that multiple distinct combinations of the 10 PDs could generate similarly powerful prognostic models (Fig. 1 1). Furthermore, prognostic models could be successfully developed using all 10 PDs, or various subsets thereof (Fig. 12). In the latter two cases subsets of PDs are selected from the ranked list of PDs generated by the algorithm of Fig. 8, and coefficients are obtained by training a Cox proportional hazard model using these coefficients.
[0154] Importantly, the prognostic models were demonstrated to be independent of standard clinical parameters (Fig. 20). Specifically, the PD-based prognostic models could identify high-risk patients typically considered "low-risk" given a relatively "thin" melanoma tumor at time of diagnosis. Similarly, the PD-based prognostic models identified high risk patients that were SLN negative (Fig. 15 and 20), as discussed above.
Example 4: Development of prognostic models for other cancers
[0155] The utility of the ten PDs was confirmed in prostate, breast, and colon cancer. The mRNA expression data and survival information were retrieved from the original publication (Prostate cancer: PMID: 20579941 , Breast cancer: PMID: 12490681 , Colon cancer: PMID: 19996206). The expression profiles from prostate and colon cancer cohorts were based on Affymetrix Human Genome U133 Plus 2.0 Array, and those from the breast cancer cohort were based on Agilent Human 25K platform. The profiles from different individuals were converted to 2-based log scale and standardized by quantile normalization within each cohort. The expression levels of genes with multiple probe sets were aggregated by taking the median values. For the prostate data set, a linear model combining the expression profiles of the ten PDs was built for the cohort using multivariate Cox proportional hazard regression that assigned a risk score to every patient in the cohort. To identify the low-risk and high-risk groups, we used an extreme approach that searched for the highest risk score cutoff such that patients with lower-than-threshold risk scores have no recurrence events. This risk score threshold dichotomized the cohort into low-risk (lower scores) and high-risk (higher scores) groups for biochemical recurrence (Fig. 21). The significance of our prognostic model is assessed by hazard ratio between the two groups, P values in log-rank test, and C-indexes. The breast cancer dataset was clustered by hierarchical clustering and the two main clusters were used to identify two populations. All 10 PDs were used. The clustering process naturally identified low and high risk populations, and they are illustrated using Kaplan Meier curves in Fig. 22. The colon cancer dataset was clustered by k-means clustering with two centroids, using only expression values from the ten PDs. The clustering process naturally identified low and high risk populations, as illustrated by Kaplan-Meier analysis in Fig. 23. Taken together, these results demonstrate that the PDs of the present invention are prognostic in diverse cancer types.
Example 5: PDs operate within pathway context, which is a molecular stratifier of the patient population
The function of individual PDs or of specific combinations of PDs is pathway context- dependent
[0156] PDs may be involved in a number of pathways relevant to cancer progression, which can include proteins with molecular alterations affecting their ability to regulate the normal molecular events of the cell. By way of example, a loss-of-function (LOF) mutation in a tumor suppressor or a gain-of-function (GOF) mutation in an oncoprotein are examples of pathway context molecular alterations that can cause perturbed signal transduction pathway activity that is associated with cancer aggressiveness. Indeed, GOF and LOF mutations in these pathway context genes may result in perturbed or deregulated pathway activity that in many instances is thought to be a driver of an aggressive, malignant phenotype. An important consequence of this is that the biological role of individual PDs and specific combinations of PDs for progression of a certain tumor type will entirely depend on the pathway context within which they operate. This notion of pathway context-dependent function is indirectly illustrated in Figs. 13-15. In this instance, nuclear CDl 17, nuclear CD44, non-nuclear CDH2, and nuclear KIF2C constitute a particularly powerful PD four-marker combination in identifying patients at low risk for progression based on their quantitative expression levels (Fig. 13). By inference, if this marker combination is able to perform very well in identifying low risk patients based on levels of expression, they should also be able to accurately predict the complementary, namely high risk patients based on the exact expression levels. Instead we find that, for instance, non-nuclear CD117, tumor KIF2C, nuclear MMP1, and tumor PCNA is a particularly powerful PD 4-marker combination to identify high risk patients with a minimal number of false positive patients (Fig. 14). Finally, yet another PD four-marker combination, namely nuclear ANLN, nuclear MMP 1 , tumor PCNA, and non-nuclear
SPARC, is particularly sensitive for identifying the high risk patients in a SLNB-negative patient population (Fig. 15). While not wishing to be bound by theory, the observation that different four-marker PD combinations are optimal in performance for different prognostic endpoints, e.g. identification of low risk, high risk, and high risk amongst node-negative patients suggests that the individual PDs and specific combinations of PDs depend on a distinct pathway context.
[0157] This concept has important implications for the functional role of PDs. The correlation between the expression level of a PD and outcome strongly suggests that the PD plays a critical role in tumor progression regardless of whether it is an oncogenic or suppressive factor. The PD may not be the ultimate driver of tumor progression in a pathway, but, rather, could be a downstream or upstream product of the true driver. Thus, many of the PDs can be substituted with other protein markers that can serve as functional readouts of the perturbed pathways causing early stage cancers to progress. Using quantitative technology in conjunction with pathway analysis, the true drivers of tumor progression will be identified. These drivers will have an important effect on prognosis.
Pathway context characterization and pathway context-dependent PD function.
[0158] Fig. 24 illustrates an example of quantitative analysis of mRNA levels through qRT-
PCR of the three clinically relevant pathway context genes for malignant melanoma, B-Raf, N-Ras, and K-Ras, in the nine melanoma cell lines described in Fig. 1 and a control cell line, NHEM-neo, used for normalization. As expected, the expression levels vary across the cell lines, which could infer different functional roles of the PDs independent of the mutational status of the pathway context genes. Fig. 25 is a Western blot analysis of the 3 pathway context proteins, B-Raf, N-Ras, and K-Ras in the 9 melanoma cell lines. There are noticeable differences in the expression levels of B-Raf across the cell lines, and the band shifts of B- Raf are indicative of different phosphorylation and hence activity states of B-Raf in the different cell lines. Figures 26 and 27, finally, show qualitative and quantitative analysis of the expression levels of B-Raf, N-Ras, and K-Ras in a human melanoma sample. This is important, as the functional role of PDs in a cancer type will depend on the expression level and/or the activity state of the pathway context proteins. If a pathway context protein that is known to be associated with aggressiveness, for instance B-Raf in malignant melanoma, is activated or mutated, certain PDs will be associated with high risk and cancer progression. However, other PD combination(s) can predict low risk for progression, as they serve a different role in context of a wild-type B-Raf.
[0159] Through analysis of the expression of pathway context proteins with specific antibodies together with analysis of the signal transduction pathway activity in the pathway context-regulated proteins through standard tools, like pan and phospho-specific antibodies, one can get a good functional assessment of the activity state of pathway context-regulated proteins (for examples, see Andersen et al, 2010. Pathway-based identification of biomarkers for targeted therapeutics: personalized oncology with PI3K pathway inhibitors. Sci Transl Med. 2(43):43ra55). Such 'pathway mapping' analysis of PDs in the pathway context using quantitative means like AQUA or Defmiens TissueStudio™, will enable a more precise understanding of the role of a particular PD as a driver of the aggressive cancer phenotype, or just a co-regulated molecule. For instance, the information that a particular PD is active in a particular pathway context, but not in another can be linked with a functional characterization of the role of that PD when it is in the active pathway context. Through genetic means one can downregulate the expression of the active PD in the relevant pathway context in a cellular model system, and functionally validate the consequences on cell survival, cell invasiveness in vitro and in animal cancer models. This will have direct implications for the ability to link a high risk cancer patient to the most suitable type of therapy. For instance, the effect of a particular drug on a pathway can be directly tested using quantitative analysis of the PD expression level and/or its activity state with a phospho-specific antibody, enabling the ability to offer high risk patients advice on most suitable types of therapy for their cancer.

Claims

What is Claimed is:
1. A method of predicting prognosis of a cancer patient, comprising:
obtaining a cancerous tissue sample from the patient;
measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCNl, KIF2C, MMPl,
PCNA, and SPARC in the sample; and
obtaining biomarker scores based on the measured levels, wherein the biomarker scores are indicative of the prognosis of the cancer patient.
2. The method of claim 1 , wherein the patient has melanoma.
3. The method of claim 2, wherein the prognosis is that the patient is at a low risk of having metastatic cancer or recurrence of the melanoma.
4. The method of claim 3, wherein the selected biomarkers comprise: (1) CD44, ANLN, CDl 17, MMPl, and KIF2C; or (2) CDH2, SPARC, PCNA, FSCNl, and DEPDCl .
5. The method of claim 2, wherein the prognosis is that the patient is at a high risk of having metastatic cancer or recurrence of the melanoma.
6. The method of claim 5, wherein the selected biomarkers comprise: (1) CDl 17, CD44, KIF2C, MMPl, and CDH2; or (2) PCNA, ANLN, SPARC, FSCNl, and DEPDCl .
7. The method of claim 5, wherein the patient has a negative result in sentinel lymph node biopsy (SLNB), and wherein the selected biomarkers comprise: (1) ANLN, MMPl, CDH2, KIF2C, and SPARC; or (2) CDl 17, PCNA, FSCNl, CD44, and DEPDCl .
8. A method of analyzing a cancerous tissue sample from a cancer patient, comprising: obtaining a cancerous tissue sample from the patient; and measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCNl, KIF2C, MMPl,
PCNA, and SPARC in the sample.
9. A method of identifying a cancer patient in need of adjuvant therapy, comprising:
obtaining a cancerous tissue sample from the patient;
measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCNl, KIF2C, MMPl, PCNA, and SPARC in the sample; and
obtaining biomarker scores based on the measured levels, wherein the biomarker scores indicate that the patient is in need of adjuvant therapy.
10. The method of claim 9, wherein the adjuvant therapy is selected from the group consisting of radiation therapy, chemotherapy, immunotherapy, hormone therapy, and targeted therapy.
11. The method of claim 10, wherein the targeted therapy targets another component of a signaling pathway in which one or more of the selected biomarkers is a component.
12. The method of claim 10, wherein the targeted therapy targets one or more of the selected biomarkers.
13. A method of treating a cancer patient, comprising:
obtaining the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CDl 17, DEPDCl, FSCNl, KIF2C, MMPl, PCNA, and SPARC in a cancerous tissue sample from the patient and
treating the patient with adjuvant therapy if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer.
14. The method of claim 13, wherein the adjuvant therapy is an experimental therapy.
15. A method of identifying a cancer patient in need of a sentinel lymph node biopsy, comprising:
obtaining the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in the sample; and
performing sentinel lymph node biopsy on the patient if the biomarker scores indicate that the patient is at a high risk of having metastatic cancer or recurrence of cancer.
16. A method of identifying a cancer patient not in need of a sentinel lymph node biopsy, comprising:
obtaining the biomarker scores of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC in the sample; and
not performing sentinel lymph node biopsy on the patient if the biomarker scores indicate that the patient is at a low risk of having metastatic cancer or recurrence of cancer.
17. The method of any one of the above claims, wherein the biomarker scores are obtained by applying a coefficient to the measured levels of the selected biomarkers.
18. The method of any one of the above claims, wherein the biomarker scores are calculated by using one or more algorithms selected from the group consisting of the Greedy Model, the Cox regression algorithm, the LASSO algorithm, the AlC-Optimizing Stepwise Forward Selection Cox Regression algorithm.
19. The method of any one of the above claims, wherein the measuring step comprises measuring the RNA transcript levels of the selected biomarkers.
20. The method of claim 18, wherein the R A transcript levels are determined by microarray, quantitative RT-PCR or Nanostring nCounter.
21. The method of any one of claims 1-18, wherein the measuring step comprises measuring the protein levels of the selected biomarkers.
22. The method of claim 21, wherein the protein levels are measured by antibodies.
23. The method of claim 22, wherein the protein levels are measured by immunohistochemistry or immunofluorescence.
24. The method of claim 23, wherein the measuring step comprises measuring the protein level of a selected biomarker in subcellular compartments.
25. The method of claim 24, wherein the measuring step comprises measuring the protein level of a selected biomarker in the nucleus relative to the protein level of the biomarker in the cytoplasm.
26. The method of claim 24, wherein the measuring step comprises measuring the protein level of a selected biomarker in the nucleus or in the cytoplasm.
27. The method of any one of the above claims, wherein noncancerous cells are excluded from the cancerous tissue sample.
28. The method of any one of the above claims, wherein the measuring step comprises separately measuring the levels of the biomarkers.
29. The method of any one of the above claims, wherein the measuring step comprises measuring the levels of the biomarkers in a multiplex reaction.
30. The method of any one of the above claims, wherein the cancerous tissue sample is a formalin- fixed paraffin embedded tissue sample, a snap-frozen tissue sample, an ethanol-fixed tissue sample, a tissue sample fixed with an organic solvent, a tissue sample fixed with plastic or epoxy, a cross-linked tissue sample, surgically removed tumor tissue, circulating tumor cells, a biopsy sample, or a blood sample.
31. The method of any one of the above claims, wherein the cancerous tissue is melanoma, prostate cancer, breast cancer, or colon cancer tissue.
32. The method of any one of the above claims, further comprising measuring at least one standard parameter associated with the cancer.
33. The method of claim 32, wherein the at least one standard parameter is selected from the group consisting of tumor stage, tumor grade, tumor size, tumor visual characteristics, tumor location, tumor growth, lymph node status, tumor thickness (Breslow score), ulceration, age of onset, PSA level, or Gleason score.
34. A kit for measuring the levels of two or more biomarkers selected from the group consisting of ANLN, CD44, CDH2, CD117, DEPDC1, FSCN1, KIF2C, MMP1, PCNA, and SPARC, comprising reagents for specifically measuring the levels of the selected biomarkers.
35. The kit of claim 34, wherein the reagents are nucleic acid molecules.
36. The kit of claim 35, wherein the nucleic acid molecules are PCR primers or hybridizing probes.
37. The kit of claim 36, wherein the reagents are antibodies.
PCT/US2012/028307 2011-03-11 2012-03-08 Methods of predicting prognosis in cancer WO2012125411A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161452054P 2011-03-11 2011-03-11
US61/452,054 2011-03-11

Publications (1)

Publication Number Publication Date
WO2012125411A1 true WO2012125411A1 (en) 2012-09-20

Family

ID=46831055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2012/028307 WO2012125411A1 (en) 2011-03-11 2012-03-08 Methods of predicting prognosis in cancer

Country Status (1)

Country Link
WO (1) WO2012125411A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014158696A1 (en) * 2013-03-14 2014-10-02 Castle Biosciences, Inc. Methods for predicting risk of metastasis in cutaneous melanoma
US9057109B2 (en) 2008-05-14 2015-06-16 Dermtech International Diagnosis of melanoma and solar lentigo by nucleic acid analysis
WO2018035168A1 (en) * 2016-08-15 2018-02-22 Imaging Endpoints II LLC Systems and methods for predicting lung cancer immune therapy responsiveness using quantitative textural analysis
US10332634B2 (en) 2017-03-14 2019-06-25 Imaging Endpoints II LLC Systems and methods for reliably diagnosing breast cancer using quantitative textural analysis
CN112034182A (en) * 2020-09-01 2020-12-04 复旦大学附属中山医院 Method and system for predicting colon cancer metastasis
CN113444797A (en) * 2021-06-29 2021-09-28 北京泱深生物信息技术有限公司 Biomarkers for predicting lung cancer prognosis
US11578373B2 (en) 2019-03-26 2023-02-14 Dermtech, Inc. Gene classifiers and uses thereof in skin cancers

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100248225A1 (en) * 2006-11-06 2010-09-30 Bankaitis-Davis Danute M Gene expression profiling for identification, monitoring and treatment of melanoma
US20100292094A1 (en) * 2007-10-23 2010-11-18 Clinical Genomics Pty. Ltd. Method of diagnosing neoplasms

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100248225A1 (en) * 2006-11-06 2010-09-30 Bankaitis-Davis Danute M Gene expression profiling for identification, monitoring and treatment of melanoma
US20100292094A1 (en) * 2007-10-23 2010-11-18 Clinical Genomics Pty. Ltd. Method of diagnosing neoplasms

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ONKEN ET AL.: "Gene Expression Profiling in Uveal Melanoma Reveals Two Molecular Classes and Predicts Metastatic Death.", CANCER RES, vol. 64, no. 20, 15 October 2004 (2004-10-15), pages 7205 - 7209 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11332795B2 (en) 2008-05-14 2022-05-17 Dermtech, Inc. Diagnosis of melanoma and solar lentigo by nucleic acid analysis
US9057109B2 (en) 2008-05-14 2015-06-16 Dermtech International Diagnosis of melanoma and solar lentigo by nucleic acid analysis
US10407729B2 (en) 2008-05-14 2019-09-10 Dermtech, Inc. Diagnosis of melanoma by nucleic acid analysis
US11753687B2 (en) 2008-05-14 2023-09-12 Dermtech, Inc. Diagnosis of melanoma and solar lentigo by nucleic acid analysis
WO2014158696A1 (en) * 2013-03-14 2014-10-02 Castle Biosciences, Inc. Methods for predicting risk of metastasis in cutaneous melanoma
US10577660B2 (en) 2013-03-14 2020-03-03 Castle Biosciences, Inc. Diagnostic test for predicting metastasis and recurrence in cutaneous melanoma
US11434536B2 (en) 2013-03-14 2022-09-06 Castle Biosciences, Inc. Diagnostic test for predicting metastasis and recurrence in cutaneous melanoma
WO2018035168A1 (en) * 2016-08-15 2018-02-22 Imaging Endpoints II LLC Systems and methods for predicting lung cancer immune therapy responsiveness using quantitative textural analysis
US11120888B2 (en) 2016-08-15 2021-09-14 Imaging Endpoints II LLC Systems and methods for predicting lung cancer immune therapy responsiveness using quantitative textural analysis
US10332634B2 (en) 2017-03-14 2019-06-25 Imaging Endpoints II LLC Systems and methods for reliably diagnosing breast cancer using quantitative textural analysis
US11578373B2 (en) 2019-03-26 2023-02-14 Dermtech, Inc. Gene classifiers and uses thereof in skin cancers
CN112034182A (en) * 2020-09-01 2020-12-04 复旦大学附属中山医院 Method and system for predicting colon cancer metastasis
CN113444797A (en) * 2021-06-29 2021-09-28 北京泱深生物信息技术有限公司 Biomarkers for predicting lung cancer prognosis

Similar Documents

Publication Publication Date Title
Larkin et al. Identification of markers of prostate cancer progression using candidate gene expression
de Oca et al. The histone chaperone HJURP is a new independent prognostic marker for luminal A breast carcinoma
JP7365899B2 (en) Cancer classification and prognosis
EP2257810B1 (en) Molecular diagnosis and classification of malignant melanoma
JP6049739B2 (en) Marker genes for classification of prostate cancer
US9435812B2 (en) Expression of ETS related gene (ERG) and phosphatase and tensin homolog (PTEN) correlates with prostate cancer capsular penetration
WO2012125411A1 (en) Methods of predicting prognosis in cancer
US20140336280A1 (en) Compositions and methods for detecting and determining a prognosis for prostate cancer
US20210233611A1 (en) Classification and prognosis of prostate cancer
WO2015073949A1 (en) Method of subtyping high-grade bladder cancer and uses thereof
CA2628390A1 (en) Molecular profiling of cancer
US20210010090A1 (en) Method and system for predicting recurrence and non-recurrence of melanoma using sentinel lymph node biomarkers
US20120329878A1 (en) Phenotyping tumor-infiltrating leukocytes
IL297812A (en) Immunotherapy response signature
WO2012009382A2 (en) Molecular indicators of bladder cancer prognosis and prediction of treatment response
US20080299550A1 (en) Methods and Kits For the Prediction of Therapeutic Success and Recurrence Free Survival In Cancer Therapy
Lerebours et al. Hemoglobin overexpression and splice signature as new features of inflammatory breast cancer?
WO2016118670A1 (en) Multigene expression assay for patient stratification in resected colorectal liver metastases
Kosari et al. Shared gene expression alterations in prostate cancer and histologically benign prostate from patients with prostate cancer
US20140100188A1 (en) Phenotyping tumor-infiltrating leukocytes
Kowalewska et al. Estimation of groin recurrence risk in patients with squamous cell vulvar carcinoma by the assessment of marker gene expression in the lymph nodes
Shiyanbola et al. Long noncoding RNA expression in adrenal cortical neoplasms
CN113943803A (en) Application of HTR6 in diagnosis and prognosis of breast cancer
US20160138108A1 (en) Stroma biomarkers for the diagnosis of prostate cancer
US20130260384A1 (en) Method for determining cancer prognosis and prediction with cancer stem cell associated genes

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12757067

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12757067

Country of ref document: EP

Kind code of ref document: A1