WO2011039757A2 - Compositions and methods for prognosis of renal cancer - Google Patents

Compositions and methods for prognosis of renal cancer Download PDF

Info

Publication number
WO2011039757A2
WO2011039757A2 PCT/IL2010/000806 IL2010000806W WO2011039757A2 WO 2011039757 A2 WO2011039757 A2 WO 2011039757A2 IL 2010000806 W IL2010000806 W IL 2010000806W WO 2011039757 A2 WO2011039757 A2 WO 2011039757A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
nucleic acid
mir
expression
sequence
Prior art date
Application number
PCT/IL2010/000806
Other languages
French (fr)
Other versions
WO2011039757A3 (en
Inventor
Moshe Hoshen
Zohar Dotan
Original Assignee
Rosetta Genomics Ltd.
Tel Hashomer Medical Infrastructure And Services Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rosetta Genomics Ltd., Tel Hashomer Medical Infrastructure And Services Ltd. filed Critical Rosetta Genomics Ltd.
Publication of WO2011039757A2 publication Critical patent/WO2011039757A2/en
Publication of WO2011039757A3 publication Critical patent/WO2011039757A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA

Definitions

  • the invention relates to compositions and methods for prognosis of renal cancer. Specifically the invention relates to microRNA molecules associated with prognosis of renal cancer, as well as various nucleic acid molecules relating thereto or derived therefrom.
  • miRNAs miRNAs
  • miRs have emerged as an important novel class of regulatory RNAs that has a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression patterns by promoting RNA degradation, inhibiting mRNA translation, and also affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases strong evidence has been put forward in support of the conjecture that such alterations may play a causative role in tumor progression. The remarkable tissue-specificity of miR expression allows the development of novel approaches to molecular classification. There are currently about 1,200 known human miRs.
  • Renal cancers account for more than 3% of adult malignancies and cause more than 13,000 deaths per year in the US alone (Jemal et al., 2008, Cancer statistics, CA Cancer J Clin 58, 71-96).
  • the incidence of renal cancers in the US rose more than 50% between 1983 to 2002 and the estimated number of new cases per year rose from 39,000 estimated in 2006 to 54,000 estimated in 2008.
  • Clear cell carcinoma accounts for around 70% of renal cancers and has the worse prognosis of all RCC types.
  • the best available prognostic indicator is stage, but the current prognostic factors: Fuhrman grade, performance status and histological type, as well as stage, are insufficient to predict patient outcome and cancer aggressiveness. Identification of biomarkers that provide further prognostic information would thus be vital for defining optimal treatment and outcomes.
  • altered expression levels of any of SEQ ID NO: 1-60 or combinations thereof are indicative of renal cancer prognosis: life expectancy of the patient, expected recurrence-free survival, time to progression and risk of recurrence of metastases.
  • a method for determining a prognosis for renal cancer in a subject comprising:
  • the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1-3, 6-8, 13-23, 30-40, 45-50, 52-58 and sequences at least about 80% identical thereto, and an increased expression level of any of said nucleic acid sequence compared to the threshold expression level is indicative of good prognosis of said subject.
  • the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 4, 5, 9-12, 24-29, 41-44, 51, 59-60 and sequences at least about 80% identical thereto, and an increased expression level of any of said nucleic acid sequence compared to the threshold expression level is indicative of poor prognosis of said subject.
  • the subject is a human.
  • the method is used to determine a course of treatment of the subject.
  • the biological sample obtained from the subject is selected from the group consisting of a bodily fluid, a cell line or a tissue sample.
  • the tissue is a fresh, frozen, fixed, wax-embedded or formalin-fixed, paraffin- embedded (FFPE) tissue.
  • said tissue is a renal cancer tissue.
  • the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, or a combination thereof.
  • the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
  • the nucleic acid amplification method is real-time
  • the PCR method comprises forward and reverse primers.
  • the reverse primer comprises SEQ ID NO: 61.
  • the forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 62-78.
  • the real-time PCR method further comprises a probe.
  • the probe is complementary to SEQ ID NOS: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto.
  • the probe comprises a sequence selected from the group consisting of SEQ ID NOS: 79-92.
  • the invention further provides a kit for prognosis of renal cancer, said kit comprising a forward primer comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NOS: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto.
  • the kit further comprises a reverse primer.
  • said reverse primer comprises SEQ ID NO: 61.
  • said forward primer comprises a sequence selected from the group consisting of SEQ ED NO: 62-78.
  • the kit further comprises a probe.
  • the probe comprises a sequence selected from the group consisting of SEQ ED NOS: 79-92.
  • Figure 1 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from renal cancer patients with good prognosis (survival above two years following surgery) and from renal cancer patients with bad prognosis (survival below two years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with good prognosis (46 patients), and the x-axis represents patients with bad prognosis (9 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • miRs miR-21* (SEQ ID NO: 4), hsa-miR-22 (SEQ ID NO: 5), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-27a (SEQ ID NO: 2) and hsa-miR-23a (SEQ ID NO: 3), are labeled.
  • P-values are calculated by Student t-test.
  • the left box includes the group of patients with good prognosis, while the right box includes the group of patients with bad prognosis.
  • the line in the box indicates the median value.
  • the box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
  • Figures 3A-K are Kaplan-Meier plots, which correct for patients who were censored (subjects that may have dropped out of the study and/or were lost to follow-up or deliberately withdrawn due to the culmination of the study).
  • the y-axis depicts the fraction of surviving patients, and the x-axis depicts months of survival.
  • Figure 4 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by microarray) in samples obtained from renal cancer patients with metastases and from renal cancer patients with primary tumors, comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents renal cancer patients with metastases to the lung (6 patients), and the x-axis represents patients with primary renal tumors (51 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • miR-lOa SEQ ID NO: 11
  • hsa-miR-138 SEQ ID NO: 12
  • hsa-miR-451 SEQ ID NO: 13
  • hsa-miR-27a SEQ ID NO: 2
  • hsa-miR-26b SEQ ID NO: 1
  • hsa-miR-210 SEQ ID NO: 14
  • hsa-miR-155 SEQ ID NO: 15
  • hsa-miR-455-3p SEQ ID NO: 16
  • hsa-miR-16 SEQ ID NO: 17
  • Figure 5 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from primary renal cancer patients with good prognosis and from primary renal cancer patients with bad prognosis, comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with good prognosis (44 patients), and the x-axis represents patients with bad prognosis (5 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • Figures 6A-D are Kaplan-Meier plots for time to progression status of primary renal cancer patients.
  • the y-axis depicts the fraction of progression-free patients, and the x-axis depicts time to progression (months).
  • Figure 7 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from non-metastatic renal cancer patients with good prognosis (mean follow-up 68 months, range 24-142) and from renal cancer patients with bad prognosis (mean time to progression 12 months, range 1- 22 months), comparing the median normalised values of each miR in all patients in one group with the corresponding median for members of the other group.
  • the y-axis represents patients with good prognosis (40 patients), and the x-axis represents patients with bad prognosis (10 patients).
  • the parallel lines describe a fold change between groups of 1.5 in either direction.
  • miRs miR-21* (SEQ ID NO: 4), hsa-miR-487b (SEQ ID NO: 59), hsa-miR-30e* (SEQ ID NO: 52), hsa-miR-29c* (SEQ ID NO: 57), hsa-miR-345 (SEQ ID NO: 47) and hsa-miR-362-5p (SEQ ID NO: 55) are labeled. P-values are calculated by Student t-test.
  • Figure 8 is a dot-plot presentation comparing distribution of the expression (y-axis) of miR-21* (SEQ ID NO: 4) (p-value 0.0004552, fold change 2.2) in tumor samples obtained from non-metastatic renal cancer patients with bad or good prognosis (as defined in Figure 7).
  • the left plot includes the group of patients with good prognosis, while the right plot includes the group of patients with bad prognosis.
  • the line in the middle indicates the median value.
  • Figures 9A-D are Kaplan-Meier plots for time to progression status of primary renal cancer patients comparing microarray (9 A, 9C) and PCR (9B, 9D) data.
  • the y-axis depicts the fraction of progression-free patients, and the x-axis depicts time to progression (months).
  • miRNA expression can serve as a novel tool for the prognosis of patients with renal cancer. More particularly, it may serve for the prediction of survival, risk of recurrence and progression.
  • All the methods of the present invention may optionally further include measuring levels of other renal cancer markers.
  • Other renal cancer markers in addition to said microRNA molecules, useful in the present invention will depend on the cancer being tested and are known to those of skill in the art.
  • Assay techniques that can be used to determine levels of expression, such as the nucleic acid sequence of the present invention, in a sample derived from a patient are well known to those of skill in the art.
  • Such assay methods include, but are not limited to, radioimmunoassays, reverse transcriptase PCR (RT-PCR) assays, immunohistochemistry assays, in situ hybridization assays, competitive-binding assays, Northern Blot analyses, ELISA assays, nucleic acid microarrays and biochip analysis.
  • An arbitrary threshold on the expression level of one or more nucleic acid sequences can be set for assigning a sample or tumor sample to one of two groups.
  • expression levels of one or more nucleic acid sequences of the invention are combined by taking ratios of expression levels of two nucleic acid sequences and/or by a method such as logistic regression to define a metric which is then compared to previously measured samples or to a threshold.
  • the threshold for assignment is treated as a parameter, which can be used to quantify the confidence with which samples are assigned to each class.
  • the threshold for assignment can be scaled to favor sensitivity or specificity, depending on the clinical scenario.
  • the correlation value to the reference data generates a continuous score that can be scaled and provides diagnostic information on the likelihood that a samples belongs to a certain class of renal subtype.
  • the microRNA signature provides a high level of renal cancer prognostic information.
  • each intervening number there between with the same degree of precision is explicitly contemplated.
  • the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9 and 7.0 are explicitly contemplated, aberrant proliferation
  • aberrant proliferation means cell proliferation that deviates from the normal, proper, or expected course.
  • aberrant cell proliferation may include inappropriate proliferation of cells whose DNA or other cellular components have become damaged or defective.
  • Aberrant cell proliferation may include cell proliferation whose characteristics are associated with an indication caused by, mediated by, or resulting in inappropriately high levels of cell division, inappropriately low levels of apoptosis, or both.
  • Such indications may be characterized, for example, by single or multiple local abnormal proliferations of cells, groups of cells, or tissue(s), whether cancerous or non-cancerous, benign or malignant.
  • altered expression encompasses over-expression, under- expression, and ectopic expression.
  • the altered expression level is a change in a score based on a combination of expression levels of nucleic acid sequences or any combinations thereof. antisense
  • antisense refers to nucleotide sequences which are complementary to a specific DNA or R A sequence.
  • antisense strand is used in reference to a nucleic acid strand that is complementary to the "sense" strand.
  • Antisense molecules may be produced by any method, including synthesis by ligating the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a complementary strand. Once introduced into a cell, this transcribed strand combines with natural sequences produced by the cell to form duplexes. These duplexes then block either the further transcription or translation. In this manner, mutant phenotypes may be generated.
  • Binding or “immobilized” as used herein to refer to a probe and a solid support means that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal.
  • the binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules.
  • Non-covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non- covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non-covalent interactions.
  • Bio sample as used herein means a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples may also include sections of tissues such as biopsy and autopsy samples, FFPE samples, frozen sections taken for histological purposes, blood, blood fraction, plasma, serum, sputum, stool, tears, mucus, hair, skin, urine, effusions, ascitic fluid, amniotic fluid, saliva, cerebrospinal fluid, cervical secretions, vaginal secretions, endometrial secretions, gastrointestinal secretions, bronchial secretions, cell line, tissue sample, or secretions from the breast.
  • a biological sample may be provided by removing a sample of cells from a subject but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods described herein in vivo.
  • Archival tissues such as those having treatment or outcome history, may also be used.
  • Biological samples also include explants and primary and/or transformed cell cultures derived from animal or human tissues.
  • cancer is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness.
  • cancers include but are not limited to solid tumors and leukemias, including: apudoma, choristoma, branchioma, malignant carcinoid syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, clear cell RCC, papillary RCC and chromophobe RCC, non-small cell lung (e.g., lung squamous cell carcinoma, lung adenocarcinoma and lung undifferentiated large cell carcinoma), oat cell, papillary, bronchiolar, bronchogenic, squamous cell, and transitional cell), histiocytic disorders, leukemia (e.g., B cell, mixed cell,
  • cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response to treatment in a group of patients susceptible to or diagnosed with a cancer, duration of response to treatment in a patient or a group of patients susceptible to or diagnosed with a cancer.
  • prognostic for cancer means providing a forecast or prediction of the probable course or outcome of the cancer.
  • prognostic for cancer comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression- free survival of a patient susceptible to or diagnosed with a cancer, response to treatment in a group of patients susceptible to or diagnosed with a cancer, and duration of response to treatment in a patient or a group of patients susceptible to or diagnosed with a cancer.
  • classification refers to a procedure and/or algorithm in which individual items are placed into groups or classes based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, features, etc) and based on a statistical model and/or a training set of previously labeled items.
  • “Complement” or “complementary” as used herein to refer to a nucleic acid may mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
  • a full complement or fully complementary means 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
  • the complementary sequence has a reverse orientation (5 '-3').
  • C T signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of C T represent high abundance or expression levels of the microRNA.
  • the PCR C T signal is normalized such that the normalized C T remains inversed from the expression level.
  • the PCR C T signal may be normalized and then inverted such that low normalized-inverted C T represents low abundance or expression levels of the microRNA.
  • a "data processing routine” refers to a process that can be embodied in software that determines the biological significance of acquired data (i.e., the ultimate results of an assay or analysis). For example, the data processing routine can make determination of tissue of origin based upon the data collected. In the systems and methods herein, the data processing routine can also control the data collection routine based upon the results determined. The data processing routine and the data collection routines can be integrated and provide feedback to operate the data acquisition, and hence provide assay- based judging methods.
  • data set refers to numerical values obtained from the analysis. These numerical values associated with analysis may be values such as peak height and area under the curve.
  • data structure refers to a combination of two or more data sets, applying one or more mathematical manipulations to one or more data sets to obtain one or more new data sets, or manipulating two or more data sets into a form that provides a visual illustration of the data in a new way.
  • An example of a data structure prepared from manipulation of two or more data sets would be a hierarchical cluster.
  • Detection means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means determining the level of a component, either quantitatively or qualitatively. differential expression
  • differential expression means qualitative or quantitative differences in the temporal and/or spatial gene expression patterns within and among cells and tissue.
  • a differentially expressed gene may qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus diseased tissue. Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more states.
  • a qualitatively regulated gene may exhibit an expression pattern within a state or cell type which may be detectable by standard techniques. Some genes may be expressed in one state or cell type, but not in both.
  • the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down-regulated, resulting in a decreased amount of transcript.
  • the degree to which expression differs needs only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, Northern blot analysis, real-time PCR, in situ hybridization and RNase protection.
  • expression profile is used broadly to include a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cDNA, etc., quantitative PCR, ELISA for quantitation, and the like, and allow the analysis of differential gene expression between two samples.
  • a subject or patient tumor sample e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art.
  • Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences.
  • expression profile means measuring the abundance of the nucleic acid sequences in the measured samples.
  • “Expression ratio” as used herein refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
  • Fram is used herein to indicate a non-full length part of a nucleic acid or polypeptide.
  • a fragment is itself also a nucleic acid or polypeptide, respectively.
  • Gene as used herein may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non- translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences).
  • the coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA.
  • a gene may also be an mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto.
  • a gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
  • “Groove binder” and/or “minor groove binder” may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner.
  • Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water.
  • Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings.
  • Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic antitumor drugs such as chromomycin and mithramycin, CC-1065, dihydrocyclopyrroloindole tripeptide (DPI 3 ), l,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI 3 ), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No.
  • antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic antitumor drugs such as chromomycin and
  • a minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the T m of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures,
  • Identity or “identity” as used herein in the context of two or more nucleic acids or polypeptide sequences mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity.
  • the residues of single sequence are included in the denominator but not the numerator of the calculation.
  • thymine (T) and uracil (U) may be considered equivalent.
  • Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0.
  • In situ detection means the detection of expression or expression levels in the original site hereby meaning in a tissue sample such as biopsy.
  • Label as used herein means a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
  • useful labels include P 32 , fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable.
  • a label may be incorporated into nucleic acids and proteins at any position. logistic regression
  • Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these.
  • the dependent or response variable can be dichotomous, for example, one of two possible types of cancer.
  • Logistic regression models the natural log of the odds ratio, i.e. the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (or mathematical functions thereof).
  • the logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first group if P is greater than 0.5 or 50%.
  • the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier.
  • 1D/2D threshold classifier used herein may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer.
  • ID threshold classifier the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold.
  • a 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables.
  • a threshold may be calculated as a function (usually a continuous or even a monotonic function) of the first variable; the decision is then reached by comparing the second variable to the calculated threshold, similar to the ID threshold classifier.
  • NPV Native predictive value
  • Nucleic acid or "oligonucleotide” or “polynucleotide”, as used herein means at least two nucleotides covalently linked together.
  • the depiction of a single strand also defines the sequence of the complementary strand.
  • a nucleic acid also encompasses the complementary strand of a depicted single strand.
  • Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid.
  • a nucleic acid also encompasses substantially identical nucleic acids and complements thereof.
  • a single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions.
  • a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
  • Nucleic acids may be single stranded or double stranded, or may contain portions of both double stranded and single stranded sequences.
  • the nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine.
  • Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
  • a nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages.
  • Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference.
  • Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids.
  • the modified nucleotide analog may be located for example at the 5 '-end and/or the 3 '-end of the nucleic acid molecule.
  • Representative examples of nucleotide analogs may be selected from sugar- or backbone- modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e. ribonucleotides, containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5- position, e.g.
  • the 2'- OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or CN, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, CI, Br or I.
  • Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al., Nature 438:685-689 (2005), Soutschek et al., Nature 432:173-178 (2004), and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference.
  • Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip.
  • the backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells.
  • the backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
  • PSV Positive predictive value
  • PPV may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a positive diagnosis.
  • the PPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "A” by the test out of the cases that are diagnosed as belonging to class "A”, as determined by some absolute or gold standard.
  • Probe as used herein means an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence.
  • a probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind. reference expression profile
  • the phrase "reference expression profile” refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with renal cancer.
  • the reference expression profile may be based on the abundance of the nucleic acids, or may be based on a combined metric score thereof.
  • reference value means a value that statistically correlates to a particular outcome when compared to an assay result.
  • the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes.
  • the reference value may be a threshold score value or a cutoff score value. Typically a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable.
  • sensitivity used herein may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the sensitivity for class A is the proportion of cases that are determined to belong to class "A” by the test out of the cases that are in class "A", as determined by some absolute or gold standard.
  • Specificity used herein may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the specificity for class A is the proportion of cases that are determined to belong to class "not A” by the test out of the cases that are in class "not A”, as determined by some absolute or gold standard.
  • stage of cancer refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the size of the tumor, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ).
  • Stringent hybridization conditions mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The T m may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are occupied at equilibrium).
  • T m thermal melting point
  • Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 60°C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization.
  • Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65°C.
  • Substantially complementary as used herein means that a first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the complement of a second sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
  • Substantially identical as used herein means that a first and a second sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence. subtype of cancer
  • subtype of cancer refers to different types of cancer that effect the same organ (e.g., spindle cell, cystic and collecting duct carcinomas of the kidney).
  • the term "subject” refers to a mammal, including both human and other mammals.
  • the methods of the present invention are preferably applied to human subjects.
  • Target nucleic acid as used herein means a nucleic acid or variant thereof that may be bound by another nucleic acid.
  • a target nucleic acid may be a DNA sequence.
  • the target nucleic acid may be RNA.
  • the target nucleic acid may comprise a mRNA, tRNA, shRNA, siRNA or Piwi-interacting RNA, or a pri-miRNA, pre-miRNA, miRNA, or anti-miRNA.
  • the target nucleic acid may comprise a target miRNA binding site or a variant thereof.
  • One or more probes may bind the target nucleic acid.
  • the target binding site may comprise 5-100 or 10-60 nucleotides.
  • the target binding site may comprise a total of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30-40, 40- 50, 50-60, 61, 62 or 63 nucleotides.
  • the target site sequence may comprise at least 5 nucleotides of the sequence of a target miRNA binding site disclosed in WO2006/092738, U.S. Patent Applications 11/418,718 or 11/429,720, the contents of which are incorporated herein.
  • threshold expression level refers to a reference expression value. Measured values are compared to a corresponding threshold expression level to determine the prognosis of a subject
  • tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts.
  • the phrase "suspected of being cancerous" as used herein means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser-based microdissection, or other art-known cell-separation methods.
  • Tumor refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. variant
  • nucleic acid means (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequence substantially identical thereto.
  • wild type sequence refers to a coding, a non-coding or an interface sequence which is an allelic form of sequence that performs the natural or normal function for that sequence. Wild type sequences include multiple allelic forms of a cognate sequence, for example, multiple alleles of a wild type sequence may encode silent or conservative changes to the protein sequence that a coding sequence encodes.
  • the present invention employs miRNAs for the identification, classification and diagnosis of specific cancers and the identification of their tissues of origin.
  • a gene coding for microRNA may be transcribed leading to production of a miRNA primary transcript known as the pri-miRNA.
  • the pri-miRNA may comprise a hairpin with a stem and loop structure.
  • the stem of the hairpin may comprise mismatched bases.
  • the pri-miRNA may comprise several hairpins in a polycistronic structure.
  • the hairpin structure of the pri-miRNA may be recognized by Drosha, which is an RNase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 60-70 nt precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of RNase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ⁇ 2 nucleotide 3' overhang. Approximately one helical turn of stem ( ⁇ 10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing.
  • Drosha is an RNase III endonuclease.
  • Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 60-70 nt precursor known as the pre-m
  • the pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran-GTP and the export receptor Ex-portin-5.
  • the pre-miRNA may be recognized by Dicer, which is also an RNase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ⁇ 2 nucleotide 3' overhang.
  • the resulting siRNA-like duplex which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*.
  • the miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
  • RISC RNA-induced silencing complex
  • the miRNA* When the miRNA strand of the miRNArmiRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded.
  • the strand of the miRNA:miRNA* duplex that is loaded into the RISC may be the strand whose 5' end is less tightly paired. In cases where both ends of the miRNA:miRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
  • the RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-7 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for mir-196 and Hox B8 and it was further shown that mir-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al 2004, Science 304-594). Otherwise, such interactions are known only in plants (Bartel & Bartel 2003, Plant Physiol 132-709).
  • the target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region.
  • multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites.
  • the presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple RISCs provides the most efficient translational inhibition.
  • miRNAs may direct the RISC to downregulate gene expression by either of two mechanisms: mRNA cleavage or translational repression.
  • the miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA.
  • the miRNA may repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the miRNA and binding site.
  • any pair of miRNA and miRNA* there may be variability in the 5' and 3' ends of any pair of miRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage. Variability at the 5' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri- miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer.
  • Nucleic acids are provided herein.
  • the nucleic acids comprise the sequences of SEQ ID NOS: 1-92 or variants thereof.
  • the variant may be a complement of the referenced nucleotide sequence.
  • the variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof.
  • the variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
  • the nucleic acid may have a length of from about 10 to about 250 nucleotides.
  • the nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides.
  • the nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein.
  • the nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex.
  • the nucleic acid may be introduced to a cell, tissue or organ in a single- or double- stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No. 6,506,559 which is incorporated by reference.
  • Table 1 The nucleic acids of the invention (miRs and related hairpins)
  • the nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer.
  • the nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
  • the pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000-1,500 or 80-100 nucleotides.
  • the sequence of the pri-miRNA may comprise a pre-miRNA, miRNA and miRNA*, as set forth herein, and variants thereof.
  • the sequence of the pri-miRNA may comprise any of the sequences of SEQ ID NOS: 1-60 or variants thereof.
  • the pri-miRNA may comprise a hairpin structure.
  • the hairpin may comprise a first and a second nucleic acid sequence that are substantially complementary.
  • the first and second nucleic acid sequence may be from 37-50 nucleotides.
  • the first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides.
  • the hairpin structure may have a free energy of less than -25 Kcal/mole as calculated by the Vienna algorithm with default parameters, as described in Hofacker et al., Monatshefte f. Chemie 125: 167-188 (1994), the contents of which are incorporated herein by reference.
  • the hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides.
  • the pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least 23% thymine nucleotides and at least 19% guanine nucleotides.
  • the nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof.
  • the pre-miRNA sequence may comprise from 45-90, 60-80 or 60-70 nucleotides.
  • the sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein.
  • the sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA.
  • the sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 1-60 or variants thereof.
  • the nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof.
  • the miRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides.
  • the miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
  • the sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may comprise the sequence of SEQ ED NOS: 1-5, 11-27, 45-49, 51, 52, 54-55, 57, 59 or variants thereof.
  • a probe is also provided comprising a nucleic acid described herein. Probes may be used for screening and diagnostic methods, as outlined below. The probe may be attached or immobilized to a solid substrate, such as a biochip.
  • the probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
  • the probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides.
  • the probe may further comprise a linker sequence of from 10-60 nucleotides.
  • the probe may be a test probe.
  • the test probe may comprise a nucleic acid sequence that is complementary to a miRNA, a miRNA*, a pre-miRNA, or a pri-miRNA.
  • the sequence of the test probe may be complementary to a sequence selected from SEQ ID NOS: 1-60; fragments or variants thereof.
  • the probe may further comprise a linker.
  • the linker may be 10-60 nucleotides in length.
  • the linker may be 20-27 nucleotides in length.
  • the linker may be of sufficient length to allow the probe to be a total length of 45-60 nucleotides.
  • the linker may not be capable of forming a stable secondary structure, or may not be capable of folding on itself, or may not be capable of folding on a non-linker portion of a nucleic acid contained in the probe.
  • the sequence of the linker may not appear in the genome of the animal from which the probe non-linker nucleic acid is derived.
  • Target sequences of a cDNA may be generated by reverse transcription of the target RNA.
  • Methods for generating cDNA may be reverse transcribing polyadenylated RNA or alternatively, RNA with a ligated adaptor sequence.
  • the RNA may be ligated to an adapter sequence prior to reverse transcription.
  • a ligation reaction may be performed by T4 RNA ligase to ligate an adaptor sequence at the 3' end of the RNA.
  • Reverse transcription (RT) reaction may then be performed using a primer comprising a sequence that is complementary to the 3' end of the adaptor sequence.
  • Polyadenylated RNA may be used in a reverse transcription (RT) reaction using a poly(T) primer comprising a 5' adaptor sequence.
  • the poly(T) sequence may comprise 8, 9, 10, 11, 12, 13, or 14 consecutive thymines.
  • the reverse transcript of the RNA may be amplified by real time PCR, using a specific forward primer comprising at least 15 nucleic acids complementary to the target nucleic acid and a 5' tail sequence; a reverse primer that is complementary to the 3' end of the adaptor sequence; and a probe comprising at least 8 nucleic acids complementary to the target nucleic acid.
  • the probe may be partially complementary to the 5' end of the adaptor sequence.
  • the amplification may be by a method comprising PCR.
  • the first cycles of the PCR reaction may have an annealing temp of 56°C, 57°C, 58°C, 59°C, or 60°C.
  • the first cycles may comprise 1-10 cycles.
  • the remaining cycles of the PCR reaction may be 60°C.
  • the remaining cycles may comprise 2-40 cycles.
  • the annealing temperature may cause the PCR to be more sensitive.
  • the PCR may generate longer products that can serve as higher stringency PCR templates.
  • the PCR reaction may comprise a forward primer.
  • the forward primer may comprise 15, 16, 17, 18, 19, 20, or 21 nucleotides identical to the target nucleic acid.
  • the 3' end of the forward primer may be sensitive to differences in sequence between a target nucleic acid and a sibling nucleic acid.
  • the forward primer may also comprise a 5' overhanging tail.
  • the 5' tail may increase the melting temperature of the forward primer.
  • the sequence of the 5' tail may comprise a sequence that is non-identical to the genome of the animal from which the target nucleic acid is isolated.
  • the sequence of the 5' tail may also be synthetic.
  • the 5' tail may comprise 8, 9, 10, 11, 12, 13, 14, 15, or 16 nucleotides.
  • the PCR reaction may comprise a reverse primer.
  • the reverse primer may be complementary to a target nucleic acid.
  • the reverse primer may also comprise a sequence complementary to an adaptor sequence.
  • the reverse primer may comprise SEQ ID NO: 61 - 'GCGAGCACAGAATTAATACGAC*.
  • a biochip is also provided.
  • the biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein.
  • the probes may be capable of hybridizing to a target sequence under stringent hybridization conditions.
  • the probes may be attached at spatially defined addresses on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence.
  • the probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art.
  • the probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
  • the solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method.
  • substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics.
  • the substrates may allow optical detection without appreciably fluorescing.
  • the substrate may be planar, although other configurations of substrates may be used as well.
  • probes may be placed on the inside surface of a tube, for flow- through sample analysis to minimize sample volume.
  • the substrate may be flexible, such as flexible foam, including closed cell foams made of particular plastics.
  • the biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two.
  • the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker.
  • the probes may be attached to the solid support by either the 5' terminus, 3' terminus, or via an internal nucleotide.
  • the probe may also be attached to the solid support non-covalently.
  • biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment.
  • probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
  • diagnosis refers to classifying pathology, or a symptom, determining a severity of the pathology (grade or stage), monitoring pathology progression, forecasting an outcome of pathology and/or prospects of recovery.
  • the phrase "subject in need thereof refers to an animal or human subject who is known to have cancer, at risk of having cancer [e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens, occupational hazard, environmental hazard] and/or a subject who exhibits suspicious clinical signs of cancer [e.g., blood in the stool or melena, unexplained pain, sweating, unexplained fever, unexplained loss of weight up to anorexia, changes in bowel habits (constipation and/or diarrhea), tenesmus (sense of incomplete defecation, for rectal cancer specifically), anemia and/or general weakness]. Additionally or alternatively, the subject in need thereof can be a healthy human subject undergoing a routine well-being check up.
  • cancer e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens,
  • Analyzing presence of malignant or pre-malignant cells can be effected in-vivo or ex-vivo, whereby a biological sample (e.g., biopsy) is retrieved.
  • a biological sample e.g., biopsy
  • Such biopsy samples comprise cells and may be an incisional or excisional biopsy. Alternatively the cells may be retrieved from a complete resection.
  • treatment regimen refers to a treatment plan that specifies the type of treatment, dosage, schedule and/or duration of a treatment provided to a subject in need thereof (e.g., a subject diagnosed with a pathology).
  • the selected treatment regimen can be an aggressive one which is expected to result in the best clinical outcome (e.g., complete cure of the pathology) or a more moderate one which may relieve symptoms of the pathology yet results in incomplete cure of the pathology. It will be appreciated that in certain cases the treatment regimen may be associated with some discomfort to the subject or adverse side effects (e.g., damage to healthy cells or tissue).
  • the type of treatment can include a surgical intervention (e.g., removal of lesion, diseased cells, tissue, or organ), a cell replacement therapy, an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode, an exposure to radiation therapy using an external source (e.g., external beam) and/or an internal source (e.g., brachytherapy) and/or any combination thereof.
  • a surgical intervention e.g., removal of lesion, diseased cells, tissue, or organ
  • a cell replacement therapy e.g., an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode
  • an exposure to radiation therapy using an external source e.g., external beam
  • an internal source e.g., brachytherapy
  • the dosage, schedule and duration of treatment can vary, depending on the severity of pathology and the selected type of treatment, and those
  • a method of diagnosis comprises detecting an expression level of a specific cancer-associated nucleic acid in a biological sample.
  • the sample may be derived from a patient. Diagnosis of a specific cancer state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed specific cancer-associated nucleic acids.
  • In situ hybridization of labeled probes to tissue sections may be performed.
  • the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acid sequence which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
  • kits may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base.
  • the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
  • the kit may further comprise a software package for data analysis of expression profiles.
  • the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence.
  • the kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe.
  • kits may comprise, in suitable container means, an enzyme for labeling the miRNA by incorporating labeled nucleotide or unlabeled nucleotides that are subsequently labeled. It may also include one or more buffers, such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the miRNA probes, components for in situ hybridization and components for isolating miRNA.
  • Other kits of the invention may include components for making a nucleic acid array comprising miRNA, and thus, may include, for example, a solid support.
  • FFPE renal tumor Formalin fixed, paraffin embedded
  • Custom microarrays were produced by printing DNA oligonucleotide probes representing 903 human microRNAs. Each probe, printed in triplicate, carried up to 22-nt linker at the 3' end of the microRNA's complement sequence, in addition to an amine group used to couple the probes to coated glass slides. Each probe (20 ⁇ ) was dissolved in 2X SSC + 0.0035% SDS and spotted in triplicate on Schott Nexterion® Slide E-coated microarray slides using a Genomic Solutions® BioRobotics MicroGrid II according the MicroGrid manufacturer's directions. Fifty- four negative control probes were designed using the sense sequences of different microRNAs.
  • Two groups of positive control probes were designed to hybridize to microarray: (i) synthetic small RNAs were spiked to the RNA before labeling to verify labeling efficiency; and (ii) probes for abundant small RNA (e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA) were spotted on the array to verify RNA quality. The slides were blocked in a solution containing 50 mM ethanolamine, 1 M Tris (pH9.0) and 0.1% SDS for 20 min at 50°C, then thoroughly rinsed with water and spun dry.
  • synthetic small RNAs were spiked to the RNA before labeling to verify labeling efficiency
  • probes for abundant small RNA e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA
  • RNA-linker 1:47-53 of an RNA-linker, p-rCrU-Cy/dye (Dharmacon, Lafayette, CO), to the 3' -end with Cy3 or Cy5.
  • the labeling reaction contained total RNA, spikes (0.1-20 fmoles), 300 ng RNA-linker-dye, 15% DMSO, lx ligase buffer and 20 units of T4 RNA ligase (NEB) and proceeded at 4°C for 1 h, followed by 1 h at 37°C.
  • the labeled RNA was mixed with 3x hybridization buffer (Ambion), heated to 95°C for 3 min and then added on top of the miR array. Slides were hybridized 12-16 h at 42°C, followed by two washes at room temperature with lxSSC and 0.2% SDS and a final wash with O.lxSSC.
  • Arrays were scanned at 0.01 mm resolution and images read using SpotReader software (Niles Scientific, Portola Valley, CA).
  • the initial data set consisted of signals measured for multiple probes for every sample. For the analysis, signals were used only for probes that were designed to measure the expression levels of known or validated human microRNAs.
  • Triplicate spots were combined into one signal by taking the logarithmic mean of the reliable spots. All data was log-transformed and the analysis was performed in log-space. A reference data vector for normalization, R, was calculated by taking the mean expression level for each probe in two representative samples, one from each tumor type.
  • Measurements of the expression of miRs were log-transformed before all further analysis. Normalization of samples was performed by calculating a median reference vector. For each sample, the best fit to this reference vector was calculated using a 2 nd degree polynomial.
  • N patients are sampled, by randomly selecting patients from the entire pool. In this way there may be patients appearing more than once (with repeats). The more frequently a feature is selected within such repeats the more likely it is to be real. Thus stability of microRNAs' separation of survival patterns was tested by bootstrapping 100 times. The number of times each microRNA gave a logrank p-value less than 0.05 was counted. Multivariate stepwise Cox analysis of the microRNA expression, in concurrence with stage, lymph node involvement, tumour size and demographic data was carried out, using p ⁇ 0.05 as inclusion, exclusion criterion. This process was performed over all patients and within stage III patients alone.
  • qRT-PCR was used to further quantify microRNA expression.
  • RNA was extracted from 29 samples and analysed using 16 microRNA probes, 11 of which were taken as being differential using the microarray and 5 were taken as normalisers. All expression readings (Cj) were normalised to give equal means for all patients.
  • Cj expression readings
  • 50-CT was used.
  • the value from the microarray which created the best separation was translated to 50-CT units (using linear regression) and used to separate the PCR values into good and bad prognosis groups. The groups were then compared using Kaplan-Meier and log-rank.
  • microRNAs are able to predict the prognosis of renal cancer patients
  • the median expression values of hsa-miR-21* (SEQ ID NO: 4) and hsa-miR-22 (SEQ ID NO: 5) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-26b (SEQ ID NO: 1), hsa-miR-27a (SEQ ID NO: 2) and hsa-miR-23a (SEQ ID NO: 3) were found to be below the median expression of the patients with good prognosis.
  • hsa-miR-21* SEQ ID NO: 4
  • hsa-miR-22 SEQ ID NO: 5
  • relatively high expression values of hsa-miR-26b SEQ ID NO: 1
  • hsa-miR-27a SEQ ID NO: 2
  • hsa-miR-23a SEQ ID NO: 3
  • hsa-miR-22 SEQ ID NO: 5
  • hsa-miR-21* SEQ ID NO: 4
  • hsa-miR-21 SEQ ID NO: 51
  • hsa-miR-193b SEQ ID NO: 25
  • hsa-miR-26b SEQ ID NO: 1
  • hsa-miR-23a SEQ ID NO: 3
  • hsa-miR-27a SEQ ID NO: 2
  • hsa-miR-27b SEQ ID NO: 45
  • hsa-miR- 345 SEQ ID NO: 47
  • hsa-miR-let-7a SEQ ID NO: 49
  • hsa-miR-140-3p SEQ ID NO: 19
  • each of hsa-miR-451 (SEQ ID NO: 13), hsa-miR-27a (SEQ ID NO: 2), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-210 (SEQ ID NO: 14), hsa-miR-155 (SEQ ID NO: 15), hsa-miR-455-3p (SEQ ID NO: 16) and hsa-miR-16 (SEQ ID NO: 17) is higher in patients with primary renal cancer as compared to patients with renal metastases to the lung.
  • hsa-miR-lOa SEQ ID NO: 11
  • hsa-miR-138 SEQ ID NO: 12
  • Figure 5 shows differential expression of miRs in samples obtained from primary renal cancer patients.
  • the parallel lines describe a fold change between groups of 1.5 in either direction
  • the expression of each of hsa-miR-143 (SEQ ID NO: 18), hsa-miR-140-3p (SEQ ID NO: 19), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-192 (SEQ ID NO: 21), hsa-miR-194 (SEQ ID NO: 20), hsa-miR-30a* (SEQ ID NO: 22), hsa-miR-204 (SEQ ID NO: 23) and hsa-miR-30e* (SEQ ID NO: 52) is up-regulated in samples obtained from primary cancer renal patients with good prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis.
  • hsa-miR-21* (SEQ ID NO: 4), hsa-miR-193b (SEQ ID NO: 25), hsa-miR-199b-5p (SEQ ID NO: 26), hsa-miR-451 (SEQ ID NO: 13), and hsa- miR-373* (SEQ ED NO: 27) is up-regulated in samples obtained from primary cancer renal patients with bad prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of bad prognosis.
  • Each microRNA is expressed as the median value over all members of the good prognosis group (y-axis) against the median value over all members of the bad prognosis group (x-axis).
  • the six microRNAs from the Sanger dataset which are thus differential (pO.006, fold change>1.5) and expressed (>300 fluorescence units) in one group are labeled. Their expression levels are listed in Table 3.
  • FIGS 6A-D show Kaplan-Meier plots of progression-free survival curves plotted for each of the three expression-level groups.
  • hsa-miR-21* SEQ ED NO: 4
  • hsa-miR-21 SEQ ED NO: 51
  • hsa-miR-30e SEQ ED NO: 54
  • hsa-miR- 30a* SEQ ID NO: 22
  • the prognostic miRs were tested if they could serve as prognosticators using qRT- PCR.
  • the sequences of the RT-PCR primers and probes are presented in Table 4.
  • the best separation by miR-21 was with the log 2 (expression) value of 16.83, which is parallel in 50- CT units in the PCR experiment to 25.67.
  • the best separation by miR-21* in the microarray experiment was with value 10.27, which translated to 50-Cj as 18.91.
  • the resulting differentiation in the PCR experiment for both groups is significant (p ⁇ 0.05) for both microRNAs ( Figures 9A-9D).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)

Abstract

Described herein are compositions and methods for the prognosis of renal cancer patients after surgical operation. Specifically the invention relates to microRNA molecules associated with the prognosis of renal cancer, as well as various nucleic acid molecules relating thereto or derived therefrom.

Description

COMPOSITIONS AND METHODS FOR PROGNOSIS OF RENAL CANCER
CROSS REFERENCE TO RELATED APPLICATIONS
The present application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Application No. 61/248,458 filed October 4, 2009, which is herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
The invention relates to compositions and methods for prognosis of renal cancer. Specifically the invention relates to microRNA molecules associated with prognosis of renal cancer, as well as various nucleic acid molecules relating thereto or derived therefrom.
BACKGROUND OF THE INVENTION
In recent years, microRNAs (miRNAs, miRs) have emerged as an important novel class of regulatory RNAs that has a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression patterns by promoting RNA degradation, inhibiting mRNA translation, and also affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases strong evidence has been put forward in support of the conjecture that such alterations may play a causative role in tumor progression. The remarkable tissue-specificity of miR expression allows the development of novel approaches to molecular classification. There are currently about 1,200 known human miRs.
Renal cancers account for more than 3% of adult malignancies and cause more than 13,000 deaths per year in the US alone (Jemal et al., 2008, Cancer statistics, CA Cancer J Clin 58, 71-96). The incidence of renal cancers in the US rose more than 50% between 1983 to 2002 and the estimated number of new cases per year rose from 39,000 estimated in 2006 to 54,000 estimated in 2008. Despite the trend of increased incidence of relatively small and kidney-confined disease, the mortality rate has not changed significantly during the last 2 decades in the U.S. and Europe. Clear cell carcinoma accounts for around 70% of renal cancers and has the worse prognosis of all RCC types. For renal cancers, the best available prognostic indicator is stage, but the current prognostic factors: Fuhrman grade, performance status and histological type, as well as stage, are insufficient to predict patient outcome and cancer aggressiveness. Identification of biomarkers that provide further prognostic information would thus be vital for defining optimal treatment and outcomes.
SUMMARY OF THE INVENTION
According to some embodiments of the present invention, altered expression levels of any of SEQ ID NO: 1-60 or combinations thereof are indicative of renal cancer prognosis: life expectancy of the patient, expected recurrence-free survival, time to progression and risk of recurrence of metastases.
According to one aspect of the invention, a method for determining a prognosis for renal cancer in a subject is provided, the method comprising:
(a) obtaining a biological sample from the subject;
(b) determining the expression level of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-60 and sequences at least about 80% identical thereto from said sample; and
(c) comparing said expression level to a threshold expression level,
wherein the level of any of SEQ ID NOS: 1-60 and sequences at least about 80% identical thereto compared to said threshold expression level is indicative of the prognosis of said subject.
In some embodiments, the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1-3, 6-8, 13-23, 30-40, 45-50, 52-58 and sequences at least about 80% identical thereto, and an increased expression level of any of said nucleic acid sequence compared to the threshold expression level is indicative of good prognosis of said subject. In other embodiments, the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 4, 5, 9-12, 24-29, 41-44, 51, 59-60 and sequences at least about 80% identical thereto, and an increased expression level of any of said nucleic acid sequence compared to the threshold expression level is indicative of poor prognosis of said subject.
In certain embodiments, the subject is a human.
In certain embodiments, the method is used to determine a course of treatment of the subject. In certain embodiments the biological sample obtained from the subject is selected from the group consisting of a bodily fluid, a cell line or a tissue sample. In certain embodiments the tissue is a fresh, frozen, fixed, wax-embedded or formalin-fixed, paraffin- embedded (FFPE) tissue.
In certain embodiments said tissue is a renal cancer tissue.
According to some embodiments, the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, or a combination thereof. According to some embodiments, the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
According to other embodiments, the nucleic acid amplification method is real-time
PCR. According to some embodiments, the PCR method comprises forward and reverse primers. According to some embodiments the reverse primer comprises SEQ ID NO: 61.
According to some embodiments the forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 62-78.
According to some embodiments, the real-time PCR method further comprises a probe. According to some embodiments, the probe is complementary to SEQ ID NOS: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto. According to some embodiments, the probe comprises a sequence selected from the group consisting of SEQ ID NOS: 79-92.
The invention further provides a kit for prognosis of renal cancer, said kit comprising a forward primer comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NOS: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto. According to some embodiments, the kit further comprises a reverse primer. According to some embodiments, said reverse primer comprises SEQ ID NO: 61. According to some embodiments, said forward primer comprises a sequence selected from the group consisting of SEQ ED NO: 62-78. According to some embodiments, the kit further comprises a probe. According to some embodiments, the probe comprises a sequence selected from the group consisting of SEQ ED NOS: 79-92.
These and other embodiments of the present invention will become apparent in conjunction with the figures, description and claims that follow.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from renal cancer patients with good prognosis (survival above two years following surgery) and from renal cancer patients with bad prognosis (survival below two years following surgery), comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with good prognosis (46 patients), and the x-axis represents patients with bad prognosis (9 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant (p- value<0.05) miRs, miR-21* (SEQ ID NO: 4), hsa-miR-22 (SEQ ID NO: 5), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-27a (SEQ ID NO: 2) and hsa-miR-23a (SEQ ID NO: 3), are labeled. P-values are calculated by Student t-test.
Figures 2A-E are boxplot presentations comparing distribution of the expression of the statistically significant miRs: hsa-miR-26b (SEQ ID NO: 1) (p-value=0.001505, fold change 2.0) (2A), hsa-miR-27a (SEQ ID NO: 2) (p-value=0.000932, fold change 1.8) (2B), hsa-miR-23a (SEQ ID NO: 3) (p-value=0.000729, fold change 1.5) (2C), hsa-miR-22 (SEQ ID NO: 5) (p-value=0.003217, fold change 1.5) (2D), hsa-nuR-21* (SEQ ID NO: 4) (p- value=0.008038, fold change 2) (2E), in tumor samples obtained from renal cancer patients with bad or good prognosis (as defined in Figure 1). Two boxes are shown. The left box includes the group of patients with good prognosis, while the right box includes the group of patients with bad prognosis. The line in the box indicates the median value. The box contains 50% of the data and the horizontal lines and crosses (outliers) show the full range of signals in this group.
Figures 3A-K are Kaplan-Meier plots, which correct for patients who were censored (subjects that may have dropped out of the study and/or were lost to follow-up or deliberately withdrawn due to the culmination of the study). The y-axis depicts the fraction of surviving patients, and the x-axis depicts months of survival. Figure 3 A is grouped by the expression levels of hsa-miR-22 (SEQ ID NO: 5) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >13.75), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 13.29). Figure 3B is grouped by the expression levels of hsa-miR-26b (SEQ ID NO: 1) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >12.53), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 11.99). Figure 3C is grouped by the expression levels of hsa-miR-23a (SEQ ID NO: 3) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >15.11), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 14.78). Figure 3D is grouped by the expression levels of hsa-miR-27a (SEQ ID NO: 2) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >14.47), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 13.88). Figure 3E is grouped by the expression levels of hsa-miR-21* (SEQ ID NO: 4) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >10.69), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 9.93). Figure 3F is grouped by the expression levels of hsa-miR-27b (SEQ ID NO: 45) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression). >13.09), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 12.69). Figure 3G is grouped by the expression levels of hsa-miR-345 (SEQ ID NO: 47) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >9.16), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 8.62). Figure 3H is grouped by the expression levels of hsa-miR- let-7a (SEQ ID NO: 49) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >16.18), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 15.73). Figure 31 is grouped by the expression levels of hsa-miR-21 (SEQ ED NO: 51) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >17), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 16.32). Figure 3J is grouped by the expression levels of hsa-miR-193b (SEQ ID NO: 25) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >10.35), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 9.23). Figure 3K is grouped by the expression levels of hsa-miR-140-3p (SEQ ID NO: 19) with the solid line representing patients with the highest expression for this miR (n=19, log2(expression) >12.97), the dashed dotted line depicting the intermediate scoring (n=18) and the dashed line depicting the lowest scoring (n=18, log2(expression) < 12.52).
Figure 4 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by microarray) in samples obtained from renal cancer patients with metastases and from renal cancer patients with primary tumors, comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents renal cancer patients with metastases to the lung (6 patients), and the x-axis represents patients with primary renal tumors (51 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Differential miRs, miR-lOa (SEQ ID NO: 11), hsa-miR-138 (SEQ ID NO: 12), hsa-miR-451 (SEQ ID NO: 13), hsa-miR-27a (SEQ ID NO: 2), hsa-miR-26b (SEQ ID NO: 1), hsa-miR- 210 (SEQ ID NO: 14), hsa-miR-155 (SEQ ID NO: 15), hsa-miR-455-3p (SEQ ID NO: 16) and hsa-miR-16 (SEQ ID NO: 17) are marked. P-values are calculated by Student t-test.
Figure 5 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from primary renal cancer patients with good prognosis and from primary renal cancer patients with bad prognosis, comparing the median values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with good prognosis (44 patients), and the x-axis represents patients with bad prognosis (5 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant (p-value<0.05) miRs, miR-143 (SEQ ID NO: 18), hsa-miR-140-3p (SEQ ID NO: 19), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-192 (SEQ ID NO: 21), hsa-miR- 194 (SEQ ID NO: 20), hsa-miR-30a* (SEQ ID NO: 22), hsa-miR-204 (SEQ ID NO: 23), hsa-miR-30e* (SEQ ID NO: 52), hsa-miR-451 (SEQ ID NO: 13), hsa-miR-21* (SEQ ID NO: 4), hsa-miR-193b (SEQ ID NO: 25) hsa-miR-199b-5p (SEQ ID NO: 26) and hsa-miR- 373* (SEQ ID NO: 27) are marked. P-values are calculated by Student t-test.
Figures 6A-D are Kaplan-Meier plots for time to progression status of primary renal cancer patients. The y-axis depicts the fraction of progression-free patients, and the x-axis depicts time to progression (months). Figure 6A is grouped by the expression levels of hsa- miR-21* (SEQ ID NO: 4) with the solid line representing patients with the highest expression for this miR (n=18, log2(expression) >10.68), the dashed dotted line depicting the intermediate scoring (n=16) and the dashed line depicting the lowest scoring (n=16, log2(expression) < 9.96). Figure 6B is grouped by the expression levels of hsa-miR-30e (SEQ ID NO: 54) with the solid line representing patients with the highest expression for this miR (n=18, log2(expression) >11.41), the dashed dotted line depicting the intermediate scoring (n=16) and the dashed line depicting the lowest scoring (n=16, log2(expression) < 11.05). Figure 6C is grouped by the expression levels of hsa-miR-21 (SEQ ID NO: 51) with the solid line representing patients with the highest expression for this miR (n=18, log2(expression) >17.08), the dashed dotted line depicting the intermediate scoring (n=16) and the dashed line depicting the lowest scoring (n=16, log2(expression) < 16.40). Figure 6D is grouped by the expression levels of hsa-miR-30a* (SEQ ID NO: 22) with the solid line representing patients with the highest expression for this miR (n=18, log2(expression) >11.56), the dashed dotted line depicting the intermediate scoring (n=16) and the dashed line depicting the lowest scoring (n=16, log2(expression) < 10.83).
Figure 7 is a scatter plot showing differential expression of miRs (in normalized fluorescence units, as measured by a microarray) in samples obtained from non-metastatic renal cancer patients with good prognosis (mean follow-up 68 months, range 24-142) and from renal cancer patients with bad prognosis (mean time to progression 12 months, range 1- 22 months), comparing the median normalised values of each miR in all patients in one group with the corresponding median for members of the other group. The y-axis represents patients with good prognosis (40 patients), and the x-axis represents patients with bad prognosis (10 patients). The parallel lines describe a fold change between groups of 1.5 in either direction. Statistically significant (p-value<0.006) miRs, miR-21* (SEQ ID NO: 4), hsa-miR-487b (SEQ ID NO: 59), hsa-miR-30e* (SEQ ID NO: 52), hsa-miR-29c* (SEQ ID NO: 57), hsa-miR-345 (SEQ ID NO: 47) and hsa-miR-362-5p (SEQ ID NO: 55) are labeled. P-values are calculated by Student t-test.
Figure 8 is a dot-plot presentation comparing distribution of the expression (y-axis) of miR-21* (SEQ ID NO: 4) (p-value 0.0004552, fold change 2.2) in tumor samples obtained from non-metastatic renal cancer patients with bad or good prognosis (as defined in Figure 7). The left plot includes the group of patients with good prognosis, while the right plot includes the group of patients with bad prognosis. The line in the middle indicates the median value.
Figures 9A-D are Kaplan-Meier plots for time to progression status of primary renal cancer patients comparing microarray (9 A, 9C) and PCR (9B, 9D) data. The y-axis depicts the fraction of progression-free patients, and the x-axis depicts time to progression (months). Figure 9A is grouped by the expression levels as detected by microarray of hsa- miR-21* (SEQ ID NO: 4) (p= 0.033) with the dashed line representing patients with the highest expression for this miR (n=25, log2(expression) >10.30), and the solid line depicting the lowest scoring (n=25, log2(expression) < 10.27). Figure 9B is grouped by the expression levels as detected by PCR of hsa-miR-21* (SEQ ED NO: 4) (p= 0.0081) with the dashed line representing patients with the highest expression for this miR (n=15, 50-CT >18.99), and the solid line depicting the lowest scoring (n=14, 50-CT < 18.91). Figure 9C is grouped by the expression levels as detected by microarray of hsa-miR-21 (SEQ ID NO: 51) with the dashed line representing patients with the highest expression for this miR (n=25, log2(expression) >16.85), and the solid line depicting the lowest scoring (n=25, log2(expression) < 16.83). Figure 9D is grouped by the expression levels as detected by PCR of hsa-miR-21 (SEQ ID NO: 51) with the dashed line representing patients with the highest expression for this miR (n=15, 50-CT >25.89), and the solid line depicting the lowest scoring (n=14, 50-CT < 25.67).
DETAILED DESCRIPTION OF THE INVENTION
According to some embodiments of the present invention miRNA expression can serve as a novel tool for the prognosis of patients with renal cancer. More particularly, it may serve for the prediction of survival, risk of recurrence and progression.
Methods and compositions are provided for determining the prognosis of renal cancer. Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.
All the methods of the present invention may optionally further include measuring levels of other renal cancer markers. Other renal cancer markers, in addition to said microRNA molecules, useful in the present invention will depend on the cancer being tested and are known to those of skill in the art.
Assay techniques that can be used to determine levels of expression, such as the nucleic acid sequence of the present invention, in a sample derived from a patient are well known to those of skill in the art. Such assay methods include, but are not limited to, radioimmunoassays, reverse transcriptase PCR (RT-PCR) assays, immunohistochemistry assays, in situ hybridization assays, competitive-binding assays, Northern Blot analyses, ELISA assays, nucleic acid microarrays and biochip analysis.
An arbitrary threshold on the expression level of one or more nucleic acid sequences can be set for assigning a sample or tumor sample to one of two groups. Alternatively, expression levels of one or more nucleic acid sequences of the invention are combined by taking ratios of expression levels of two nucleic acid sequences and/or by a method such as logistic regression to define a metric which is then compared to previously measured samples or to a threshold. The threshold for assignment is treated as a parameter, which can be used to quantify the confidence with which samples are assigned to each class. The threshold for assignment can be scaled to favor sensitivity or specificity, depending on the clinical scenario. The correlation value to the reference data generates a continuous score that can be scaled and provides diagnostic information on the likelihood that a samples belongs to a certain class of renal subtype. In multivariate analysis, the microRNA signature provides a high level of renal cancer prognostic information.
Definitions
It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise.
For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9 and 7.0 are explicitly contemplated, aberrant proliferation
As used herein, the term "aberrant proliferation" means cell proliferation that deviates from the normal, proper, or expected course. For example, aberrant cell proliferation may include inappropriate proliferation of cells whose DNA or other cellular components have become damaged or defective. Aberrant cell proliferation may include cell proliferation whose characteristics are associated with an indication caused by, mediated by, or resulting in inappropriately high levels of cell division, inappropriately low levels of apoptosis, or both. Such indications may be characterized, for example, by single or multiple local abnormal proliferations of cells, groups of cells, or tissue(s), whether cancerous or non-cancerous, benign or malignant.
about
As used herein, the term "about" refers to +/-10%.
altered expression
As used herein, the term "altered expression" encompasses over-expression, under- expression, and ectopic expression. According to some embodiments, the altered expression level is a change in a score based on a combination of expression levels of nucleic acid sequences or any combinations thereof. antisense
The term "antisense," as used herein, refers to nucleotide sequences which are complementary to a specific DNA or R A sequence. The term "antisense strand" is used in reference to a nucleic acid strand that is complementary to the "sense" strand. Antisense molecules may be produced by any method, including synthesis by ligating the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a complementary strand. Once introduced into a cell, this transcribed strand combines with natural sequences produced by the cell to form duplexes. These duplexes then block either the further transcription or translation. In this manner, mutant phenotypes may be generated.
attached
"Attached" or "immobilized" as used herein to refer to a probe and a solid support means that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules. Non-covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non- covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non-covalent interactions.
biological sample
"Biological sample" as used herein means a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples may also include sections of tissues such as biopsy and autopsy samples, FFPE samples, frozen sections taken for histological purposes, blood, blood fraction, plasma, serum, sputum, stool, tears, mucus, hair, skin, urine, effusions, ascitic fluid, amniotic fluid, saliva, cerebrospinal fluid, cervical secretions, vaginal secretions, endometrial secretions, gastrointestinal secretions, bronchial secretions, cell line, tissue sample, or secretions from the breast. A biological sample may be provided by removing a sample of cells from a subject but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods described herein in vivo. Archival tissues, such as those having treatment or outcome history, may also be used. Biological samples also include explants and primary and/or transformed cell cultures derived from animal or human tissues.
cancer
The term "cancer" is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. Examples of cancers include but are not limited to solid tumors and leukemias, including: apudoma, choristoma, branchioma, malignant carcinoid syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, clear cell RCC, papillary RCC and chromophobe RCC, non-small cell lung (e.g., lung squamous cell carcinoma, lung adenocarcinoma and lung undifferentiated large cell carcinoma), oat cell, papillary, bronchiolar, bronchogenic, squamous cell, and transitional cell), histiocytic disorders, leukemia (e.g., B cell, mixed cell, null cell, T cell, T-cell chronic, HTLV-II- associated, lymphocytic acute, lymphocytic chronic, mast cell, and myeloid), histiocytosis malignant, Hodgkin disease, immunoproliferative small, non-Hodgkin lymphoma, plasmacytoma, reticuloendotheliosis, melanoma, chondroblastoma, chondroma, chondrosarcoma, fibroma, fibrosarcoma, giant cell tumors, histiocytoma, lipoma, liposarcoma, mesothelioma, myxoma, myxosarcoma, osteoma, osteosarcoma, Ewing sarcoma, synovioma, adenofibroma, adenolymphoma, carcinosarcoma, chordoma, craniopharyngioma, dysgerminoma, hamartoma, mesenchymoma, mesonephroma, myosarcoma, ameloblastoma, cementoma, odontoma, teratoma, thymoma, trophoblastic tumor, adeno-carcinoma, adenoma, cholangioma, cholesteatoma, cylindroma, cystadenocarcinoma, cystadenoma, granulosa cell tumor, gynandroblastoma, hepatoma, hidradenoma, islet cell tumor, Leydig cell tumor, papilloma, Sertoli cell tumor, theca cell tumor, leiomyoma, leiomyosarcoma, myoblastoma, myosarcoma, rhabdomyoma, rhabdomyosarcoma, ependymoma, ganglioneuroma, glioma, medulloblastoma, meningioma, neurilemmoma, neuroblastoma, neuroepithelioma, neurofibroma, neuroma, paraganglioma, paraganglioma nonchromaffin, angiokeratoma, angiolymphoid hyperplasia with eosinophilia, angioma sclerosing, angiomatosis, glomangioma, hemangioendothelioma, hemangioma, hemangiopericytoma, hemangiosarcoma, lymphangioma, lymphangiomyoma, lymphangiosarcoma, pinealoma, carcinosarcoma, chondrosarcoma, cystosarcoma, phyllodes, fibrosarcoma, hemangiosarcoma, leimyosarcoma, leukosarcoma, liposarcoma, lymphangiosarcoma, myosarcoma, myxosarcoma, ovarian carcinoma, rhabdomyosarcoma, sarcoma (e.g., Ewing, experimental, Kaposi, and mast cell), neurofibromatosis, and cervical dysplasia, and other conditions in which cells have become immortalized or transformed,
cancer prognosis
A forecast or prediction of the probable course or outcome of the cancer. As used herein, cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response to treatment in a group of patients susceptible to or diagnosed with a cancer, duration of response to treatment in a patient or a group of patients susceptible to or diagnosed with a cancer. As used herein, "prognostic for cancer" means providing a forecast or prediction of the probable course or outcome of the cancer. In some embodiments, "prognostic for cancer" comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression- free survival of a patient susceptible to or diagnosed with a cancer, response to treatment in a group of patients susceptible to or diagnosed with a cancer, and duration of response to treatment in a patient or a group of patients susceptible to or diagnosed with a cancer.
classification
The term classification refers to a procedure and/or algorithm in which individual items are placed into groups or classes based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, features, etc) and based on a statistical model and/or a training set of previously labeled items.
complement
"Complement" or "complementary" as used herein to refer to a nucleic acid may mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. A full complement or fully complementary means 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. In some embodiments, the complementary sequence has a reverse orientation (5 '-3'). CT
CT signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of CT represent high abundance or expression levels of the microRNA. In some embodiments, the PCR CT signal is normalized such that the normalized CT remains inversed from the expression level. In other embodiments the PCR CT signal may be normalized and then inverted such that low normalized-inverted CT represents low abundance or expression levels of the microRNA.
data processing routine
As used herein, a "data processing routine" refers to a process that can be embodied in software that determines the biological significance of acquired data (i.e., the ultimate results of an assay or analysis). For example, the data processing routine can make determination of tissue of origin based upon the data collected. In the systems and methods herein, the data processing routine can also control the data collection routine based upon the results determined. The data processing routine and the data collection routines can be integrated and provide feedback to operate the data acquisition, and hence provide assay- based judging methods.
data set
As use herein, the term "data set" refers to numerical values obtained from the analysis. These numerical values associated with analysis may be values such as peak height and area under the curve.
data structure
As used herein the term "data structure" refers to a combination of two or more data sets, applying one or more mathematical manipulations to one or more data sets to obtain one or more new data sets, or manipulating two or more data sets into a form that provides a visual illustration of the data in a new way. An example of a data structure prepared from manipulation of two or more data sets would be a hierarchical cluster.
detection
"Detection" means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means determining the level of a component, either quantitatively or qualitatively. differential expression
"Differential expression" means qualitative or quantitative differences in the temporal and/or spatial gene expression patterns within and among cells and tissue. Thus, a differentially expressed gene may qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus diseased tissue. Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more states. A qualitatively regulated gene may exhibit an expression pattern within a state or cell type which may be detectable by standard techniques. Some genes may be expressed in one state or cell type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down-regulated, resulting in a decreased amount of transcript. The degree to which expression differs needs only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, Northern blot analysis, real-time PCR, in situ hybridization and RNase protection.
expression profile
The term "expression profile" is used broadly to include a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cDNA, etc., quantitative PCR, ELISA for quantitation, and the like, and allow the analysis of differential gene expression between two samples. A subject or patient tumor sample, e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art. Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences. According to some embodiments, the term "expression profile" means measuring the abundance of the nucleic acid sequences in the measured samples.
expression ratio
"Expression ratio" as used herein refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample. FDR
When performing multiple statistical tests, for example in comparing the signal between two groups in multiple data features, there is an increasingly high probability of obtaining false positive results, by random differences between the groups that can reach levels that would otherwise be considered as statistically significant. In order to limit the proportion of such false discoveries, statistical significance is defined only for data features in which the differences reached a p-value (such as determined by two-sided t-test) below a threshold, which is dependent on the number of tests performed and the distribution of p- values obtained in these tests.
fragment
"Fragment" is used herein to indicate a non-full length part of a nucleic acid or polypeptide. Thus, a fragment is itself also a nucleic acid or polypeptide, respectively.
gene
"Gene" as used herein may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non- translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences). The coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA. A gene may also be an mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto. A gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
Groove binder/minor groove binder (MGB)
"Groove binder" and/or "minor groove binder" may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner. Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water. Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings. Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic antitumor drugs such as chromomycin and mithramycin, CC-1065, dihydrocyclopyrroloindole tripeptide (DPI3), l,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI3), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No. WO 03/078450, the contents of which are incorporated herein by reference. A minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the Tm of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures,
identity
"Identical" or "identity" as used herein in the context of two or more nucleic acids or polypeptide sequences mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA sequences, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0.
in situ detection
"In situ detection" as used herein means the detection of expression or expression levels in the original site hereby meaning in a tissue sample such as biopsy.
label
"Label" as used herein means a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include P32, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable. A label may be incorporated into nucleic acids and proteins at any position. logistic regression
Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable can be dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e. the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (or mathematical functions thereof). The logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first group if P is greater than 0.5 or 50%. Alternatively, the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier.
1D/2D threshold classifier
"1D/2D threshold classifier" used herein may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer. For a ID threshold classifier, the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold. A 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables. A threshold may be calculated as a function (usually a continuous or even a monotonic function) of the first variable; the decision is then reached by comparing the second variable to the calculated threshold, similar to the ID threshold classifier. negative predictive value
"Negative predictive value" (NPV), as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a negative diagnosis. The NPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "not A" by the test out of the cases that are diagnosed as belonging to class "not A", as determined by some absolute or gold standard. nucleic acid
"Nucleic acid" or "oligonucleotide" or "polynucleotide", as used herein means at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions. Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
Nucleic acids may be single stranded or double stranded, or may contain portions of both double stranded and single stranded sequences. The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
A nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference. Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids. The modified nucleotide analog may be located for example at the 5 '-end and/or the 3 '-end of the nucleic acid molecule. Representative examples of nucleotide analogs may be selected from sugar- or backbone- modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e. ribonucleotides, containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5- position, e.g. 5-(2-amino) propyl uridine, 5-bromo uridine; adenosines and guanosines modified at the 8-position, e.g. 8-bromo guanosine; deaza nucleotides, e.g. 7-deaza- adenosine; O- and N-alkylated nucleotides, e.g. N6-methyl adenosine are suitable. The 2'- OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or CN, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, CI, Br or I. Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al., Nature 438:685-689 (2005), Soutschek et al., Nature 432:173-178 (2004), and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference. Additional modified nucleotides and nucleic acids are described in U.S. Patent Publication No. 20050182005, which is incorporated herein by reference. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip. The backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells. The backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
positive predictive value
"Positive predictive value" (PPV), as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example the probability of a patient to have specific condition, given a positive diagnosis. The PPV for class A is the proportion of cases that are correctly diagnosed as belonging to class "A" by the test out of the cases that are diagnosed as belonging to class "A", as determined by some absolute or gold standard.
probe
"Probe" as used herein means an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence. A probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind. reference expression profile
As used herein, the phrase "reference expression profile" refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with renal cancer. The reference expression profile may be based on the abundance of the nucleic acids, or may be based on a combined metric score thereof.
reference value
As used herein the term "reference value" means a value that statistically correlates to a particular outcome when compared to an assay result. In preferred embodiments the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes. The reference value may be a threshold score value or a cutoff score value. Typically a reference value will be a threshold above which one outcome is more probable and below which an alternative threshold is more probable.
sensitivity
"sensitivity" used herein may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types. The sensitivity for class A is the proportion of cases that are determined to belong to class "A" by the test out of the cases that are in class "A", as determined by some absolute or gold standard.
specificity
"Specificity" used herein may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example how frequently it correctly classifies a cancer into the correct type out of two possible types. The specificity for class A is the proportion of cases that are determined to belong to class "not A" by the test out of the cases that are in class "not A", as determined by some absolute or gold standard.
stage of cancer
As used herein, the term "stage of cancer" refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the size of the tumor, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ).
stringent hybridization conditions
"Stringent hybridization conditions" as used herein mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 60°C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization. Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65°C.
substantially complementary
"Substantially complementary" as used herein means that a first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the complement of a second sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
substantially identical
"Substantially identical" as used herein means that a first and a second sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence. subtype of cancer
As used herein, the term "subtype of cancer" refers to different types of cancer that effect the same organ (e.g., spindle cell, cystic and collecting duct carcinomas of the kidney).
subject
As used herein, the term "subject" refers to a mammal, including both human and other mammals. The methods of the present invention are preferably applied to human subjects.
target nucleic acid
"Target nucleic acid" as used herein means a nucleic acid or variant thereof that may be bound by another nucleic acid. A target nucleic acid may be a DNA sequence. The target nucleic acid may be RNA. The target nucleic acid may comprise a mRNA, tRNA, shRNA, siRNA or Piwi-interacting RNA, or a pri-miRNA, pre-miRNA, miRNA, or anti-miRNA.
The target nucleic acid may comprise a target miRNA binding site or a variant thereof. One or more probes may bind the target nucleic acid. The target binding site may comprise 5-100 or 10-60 nucleotides. The target binding site may comprise a total of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30-40, 40- 50, 50-60, 61, 62 or 63 nucleotides. The target site sequence may comprise at least 5 nucleotides of the sequence of a target miRNA binding site disclosed in WO2006/092738, U.S. Patent Applications 11/418,718 or 11/429,720, the contents of which are incorporated herein.
threshold expression profile
As used herein, the phrase "threshold expression level" refers to a reference expression value. Measured values are compared to a corresponding threshold expression level to determine the prognosis of a subject,
tissue sample
As used herein, a tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts. The phrase "suspected of being cancerous" as used herein means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser-based microdissection, or other art-known cell-separation methods. tumor
"Tumor" as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. variant
"Variant" as used herein referring to a nucleic acid means (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequence substantially identical thereto.
wild type
As used herein, the term "wild type" sequence refers to a coding, a non-coding or an interface sequence which is an allelic form of sequence that performs the natural or normal function for that sequence. Wild type sequences include multiple allelic forms of a cognate sequence, for example, multiple alleles of a wild type sequence may encode silent or conservative changes to the protein sequence that a coding sequence encodes.
The present invention employs miRNAs for the identification, classification and diagnosis of specific cancers and the identification of their tissues of origin.
microRNA processing
A gene coding for microRNA (miRNA) may be transcribed leading to production of a miRNA primary transcript known as the pri-miRNA. The pri-miRNA may comprise a hairpin with a stem and loop structure. The stem of the hairpin may comprise mismatched bases. The pri-miRNA may comprise several hairpins in a polycistronic structure.
The hairpin structure of the pri-miRNA may be recognized by Drosha, which is an RNase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 60-70 nt precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of RNase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ~2 nucleotide 3' overhang. Approximately one helical turn of stem (~10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing. The pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran-GTP and the export receptor Ex-portin-5. The pre-miRNA may be recognized by Dicer, which is also an RNase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ~2 nucleotide 3' overhang. The resulting siRNA-like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
Although initially present as a double-stranded species with miRNA*, the miRNA may eventually become incorporated as a single-stranded RNA into a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC). Various proteins can form the RISC, which can lead to variability in specificity for miRNA/miRNA* duplexes, binding site of the target gene, activity of miRNA (repress or activate), and which strand of the miRNA/miRNA* duplex is loaded in to the RISC.
When the miRNA strand of the miRNArmiRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded. The strand of the miRNA:miRNA* duplex that is loaded into the RISC may be the strand whose 5' end is less tightly paired. In cases where both ends of the miRNA:miRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
The RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-7 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for mir-196 and Hox B8 and it was further shown that mir-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al 2004, Science 304-594). Otherwise, such interactions are known only in plants (Bartel & Bartel 2003, Plant Physiol 132-709).
A number of studies have looked at the base-pairing requirement between miRNA and its mRNA target for achieving efficient inhibition of translation (reviewed by Bartel 2004, Cell 116-281). In mammalian cells, the first 8 nucleotides of the miRNA may be important (Doench & Sharp 2004 GenesDev 2004-504). However, other parts of the microRNA may also participate in mRNA binding. Moreover, sufficient base pairing at the 3' can compensate for insufficient pairing at the 5' (Brennecke et al, 2005 PLoS 3-e85). Computation studies, analyzing miRNA binding on whole genomes have suggested a specific role for bases 2-7 at the 5' of the miRNA in target binding but the role of the first nucleotide, found usually to be "A" was also recognized (Lewis et at 2005 Cell 120-15). Similarly, nucleotides 1-7 or 2-8 were used to identify and validate targets by Krek et al (2005, Nat Genet 37-495).
The target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region. Interestingly, multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites. The presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple RISCs provides the most efficient translational inhibition.
miRNAs may direct the RISC to downregulate gene expression by either of two mechanisms: mRNA cleavage or translational repression. The miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the miRNA and binding site.
It should be noted that there may be variability in the 5' and 3' ends of any pair of miRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage. Variability at the 5' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri- miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer.
Nucleic Acids
Nucleic acids are provided herein. The nucleic acids comprise the sequences of SEQ ID NOS: 1-92 or variants thereof. The variant may be a complement of the referenced nucleotide sequence. The variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof. The variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto. The nucleic acid may have a length of from about 10 to about 250 nucleotides. The nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides. The nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein. The nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex. The nucleic acid may be introduced to a cell, tissue or organ in a single- or double- stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No. 6,506,559 which is incorporated by reference.
Table 1 : The nucleic acids of the invention (miRs and related hairpins)
Figure imgf000027_0001
hsa-miR-21 51 9
hsa-miR-30e* 52 53
hsa-miR-30e 54 53
hsa-miR-362-5p 55 56
hsa-miR-29c* 57 58
hsa-miR-487b 59 60
Nucleic acid complexes
The nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer.
Pri-miRNA
The nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof. The pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000-1,500 or 80-100 nucleotides. The sequence of the pri-miRNA may comprise a pre-miRNA, miRNA and miRNA*, as set forth herein, and variants thereof. The sequence of the pri-miRNA may comprise any of the sequences of SEQ ID NOS: 1-60 or variants thereof.
The pri-miRNA may comprise a hairpin structure. The hairpin may comprise a first and a second nucleic acid sequence that are substantially complementary. The first and second nucleic acid sequence may be from 37-50 nucleotides. The first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides. The hairpin structure may have a free energy of less than -25 Kcal/mole as calculated by the Vienna algorithm with default parameters, as described in Hofacker et al., Monatshefte f. Chemie 125: 167-188 (1994), the contents of which are incorporated herein by reference. The hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides. The pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least 23% thymine nucleotides and at least 19% guanine nucleotides.
Pre-miRNA
The nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof. The pre-miRNA sequence may comprise from 45-90, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein. The sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA. The sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 1-60 or variants thereof. miRNA
The nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof. The miRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides. The miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may comprise the sequence of SEQ ED NOS: 1-5, 11-27, 45-49, 51, 52, 54-55, 57, 59 or variants thereof.
Probes
A probe is also provided comprising a nucleic acid described herein. Probes may be used for screening and diagnostic methods, as outlined below. The probe may be attached or immobilized to a solid substrate, such as a biochip.
The probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides. The probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The probe may further comprise a linker sequence of from 10-60 nucleotides.
Test Probe
The probe may be a test probe. The test probe may comprise a nucleic acid sequence that is complementary to a miRNA, a miRNA*, a pre-miRNA, or a pri-miRNA. The sequence of the test probe may be complementary to a sequence selected from SEQ ID NOS: 1-60; fragments or variants thereof.
Linker Sequences
The probe may further comprise a linker. The linker may be 10-60 nucleotides in length.
The linker may be 20-27 nucleotides in length. The linker may be of sufficient length to allow the probe to be a total length of 45-60 nucleotides. The linker may not be capable of forming a stable secondary structure, or may not be capable of folding on itself, or may not be capable of folding on a non-linker portion of a nucleic acid contained in the probe. The sequence of the linker may not appear in the genome of the animal from which the probe non-linker nucleic acid is derived. Reverse Transcription
Target sequences of a cDNA may be generated by reverse transcription of the target RNA. Methods for generating cDNA may be reverse transcribing polyadenylated RNA or alternatively, RNA with a ligated adaptor sequence.
Reverse Transcription using Adaptor Sequence Ligated to RNA
The RNA may be ligated to an adapter sequence prior to reverse transcription. A ligation reaction may be performed by T4 RNA ligase to ligate an adaptor sequence at the 3' end of the RNA. Reverse transcription (RT) reaction may then be performed using a primer comprising a sequence that is complementary to the 3' end of the adaptor sequence.
Reverse Transcription using Polyadenylated Sequence Ligated to RNA
Polyadenylated RNA may be used in a reverse transcription (RT) reaction using a poly(T) primer comprising a 5' adaptor sequence. The poly(T) sequence may comprise 8, 9, 10, 11, 12, 13, or 14 consecutive thymines.
RT-PCR of RNA
The reverse transcript of the RNA may be amplified by real time PCR, using a specific forward primer comprising at least 15 nucleic acids complementary to the target nucleic acid and a 5' tail sequence; a reverse primer that is complementary to the 3' end of the adaptor sequence; and a probe comprising at least 8 nucleic acids complementary to the target nucleic acid. The probe may be partially complementary to the 5' end of the adaptor sequence.
PCR of Target Nucleic Acids
Methods of amplifying target nucleic acids are described herein. The amplification may be by a method comprising PCR. The first cycles of the PCR reaction may have an annealing temp of 56°C, 57°C, 58°C, 59°C, or 60°C. The first cycles may comprise 1-10 cycles. The remaining cycles of the PCR reaction may be 60°C. The remaining cycles may comprise 2-40 cycles. The annealing temperature may cause the PCR to be more sensitive. The PCR may generate longer products that can serve as higher stringency PCR templates.
Forward Primer
The PCR reaction may comprise a forward primer. The forward primer may comprise 15, 16, 17, 18, 19, 20, or 21 nucleotides identical to the target nucleic acid.
The 3' end of the forward primer may be sensitive to differences in sequence between a target nucleic acid and a sibling nucleic acid. The forward primer may also comprise a 5' overhanging tail. The 5' tail may increase the melting temperature of the forward primer. The sequence of the 5' tail may comprise a sequence that is non-identical to the genome of the animal from which the target nucleic acid is isolated. The sequence of the 5' tail may also be synthetic. The 5' tail may comprise 8, 9, 10, 11, 12, 13, 14, 15, or 16 nucleotides.
Reverse Primer
The PCR reaction may comprise a reverse primer. The reverse primer may be complementary to a target nucleic acid. The reverse primer may also comprise a sequence complementary to an adaptor sequence. The reverse primer may comprise SEQ ID NO: 61 - 'GCGAGCACAGAATTAATACGAC*.
Biochip
A biochip is also provided. The biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein. The probes may be capable of hybridizing to a target sequence under stringent hybridization conditions. The probes may be attached at spatially defined addresses on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence. The probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art. The probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
The solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method. Representative examples of substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The substrates may allow optical detection without appreciably fluorescing.
The substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow- through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as flexible foam, including closed cell foams made of particular plastics. The biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two. For example, the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker. The probes may be attached to the solid support by either the 5' terminus, 3' terminus, or via an internal nucleotide.
The probe may also be attached to the solid support non-covalently. For example, biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
Diagnostics
As used herein the term "diagnosing" refers to classifying pathology, or a symptom, determining a severity of the pathology (grade or stage), monitoring pathology progression, forecasting an outcome of pathology and/or prospects of recovery.
As used herein the phrase "subject in need thereof refers to an animal or human subject who is known to have cancer, at risk of having cancer [e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens, occupational hazard, environmental hazard] and/or a subject who exhibits suspicious clinical signs of cancer [e.g., blood in the stool or melena, unexplained pain, sweating, unexplained fever, unexplained loss of weight up to anorexia, changes in bowel habits (constipation and/or diarrhea), tenesmus (sense of incomplete defecation, for rectal cancer specifically), anemia and/or general weakness]. Additionally or alternatively, the subject in need thereof can be a healthy human subject undergoing a routine well-being check up.
Analyzing presence of malignant or pre-malignant cells can be effected in-vivo or ex-vivo, whereby a biological sample (e.g., biopsy) is retrieved. Such biopsy samples comprise cells and may be an incisional or excisional biopsy. Alternatively the cells may be retrieved from a complete resection.
While employing the present teachings, additional information may be gleaned pertaining to the determination of treatment regimen, treatment course and/or to the measurement of the severity of the disease. As used herein the phrase "treatment regimen" refers to a treatment plan that specifies the type of treatment, dosage, schedule and/or duration of a treatment provided to a subject in need thereof (e.g., a subject diagnosed with a pathology). The selected treatment regimen can be an aggressive one which is expected to result in the best clinical outcome (e.g., complete cure of the pathology) or a more moderate one which may relieve symptoms of the pathology yet results in incomplete cure of the pathology. It will be appreciated that in certain cases the treatment regimen may be associated with some discomfort to the subject or adverse side effects (e.g., damage to healthy cells or tissue). The type of treatment can include a surgical intervention (e.g., removal of lesion, diseased cells, tissue, or organ), a cell replacement therapy, an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode, an exposure to radiation therapy using an external source (e.g., external beam) and/or an internal source (e.g., brachytherapy) and/or any combination thereof. The dosage, schedule and duration of treatment can vary, depending on the severity of pathology and the selected type of treatment, and those of skills in the art are capable of adjusting the type of treatment with the dosage, schedule and duration of treatment.
A method of diagnosis is also provided. The method comprises detecting an expression level of a specific cancer-associated nucleic acid in a biological sample. The sample may be derived from a patient. Diagnosis of a specific cancer state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed specific cancer-associated nucleic acids.
In situ hybridization of labeled probes to tissue sections may be performed. When comparing the fingerprints between individual samples the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acid sequence which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
Kits
A kit is also provided and may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base. In addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein. The kit may further comprise a software package for data analysis of expression profiles.
For example, the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence. The kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe.
Any of the compositions described herein may be comprised in a kit. In a non- limiting example, reagents for isolating miRNA, labeling miRNA, and/or evaluating a miRNA population using an array are included in a kit. The kit may further include reagents for creating or synthesizing miRNA probes. The kits will thus comprise, in suitable container means, an enzyme for labeling the miRNA by incorporating labeled nucleotide or unlabeled nucleotides that are subsequently labeled. It may also include one or more buffers, such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the miRNA probes, components for in situ hybridization and components for isolating miRNA. Other kits of the invention may include components for making a nucleic acid array comprising miRNA, and thus, may include, for example, a solid support.
The following examples are presented in order to more fully illustrate some embodiments of the invention. They should, in no way be construed, however, as limiting the broad scope of the invention.
EXAMPLES
Example 1
Materials and Methods
1. Tumor samples
Patients who were surgically treated for kidney cancer at Tel Hashomer Medical Center between 1992 and 2006 were identified. Patient charts were reviewed for demographic (age, gender, etc.) and clinicopathological (surgical procedures and pathological findings, chemotherapy etc.) information. For Example 5, the inclusion criteria were non-metastatic patients with two years follow up or disease progression within two years. Survival time was defined as time from surgery until death or censoring (death or loss to follow-up). Progression time was defined as the time from surgery until detection of recurrence or censoring.
57 renal tumor Formalin fixed, paraffin embedded (FFPE) samples were obtained from the pathology archives of Sheba Medical Center (Tel-Hashomer, Israel). The study protocol was approved by the Research Ethics Board of the contributing institute. FFPE samples were reviewed by a pathologist with experience in urological pathology for histological type based on hematoxilin-eosin (H&E) stained slides, performed on the first and/or last sections of the sample. Tumor classification was based on the World Health Organization (WHO) guidelines. Tumor content was higher than 50% for all the samples.
2. RNA extraction
Total RNA was isolated from seven to ten ΙΟ-μπι-thick FFPE tissue sections using the miR extraction protocol developed at Rosetta Genomics. Briefly, the sample was incubated a few times in xylene at 57° to remove paraffin excess, followed by ethanol washes. Proteins were degraded by proteinase K solution at 45°C for few hours. The RNA was extracted with acid phenol: chloroform, followed by ethanol precipitation and DNAse digestion. Total RNA quantity and quality was checked by spectrophotometer (Nanodrop ND-1000).
3. MicroR A profiling
Custom microarrays were produced by printing DNA oligonucleotide probes representing 903 human microRNAs. Each probe, printed in triplicate, carried up to 22-nt linker at the 3' end of the microRNA's complement sequence, in addition to an amine group used to couple the probes to coated glass slides. Each probe (20 μΜ) was dissolved in 2X SSC + 0.0035% SDS and spotted in triplicate on Schott Nexterion® Slide E-coated microarray slides using a Genomic Solutions® BioRobotics MicroGrid II according the MicroGrid manufacturer's directions. Fifty- four negative control probes were designed using the sense sequences of different microRNAs. Two groups of positive control probes were designed to hybridize to microarray: (i) synthetic small RNAs were spiked to the RNA before labeling to verify labeling efficiency; and (ii) probes for abundant small RNA (e.g., small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA) were spotted on the array to verify RNA quality. The slides were blocked in a solution containing 50 mM ethanolamine, 1 M Tris (pH9.0) and 0.1% SDS for 20 min at 50°C, then thoroughly rinsed with water and spun dry.
4. Cy-dye labeling of miRNA for microarray
3.5 μg of total RNA were labeled by ligation (Thomson et al, Nature Methods 2004,
1:47-53) of an RNA-linker, p-rCrU-Cy/dye (Dharmacon, Lafayette, CO), to the 3' -end with Cy3 or Cy5. The labeling reaction contained total RNA, spikes (0.1-20 fmoles), 300 ng RNA-linker-dye, 15% DMSO, lx ligase buffer and 20 units of T4 RNA ligase (NEB) and proceeded at 4°C for 1 h, followed by 1 h at 37°C. The labeled RNA was mixed with 3x hybridization buffer (Ambion), heated to 95°C for 3 min and then added on top of the miR array. Slides were hybridized 12-16 h at 42°C, followed by two washes at room temperature with lxSSC and 0.2% SDS and a final wash with O.lxSSC.
Arrays were scanned at 0.01 mm resolution and images read using SpotReader software (Niles Scientific, Portola Valley, CA).
5. Array data normalization
The initial data set consisted of signals measured for multiple probes for every sample. For the analysis, signals were used only for probes that were designed to measure the expression levels of known or validated human microRNAs.
Triplicate spots were combined into one signal by taking the logarithmic mean of the reliable spots. All data was log-transformed and the analysis was performed in log-space. A reference data vector for normalization, R, was calculated by taking the mean expression level for each probe in two representative samples, one from each tumor type.
For each sample k with data vector S4, a 2nd degree polynomial Fk was found so as to provide the best fit between the sample data and the reference data, such that R ^(S*). Remote data points ("outliers") were not used for fitting the polynomials F. For each probe in the sample (element S- in the vector 5*), the normalized value (in log-space)
Figure imgf000036_0001
is calculated from the initial value S- by transforming it with the polynomial function Fk, so that
Figure imgf000036_0002
Si ). Statistical analysis is performed in log-space. For presentation and calculation of fold-change, data is translated back to linear-space by taking the exponent.
6. Statistical analysis
Measurements of the expression of miRs were log-transformed before all further analysis. Normalization of samples was performed by calculating a median reference vector. For each sample, the best fit to this reference vector was calculated using a 2nd degree polynomial.
Patients were divided dichotomously into good and bad prognosis (24-months disease-free or not) groups. Expression levels differentiation between the groups was performed for each microRNA using Wilcoxon-Mann- Whitney ranksum test. MicroRNAs were considered differential if p<0.05, fold-change between groups was at least 1.5, and if median expression in at least one group was above 300. Multiple hypothesis adjustment was performed by the Benjamini-Hochberg method with a false detection rate (FDR) threshold set at 0.1. Survival analysis of the recurrence-free status was performed using the Kaplan Meier method, and groups were compared using the log-rank test. Bootstrapping was used to overcome potential biases of small data sets. Here the dataset is sampled repeatedly. At each repeat N patients are sampled, by randomly selecting patients from the entire pool. In this way there may be patients appearing more than once (with repeats). The more frequently a feature is selected within such repeats the more likely it is to be real. Thus stability of microRNAs' separation of survival patterns was tested by bootstrapping 100 times. The number of times each microRNA gave a logrank p-value less than 0.05 was counted. Multivariate stepwise Cox analysis of the microRNA expression, in concurrence with stage, lymph node involvement, tumour size and demographic data was carried out, using p<0.05 as inclusion, exclusion criterion. This process was performed over all patients and within stage III patients alone.
In order to find miRs which would attain a high predictive value (good performance), different expression thresholds were examined. Since the prevalence of disease recurrence in this cohort is similar to the general prevalence in the population, positive predictive value (PPV) was calculated as the fraction of patients with no recurrence in the group of patients predicted to have no recurrence (according to a specific miR expression threshold). Likewise, negative predictive value (NPV) was calculated as the fraction of patients with recurrence of the disease in the group of patients predicted to have recurrence (according to a specific miR expression threshold. Sensitivity and specificity thresholds were also applied.
In both t-tests and Kaplan-Meier analyses, only miRs whose normalized median expression was >300 in at least one of the groups compared were considered.
7. Quantitative real-time PCR (qRT-PCR)
qRT-PCR was used to further quantify microRNA expression. RNA was extracted from 29 samples and analysed using 16 microRNA probes, 11 of which were taken as being differential using the microarray and 5 were taken as normalisers. All expression readings (Cj) were normalised to give equal means for all patients. To compare expression with microarray results, 50-CT was used. In addition, for each microRNA the value from the microarray which created the best separation (as ascertained by log-rank) was translated to 50-CT units (using linear regression) and used to separate the PCR values into good and bad prognosis groups. The groups were then compared using Kaplan-Meier and log-rank.
The expressions by either method for the most significant microRNAs were used to find the best differentiation into good vs. bad prognosis with logistic regression. Classification by microRNA expression level into good or bad prognosis was evaluated by receiver operating curves (ROC). The area under the ROC curve (AUC) was calculated for each microRNA. An AUC>0.8 was considered a good classifier. Example 2
Specific microRNAs are able to predict the prognosis of renal cancer patients
The statistical analysis of the microarray results and comparison of the median values of miR expression in tumor samples obtained from renal cancer patients having good prognosis (survival above two years following surgery) (46 patients) with the median values of miR expression in tumor samples obtained from patients with bad prognosis (survival below two years following surgery) (9 patients) revealed significant differences in the expression pattern of specific miRs as shown in Table 2 and Figures 1-2. In the group of patients with bad prognosis, the median expression values of hsa-miR-21* (SEQ ID NO: 4) and hsa-miR-22 (SEQ ID NO: 5) were found to be above the median expression values of the patients with good prognosis, whereas the median expression values of hsa-miR-26b (SEQ ID NO: 1), hsa-miR-27a (SEQ ID NO: 2) and hsa-miR-23a (SEQ ID NO: 3) were found to be below the median expression of the patients with good prognosis. Accordingly, relatively high expression values of hsa-miR-21* (SEQ ID NO: 4) and hsa-miR-22 (SEQ ID NO: 5) were demonstrated to be indicative of poor prognosis of renal cancer, whereas relatively high expression values of hsa-miR-26b (SEQ ID NO: 1), hsa-miR-27a (SEQ ID NO: 2) and hsa-miR-23a (SEQ ID NO: 3) were demonstrated to be indicative of good prognosis.
Table 2
U re ulated in ood ro nosis vs. bad ro nosis
Figure imgf000038_0001
Example 3
miR expression patterns in patients with renal cancer correlate with prognosis
The prognosis of groups of patients with renal cancer, stratified according the expression levels of individual miRs, was compared. For each miR the samples were divided into textiles according to high (n=19), intermediate (n=18) or low (n=18) expression level of the miR.
Survival was compared between the two groups with high and low miR expression levels. The miRs associated with significant differences (logrank p-value<0.05) in survival time are presented in figures 3 A-K. hsa-miR-22 (SEQ ID NO: 5), hsa-miR-21* (SEQ ID NO: 4), hsa-miR-21 (SEQ ID NO: 51) and hsa-miR-193b (SEQ ID NO: 25) were down- regulated in samples obtained from patients with survival above two years following surgery. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of bad prognosis. In contrast, hsa-miR-26b (SEQ ID NO: 1), hsa-miR-23a (SEQ ID NO: 3), hsa-miR-27a (SEQ ID NO: 2), hsa-miR-27b (SEQ ID NO: 45), hsa-miR- 345 (SEQ ID NO: 47), hsa-miR-let-7a (SEQ ID NO: 49), hsa-miR-140-3p (SEQ ID NO: 19) were up-regulated in samples obtained from patients with survival above two years following surgery. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis. Example 4
High expression of selective miRs is characteristic of patients with primary renal cancer
As indicated in Figure 4, the expression of each of hsa-miR-451 (SEQ ID NO: 13), hsa-miR-27a (SEQ ID NO: 2), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-210 (SEQ ID NO: 14), hsa-miR-155 (SEQ ID NO: 15), hsa-miR-455-3p (SEQ ID NO: 16) and hsa-miR-16 (SEQ ID NO: 17) is higher in patients with primary renal cancer as compared to patients with renal metastases to the lung. In contrast, the expression of hsa-miR-lOa (SEQ ID NO: 11), and hsa-miR-138 (SEQ ID NO: 12), is higher in patients with renal metastases to the lung as compared to patients with primary cancer.
Figure 5 shows differential expression of miRs in samples obtained from primary renal cancer patients. The y-axis represents patients with good prognosis (survival above 24 months, n=44), and the x-axis represents patients with bad prognosis (survival below 24 months, n=5). The parallel lines describe a fold change between groups of 1.5 in either direction The expression of each of hsa-miR-143 (SEQ ID NO: 18), hsa-miR-140-3p (SEQ ID NO: 19), hsa-miR-26b (SEQ ID NO: 1), hsa-miR-192 (SEQ ID NO: 21), hsa-miR-194 (SEQ ID NO: 20), hsa-miR-30a* (SEQ ID NO: 22), hsa-miR-204 (SEQ ID NO: 23) and hsa-miR-30e* (SEQ ID NO: 52) is up-regulated in samples obtained from primary cancer renal patients with good prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis.
In contrast, the expression of hsa-miR-21* (SEQ ID NO: 4), hsa-miR-193b (SEQ ID NO: 25), hsa-miR-199b-5p (SEQ ID NO: 26), hsa-miR-451 (SEQ ID NO: 13), and hsa- miR-373* (SEQ ED NO: 27) is up-regulated in samples obtained from primary cancer renal patients with bad prognosis. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of bad prognosis.
Example 5
Time to progression analysis
Of the 50 non-metastatic patients identified, 10 patients had bad prognosis (mean time to progression 12 months, range 1-22 months) and 40 had good prognosis (mean follow-up 68 months, range 24-142). Of the good prognosis group, 8 suffered progression later (at mean time 58 months), while the other 32 did not (mean follow-up 70 months). MicroRNA microarray expression values for each patient were transformed into logarithm space and normalized there by polynomial fit. For presentation purposes the values were transformed back to linear space. The differential expression of normalized microRNA expression between the two groups is displayed in Figure 7. Each microRNA is expressed as the median value over all members of the good prognosis group (y-axis) against the median value over all members of the bad prognosis group (x-axis). The FDR=0.1 threshold set the p-value threshold at p=0.006. The six microRNAs from the Sanger dataset which are thus differential (pO.006, fold change>1.5) and expressed (>300 fluorescence units) in one group are labeled. Their expression levels are listed in Table 3.
The correlation between miR expression and time to progression in renal cancer patients is indicated in Figures 6A-D, which show Kaplan-Meier plots of progression-free survival curves plotted for each of the three expression-level groups. hsa-miR-21* (SEQ ED NO: 4), and hsa-miR-21 (SEQ ED NO: 51) were all more in hazard of progression when up- regualted. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of bad prognosis. In contrast, hsa-miR-30e (SEQ ED NO: 54) and hsa-miR- 30a* (SEQ ID NO: 22) were all in more hazard of progression when down-regulated. Accordingly, relatively high expression values of these miRs were demonstrated to be indicative of good prognosis.
A dot-plot for miR-21* is presented in Figure 8.
The prognostic miRs were tested if they could serve as prognosticators using qRT- PCR. The sequences of the RT-PCR primers and probes are presented in Table 4. The best separation by miR-21 was with the log2(expression) value of 16.83, which is parallel in 50- CT units in the PCR experiment to 25.67. The best separation by miR-21* in the microarray experiment was with value 10.27, which translated to 50-Cj as 18.91. The resulting differentiation in the PCR experiment for both groups is significant (p<0.05) for both microRNAs (Figures 9A-9D).
Table 3
Figure imgf000041_0001
Expression levels of microRNAs which were differential (at level FDR=0.1) between good
(n=40) and bad (n=10) prognosis. P-values were calculated using ranksum, and fold-change is between median values.
Table 4
SEQ SEQ SEQ
miR
ID forward primer sequence ID Probe ID name
NO NO NO
hsa-miR- CAACACCAGTCGAT AAAACCGATAGTGA
4 62 79 21* GGGCTGT GTCG
hsa-miR- CATTTGGCTAGCTTA AAAACCGATAGTGA
51 63 79 21 TCAGACTGATGTTGA GTCG
hsa-miR- CCTTTCAGTCGGATG AAAACCGATAGTGA
22 64 79 30a* TTTGCAGC GTCG
hsa-miR- 54 TCATTTGGCTGTAAA 65 AAAACCGATAGTGA 79 30e CATCCTTGACTGGA GTCG
hsa-let- CAGTCATTTGGGTGA CCGTTTTTTTTTTTT
49 66 80 7a GGTAGTAGGTTGT AACTATAC
hsa-miR- CAGTCATTTGGCCAA CCGTTTTTTTTTTTT
4 67 81 21* CACCAGTCGATGG ACAGCCCA
hsa-miR- CAGTCATTTGGCAA CCGTTTTTTTTTTTT
5 68 82 22 GCTGCCAGTTGAAG ACAGTTCT
hsa-miR- CAGTCATTTGGCTTC CCGTTTTTTTTTTTT
1 69 83 26b AAGTAATTCAGGA ACCTATCC
hsa-miR- CAGTCATTTGGCTAG CCGTTTTTTTTTTTT
51 70 84 21 CTTATCAGACTGA CAACATCA
hsa-miR- CAGTCATTTGGCTAC CCGTTTTTTTTTTTT
19 71 85 140-3p CACAGGGTAGAAC CCGTGGTT
hsa-miR- CAGTCATTTGGCTGT CCGTTTTTTTTTTTT
54 72 86 30e AAACATCCTTGAC CTTCCAGT
hsa-miR- CAGTCATTTGGCCTT CCGTTTTTTTTTTTT
22 73 87 30a* TCAGTCGGATGTT' GCTGCAAA
hsa-miR- CAGTCATTTGGCATC CCGTTTTTTTTTTTT
3 74 88 23a ACATTGCCAGGGA GGAAATCC
hsa-miR- CAGTCATTTGGCAAC CGTTTTTTTTTTTTA
25 75 89 193b TGGCCCTCAAAGT GCGGGAC
hsa-miR- CAGTCATTTGGCGCT CGTTTTTTTTTTTTG
47 76 90 345 GACTCCTAGTCCA AGCCCTG
hsa-miR- CAGTCATTTGGGTTC CGTTTTTTTTTTTTG
45 77 91 27b ACAGTGGCTAAGT CAGAACT
hsa-miR- CAGTCATTTGGGTTC CGTTTTTTTTTTTTG
2 78 92 27a ACAGTGGCTAAGT CGGAACT
The foregoing description of the specific embodiments so fully reveals the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
It should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

Claims

1. A method for determining a prognosis for renal cancer in a subject comprising:
(a) obtaining a biological sample from the subject;
(b) determining the expression level of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-60 and sequences at least about 80% identical thereto from said sample; and
(c) comparing said expression level to a threshold expression level,
wherein the expression level of any of SEQ ID NOS: 1-60 and sequences at least about 80% identical thereto compared to said threshold expression level is indicative of the prognosis of said subject.
2. The method of claim 1, wherein the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 1-3, 6-8, 13-23, 30-40, 45-50, 52-58 and sequences at least about 80% identical thereto and wherein an increased expression level of any of said nucleic acid sequence compared to said threshold expression level is indicative of good prognosis of said subject.
3. The method of claim 1, wherein the nucleic acid sequence is selected from the group consisting of SEQ ID NOS: 4, 5, 9-12, 24-29, 41-44, 51, 59-60 and sequences at least about 80% identical thereto and wherein an increased expression level of any of said nucleic acid sequence compared to said threshold expression level is indicative of poor prognosis of said subject.
4. The method of any of claims 1-3, wherein the subject is a human.
5. The method of any of claims 1-4, wherein said method is used to determine a course of treatment for said subject.
6. The method of any of claims 1-5, wherein said biological sample is selected from the group consisting of bodily fluid, a cell line and a tissue sample.
7. The method of claim 6, wherein said bodily fluid is blood.
8. The method of claim 6, wherein said tissue is a fresh, frozen, fixed, wax-embedded or formalin fixed paraffin-embedded (FFPE) tissue.
9. The method of claim 8, wherein said tissue is a renal cancer tissue.
10. The method of any of claims 1-9, wherein the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
11. The method of claim 10, wherein the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
12. The method of claim 10, wherein the nucleic acid amplification method is real-time PCR.
13. The method of claim 12, wherein the real-time PCR method comprises forward and reverse primers.
14. The method of claim 13, wherein the reverse primer comprises SEQ ID NO: 61.
15. The method of claim 13, wherein the forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 62-78.
16. The method of claim 13, wherein the real-time PCR method further comprises a probe.
17. The method of claim 16, wherein the probe comprises a sequence complementary to SEQ ID NOS: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto.
18. The method of claim 16, wherein the probe comprises a sequence selected from the group consisting of SEQ ID NOS: 79-92.
19. A kit for determining a prognosis of a subject with renal cancer, said kit comprising a forward primer comprising a nucleic acid sequence that is complementary to a sequence selected from SEQ ID NO: 1-60, to a fragment thereof or to a sequence at least about 80% identical thereto.
20. The kit of claim 19, wherein said forward primer comprises a sequence selected from the group consisting of SEQ ID NO: 62-78.
21. The kit of claim 19, wherein the kit further comprises a reverse primer.
22. The kit of claim 21, wherein the reverse primer comprises SEQ ID NO: 61.
23. The kit of claim 19, wherein the kit further comprises a probe.
24. The kit of claim 23, wherein the probe comprises a sequence selected from the group consisting of SEQ ID NOS: 79-92.
PCT/IL2010/000806 2009-10-04 2010-10-04 Compositions and methods for prognosis of renal cancer WO2011039757A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US24845809P 2009-10-04 2009-10-04
US61/248,458 2009-10-04

Publications (2)

Publication Number Publication Date
WO2011039757A2 true WO2011039757A2 (en) 2011-04-07
WO2011039757A3 WO2011039757A3 (en) 2011-12-29

Family

ID=43826732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2010/000806 WO2011039757A2 (en) 2009-10-04 2010-10-04 Compositions and methods for prognosis of renal cancer

Country Status (1)

Country Link
WO (1) WO2011039757A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013026684A1 (en) * 2011-08-19 2013-02-28 Febit Holding Gmbh Complex sets of mirnas as non-invasive biomarkers for kidney cancer
EP2702998A1 (en) * 2012-08-31 2014-03-05 IMBA-Institut für Molekulare Biotechnologie GmbH Therapeutic and diagnostic miRNA regulator in kidney disease
ES2606790A1 (en) * 2015-09-23 2017-03-27 Fundación Para La Investigación Biomédica Del Hospital 12 De Octubre Prognostic method to identify risk of recurrence in kidney cancer patients with clear cell renal cell carcinoma stages i and ii and kit (Machine-translation by Google Translate, not legally binding)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009036236A1 (en) * 2007-09-14 2009-03-19 The Ohio State University Research Foundation Mirna expression in human peripheral blood microvesicles and uses thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009036236A1 (en) * 2007-09-14 2009-03-19 The Ohio State University Research Foundation Mirna expression in human peripheral blood microvesicles and uses thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GOTTARDO ET AL.: 'Micro-RNA profiling in kidney and bladder cancers.' UROL ONCOL vol. 25, no. 5, 2007, pages 387 - 392 *
JUAN ET AL.: 'Identification of a microRNA panel for clear-cell kidney cancer.' UROLOGY vol. 75, no. 4, 29 December 2009, pages 835 - 841 *
JUNG ET AL.: 'MicroRNA profiling of clear cell renal cell cancer identifies a robust signature to define renal malignancy.' J CELL MOL MED. vol. 13, no. 9B, September 2009, pages 3918 - 3928 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013026684A1 (en) * 2011-08-19 2013-02-28 Febit Holding Gmbh Complex sets of mirnas as non-invasive biomarkers for kidney cancer
EP3135774A3 (en) * 2011-08-19 2017-05-31 Hummingbird Diagnostics GmbH Complex sets of mirnas as non-invasive biomarkers for kidney cancer
EP2702998A1 (en) * 2012-08-31 2014-03-05 IMBA-Institut für Molekulare Biotechnologie GmbH Therapeutic and diagnostic miRNA regulator in kidney disease
WO2014033262A1 (en) * 2012-08-31 2014-03-06 Imba - Institut Für Molekulare Biotechnologie Gmbh Therapeutic and diagnostic mirna regulator in kidney disease
ES2606790A1 (en) * 2015-09-23 2017-03-27 Fundación Para La Investigación Biomédica Del Hospital 12 De Octubre Prognostic method to identify risk of recurrence in kidney cancer patients with clear cell renal cell carcinoma stages i and ii and kit (Machine-translation by Google Translate, not legally binding)
US11401556B2 (en) 2015-09-23 2022-08-02 Fundaciôn Para La Investigaciôn Biomedica Del Hospital 12 De Octubre Prognostic method for determining the risk of relapse in renal cancer patients with a clear cell type of renal carcinoma, stages I and II, and kit for same

Also Published As

Publication number Publication date
WO2011039757A3 (en) 2011-12-29

Similar Documents

Publication Publication Date Title
US9803247B2 (en) MicroRNAs expression signature for determination of tumors origin
US20190241966A1 (en) Gene Expression Signature for Classification of Tissue of Origin of Tumor Samples
US9133522B2 (en) Compositions and methods for the diagnosis and prognosis of mesothelioma
US20100178653A1 (en) Gene expression signature for classification of cancers
US20150099665A1 (en) Methods for distinguishing between specific types of lung cancers
WO2010018563A2 (en) Compositions and methods for the prognosis of lymphoma
WO2010073248A2 (en) Gene expression signature for classification of tissue of origin of tumor samples
US9068232B2 (en) Gene expression signature for classification of kidney tumors
US9914972B2 (en) Methods for lung cancer classification
EP2643479A2 (en) Methods and materials for classification of tissue of origin of tumor samples
US9834821B2 (en) Diagnosis and prognosis of various types of cancers
WO2010004562A2 (en) Methods and compositions for detecting colorectal cancer
US20160186271A1 (en) Compositions and methods for determining the prognosis of bladder urothelial cancer
WO2009066291A2 (en) Micrornas expression signature for determination of tumors origin
US9340823B2 (en) Gene expression signature for classification of kidney tumors
WO2011039757A2 (en) Compositions and methods for prognosis of renal cancer
WO2010058393A2 (en) Compositions and methods for the prognosis of colon cancer
WO2010070637A2 (en) Method for distinguishing between adrenal tumors
WO2010018585A2 (en) Compositions and methods for prognosis of melanoma

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10820011

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10820011

Country of ref document: EP

Kind code of ref document: A2