WO2011024157A1 - Nucleic acid sequences related to cancer - Google Patents

Nucleic acid sequences related to cancer Download PDF

Info

Publication number
WO2011024157A1
WO2011024157A1 PCT/IL2010/000462 IL2010000462W WO2011024157A1 WO 2011024157 A1 WO2011024157 A1 WO 2011024157A1 IL 2010000462 W IL2010000462 W IL 2010000462W WO 2011024157 A1 WO2011024157 A1 WO 2011024157A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
sequence
expression profile
seq
subject
Prior art date
Application number
PCT/IL2010/000462
Other languages
French (fr)
Inventor
Einat Sitbon
Gila Lithwick Yanai
Eti Meiri
Nir Dromi
Asaf Levy
Original Assignee
Rosetta Genomics Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rosetta Genomics Ltd. filed Critical Rosetta Genomics Ltd.
Publication of WO2011024157A1 publication Critical patent/WO2011024157A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/11Antisense
    • C12N2310/113Antisense targeting other non-coding nucleic acids, e.g. antagomirs
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/14Type of nucleic acid interfering N.A.
    • C12N2310/141MicroRNAs, miRNAs
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/136Screening for pharmacological compounds
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA

Definitions

  • the invention relates to microRNA molecules, as well as various nucleic acid molecules relating thereto or derived therefrom.
  • the invention also relates to methods and compositions that can be used for diagnosis of cancer. BACKGROUND OF THE INVENTION
  • miRNAs are endogenous non-coding small RNAs that negatively regulate gene expression by interfering with the translation of coding messenger RNAs (mRNAs) in a sequence-specific manner, thereby playing a critical role in the control of gene expression during development and tissue homeostasis (Yi et al. , Nat Genet 2006;38:356-362). Certain miRNAs have been shown to be deregulated in human cancer, and their specific over- or under-expression has been shown to correlate with particular tumor types (Calin and Croce, Nat Rev Cancer 2006;6:857-866), as well as predict patient outcome (Yu et al, Cancer Cell 2008;13:48-57).
  • miRNA over-expression results in reduced expression of tumor suppressor genes, while loss of miRNA expression often leads to oncogene activation.
  • Deep sequencing is method of high-throughput DNA sequencing using a novel highly parallel sequencing-by-synthesis approach, which allows, rapid sequencing of millions of bases, and even whole genomes.
  • the technique can be used to sequence any double-stranded DNA and can be used for de novo whole genome sequencing, r&- sequencing of whole genomes and target DNA regions, metagenomics and RNA analysis. It is based on an emulsion-based method, in which short adaptors are ligated onto the ends of sequence fragments, which are then immobilized onto beads. The beads are then emulsified with the amplification reagents in a water-in-oil mixture, and are clonally amplified within the emulsion droplets. Sequencing-by-synthesis is then performed by pyrosequencing in wells on a fibreoptic slide.
  • Deep sequencing methods have been widely used in recent years. These high throughput and highly sensitive sequencing methods include Roche Applied Sciences (454) GS, Illumina's Solexa IG sequencer, and Applied Biosystem's SOLiD system. Deep sequencing can be used for the discovery of novel miRNA species and other small RNAs that are missed by traditional sequencing of small RNA libraries. Human microRNAs were previously identified using deep sequencing (Bar, M. et al. (2008) Stem Cells, 2( ⁇ ,
  • Deep sequencing may be used to identify miRNAs and their differential expression in tissue samples, and may thus aid in distinguishing between primary tumors and cancer metastasis. Being able to distinguish between primary tumors and cancer metastasis, as well as distinguishing between metastases of different origins, has practical importance for choice of therapy. Diagnosis of specific tumors is also of great importance when choosing appropriate treatment. miR analogs, as well as anti-sense sequences of. miRs, were recently shown to be useful as a therapeutic agent in several cancers. Ths , miRNA content of solid human tumors has only been partially explored using these methods and yet-unknown miRNAs and other small RNAs may be part of the tumor transcriptome. Thus, there exists a need to identify nucleic acid sequences which will aid in cancer diagnosis.
  • the present invention is based in part on deep sequencing analysis of miRNAs from tumor specimens of different types.
  • a computational approach was used to identify known miRNA sequences, miRNA sequence variants (isomiRs), and novel small RNA. species in these tumors.
  • normal and tumor samples from various tissue types were hybridized to a miRNA-microarray containing the novel miRNAs and known miRNAs.
  • Some of the novel miRNAs are abundantly expressed in different types of tumors and others are expressed differently between tumor and non-tumor samples, between different tumor stages or between different types of tumors.
  • using RT-PCR as a third platform the expression of several novel small RNAs was confirmed in normal human serum. These new cancer miRNA candidates can potentially be used as diagnostic biomarkers or therapeutic targets in different types of cancer.
  • the present invention provides nucleic acid sequences related to cancer, and methods and compositions that can be used for diagnosis of cancer.
  • the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
  • nucleic acid is 16-26 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
  • nucleic acid is 16-26 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
  • nucleic acid is 50-150 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of: (a) SEQ ID NOS: 192, 232-242, 176-187, 189-191, 193-196, 198-231, 243-267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389-413, 415-432, 434-438, 440-449 and 496-514;
  • nucleic acid is 50-150 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising an endogenous human siRNA.
  • the endogenous human siRNA comprises a sequence selected from the group consisting of:
  • nucleic acid is 16-26 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising an endogenous human siRNA.
  • the endogenous human siRNA comprises a sequence selected from the group consisting of:
  • nucleic acid is 16-26 nucleotides in length.
  • the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
  • the isolated nucleic acid of the invention is a modified oligonucleotide.
  • the invention provides a composition comprising an isolated nucleic acid of the invention.
  • the composition is suitable for diagnostic applications.
  • the composition is suitable for therapeutic applications.
  • the composition further comprises a pharmaceutically acceptable carrier.
  • the composition is a marker or modulator of cancer.
  • the invention provides a recombinant expression vector' comprising an isolated nucleic acid of the invention.
  • the invention provides a probe comprising an isolated nucleic acid of the invention.
  • the invention provides a biochip comprising the probe of the invention.
  • the invention provides a host cell comprising an isolated nucleic acid of the invention.
  • the invention provides a method for diagnosing a cancer in a subject comprising:
  • the cancer is selected from the group
  • the invention provides a method of diagnosing an increased risk of colon cancer in a subject comprising:
  • a high expression level of said nucleic acid sequence is indicative of an increased risk of colon cancer in a subject.
  • the invention provides a method of diagnosing an increased risk of colon cancer in a subject comprising:
  • the invention provides a method of diagnosing colon cancer in a subject comprising:
  • the invention provides a method of diagnosing lung cancer in a subject a comprising:
  • the invention provides a method of diagnosing bladder cancer in a subject comprising:
  • the invention provides a method of diagnosing bladder cancer in a subject comprising:
  • the invention provides a method of diagnosing liver cancer in a subject comprising:
  • the invention provides a method of diagnosing liver cancer in a subject comprising:
  • the invention provides a method of diagnosing an endometrial metastasis in a subject comprising:
  • the invention provides a method of diagnosing an endometrial metastasis in a subject comprising:
  • the invention provides a method of diagnosing kidney cancer in a subject comprising:
  • the invention provides a method of diagnosing kidney cancer in a subject comprising:
  • the invention provides a method of diagnosing breast cancer in a subject comprising:
  • the invention provides a method to distinguish between a primary lung tumor and a metastasis to the lung, said method comprising:
  • the origin of the metastasis to the lung is selected from the group consisting of endometrium, kidney, larynx, melanocyte and salivary gland.
  • the subject of the invention is a human.
  • the method of the invention is used to determine a course of treatment for said subject.
  • the biological sample obtained in the method of the invention is selected from the group consisting of bodily fluid, a cell line and a tissue sample.
  • the bodily fluid is selected from the group consisting of whole blood and serum.
  • the tissue is a fresh, frozen, fixed, wax-embedded or formalin fixed paraffin-embedded (FFPE) tissue.
  • the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
  • the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
  • the nucleic acid amplification method is real-time PCR.
  • the real-time PCR method comprises forward and reverse primers.
  • the forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 466, 467, 469, a fragment thereof and a sequence at least about 80% identical thereto.
  • the real-time PCR method further comprises a probe.
  • the probe comprises a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119- 121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof and a sequence at least about 80% identical thereto.
  • the invention provides a kit for diagnosing a cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NO: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119- 121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof and a sequence at least about 80% identical thereto.
  • the invention provides a kit for diagnosing an increased risk of colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 21, 68, 92, 111 and 174, a fragment thereof and a sequence at least about 80% identical thereto.
  • the invention provides a kit for diagnosing colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
  • the invention provides a kit for diagnosing lung cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
  • the invention provides a kit for diagnosing bladder cancer in a subject, said kit comprising a probe comprising a nucleic aci ⁇ sequence that is complementary to a sequence selected from the group consisting of SEQ
  • the invention provides a kit for diagnosing liver cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
  • the invention provides a kit for diagnosing a subject with endometrial metastasis, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of; SEQ ID NOS: 16, 22, 25, 29, 31, 37, 39, 53, 57, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171, a fragment thereof and a sequence at least about 80% identical thereto.
  • the invention provides a kit for diagnosing kidney cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ
  • the invention provides a kit for diagnosing breast cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
  • the invention provides a kit for distinguishing between a primary lung tumor and a metastasis to the lung, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 92, 1, 2, 6, 7, 10, 11, 13, 17, 19-24, 28-30, 33,
  • Figure 1 shows differential expression of miRs (in log 2 (fluorescence units)), comparing the median values of each miR in breast primary tumor (y-axis) with breast metastases into lymph nodes (x-axis). Median normalized fluorescence for each miRNA, (black crosses) indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
  • Figure 2 shows differential expression of miRs (in Iog 2 (fluorescence units)), comparing the median values of each miR in colon tumors (y-axis) with the corresponding median for their adjacent tissues (x-axis). Median normalized fluorescence for each miRNA (black crosses) indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
  • Figure 3 shows differential expression of miRs (in Iog 2 (fluorescence units))t comparing the median values of each miR in lung tumors (y-axis) with the corresponding median for other tumors from the following tissues: bile duct, bladder, breast, colon, kidney, liver, lung, ovary, pancreas, and prostate, (x-axis).
  • Median normalized fluorescence for each miRNA indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
  • Figure 4 Expression in RT-PCR of novel miRNAs and small RNAs, in human serum. RNA was measured in sera of 19 normal humans and in negative control not containing RNA. Shown is the median of expression signals (y-axis, in units of 42-Ct) for each miR in all tested samples. Black bars show expression in experimental samples- and white bars show expression in negative controls.
  • the present invention extends the current knowledge of the tumor small RNA transcriptome and provides novel candidates for molecular biomarkers and drug targets.
  • RNA extracted from tumors in addition to careful computational analysis and followed by verification experiments can identify yet unknown sequences such as the new miRNAs, miRNA-offset RNAs (MORs), Y-RNA derived sequences and endogenous siRNAs presented in this analysis.
  • MORs miRNA-offset RNAs
  • Y-RNA derived sequences endogenous siRNAs presented in this analysis.
  • the identification of such tumor-specific small RNAs 5 could lead to the development of new therapeutic targets, which may be utilized as a treatment more specific than the set of tools currently available.
  • the present invention provides methods and compositions for diagnosis of cancer and cancer metastasis. Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.
  • each intervening number there between with the same degree of precision is explicitly contemplated.
  • the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
  • Attached or “immobilized”, as used herein to refer to a probe and a solid support, may mean that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal.
  • the binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of ? > specific reactive group on either the solid support or the probe or both molecules.
  • Non- covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions.
  • non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non- covalent interactions.
  • a molecule such as streptavidin
  • Bio sample as used herein means a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples may also include sections of tissues such as biopsy and autopsy samples, FFPE samples, frozen sections taken for histological purposes, blood, plasma, serum, sputum, stool, tears, mucus, hair, and skin. Biological samples also include explants and primary and/or transformed cell cultures derived from animal or patient tissues.
  • Biological samples may also be blood, a blood fraction, urine, effusions, ascitic fluid, saliva, cerebrospinal fluid, cervical secretions, vaginal secretions, endometrial secretions, gastrointestinal secretions, bronchial secretions, sputum, cell line, tissue sample, cellular content of fine needle aspiration (FNA) or secretions from the breast.
  • a biological sample may be provided by removing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methodc described herein in vivo.
  • Archival tissues such as those having treatment or outcome history, may also be used.
  • cancer is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness.
  • cancers include but are nor limited to solid tumors and leukemias, including: apudoma, choristoma, branchioma, malignant carcinoid syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, small cell lung, non-small cell lung (e.g., lung squamous cell carcinoma, lun ⁇ adenocarcinoma and lung undifferentiated large cell carcinoma), oat cell, papillary, bronchiolar, bronchogenic, squamous cell, and transitional cell), histiocytic disorders, leukemia (e.g., B cell, mixed cell, null cell, T cell, T-cell chronic,
  • leukemia
  • myxosarcoma myxosarcoma, ovarian carcinoma, rhabdomyosarcoma, sarcoma (e.g., Ewing, experimental, Kaposi, and mast cell), neurofibromatosis, and cervical dysplasia, and other conditions in which cells have become immortalized or transformed.
  • cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer.
  • prognostic for cancer means providing a forecast or prediction of the probable course or outcome of the cancer.
  • prognostic for cancer comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer.
  • a drug used to treat a disease especially cancer.
  • the drugs typically target rapidly dividing cells, such as cancer cells.
  • “Complement” or “complementary” as used herein means Watson-Crick (e.g., A-
  • TAJ and C-G Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
  • a full complement or fully complementary may mean 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
  • the complementary sequence has a reverse orientation (5'-3 5 ).
  • C T signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of C T represent high abundance or expression levels of the microRNA.
  • C T remains inversed from the expression level.
  • the PCR CT signal may be normalized and then inverted such that low normalized-inverted C T represents low abundance or expression levels of the microRNA.
  • Detection means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means measuring the level of a component, either quantitatively or qualitatively.
  • “Differential expression” may mean qualitative or quantitative differences in th ⁇ temporal and/or cellular gene expression patterns within and among cells and tissue.
  • a differentially expressed gene can qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus disease tissue. Genes may be turned on or turned off in a particular state relative to another state, thus permitting comparison of two or more states. A qualitatively regulated gene will exhibit an expression pattern within a state or cell type that may be detectable by standard techniques. Some genes will be expressed in one state or cell type, but not in both.
  • the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down - regulated, resulting in a decreased amount of transcript.
  • the degree to which expression differs need only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, northern analysis, and Rnase protection.
  • “Expression profile”, as used herein, may mean a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence, e.g., quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cRNA, etc., quantitative PCR, ELISA for quantification, and the like, and allow the analysis of differential gene expression between two samples. A subject or patient tumor sample, e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art.
  • Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more, including all of the listed nucleic acid sequences.
  • expression profile may also mean measuring the abundance of the nucleic acid sequences in the measured samples.
  • “Expression ratio” refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
  • “Hairpin”, as used herein, refers to an area where single-stranded DNA or RNA has folded back on itself and nucleotides from the two strands have base paired, so that the resulting structure appears as a hairpin structure.
  • the hairpin may comprise a first and a second nucleic acid sequence that are substantially complementary.
  • the first and second nucleic acid sequence may be from 37-50 nucleotides.
  • the first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides.
  • the hairpin structure may have a free energy less than -25 Kcal/mole as calculated by the Vienna algorithm with default parameters, as described in Hofacker et al, Monatshefte f. Chemie 125: 167-188 (1994), the contents of which are incorporated herein.
  • the hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides.
  • Gene may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences).
  • the coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA.
  • a gene may also be an mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto.
  • a gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
  • Identity as used herein in the context of two or more nucleic acids or polypeptide sequences, may mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity.
  • the residues of single sequence are included in the denominator but not the numerator of the calculation.
  • thymine (T) and uracil (U) may be considered equivalent.
  • Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0.
  • increased risk of cancer may mean that the probability that a subject will develop cancer in the future is higher than that of a control subject.
  • Inhibit may mean prevent, suppress, repress, reduce or eliminate.
  • Label may mean a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
  • useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymer (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable.
  • a label may be incorporated into nucleic acids and proteins at any position.
  • Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable is dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e., the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space) and of other explaining variables.
  • the logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type if P is greater than 0.5 or 50%.
  • the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier.
  • Methodastasis means the process by which cancer spreads from the place at which it first arose as a primary tumor (origin) to other locations in the body.
  • the metastatic progression of a primary tumor reflects multiple stages, including dissociation from neighboring primary tumor cells, survival in the circulation, and growth in a secondary location.
  • the name of a specific metastasis refers to its origin. .-> miRNA or miR
  • miRNA or “miR”, as used herein, may mean a non-coding RNA between 18. and 25 nucleobases in length, which is the product of cleavage of a pre-miRNA by the enzyme Dicer. Examples of mature miRNAs are found in the miRNA database known as Sanger miRBase (release 10).
  • miRNA precursor may mean a transcript that originates from a genomic DNA and that comprises a non-coding, structured RNA comprising one or more miRNA sequences.
  • a miRNA precursor is a pre- rm ' RNA.
  • a miRNA precursor is a pri-miRNA.
  • Mismatch means a nucleobase of a first nucleic acid that is not capable of pairing with a nucleobase at a corresponding position of a second nucleic acid.
  • Modified oligonucleotide as used herein means an oligonucleotide having one or more modifications relative to a naturally occurring terminus, sugar, nucleobase, and/or internucleoside linkage.
  • the modified oligonucleotide is a miRNA or siRNA comprising a modification (e.g. labeled).
  • the modified oligonucleotide is complementary to a miRNA or siRNA.
  • Nucleic acid or “oligonucleotide” or “polynucleotide”, as used herein, may mean at least two nucleotides covalently linked together.
  • the depiction of a single strand also defines the sequence of the complementary strand.
  • a nucleic acid also encompasses the complementary strand of a depicted single strand.
  • Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid.
  • a nucleic acid also encompasses substantially identical nucleic acids and complements thereof.
  • a single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions.
  • nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
  • Nucleic acids may be single-stranded or double-stranded, or may contain portions; of both double-stranded and single-stranded sequence.
  • the nucleic acid may be DNA, both genomic and cDNA, KNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine.
  • Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
  • a nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroarnidite linkages and peptide nucleic acid backbones and linkages.
  • Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated by reference.
  • Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids.
  • the modified nucleotide analog may be located, for example, at the 5'-end and/or the 3'-end of the nucleic acid molecule.
  • Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides.
  • nucleobase-modified ribonucleotides i.e., ribonucleotides, containing a non- naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5-position, e.g., 5-(2-amino)propyl uridine, 5-bromo undine; adenosines and guanosines modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6- methyl adenosine are suitable.
  • uridines or cytidines modified at the 5-position e.g., 5-(2-amino)propyl uridine, 5-bromo undine
  • the 2'-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH 2 , NHR, NR 2 or CN, wherein R is C 1 -C 6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I.
  • Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al, Nature 2005;438:685-689, Soutschek et al, Nature 2004;432: 173-178, and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference.
  • Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip.
  • the backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells.
  • the backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
  • Probe may mean an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence, depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization- between the target sequence and the single-stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence.
  • a probe may be single-stranded or partially single- and partially double- stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind.
  • pseudogenes are defunct relatives of known genes that are no longer expressed in the cell. Although most pseudogenes have some gene-like features (such as promoters, CpG islands, and splice sites), they are nonetheless considered nonfunctional, due to their lack of protein-coding ability resulting from various genetic disablements (stop codons, frameshifts, or a lack of transcription) or their inability to encode RNA (such as with rRNA pseudogenes).
  • the term "reference expression profile” means a profile of values that statistically correlates to a particular outcome when compared to an assay result.
  • the reference profile values are determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes>
  • the reference values may be a threshold score value or a cutoff score value.
  • s reference value will be a threshold above which one outcome is more probable and below which an alternative outcome is more probable.
  • “Sensitivity”, as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example, how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the sensitivity for class A is the proportion of cases that are determined to belong to class "A” by the test out of the cases that are in class "A”, as determined by some absolute or gold standard.
  • RNA refers to small inhibitory RNA duplexes (generally 16- 30 base pairs) that induce the RNA interference (RNAi) pathway. These molecules contain varying degrees of complementarity to their target niRNA in the antisense strand.
  • siRNA has unpaired overhanging bases on the 5' or 3' end of the sense strand and/or the antisense strand.
  • siRNA includes duplexes of two separate strands, as well as single strands that can form hairpin structures comprising a duplex region.
  • Specificity may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example, how frequently it correctly classifies a cancer into the correct type out of two possible types.
  • the specificity for class A is the proportion of cases that are determined to belong to clasc "not A” by the test out of the cases that are in class "not A”, as determined by some; absolute or gold standard.
  • Ste-loop sequence may mean an RNA having a hairpin structure and containing a mature miRNA sequence. Pre-miRNA sequences and stem- loop sequences may overlap. Examples of stem-loop sequences are found in the miRNA database known as Sanger miRBase (release 10).
  • Stringent hybridization conditions may mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid.
  • sequence e.g., target
  • Stringent conditions are sequence-dependent and will be different in different circumstances.
  • Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength pH.
  • the T m may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are occupied at equilibrium).
  • Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 6O 0 C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization.
  • Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65 0 C, with wash in 0.2x SSC, and O.l% SDS at 65°C.
  • Substantially complementary as used herein means that a first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the complement of a second sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
  • substantially identical means that a first and a second sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence.
  • the term "subject” refers to a mammal, including both human and other mammals.
  • the methods of the present invention are preferably applied to human subjects. Threshold expression level
  • threshold expression level refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with cancer.
  • the reference expression profile may be based on the expression of the nucleic acids, or may be based on a combined metric score thereof.
  • tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts.
  • the phrase "suspected of being cancerous", as used herein, means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser- based microdissection, or other art-known cell-separation methods.
  • Treating” or “treating”, as used herein when referring to protection of a subject from a condition, may mean preventing, suppressing, repressing, or eliminating the condition.
  • Preventing the condition involves administering a composition described herein to a subject prior to onset of the condition.
  • Suppressing the condition involves administering the composition to a subject after induction of the condition but before its clinical appearance.
  • Repressing the condition involves administering the composition to a subject after clinical appearance of the condition such that the condition is reduced or prevented from worsening.
  • Elimination of the condition involves administering the composition to a subject after clinical appearance of the condition such that the subject no longer suffers from the condition.
  • Tumor refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
  • Vector as used herein to refer to a nucleic acid, may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto.
  • Vector may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto.
  • Vector may mean a nucleic acid sequence containing an origin of replication.
  • a vector maybe a plasmid, bacteriophage, bacterial artificial chromosome or yeast artificial chromosome.
  • a vector may be a DNA or RNA vector.
  • a vector may be either a self-replicating extrachromosomal vector or a vector which integrates into a host genome.
  • Y RNAs are small non-coding RNA components of the Ro ribonucleoprotein particle (Ro RNP). These small RNAs are predicted to fold into a conserved stem formed by the 3' and 5' ends of the RNA and characterized by a single; bulged cytosine. In some embodiments, Y RNAs are over-expressed in human tumours.
  • Ro RNP Ro ribonucleoprotein particle
  • Y RNAs are required for cell proliferation.
  • 1D/2D threshold classifier may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer or two types of prognosis (e.g., good and bad).
  • ID threshold classifier the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold.
  • a 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables. A score may be calculated as a function (usually a continuous function) of the two variables; the decision is then reached by comparing the score to the predetermined threshold, similar to the ID threshold classifier.
  • a gene coding for a miRNA may be transcribed, leading to production of an miRNA precursor known as the pri-miRNA.
  • the pri-miRNA may be part of a polycistronic RNA comprising multiple pri-miRNAs.
  • the pri-miRNA may form a hairpin with a stem and loop.
  • the stem may comprise mismatched bases.
  • the hairpin structure of the pri-miRNA may be recognized by Drosha, which is an Rnase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 30-200 nucleotide precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of Rnase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ⁇ 2 nucleotide 3' overhang.
  • Drosha is an Rnase III endonuclease.
  • Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 30-200 nucleotide precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical
  • Approximately one helical turn of stem ( ⁇ 10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing.
  • the pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran-GTP and the export receptor Ex-portin-5.
  • the pre-miRNA may be recognized by Dicer, which is also an Rnase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also recognize the 5' phosphate and 3' overhang at the base of the stem loop. Dicer may cleave off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ⁇ 2 nucleotide 3 1 overhang. The resulting siRNA-- like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
  • RNA-induced silencing complex RNA-induced silencing complex
  • RISC RNA-induced silencing complex
  • Various proteins can form the RISC, which can lead to variability in specifity for. miRNA/miRNA* duplexes, binding site of the target gene, activity of miRNA (repress or activate), and which strand of the miRNA/miRNA* duplex is loaded in to the RISC.
  • the miRNA* When the miRNA strand of the miRNAmiRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded.
  • the strand of the miRNA:miRNA* duplex that is loaded into the RISC maybe the strand whose 5 1 end is less tightly paired. In cases where both ends of the miRNAmiRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
  • the RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-8 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for miR-196 and Hox B8 and it was further shown that miR-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al, Science 2004; 304:594-596). Otherwise, such interactions are known only in plants (Bartel & Bartel, Plant Physiol 2003; 132:709-717).
  • the target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region.
  • multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites.
  • the presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple
  • RISCs provides the most efficient translational inhibition.
  • miRNAs may direct the RISC to down-regulate gene expression by either of two mechanisms: mRNA cleavage or translational repression.
  • the miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may. repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the miRNA and binding site.
  • nucleic acids are provided herein.
  • the nucleic acid may comprise the sequence of SEQ ID NOS: 1-514, or variants thereof.
  • the variant may be a complement of the referenced nucleotide sequence.
  • the variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof.
  • the variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
  • the nucleic acid may have a length of from 10 to 250 nucleotides.
  • the nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
  • the nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein.
  • the nucleic acid may be synthesized as a single-strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex.
  • the nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or may be capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No..
  • a miRNA may be identified at a single genome locus. In another embodiment, a miRNA may be identified within multiple genomic loci.
  • Sequence variants are presented in Table 3 below, together with counts for both the most abundant sequence and sequence variants.
  • MiRNAs may, in one embodiment, form clusters in the genome.
  • a cluster is defined based on the criterion of a distance of not more than 5000 nucleotides between miRNAs within the human genome. Most miRNA genes within 50 kb of each other have highly correlated expression patterns, with the correlation dropping sharply beyond the 50-kb range. Relatively few miRNAs are found between 50 kb and 500 kb of each other, as described in Baskerville and Bartel (RNA 2005; 11:241-247). Thus, in one embodiment, clustered miRNAs are defined as falling within a range of 0.1 kb and 50 kb.
  • miRNAs located at this distance are co-transcribed. In another embodiment, miRNAs located at this distance are co-regulated. In yet another embodiment, miRNAs located at this distance are co-transcribed and co-regulated.
  • SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, S
  • SEQ ID NOS: 70 and 110 appear in a cluster, separated by 500 nucleotides on chromosome 12.
  • SEQ ID NOS: 14 and 120 appear in a cluster.
  • SEQ ID NOS: 63, 106 and 58 appear in a cluster separated by 1000 nucleotides on chromosome 20.
  • SEQ ID NOS: 135 and 159 appear in a cluster.
  • the nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer.
  • the nucleic acid may also comprise a protamine-antibody fusion protein as described in Song et al. (Nature Biotechnology 2005;23:709-717) and Rossi (Nature
  • the protamine-fusion protein may comprise the abundant and highly basic cellular protein protamine.
  • the protamine may readily interact with the nucleic acid.
  • the protamine may comprise the entire 51 -amino acid protamine peptide or a fragment thereof.
  • the protamine may be covalently attached to another protein, which may be a
  • the Fab may bind to a receptor expressed on a cell surface.
  • the nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
  • the pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000- 1,500 or 80-100 nucleotides.
  • the sequence of the pri-miRNA may comprise a pre- miRNA, miRNA and miRNA*, as set forth herein, and variants thereof.
  • a sequence of the pri-miRNA may comprise the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37- 43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116-157, 159-187, 189-196, 198- 267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389- 413, 415-432, 434-438, 440-449, 472-514 or variants thereof.
  • the pri-miRNA may form a hairpin structure.
  • the hairpin may comprise first and second nucleic acid sequence that are substantially complementary.
  • the first and second nucleic acid sequence may be from 37-50 nucleotides.
  • the first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides.
  • the hairpin structure may have a free energy less than -25 Kcal/mole, as calculated by the Vienna algorithm, with default parameters, as described in Hofacker et al. (Monatshefte f. Chemie 1994; 125:167-188), the contents of which are incorporated herein.
  • the hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides.
  • the pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least
  • the nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof.
  • the pre-miRNA sequence may comprise from 45-200, 60-80 or 60-70 nucleotides.
  • the sequence of the pre-miRNA may comprise a miRNA and a miRNA*, ac: set forth herein.
  • the sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA.
  • a sequence of the pre-miRNA may comprise the sequence of SEQ TD NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116-157, 159-187, 189-196, 198- 267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389- 413, 415-432, 434-438, 440-449, 472-514 or variants thereof.
  • the nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof.
  • the miRNA sequence may comprise from 13-33, 18-24 or 21-23
  • the miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37
  • the sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA.
  • the sequence of the miRNA may comprise the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116- 157, 159-175 and 472-495 or variants thereof.
  • Anti-miRNA e. Anti-miRNA
  • the nucleic acid may also comprise a sequence of an anti-rm ' RNA that is capable of blocking the activity of a miRNA or miRNA*, such as by binding to the pri-miRNA, pre-miRNA, miRNA or miRNA* (e.g., antisense or RNA silencing), or by binding to the target binding site.
  • the anti-miRNA may comprise a total of 5-100 or 10-60 nucleotides.
  • the anti-miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
  • the sequence of the anti-miRNA may comprise (a) at least 5 nucleotides that are substantially identical or complementary to the 5' of a miRNA and at least 5-12 nucleotides that are substantially complementary to the flanking regions of the target site from the 5' end of the miRNA, or (b) at least 5-12 nucleotides that are substantially identical or complementary to the 3' of a miRNA and at least 5 nucleotides that are substantially complementary to the flanking region of the target site from the 3 ' end of the miRNA.
  • a sequence of the anti-miRNA may comprise the complement of the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102 . , 104-108, 110-114, 116-157, 159-175 and 472-495 or variants thereof.
  • the nucleic acid may also comprise a sequence of a double-stranded RNA (dsRNA) that is capable of suppressing specific transcripts in a sequence-dependent manner.
  • dsRNA double-stranded RNA
  • the dsRNA may be processed to provide small interfering RNAs (siRNAs).
  • the siRNA may comprise a total of 20-25 nucleotides. Endogenous siRNAs have been identified in nematodes, plants and mammalian cells, including mouse oocytes.
  • the sequence of the siRNA may comprise SEQ ID NOS: 450-454 or variants thereof. In one embodiment, these sequences may be found in the introns of two genes: ERBB4 and AKAP6.
  • ERBB4 is a member of the Tyr protein kinase family and the epidermal growth factor receptor subfamily. In one embodiment, mutations in ERBB4 may be associated with cancer.
  • AKAPs A-kinase anchor proteins
  • PKA protein kinase A
  • the encoded protein is expressed in brain and cardiac and skeletal muscle. In one embodiment, it is specifically localized to the sarcoplasmic reticulum and nuclear membrane. In another embodiment, it is involved in anchoring PKA to the nuclear membrane or sarcoplasmic reticulum.
  • the siRNA sequence may be found in pseudogenes derived from mitochondrial tRNA.
  • the siRNA may have a regulatory role in splicing of the genes in which it resides. In other embodiments, the siRNA may have a role in removing unspliced mRNAs. Detection of such siRNAs in a tissue sample may be indicative of cancerous processes within the cells.
  • a probe comprising a nucleic acid described herein is also provided. Probes may be used for screening and diagnostic methods. The probe may be attached or immobilized to a solid substrate, such as a biochip.
  • the probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
  • the probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140,
  • the probe may further comprise a linker sequence of from 10-60 nucleotides.
  • a biochip is also provided.
  • the biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein.
  • the probes may be capable of hybridizing to a target sequence under stringent hybridization conditions.
  • the probes may be attached at a spatially defined address on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence.
  • the probes may be capable of hybridizing to target sequences associated with a single disorder, as appreciated by those in the art.
  • the probes may either be synthesized first, with subsequent attachment to the biochip, or maybe directly synthesized on the biochip.
  • the solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes, and is amenable to at least one detection method.
  • substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics.
  • the substrates may allow optical detection without appreciably fluorescing.
  • the substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume.
  • the substrate may be flexible, such as a flexible foam, including closed-cell foams made of particular plastics.
  • the biochip and the probe may be derivatized with chemical functional groupr for subsequent attachment of the two.
  • the biochip may be derivatized witb a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups.
  • the probes may be attached using functional groups on the probes either directly or indirectly using a linker.
  • the probes may be attached to the solid support by the 5' terminus, 3' terminus, or via an internal nucleotide.
  • the probe may also be attached to the solid support non-covalently.
  • biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment.
  • probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
  • a method of diagnosis comprises detecting a differential expression level of a nucleic acid in a biological sample.
  • the sample may be derived from a subject. Diagnosis of a disease state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be determined by determining temporarily expressed cancer-associated nucleic acids.
  • In situ hybridization of labeled probes to tissue sections may be performed.
  • the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acids which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes. 6. miRNA expression analysis
  • miRNA expression patterns in cancer cells have been reported. Both increases and decreases in miRNA expression have been described in relation to cancer.
  • the total number of miRNAs in the human genome is estimated to range from approximately 800 to several thousand. In view of this high number of total miRNAs, identification of particular miRNAs linked to particular cancer types is necessary in order to identify miRNAs that could be targeted for cancer therapy, either through inhibition or augmentation of the miRNA.
  • the methods provided herein are useful for the treatment of cancer. These methods may result in one or more clinically desirable outcomes in a subject having cancer, such as reduction in tumor number and/or size, reduced metastatic progression, prolonged survival time, and/or increased progression-free survival time. Also provided herein are pharmaceutical agents, such as modified oligonucleotides, that may be used for the treatment of cancer.
  • the present invention also relates to a method of identifying miRNAs that are associated with disease or a pathological condition comprising contacting a biological sample with a probe or biochip of the invention and detecting the amount of hybridization.
  • PCR may be used to amplify nucleic acids in the sample, which may provide higher sensitivity.
  • the ability to identify miRNAs that are overexpressed or underexpressed in pathological cells compared to a control can provide high-resolution, high-sensitivity datasets which may be used in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, biosensor development, and other related areas.
  • An expression profile generated by the current methods may be a "fingerprint" of the state of the sample with respect to a number of miRNAs. While two states may have any particular miRNA similarly expressed, the evaluation of a number of miRNAs simultaneously allows the generation of a gene expression profile that is characteristic of the state of the cell. That is, normal tissue may be distinguished from diseased tissue. By comparing expression profiles of tissue in known different disease states, information regarding which miRNAs are associated in each of these states may be obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue sample has the expression profile of normal or disease tissue. This may provide for molecular diagnosis of related conditions.
  • the present invention also relates to a method of determining the expression level of a cancer-associated rm ' RNA comprising contacting a biological sample with a probe or biochip of the invention and measuring the amount of hybridization.
  • the expression level of a cancer-associated miRNA is information in a number of ways. For example, a differential expression of a cancer-associated miRNA compared to a control may be used as a diagnostic that a patient suffers from cancer. Expression levels of a cancer- associated miRNA may also be used to monitor the treatment and cancer state of a patient. Furthermore, expression levels of a cancer- associated miRNA may allow the screening of drug candidates for altering a particular expression profile or suppressing an expression profile associated with cancer.
  • a target nucleic acid may be detected by contacting a sample comprising the target nucleic acid with a biochip comprising an attached probe sufficiently complementary to the target nucleic acid and detecting hybridization to the probe above control levels.
  • the target nucleic acid may also be detected by immobilizing the nucleic acid to be examined on a solid support such as nylon membranes and hybridizing a labelled probe with the sample.
  • the target nucleic may also be detected by immobilizing the labeled probe to the solid support and hybridizing a sample comprising a labeled target nucleic acid. Following washing to remove the non-specific hybridization, the label may be detected.
  • the target nucleic acid may also be detected in situ by contacting permeabilize ⁇ cells or tissue samples with a labeled probe to allow hybridization with the target nucleic acid. Following washing to remove the non-specifically bound probe, the label may be detected.
  • These assays can be direct hybridization assays or can comprise sandwich assays, which include the use of multiple probes, as is generally outlined in U.S. Pat. Nos. 5,681,702; 5,597,909; 5,545,730; 5,594,117; 5,591,584; 5,571,670; 5,580,731; 5,571,670; 5,591,584; 5,624,802; 5,635,352; 5,594,118; 5,359,100; 5,124,246; and 5,681,697, each of which is hereby incorporated by reference.
  • a variety of hybridization conditions may be used, including high, moderate and low stringency conditions as outlined above.
  • the assays may be performed under stringency conditions which allow hybridization of the probe only to the target.
  • Stringency can be controlled by altering a step parameter that is a thermodynamic variable, including, but not limited to, temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, or organic solvent concentration.
  • Hybridization reactions may be accomplished in a variety of ways. Components of the reaction may be added simultaneously, or sequentially, in different orders.
  • the reaction may include a variety of other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. which may be used to facilitate optimal hybridization and detection, and/or reduce non-specific or background interactions. Reagents that otherwise improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors and anti-microbial agents may also be used as appropriate, depending on the sample preparation methods and purity of the target.
  • the present invention also relates to a method of diagnosis comprising detecting a differential expression level of a cancer-associated rm ' RNA in a biological sample.
  • the sample may be derived from a patient. Diagnosis of cancer in a patient allows for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed miRNA-molecules.
  • In situ hybridization of labeled probes to tissue arrays may be performed.
  • the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the genes which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
  • the present invention also relates to a method of screening therapeutics comprising contacting a pathological cell capable of expressing a disease related miRNA with a candidate therapeutic and evaluating the effect of a drug candidate on the expression profile of the disease associated miRNA. Having identified the differentially expressed miRNAs, a variety of assays may be executed. Test compounds may be screened for the ability to modulate gene expression of the disease associated miRNA. Modulation includes both an increase and a decrease in gene expression.
  • the test compound or drug candidate may be any molecule, e.g., protein, oligopeptide, small organic molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or indirectly alter the disease phenotype or the expression of the disease associated miRNA.
  • Drug candidates encompass numerous chemical classes, such as small organic molecules having a molecular weight of more than 100 and less than about 500, 1,000, 1,500, 2,000 or 2,500 daltons.
  • Candidate compounds may comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups.
  • the candidate agents may comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups.
  • Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof.
  • Combinatorial libraries of potential modulators may be screened for the ability to bind to the disease associated miRNA or to modulate the activity thereof.
  • the combinatorial library may be a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical building blocks such as reagents. Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art.
  • Such combinatorial chemical libraries include, but are not limited to, peptide libraries encoded peptides, benzodiazepines, diversomers such as hydantoins, benzodiazepines and dipeptide, vinylogous polypeptides, analogous organic syntheses of small compound libraries, oligocarbamates, and/or peptidyl phosphonates, nucleic acid libraries, peptide nucleic acid libraries, antibody libraries, carbohydrate libraries, and small organic molecule libraries.
  • the present invention also relates to a method of using the nucleic acids of the invention to reduce expression of a target gene in a cell, tissue or organ.
  • Expression of the target gene may be reduced by expressing a nucleic acid of the invention that comprises a sequence substantially complementary to one or more binding sites of the target mRNA.
  • the nucleic acid may be a miRNA or a variant thereof.
  • the nucleic acid may also be pri-miRNA, pre-miRNA, or a variant thereof, which may be processed to yield a miRNA.
  • the expressed miRNA may hybridize to a substantially complementary binding site on the target mRNA, which may lead to activation of RISC-mediated gene silencing.
  • nucleic acids of the present invention may be used to inhibit expression of target genes using antisense methods well known in the art, as well as RNAi methods described in U.S. Patent Nos. 6,506,559 and 6,573,099, which are incorporated by reference.
  • the target of gene silencing may be a protein that causes the silencing of a second protein. By repressing expression of the target gene, expression of the second protein may be increased. Examples for efficient suppression of miRNA expression are the studies by Esau et al. (JBC 2004; 275:52361) and Cheng et al. (Nucleic Acids Res 2005;
  • the present invention also relates to a method of using the nucleic acids of the invention to increase expression of a target gene in a cell, tissue or organ.
  • Expression of the target gene may be increased by expressing a nucleic acid of the invention that comprises a sequence substantially complementary to a pri-miRNA, pre-miRNA, miRNA or a variant thereof.
  • the nucleic acid may be an anti-miRNA.
  • the anti-miRNA may hybridize with a pri-miRNA, pre-miRNA or miRNA, thereby reducing its gene repression activity.
  • Expression of the target gene may also be increased by expressing a nucleic acid of the invention that is substantially complementary to a portion of the binding site in the target gene, such that binding of the nucleic acid to the binding site may prevent miRNA binding. 11.
  • the present invention also relates to a method of using the nucleic acids of the invention as modulators or targets of disease or disorders associated with developmental dysfunctions, such as cancer.
  • the claimed nucleic acid molecules may be used as a modulator of the expression of genes which are at least partially complementary to said nucleic acid.
  • miRNA molecules may act as target for therapeutic screening procedures, e.g. inhibition or activation of miRNA molecules might modulate a cellular differentiation process, e.g. apoptosis.
  • miRNA molecules may be used as starting materials for the manufacture of sequence-modified miRNA molecules, in order to modify the target- specificity thereof, e.g. an oncogene, a multidrug-resistance gene or another therapeutic target gene. Further, miRNA molecules can be modified, in order that they are processed and then generated as double-stranded siRNAs which are again directed against therapeutically relevant targets. Furthermore, miRNA molecules may be used for tissue reprogramming procedures, e.g. a differentiated cell line might be transformed by expression of miRNA molecules into a different cell type or a stem cell.
  • tissue reprogramming procedures e.g. a differentiated cell line might be transformed by expression of miRNA molecules into a different cell type or a stem cell.
  • composition may comprise a nucleic acid described herein and optionally a pharmaceutically acceptable carrier.
  • the composition may encompass modified oligonucleotides that are identical, substantially identical, substantially complementary or complementary to any nucleobase sequence version of the miRNAs described herein or a precursor thereof.
  • a nucleobase sequence of a modified oligonucleotide is fully identical or complementary to a miRNA nucleobase sequence listed herein, or a precursor thereof.
  • a modified oligonucleotide has a nucleobase sequence having one mismatch with respect to the nucleobase sequence of the mature miRNA, or a precursor thereof.
  • a modified oligonucleotide has a nucleobase sequence having two mismatches with respect to the nucleobase sequence of the miRNA, or a precursor thereof.
  • a modified oligonucleotide has a nucleobase sequence having no more than two mismatches with respect to the nucleobase sequence of the mature miRNA, or a precursor thereof.
  • the mismatched nucleobases are contiguous. In certain such embodiments, the mismatched nucleobases are not contiguous.
  • a modified oligonucleotide consists of a number of linked nucleosides that is equal to the length of the mature miRNA.
  • the number of linked nucleosides of a modified oligonucleotide is less than the length of the mature miRNA. In certain such embodiments, the number of linked nucleosides of a modified oligonucleotide is one less than the length of the mature miRNA. In certain such embodiments, a modified oligonucleotide has one less nucleoside at the 5' terminus. In certain such embodiments, a modified oligonucleotide has one less nucleoside at the 3' terminus. In certain such embodiments, a modified oligonucleotide has two fewer nucleosides at the 5' terminus!
  • a modified oligonucleotide has two fewer nucleosides at the 3' terminus.
  • a modified oligonucleotide having a number of linked nucleosides that is less than the length of the miRNA, wherein each nucleobase of a modified oligonucleotide is complementary to each nucleobase at a corresponding position in a miRNA, is considered to be a modified oligonucleotide having a nucleobase sequence that is fully complementary to a portion of a miRNA sequence.
  • a modified oligonucleotide consists of 15 to 30 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 19 to 24 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 21 to 24 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 15 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 16 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 17 linked nucleosides.
  • a modified oligonucleotide consists of IB linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 19 linked nucleosides, m certain embodiments, a modified oligonucleotide consists of 20 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 21 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 22 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 23 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 24 linked nucleosides.
  • a modified oligonucleotide consists of 25 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 26 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 27 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 2S linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 29 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 30 linked nucleosides.
  • Modified oligonucleotides of the present invention may comprise one or more modifications to a nucleobase, sugar, and/or internucleoside linkage.
  • a modified nucleobase, sugar, and/or internucleoside linkage may be selected over an unmodified form because of desirable properties such as, for example, enhanced cellular uptake, enhanced affinity for other oligonucleotides or nucleic acid targets and increased stability in the presence of nucleases. . ;
  • a modified oligonucleotide of the present invention comprises one or more modified nucleosides.
  • a modified nucleoside is a stabilizing nucleoside.
  • An example of a stabilizing nucleoside is a sugar- modified nucleoside.
  • a modified nucleoside is a sugar-modified nucleoside.
  • the sugar-modified nucleosides can further comprise a natural or modified heterocyclic base moiety and/or a natural or modified internucleoside linkage and may include further modifications independent from the sugar modification.
  • a sugar modified nucleoside is a 2' -modified nucleoside, wherein the sugar ring is modified at the 2' carbon from natural ribose or 2'-deoxy- ribose. In certain embodiments, 2'-O-methyl group is present in the sugar residue.
  • the modified oligonucleotides of the present invention can be generated according to any oligonucleotide synthesis method known in the art, including both enzymatic syntheses and solid-phase syntheses.
  • Equipment and reagents for executing solid-phase synthesis are commercially available from, for example, Applied Biosystems. Any other means for such synthesis may also be employed; the actual synthesis of the oligonucleotides is well within the capabilities of one skilled in the art and can be accomplished via established methodologies as detailed in, for example: Sambrook, J. and Russell, D. W. (2001), "Molecular Cloning: A Laboratory Manual”; Ausubel, R. M. et al, eds.
  • oligonucleotide comprising an RNA molecule can be also generated using an expression vector as is further described hereinbelow.
  • compositions may be used for therapeutic applications.
  • the pharmaceutical composition may be administered by known methods, including wherein a nucleic acid is introduced into a desired target cell in vitro or in vivo.
  • nucleic acid molecules can be administered to cells by a variety of methods known to those familiar to the art, including, but not restricted to, encapsulation in liposomes, by iontophoresis, or by incorporation into other vehicles, such as hydrogels, cyclodextrins, biodegradable nanocapsules, and bioadhesive microspheres.
  • the nucleic acid/vehicle combination is locally delivered by direct injection or by use of an infusion pump.
  • routes of delivery include, but are not limited to oral (tablet or pill form) and/or intrathecal delivery (Gold, Neuroscience, 76, 1153-1158, 1997).
  • Other approaches include the use of various transport and carrier systems, for example, through the use of conjugates and biodegradable polymers. More detailed descriptions of nucleic acid delivery and administration are provided for example in WO93/23569, WO99/05094, and WO99/04819.
  • the nucleic acids can be introduced into tissues or host cells by any number of routes, including viral infection, microinjection, or fusion of vesicles. Jet injection may also be used for intra-muscular administration, as described by Furth et al. (Anal Biochem 1992; 205:365-368).
  • the nucleic acids can be coated onto gold microparticles, and delivered intradermally by a particle bombardment device, or "gene gun” as described in the literature (see, for example, Tang et al, Nature 1992;356:152-154), where gold microprojectiles are coated with the DNA, then bombarded into skin cells.
  • Administration of a pharmaceutical composition of the present invention to a subject having cancer results in one or more clinically desirable outcomes.
  • Such clinically desirable outcomes include reduction of tumor number or reduction of tumor size.
  • Additional clinically desirable outcomes include the extension of overall survival time of the subject, and/or extension of progression-free survival time of the subject.
  • administration of a pharmaceutical composition of the invention prevents an increase in tumor size and/or tumor number.
  • administration of a pharmaceutical composition of the invention prevents metastatic progression.
  • administration of a pharmaceutical composition of the invention slows or stops metastatic progression.
  • administration of a pharmaceutical composition of the invention prevents the recurrence of tumors.
  • administration of a pharmaceutical composition of the invention prevents recurrence of tumor metastasis.
  • a modified oligonucleotide may stop, slow or reduce the uncontrolled proliferation of cancer cells.
  • a modified oligonucleotide may induce apoptosis in cancer cells.
  • a modified oligonucleotide may reduce cancer cell survival.
  • a miRNA hybridizes to an mRNA to regulate expression of the mRNA and its protein product.
  • the hybridization of a miRNA to its rnRNA target inhibits expression of the mRNA.
  • the inhibition of a miRNA may result in the increased expression of a miRNA nucleic acid target.
  • the inhibition of a miRNA results in the increase of a protein encoded by a miRNA nucleic acid target.
  • the present invention also relates to a pharmaceutical composition
  • a pharmaceutical composition comprising the nucleic acids of the invention and optionally a pharmaceutically acceptable carrier.
  • compositions may be used for diagnostic or therapeutic applications.
  • administration of the pharmaceutical composition may be carried out by known methods, wherein a nucleic acid is introduced into a desired target cell in vitro or in vivo.
  • Gene transfer techniques include calcium phosphate, DEAE-dextran, electroporation, microinjection, viral methods and cationic liposomes.
  • Cancer treatments often comprise more than one therapy.
  • the present invention provides methods for treating cancer comprising administering to a subject in need thereof a compound comprising a modified oligonucleotide complementary to a miRNA, or a precursor thereof, and further comprising administering at least one additional therapy.
  • an additional therapy may also be designed to treat cancer.
  • An additional therapy may be a chemotherapeutic agent.
  • Suitable chemotherapeutic agents include 5-fluorouracil, gemcitabine, doxorubicine, mitomycin c, sorafenib, etoposide, carboplatin, epirubicin, irinotecan and oxaliplatin.
  • An additional suitable chemotherapeutic agent includes a modified oligonucleotide, other than a modified oligonucleotide of the present invention, that is used to treat cancer.
  • an additional therapy may be designed to treat a disease other than cancer.
  • an additional therapy is a treatment that includes interferons, for example, interferon alfa-2b, interferon alfa-2a, and interferon alfacon-1.
  • interferon alfa-2b pegylated and unpegylated
  • ribavarin RNA replication inhibitor
  • antisense agents e.g., ViroPharma's VP50406 series
  • therapeutic vaccines e.g., ViroPharma's VP50406 series
  • protease inhibitors e.g., helicase inhibitors
  • antibody therapy monoclonal and polyclonal.
  • an additional therapy may be a pharmaceutical agent that enhances the body's immune system, including low-dose cyclophosphamide, thymostimulin, vitamins and nutritional supplements (e.g., antioxidants, including vitamins A, C, E, beta-carotene, zinc, selenium, glutathione, coenzyme Q-IO and echinacea), and vaccines, e.g., the immunostimulating complex (ISCOM), which comprises a vaccine formulation that combines a multimeric presentation of antigen and an adjuvant.
  • ICOM immunostimulating complex
  • the additional therapy is selected to treat oi ameliorate a side effect of one or more pharmaceutical compositions of the present invention.
  • side effects include, without limitation, injection site reactions, liver function test abnormalities, renal function abnormalities, liver toxicity, renal toxicity, central nervous system abnormalities, and myopathies.
  • increased aminotransferase levels in serum may indicate liver toxicity or liver function abnormality.
  • increased bilirubin may indicate liver toxicity or liver function abnormality.
  • one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are administered at the same time. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are administered at different times. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are prepared together in a single formulation. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are prepared separately.
  • compositions of the present invention can be formulated into pharmaceutical compositions by combination with appropriate, pharmaceutically acceptable carriers or diluents, and can be formulated into preparations in solid, semi-soJid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories; injections, inhalants and aerosols.
  • administration of the agents can be achieved in various ways, including, but not limited to, oral, buccal, rectal, parenteral, transmucosal, intestinal, enteral, topical, suppository, through inhalation, intraperitoneal, ⁇ intradermal, transdermal, intracheal, intrathecal, intraventricular, intranasal, intraocular and iratumoral (e.g., intravenous, intramuscular, intramedullary, and subcutaneous).
  • An additional suitable administration route includes chemoembolization.
  • pharmaceutical intrathecals are administered to achieve local rather than systemic exposures.
  • pharmaceutical compositions may be injected directly in the area of desired effect (e.g., into a tumor).
  • a pharmaceutical composition of the present invention is administered in the form of a dosage unit (e.g., tablet, capsule, bolus, etc.).
  • such pharmaceutical compositions comprise a modified oligonucleotide in a dose selected from 25 mg, 30 mg, 35 mg, 40 mg, 45 mg, 50 mg, 55 mg, 60 mg, 65 mg, 70 mg, 75 mg, 80 mg, 85 mg, 90 mg, 95 mg, 100 mg, 105 mg, 110 mg, 115 mg, 120 mg, 125 mg, 130 mg, 135 mg, 140 mg, 145 mg, 150 mg, 155 mg, 160 mg, 165 mg, 170 mg, 175 mg, 180 mg, 185 mg, 190 mg, 195 mg, 200 mg, 205 mg, 210 mg, 215 mg, 220 mg, 225 mg, 230 mg, 235 mg, 240 mg, 245 mg, 250 mg, 255 mg, 260 mg, 265 mg, 270 mg, 270 mg, 280 mg, 285 mg,
  • a pharmaceutical composition of the present invention comprises a dose of modified oligonucleotide selected from 25 mg, 50 mg, 75 mg, 100 mg, 150 mg, 200 mg, 250 mg, 300 mg, 350 mg, 400 mg, 500 mg, 600 mg, 700 mg, and 800mg.
  • a pharmaceutical agent is sterile lyophilized modified oligonucleotide that is reconstituted with a suitable diluent, e.g., sterile water for injection or sterile saline for injection.
  • a suitable diluent e.g., sterile water for injection or sterile saline for injection.
  • the reconstituted product is administered as a subcutaneous injection or as an intravenous infusion after dilution into saline.
  • the lyophilized drag product consists of a modified oligonucleotide which has been prepared in water for injection, or in saline for injection, adjusted to pH 7.0-9.0 with acid or base during preparation, and then lyophilized.
  • the lyophilized modified oligonucleotide may be 25-800 mg of a modified oligonucleotide.
  • the lyophilized drug product may be packaged in a 2 mL Type I, clear glass vial (ammonium sulfate-treated), stoppered with a bromobutyl rubber closure and sealed with an aluminum FLIP-OFF® overseal.
  • compositions of the present invention may contain additional, compatible, pharmaceutically-active materials such as, for example, antipruritics, astringents, local anesthetics or anti-inflammatory agents, or may contain additional materials useful in physically formulating various dosage forms of the compositions of the present invention, such as dyes, flavoring agents, preservatives, antioxidants, opacifiers, thickening agents and stabilizers.
  • additional materials useful in physically formulating various dosage forms of the compositions of the present invention, such as dyes, flavoring agents, preservatives, antioxidants, opacifiers, thickening agents and stabilizers.
  • such materials when added, should not unduly interfere with the biological activities of the components of the compositions of the present invention.
  • the formulations can be sterilized and, if desired, mixed with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, colorings, flavoring;: and/or aromatic substances and the like which do not deleteriously interact with the oligonucleotide(s) of the formulation.
  • auxiliary agents e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, colorings, flavoring;: and/or aromatic substances and the like which do not deleteriously interact with the oligonucleotide(s) of the formulation.
  • compositions of the present invention comprise one or more modified oligonucleotides and one or more excipients.
  • excipients are selected from water, salt solutions, alcohol, polyethylene glycols, gelatin, lactose, amylase, magnesium stearate, talc, silicic acid, viscous paraffin, hydroxymethylcellulose and polyvinylpyrrolidone.
  • a pharmaceutical composition of the present invention is prepared using known techniques, including, but not limited to mixing, dissolving! granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or tabletting processes.
  • a pharmaceutical composition of the present invention is a liquid (e.g., a suspension, elixir and/or solution).
  • a liquid pharmaceutical composition is prepared using ingredients known in the art, including, but not limited to, water, glycols, oils, alcohols, flavoring agents, preservatives, and coloring agents.
  • a pharmaceutical composition of the present invention is a solid (e.g., a powder, tablet, and/or capsule).
  • a solid pharmaceutical composition comprising one or more oligonucleotides is prepared using ingredients known in the art, including, but not limited to, starches, sugars, diluents, granulating agents, lubricants, binders, and disintegrating agents.
  • a pharmaceutical composition of the present invention is formulated as a depot preparation.
  • Certain such depot preparations are typically longer acting than non-depot preparations, hi certain embodiments, such preparations are administered by implantation (for example, subcutaneously or intramuscularly) or by intramuscular injection.
  • depot preparations are prepared using suitable polymeric or hydrophobic materials (for example, an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
  • a pharmaceutical composition of the present invention comprises a delivery system.
  • delivery systems include, but are not limited to, liposomes and emulsions.
  • Certain delivery systems are useful for preparing certain pharmaceutical compositions including those comprising hydrophobic compounds.
  • certain organic solvents such as dimethylsulfoxide are used.
  • a pharmaceutical composition of the present invention comprises one or more tissue-specific delivery molecules designed to deliver the one or more pharmaceutical agents of the present invention to specific tissues or cell types.
  • pharmaceutical compositions include liposomes coated with a tissue-specific antibody.
  • a pharmaceutical composition of the present invention comprises a co-solvent system.
  • co-solvent systems comprise, for example, benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and an aqueous phase.
  • co-solvent systems are used for hydrophobic compounds.
  • VPD co-solvent system is a solution of absolute ethanol comprising 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant Polysorbate 80TM and 65% w/v polyethylene glycol 300.
  • co-solvent systems may be varied considerably without significantly altering their solubility and toxicity characteristics.
  • identity of co-solvent components may be varied: for example, other surfactants may be used instead of Polysorbate 80TM; the fraction size of polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene glycol, e.g., polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for dextrose.
  • a pharmaceutical composition of the present invention' comprises a sustained-release system.
  • a sustained-release system is a semi-permeable matrix of solid hydrophobic polymers.
  • sustained-release systems may, depending on their chemical nature, release pharmaceutical agents over a period of hours, days, weeks or months.
  • a pharmaceutical composition of the present invention is prepared for oral administration, hi certain of such embodiments, a pharmaceutical composition is formulated by combining one or more compounds comprising a modified oligonucleotide with one or more pharmaceutically acceptable carriers. Certain of such carriers enable pharmaceutical compositions to be formulated as tablets, pills, dragees.., capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by ?.. :
  • compositions for oral use are obtained by mixing oligonucleotide and one or more solid excipient.
  • Suitable excipients include, but are not limited to, fillers, such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl- cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP).
  • such a mixture is optionally ground and auxiliaries are optionally added.
  • compositions are formed to obtain tablets or dragee cores.
  • disintegrating agents e.g., cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof, such as sodium alginate
  • dragee cores are provided with coatings, hi certain such embodiments, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures.
  • Dyestuffs or pigments may be added to tablets or dragee coatings.
  • compositions for oral administration are push-fit capsules made of gelatin.
  • Certain of such push-fit capsules comprise one or more pharmaceutical agents of the present invention in admixture with one or more filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers.
  • pharmaceutical compositions for oral administration are soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol.
  • one or more pharmaceutical agents of the present invention are be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols.
  • stabilizers may be added.
  • compositions are prepared for buccal administration. Certain of such pharmaceutical compositions are tablets or lozenges formulated in conventional manner.
  • a pharmaceutical composition is prepared for administration by injection (e.g., intravenous, subcutaneous, intramuscular, etc.).
  • a pharmaceutical composition comprises a carrier and is formulated in aqueous solution, such as water or physiologically compatible buffers such as Hanks's solution, Ringer's solution, or physiological saline buffer.
  • aqueous solution such as water or physiologically compatible buffers such as Hanks's solution, Ringer's solution, or physiological saline buffer.
  • other ingredients are included (e.g., ingredients that aid in solubility o ⁇ - serve as preservatives).
  • injectable suspensions are prepared using appropriate liquid carriers, suspending agents and the like.
  • Certain pharmaceutical compositions for injection are presented in unit dosage form, e.g., in ampoules or in multi-dose containers.
  • compositions for injection are suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents.
  • Certain solvents suitable for use in pharmaceutical compositions for injection include, but are not limited to, lipophilic solvents and fatty oils, such as sesame oil, synthetic fatty acid esters, such as ethyl oleate or triglycerides, and liposomes.
  • Aqueous injection suspensions may contain substances that increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran.
  • such suspensions may also contain suitable stabilizers or agents that increase the solubility of the pharmaceutical agents to allow for the preparation of highly concentrated solutions.
  • a pharmaceutical composition is prepared for traiismucosal administration.
  • penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.
  • a pharmaceutical composition is prepared for administration by inhalation.
  • Certain of such pharmaceutical compositions for inhalation are prepared in the form of an aerosol spray in a pressurized pack or a nebulizer.
  • Certain of such pharmaceutical compositions comprise a propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas.
  • the dosage unit may be determined with a valve that delivers a metered amount
  • capsules and cartridges for use in an inhaler or insufflator may be formulated.
  • Certain of such formulations comprise a powder mixture of a pharmaceutical agent of the invention and a suitable powder base such as lactose or starch.
  • a pharmaceutical composition is prepared for rectal administration, such as a suppositories or retention enema.
  • Certain of such pharmaceutical compositions comprise known ingredients, such as cocoa butter and/or other glycerides.
  • a pharmaceutical composition is prepared for topical administration.
  • Certain of such pharmaceutical compositions comprise bland moisturizing bases, such as ointments or creams.
  • ointments or creams include, but are not limited to, petrolatum, petrolatum plus volatile silicones, and lanolin and water in oil emulsions.
  • suitable cream bases include, but are not limited to, cold cream and hydrophilic ointment.
  • a pharmaceutical composition of the present invention comprises a modified oligonucleotide in a therapeutically effective amount, hi certain embodiments, the therapeutically effective amount is sufficient to prevent, alleviate or ameliorate symptoms of a disease or to prolong the survival of the subject being treated. Determination of a therapeutically effective amount is well within the capability of those skilled in the art.
  • one or more modified oligonucleotides of the present invention is formulated as a prodrug.
  • a prodrug upon in vivo administration, is chemically converted to the biologically, pharmaceutically or therapeutically more active form of a modified oligonucleotide.
  • prodrugs are useful because they are easier to administer than the corresponding active form.
  • a prodrug may be more bioavailable (e.g., through oral administration) than is the corresponding active form.
  • a prodrug may have improved solubility compared to the corresponding active form.
  • prodrugs are less water soluble than the corresponding active form.
  • a prodrug is an ester.
  • the ester is metabolically hydrolyzed to carboxylic acid upon administration.
  • the carboxylic acid containing compound is the corresponding active form.
  • a prodrug comprises a short peptide (polyaminoacid) bound to an acid group.
  • the peptide is cleaved upon administration to form the corresponding active form.
  • a prodrug is produced by modifying a pharmaceutically active compound such that the active compound will be regenerated upon in vivo administration.
  • the prodrug can be designed to alter the metabolic stability or the transport characteristics of a drug, to mask side effects or toxicity, to improve the flavor of a drug or to alter other characteristics or properties of a drug.
  • kits which may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base, hi addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
  • the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence.
  • the kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe.
  • Samples in which miRNA expression was identified include tumors (colon, bladder, breast, lung, liver, kidney, ovarian, prostate, esophagus, cervix, and pancreas), normal tissues (colon, bladder, breast, lung, liver, kidney, brain, endometrium, lymph nodes, and heart), metastases (breast, lung, kidney, endometrium, salivary gland, larynx, tongue, and melanocytes) and blood samples (blood cells, whole blood).
  • RNAs used for deep sequencing libraries are as follows: bladder pool, colon pool, breast pool and lung pool.
  • FFPE paraffin-embedded
  • RNA quantity and quality were checked by spectrophotometry (Nanodrop ND- 1000). Pools of samples of the small RNA fraction within the total RNA were labeled and hybridized on arrays. After ensuring the presence and expression of more than 100 miRNAs per cancerous tissue pool, tissues were pooled together, resulting in a bladder+breast tumor pool and a colon+lung pool. Array expression revealed the presence of 157 miRNAs from bladder cancer FFPEs, 260 miRNAs from breast cancer FFPEs, 135 miRNAs from lung cancer FFPEs, and 239 miRNAs from colon cancer FFPEs.
  • RNA (75 ⁇ g) of seven duplicate different colon cancer FFPEs were pooled together with 75 ⁇ g of six duplicate different lung cancer FFPEs, while 75 ⁇ g total RNA of five duplicate different bladder cancer FFPEs were pooled together with 75 ⁇ g of five duplicate different breast cancer FFPEs.
  • the 3' and 5' cloning linkers (3' Linker: 5'- rAppCTGTAGGCACCATCAAT/3ddC/-3' (SEQ ID NO: 455); 5' Linker: 5'- TGGAATrUrCrUrCrGrGrGrCrArCrCrArArGrGrU-3' (SEQ ID NO: 456)) were ligated to purified small KNA species in preparation for cDNA synthesis and amplification.
  • Fwd tag 1 primer (454 fwdl -BLl mm)
  • Fwd tag2 primer (454 fwd2 -BLl mm)
  • Rev tag2 primer (454-Rev2-BL 1 )
  • RNAs in the miRNA size range of 18 nt to 26 nt is very small relative to total RNA, so removal of as much competing mass as possible is essential. Therefore, enrichment of small RNA was carried out by recovering the small RNA fraction, identified by internal size markers, from a slice of a 12% denaturing (7 M Urea) polyacrylamide gel.
  • the synthetic RNA size markers run in the lane adjacent to the cancer samples were:
  • RNA from the gel was carried out by GeBAflex-tube-midi column using an electric current of 300 volt for 40 min until the nucleic acid exited from the gel slice, followed by applying reverse polarity of the current for 120 seconds.
  • RNA was precipitated by adding 8 ⁇ l of linear acrylamide, a 1/10 volume of NaOAc 3 M, pH 5.2, and three volumes of cold 100% ETOH, with vortexing after each addition. The isolated RNA was precipitated overnight at -2O 0 C, centrifuged for 1 h at 4 0 C at 14000 rpm, followed by washing with 1 ml cold 85% ETOH and subsequent centrifugation for 5 min at
  • the small RNAs were ligated with a 3' and a 5' linker in two separate reactions.
  • 3' ligation was performed in which the 3' linker was ligated to the small RNAs using T4 RNA ligase in the absence of ATP in order to avoid circularization of the
  • RNA fragments as described in Lau et at, 2001.
  • the ligated product was purified by recovering the desired band, identified using size markers, from a slice of a 12% denaturing (7 M Urea) polyacrylamide gel.
  • Two synthetic RNAs 24 nt and 38 nt, described previously
  • two synthetic RNA transcripts 53 nt and 83 nt were run adjacent to the cancer samples. Purification and precipitation were carried out as described previously.
  • the 5' linker is ligated to the 3' linkered small RNAs in the presence of 1.0 mM ATP, followed by recovering the desired band from a slice of a 12% denaturing (7M urea) polyacrylamide gel with the same size markers. Purification and precipitation were done as described before.
  • RNAs contained both RNA and DNA regions which were converted to DNA using reverse transcriptase with RT primer, according to MirCat protocol.
  • PCR amplification step was carried out using primers different from those provided by the MirCat kit since the primers provided cause strong self- and heterodimers.
  • PCR was carried out using PfuUltra high fidelity DNA polymerase (Stratagene #600380) and pairs of longer PCR primers (40-42-mers) containing sequences complementary to the linkers, tag sequences and sequences which were suitable for the 454 platform.
  • PfuUltra high fidelity DNA polymerase (Stratagene #600380) and pairs of longer PCR primers (40-42-mers) containing sequences complementary to the linkers, tag sequences and sequences which were suitable for the 454 platform.
  • Tagl -flagged colon and lung library and Tag2-flagged breast and bladder library followed by 454 sequences that will convert the small RNA libraries made to ones that can be directly sequenced on the 454 platform.
  • the deep sequencing process yielded over 200,000 sequences from both libraries.
  • Adaptors were removed using a Perl script allowing internal polyN sequences within the adaptors and 1 mismatch. About 1000 sequences were removed since they were too short after adaptor removal ( ⁇ 10 bp). The sequences were mapped to the human genome (UCSC hgl8 build) using BLAST, allowing maximum three bps mismatched to the genome and maximum insertion/deletion (indels) of three bps. For each aligned sequence the highest scoring hit was retrieved. AU sequences with position overlap were clustered together using a Perl script.
  • sequence X was mapped to positions 1-20 within the plus strand of chromosome 1 and a sequence Y was mapped to positions 15- 35 on the same chromosome and strand, then the two sequences were unified in the same genomic cluster of chromosome 1, plus strand, positions 1-35.
  • the clusters of sequences represent segments of expressed genes. Each genomic cluster of sequences was assigned the most abundant sequence in this cluster and demanded that for candidate miRNAs, the most abundant sequence will be mapped precisely to the genome (not allowing any mismatches/indels) .
  • RNA genes, sno/miRNA, RefSeq genes, and RepeatMasker tables were downloaded from the UCSC table browser, and known miRNA precursors were downloaded from miRBase in order to mark whether the sequence is part of a noncoding gene, a snoRNA, a protein-coding gene exon, a genomic repeat, or a known miRNA precursor, respectively.
  • the sequences of the novel miRNA candidates were extended by several hundred bp within their chromosomes in order to predict possible miRNA precursors. An extended sequence was intended to predict the folding of a pri-miRNA that contains a hairpin-folded pre-miRNA.
  • the candidate pri-miRNAs were folded using the Vienna package ⁇ Hofacker, LL. (2003) Nucleic Acids Res, 31, 3429-3431 ⁇ or mfold ⁇ Zuker, M. (2003) Nucleic Acids Res, 31, 3406-3415 ⁇ programs. All hairpin structures that had at least six base pairs, were at least 55 nucleotides long and had a loop not longer than 20 nucleotides were extracted from the minimum free energy fold of the predicted pri-miRNA (excluding overlapping hairpins). Each hairpin was assigned a Palgrade and conservation score.
  • Predicted miRNA precursors have either Palgrade>0 (meaning it has structural characteristics of known miRNA) or have absolute value of conservation score>0.9 (conserved in mammals) ⁇ Bentwich, et a (2005) Nat Genet, 37, 766-770 ⁇ . These criteria have a sensitivity of 86% for known miRNA precursors from miRBase 13.0. In addition, only sequences with ten or less genomic copies, with a length of 17-25 bp and a GC content in known miRNA range (15-90%) were chosen as miRNA candidates.
  • Custom microarrays (Biochips) were manufactured by Agilent Technologies by in situ synthesizing DNA oligonucleotide probes to 949 known microRNAs and 876 sequences printed in triplicate, and 8639 computationally predicted microRNAs printed in one copy. 44/49 of the novel miRNA and small RNAs were used in the microarray (five sequences were identified as novel miRNAs/small RNAs after the design of the microarray). Sequences from deep sequencing were characterized by:
  • Each probe comprised an antisense sequence of the relevant sequence, followed by a tail sequence (GCAATGCTAGCTATTGCTTGCTATTAAAAA) (SEQ ID NO: 465), trimmed so the final length of the probe would be 45 nucleotides.
  • Seventeen negative control probes were designed using the sense sequences of different microRNAs. Two groups of positive control probes were designed to hybridize to the array: (i) synthetic small RNA that were spiked to the RNA before labeling to verify the labeling efficiency, and (ii) probes for abundant small nuclear RNAs that were spotted ori the array to verify RNA quality.
  • RNA-linker p- rCrU-Cy/dye (Eurogentec S. A.; Cy3 or Cy5)
  • Synthetic small RNA was spiked into the RNA before labeling to verify the labeling efficiency.
  • Slides were incubated with the labeled RNA for 12-16 h at 55°C and then washed according to Agilent GE washes for Agilent miRNA protocol.
  • Arrays were scanned using Agilent DNA Microarray Scanner Bundle (Agilent Technologies, Santa Clara, CA) at a resolution of 5 micrometer, dual pass at 100% and 10% PMT power green and red Dye channel.
  • Array images were analyzed using Agilent Feature Extraction software (version 9.5).
  • Array images were analyzed using the Feature Extraction software (FE) 9.5.1 (Agilent, Santa Clara, CA). Triplicate spots were combined to produce one signal for each probe by taking the logarithmic mean of reliable spots. All data were log- transformed (natural base) and the analysis was performed in log-space.
  • P-values were calculated using a two-sided t-test on the log-transformed normalized fluorescence signal.
  • the fold-difference ratio of the median normalized fluorescence was calculated for each microRNA.
  • the signal of a sequence is defined as differential between sample "A” and sample "B” if the fold change between the signal in sample A and sample B is either larger than the 95 th percentile of fold changes of all sequences expressed in both samples, or larger than 8.
  • the tumor sample was compared to the median signals of all other tumors, to the normal sample from the same tissue type where available, the relevant tumor adjacent sample where available, and metastatic samples originating in the same tissue where available. Each metastatic sample was also compared to normal samples originating from the same site.
  • Sequences used in the reaction are microRNA-specific forward primers detailed in table 4 below, a universal TaqMan probe (complementary to the 3' end of the oligodT plus part of the tail, SEQ ID NO: 470), and the universal reverse primer (complementary to the consensus 3' sequence of the oligodT tail, SEQ ID NO: 471).
  • expression signals were calculated by the formula 42 - Ct (miR-X).
  • Example 2 Identified miR expression in tumor tissues, normal tissues and blood and metastatic tissues, for candidate miRs
  • Table 5 presents expression data for miRNAs identified in tumor tissues.
  • Table 6 presents expression data for miRNAs identified in normal tissues.
  • Table 7 presents expression data for miRNAs identified in metastatic tissue originating from primary tumors at the indicated sites and normal blood tissues.
  • Example 3.1 Differential expression of miRNAs in tumor vs. normal tissue
  • Table 8 presents expression for of a list of miRNAs identified in tumor and normal tissues, measured by Iog 2 (signal).
  • “Expression in the tumor” refers to median of expression in all tumor tissues tested.
  • Normal refers to median of expression in all normal tissues tested.
  • P-value refers to the significance of the change between all normal and all tumor tissues tested according to two-sided Student's t-test.
  • “Fold change” refers to the fold change between all normal and all tumor tissues tested.
  • Table 8 Differential expression of miRNAs in median of all tumors vs. median of all normal tissues
  • Example 3.2 Differential expression of miRNAs in tumor vs. adjacent and normal tissue
  • Table 9 presents data comparing expression of a list of miRNAs identified in tumor vs. adjacent and normal tissues, measured by Iog 2 (signal).
  • adjacent refers to an area of the same tissue and the same patient without tumor growth.
  • “Expression in normal tissue” refers to expression in non-tumor tissue of the same origin.
  • SEQ ID NOS: 79, 99, 122, 130, 153, 154 exhibit relatively low expression, when comparing expression in a breast tumor vs. adjacent breast tissue.
  • SEQ ID NOS: 11, 26, 71, 72, 77, 98, 134, 136, 153, 160, 170 and 171 exhibit relatively high expression
  • SEQ ID NOS: 12, 14, 32, 34, 54, 89, 123, 126, 128 and 140 exhibit relatively low expression, when comparing expression in liver tumor vs. normal liver tissue.
  • SEQ ID NOS: 14, 28-32, 42, 43, 52-55, 67, 75, 84, 85, 89, 97,105, 123-126, 128, 129, 137, 144 and 154 exhibit relatively low expression when comparing expression in liver tumor vs. adjacent liver tissue.
  • Example 3.4 Differential expression of m ⁇ RNAs in a specific tumor vs. other tumors
  • Table 10 presents data comparing expression of a list of miRNAs identified in a specific tumor vs. their expression in other tumors, measured by Iog 2 (signal).
  • SEQ ID NOS: 5, 10, 21, 29, 43, 47, 51, 63, 68-70, 86, 110, 129, 143, 154, 165 and 166 exhibit high fold change when comparing expression in bile vs. other tumors.
  • SEQ ID NOS: 21, 68, 92, 111 and 174 exhibit high fold change when comparing expression in whole blood vs. other tumors.
  • mets. metastasis.
  • Example 3.5 Differential expression of miRNAs in a specific tumor vs. sites of metastases
  • Tables 12A-C presents data comparing expression of a list of miRNAs identified in a specific tumor vs. their expression in other tumors, measured by Iog 2 (signal).
  • Table HA details the differential expression of miRNAs in a specific primary tumor vs. metastisis originating from the same tissue or the metastasis vs.
  • table HB details miR expression in a tumor originating from a specific tissue, in metastases of same tissue to the lung, in normal lung tissue, in lung primary tumors (normalized to the metasteses), and in median of metastasis of other origins
  • table HC details miR expression in a tumor originating from a specific tissue, in metastasis of same tissue to the lymph node, in normal lymph node tissue, in lymph node primary tumors, and in median of metastasis of other origins;
  • Specific tumor signal refers to the expression in a specific tumor tissue.
  • Metal signal refers to expression in the metastasis originating from the specific tissue.
  • Mets Signal refers to expression in median of metastasis of various origins.
  • Figure 1 shows differential expression of miRs, comparing the median values of each miR in breast primary tumor with breast metastases into lymph nodes.
  • Figure 2 shows differential expression of miRs, comparing the median values of each miR in colon tumors with the corresponding median for their adjacent tissues.
  • Figure 3 shows differential expression of miRs, comparing the median values of each miR in lung tumors with the corresponding median for other tumors from the following tissues: bile duct, bladder, breast, colon, kidney, liver, lung, ovary, pancreas, and prostate.
  • Table HA miR expression in tumor originating from a specific tissue, in metastasis originating from same tissue, and in median of metastasis originating from other tissues
  • Table HB miR expression in tumor originating from a specific tissue, in metastasis of same tissue to the lung, in normal lung tissue, in lung primary tumors, and in median of metastasis of other origins
  • Table HC miR expression in tumor originating from a specific tissue, in metastasis of same tissue to the lymph node, in normal lymph node tissue and in median of metastasis of other origins
  • Example 3.6 Differential expression of miRNAs in blood vs. colon tumor or adjacent colon tissue
  • Table 12 presents data comparing expression of a list of miRNAs measured by Iog 2 (signal).
  • Colon tumor expression refers to expression in a colon tumor.
  • Full change blood vs. colon tumor signal refers to the fold change of expression in a blood tissue vs. colon tumor.
  • Depression in adjacent colon refers to expression in colon tissue adjacent to the specific tumor. This is shown graphically in Figure 2.
  • Full change blood vs. adjacent site refers to the fold change obtained upon comparison of the blood tissue vs. adjacent colon tissue.
  • SEQ ID NOS: 21, 68, 111 and 174 exhibit high fold change in expression in blood cells or whole blood or both when compared to expression in both colon tumor tissue and colon tissue adjacent to the tumor site.
  • SEQ ID NO: 92 exhibit low fold change in expression in both blood cells and whole blood when compared to expression in both colon tumor tissue and colon tissue adjacent to the tumor site.
  • Table 12 Differential expression of miRNAs in blood vs. colon tumor or adjacent colon tissue
  • Example 4 Deep sequencing of small RNAs from solid tumor samples
  • Example 4.1 Expression and sequence variability of known microRNAs i
  • the sequencing process yielded 141,023 sequences from the bladder+breast tumor pool and 90,986 sequences from the colon+lung pool. After combining identical sequences, 27,968 unique sequences remained, 81% of which are 17-26 nt long, accounting for 93% of all redundant sequences. As described in Example 1, these sequences were further analyzed by aligning each sequence, using BLAST, to the human genome allowing a maximum of three nucleotides mismatched relative to the genome and a maximum insertion/deletion of three base pairs.
  • the small RNA libraries were found to be enriched with human miRNAs.
  • Known miRNAs occupied 61% (140,255 of 230,740) of the total small RNA reads.
  • Three hundred and eighty-seven out of 885 (44%) human miRNAs were sequenced in at least one read in the different tumor libraries.
  • Most miRNAs were sequenced in several sequence variants that were previously referred to as isomiRs.
  • the different isomiRs were predominantly variable in the 3' end of the mature miRNA sequence, a region which is less precisely defined than the miRNA 5' end.
  • the most abundant isomiR in the cancer tissue survey was much more abundant (at least 20%) than the reference miRNA sequence from miRBase database. This suggests that the relative abundance of isomiRs may be' inherently different between normal tissue and tumors.
  • several knowrf miRNAs had an abundant isomiR with at least one mismatch to the human genome sequence, suggestive of the discovery of novel miRNA-related SNPs/cancer mutations or post-transcriptional modification of the miRNAs.
  • isomiRs were expressed in at least the same number of reads as the miRBase isomiR. Most of the sequence modifications (69%) occurred in the 3' end of the miRNA and involved either DNA base modification, 3' uridylation or 3' adenylation. 3' additions of G or C were completely absent. The high abundance and the specificity of the 3' terminal single nucleotide insertions suggest that these are regulated post-transcriptional modifications and not DNA-level changes (SNPs/mutations), which are expected to occur in a more random manner. Several sequence modifications that occur internally within the miRNA sequence were also noted.
  • isomiRs demonstrated primarily (77%) C->T or A->G nucleotide modifications, again suggesting involvement of post-transcriptional RNA editing by cytidine deaminase or ADAR enzymes, respectively, contrary to DNA level changes.
  • Example 4.2 Deep sequence identification of novel miRNAs and miRNA-Iike small
  • miRNA star sequences are ⁇ 22-nt RNA species nearly complementary to a known miRNA, which are located within the miRNA precursor and which may have an inhibitory activity.
  • MORs miRNA-offset RNAs
  • MORs sequenced in the human tumors are highly conserved, derived exclusively from the 5' stem of the miRNA precursor directly upstream to the 5 ' miRNA, and lowly expressed relative to the main miRNA product of the precursor.
  • the MORs identified here tend to be located in a region of lower dsRNA stability than the main miRNA- miRNA star pair of the miRNA precursor. Therefore, the miRNA precursor of a MOR may switch between different folded RNA structures, only part of which accommodates the MOR in a dsRNA region that would be processed by the canonical miRNA pathway. This may explain the relatively low expression of MORs in comparison to the main mature miRNAs of the precursors.
  • Example 4.2.2 miRNAs derived from novel miRNA precursors
  • An additional group of novel miRNAs and miRNA-like small RNAs expressed in the tested cancer tissues includes completely novel miRNAs from novel miRNA precursors. Only reads that were exactly mapped to the genome were used. Reads that were mapped to more than 10 loci were filtered out, since human miRNAs rarely map to more than a few genomic loci. Other reasons for which sequences were discarded include rare occurrence (i.e. very few reads), length exceeding normal miRNA length and %GC higher than the %GC of known miRNAs. After filtering out by these criteria, as well as filtering sequences located within already annotated sequences (known miRNAs, other small RNAs, transposons, coding exons), miRNA precursors were predicted by folding
  • miRNA-like small RNAs expressed in the tested cancer tissues, contained miRNA-like sequences derived from annotated small RNAs and genomic repeats.
  • miRNAs were previously described as having been derived from such genetic elements. Sequences whose length exceeded the conventional size of miRNAs (17-25 bp) were discarded. MiRNA precursors were predicted using RNAfold and mFold and the precursor score described above. Finally, only sequences with at least 10 reads were taken, in order to ensure that the identified novel miRNAs were likely to be consistent products of enzymatic excision and not rare degradation products.
  • MID-24078 (SEQ ID NO: 495) is derived from a local hairpin-fold of an AIu repeat.
  • MID- 19434 (SEQ ID NO: 93)] are, interestingly, derived from Y RNAs.
  • Y RNAs are relatively unexplored noncoding RNA species that are implicated in chromosomal DNA replication ⁇ Krude, T., et al, J Cell Sd, 122, 2836-2845 ⁇ and RNA quality control ⁇ Sim, et al, (2009) MoI Biol Cell, 20, 1555-1564 ⁇ .
  • Y RNA have been shown to be over- expressed in solid tumors (Christov et al, Br J Cancer 2008;98(5):981-988), and thus may have potential for the diagnosis of cancer.
  • MID- 19434 (SEQ ID NO: 93) is a 25-nt long RNA derived from a -100 nucleotide-long hY3 RNA-like sequence. This sequence was highly expressed, with 200 sequenced reads, which is more abundant than over 300 known miRNAs sequenced iif the analyzed tumor samples.
  • the predicted well-folded precursor of this miRNA (SEQ E) NOS: 235-242) is precisely aligned to the hY3 RNA (genebank number NR_004392.1), suggesting that the Y RNA is processed, possibly by Dicer, to yield a 25 bp mature miRNA.
  • MID-19433 (SEQ ID NO: 92) is derived from hairpin-folded hYl Y RNAs (genebank number NR_004391.1).
  • Endogenous siRNA are ⁇ 21-bp-long RNA species that are processed from a dsRNA by Dicer and assembled in the RNA induced silencing complex (RISC). These, were recently described in mouse oocytes ⁇ Watanabe, T., et al. (2008) Nature, 453, 539- 543 ⁇ , but have not yet been identified in the human transcriptome.
  • the identified candidate human endogenous siRNA is a ⁇ 20-nt dsRNA that could be derived from bidirectional transcription of the same locus.
  • MID-19433 SEQ ID NO: 92
  • MID-19434 SEQ ID NO: 93
  • MID-16489 SEQ ID NO: 31
  • MID- 19433 SEQ ID NO: 92
  • MID-19434 SEQ ID NO: 93

Abstract

Disclosed are microRNA molecules, as well as various nucleic acid molecules relating thereto or derived therefrom Further disclosed are methods and compositions that can be used for diagnosis of cancer.

Description

NUCLEIC ACID SEQUENCES RELATED TO CANCER
CROSS REFERENCE TO RELATED APPLICATIONS
The present application claims priority under 35 U.S. C. § 119(e) to U.S. Provisional Applications No. 61/236,090 filed Aug. 23, 2009 and No. 61/330,920 filed May 4, 2010, which are herein incorporated by reference in its entirety.
FIELD OF THE INVENTION
The invention relates to microRNA molecules, as well as various nucleic acid molecules relating thereto or derived therefrom. The invention also relates to methods and compositions that can be used for diagnosis of cancer. BACKGROUND OF THE INVENTION
microRNAs (miRNAs, miRs) are endogenous non-coding small RNAs that negatively regulate gene expression by interfering with the translation of coding messenger RNAs (mRNAs) in a sequence-specific manner, thereby playing a critical role in the control of gene expression during development and tissue homeostasis (Yi et al. , Nat Genet 2006;38:356-362). Certain miRNAs have been shown to be deregulated in human cancer, and their specific over- or under-expression has been shown to correlate with particular tumor types (Calin and Croce, Nat Rev Cancer 2006;6:857-866), as well as predict patient outcome (Yu et al, Cancer Cell 2008;13:48-57). In some cases miRNA over-expression results in reduced expression of tumor suppressor genes, while loss of miRNA expression often leads to oncogene activation. Recent work has shown an essential role for miRNA deregulation in breast cancer metastasis, and in other tumor types. Differential expression of miRNAs may therefore indicate specific disease states.
Deep sequencing is method of high-throughput DNA sequencing using a novel highly parallel sequencing-by-synthesis approach, which allows, rapid sequencing of millions of bases, and even whole genomes. The technique can be used to sequence any double-stranded DNA and can be used for de novo whole genome sequencing, r&- sequencing of whole genomes and target DNA regions, metagenomics and RNA analysis. It is based on an emulsion-based method, in which short adaptors are ligated onto the ends of sequence fragments, which are then immobilized onto beads. The beads are then emulsified with the amplification reagents in a water-in-oil mixture, and are clonally amplified within the emulsion droplets. Sequencing-by-synthesis is then performed by pyrosequencing in wells on a fibreoptic slide.
Deep sequencing methods have been widely used in recent years. These high throughput and highly sensitive sequencing methods include Roche Applied Sciences (454) GS, Illumina's Solexa IG sequencer, and Applied Biosystem's SOLiD system. Deep sequencing can be used for the discovery of novel miRNA species and other small RNAs that are missed by traditional sequencing of small RNA libraries. Human microRNAs were previously identified using deep sequencing (Bar, M. et al. (2008) Stem Cells, 2(\,
2496-2505). However, the miRNA content of solid human tumors has only been partially explored using these methods and yet-unknown miRNAs and other small RNAs may be part of the tumor transcriptome.
Deep sequencing may be used to identify miRNAs and their differential expression in tissue samples, and may thus aid in distinguishing between primary tumors and cancer metastasis. Being able to distinguish between primary tumors and cancer metastasis, as well as distinguishing between metastases of different origins, has practical importance for choice of therapy. Diagnosis of specific tumors is also of great importance when choosing appropriate treatment. miR analogs, as well as anti-sense sequences of. miRs, were recently shown to be useful as a therapeutic agent in several cancers. Ths , miRNA content of solid human tumors has only been partially explored using these methods and yet-unknown miRNAs and other small RNAs may be part of the tumor transcriptome. Thus, there exists a need to identify nucleic acid sequences which will aid in cancer diagnosis.
SUMMARY OF THE INVENTION
The present invention is based in part on deep sequencing analysis of miRNAs from tumor specimens of different types. A computational approach was used to identify known miRNA sequences, miRNA sequence variants (isomiRs), and novel small RNA. species in these tumors. Subsequently, normal and tumor samples from various tissue types were hybridized to a miRNA-microarray containing the novel miRNAs and known miRNAs. Some of the novel miRNAs are abundantly expressed in different types of tumors and others are expressed differently between tumor and non-tumor samples, between different tumor stages or between different types of tumors. In addition, using RT-PCR as a third platform the expression of several novel small RNAs was confirmed in normal human serum. These new cancer miRNA candidates can potentially be used as diagnostic biomarkers or therapeutic targets in different types of cancer.
The present invention provides nucleic acid sequences related to cancer, and methods and compositions that can be used for diagnosis of cancer.
In one embodiment, the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ ID NOS: 31, 92-93, 450-454, 1, 2, 4-7, 9-26, 28-30, 32-35, 37-43ja 46-72, 74-81, 83-91, 95-102, 104-108, 110-114, 116-157, 159-175 and' 472-495;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
In one embodiment, the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ ID NOS: 31, 92-93, 450-454, 1, 2, 4-7, 9-26, 28-30, 32-35, 37-43, 46- 72, 74-81, 83-91, 95-102, 104-108, 110-114, 116-157, 159-175 and 472-495; ;.
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 90% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
In one embodiment, the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ ID NOS: 192, 232-242, 176-187, 189-191, 193-196, 198-231, 243-267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387,
389-413, 415-432, 434-438, 440-449 and 496-514;
(b) a DNA encoding (a); '
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 50-150 nucleotides in length.
In one embodiment, the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of: (a) SEQ ID NOS: 192, 232-242, 176-187, 189-191, 193-196, 198-231, 243-267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389-413, 415-432, 434-438, 440-449 and 496-514;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 90% identical to (a) - (c),
wherein said nucleic acid is 50-150 nucleotides in length.
hi one embodiment, the present invention provides an isolated nucleic acid comprising an endogenous human siRNA. In one embodiment, the endogenous human siRNA comprises a sequence selected from the group consisting of:
(a) SEQ ID NOS: 450-454;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
hi one embodiment, the present invention provides an isolated nucleic acid comprising an endogenous human siRNA. In one embodiment, the endogenous human siRNA comprises a sequence selected from the group consisting of:
(a) SEQ ID NOS: 450-454;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 90% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
hi one embodiment, the present invention provides an isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ ID NOS: 53 and 162; ,
(b) SEQ ID NOS : 70 and 110;
(c) SEQ ID NOS: 14 and 120;
(d) SEQ ID NOS: 63, 106 and 58; and
(e) SEQ ID NOS: 135 and 159.
hi one embodiment, the isolated nucleic acid of the invention is a modified oligonucleotide. In one embodiment, the invention provides a composition comprising an isolated nucleic acid of the invention. According to one embodiment the composition is suitable for diagnostic applications. According to another embodiment the composition is suitable for therapeutic applications. According to some embodiments the composition further comprises a pharmaceutically acceptable carrier. According to one embodiment the composition is a marker or modulator of cancer.
hi one embodiment, the invention provides a recombinant expression vector' comprising an isolated nucleic acid of the invention. According to another embodiment, the invention provides a probe comprising an isolated nucleic acid of the invention. According to another embodiment, the invention provides a biochip comprising the probe of the invention. According to another embodiment, the invention provides a host cell comprising an isolated nucleic acid of the invention.
According to another embodiment, the invention provides a method for diagnosing a cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30; 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119-121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162,
165, 167, 169-174, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression levels of any of said nucleic acids in healthy controls,
wherein the comparison of said expression profile to said reference expression allows for diagnosis of said cancer.
According to some embodiments, the cancer is selected from the group
, consisting of colon, bladder, breast, lung, liver, kidney, ovarian, prostate, esophagus, cervix, and pancreatic cancer. According to some embodiments relatively high expression levels of a nucleic acid sequence selected from the group consisting of SEQ
ID NOS: 31, 4, 7, 11, 17, 21, 22, 23, 26, 30, 33, 35, 37, 39, 46, 48, 49, 51, 52, 53, 56, 58,
60, 63, 64, 66, 68, 71, 72, 74, 76-78, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119-121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159- 162, 165, 167 and 169-174, as compared to said reference expression profile, is indicative of cancer. According another embodiment, relatively low expression levels of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16, 34, 47 and 83, as compared to said reference expression profile, is indicative of cancer.
According to another embodiment, the invention provides a method of diagnosing an increased risk of colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 21, 68, 111 and 174, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression levels of any of said nucleic acids in healthy controls,
wherein a high expression level of said nucleic acid sequence is indicative of an increased risk of colon cancer in a subject.
According to another embodiment, the invention provides a method of diagnosing an increased risk of colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of SEQ ID NO: 92, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein a low expression level of said nucleic acid sequence is indicative of an increased risk of colon cancer in a subject.
According to another embodiment, the invention provides a method of diagnosing colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 2, 15, 28, 29, 48, 58, 61, 70, 86, 97, 107, 110, 150, 156, 170 and 172, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and (c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of colon cancer. According to another embodiment, the invention provides a method of diagnosing lung cancer in a subject a comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1, 7, 19, 21, 22, 26, 30, 35, 37, 39, 43, 46, 51, 53, 58, 59, 63, 64, 68, 71, 72, 74, 78, 86, 87, 106, 110, 113, 116, 119,
125, 127, 132, 135, 141, 153, 159, 161, 164, 165 and 170-173, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of lung cancer.
According to another embodiment, the invention provides a method of diagnosing bladder cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 7, 26, 35, 37, 39, 53, 71, 72, 125, 127,
130, 132, 135, 159, 161, 165, 170 and 172, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and (c) comparing said expression profile to a reference, expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of bladder cancer.
According to another embodiment, the invention provides a method of diagnosing bladder cancer in a subject comprising:
(a) obtaining a biological sample from said subject; (b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 34 and 83, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of bladder cancer.
According to another embodiment, the invention provides a method of diagnosing liver cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 11, 26, 71, 72, 77, 98, 134, 136, 153, 160, 170 and 171, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, ; wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of liver cancer.
According to another embodiment, the invention provides a method of diagnosing liver cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 12, 14, 28-32, 34, 42, 43, 52-55, 67, 75, 84, 85, 89, 97,105, 123-126, 128, 129, 137 and 140 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample.; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of liver cancer.
According to another embodiment, the invention provides a method of diagnosing an endometrial metastasis in a subject comprising:
(a) obtaining a biological sample from said subject; (b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 22, 25, 29, 31, 37, 39, 53, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of endometrial metastasis.
According to another embodiment, the invention provides a method of diagnosing an endometrial metastasis in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16 and 57 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of endometrial metastasis.
According to another embodiment, the invention provides a method of diagnosing kidney cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 26, 37, 39, 53, 71, 72, 98, 125, 127,. 130, 135, 159 and 165, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of kidney cancer. According to another embodiment, the invention provides a method of diagnosing kidney cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16, 34, 83 and 140, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample' and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of kidney cancer.
According to another embodiment, the invention provides a method of diagnosing breast cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 79, 99, 122, 130, 153 and 154 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls, wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of breast cancer.
According to some embodiments, the invention provides a method to distinguish between a primary lung tumor and a metastasis to the lung, said method comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from: the group consisting of SEQ ID NOS: 31, 92, 1, 2, 6, 7, 10, 11, 13, 17, 19-24, 28-30, 33, 40, 42, 43, 46, 49, 51, 56-59, 61, 63, 64, 68, 70, 71, 74, 75, 77, 78, 80, 81, 86-88, 90-91, 95, 96, 97, 100, 104, 106, 107, 110-113, 116, 117, 119, 122,129, 133, 138, 139, 141, 142, 144, 145-152, 154, 162, 164, 165, 169,
171-173, 175, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample;
(c) comparing said expression profile to a reference expression profile, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of a primary lung tumor.
According to some embodiments, the origin of the metastasis to the lung is selected from the group consisting of endometrium, kidney, larynx, melanocyte and salivary gland.
According to some embodiments, the subject of the invention is a human. According to some embodiments, the method of the invention is used to determine a course of treatment for said subject.
According to some embodiments, the biological sample obtained in the method of the invention is selected from the group consisting of bodily fluid, a cell line and a tissue sample. According to some embodiments, the bodily fluid is selected from the group consisting of whole blood and serum. According to some embodiments, the tissue is a fresh, frozen, fixed, wax-embedded or formalin fixed paraffin-embedded (FFPE) tissue.
According to some embodiments, the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof. According to some embodiments, the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization. According to some embodiments, the nucleic acid amplification method is real-time PCR. According to some embodiments, the real-time PCR method comprises forward and reverse primers. According to some embodiments, the forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 466, 467, 469, a fragment thereof and a sequence at least about 80% identical thereto. According to some embodiments, the real-time PCR method further comprises a probe. According to some embodiments, the probe comprises a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119- 121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof and a sequence at least about 80% identical thereto. According to another embodiment, the invention provides a kit for diagnosing a cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NO: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119- 121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing an increased risk of colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 21, 68, 92, 111 and 174, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
2, 15, 28, 29, 48, 58, 61, 70, 86, 97, 107, 110, 150, 156, 170 and 172, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing lung cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
I, 7, 19, 21, 22, 26, 30, 35, 37, 39, 43, 46, 51, 53, 58, 59, 63, 64, 68, 71, 72, 74, 78, 86, 87, 106, 110, 113, 116, 119, 125, 127, 132, 135, 141, 153, 159, 161, 164, 165 and 170- 173, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing bladder cancer in a subject, said kit comprising a probe comprising a nucleic aciα sequence that is complementary to a sequence selected from the group consisting of SEQ
ID NOS: 7, 26, 34, 35, 37, 39, 53, 71, 72, 83, 125, 127, 130, 132, 135, 159, 161, 165, 170 and 172, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing liver cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
II, 12, 14, 26, 28-32, 34, 42, 43, 52-55, 67, 71, 72, 75, 77, 84, 85, 89, 97, 89, 105, 123- 126, 128, 129, 134, 136, 137, 140, 153, 160, 170 and 171, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing a subject with endometrial metastasis, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of; SEQ ID NOS: 16, 22, 25, 29, 31, 37, 39, 53, 57, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing kidney cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ
ID NOS: 16, 26, 34, 37, 39, 53, 71, 72, 83, 98, 125, 127, 130, 135, 140, 159 and 165, a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for diagnosing breast cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS:
79, 99, 122, 130, 153 and 154 a fragment thereof and a sequence at least about 80% identical thereto.
According to another embodiment, the invention provides a kit for distinguishing between a primary lung tumor and a metastasis to the lung, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 92, 1, 2, 6, 7, 10, 11, 13, 17, 19-24, 28-30, 33,
40, 42, 43, 46, 49, 51, 56-59, 61, 63, 64, 68, 70, 71, 74, 75, 77, 78, 80, 81, 86-88, 90-91,
95, 96, 97, 100, 104, 106, 107, 110-113, 116, 117, 119, 122,129, 133, 138, 139, 141, 142,; 144, 145-152, 154, 162, 164, 165, 169, 171-173, 175, a fragment thereof and a sequence at least about 80% identical thereto.
These and other embodiments of the present invention will become apparent in conjunction with the figures, description and claims that follow. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 shows differential expression of miRs (in log2 (fluorescence units)), comparing the median values of each miR in breast primary tumor (y-axis) with breast metastases into lymph nodes (x-axis). Median normalized fluorescence for each miRNA, (black crosses) indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
Figure 2 shows differential expression of miRs (in Iog2(fluorescence units)), comparing the median values of each miR in colon tumors (y-axis) with the corresponding median for their adjacent tissues (x-axis). Median normalized fluorescence for each miRNA (black crosses) indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
Figure 3 shows differential expression of miRs (in Iog2(fluorescence units))t comparing the median values of each miR in lung tumors (y-axis) with the corresponding median for other tumors from the following tissues: bile duct, bladder, breast, colon, kidney, liver, lung, ovary, pancreas, and prostate, (x-axis). Median normalized fluorescence for each miRNA (black crosses) indicates expression levels as measured by microarray. Squares represent differentially expressed miRs. The parallel lines describe a fold change between groups of 1.5 in either direction.
Figure 4. Expression in RT-PCR of novel miRNAs and small RNAs, in human serum. RNA was measured in sera of 19 normal humans and in negative control not containing RNA. Shown is the median of expression signals (y-axis, in units of 42-Ct) for each miR in all tested samples. Black bars show expression in experimental samples- and white bars show expression in negative controls.
DETAILED DESCRIPTION
The present invention extends the current knowledge of the tumor small RNA transcriptome and provides novel candidates for molecular biomarkers and drug targets.
This work demonstrates, using a sensitive method of next generation sequencing on RNA extracted from tumors, in addition to careful computational analysis and followed by verification experiments can identify yet unknown sequences such as the new miRNAs, miRNA-offset RNAs (MORs), Y-RNA derived sequences and endogenous siRNAs presented in this analysis. The identification of such tumor-specific small RNAs5 could lead to the development of new therapeutic targets, which may be utilized as a treatment more specific than the set of tools currently available. , The present invention provides methods and compositions for diagnosis of cancer and cancer metastasis. Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.
Before the present compositions and methods are disclosed and described, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the context clearly dictates otherwise.
For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, and 7.0 are explicitly contemplated.
1. Definitions
Attached
"Attached" or "immobilized", as used herein to refer to a probe and a solid support, may mean that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of ?> specific reactive group on either the solid support or the probe or both molecules. Non- covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non- covalent interactions. Biological sample
"Biological sample" as used herein means a sample of biological tissue or fluid that comprises nucleic acids. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples may also include sections of tissues such as biopsy and autopsy samples, FFPE samples, frozen sections taken for histological purposes, blood, plasma, serum, sputum, stool, tears, mucus, hair, and skin. Biological samples also include explants and primary and/or transformed cell cultures derived from animal or patient tissues.
Biological samples may also be blood, a blood fraction, urine, effusions, ascitic fluid, saliva, cerebrospinal fluid, cervical secretions, vaginal secretions, endometrial secretions, gastrointestinal secretions, bronchial secretions, sputum, cell line, tissue sample, cellular content of fine needle aspiration (FNA) or secretions from the breast. A biological sample may be provided by removing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methodc described herein in vivo. Archival tissues, such as those having treatment or outcome history, may also be used.
Cancer
The term "cancer" is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. Examples of cancers include but are nor limited to solid tumors and leukemias, including: apudoma, choristoma, branchioma, malignant carcinoid syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, small cell lung, non-small cell lung (e.g., lung squamous cell carcinoma, lun^ adenocarcinoma and lung undifferentiated large cell carcinoma), oat cell, papillary, bronchiolar, bronchogenic, squamous cell, and transitional cell), histiocytic disorders, leukemia (e.g., B cell, mixed cell, null cell, T cell, T-cell chronic, HTLV-II-associated, lymphocytic acute, lymphocytic chronic, mast cell, and myeloid), histiocytosis malignant, Hodgkin disease, immunoproliferative small, non-Hodgkin lymphoma, plasmacytoma, reticuloendotheliosis, melanoma, chondroblastoma, chondroma, chondrosarcoma, fibroma, fibrosarcoma, giant cell tumors, histiocytoma, lipoma, liposarcoma, mesothelioma, myxoma, myxosarcoma, osteoma, osteosarcoma, Ewing sarcoma, synovioma, adenofibroma, adenolymphoma, carcinosarcoma, chordoma, craniopharyngioma, dysgerminoma, hamartoma, mesenchymoma, mesonephroma, myosarcoma, ameloblastoma, cementoma, odontoma, teratoma, thymoma, trophoblastic tumor, adeno-carcinoma, adenoma, cholangioma, cholesteatoma, cylindroma; cystadenocarcinoma, cystadenoma, granulosa cell tumor, gynandroblastoma, hepatoma/ hidradenoma, islet cell tumor, Leydig cell tumor, papilloma, Sertoli cell tumor, theca cell tumor, leiomyoma, leiomyosarcoma, myoblastoma, myosarcoma, rhabdomyoma, rhabdomyosarcoma, ependymoma, ganglioneuroma, glioma, medulloblastoma, meningioma, neurilemmoma, neuroblastoma, neuroepithelioma, neurofibroma, neuroma, paraganglioma, paraganglioma nonchromaffin, angiokeratoma, angiolymphoid hyperplasia with eosinophilia, angioma sclerosing, angiomatosis, glomangioma, hemangioendothelioma, hemangioma, hemangiopericytoma, hemangiosarcoma, lymphangioma, lymphangiomyoma, lymphangiosarcoma, pinealoma, carcinosarcoma, chondrosarcoma, cystosarcoma, phyllodes, fibrosarcoma, hemangiosarcoma, leimyosarcoma, leukosarcoma, liposarcoma, lymphangiosarcoma, myosarcoma. myxosarcoma, ovarian carcinoma, rhabdomyosarcoma, sarcoma (e.g., Ewing, experimental, Kaposi, and mast cell), neurofibromatosis, and cervical dysplasia, and other conditions in which cells have become immortalized or transformed.
Cancer prognosis
A forecast or prediction of the probable course or outcome of the cancer. As used herein, cancer prognosis includes the forecast or prediction of any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer. As used herein, "prognostic for cancer" means providing a forecast or prediction of the probable course or outcome of the cancer. In some embodiments, "prognostic for cancer" comprises providing the forecast or prediction of (prognostic for) any one or more of the following: duration of survival of a patient susceptible to or diagnosed with a cancer, duration of recurrence-free survival, duration of progression-free survival of a patient susceptible to or diagnosed with a cancer, response rate in a group of patients susceptible to or diagnosed with a cancer, and duration of response in a patient or a group of patients susceptible to or diagnosed with a cancer. Chemotherapeutic
A drug used to treat a disease, especially cancer. In relation to cancer the drugs typically target rapidly dividing cells, such as cancer cells.
Complement
"Complement" or "complementary" as used herein means Watson-Crick (e.g., A-
TAJ and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. A full complement or fully complementary may mean 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. In some embodiments, the complementary sequence has a reverse orientation (5'-35).
CT
CT signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of CT represent high abundance or expression levels of the microRNA.
In some embodiments the PCR CT signal is normalized such that the normalized
CT remains inversed from the expression level. In other embodiments the PCR CT signal may be normalized and then inverted such that low normalized-inverted CT represents low abundance or expression levels of the microRNA.
Detection
"Detection" means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means measuring the level of a component, either quantitatively or qualitatively.
Differential expression
"Differential expression" may mean qualitative or quantitative differences in th^ temporal and/or cellular gene expression patterns within and among cells and tissue.
Thus, a differentially expressed gene can qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus disease tissue. Genes may be turned on or turned off in a particular state relative to another state, thus permitting comparison of two or more states. A qualitatively regulated gene will exhibit an expression pattern within a state or cell type that may be detectable by standard techniques. Some genes will be expressed in one state or cell type, but not in both.
Alternatively, the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of transcript, or down - regulated, resulting in a decreased amount of transcript. The degree to which expression differs need only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, northern analysis, and Rnase protection.
Expression profile
"Expression profile", as used herein, may mean a genomic expression profile, e.g., an expression profile of microRNAs. Profiles may be generated by any convenient means for determining a level of a nucleic acid sequence, e.g., quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cRNA, etc., quantitative PCR, ELISA for quantification, and the like, and allow the analysis of differential gene expression between two samples. A subject or patient tumor sample, e.g., cells or collections thereof, e.g., tissues, is assayed. Samples are collected by any convenient method, as known in the art. Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided above, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more, including all of the listed nucleic acid sequences. The term "expression profile" may also mean measuring the abundance of the nucleic acid sequences in the measured samples.
Expression ratio
"Expression ratio", as used herein, refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
FDR
When performing multiple statistical tests, for example, in comparing the signal between two groups in multiple data features, there is an increasingly high probability of obtaining false positive results, by random differences between the groups that can reach levels that would otherwise be considered statistically significant. In order to limit the proportion of such false discoveries, statistical significance is defined only for data features in which the differences reach a p-value (such as by a two-sided t-test) below ? threshold, which is dependent on the number of tests performed and the distribution of p- values obtained in these tests. FDR or false discovery rate is the probability that one of the "significant" results was actually false.
Fold change
The larger signal value divided by the smaller signal value. Hairpin
"Hairpin", as used herein, refers to an area where single-stranded DNA or RNA has folded back on itself and nucleotides from the two strands have base paired, so that the resulting structure appears as a hairpin structure. The hairpin may comprise a first and a second nucleic acid sequence that are substantially complementary. The first and second nucleic acid sequence may be from 37-50 nucleotides. The first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides. The hairpin structure may have a free energy less than -25 Kcal/mole as calculated by the Vienna algorithm with default parameters, as described in Hofacker et al, Monatshefte f. Chemie 125: 167-188 (1994), the contents of which are incorporated herein. The hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides.
Gene
"Gene", as used herein, may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (e.g., introns, 5'- and 3 '-untranslated sequences). The coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA. A gene may also be an mRNA or cDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3 '-untranslated sequences linked thereto. A gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3 '-untranslated sequences linked thereto.
Identity
"Identical" or "identity", as used herein in the context of two or more nucleic acids or polypeptide sequences, may mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0.
Increased risk of cancer
As used herein, "increased risk of cancer" may mean that the probability that a subject will develop cancer in the future is higher than that of a control subject.
Inhibit
"Inhibit", as used herein, may mean prevent, suppress, repress, reduce or eliminate.
Label
"Label", as used herein, may mean a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymer (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable. A label may be incorporated into nucleic acids and proteins at any position.
Logistic regression
Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable is dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e., the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space) and of other explaining variables. The logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type if P is greater than 0.5 or 50%. Alternatively, the calculated probability P can be used as a variable in other contexts such as a ID or 2D threshold classifier.
metastasis
"Metastasis", as used herein, means the process by which cancer spreads from the place at which it first arose as a primary tumor (origin) to other locations in the body. The metastatic progression of a primary tumor reflects multiple stages, including dissociation from neighboring primary tumor cells, survival in the circulation, and growth in a secondary location. The name of a specific metastasis refers to its origin. .-> miRNA or miR
"miRNA" or "miR", as used herein, may mean a non-coding RNA between 18. and 25 nucleobases in length, which is the product of cleavage of a pre-miRNA by the enzyme Dicer. Examples of mature miRNAs are found in the miRNA database known as Sanger miRBase (release 10).
miRNA precursor
"miRNA precursor", as used herein, may mean a transcript that originates from a genomic DNA and that comprises a non-coding, structured RNA comprising one or more miRNA sequences. For example, in certain embodiments a miRNA precursor is a pre- rm'RNA. In certain embodiments, a miRNA precursor is a pri-miRNA.
Mismatch
"Mismatch" means a nucleobase of a first nucleic acid that is not capable of pairing with a nucleobase at a corresponding position of a second nucleic acid.
Modified oligonucleotide
"Modified oligonucleotide" as used herein means an oligonucleotide having one or more modifications relative to a naturally occurring terminus, sugar, nucleobase, and/or internucleoside linkage. According to one embodiment, the modified oligonucleotide is a miRNA or siRNA comprising a modification (e.g. labeled). According to another embodiment, the modified oligonucleotide is complementary to a miRNA or siRNA.
Nucleic acid
"Nucleic acid" or "oligonucleotide" or "polynucleotide", as used herein, may mean at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions. Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions. Nucleic acids may be single-stranded or double-stranded, or may contain portions; of both double-stranded and single-stranded sequence. The nucleic acid may be DNA, both genomic and cDNA, KNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
A nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroarnidite linkages and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated by reference. Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids. The modified nucleotide analog may be located, for example, at the 5'-end and/or the 3'-end of the nucleic acid molecule. Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e., ribonucleotides, containing a non- naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5-position, e.g., 5-(2-amino)propyl uridine, 5-bromo undine; adenosines and guanosines modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6- methyl adenosine are suitable. The 2'-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or CN, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I. Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al, Nature 2005;438:685-689, Soutschek et al, Nature 2004;432: 173-178, and U.S. Patent Publication No. 20050107325, which are incorporated herein by reference. Additional modified nucleotides and nucleic acids are described in U.S. Patent Publication No. 20050182005, which is incorporated herein by reference. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip. The backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells. The backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and kidney. Mixtures of naturally occurring nucleic acids and analogs may be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
Probe
"Probe", as used herein, may mean an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence, depending upon the stringency of the hybridization conditions. There may be any number of base pair mismatches which will interfere with hybridization- between the target sequence and the single-stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence. A probe may be single-stranded or partially single- and partially double- stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind.
Pseudogene
As used herein, pseudogenes are defunct relatives of known genes that are no longer expressed in the cell. Although most pseudogenes have some gene-like features (such as promoters, CpG islands, and splice sites), they are nonetheless considered nonfunctional, due to their lack of protein-coding ability resulting from various genetic disablements (stop codons, frameshifts, or a lack of transcription) or their inability to encode RNA (such as with rRNA pseudogenes).
Reference expression profile
As used herein, the term "reference expression profile" means a profile of values that statistically correlates to a particular outcome when compared to an assay result. In preferred embodiments the reference profile values are determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes> The reference values may be a threshold score value or a cutoff score value. Typically, s reference value will be a threshold above which one outcome is more probable and below which an alternative outcome is more probable. Sensitivity
"Sensitivity", as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example, how frequently it correctly classifies a cancer into the correct type out of two possible types. The sensitivity for class A is the proportion of cases that are determined to belong to class "A" by the test out of the cases that are in class "A", as determined by some absolute or gold standard.
siRNA
"siRNA", as used herein, refers to small inhibitory RNA duplexes (generally 16- 30 base pairs) that induce the RNA interference (RNAi) pathway. These molecules contain varying degrees of complementarity to their target niRNA in the antisense strand.
Some, but not all, siRNA have unpaired overhanging bases on the 5' or 3' end of the sense strand and/or the antisense strand. The term "siRNA" includes duplexes of two separate strands, as well as single strands that can form hairpin structures comprising a duplex region.
Specificity
"Specificity", as used herein, may mean a statistical measure of how well a binary classification test correctly identifies a condition, for example, how frequently it correctly classifies a cancer into the correct type out of two possible types. The specificity for class A is the proportion of cases that are determined to belong to clasc "not A" by the test out of the cases that are in class "not A", as determined by some; absolute or gold standard.
Stem-loop sequence
"Stem-loop sequence", as used herein, may mean an RNA having a hairpin structure and containing a mature miRNA sequence. Pre-miRNA sequences and stem- loop sequences may overlap. Examples of stem-loop sequences are found in the miRNA database known as Sanger miRBase (release 10).
Stringent hybridization conditions
"Stringent hybridization conditions", as used herein, may mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid. sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances.
Stringent conditions may be selected to be about 5-10°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm may be the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., about 10-50 nucleotides) and at least about 6O0C for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization. Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 650C, with wash in 0.2x SSC, and O.l% SDS at 65°C.
Substantially complementary
"Substantially complementary" as used herein means that a first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the complement of a second sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
Substantially identical
"Substantially identical" as used herein means that a first and a second sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence.
Subject
As used herein, the term "subject" refers to a mammal, including both human and other mammals. The methods of the present invention are preferably applied to human subjects. Threshold expression level
As used herein, the phrase "threshold expression level" refers to a criterion expression profile to which measured values are compared in order to determine the prognosis of a subject with cancer. The reference expression profile may be based on the expression of the nucleic acids, or may be based on a combined metric score thereof.
Tissue sample
As used herein, a tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts. The phrase "suspected of being cancerous", as used herein, means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser- based microdissection, or other art-known cell-separation methods.
Treat
"Treat" or "treating", as used herein when referring to protection of a subject from a condition, may mean preventing, suppressing, repressing, or eliminating the condition. Preventing the condition involves administering a composition described herein to a subject prior to onset of the condition. Suppressing the condition involves administering the composition to a subject after induction of the condition but before its clinical appearance. Repressing the condition involves administering the composition to a subject after clinical appearance of the condition such that the condition is reduced or prevented from worsening. Elimination of the condition involves administering the composition to a subject after clinical appearance of the condition such that the subject no longer suffers from the condition.
Tumor
"Tumor", as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
Variant
"Variant", as used herein to refer to a nucleic acid, may mean (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under
Figure imgf000028_0001
conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto. Vector
"Vector", as used herein, may mean a nucleic acid sequence containing an origin of replication. A vector maybe a plasmid, bacteriophage, bacterial artificial chromosome or yeast artificial chromosome. A vector may be a DNA or RNA vector. A vector may be either a self-replicating extrachromosomal vector or a vector which integrates into a host genome.
Y-RNA
"Y RNAs", as used herein, are small non-coding RNA components of the Ro ribonucleoprotein particle (Ro RNP). These small RNAs are predicted to fold into a conserved stem formed by the 3' and 5' ends of the RNA and characterized by a single; bulged cytosine. In some embodiments, Y RNAs are over-expressed in human tumours.
In some embodiments, Y RNAs are required for cell proliferation.
1D/2D threshold classifier
"1D/2D threshold classifier", as used herein, may mean an algorithm for classifying a case or sample such as a cancer sample into one of two possible types such as two types of cancer or two types of prognosis (e.g., good and bad). For a ID threshold classifier, the decision is based on one variable and one predetermined threshold value; the sample is assigned to one class if the variable exceeds the threshold and to the other class if the variable is less than the threshold. A 2D threshold classifier is an algorithm for classifying into one of two types based on the values of two variables. A score may be calculated as a function (usually a continuous function) of the two variables; the decision is then reached by comparing the score to the predetermined threshold, similar to the ID threshold classifier.
2. MicroRNAs and their processing
A gene coding for a miRNA may be transcribed, leading to production of an miRNA precursor known as the pri-miRNA. The pri-miRNA may be part of a polycistronic RNA comprising multiple pri-miRNAs. The pri-miRNA may form a hairpin with a stem and loop. The stem may comprise mismatched bases.
1
The hairpin structure of the pri-miRNA may be recognized by Drosha, which is an Rnase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 30-200 nucleotide precursor known as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered cut typical of Rnase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and ~2 nucleotide 3' overhang. Approximately one helical turn of stem (~10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing. The pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran-GTP and the export receptor Ex-portin-5.
The pre-miRNA may be recognized by Dicer, which is also an Rnase III endonuclease. Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also recognize the 5' phosphate and 3' overhang at the base of the stem loop. Dicer may cleave off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and ~2 nucleotide 31 overhang. The resulting siRNA-- like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
Although initially present as a double-stranded species with miRNA*, the miRNA may eventually become incorporated as a single-stranded RNA into a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC). Various proteins can form the RISC, which can lead to variability in specifity for. miRNA/miRNA* duplexes, binding site of the target gene, activity of miRNA (repress or activate), and which strand of the miRNA/miRNA* duplex is loaded in to the RISC.
When the miRNA strand of the miRNAmiRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded. The strand of the miRNA:miRNA* duplex that is loaded into the RISC maybe the strand whose 51 end is less tightly paired. In cases where both ends of the miRNAmiRNA* have roughly equivalent 5' pairing, both miRNA and miRNA* may have gene silencing activity.
The RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-8 of the miRNA. Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire length of the miRNA. This was shown for miR-196 and Hox B8 and it was further shown that miR-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al, Science 2004; 304:594-596). Otherwise, such interactions are known only in plants (Bartel & Bartel, Plant Physiol 2003; 132:709-717). A number of studies have looked at the base-pairing requirement between miRNA and its mRNA target for achieving efficient inhibition of translation (reviewed by Bartel, Cell 2004;l 16:281-297). In mammalian cells, the first 8 nucleotides of the miRNA may be important (Doench & Sharp, GenesDev 2004; 18:504-511). However, other parts of the microRNA may also participate in mRNA binding. Moreover, sufficient base pairing at the 3' can compensate for insufficient pairing at the 5' (Brennecke et al, PIoS Biol 2005; 3:e85). Computation studies, in which miRNA binding on whole genomes is analyzed, have suggested a specific role for bases 2-7 at the 5' of the miRNA in target binding, but the role of the first nucleotide, found usually to be, "A", was also recognized (Lewis et al, Cell 2005;120:15-20). Similarly, nucleotides 1-7 or 2-8 were used by Krek et al, Nat Genet 2005; 37:495-500) to identify and validate targets.
The target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region. Interestingly, multiple miRNAs may regulate the same mRNA target by recognizing the same or multiple sites. The presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple
RISCs provides the most efficient translational inhibition.
miRNAs may direct the RISC to down-regulate gene expression by either of two mechanisms: mRNA cleavage or translational repression. The miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may. repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA. Translational repression may be more prevalent in animals since animals may have a lower degree of complementarity between the miRNA and binding site.
It should be noted that there may be variability in the 5' and 3' ends of any pair of miRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage. Variability at the 5.' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri-miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer. 2.1 Nucleic Acids
Nucleic acids are provided herein. The nucleic acid may comprise the sequence of SEQ ID NOS: 1-514, or variants thereof. The variant may be a complement of the referenced nucleotide sequence. The variant may also be a nucleotide sequence that is substantially identical to the referenced nucleotide sequence or the complement thereof. The variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto. ■'
The nucleic acid may have a length of from 10 to 250 nucleotides. The nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides. The nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein. The nucleic acid may be synthesized as a single-strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex. The nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or may be capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No..
6,506,559, which is incorporated by reference.
Table t: miR and miR hairpin sequence identity numbers
Figure imgf000033_0001
Figure imgf000034_0001
Figure imgf000035_0001
Figure imgf000036_0001
Figure imgf000037_0001
Figure imgf000038_0001
Predicted precursor positions, including genomic location, are presented in Table 2 below. In one embodiment, a miRNA may be identified at a single genome locus. In another embodiment, a miRNA may be identified within multiple genomic loci.
Table 2: Predicted precursor positions
Figure imgf000039_0001
Figure imgf000040_0001
Figure imgf000041_0001
Figure imgf000042_0001
Figure imgf000043_0001
Figure imgf000044_0001
Figure imgf000045_0001
Figure imgf000046_0001
Figure imgf000047_0001
Sequence variants are presented in Table 3 below, together with counts for both the most abundant sequence and sequence variants.
Table 3: Sequence variants of miR sequences
OO
Figure imgf000049_0001
Figure imgf000050_0001
MiRNAs may, in one embodiment, form clusters in the genome. In one embodiment, a cluster is defined based on the criterion of a distance of not more than 5000 nucleotides between miRNAs within the human genome. Most miRNA genes within 50 kb of each other have highly correlated expression patterns, with the correlation dropping sharply beyond the 50-kb range. Relatively few miRNAs are found between 50 kb and 500 kb of each other, as described in Baskerville and Bartel (RNA 2005; 11:241-247). Thus, in one embodiment, clustered miRNAs are defined as falling within a range of 0.1 kb and 50 kb.
In one embodiment, miRNAs located at this distance are co-transcribed. In another embodiment, miRNAs located at this distance are co-regulated. In yet another embodiment, miRNAs located at this distance are co-transcribed and co-regulated. In one embodiment, SEQ ID NOS: 53 and 162 appear in a cluster. In one embodiment, SEQ ID
NOS: 70 and 110 appear in a cluster, separated by 500 nucleotides on chromosome 12. In one embodiment, SEQ ID NOS: 14 and 120 appear in a cluster. Li one embodiment, SEQ ID NOS: 63, 106 and 58 appear in a cluster separated by 1000 nucleotides on chromosome 20. In one embodiment, SEQ ID NOS: 135 and 159 appear in a cluster. \ a. Nucleic acid complex
The nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer. The nucleic acid may also comprise a protamine-antibody fusion protein as described in Song et al. (Nature Biotechnology 2005;23:709-717) and Rossi (Nature
Biotechnology 2005;23:682-684), the contents of which are incorporated herein by reference. The protamine-fusion protein may comprise the abundant and highly basic cellular protein protamine. The protamine may readily interact with the nucleic acid. The protamine may comprise the entire 51 -amino acid protamine peptide or a fragment thereof. The protamine may be covalently attached to another protein, which may be a
Fab. The Fab may bind to a receptor expressed on a cell surface.
b. Pri-miRNA
The nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
The pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000- 1,500 or 80-100 nucleotides. The sequence of the pri-miRNA may comprise a pre- miRNA, miRNA and miRNA*, as set forth herein, and variants thereof. A sequence of the pri-miRNA may comprise the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37- 43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116-157, 159-187, 189-196, 198- 267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389- 413, 415-432, 434-438, 440-449, 472-514 or variants thereof.
The pri-miRNA may form a hairpin structure. The hairpin may comprise first and second nucleic acid sequence that are substantially complementary. The first and second nucleic acid sequence may be from 37-50 nucleotides. The first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides. The hairpin structure may have a free energy less than -25 Kcal/mole, as calculated by the Vienna algorithm, with default parameters, as described in Hofacker et al. (Monatshefte f. Chemie 1994; 125:167-188), the contents of which are incorporated herein. The hairpin may comprise a terminal loop of 4-20, 8-12 or 10 nucleotides. The pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least
23% thymine nucleotides and at least 19% guanine nucleotides.
c. Pre-miRNA
The nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof. The pre-miRNA sequence may comprise from 45-200, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA may comprise a miRNA and a miRNA*, ac: set forth herein. The sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA. A sequence of the pre-miRNA may comprise the sequence of SEQ TD NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116-157, 159-187, 189-196, 198- 267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389- 413, 415-432, 434-438, 440-449, 472-514 or variants thereof.
d. miRNA
The nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof. The miRNA sequence may comprise from 13-33, 18-24 or 21-23
j nucleotides. The miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37| 38, 39 or 40 nucleotides. The sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may also be the last 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may comprise the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102, 104-108, 110-114, 116- 157, 159-175 and 472-495 or variants thereof. e. Anti-miRNA
The nucleic acid may also comprise a sequence of an anti-rm'RNA that is capable of blocking the activity of a miRNA or miRNA*, such as by binding to the pri-miRNA, pre-miRNA, miRNA or miRNA* (e.g., antisense or RNA silencing), or by binding to the target binding site. The anti-miRNA may comprise a total of 5-100 or 10-60 nucleotides. The anti-miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the anti-miRNA may comprise (a) at least 5 nucleotides that are substantially identical or complementary to the 5' of a miRNA and at least 5-12 nucleotides that are substantially complementary to the flanking regions of the target site from the 5' end of the miRNA, or (b) at least 5-12 nucleotides that are substantially identical or complementary to the 3' of a miRNA and at least 5 nucleotides that are substantially complementary to the flanking region of the target site from the 3 ' end of the miRNA. A sequence of the anti-miRNA may comprise the complement of the sequence of SEQ ID NOS: 1, 2, 4-7, 9-26, 28-35, 37-43, 46-72, 74-81, 83-93, 95-102., 104-108, 110-114, 116-157, 159-175 and 472-495 or variants thereof.
f. siRNA
The nucleic acid may also comprise a sequence of a double-stranded RNA (dsRNA) that is capable of suppressing specific transcripts in a sequence-dependent manner. The dsRNA may be processed to provide small interfering RNAs (siRNAs). The siRNA may comprise a total of 20-25 nucleotides. Endogenous siRNAs have been identified in nematodes, plants and mammalian cells, including mouse oocytes. The sequence of the siRNA may comprise SEQ ID NOS: 450-454 or variants thereof. In one embodiment, these sequences may be found in the introns of two genes: ERBB4 and AKAP6. ERBB4 is a member of the Tyr protein kinase family and the epidermal growth factor receptor subfamily. In one embodiment, mutations in ERBB4 may be associated with cancer. In one embodiment, AKAPs (A-kinase anchor proteins) bind to the: regulatory subunit of protein kinase A (PKA). m one embodiment, the encoded protein is expressed in brain and cardiac and skeletal muscle. In one embodiment, it is specifically localized to the sarcoplasmic reticulum and nuclear membrane. In another embodiment, it is involved in anchoring PKA to the nuclear membrane or sarcoplasmic reticulum. In a further embodiment, the siRNA sequence may be found in pseudogenes derived from mitochondrial tRNA. In some embodiments, the siRNA may have a regulatory role in splicing of the genes in which it resides. In other embodiments, the siRNA may have a role in removing unspliced mRNAs. Detection of such siRNAs in a tissue sample may be indicative of cancerous processes within the cells.
3. Probes
A probe comprising a nucleic acid described herein is also provided. Probes may be used for screening and diagnostic methods. The probe may be attached or immobilized to a solid substrate, such as a biochip.
The probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
The probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140,
160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The probe may further comprise a linker sequence of from 10-60 nucleotides.
4. Biochip
A biochip is also provided. The biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein. The probes may be capable of hybridizing to a target sequence under stringent hybridization conditions. The probes may be attached at a spatially defined address on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence. The probes may be capable of hybridizing to target sequences associated with a single disorder, as appreciated by those in the art. The probes may either be synthesized first, with subsequent attachment to the biochip, or maybe directly synthesized on the biochip.
The solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes, and is amenable to at least one detection method. Representative examples of substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The substrates may allow optical detection without appreciably fluorescing.
r'
The substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed-cell foams made of particular plastics.
The biochip and the probe may be derivatized with chemical functional groupr for subsequent attachment of the two. For example, the biochip may be derivatized witb a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker. The probes may be attached to the solid support by the 5' terminus, 3' terminus, or via an internal nucleotide.
The probe may also be attached to the solid support non-covalently. For example, biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
5. Diagnosis
A method of diagnosis is provided. The method comprises detecting a differential expression level of a nucleic acid in a biological sample. The sample may be derived from a subject. Diagnosis of a disease state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be determined by determining temporarily expressed cancer-associated nucleic acids.
In situ hybridization of labeled probes to tissue sections may be performed. When comparing the fingerprints between an individual and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acids which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes. 6. miRNA expression analysis
Certain changes in miRNA expression patterns in cancer cells relative to noncancerous cells, have been reported. Both increases and decreases in miRNA expression have been described in relation to cancer. The total number of miRNAs in the human genome is estimated to range from approximately 800 to several thousand. In view of this high number of total miRNAs, identification of particular miRNAs linked to particular cancer types is necessary in order to identify miRNAs that could be targeted for cancer therapy, either through inhibition or augmentation of the miRNA.
Accordingly, there exists a need for the identification of miRNAs that can be inhibited for the treatment of cancer. Also needed are inhibitory agents useful for the treatment of cancer. Further, there exists a need for methods of treating cancer by administering to a subject in need thereof a pharmaceutical agent capable of inhibiting a miRNA identified as dysregulated in connection with cancer. As cancer is a disease caused by the uncontrolled proliferation of cells, as well as increased cell survival, desirable traits of pharmaceutical agents for the treatment of cancer include the ability to. reduce cell proliferation, and/or induce apoptosis, which will in turn reduce tumor size, reduce tumor number, and/or prevent or slow the metastasis of cancer cells.
In certain embodiments, the methods provided herein are useful for the treatment of cancer. These methods may result in one or more clinically desirable outcomes in a subject having cancer, such as reduction in tumor number and/or size, reduced metastatic progression, prolonged survival time, and/or increased progression-free survival time. Also provided herein are pharmaceutical agents, such as modified oligonucleotides, that may be used for the treatment of cancer.
The present invention also relates to a method of identifying miRNAs that are associated with disease or a pathological condition comprising contacting a biological sample with a probe or biochip of the invention and detecting the amount of hybridization. PCR may be used to amplify nucleic acids in the sample, which may provide higher sensitivity.
The ability to identify miRNAs that are overexpressed or underexpressed in pathological cells compared to a control can provide high-resolution, high-sensitivity datasets which may be used in the areas of diagnostics, therapeutics, drug development, pharmacogenetics, biosensor development, and other related areas. An expression profile generated by the current methods may be a "fingerprint" of the state of the sample with respect to a number of miRNAs. While two states may have any particular miRNA similarly expressed, the evaluation of a number of miRNAs simultaneously allows the generation of a gene expression profile that is characteristic of the state of the cell. That is, normal tissue may be distinguished from diseased tissue. By comparing expression profiles of tissue in known different disease states, information regarding which miRNAs are associated in each of these states may be obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue sample has the expression profile of normal or disease tissue. This may provide for molecular diagnosis of related conditions.
7. Determinination of Expression Levels
The present invention also relates to a method of determining the expression level of a cancer-associated rm'RNA comprising contacting a biological sample with a probe or biochip of the invention and measuring the amount of hybridization. The expression level of a cancer-associated miRNA is information in a number of ways. For example, a differential expression of a cancer-associated miRNA compared to a control may be used as a diagnostic that a patient suffers from cancer. Expression levels of a cancer- associated miRNA may also be used to monitor the treatment and cancer state of a patient. Furthermore, expression levels of a cancer- associated miRNA may allow the screening of drug candidates for altering a particular expression profile or suppressing an expression profile associated with cancer.
A target nucleic acid may be detected by contacting a sample comprising the target nucleic acid with a biochip comprising an attached probe sufficiently complementary to the target nucleic acid and detecting hybridization to the probe above control levels.
The target nucleic acid may also be detected by immobilizing the nucleic acid to be examined on a solid support such as nylon membranes and hybridizing a labelled probe with the sample. Similarly, the target nucleic may also be detected by immobilizing the labeled probe to the solid support and hybridizing a sample comprising a labeled target nucleic acid. Following washing to remove the non-specific hybridization, the label may be detected.
The target nucleic acid may also be detected in situ by contacting permeabilize^ cells or tissue samples with a labeled probe to allow hybridization with the target nucleic acid. Following washing to remove the non-specifically bound probe, the label may be detected.
These assays can be direct hybridization assays or can comprise sandwich assays, which include the use of multiple probes, as is generally outlined in U.S. Pat. Nos. 5,681,702; 5,597,909; 5,545,730; 5,594,117; 5,591,584; 5,571,670; 5,580,731; 5,571,670; 5,591,584; 5,624,802; 5,635,352; 5,594,118; 5,359,100; 5,124,246; and 5,681,697, each of which is hereby incorporated by reference. A variety of hybridization conditions may be used, including high, moderate and low stringency conditions as outlined above. The assays may be performed under stringency conditions which allow hybridization of the probe only to the target. Stringency can be controlled by altering a step parameter that is a thermodynamic variable, including, but not limited to, temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, or organic solvent concentration.
Hybridization reactions may be accomplished in a variety of ways. Components of the reaction may be added simultaneously, or sequentially, in different orders. In addition, the reaction may include a variety of other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. which may be used to facilitate optimal hybridization and detection, and/or reduce non-specific or background interactions. Reagents that otherwise improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors and anti-microbial agents may also be used as appropriate, depending on the sample preparation methods and purity of the target.
The present invention also relates to a method of diagnosis comprising detecting a differential expression level of a cancer-associated rm'RNA in a biological sample. The sample may be derived from a patient. Diagnosis of cancer in a patient allows for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed miRNA-molecules.
In situ hybridization of labeled probes to tissue arrays may be performed. When comparing the fingerprints between an individual and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the genes which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
8. Drug Screening
The present invention also relates to a method of screening therapeutics comprising contacting a pathological cell capable of expressing a disease related miRNA with a candidate therapeutic and evaluating the effect of a drug candidate on the expression profile of the disease associated miRNA. Having identified the differentially expressed miRNAs, a variety of assays may be executed. Test compounds may be screened for the ability to modulate gene expression of the disease associated miRNA. Modulation includes both an increase and a decrease in gene expression. The test compound or drug candidate may be any molecule, e.g., protein, oligopeptide, small organic molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or indirectly alter the disease phenotype or the expression of the disease associated miRNA. Drug candidates encompass numerous chemical classes, such as small organic molecules having a molecular weight of more than 100 and less than about 500, 1,000, 1,500, 2,000 or 2,500 daltons. Candidate compounds may comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The candidate agents may comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof.
Combinatorial libraries of potential modulators may be screened for the ability to bind to the disease associated miRNA or to modulate the activity thereof. The combinatorial library may be a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical building blocks such as reagents. Preparation and screening of combinatorial chemical libraries is well known to those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide libraries encoded peptides, benzodiazepines, diversomers such as hydantoins, benzodiazepines and dipeptide, vinylogous polypeptides, analogous organic syntheses of small compound libraries, oligocarbamates, and/or peptidyl phosphonates, nucleic acid libraries, peptide nucleic acid libraries, antibody libraries, carbohydrate libraries, and small organic molecule libraries.
9. Gene Silencing
The present invention also relates to a method of using the nucleic acids of the invention to reduce expression of a target gene in a cell, tissue or organ. Expression of the target gene may be reduced by expressing a nucleic acid of the invention that comprises a sequence substantially complementary to one or more binding sites of the target mRNA. The nucleic acid may be a miRNA or a variant thereof. The nucleic acid may also be pri-miRNA, pre-miRNA, or a variant thereof, which may be processed to yield a miRNA. The expressed miRNA may hybridize to a substantially complementary binding site on the target mRNA, which may lead to activation of RISC-mediated gene silencing. An example for a study employing over-expression of miRNA is Yekta et al. (Science 2004; 304:594-596), which is incorporated herein by reference. One of ordinary skill in the art will recognize that the nucleic acids of the present invention may be used to inhibit expression of target genes using antisense methods well known in the art, as well as RNAi methods described in U.S. Patent Nos. 6,506,559 and 6,573,099, which are incorporated by reference.
The target of gene silencing may be a protein that causes the silencing of a second protein. By repressing expression of the target gene, expression of the second protein may be increased. Examples for efficient suppression of miRNA expression are the studies by Esau et al. (JBC 2004; 275:52361) and Cheng et al. (Nucleic Acids Res 2005;
33:1290), which is incorporated herein by reference.
10. Gene Enhancement
The present invention also relates to a method of using the nucleic acids of the invention to increase expression of a target gene in a cell, tissue or organ. Expression of the target gene may be increased by expressing a nucleic acid of the invention that comprises a sequence substantially complementary to a pri-miRNA, pre-miRNA, miRNA or a variant thereof. The nucleic acid may be an anti-miRNA. The anti-miRNA may hybridize with a pri-miRNA, pre-miRNA or miRNA, thereby reducing its gene repression activity. Expression of the target gene may also be increased by expressing a nucleic acid of the invention that is substantially complementary to a portion of the binding site in the target gene, such that binding of the nucleic acid to the binding site may prevent miRNA binding. 11. Therapeutic
The present invention also relates to a method of using the nucleic acids of the invention as modulators or targets of disease or disorders associated with developmental dysfunctions, such as cancer. In general, the claimed nucleic acid molecules may be used as a modulator of the expression of genes which are at least partially complementary to said nucleic acid. Further, miRNA molecules may act as target for therapeutic screening procedures, e.g. inhibition or activation of miRNA molecules might modulate a cellular differentiation process, e.g. apoptosis.
Furthermore, existing miRNA molecules may be used as starting materials for the manufacture of sequence-modified miRNA molecules, in order to modify the target- specificity thereof, e.g. an oncogene, a multidrug-resistance gene or another therapeutic target gene. Further, miRNA molecules can be modified, in order that they are processed and then generated as double-stranded siRNAs which are again directed against therapeutically relevant targets. Furthermore, miRNA molecules may be used for tissue reprogramming procedures, e.g. a differentiated cell line might be transformed by expression of miRNA molecules into a different cell type or a stem cell.
12. Compositions
A composition is also provided. The composition may comprise a nucleic acid described herein and optionally a pharmaceutically acceptable carrier. The composition may encompass modified oligonucleotides that are identical, substantially identical, substantially complementary or complementary to any nucleobase sequence version of the miRNAs described herein or a precursor thereof.
In certain embodiments, a nucleobase sequence of a modified oligonucleotide is fully identical or complementary to a miRNA nucleobase sequence listed herein, or a precursor thereof. In certain embodiments, a modified oligonucleotide has a nucleobase sequence having one mismatch with respect to the nucleobase sequence of the mature miRNA, or a precursor thereof. In certain embodiments, a modified oligonucleotide has a nucleobase sequence having two mismatches with respect to the nucleobase sequence of the miRNA, or a precursor thereof. In certain such embodiments, a modified oligonucleotide has a nucleobase sequence having no more than two mismatches with respect to the nucleobase sequence of the mature miRNA, or a precursor thereof. In certain such embodiments, the mismatched nucleobases are contiguous. In certain such embodiments, the mismatched nucleobases are not contiguous.
In certain embodiments, a modified oligonucleotide consists of a number of linked nucleosides that is equal to the length of the mature miRNA.
In certain embodiments, the number of linked nucleosides of a modified oligonucleotide is less than the length of the mature miRNA. In certain such embodiments, the number of linked nucleosides of a modified oligonucleotide is one less than the length of the mature miRNA. In certain such embodiments, a modified oligonucleotide has one less nucleoside at the 5' terminus. In certain such embodiments, a modified oligonucleotide has one less nucleoside at the 3' terminus. In certain such embodiments, a modified oligonucleotide has two fewer nucleosides at the 5' terminus! In certain such embodiments, a modified oligonucleotide has two fewer nucleosides at the 3' terminus. A modified oligonucleotide having a number of linked nucleosides that is less than the length of the miRNA, wherein each nucleobase of a modified oligonucleotide is complementary to each nucleobase at a corresponding position in a miRNA, is considered to be a modified oligonucleotide having a nucleobase sequence that is fully complementary to a portion of a miRNA sequence.
In certain embodiments, a modified oligonucleotide consists of 15 to 30 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 19 to 24 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 21 to 24 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 15 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 16 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 17 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of IB linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 19 linked nucleosides, m certain embodiments, a modified oligonucleotide consists of 20 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 21 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 22 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 23 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 24 linked nucleosides. In certain embodiments, a modified oligonucleotide consists of 25 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 26 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 27 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 2S linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 29 linked nucleosides, hi certain embodiments, a modified oligonucleotide consists of 30 linked nucleosides.
Modified oligonucleotides of the present invention may comprise one or more modifications to a nucleobase, sugar, and/or internucleoside linkage. A modified nucleobase, sugar, and/or internucleoside linkage may be selected over an unmodified form because of desirable properties such as, for example, enhanced cellular uptake, enhanced affinity for other oligonucleotides or nucleic acid targets and increased stability in the presence of nucleases. .; In certain embodiments, a modified oligonucleotide of the present invention comprises one or more modified nucleosides. In certain such embodiments, a modified nucleoside is a stabilizing nucleoside. An example of a stabilizing nucleoside is a sugar- modified nucleoside.
In certain embodiments, a modified nucleoside is a sugar-modified nucleoside. In certain such embodiments, the sugar-modified nucleosides can further comprise a natural or modified heterocyclic base moiety and/or a natural or modified internucleoside linkage and may include further modifications independent from the sugar modification. In certain embodiments, a sugar modified nucleoside is a 2' -modified nucleoside, wherein the sugar ring is modified at the 2' carbon from natural ribose or 2'-deoxy- ribose. In certain embodiments, 2'-O-methyl group is present in the sugar residue. i
The modified oligonucleotides of the present invention can be generated according to any oligonucleotide synthesis method known in the art, including both enzymatic syntheses and solid-phase syntheses. Equipment and reagents for executing solid-phase synthesis are commercially available from, for example, Applied Biosystems. Any other means for such synthesis may also be employed; the actual synthesis of the oligonucleotides is well within the capabilities of one skilled in the art and can be accomplished via established methodologies as detailed in, for example: Sambrook, J. and Russell, D. W. (2001), "Molecular Cloning: A Laboratory Manual"; Ausubel, R. M. et al, eds. (1994, 1989), "Current Protocols in Molecular Biology," Volumes I-III, John Wiley & Sons, Baltimore, Md.; Perbal, B. (1988), "A Practical Guide to Molecular Cloning," John Wiley & Sons, New York; and Gait, M. J., ed. (1984), "Oligonucleotide Synthesis"; utilizing solid-phase chemistry, e.g. cyanoethyl phosphoramidite followed by deprotection, desalting, and purification by, for example, an automated trityl-on method or HPLC. It will be appreciated that an oligonucleotide comprising an RNA molecule can be also generated using an expression vector as is further described hereinbelow.
The compositions may be used for therapeutic applications. The pharmaceutical composition may be administered by known methods, including wherein a nucleic acid is introduced into a desired target cell in vitro or in vivo.
Methods for the delivery of nucleic acid molecules are described in Akhtar et ah (Trends Cell Bio 1992; 2:139). WO 94/02595 describes general methods for delivery of RNA molecules. These protocols can be utilized for the delivery of virtually any nucleic acid molecule. Nucleic acid molecules can be administered to cells by a variety of methods known to those familiar to the art, including, but not restricted to, encapsulation in liposomes, by iontophoresis, or by incorporation into other vehicles, such as hydrogels, cyclodextrins, biodegradable nanocapsules, and bioadhesive microspheres. Alternatively, the nucleic acid/vehicle combination is locally delivered by direct injection or by use of an infusion pump. Other routes of delivery include, but are not limited to oral (tablet or pill form) and/or intrathecal delivery (Gold, Neuroscience, 76, 1153-1158, 1997). Other approaches include the use of various transport and carrier systems, for example, through the use of conjugates and biodegradable polymers. More detailed descriptions of nucleic acid delivery and administration are provided for example in WO93/23569, WO99/05094, and WO99/04819.
The nucleic acids can be introduced into tissues or host cells by any number of routes, including viral infection, microinjection, or fusion of vesicles. Jet injection may also be used for intra-muscular administration, as described by Furth et al. (Anal Biochem 1992; 205:365-368). The nucleic acids can be coated onto gold microparticles, and delivered intradermally by a particle bombardment device, or "gene gun" as described in the literature (see, for example, Tang et al, Nature 1992;356:152-154), where gold microprojectiles are coated with the DNA, then bombarded into skin cells.
Administration of a pharmaceutical composition of the present invention to a subject having cancer results in one or more clinically desirable outcomes. Such clinically desirable outcomes include reduction of tumor number or reduction of tumor size. Additional clinically desirable outcomes include the extension of overall survival time of the subject, and/or extension of progression-free survival time of the subject. In certain embodiments, administration of a pharmaceutical composition of the invention prevents an increase in tumor size and/or tumor number. In certain embodiments, administration of a pharmaceutical composition of the invention prevents metastatic progression. In certain embodiments, administration of a pharmaceutical composition of the invention slows or stops metastatic progression. In certain embodiments, administration of a pharmaceutical composition of the invention prevents the recurrence of tumors. In certain embodiments, administration of a pharmaceutical composition of the invention prevents recurrence of tumor metastasis.
Administration of a pharmaceutical composition of the present invention to cancer cells may result in desirable phenotypic effects. In certain embodiments, a modified oligonucleotide may stop, slow or reduce the uncontrolled proliferation of cancer cells. In certain embodiments, a modified oligonucleotide may induce apoptosis in cancer cells. In certain embodiments, a modified oligonucleotide may reduce cancer cell survival.
A miRNA hybridizes to an mRNA to regulate expression of the mRNA and its protein product. Generally, the hybridization of a miRNA to its rnRNA target inhibits expression of the mRNA. Thus, the inhibition of a miRNA may result in the increased expression of a miRNA nucleic acid target. In certain embodiments, the inhibition of a miRNA results in the increase of a protein encoded by a miRNA nucleic acid target. :
The present invention also relates to a pharmaceutical composition comprising the nucleic acids of the invention and optionally a pharmaceutically acceptable carrier.
The compositions may be used for diagnostic or therapeutic applications. The administration of the pharmaceutical composition may be carried out by known methods, wherein a nucleic acid is introduced into a desired target cell in vitro or in vivo.
Commonly used gene transfer techniques include calcium phosphate, DEAE-dextran, electroporation, microinjection, viral methods and cationic liposomes.
Cancer treatments often comprise more than one therapy. As such, in certain embodiments the present invention provides methods for treating cancer comprising administering to a subject in need thereof a compound comprising a modified oligonucleotide complementary to a miRNA, or a precursor thereof, and further comprising administering at least one additional therapy.
In certain embodiments, an additional therapy may also be designed to treat cancer. An additional therapy may be a chemotherapeutic agent. Suitable chemotherapeutic agents include 5-fluorouracil, gemcitabine, doxorubicine, mitomycin c, sorafenib, etoposide, carboplatin, epirubicin, irinotecan and oxaliplatin. An additional suitable chemotherapeutic agent includes a modified oligonucleotide, other than a modified oligonucleotide of the present invention, that is used to treat cancer.
In certain embodiments, an additional therapy may be designed to treat a disease other than cancer.
In certain embodiments, an additional therapy is a treatment that includes interferons, for example, interferon alfa-2b, interferon alfa-2a, and interferon alfacon-1.
Less frequent interferon dosing can be achieved using pegylated interferon (interferon attached to a polyethylene glycol moiety which significantly improves its pharmacokinetic profile). Combination therapy with interferon alfa-2b (pegylated and unpegylated) and ribavarin has also been shown to be efficacious for some patient1 populations. Other agents currently being developed include RNA replication inhibitor,'^ (e.g., ViroPharma's VP50406 series), antisense agents, therapeutic vaccines, protease inhibitors, helicase inhibitors and antibody therapy (monoclonal and polyclonal).
In certain embodiments, an additional therapy may be a pharmaceutical agent that enhances the body's immune system, including low-dose cyclophosphamide, thymostimulin, vitamins and nutritional supplements (e.g., antioxidants, including vitamins A, C, E, beta-carotene, zinc, selenium, glutathione, coenzyme Q-IO and echinacea), and vaccines, e.g., the immunostimulating complex (ISCOM), which comprises a vaccine formulation that combines a multimeric presentation of antigen and an adjuvant. l-
In certain such embodiments, the additional therapy is selected to treat oi ameliorate a side effect of one or more pharmaceutical compositions of the present invention. Such side effects include, without limitation, injection site reactions, liver function test abnormalities, renal function abnormalities, liver toxicity, renal toxicity, central nervous system abnormalities, and myopathies. For example, increased aminotransferase levels in serum may indicate liver toxicity or liver function abnormality. For example, increased bilirubin may indicate liver toxicity or liver function abnormality.
In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are administered at the same time. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are administered at different times. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are prepared together in a single formulation. In certain embodiments, one or more pharmaceutical compositions of the present invention and one or more other pharmaceutical agents are prepared separately.
The compositions of the present invention can be formulated into pharmaceutical compositions by combination with appropriate, pharmaceutically acceptable carriers or diluents, and can be formulated into preparations in solid, semi-soJid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories; injections, inhalants and aerosols. As such, administration of the agents can be achieved in various ways, including, but not limited to, oral, buccal, rectal, parenteral, transmucosal, intestinal, enteral, topical, suppository, through inhalation, intraperitoneal,^ intradermal, transdermal, intracheal, intrathecal, intraventricular, intranasal, intraocular and iratumoral (e.g., intravenous, intramuscular, intramedullary, and subcutaneous). An additional suitable administration route includes chemoembolization. hi certain embodiments, pharmaceutical intrathecals are administered to achieve local rather than systemic exposures. For example, pharmaceutical compositions may be injected directly in the area of desired effect (e.g., into a tumor).
In certain embodiments, a pharmaceutical composition of the present invention is administered in the form of a dosage unit (e.g., tablet, capsule, bolus, etc.). In certain embodiments, such pharmaceutical compositions comprise a modified oligonucleotide in a dose selected from 25 mg, 30 mg, 35 mg, 40 mg, 45 mg, 50 mg, 55 mg, 60 mg, 65 mg, 70 mg, 75 mg, 80 mg, 85 mg, 90 mg, 95 mg, 100 mg, 105 mg, 110 mg, 115 mg, 120 mg, 125 mg, 130 mg, 135 mg, 140 mg, 145 mg, 150 mg, 155 mg, 160 mg, 165 mg, 170 mg, 175 mg, 180 mg, 185 mg, 190 mg, 195 mg, 200 mg, 205 mg, 210 mg, 215 mg, 220 mg, 225 mg, 230 mg, 235 mg, 240 mg, 245 mg, 250 mg, 255 mg, 260 mg, 265 mg, 270 mg, 270 mg, 280 mg, 285 mg, 290 mg, 295 mg, 300 mg, 305 mg, 310 mg, 315 mg, 320 mg, 325 mg, 330 mg, 335 mg, 340 mg, 345 mg, 350 mg, 355 mg, 360 mg, 365 mg, 370 mg, 375 mg, 380 mg, 385 mg, 390 mg, 395 mg, 400 mg, 405 mg, 410 mg, 415 mg, 420 mg, 425 mg, 430 mg, 435 mg, 440 mg, 445 mg, 450 mg, 455 mg, 460 mg, 465 mg, 470 mg, 475 mg, 480 mg, 485 mg, 490 mg, 495 mg, 500 mg, 505 mg, 510 mg, 515 mg, 520 mg. 525 mg, 530 mg, 535 mg, 540 mg, 545 mg, 550 mg, 555 mg, 560 mg, 565 mg, 570 mg, 575 mg, 580 mg, 585 mg, 590 mg, 595 mg, 600 mg, 605 mg, 610 mg, 615 mg, 620 mg, 625 mg, 630 mg, 635 mg, 640 mg, 645 mg, 650 mg, 655 mg, 660 mg, 665 mg, 670 mg, 675 mg, 680 mg, 685 mg, 690 mg, 695 mg, 700 mg, 705 mg, 710 mg, 715 mg, 720 mg, 725 mg, 730 mg, 735 mg, 740 mg, 745 mg, 750 mg, 755 mg, 760 mg, 765 mg, 770 mg, 775 mg, 780 mg, 785 mg, 790 mg, 795 mg, and 800 mg. In certain such embodiments, a pharmaceutical composition of the present invention comprises a dose of modified oligonucleotide selected from 25 mg, 50 mg, 75 mg, 100 mg, 150 mg, 200 mg, 250 mg, 300 mg, 350 mg, 400 mg, 500 mg, 600 mg, 700 mg, and 800mg.
In certain embodiments, a pharmaceutical agent is sterile lyophilized modified oligonucleotide that is reconstituted with a suitable diluent, e.g., sterile water for injection or sterile saline for injection. The reconstituted product is administered as a subcutaneous injection or as an intravenous infusion after dilution into saline. The lyophilized drag product consists of a modified oligonucleotide which has been prepared in water for injection, or in saline for injection, adjusted to pH 7.0-9.0 with acid or base during preparation, and then lyophilized. The lyophilized modified oligonucleotide may be 25-800 mg of a modified oligonucleotide. It is understood that this encompasses 25, 50, 75, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 425, 450, 475, 500, 525, 550, 575, 600, 625, 650, 675, 700, 725, 750, 775, and 800 mg of modified lyophilized oligonucleotide. The lyophilized drug product may be packaged in a 2 mL Type I, clear glass vial (ammonium sulfate-treated), stoppered with a bromobutyl rubber closure and sealed with an aluminum FLIP-OFF® overseal.
In certain embodiments, the compositions of the present invention ma]( additionally contain other adjunct components conventionally found in pharmaceutical compositions, at their art-established usage levels. Thus, for example, the compositions may contain additional, compatible, pharmaceutically-active materials such as, for example, antipruritics, astringents, local anesthetics or anti-inflammatory agents, or may contain additional materials useful in physically formulating various dosage forms of the compositions of the present invention, such as dyes, flavoring agents, preservatives, antioxidants, opacifiers, thickening agents and stabilizers. However, such materials, when added, should not unduly interfere with the biological activities of the components of the compositions of the present invention. The formulations can be sterilized and, if desired, mixed with auxiliary agents, e.g., lubricants, preservatives, stabilizers, wetting agents, emulsifiers, salts for influencing osmotic pressure, buffers, colorings, flavoring;: and/or aromatic substances and the like which do not deleteriously interact with the oligonucleotide(s) of the formulation.
In certain embodiments, pharmaceutical compositions of the present invention comprise one or more modified oligonucleotides and one or more excipients. hi certain such embodiments, excipients are selected from water, salt solutions, alcohol, polyethylene glycols, gelatin, lactose, amylase, magnesium stearate, talc, silicic acid, viscous paraffin, hydroxymethylcellulose and polyvinylpyrrolidone.
hi certain embodiments, a pharmaceutical composition of the present invention is prepared using known techniques, including, but not limited to mixing, dissolving! granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or tabletting processes. In certain embodiments, a pharmaceutical composition of the present invention is a liquid (e.g., a suspension, elixir and/or solution). In certain of such embodiments, a liquid pharmaceutical composition is prepared using ingredients known in the art, including, but not limited to, water, glycols, oils, alcohols, flavoring agents, preservatives, and coloring agents.
In certain embodiments, a pharmaceutical composition of the present invention is a solid (e.g., a powder, tablet, and/or capsule). In certain of such embodiments, a solid pharmaceutical composition comprising one or more oligonucleotides is prepared using ingredients known in the art, including, but not limited to, starches, sugars, diluents, granulating agents, lubricants, binders, and disintegrating agents.
Li certain embodiments, a pharmaceutical composition of the present invention is formulated as a depot preparation. Certain such depot preparations are typically longer acting than non-depot preparations, hi certain embodiments, such preparations are administered by implantation (for example, subcutaneously or intramuscularly) or by intramuscular injection. Li certain embodiments, depot preparations are prepared using suitable polymeric or hydrophobic materials (for example, an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
In certain embodiments, a pharmaceutical composition of the present invention comprises a delivery system. Examples of delivery systems include, but are not limited to, liposomes and emulsions. Certain delivery systems are useful for preparing certain pharmaceutical compositions including those comprising hydrophobic compounds. In certain embodiments, certain organic solvents such as dimethylsulfoxide are used.
Li certain embodiments, a pharmaceutical composition of the present invention comprises one or more tissue-specific delivery molecules designed to deliver the one or more pharmaceutical agents of the present invention to specific tissues or cell types. For example, in certain embodiments, pharmaceutical compositions include liposomes coated with a tissue-specific antibody.
In certain embodiments, a pharmaceutical composition of the present invention comprises a co-solvent system. Certain of such co-solvent systems comprise, for example, benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and an aqueous phase. In certain embodiments, such co-solvent systems are used for hydrophobic compounds. A non-limiting example of such a co-solvent system is the VPD co-solvent system, which is a solution of absolute ethanol comprising 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant Polysorbate 80™ and 65% w/v polyethylene glycol 300. The proportions of such co-solvent systems may be varied considerably without significantly altering their solubility and toxicity characteristics. Furthermore, the identity of co-solvent components may be varied: for example, other surfactants may be used instead of Polysorbate 80™; the fraction size of polyethylene glycol may be varied; other biocompatible polymers may replace polyethylene glycol, e.g., polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for dextrose.
In certain embodiments, a pharmaceutical composition of the present invention' comprises a sustained-release system. A non-limiting example of such a sustained- release system is a semi-permeable matrix of solid hydrophobic polymers. In certain embodiments, sustained-release systems may, depending on their chemical nature, release pharmaceutical agents over a period of hours, days, weeks or months.
In certain embodiments, a pharmaceutical composition of the present invention is prepared for oral administration, hi certain of such embodiments, a pharmaceutical composition is formulated by combining one or more compounds comprising a modified oligonucleotide with one or more pharmaceutically acceptable carriers. Certain of such carriers enable pharmaceutical compositions to be formulated as tablets, pills, dragees.., capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by ?.. :
.i subject. In certain embodiments, pharmaceutical compositions for oral use are obtained by mixing oligonucleotide and one or more solid excipient. Suitable excipients include, but are not limited to, fillers, such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, hydroxypropylmethyl- cellulose, sodium carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). hi certain embodiments, such a mixture is optionally ground and auxiliaries are optionally added. In certain embodiments, pharmaceutical compositions are formed to obtain tablets or dragee cores. In certain embodiments, disintegrating agents (e.g., cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof, such as sodium alginate) are added. hi certain embodiments, dragee cores are provided with coatings, hi certain such embodiments, concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be added to tablets or dragee coatings. In certain embodiments, pharmaceutical compositions for oral administration are push-fit capsules made of gelatin. Certain of such push-fit capsules comprise one or more pharmaceutical agents of the present invention in admixture with one or more filler such as lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, optionally, stabilizers. In certain embodiments, pharmaceutical compositions for oral administration are soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or sorbitol. In certain soft capsules, one or more pharmaceutical agents of the present invention are be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, stabilizers may be added.
In certain embodiments, pharmaceutical compositions are prepared for buccal administration. Certain of such pharmaceutical compositions are tablets or lozenges formulated in conventional manner.
In certain embodiments, a pharmaceutical composition is prepared for administration by injection (e.g., intravenous, subcutaneous, intramuscular, etc.). In certain of such embodiments, a pharmaceutical composition comprises a carrier and is formulated in aqueous solution, such as water or physiologically compatible buffers such as Hanks's solution, Ringer's solution, or physiological saline buffer. Li certain embodiments, other ingredients are included (e.g., ingredients that aid in solubility oΛ- serve as preservatives). In certain embodiments, injectable suspensions are prepared using appropriate liquid carriers, suspending agents and the like. Certain pharmaceutical compositions for injection are presented in unit dosage form, e.g., in ampoules or in multi-dose containers. Certain pharmaceutical compositions for injection are suspensions, solutions or emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, stabilizing and/or dispersing agents. Certain solvents suitable for use in pharmaceutical compositions for injection include, but are not limited to, lipophilic solvents and fatty oils, such as sesame oil, synthetic fatty acid esters, such as ethyl oleate or triglycerides, and liposomes. Aqueous injection suspensions may contain substances that increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, such suspensions may also contain suitable stabilizers or agents that increase the solubility of the pharmaceutical agents to allow for the preparation of highly concentrated solutions.
In certain embodiments, a pharmaceutical composition is prepared for traiismucosal administration. In certain of such embodiments penetrants appropriate to the barrier to be permeated are used in the formulation. Such penetrants are generally known in the art.
hi certain embodiments, a pharmaceutical composition is prepared for administration by inhalation. Certain of such pharmaceutical compositions for inhalation are prepared in the form of an aerosol spray in a pressurized pack or a nebulizer. Certain of such pharmaceutical compositions comprise a propellant, e.g., dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In certain embodiments using a pressurized aerosol, the dosage unit may be determined with a valve that delivers a metered amount, hi certain embodiments, capsules and cartridges for use in an inhaler or insufflator may be formulated. Certain of such formulations comprise a powder mixture of a pharmaceutical agent of the invention and a suitable powder base such as lactose or starch.
hi certain embodiments, a pharmaceutical composition is prepared for rectal administration, such as a suppositories or retention enema. Certain of such pharmaceutical compositions comprise known ingredients, such as cocoa butter and/or other glycerides.
hi certain embodiments, a pharmaceutical composition is prepared for topical administration. Certain of such pharmaceutical compositions comprise bland moisturizing bases, such as ointments or creams. Exemplary suitable ointment bases include, but are not limited to, petrolatum, petrolatum plus volatile silicones, and lanolin and water in oil emulsions. Exemplary suitable cream bases include, but are not limited to, cold cream and hydrophilic ointment.
hi certain embodiments, a pharmaceutical composition of the present invention comprises a modified oligonucleotide in a therapeutically effective amount, hi certain embodiments, the therapeutically effective amount is sufficient to prevent, alleviate or ameliorate symptoms of a disease or to prolong the survival of the subject being treated. Determination of a therapeutically effective amount is well within the capability of those skilled in the art.
hi certain embodiments, one or more modified oligonucleotides of the present invention is formulated as a prodrug. In certain embodiments, upon in vivo administration, a prodrug is chemically converted to the biologically, pharmaceutically or therapeutically more active form of a modified oligonucleotide. In certain embodiments, prodrugs are useful because they are easier to administer than the corresponding active form. For example, in certain instances, a prodrug may be more bioavailable (e.g., through oral administration) than is the corresponding active form. In certain instances, a prodrug may have improved solubility compared to the corresponding active form. In certain embodiments, prodrugs are less water soluble than the corresponding active form. In certain instances, such prodrugs possess superior transmittal across cell membranes, where water solubility is detrimental to mobility. In certain embodiments, a prodrug is an ester. In certain such embodiments, the ester is metabolically hydrolyzed to carboxylic acid upon administration. In certain instances the carboxylic acid containing compound is the corresponding active form. In certain embodiments, a prodrug comprises a short peptide (polyaminoacid) bound to an acid group. In certain of such embodiments, the peptide is cleaved upon administration to form the corresponding active form.
In certain embodiments, a prodrug is produced by modifying a pharmaceutically active compound such that the active compound will be regenerated upon in vivo administration. The prodrug can be designed to alter the metabolic stability or the transport characteristics of a drug, to mask side effects or toxicity, to improve the flavor of a drug or to alter other characteristics or properties of a drug. By virtue of knowledge of pharmacodynamic processes and drug metabolism in vivo, those of skill in this art, once a pharmaceutically active compound is known, can design prodrugs of the compound (see, e.g., Nogrady (1985) Medicinal Chemistry A Biochemical Approach, Oxford University Press, New York, pages 388-392).
13. Kits
The present invention also relates to a kit, which may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base, hi addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
For example, the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence. The kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe.
Having now generally described the invention, the same will be more readily understood through reference to the following examples, which are provided by way of illustration and are not intended to be limiting of the present invention. EXAMPLES
Example 1: Materials and methods
Deep Sequencing
Samples in which miRNA expression was identified include tumors (colon, bladder, breast, lung, liver, kidney, ovarian, prostate, esophagus, cervix, and pancreas), normal tissues (colon, bladder, breast, lung, liver, kidney, brain, endometrium, lymph nodes, and heart), metastases (breast, lung, kidney, endometrium, salivary gland, larynx, tongue, and melanocytes) and blood samples (blood cells, whole blood). RNAs used for deep sequencing libraries are as follows: bladder pool, colon pool, breast pool and lung pool.
RNA Isolation and Enrichment Phase
Total RNA was extracted twice from each sample of 23 human formalin-fixed paraffin-embedded (FFPE) samples derived from cancerous tissue [breast (n=5), bladder (n=5), colon (n=7), and lung (n=6)] (producing a total of 46 samples). RNA was isolated using ten 10-μm-thick tissue sections using the extraction protocol developed at Rosetta Genomics. Briefly, the sample was incubated repeatedly in xylene at 570C to remove excess paraffin, followed by washing in ethanol. Proteins were degraded by incubation in proteinase K solution at 450C for a few hours. The RNA was extracted with acid phenol: chloroform followed by ethanol precipitation and DNAse digestion. Total RNA quantity and quality were checked by spectrophotometry (Nanodrop ND- 1000). Pools of samples of the small RNA fraction within the total RNA were labeled and hybridized on arrays. After ensuring the presence and expression of more than 100 miRNAs per cancerous tissue pool, tissues were pooled together, resulting in a bladder+breast tumor pool and a colon+lung pool. Array expression revealed the presence of 157 miRNAs from bladder cancer FFPEs, 260 miRNAs from breast cancer FFPEs, 135 miRNAs from lung cancer FFPEs, and 239 miRNAs from colon cancer FFPEs. Total RNA (75 μg) of seven duplicate different colon cancer FFPEs were pooled together with 75 μg of six duplicate different lung cancer FFPEs, while 75 μg total RNA of five duplicate different bladder cancer FFPEs were pooled together with 75 μg of five duplicate different breast cancer FFPEs. Cloning Linker Attachment Phase
The 3' and 5' cloning linkers (3' Linker: 5'- rAppCTGTAGGCACCATCAAT/3ddC/-3' (SEQ ID NO: 455); 5' Linker: 5'- TGGAATrUrCrUrCrGrGrGrCrArCrCrArArGrGrU-3' (SEQ ID NO: 456)) were ligated to purified small KNA species in preparation for cDNA synthesis and amplification.
Amplification
Reverse transcription of the linkered RNA species was carried out followed by
PCR amplification. Primer sequences were as follows:
RT: 5'-GATTGATGGTGCCTACAG -3' (SEQ ID NO: 457) (Tm: 50.20C)
Fwd tag 1 primer: (454 fwdl -BLl mm)
5' GCCTCCCTCGCGCCATCAGcagtTGTAATTCTCGGTCACCAA 3' (SEQ ID NO:
458)
Rev tagl primer: (454-Revl-BLl)
5' GCCTTGCCAGCCCGCTCAGcatgATTGACGGTGCCTACAG 3' (SEQ ID NO:
459)
Fwd tag2 primer: (454 fwd2 -BLl mm)
5' GCCTCCCTCaCGCCATCAGtagtTGTAATTCTCGGTCACCAA 3' (SEQ ID NO:
460)
Rev tag2 primer: (454-Rev2-BL 1 )
5' GCCTTGCCAGCCCGCTCAGtagtATTGACGGTGCCTACAG 3' (SEQ ID NO:
461)
Small RNA Enrichment
The libraries were built using the MirCat kit, with several modifications, as described below. The mass of RNAs in the miRNA size range of 18 nt to 26 nt is very small relative to total RNA, so removal of as much competing mass as possible is essential. Therefore, enrichment of small RNA was carried out by recovering the small RNA fraction, identified by internal size markers, from a slice of a 12% denaturing (7 M Urea) polyacrylamide gel.
The synthetic RNA size markers run in the lane adjacent to the cancer samples were:
15 nt (5' GCAAAGCACACGGCC 3') (SEQ ID NO: 462),
22 nt (5' UAUGUAUCGAAUUUAAGCUCAA 3') (SEQ ID NO: 463) and 38 nt (5' GCAAGGAUGACACGCAAAUUCGUGAAGCGUUCCAUAUU 3') (SEQ ID NO: 464).
Cleaning the desired RNA from the gel was carried out by GeBAflex-tube-midi column using an electric current of 300 volt for 40 min until the nucleic acid exited from the gel slice, followed by applying reverse polarity of the current for 120 seconds.
This step releases the nucleic acid from the membrane. Isolated RNA was precipitated by adding 8 μl of linear acrylamide, a 1/10 volume of NaOAc 3 M, pH 5.2, and three volumes of cold 100% ETOH, with vortexing after each addition. The isolated RNA was precipitated overnight at -2O0C, centrifuged for 1 h at 40C at 14000 rpm, followed by washing with 1 ml cold 85% ETOH and subsequent centrifugation for 5 min at
14000 rpm.
RNA linkering
Following recovery of the enriched small RNA fraction from the acrylamide gel slice, the small RNAs were ligated with a 3' and a 5' linker in two separate reactions.
First, 3' ligation was performed in which the 3' linker was ligated to the small RNAs using T4 RNA ligase in the absence of ATP in order to avoid circularization of the
RNA fragments, as described in Lau et at, 2001. The ligated product was purified by recovering the desired band, identified using size markers, from a slice of a 12% denaturing (7 M Urea) polyacrylamide gel. Two synthetic RNAs (24 nt and 38 nt, described previously) and two synthetic RNA transcripts (53 nt and 83 nt) were run adjacent to the cancer samples. Purification and precipitation were carried out as described previously.
The 5' linker is ligated to the 3' linkered small RNAs in the presence of 1.0 mM ATP, followed by recovering the desired band from a slice of a 12% denaturing (7M urea) polyacrylamide gel with the same size markers. Purification and precipitation were done as described before.
Reverse Transcription
The 5' and 3' ligated RNAs contained both RNA and DNA regions which were converted to DNA using reverse transcriptase with RT primer, according to MirCat protocol. PCR amplification
The PCR amplification step was carried out using primers different from those provided by the MirCat kit since the primers provided cause strong self- and heterodimers. PCR was carried out using PfuUltra high fidelity DNA polymerase (Stratagene #600380) and pairs of longer PCR primers (40-42-mers) containing sequences complementary to the linkers, tag sequences and sequences which were suitable for the 454 platform. Tagl -flagged colon and lung library and Tag2-flagged breast and bladder library followed by 454 sequences that will convert the small RNA libraries made to ones that can be directly sequenced on the 454 platform.
Samples from five PCR reactions were pooled, extracted with phenol: chloroform, followed by recovery of the desired band from slices of an 8% native polyacrylamide gel. The resulting library was sent for sequencing on the 454 platform.
Deep sequencing data analysis
The deep sequencing process yielded over 200,000 sequences from both libraries.
Adaptors were removed using a Perl script allowing internal polyN sequences within the adaptors and 1 mismatch. About 1000 sequences were removed since they were too short after adaptor removal (<10 bp). The sequences were mapped to the human genome (UCSC hgl8 build) using BLAST, allowing maximum three bps mismatched to the genome and maximum insertion/deletion (indels) of three bps. For each aligned sequence the highest scoring hit was retrieved. AU sequences with position overlap were clustered together using a Perl script. For example, if sequence X was mapped to positions 1-20 within the plus strand of chromosome 1 and a sequence Y was mapped to positions 15- 35 on the same chromosome and strand, then the two sequences were unified in the same genomic cluster of chromosome 1, plus strand, positions 1-35. The clusters of sequences represent segments of expressed genes. Each genomic cluster of sequences was assigned the most abundant sequence in this cluster and demanded that for candidate miRNAs, the most abundant sequence will be mapped precisely to the genome (not allowing any mismatches/indels) .
The next step was to annotate known sequences. The following datasets were used for this task: RNA genes, sno/miRNA, RefSeq genes, and RepeatMasker tables were downloaded from the UCSC table browser, and known miRNA precursors were downloaded from miRBase in order to mark whether the sequence is part of a noncoding gene, a snoRNA, a protein-coding gene exon, a genomic repeat, or a known miRNA precursor, respectively. The sequences of the novel miRNA candidates were extended by several hundred bp within their chromosomes in order to predict possible miRNA precursors. An extended sequence was intended to predict the folding of a pri-miRNA that contains a hairpin-folded pre-miRNA. The candidate pri-miRNAs were folded using the Vienna package {Hofacker, LL. (2003) Nucleic Acids Res, 31, 3429-3431} or mfold {Zuker, M. (2003) Nucleic Acids Res, 31, 3406-3415} programs. All hairpin structures that had at least six base pairs, were at least 55 nucleotides long and had a loop not longer than 20 nucleotides were extracted from the minimum free energy fold of the predicted pri-miRNA (excluding overlapping hairpins). Each hairpin was assigned a Palgrade and conservation score. Predicted miRNA precursors have either Palgrade>0 (meaning it has structural characteristics of known miRNA) or have absolute value of conservation score>0.9 (conserved in mammals) {Bentwich, et a (2005) Nat Genet, 37, 766-770} . These criteria have a sensitivity of 86% for known miRNA precursors from miRBase 13.0. In addition, only sequences with ten or less genomic copies, with a length of 17-25 bp and a GC content in known miRNA range (15-90%) were chosen as miRNA candidates.
Microarray design
Custom microarrays (Biochips) were manufactured by Agilent Technologies by in situ synthesizing DNA oligonucleotide probes to 949 known microRNAs and 876 sequences printed in triplicate, and 8639 computationally predicted microRNAs printed in one copy. 44/49 of the novel miRNA and small RNAs were used in the microarray (five sequences were identified as novel miRNAs/small RNAs after the design of the microarray). Sequences from deep sequencing were characterized by:
1. Mapping to the human genome. < 2. Being part of a predicted hairpin (folded by Vienna/Mfold).
3. Not being part of an annotated sequence (known miRNA, small RNA, coding exon).
4. Having less than 10 genomic occurrences.
5. MiRNA-sized (17-25 bp).
6. 10%<%GC content <90%.
Each probe comprised an antisense sequence of the relevant sequence, followed by a tail sequence (GCAATGCTAGCTATTGCTTGCTATTAAAAA) (SEQ ID NO: 465), trimmed so the final length of the probe would be 45 nucleotides. Seventeen negative control probes were designed using the sense sequences of different microRNAs. Two groups of positive control probes were designed to hybridize to the array: (i) synthetic small RNA that were spiked to the RNA before labeling to verify the labeling efficiency, and (ii) probes for abundant small nuclear RNAs that were spotted ori the array to verify RNA quality.
Microarray hybridization
Thirty-eigfit samples were hybridized to these microarrays. The samples divided to normal (n = 8), tumor (n = 15), tumor adjacent (n = 5) and metastasis indications (n = 8).
A total of 2-2.5 μg of total RNA was labeled by ligation of an RNA-linker, p- rCrU-Cy/dye (Eurogentec S. A.; Cy3 or Cy5), to the 3' end. Synthetic small RNA was spiked into the RNA before labeling to verify the labeling efficiency. Slides were incubated with the labeled RNA for 12-16 h at 55°C and then washed according to Agilent GE washes for Agilent miRNA protocol. Arrays were scanned using Agilent DNA Microarray Scanner Bundle (Agilent Technologies, Santa Clara, CA) at a resolution of 5 micrometer, dual pass at 100% and 10% PMT power green and red Dye channel. Array images were analyzed using Agilent Feature Extraction software (version 9.5). Array signal calculation and normalization:
Array images were analyzed using the Feature Extraction software (FE) 9.5.1 (Agilent, Santa Clara, CA). Triplicate spots were combined to produce one signal for each probe by taking the logarithmic mean of reliable spots. All data were log- transformed (natural base) and the analysis was performed in log-space. A reference data vector for normalization R was calculated by taking the median expression level of a subset of all probes (all miRs in mirbase 10) across samples. For each sample data vector S, a 2nd degree polynomial F was found so as to provide the best fit between the sample data and the reference data, such that R=F(S). For each probe in the sample (element Si in the vector S), the normalized value (in log-space) Mi was calculated from the initial value Si by transforming it with the polynomial function F, so that Mi=F(Si). P-values were calculated using a two-sided t-test on the log-transformed normalized fluorescence signal. The fold-difference (ratio of the median normalized fluorescence) was calculated for each microRNA. The signal of a sequence is defined as differential between sample "A" and sample "B" if the fold change between the signal in sample A and sample B is either larger than the 95th percentile of fold changes of all sequences expressed in both samples, or larger than 8.
For every tumor/tissue type, the tumor sample was compared to the median signals of all other tumors, to the normal sample from the same tissue type where available, the relevant tumor adjacent sample where available, and metastatic samples originating in the same tissue where available. Each metastatic sample was also compared to normal samples originating from the same site.
Sequences meeting the criteria of having differential expression (as defined above) in at least one comparison, GC content<75%, and differing from other previously identified miRs, were chosen.
Expression detection by qRT-PCR
RNA was subjected to a polyadenylation reaction. RNA was incubated in the presence of poly (A) polymerase (PAP; NEB M0276), MnCl2, and ATP for Ih at 370C. Then, using an oligodT primer harboring a consensus sequence, reverse transcription was performed on total RNA using Superscript II RT (Invitrogen). Next, the cDNA was amplified by real time PCR;
Sequences used in the reaction are microRNA-specific forward primers detailed in table 4 below, a universal TaqMan probe (complementary to the 3' end of the oligodT plus part of the tail, SEQ ID NO: 470), and the universal reverse primer (complementary to the consensus 3' sequence of the oligodT tail, SEQ ID NO: 471). For each miR. expression signals were calculated by the formula 42 - Ct (miR-X).
Table 4: microRNA-specific forward primers sequences used in the qRT-PCR reaction
Figure imgf000080_0001
Example 2; Identified miR expression in tumor tissues, normal tissues and blood and metastatic tissues, for candidate miRs
Table 5 presents expression data for miRNAs identified in tumor tissues.
Table 5: log2 expression per tumor tissue per miR for candidate miRs
oo
O
Figure imgf000081_0001
OO
Figure imgf000082_0001
OO bo
Figure imgf000083_0001
OO
Figure imgf000084_0001
OO
Figure imgf000085_0001
oo
Figure imgf000086_0001
oo
Os
Figure imgf000087_0001
OO
Figure imgf000088_0001
oo
00
Figure imgf000089_0001
Table 6 presents expression data for miRNAs identified in normal tissues.
Table 6: log2 expression per normal tissue per miR for all candidate miRs
Figure imgf000089_0002
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
Figure imgf000093_0001
Figure imgf000094_0001
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Table 7 presents expression data for miRNAs identified in metastatic tissue originating from primary tumors at the indicated sites and normal blood tissues.
Table 7: log expression per metastatic tissue and blood per miR for all candidate miRs
Figure imgf000098_0002
oo
Figure imgf000099_0001
VO VO
Figure imgf000100_0001
O
O
Figure imgf000101_0001
O
Figure imgf000102_0001
O
Figure imgf000103_0001
O OJ
Figure imgf000104_0001
O
Figure imgf000105_0001
Example 3: Differential expression of miRNAs
Example 3.1: Differential expression of miRNAs in tumor vs. normal tissue
Table 8 presents expression for of a list of miRNAs identified in tumor and normal tissues, measured by Iog2(signal). "Expression in the tumor" refers to median of expression in all tumor tissues tested. "Normal" refers to median of expression in all normal tissues tested. "P-value" refers to the significance of the change between all normal and all tumor tissues tested according to two-sided Student's t-test. "Fold change" refers to the fold change between all normal and all tumor tissues tested.
Table 8: Differential expression of miRNAs in median of all tumors vs. median of all normal tissues
Figure imgf000106_0001
Figure imgf000107_0001
Figure imgf000108_0001
Example 3.2: Differential expression of miRNAs in tumor vs. adjacent and normal tissue
Table 9 presents data comparing expression of a list of miRNAs identified in tumor vs. adjacent and normal tissues, measured by Iog2(signal). The term "adjacent" refers to an area of the same tissue and the same patient without tumor growth.
"Expression in normal tissue" refers to expression in non-tumor tissue of the same origin.
SEQ ID NOS: 7, 26, 34, 35, 37, 39, 53, 71, 72, 125, 127, 130, 132, 135, 159, 161, 165, 170 and 172 exhibit relatively high expression, and SEQ ID NOS: 34 and 38 exhibit relatively low expression, when comparing expression in a bladder tumor vs. normal bladder tissue.
SEQ ID NOS: 79, 99, 122, 130, 153, 154 exhibit relatively low expression, when comparing expression in a breast tumor vs. adjacent breast tissue.
SEQ ID NOS: 2, 15, 28, 29, 48, 58, 61, 70, 86, 97, 107, 110, 150, 156, 170 and
172 exhibit relatively high expression when comparing expression in a colon tumor vs. adjacent colon tissue. SEQ ID NOS: 22, 25, 29, 31, 37, 39, 53, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171 exhibit relatively high expression, and SEQ ID NOS: 16 and 57 exhibit relatively low expression, when comparing expression in an endometrium met. (metastasis) vs. normal endometrium tissue.
SEQ ID NOS: 26, 37, 39, 53, 71, 72, 98, 125, 127, 130, 135, 159 and 165, exhibit relatively high expression, and SEQ ID NOS: 16, 34, 83 and 140 exhibit relatively low expression, when comparing expression in kidney tumor vs. normal kidney tissue.
SEQ ID NOS: 11, 26, 71, 72, 77, 98, 134, 136, 153, 160, 170 and 171, exhibit relatively high expression, and SEQ ID NOS: 12, 14, 32, 34, 54, 89, 123, 126, 128 and 140, exhibit relatively low expression, when comparing expression in liver tumor vs. normal liver tissue. SEQ ID NOS: 14, 28-32, 42, 43, 52-55, 67, 75, 84, 85, 89, 97,105, 123-126, 128, 129, 137, 144 and 154 exhibit relatively low expression when comparing expression in liver tumor vs. adjacent liver tissue.
SEQ ID NOS: 1, 7, 19, 21, 22, 26, 30, 35, 37, 39, 43, 46, 53, 58, 59, 63, 64, 68,
71, 72, 74, 78, 86, 87, 106, 110, 113, 116, 119, 125, 127, 132, 135, 141, 153, 159, 161, 164, 165 and 170-173 exhibit relatively high expression when comparing expression in lung tumor vs. normal lung tissue. SEQ ID NOS: 30, 51, 63, 171 exhibit relatively high expression when comparing expression in lung tumor vs. adjacent lung tissue.
Table 9: Differential expression of miRNAs in tumor vs. adjacent and normal tissue from same origin o
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Example 3.4: Differential expression of mϊRNAs in a specific tumor vs. other tumors
Table 10 presents data comparing expression of a list of miRNAs identified in a specific tumor vs. their expression in other tumors, measured by Iog2(signal).
SEQ ID NOS: 2, 13, 15, 21, 24, 28, 29, 31, 41, 42, 50, 58, 61, 62, 65, 68, 70, 80, 86, 90, 97, 102, 104, 107, 108, 110, 111, 114, 129, 144-146, 151, 153 and 175 exhibit high fold change when comparing expression in a colon tumor vs. other tumors. SEQ ID NOS: 5, 10, 21, 29, 43, 47, 51, 63, 68-70, 86, 110, 129, 143, 154, 165 and 166 exhibit high fold change when comparing expression in bile vs. other tumors. SEQ ID NOS: 13, 19, 21, 63, 78, 106, 116, 154, 171 and 173 exhibit high fold change when comparing expression in lung tumor vs. other tumors. This is also shown graphically in Figure 3. SEQ ID NOS: 21, 28, 29, 68, 86, 92, 97, 102, 108, 111, 151, 156, 163 and 174 exhibit high fold change when comparing expression in a pancreatic tumor vs. other tumors. SEQ ID NOS: 21, 68, 92, 111 and 174 exhibit high fold change when comparing expression in whole blood vs. other tumors. SEQ ID NOS: 21, 68, 92, 111 and 174 exhibit high fold change when comparing expression in blood cells vs. other tumors,
mets. = metastasis.
Table 10: Differential expression of miRNAs in a specific tumor vs. other tumors
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Example 3.5: Differential expression of miRNAs in a specific tumor vs. sites of metastases
Tables 12A-C presents data comparing expression of a list of miRNAs identified in a specific tumor vs. their expression in other tumors, measured by Iog2(signal). Table HA details the differential expression of miRNAs in a specific primary tumor vs. metastisis originating from the same tissue or the metastasis vs. the median of metasteses originating from other tissues; table HB details miR expression in a tumor originating from a specific tissue, in metastases of same tissue to the lung, in normal lung tissue, in lung primary tumors (normalized to the metasteses), and in median of metastasis of other origins; table HC details miR expression in a tumor originating from a specific tissue, in metastasis of same tissue to the lymph node, in normal lymph node tissue, in lymph node primary tumors, and in median of metastasis of other origins;
"Specific tumor signal" refers to the expression in a specific tumor tissue. "Mets signal" refers to expression in the metastasis originating from the specific tissue. "Other
Mets Signal" refers to expression in median of metastasis of various origins. Figure 1 shows differential expression of miRs, comparing the median values of each miR in breast primary tumor with breast metastases into lymph nodes. Figure 2 shows differential expression of miRs, comparing the median values of each miR in colon tumors with the corresponding median for their adjacent tissues. Figure 3 shows differential expression of miRs, comparing the median values of each miR in lung tumors with the corresponding median for other tumors from the following tissues: bile duct, bladder, breast, colon, kidney, liver, lung, ovary, pancreas, and prostate.
Table HA: miR expression in tumor originating from a specific tissue, in metastasis originating from same tissue, and in median of metastasis originating from other tissues
Figure imgf000124_0001
5 Table HB: miR expression in tumor originating from a specific tissue, in metastasis of same tissue to the lung, in normal lung tissue, in lung primary tumors, and in median of metastasis of other origins
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
Table HC: miR expression in tumor originating from a specific tissue, in metastasis of same tissue to the lymph node, in normal lymph node tissue and in median of metastasis of other origins
Figure imgf000137_0002
Figure imgf000138_0001
00
Figure imgf000139_0001
Figure imgf000140_0001
Example 3.6: Differential expression of miRNAs in blood vs. colon tumor or adjacent colon tissue
Table 12 presents data comparing expression of a list of miRNAs measured by Iog2(signal). "Colon tumor expression" refers to expression in a colon tumor. "Fold change blood vs. colon tumor signal" refers to the fold change of expression in a blood tissue vs. colon tumor. "Expression in adjacent colon" refers to expression in colon tissue adjacent to the specific tumor. This is shown graphically in Figure 2. "Fold change blood vs. adjacent site" refers to the fold change obtained upon comparison of the blood tissue vs. adjacent colon tissue.
Blood cells and whole blood samples were collected from normal individuals. miRNA expression was measured in both sample types and compared with miRNA expression in colon tumors and in normal colon tissue adjacent to a tumor site. SEQ ID NOS: 21, 68, 111 and 174 exhibit high fold change in expression in blood cells or whole blood or both when compared to expression in both colon tumor tissue and colon tissue adjacent to the tumor site. SEQ ID NO: 92 exhibit low fold change in expression in both blood cells and whole blood when compared to expression in both colon tumor tissue and colon tissue adjacent to the tumor site.
Table 12: Differential expression of miRNAs in blood vs. colon tumor or adjacent colon tissue
Figure imgf000141_0001
Figure imgf000142_0001
Example 4: Deep sequencing of small RNAs from solid tumor samples
Example 4.1: Expression and sequence variability of known microRNAs i
j
The sequencing process yielded 141,023 sequences from the bladder+breast tumor pool and 90,986 sequences from the colon+lung pool. After combining identical sequences, 27,968 unique sequences remained, 81% of which are 17-26 nt long, accounting for 93% of all redundant sequences. As described in Example 1, these sequences were further analyzed by aligning each sequence, using BLAST, to the human genome allowing a maximum of three nucleotides mismatched relative to the genome and a maximum insertion/deletion of three base pairs. This yielded ~723,000 genomic loci of mapped sequences; 83% of the unique sequences were mapped to the human genome using these criteria and 59% of the unique sequences were mapped with a maximum of one nucleotide mismatch. Subsequently, ~565,000 clusters of sequencer with position overlap were created.
When the mapped sequenced reads were mapped to known miRNAs from Sanger miRBase registry (release 10), according to genomic position overlap, inclusion of sequenced reads in mature miRNA sequences or inclusion of mature miRNA sequences in the sequenced reads, the small RNA libraries were found to be enriched with human miRNAs. Known miRNAs occupied 61% (140,255 of 230,740) of the total small RNA reads. Three hundred and eighty-seven out of 885 (44%) human miRNAs were sequenced in at least one read in the different tumor libraries. Most miRNAs were sequenced in several sequence variants that were previously referred to as isomiRs. The different isomiRs were predominantly variable in the 3' end of the mature miRNA sequence, a region which is less precisely defined than the miRNA 5' end. For numerous known miRNAs the most abundant isomiR in the cancer tissue survey was much more abundant (at least 20%) than the reference miRNA sequence from miRBase database. This suggests that the relative abundance of isomiRs may be' inherently different between normal tissue and tumors. Additionally, several knowrf miRNAs had an abundant isomiR with at least one mismatch to the human genome sequence, suggestive of the discovery of novel miRNA-related SNPs/cancer mutations or post-transcriptional modification of the miRNAs. All the isomiRs were expressed in at least the same number of reads as the miRBase isomiR. Most of the sequence modifications (69%) occurred in the 3' end of the miRNA and involved either DNA base modification, 3' uridylation or 3' adenylation. 3' additions of G or C were completely absent. The high abundance and the specificity of the 3' terminal single nucleotide insertions suggest that these are regulated post-transcriptional modifications and not DNA-level changes (SNPs/mutations), which are expected to occur in a more random manner. Several sequence modifications that occur internally within the miRNA sequence were also noted. These isomiRs demonstrated primarily (77%) C->T or A->G nucleotide modifications, again suggesting involvement of post-transcriptional RNA editing by cytidine deaminase or ADAR enzymes, respectively, contrary to DNA level changes.
Example 4.2: Deep sequence identification of novel miRNAs and miRNA-Iike small
RNAs
Example 4.2.1: miRNAs derived from known miRNA precursors
One group of novel miRNAs and miRNA-like small RNAs expressed in the tested cancer tissues, includes miRNAs derived from known miRNA precursors, SEQ ID NOS: 472-479, mostly miRNA star sequences of known human miRNAs. miRNA star sequences are ~22-nt RNA species nearly complementary to a known miRNA, which are located within the miRNA precursor and which may have an inhibitory activity.
In several cases the novel complementary mature miRNA was more abundant than the known miRNA, suggesting that the identified miRNA is the major active product of the miRNA precursor, at least in the tested tumor samples. In addition several cases of miRNA-offset RNAs (MORs), a miRNA-like group that was recently characterized {Shi, W., et al. (2009) Nat Struct MoI Biol, 16, 183-189} were identified. MORs are part of the miRNA precursor and are processed from a -22 bp dsRNA region, directly upstream to the miRNA- miRNA star dsRNA region. All MORs sequenced in the human tumors are highly conserved, derived exclusively from the 5' stem of the miRNA precursor directly upstream to the 5 ' miRNA, and lowly expressed relative to the main miRNA product of the precursor. The MORs identified here tend to be located in a region of lower dsRNA stability than the main miRNA- miRNA star pair of the miRNA precursor. Therefore, the miRNA precursor of a MOR may switch between different folded RNA structures, only part of which accommodates the MOR in a dsRNA region that would be processed by the canonical miRNA pathway. This may explain the relatively low expression of MORs in comparison to the main mature miRNAs of the precursors.
Example 4.2.2: miRNAs derived from novel miRNA precursors
An additional group of novel miRNAs and miRNA-like small RNAs expressed in the tested cancer tissues includes completely novel miRNAs from novel miRNA precursors. Only reads that were exactly mapped to the genome were used. Reads that were mapped to more than 10 loci were filtered out, since human miRNAs rarely map to more than a few genomic loci. Other reasons for which sequences were discarded include rare occurrence (i.e. very few reads), length exceeding normal miRNA length and %GC higher than the %GC of known miRNAs. After filtering out by these criteria, as well as filtering sequences located within already annotated sequences (known miRNAs, other small RNAs, transposons, coding exons), miRNA precursors were predicted by folding
1 several hundred bp flanking the final miRNA candidates using RNAfold {Denman, R.B. (1993) Biotechniques, 15, 1090-1095}. In order to reduce the number of false positive predictions, only predicted miRNA precursors that were either evolutionarily conserved or had structural features of known miRNAs were kept. Such structural features include limited length of bulges and loops and low folding energy. A miRNA precursor score was computed by integrating these parameters {Bentwich, 1, et al. (2005) Nat Genet, 37, 766-770}. This process resulted in the identification of small RNAs SEQ ID NOS .480- 494 and precursors SEQ ID NOS.496-507, 509-514. Example 4.2.3: miRNA-like sequences derived from annotated small RNAs and genomic repeats
Another group of novel miRNAs and miRNA-like small RNAs, expressed in the tested cancer tissues, contained miRNA-like sequences derived from annotated small RNAs and genomic repeats. Several miRNAs were previously described as having been derived from such genetic elements. Sequences whose length exceeded the conventional size of miRNAs (17-25 bp) were discarded. MiRNA precursors were predicted using RNAfold and mFold and the precursor score described above. Finally, only sequences with at least 10 reads were taken, in order to ensure that the identified novel miRNAs were likely to be consistent products of enzymatic excision and not rare degradation products. This strict criterion was used for this group only as these derive from known RNA species that are often highly expressed and their degradation products are expected to be found in the cell, therefore their re-annotation as miRNAs needs stronger evidence. This process revealed 3 miRNA-like sequences, SEQ ID NOS: 92, 93 and 495.
One of the candidates in this group, MID-24078 (SEQ ID NO: 495), is derived from a local hairpin-fold of an AIu repeat. The other two [MID-19433 (SEQ ID NO: 92),
MID- 19434 (SEQ ID NO: 93)] are, interestingly, derived from Y RNAs. Y RNAs are relatively unexplored noncoding RNA species that are implicated in chromosomal DNA replication {Krude, T., et al, J Cell Sd, 122, 2836-2845} and RNA quality control {Sim, et al, (2009) MoI Biol Cell, 20, 1555-1564}. Y RNA have been shown to be over- expressed in solid tumors (Christov et al, Br J Cancer 2008;98(5):981-988), and thus may have potential for the diagnosis of cancer.
MID- 19434 (SEQ ID NO: 93) is a 25-nt long RNA derived from a -100 nucleotide-long hY3 RNA-like sequence. This sequence was highly expressed, with 200 sequenced reads, which is more abundant than over 300 known miRNAs sequenced iif the analyzed tumor samples. The predicted well-folded precursor of this miRNA (SEQ E) NOS: 235-242) is precisely aligned to the hY3 RNA (genebank number NR_004392.1), suggesting that the Y RNA is processed, possibly by Dicer, to yield a 25 bp mature miRNA. MID-19433 (SEQ ID NO: 92) is derived from hairpin-folded hYl Y RNAs (genebank number NR_004391.1). Example 4.2.4: human siRNA
Endogenous siRNA are ~21-bp-long RNA species that are processed from a dsRNA by Dicer and assembled in the RNA induced silencing complex (RISC). These, were recently described in mouse oocytes {Watanabe, T., et al. (2008) Nature, 453, 539- 543}, but have not yet been identified in the human transcriptome. The identified candidate human endogenous siRNA is a ~20-nt dsRNA that could be derived from bidirectional transcription of the same locus. Six sequenced reads (SEQ ID NOs: 450-454) are transcribed in the same orientation as a mitochondrial tRNA as well as a tRNA- derived pseudogene in one of the chromosomes, which is the more likely source of the siRNA sequences. Their transcription starts in the transcription start site of the tRNA, suggesting that these sequences are processed from the tRNA transcripts. Two antisense reads create a dsRNA with a short 5' overhang, as opposed to common siRNA which are characterized by a 3' overhang.The sense and antisense reads are mapped to nine different genomic loci. Therefore, it is also possible that the complementary sequences were derived from independent single-stranded RNAs and not from a hybridized dsRNA! This is thought to be the first time that endogenous siRNAs have been identified in the human genome.
Example 5; RT-PCR validation
As indicated in Figure 4, several of the newly identified miRNAs: MID-19433 (SEQ ID NO: 92), MID-19434 (SEQ ID NO: 93) and MID-16489 (SEQ ID NO: 31) were found to be expressed in serum of healthy people, using a third platform, RT-PCR.
The two novel small RNAs that are most abundantly expressed in differenέ tumors in all platforms (high throughput sequencing, microarray, and RT-PCR), MID- 19433 (SEQ ID NO: 92) and MID-19434 (SEQ ID NO: 93) are derived from small cytoplasmic Y RNA. The potential importance of these two novel miRNA-like sequences in tumorgenesis is supported by a recent work reporting that the Y RNAs hYl and hY3, that are the unprocessed Y RNAs of MID-19433 and MID-19434, respectively, are overexpressed in carcinomas of the bladder, cervix, colon, kidney, lung and prostate {Christov, C.P., et al, (2008) Br J Cancer, 98, 981-988}. Therefore, measuring the highly expressed small RNAs from these Y RNAs can be used for molecular diagnostics of these cancers. The fact that these were also detected in serum confers them potential usage in non-invasive assays. The present invention is not to be limited in scope by the specific embodiments described herein. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description. Such modifications are intended to fall within the scope of the appended claims.
All patents, applications, publications, test methods, literature, and other materials cited herein are hereby incorporated by reference in their entirety as if physically present in this specification.

Claims

CLAIMS:
1. An isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ ID NOS: 31, 92-93, 450-454, 1, 2, 4-7, 9-26, 28-30, 32-35, 37-43, 46-72, 74-81, 83-91, 95-102, 104-108, 110-114, 116-157, 159-175 and 472-495;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
2. The nucleic acid of claim 1, wherein said sequence in part (d) has an identity of at least 90% to the sequence in parts (a) - (c).
3. An isolated nucleic acid comprising a sequence selected from the group consisting of:
(a) SEQ TD NOS: 192, 232-242, 176-187, 189-191, 193-196, 198-231, 243-267, 269-326, 328-330, 332-340, 342, 343, 345-350, 352-361, 363-369, 371-387, 389-413, 415-432, 434-438, 440-449 and 496-514;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 50-150 nucleotides in length.
4. The nucleic acid of claim 3, wherein said sequence in part (d) has an identity of at least 90% to the sequence in parts (a) - (c).
5. An isolated nucleic acid comprising an endogenous human siRNA.
6. The isolated nucleic acid of claim 5, wherein said nucleic acid comprises a sequence selected from the group consisting of:
(a) SEQ ID NOS: 450-454;
(b) a DNA encoding (a);
(c) the complementary sequence of any one of (a) and (b); and
(d) a sequence at least 80% identical to (a) - (c),
wherein said nucleic acid is 16-26 nucleotides in length.
7. The nucleic acid of claim 6, wherein said sequence in part (d) has an identity of at least 90% to the sequence in parts (a) - (c).
8. An isolated nucleic acid of claim 1, wherein said nucleic acid comprises a sequence selected from the group consisting of:
(a) SEQ ID NOS: 53 and 162;
(b) SEQ ID NOS: 70 and 110;
(c) SEQ ID NOS: 14 and 120;
(d) SEQ ID NOS: 63, 106 and 58; and
(e) SEQ ID NOS: 135 and 159.
9. The isolated nucleic acid of any of claims 1, 3 and 6, wherein said nucleic acid is a modified oligonucleotide.
10. A composition comprising an isolated nucleic acid of any of claims 1, 3 and 6.
11. The composition of claim 10, wherein said composition is suitable for diagnostic applications.
12. The composition of claim 10, wherein said composition is suitable for therapeutic applications.
13. The composition of claim 12, further comprising a pharmaceutically acceptable carrier.
14. The composition of claim 10, as a marker or modulator of cancer.
15. A recombinant expression vector comprising an isolated nucleic acid of any of claims 1, 3 and 6.
16. A probe comprising an isolated nucleic acid of any of claims 1, 3 and 6.
17. A biochip comprising the probe of claim 16.
18. A host cell comprising an isolated nucleic acid of any of claims 1, 3 and 6.
19. A method for diagnosing a cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66? 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, HO-IH1 116, 117, 119-121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147,.
148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof,. and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression levels of any of said nucleic acids in healthy controls,
wherein the comparison of said expression profile to said reference expression profile allows for diagnosis of said cancer.
20. The method of claim 19, wherein said cancer is selected from the group consisting of colon, bladder, breast, lung, liver, kidney, ovarian, prostate, esophagus, cervix, and pancreatic cancer.
21. The method of claim 19, wherein relatively high expression levels of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 17, 21, 22, 23, 26, 30, 33, 35, 37, 39, 46, 48, 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 86-88, 90, 96, 98, 100, 101, 106, 110- 114, 116, 117, 119-121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167 and 169-174, as compared to said reference expression profile, is indicative of cancer.
22. The method of claim 19, wherein relatively low expression levels of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16, 34, 47 and 83, as compared to said reference expression profile, is indicative of cancer.
23. A method of diagnosing an increased risk of colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 21, 68, 111 and 174, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression levels of any of said nucleic acids healthy." controls, wherein a high expression level of said nucleic acid sequence is indicative of an increased risk of colon cancer in a subject.
24. A method of diagnosing an increased risk of colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of SEQ ID NO: 92, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein a low expression level of said nucleic acid sequence is indicative of an increased risk of colon cancer in a subject.
25. A method of diagnosing colon cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 2, 15, 28, 29, 48, 58, 61, 70, 86, 97, 107, 110, 150, 156, 170 and 172, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of colon cancer.
26. A method of diagnosing lung cancer in a subject a comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1, 7, 19, 21, 22, 26, 30, 35, 37, 39, 43, 46, 51, 53, 58, 59, 63, 64, 68, 71, 72, 74, 78, 86, 87, 106, 110, 113, 116, 119, 125, 127, 132, 135, 141, 153, 159, 161, 164, 165 and 170-173, a fragment thereof, and a sequence having at least about
80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of lung cancer.
27. A method of diagnosing bladder cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 7, 26, 35, 37, 39, 53, 71, 72, 125, 127, 130, 132, 135, 159, 161, 165, 170 and 172, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of bladder cancer.
28. A method of diagnosing bladder cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 34 and 83, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of bladder cancer.
29. A method of diagnosing liver cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 11, 26, 71, 72, 77, 98, 134, 136, 153, 160, 170 and 171, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of liver cancer.
30. A method of diagnosing liver cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 12, 14, 28-32, 34, 42, 43, 52-55, 67, 75, 84, 85, 89, 97,105, 123-126, 128, 129, 137 and 140 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of liver cancer.
31. A method of diagnosing an endometrial metastasis in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 22, 25, 29, 31, 37, 39, 53, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and (c) comparing said expression profile to a reference expression profile representing the expression level of said, nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of endometrial metastasis.
32. A method of diagnosing an endometrial metastasis in a subject comprising:
(a) obtaining a biological sample from said subject; '
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16 and 57 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of endometrial metastasis.
33. A method of diagnosing kidney cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 26, 37, 39, 53, 71, 72, 98, 125, 127, 130, 135, 159 and 165, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of kidney cancer.
34. A method of diagnosing kidney cancer in a subject comprising:
(a) obtaining a biological sample from said subject; (b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 16, 34, 83 and 140, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of kidney cancer.
35. A method of diagnosing breast cancer in a subject comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 79, 99, 122, 130, 153 and 154 a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; and
(c) comparing said expression profile to a reference expression profile representing the expression level of said nucleic acid in healthy controls,
wherein relatively low expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of breast cancer.
36. A method to distinguish between a primary lung tumor and a metastasis to the lung, said method comprising:
(a) obtaining a biological sample from said subject;
(b) determining an expression profile of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 31, 92, 1, 2, 6, 7, 10, 11, 13, 17, 19-24, 28-30, 33, 40, 42, 43, 46, 49, 51, 56-59, 61, 63, 64, 68, 70, 71, 74, 75, 77, 78, 80, 81, 86-88, 90-91, 95, 96, 97, 100, 104, 106, 107, 110-113, 116, 117, 119, 122,129, 133, 138, 139, 141, 142, 144, 145-152, 154, 162, 164, 165, 169, 171-173, 175, a fragment thereof, and a sequence having at least about 80% identity thereto from said sample; (c) comparing said expression profile to a reference expression profile, wherein relatively high expression levels of any of said nucleic acid sequences, as compared to said reference expression profile, is indicative of a primary lung tumor.
37. The method of claim 36, wherein the origin of said metastasis to the lung is selected from the group consisting of endometrium, kidney, larynx, melanocyte and salivary gland.
38. The method of claim 19, wherein said subject is a human.
39. The method of claim 19, wherein said method is used to determine a course of treatment for said subject.
40. The method of claim 19, wherein said biological sample is selected from the group consisting of bodily fluid, a cell line and a tissue sample.
41. The method of claim 40, wherein said bodily fluid is selected from the group consisting of whole blood and serum.
42. The method of claim 40, wherein said tissue is a fresh, frozen, fixed, wax- embedded or formalin fixed paraffin-embedded (FFPE) tissue.
43. The method of claim 19, wherein said expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
44. The method of claim 43, wherein said nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
45. The method of claim 43, wherein said nucleic acid amplification method is real-time PCR.
46. The method of claim 45, wherein said real-time PCR method comprises forward and reverse primers.
47. The method of claim 45, wherein said forward primer comprises a sequence selected from the group consisting of SEQ ID NOS: 466, 467, 469, a fragment thereof and a sequence at least about 80% identical thereto.
48. The method of claim 45, wherein said real-time PCR method further comprises a probe.
49. The method of claim 48, wherein the probe comprises a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101, 106, 110-114, 116, 117, 119-121, 127, 129, 130; 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-11 A, a fragment thereof and a sequence at least about 80% identical thereto.
50. A kit for diagnosing a cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NO: 31, 4, 7, 11, 16, 17, 21, 22, 23, 26, 30, 33-35, 37, 39, 46, 47- 49, 51, 52, 53, 56, 58, 60, 63, 64, 66, 68, 71, 72, 74, 76-78, 83, 86-88, 90, 96, 98, 100, 101; 106, 110-114, 116, 117, 119-121, 127, 129, 130, 132-136, 138, 141, 144, 145, 147, 148, 152, 153, 157, 159-162, 165, 167, 169-174, a fragment thereof and a sequence at: least about 80% identical thereto.
51. A kit for diagnosing an increased risk of colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 21, 68, 92, 111 and 174, a fragment thereof and a sequence at least about 80% identical thereto.
52. A kit for diagnosing colon cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 2, 15, 28, 29, 48, 58, 61; 70, 86, 97, 107, 110, 150, 156, 170 and 172, a fragment thereof and a sequence at least about 80% identical thereto.
53. A kit for diagnosing lung cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 1, 7, 19, 21, 22, 26, 30, 35, 37, 39, 43, 46, 51, 53, 58, 59, 63, 64, 68, 71, 72, 74, 78, 86, 87, 106, 110, 113, 116, 119, 125, 127, 132, 135, 141, 153, 159, 161, 164, 165 and 170-173, a fragment thereof and a sequence at least about 80% identical thereto.
54. A kit for diagnosing bladder cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ TD NOS: 7, 26, 34, 35, 37, 39, 53, 71, 72, 83, 125, 127, 130, 132, 135, 159, 161, 165, 170 and 172, a fragment thereof and a sequence at least about 80% identical thereto.
55. A kit for diagnosing liver cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 11, 12, 14, 26, 28-32, 34, 42, 43, 52-55, 67, 71, 72, 75, 77, 84, 85, 89, 97, 89, 105, 123-126, 128, 129, 134, 136, 137, 140, 153, 160, 170 and 171, a fragment thereof and a sequence at least about 80% identical thereto.
56. A kit for diagnosing a subject with endometrial metastasis, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 16, 22, 25, 29, 31, 37, 39, 53, 57, 64, 68, 72, 76, 77, 78, 84, 113, 119, 121, 127, 130, 132, 133, 136, 153, 161, 170 and 171, a fragment thereof and a sequence at least about 80% identical thereto.
57. A kit for diagnosing kidney cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 16, 26, 34, 37, 39, 53; 71, 72, 83, 98, 125, 127, 130, 135, 140, 159 and 165, a fragment thereof and a sequence at least about 80% identical thereto.
58. A kit for diagnosing breast cancer in a subject, said kit comprising a probe comprising a nucleic acid sequence that is complementary to a sequence selected from the group consisting of SEQ ID NOS: 79, 99, 122, 130, 153 and 154 a fragment thereof and a sequence at least about 80% identical thereto.
59. A kit for distinguishing between a primary lung tumor and a metastasis to the lung, said kit comprising a probe comprising a nucleic acid sequence that r. complementary to a sequence selected from the group consisting of SEQ ID NOS: 31, 92, 1, 2, 6, 7, 10, 11, 13, 17, 19-24, 28-30, 33, 40, 42, 43, 46, 49, 51, 56-59, 61, 63, 64, 68, 70, 71, 74, 75, 77, 78, 80, 81, 86-88, 90-91, 95, 96, 97, 100, 104, 106, 107, 110-113, 116, 117, 119, 122,129, 133, 138, 139, 141, 142, 144, 145-152, 154, 162, 164, 165, 169, 171-173, 175, a fragment thereof and a sequence at least about 80% identical thereto.
PCT/IL2010/000462 2009-08-23 2010-06-10 Nucleic acid sequences related to cancer WO2011024157A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US23609009P 2009-08-23 2009-08-23
US61/236,090 2009-08-23
US33092010P 2010-05-04 2010-05-04
US61/330,920 2010-05-04

Publications (1)

Publication Number Publication Date
WO2011024157A1 true WO2011024157A1 (en) 2011-03-03

Family

ID=43627327

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2010/000462 WO2011024157A1 (en) 2009-08-23 2010-06-10 Nucleic acid sequences related to cancer

Country Status (1)

Country Link
WO (1) WO2011024157A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120157344A1 (en) * 2008-08-06 2012-06-21 Tel Hashomer Medical Research Infrastructure And Services Ltd. Gene expression signature for classification of kidney tumors
WO2013107459A3 (en) * 2012-01-16 2013-09-19 Herlev Hospital Microrna for diagnosis of pancreatic cancer and/or prognosis of patients with pancreatic cancer by blood samples
WO2015165779A3 (en) * 2014-05-01 2016-01-28 Stichting Vu-Vumc Small ncrnas as biomarkers
EP3156505A1 (en) * 2011-08-19 2017-04-19 Hummingbird Diagnostics GmbH Complex sets of mirnas as non-invasive biomarkers for colon cancer
US11236337B2 (en) 2016-11-01 2022-02-01 The Research Foundation For The State University Of New York 5-halouracil-modified microRNAs and their use in the treatment of cancer
US11584932B2 (en) 2016-11-01 2023-02-21 The Research Foundation For The State University Of New York 5-halouracil-modified microRNAs and their use in the treatment of cancer

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020009738A1 (en) * 2000-04-03 2002-01-24 Houghton Raymond L. Methods, compositions and kits for the detection and monitoring of breast cancer
US20030130485A1 (en) * 2000-11-14 2003-07-10 Meyers Rachel E. Novel human genes and methods of use thereof
WO2007148235A2 (en) * 2006-05-04 2007-12-27 Rosetta Genomics Ltd Cancer-related nucleic acids

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020009738A1 (en) * 2000-04-03 2002-01-24 Houghton Raymond L. Methods, compositions and kits for the detection and monitoring of breast cancer
US20030130485A1 (en) * 2000-11-14 2003-07-10 Meyers Rachel E. Novel human genes and methods of use thereof
WO2007148235A2 (en) * 2006-05-04 2007-12-27 Rosetta Genomics Ltd Cancer-related nucleic acids

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DANOVI ET AL.: "Amplification of Mdmx (or Mdm4) Directly Contributes to Tumor Formation by Inhibiting p53 Tumor Suppressor Activity.", MOLECULAR AND CELLULAR BIOLOGY, vol. 24, 2004, pages 5835 - 5843 *
DATABASE GENBANK 13 January 2009 (2009-01-13), "Human DNA sequence from clone RP11-430C7 on chromosome 1 Contains the MDM4 gene for Mdm4 (transformed 3T3 cell double minute 4) p53 binding protein (mouse), a PEF protein with a long N-terminal hydrophobic domain (peflin) (PEF) pseudogene, two novel genes and the 3' end of the LRRN5 gene for leucine", retrieved from http://www.ncbi.nlm.nih.gov/nuccore/18491332 Database accession no. AL512306.16 *
RIEMENSCHNEIDER ET AL.: "Amplification and Overexpression of the MDM4 (MDMX) Gene from 1q32 in a Subset of Malignant Gliomas without TP53 Mutation or MDM2 Amplification.", CANCER RESEARCH., vol. 59, 1999, pages 6091 - 6096 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120157344A1 (en) * 2008-08-06 2012-06-21 Tel Hashomer Medical Research Infrastructure And Services Ltd. Gene expression signature for classification of kidney tumors
US9068232B2 (en) * 2008-08-06 2015-06-30 Rosetta Genomics Ltd. Gene expression signature for classification of kidney tumors
EP3156505A1 (en) * 2011-08-19 2017-04-19 Hummingbird Diagnostics GmbH Complex sets of mirnas as non-invasive biomarkers for colon cancer
WO2013107459A3 (en) * 2012-01-16 2013-09-19 Herlev Hospital Microrna for diagnosis of pancreatic cancer and/or prognosis of patients with pancreatic cancer by blood samples
WO2015165779A3 (en) * 2014-05-01 2016-01-28 Stichting Vu-Vumc Small ncrnas as biomarkers
JP2017514519A (en) * 2014-05-01 2017-06-08 スティヒティング フェーユーエムセー Small non-coding RNA as a biomarker
CN107075568A (en) * 2014-05-01 2017-08-18 阿姆斯特丹自由大学医学中心基金会 It is used as the small ncRNA of biomarker
US11236337B2 (en) 2016-11-01 2022-02-01 The Research Foundation For The State University Of New York 5-halouracil-modified microRNAs and their use in the treatment of cancer
US11584932B2 (en) 2016-11-01 2023-02-21 The Research Foundation For The State University Of New York 5-halouracil-modified microRNAs and their use in the treatment of cancer

Similar Documents

Publication Publication Date Title
EP1784501B1 (en) VIRAL AND VIRUS ASSOCIATED MicroRNAS AND USES THEREOF
US9988690B2 (en) Compositions and methods for prognosis of ovarian cancer
WO2007148235A2 (en) Cancer-related nucleic acids
WO2010018563A2 (en) Compositions and methods for the prognosis of lymphoma
EP2203569A2 (en) Diagnosis and prognosis of specific cancers by means of differential detection of micro-rnas / mirnas
EP2322665A1 (en) MicroRNAs and uses thereof
US9243296B2 (en) Compositions and methods for prognosis and treatment of prostate cancer
EP2691545B1 (en) Methods for lung cancer classification
WO2011024157A1 (en) Nucleic acid sequences related to cancer
US9834821B2 (en) Diagnosis and prognosis of various types of cancers
US20170015999A1 (en) Compositions and methods for treatment of ovarian cancer
WO2010004562A2 (en) Methods and compositions for detecting colorectal cancer
AU2008220449A1 (en) Methods for distingushing between lung squamous carcinoma and other non smallcell lung cancers
WO2011030334A1 (en) Compositions and methods for treatment, diagnosis and prognosis of mesothelioma
WO2006092738A2 (en) Micrornas and related nucleic acids
WO2010058393A2 (en) Compositions and methods for the prognosis of colon cancer
US8563252B2 (en) Methods for distinguishing between lung squamous carcinoma and other non small cell lung cancers
WO2010070637A2 (en) Method for distinguishing between adrenal tumors
WO2010016064A2 (en) Gene expression signature for classification of kidney tumors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10811358

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10811358

Country of ref document: EP

Kind code of ref document: A1