CA2677043A1 - Method for distinguishing between lung squamous carcinoma and other non small cell lung cancers - Google Patents

Method for distinguishing between lung squamous carcinoma and other non small cell lung cancers Download PDF

Info

Publication number
CA2677043A1
CA2677043A1 CA002677043A CA2677043A CA2677043A1 CA 2677043 A1 CA2677043 A1 CA 2677043A1 CA 002677043 A CA002677043 A CA 002677043A CA 2677043 A CA2677043 A CA 2677043A CA 2677043 A1 CA2677043 A1 CA 2677043A1
Authority
CA
Canada
Prior art keywords
nucleic acid
sequence
lung
mirna
seq
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002677043A
Other languages
French (fr)
Inventor
Ranit Aharonov
Nitzan Rosenfeld
Shai Rosenwald
Hila Benjamin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rosetta Genomics Ltd
Original Assignee
Rosetta Genomics Ltd.
Ranit Aharonov
Nitzan Rosenfeld
Shai Rosenwald
Hila Benjamin
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rosetta Genomics Ltd., Ranit Aharonov, Nitzan Rosenfeld, Shai Rosenwald, Hila Benjamin filed Critical Rosetta Genomics Ltd.
Publication of CA2677043A1 publication Critical patent/CA2677043A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6841In situ hybridisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)

Abstract

The present invention provides nucleic acid sequences that are used for identification, classification and diagnosis of specific types of nonsmall-cell lung cancers (NSCLC). The nucleic acid sequences can also be used for prognosis evaluation of a subject based on the expression pattern of a biological sample.

Description

METHOD FOR DISTINGUISHING BETWEEN LUNG SQUAMOUS CARCINOMA
AND OTHER NON SMALL CELL LUNG CANCERS

FIELD OF THE INVENTION
The invention relates in general to microRNA molecules associated with specific types of lung cancers, as well as various nucleic acid molecules relating thereto or derived therefrom.
BACKGROUND OF THE INVENTION
In recent years, microRNAs (miRs) have emerged as an important novel class of regulatory RNA, which have a profound impact on a wide array of biological processes.
These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression patterns by promoting RNA degradation, inhibiting mRNA
translation, and also affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases strong evidence has been put forward in support of the conjecture that such alterations may play a causative role in tumor progression. There are currently about 700 known human miRs, and their number probably exceeds 800.

Classification of cancer has typically relied on the grouping of tumors based on histology, cytogenetics, immunohistochemistry, and known biological behavior. The pathologic diagnosis used to classify the tumor taken together with the stage of the cancer is then used to piedict prognosis and direct therapy. However, current methods of cancer classification and staging are not completely reliable.
Lung cancer is one of the most common cancers and has become a predominant cause of cancer-related death throughout the world. Scientists strive to explore biomarkers and their possible role in the diagnosis, treatment and prognosis of specific lung cancers.
Making the correct diagnosis and specifically the distinction between lung squamous carcinoma and other Non Small Cell Lung Carcinoma (NSCLC) such as but not limited to lung adenocarcinoma, has practical importance for choice of therapy. Severe or fatal heinorrhage is a black box warning for lung squamous carcinoma patients undergoing bevacizumab (Avastin) therapy. To-date there is no objective standardized test for differentiating squamous from non squamous NSCLC.

The search for biomarkers for the early detection and accurate diagnosis of various NSCLC has met with little success. Much emphasis has been placed on the discovery and characterization of a unique tumor marker. However, no marker has been identified that has adequate sensitivity or specificity to be clinically useful, although a combination of multiple markers has been shown to increase diagnostic accuracy.
There is an unmet need for a reliable method for distinguishing between lung squamous cell carcinoma and other NSCLC.

SUMMARY OF THE INVENTION
The present invention provides specific nucleic acid sequences for use in the identification, classification and diagnosis of specific lung cancers. The nucleic acid sequences can also be used as prognostic markers for prognostic evaluation of a subject based on their expression pattern in a biological sample.
The invention further provides a method of classifying NSCLC, the method comprising:
obtaining a biological sample from a subject; measuring the relative abundance in said sample of a nucleic acid sequence selected from the group consisting of SEQ ID
NOS: 1-5, 13-30, a fragment thereof or a sequence having at least about 80% identity thereto; and comparing said obtained measurement to a reference number representing abundance of said nucleic acid; whereby the differential expression of said nucleic acid sequence allows the classification of said NSCLC.
According to some embodiments, said biological sample is selected from the group consisting of bodily fluid, a cell line and a tissue sample. According to some embodiments, said tissue is a fresh, frozen, fixed, wax-embedded or formalin fixed paraffm-embedded (FFPE) tissue. According to one embodiment, the tissue sample is a lung sample.
According to some embodiments, said NSCLC is selected from the group consisting of lung squamous cell carcinoma, lung adenocarcinoma and lung undifferentiated large cell carcinoma. According to some embodiments, said lung undifferentiated large cell carcinoma is originated from lung squamous cell carcinoma or from adenocarcinoma.
The invention further provides a method for distinguishing between lung squamous cell carcinoma and other NSCLC, the method comprising: obtaining a biological sample from a subject; determining in said sample an expression level of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1- 5, a fragment thereof or a sequence having at least 80% identity thereto; whereby a relative abundance of SEQ ID NO: 1 indicates the presence of squamous cell carcinoma.
According. to some embodiments, said other NSCLC is lung adenocarcinoma.
According to some embodiments, the method comprises determining the expression levels of at least two nucleic acid sequences. According to some embodiments the method further comprising combining one or more expression ratios. According to some embodiments, the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
According to some embodiments, the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array. According to certain embodinients, the nucleic acid hybridization is performed using in situ hybridization. According to other embodiments, the nucleic acid amplification method is real-time PCR (RT-PCR). According to orie embodiment, said real-time PCR is quantitative real-time PCR (qRT-PCR).
According to some enibodiments, the RT-PCR method comprises forward and reverse primers. According to other embodiments, the forward primer comprises a sequence selected from the group consisting of any one of SEQ ID NOS: 7-9. According to some embodiments, the real-time PCR method furtlier comprises hybridization witli a probe.
According to other embodiments, the probe comprises a sequence selected from the group consisting of any one of SEQ ID NOS: 10-12.
The invention further provides a method for distinguishing between lung adenocarcinoma and large cell carcinoma, the method comprising: obtaining a biological sample from a subject; determining in said sample an expression level of one or more nucleic acid sequences selected from the group consisting of SEQ ID NOS: 13-30, a fragment thereof or a sequence having at least 80% identity thereto; whereby a relative abundance of said nucleic acid indicates the presence of large cell carcinoma.
The invention further provides a kit for NSCLC classification, said kit comprises a probe comprising a nucleic acid sequence selected from the group consisting of any one of SEQ
ID NOS: 10-12 and sequences having at least about 80% identity thereto.
According to other embodiments, the kit further comprises a forward primer comprising a sequence selected from the group consisting of any one of SEQ ID NOS: 7-9. According to some embodirnents, the kit further comprises instructions for the use of one or more expression ratios in the diagnosis of a specific NCSLC. According to some embodiments, said kit comprises reagents for performing in situ hybridization analysis.
These and other embodiments of the present invention will become apparent in conjunction with the figures, description and claims that follow.
BRIEF DESCRIPTION OF THE DRA.WINGS
Figure 1 is a graph showing the normalized expression level, of hsa-miR-205 (SEQ ID
NO: 1) based on biochip array, in lung samples originating from adenocarcinoma (circles) or squamous cell carcinoma (triangles). The samples are sorted according to the expression level of hsa-miR-205. X-axis is the sorted sanzples and y-axis is the normalized expression level. T-test p-value: 9.4735e-007.
Figure 2 is a graph showing the average normalized signal and standard error (STD/sqrt(n)) of hsa-miR-205 in two lung sample sets: adenocarcinoma and squamous cell carcinoma.
Figure 3 is a table showing the sensitivity and specificity of miR-205 in lung samples originating from squamous cell carcinoma vs. adenocarcinoma. The sensitivity of the squamous cell carcinoma detection is 100% (9/9) and the specificity is 84.2 %
(16/19).
Figure 4 is a graph showing the full separation between samples originating from lung squamous cell carcinoma (asterisks) and samples originating from other NCSLC
(ellipses) using qRT-PCR expression levels of hsa-miR-205 (SEQ ID NO: 1), normalized by qRT-PCR expression levels of hsa-miR-21 (SEQ ID NO: 2), U6 (SEQ ID NO: 3) and a threshold of a final score as described in Example 3. Full black line represents the threshold. Dashed black lines indicate low confidence area border.
Figure 5 is a photograph showing in situ hybridization detection of hsa-mir-205.
Microphotographs of parallel sections of lung squamous cell carcinoma sections were hybridized to hsa-miR-205 specific probe (A) and control (scrambled) probe (B).
Figure 6 is a graph showing the normalized expression level of hsa-miR-513 (SEQ ID
NO: 13) in lung satnples originating from adenocarcinoma (circles) or large cell carcinoma (triangles). The samples are sorted according to hsa-miR-513 expression level.
X-axis is the sorted samples' and y-axis is the normalized expression level. T-test p-value:
6.1444e-005.
Figure 7 is a graph showing the average normalized signal and standard error (STD/sqrt(n)) of hsa miR-513 in the two lung sample sets: adenocarcinoma and large cell carcinoma.
Figure 8 is a table showing the signal of hsa-miR-513 in lung samples originating from adenocarcinoma and large cell carcinoma. The signal below threshold is adenocarcinoma.
The sensitivity of the adenocarcinoma detection is94.7% (18/19) and the specificity of the signal is 85.7% (6/7).
DETAILED DESCRIPTION OF THE INVENTION
The invention is based on the discovery that specific nucleic acid sequences (SEQ ID
NOS: 1-5, 13-30) can be used for the identification, classification and diagnosis of specific lung cancers.
The present invention provides a sensitive, specific and accurate method which may be used to distinguish between lung squamous cell carcinoma and other NSCLC.
The methods of the present invention have high sensitivity and specificity.
The possibility to distinguish between lung squamous cell carcinoma and other NSCLC such as lung adenocarcinoina or lung large cell carcinoma facilitates providing the patient witll the best and most suitable treatment.

Definitions Before the present compositions and methods are disclosed and described, it is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms "a," "an" and "the"
include plural referents unless the context clearly dictates otherwise.
For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly conteinplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9 and 7.0 are explicitly contemplated.

aberrant proliferation As used herein, the term "aberrant proliferation" means cell proliferation that deviates fiom the normal, proper, or expected course. For example, aberrant cell proliferation may include inappropriate proliferation of cells whose DNA or other cellular components have become damaged or defective. Aberrant cell proliferation may include cell proliferation whose characteristics are associated with an indication caused by, mediated by, or resulting in inappropriately high levels of cell division, inappropriately low levels of apoptosis, or both. Such indications may be characterized, for example, by single or multiple local abnormal proliferations of cells, groups of cells, or tissue(s), whether cancerous or non-cancerous, benign or malignant.
about As used herein, the term "about" refers to +/-10%.

antisense The term "antisense," as used herein, refers to nucleotide sequences which are conlplementary to a specific DNA or RNA sequen.ce. The term "antisense strand"
is used in reference to a nucleic acid strand that is complementary to the "sense"
strand. Antisense molecules may be produced by any method, including synthesis by ligating the gene(s) of interest in a reverse orientation to a viral promoter which permits the synthesis of a complementary strand. Once introduced into a cell, this transcribed strand combines with natural sequences produced by the cell to form duplexes. These duplexes then block either the further transcription or translation. In this mamler, mutant phenotypes may be generated.
attached "Attached" or "immobilized" as used herein refer to a probe and a solid support and may mean that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe, or both. Non-covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in lion-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a coinbination of covalent and non-covalent interactions.
biological sample "Biological sample" as used herein means a sample of biological tissue or fluid that comprises nucleic acids. Such saniples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples may also include sections of tissues such as biopsy and autopsy samples, FFPE samples, frozen sections taken for histological purposes, blood, plasma, serum, sputum, stool, tears, mucus, hair, and skin. Biological samples also include explants and primary and/or transformed cell cultures derived from animal or patient tissues.
Biological samples may also be blood, a blood fraction,- urine, effusions, ascitic fluid, saliva, cerebrospinal fluid, cervical secretions, vaginal secretions, endometrial secretions, gastrointestinal secretions, bronchial secretions, sputum, cell line, tissue sample, or secretions from the breast. A biological sample may be provided by removiing a sample of cells from an animal, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), or by performing the methods described herein in vivo. Archival tissues, such as those having treatment or outcome history, may also be used.
cancer The term "cancer" is meant to include all types of cancerous growths or oncogenic processes, metastatic tissues or malignantly transformed cells, tissues, or organs, irrespective of histopathologic type or stage of invasiveness. Exainples of cancers include but are nor liinited to solid tumors and leukemias, including: apudoma, choristoma, branchioma, malignant carcinoid syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, small cell lung, non-small cell lung (e.g., lung squamous cell carcinoma, lung adenocarcinoma and lung undifferentiated large cell carcinoma), oat cell, pa.pillary, bronchiolar, bronchogenic, squainous cell, and transitional cell), histiocytic disorders, leukemia (e.g., B cell, mixed cell, null cell, T cell, T-cell chronic, HTLV-II-associated, lymphocytic acute, lymphocytic chronic, mast cell, and myeloid), histiocytosis malignant, Hodgkin disease, immunoproliferative small, non-Hodgkin lymphoma, plasmacytoma, reticuloendotheliosis, melanoma, chondroblastoma, chondroma, chondrosarcoma, fibroma, fibrosarcoma, giant cell turirnors, histiocytoma, lipoma, liposarcoma, mesothelioma, myxoma, myxosarcoma, osteoma, osteosarcoma, Ewing sarcoma, synovioma, adenofibroma, adenolymphoma, carcinosarcoma, chordoma, craniopharyngioma, dysgerminoma, hamartoma, mesenchymoma, mesonephroma, myosarcoma, ameloblastoma, cementoma, odontoma, teratoma, thymoma, trophoblastic tumor, adeno-carcinoma, adenoma, cholangioma, cholesteatoma, cylindroma, cystadenocarcinoma, cystadenoma, granulosa cell tumor, gynandroblastoma, hepatoma, hidradenoma, islet cell tumor, Leydig cell tumor, papilloma, Sertoli cell tumor, theca cell tumor, leiomyoma, leiomyosarcoma, myoblastoma, myosarcoma, rhabdomyoma, rhabdomyosarcoma, ependymoma, ganglioneuroma, glioma, medulloblastoma, meningioma, neurilemmoma, neuroblastoma, neuroepithelioma, neurofibroma, neuroma, paraganglioma, paraganglioma nonchromaffin, angiokeratoma, angiolymphoid llyperplasia with eosinophilia, angioma sclerosing, angiomatosis, glomangioma, hemangioendothelioma, hemangioma, hemangiopericytoma, hemangiosarcoma, lymphangioma, lymphangiomyoma, lymphangiosarcoma, pinealoma, carcinosarcoma, chondrosarcoma, cystosarcoma, phyllodes, fibrosarcoma, hemangiosarcoma, leimyosarcoma, leukosarcoma, liposarcoma, lymphangiosarcoma, myo.sarcoina, nlyxosarcoma, ovarian carcinoma, rhabdomyosarcoma, sarcoma (e.g., Ewing, experimental, Kaposi, and mast cell), neurofibromatosis, and cervical dysplasia, and other conditions in which cells have become immortalized or transformed.
classification "Classification" as used herein refers to a procedure and/or algorithm in which individual 5* items are placed into groups or classes based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, features, etc) and based on a statistical model and/or a training set of previously labeled items. According to one embodiment, classification means determination of the type of lung cancer.
complement 11 "Complement" or "complementary" as used herein means Watson-Crick (e.g.; A-T/U
and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. A full complement or fully complementary may mean 100%
complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
Ct "Ct" as used herein refers to Cycle Threshold of qRT-PCR, which is the fractional cycle number at which the fluorescence crosses the threshold.
detection "Detection" means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means measuring the level of a component, either quantitatively or qualitatively.
differential expression "Differeritial expression" means qualitative or quantitative differences in the temporal and/or cellular gene expression patterns within and among cells and tissue.
Thus, a differentially expressed gene may qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus disease tissue. Genes may be turned on or turned off in a particular state, relative to another state thus permitting comparison of two or more states. A qualitatively regulated gene may exhibit an expression pattern within a state or cell type which may be detectable by standard techniques. Some genes may be expressed in one state or cell type, but not in both. Alternatively, the difference in expression may be quantitative, e.g., in that expression is modulated, either up-regulated, resulting in an increased amount of transcript, or down-regulated, resulting in a decreased amount of transcript. The degree to which expression differs need only be large enough to quantify via standard characterization techniques such as expression arrays, quantitative reverse transcriptase PCR, northern analysis, real-time PCR, in situ hybridization and RNase protection.
expression ratio "Expression ratio" as used herein refers to relative expression levels of two or more nucleic acids as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample.
fragment' "Fragment" is used herein to indicate a non-full length part of a nucleic acid or polypeptide. Thus, a fragment is itself also a nucleic ac'id or polypeptide, respectively.

gene "Gene" as used herein may be a natural (e.g., genomic) or synthetic gene comprising transcriptional and/or translational regulatory sequences and/or a coding region and/or non-translated sequences (e.g., introns, 5'- and 3'-untranslated sequences). The coding region of a gene may be a nucleotide sequence coding for an amino acid sequence or a functional RNA, such as tRNA, rRNA, catalytic RNA, siRNA, miRNA or antisense RNA. A gene may also be an mRNA or eDNA corresponding to the coding regions (e.g., exons and miRNA) optionally comprising 5'- or 3'-untranslated sequences linked thereto.
A gene may also be an amplified nucleic acid molecule produced in vitro comprising all or a part of the coding region and/or 5'- or 3'-untranslated sequences linked thereto.
Groove binder/minor groove binder (MGB) "Groove binder" and/or "minor groove binder" may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner. Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often .25 displacing water. Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings.
Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti-tumor drugs such as chromomycin and mithramycin, CC-1065, dihydrocyclopyrroloindole tripeptide (DPI3), 1,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI3), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996; and PCT Published Applicatiori No. WO 03/078450, the contents of which are incorporated herein by reference.
A minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the T,,, of the primer or a probe to which they are attached, allowing such primers or, probes to effectively hybridize at higher temperatures.
host cell "Host cell" as used herein may be a naturally occurring cell or a transformed cell that may contain a vector and may support replication of the vector. Host cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or mammalian cells, such as CHO and HeLa.
identity "Identical" or "identity" as used herein in the context of two or more nucleic acids or polypeptide sequences mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of the single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a coniputer sequence algorithm such as BLAST or BLAST

in situ detection "In situ detection" as used herein means the detection of expression or expression levels in the original site hereby meaning in a tissue sample such as biopsy.
label "Label" as used herein means a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and other entities which can be made detectable. A label may be incorporated into nucleic acids and proteins at any position.

nucleic acid "Nucleic acid" or "oligonucleotide" or "polynucleotide" as used herein mean at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions.
Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
Nucleic acids may be single stranded or double stranded, or may contain portions of both double stranded and single stranded sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xantliine hypoxanthine, isocytosine and-isoguanine.
Nucleic acids may be obtained by chemical synthesis methods or by recoinbinant methods.
A nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included that may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated by reference.
Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids. The modified nucleotide analog may be located for example at the 5'-end and/or the 3'-end of the nucleic acid molecule.
Representative examples of nucle6tide analogs may be selected from sugar- or backbone-modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e. ribonucleotides, containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5-position, e.g. 5-(2-amino) propyl uridine, 5-bromo uridine; adenosines and guanosines modified at the 8-position, e.g. 8-bromo guanosine; deaza nucleotides, e.g. 7-deaza-adenosine; 0-and N-alkylated nucleotides, e.g. N6-methyl adenosine are suitable. The 2'-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NER, NR2 or CN, wherein R is C1-C6 allryl, alkenyl or alkynyl and halo is F, Cl, Br or I. Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al., Nature 438:685-689 (2005) and Soutschek et al., Nature 432:173-178 (2004), which are incorporated herein by reference. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip. The backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells. The backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver. Mixtures of naturally occurring nucleic acids and analogs may be made;
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
probe "Probe" as used herein means an oligonucleotide capable of binding to a target nucleic acid of complementary sequence tlirough one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. There may be any inumber of base pair mismatches which will interfere with hybridization between the target sequence and the single stranded nucleic acids described herein. However, if the number of mutations is so great that no hybridization can occur under even the least stringent of hybridization conditions, the sequence is not a complementary target sequence. A probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence.
Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may later bind.
promoter "Promoter" as used herein means a syntlletic or naturally-derived molecule which is capable of conferring, activating or enhancing expression of a nucleic acid in a cell. A
promoter may comprise one or more specific transcriptional regulatory sequences to further enhance expression and/or to alter the spatial expression and/or temporal expression of same. A promoter may. also comprise distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription. A
promoter may be derived from sources including viral, bacterial, fungal, plants, insects, and animals. A promoter may regulate the expression of a gene component constitutively, or differentially with respect to cell, the tissue or organ in which expression occurs or, with respect to the developmental stage at which expression occurs, or in response to external stimuli such as physiological stresses, pathogens, metal ions, or inducing agents.
Representative examples of promoters include the bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 promoter, lac operator-promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, CMV IE promoter, SV40 early promoter or SV401ate promoter and the CMV IE promoter.
selectable marker "Selectable marker" as used herein means any gene which confers a phenotype on a host cell in which it is expressed to facilitate the identification and/or selection of cells which are transfected or transformed with a genetic construct. Representative examples of selectable markers include the ampicillin-resistance gene (Ampr), tetracycline-resistance gene (Tc%
bacterial kanamycin-resistance gene (Kanr), zeocin resistance gene, the AURI-C
gene which confers resistance to the antibiotic aureobasidin A, phosphinotliricin-resistance gene, neomycin phosphotransferase gene (nptll), hygromycin-resistance gene, beta-glucuronidase (GUS) gene, chloramphenicol acetyltransferase (CAT) gene, green fluorescent protein (GFP)-encoding gene and luciferase gene.
stringent hybridization conditions "Stringent hybridization conditions" as used herein mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10 C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm may be the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of ' the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium).

Stringent conditions may be those in wliich the salt concentration is less than about 1.0 M
sodium ion, *such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30 C for short probes (e.g., about 10-50.
nucleotides) and at least about 60 C for long probes (e.g., greater than about nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridizatiori, a positive signal may be at least 2 to 10 times background hybridization. Exemplary stringent hybridization conditions include the following: 50% formamide, 5x SSC, and 1% SDS, incubating at 42 C, or, 5x SSC, 1% SDS, incubating at 65 C, with wash in 0.2x SSC, and 0.1%
SDS at 65 C.
substantially complementary "Substantially complementary" as used herein means that a first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the complement of a second sequence over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides, or that the two sequences hybridize under stringent hybridization conditions.
substantially identical "Substantially identical" as used herein means that a first and a second sequence are at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more nucleotides or amino acids, or with respect to nucleic acids, if the first sequence is substantially complementary to the complement of the second sequence.
subject As used herein, the term "subject" refers to a mammal, including both human and other mammals. The methods of the present invention are preferably applied to human subjects.
target nucleic acid "Target nucleic acid" as used herein means a nucleic acid or variant thereof that may be bound by another nucleic acid. A target nucleic acid may be a DNA sequence.
The target nucleic acid may be RNA. The target nucleic acid may comprise a mRNA, tRNA, shRNA, siRNA or Piwi-interacting RNA, or a pri-miRNA, pre-miRNA, miRNA, or anti-miRNA.
The target nucleic, acid may comprise a target miRNA binding site or a variant thereof.
One or more probes may bind the target nucleic acid. The target binding site may comprise 5-100 or 10-60 nucleotides. The target binding site may comprise a total of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30-40, 40-50, 50-60, 61, 62 or 63 nucleotides. The target site sequence may comprise at least 5 nucleotides of the sequence of a target miRNA binding site disclosed in U.S. Patent Application Nos.
11/384,049, 11/418,870 or 11/429,720, the contents of wliich are incorporated herein.
tissue sample As used herein, a tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts. The phrase "suspected of being cancerous" as used herein means a cancer tissue sample believed by one of ordinary skill in the medical -arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser-based microdissection, or other art-known cell-separation methods.
variant "Variant" as used herein referring to a nucleic acid means (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequence substantially identical thereto.
vector "Vector" as used herein means a nucleic acid sequence containing an origin of replication.
A vector may be a plasmid, bacteriophage, bacterial artificial chromosome or yeast artificial chromosome. A vector may be a DNA or RNA vector. A vector may be either a self-replicating extrachromosomal vector or a vector which integrates into a host genome.
wild type As used herein, the term "wild type" sequence refers to a coding, a non-coding or an interface sequence which is an allelic form of sequence that performs the natural or normal function for that sequence. Wild type sequences include multiple allelic forms of a cognate sequence, for example, multiple alleles of a wild type sequence may encode silent or conservative changes to the protein sequence that a coding sequence encodes.

The present invention employs miRNA for the'identification, classification and diagnosis of specific lung cancers, MicroRNA processing A gene coding for a microRNA (miRNA) may be transcribed leading to production of an miRNA precursor known as the pri-miRNA. The pri-miRNA may be part of a polycistronic RNA comprising multiple pri-miRNAs. The pri-miRNA may form a hairpin structure with a stem and loop. The stem may comprise mismatched bases.
The hairpin structure of the pri-miRNA may be recognized by Drosha, which is an RNase III endonuclease. Drosha may recognize terminal loops in the pri-miRNA and cleave approximately two helical turns into the stem to produce a 60-70 nucleotide precursor known as the pre-miRNA. Drosha may cleave the pri-iniRNA with a staggered cut typical of RNase III endonucleases yielding a pre-miRNA stem loop with a 5' phosphate and -2 nucleotide 3' overhang. Approximately one helical turn of the stem (-10 nucleotides) extending beyond the Drosha cleavage site may be essential for efficient processing. The pre-miRNA may then be actively transported from the nucleus to the cytoplasm by Ran-GTP and the export receptor Ex-portin-5.
The pre-miRNA may be recognized by Dicer, which is also an RNase III
endonuclease.
Dicer may recognize the double-stranded stem of the pre-miRNA. Dicer may also recognize the 5' phosphate and 3' overhang at the base of the stem loop. Dicer may cleave off the terminal loop two helical turns away from the base of the stem loop leaving an additional 5' phosphate and -2 nucleotide 3' overhang. The resulting siRNA-like duplex, which may comprise mismatches, comprises the mature miRNA and a similar-sized fragment known as the miRNA*. The miRNA and miRNA* may be derived from opposing arms of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in libraries of cloned miRNAs but typically at lower frequency than the miRNAs.
Although initially present as a double-stranded species with miRNA*, the miRNA
may eventually become incorporated as a single-stranded RNA into a ribonucleoprotein complex known as the RNA-induced silencing complex (RISC). Various proteins can form the RISC, which can lead to variability in specificity for miRNA/miRNA* duplexes, binding site of the target gene, activity of miRNA (repression or activation), and which strand of the miRNA/miRNA* duplex is loaded in to the RISC.
When the miRNA strand of the miRNA:miRNA* duplex is loaded into the RISC, the miRNA* may be removed and degraded. The strand of the miRNA:miRNA* duplex that is loaded into the RISC may be the strand whose 5' end is less tightly paired. In cases where both ends of the miRNA:miRNA* have roughly.equivalent 5' pairing, both miRNA
and miRNA* may have gene silencing activity.
The RISC may identify target nucleic acids based on high levels of complementarity between the miRNA and the mRNA, especially by nucleotides 2-7 of the miRNA.
Only one case has been reported in animals where the interaction between the miRNA and its target was along the entire lengtli of the miRNA. This was shown for mir-196 and Hox B8 and it was further shown that mir-196 mediates the cleavage of the Hox B8 mRNA (Yekta et al 2004, Science 304-594). Otherwise, such interactions are known only in plants (Bartel &
Barte12003, Plant Physiol 132-709).
A number of studies have studied the base-pairing requirement between miRNA
and its mRNA target for achieving efficient inhibition of translation (reviewed by Bartel 2004, Cell 116-281). In mammalian cells, the first 8 nucleotides of the miRNA may be important (Doench & Sharp 2004 GenesDev 2004-504). However, other parts of the microRNA
may also participate in mRNA binding. Moreover, sufficient base pairing at the 3' can compensate for insufficient pairing at the 5' (Brennecke et al, 2005 PLoS 3-e85).
Computation studies, analyzing miRNA binding on whole genomes have suggested a specific role for bases 2-7 at the 5' of the miRNA in target binding but the role of the first nucleotide, found usually to be "A" was also recognized (Lewis et at 2005 Cell 120-15).
Similarly, nucleotides 1-7 or 2-8 were used to identify and validate targets by Krek et al (2005, Nat Genet 37-495).
The target sites in the mRNA may be in the 5' UTR, the 3' UTR or in the coding region.
Interestingly, multiple miRNAs may regulate the same mRNA target by recognizing the saine or multiple sites. The presence of multiple miRNA binding sites in most genetically identified targets may indicate that the cooperative action of multiple RISCs provides the most efficient translational inhibition.
miRNAs may direct the RISC to downregulate gene expression by either of two mechanisms: mRNA cleavage or translational repression. The miRNA may specify cleavage of the mRNA if the mRNA has a certain degree of complementarity to the miRNA. When a miRNA guides cleavage, the cut may be between the nucleotides pairing to residues 10 and 11 of the miRNA. Alternatively, the miRNA may repress translation if the miRNA does not have the requisite degree of complementarity to the miRNA.
Translational repression may be more prevalent in animals since ailimals may have a lower degree of coniplementarity between the miRNA and the binding site.
It should be noted that there may be variability in the 5' and 3' ends of any pair of niiRNA and miRNA*. This variability may be due to variability in the enzymatic processing of Drosha and Dicer with respect to the site of cleavage.
Variability at the 5' and 3' ends of miRNA and miRNA* may also be due to mismatches in the stem structures of the pri-miRNA and pre-miRNA. The mismatches of the stem strands may lead to a population of different hairpin structures. Variability in the stem structures may also lead to variability in the products of cleavage by Drosha and Dicer.
Nucleic Acids Nucleic acids are provided herein. The nucleic acids comprise the sequence of SEQ ID
NOS: 1-30 or variants thereof. The variant may be a complenient of the referenced nucleotide sequence. The variant may also be a nucleotide sequence that is substantially identical to the refereilced nucleotide sequence or the coniplement thereof.
The variant may also be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical"
thereto.
The nucleic acid may have a length of from 10 to 250 nucleotides. The nucleic acid may .5 have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides. The nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene described herein. The nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex. The nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Patent No. 6,506,559 whieh is incorporated by reference.
Nucleic acid complexes The nucleic acid may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, and an aptamer.
Pri-miRNA
The nucleic acid may comprise a sequence of a pri-miRNA or a variant thereof.
The pri-miRNA sequence may comprise from 45-30,000, 50-25,000, 100-20,000, 1,000-1,500 or 80-100 nucleotides. The sequence of the pri-miRNA may comprise a pre-miRNA, miRNA
and miRNA*, as set forth herein, and variants thereof. The sequence of the pri-miRNA may comprise the sequence of SEQ ID NOS: 1-2, 4-5, 13-30 or variants thereof.
The pri-miRNA may form a hairpin structure. The hairpin may comprise a first and a second nucleic acid sequence that are substantially complimentary. The first and second nucleic acid sequence may be from 37-50 nucleotides. The first and second nucleic acid sequence may be separated by a third sequence of from 8-12 nucleotides. The hairpin structure may have a free energy of less than -25 Kcal/mole, as calculated by the Vienna algorithm, with default parameters as described in Hofacker et al., Monatshefte f. Cllemie 125: 167-188 (1994), the contents of which are incorporated herein. The hairpin may conlprise a terminal loop of 4-20, 8-12 or 10 nucleotides. The pri-miRNA may comprise at least 19% adenosine nucleotides, at least 16% cytosine nucleotides, at least 23% thymine nucleotides and at least 19% guanine nucleotides.
Pre-miRNA
The nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof. The pre-miRNA sequence may comprise from 45-90, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein. The sequence of the pre-miRNA may also be that of a pri-miRNA excluding from 0-160 nucleotides from the 5' and 3' ends of the pri-miRNA. The sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 1- 2, 4-5, 13-30 or variants thereof.
miRNA
The nucleic acid may also comprise a sequence of a miRNA (including miRNA*) or a variant thereof. The miRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides. The miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the miRNA may be the first 13-33 nucleotides of the pre-miRNA. The sequence of the miRNA may also be the last 13-33 nucleotides of the pre-n1iRNA. The sequence of the miRNA may comprise the sequence of SEQ ID NOS:

2, 13-21 or variants thereof.
Anti-miRNA
The nucleic acid may also comprise a sequence of an anti-miRNA capable of blocking the activity of a miRNA or miRNA*, such as by binding to the pri-miRNA, pre-miRNA, miRNA or miRNA* (e.g. antisense or RNA silencing), or by binding to the target binding site. The anti-miRNA may comprise a total of 5-100 or 10-60 nucleotides. The anti-miRNA may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the anti-miRNA may comprise (a) at least 5 nucleotides that are substantially identical or complimentary to the 5' of a miRNA and at least nucleotides that are substantially complimentary to the flanking regions of the target site from the 5' end of the miRNA, or (b) at least 5-12 nucleotides that are substantially identical or complimentary to the 3' of a miRNA and at least 5 nucleotide that are substantially complimentary to the flanking region of the target site from the 3' end of the miRNA. The sequence of the anti-miRNA may comprise the compliment of SEQ ID NOS: 1-2., 4-5, 13-or variants thereof.
Binding Site of Target 30 The nucleic acid may also comprise a sequence of a target microRNA binding site or a variant thereof. The target site sequence may comprise a total of 5-100 or 10-nucleotides. The target site sequence may also comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62 or 63 nucleotides. The target site sequence may comprise at least 5 nucleotides of the sequence of SEQ ID NOS: 1-2, 4-5 or 13-30.
Synthetic Gene A synthetic gene is also provided comprising a nucleic acid described herein operably linked to a transcriptional and/or translational regulatory sequence. The synthetic gene may be capable of modifying the expression of a target gene with a binding site for a nucleic acid described herein. Expression of the target gene may be modified in a cell, tissue or organ.
The synthetic gene may be synthesized or derived from naturally-occurring genes by standard recoinbinant teclmiques. The synthetic gene may also comprise terminators at the 3'-end of the transcriptional unit of the synthetic gene sequence. The synthetic gene may also comprise a selectable marker.
Vector A vector is also provided comprising a synthetic gene described herein. The vector may be an expression vector. An expression vector may comprise additional elements. For example, the expression vector may have two replication systems allowing it to be maintained in two organisms, e.g., in one host cell for expression and in a second host cell (e.g., bacteria) for cloning and ainplification. For integrating expression vectors, the expression vector may contain at least one sequence homologous to the host cell genome, and preferably two homologous sequences which flank the expression construct.
The integrating vector may be directed to a specific locus in the host cell by selecting the appropriate homologous sequence for inclusion in the vector. The vector may also comprise a selectable marker gene to allow the selection of transformed host cells.
Host Cell A host cell is also provided comprising a vector, synthetic gene or nucleic acid described herein. The cell may be a bacterial, fungal, plant, insect or animal cell. For example, the host cell line may be DG44 and DUXB 11 (Chinese Hamster Ovary lines, DHFR
minus), HELA (human cervical carcinoma), CVI (monkey kidney line), COS (a derivative of CVI
with SV40 T antigen), R1610 (Chinese hamster fibroblast) BALBC/3T3 (mouse fibroblast), HAK (hamster kidney line), SP2/O (mouse myeloma), P3x63-Ag3.653 (mouse myeloma), BFA-1c1BPT (bovine endothelial cells), RAJI (human lymphocyte) and 293 (liuman kidney). Host cell lines may be available from commercial services, the American Tissue Culture Collection or from published literature.

Probes A probe is provided herein. A probe may comprise a nucleic acid. The probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides. The probe may also have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The probe may comprise a nucleic acid of 18-25 nucleotides.
A probe may be capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. A probe may be single stranded or partially single and partially double stranded. The strandedness of the probe is dictated by the structure, composition, and properties of the target sequence. Probes may be directly labeled or indirectly labeled.
Test Probe The probe may be a test probe. The test probe may comprise a nucleic acid sequence that is complementary to a miRNA, a miRNA*, a pre-miRNA, or a pri-miRNA. The sequence of the test probe may be selected from SEQ ID NOS: 10-12.
Linker Sequences The probe may further comprise a linker. The linker may be 10-60 nucleotides in length.
The linker may be 20-27 nucleotides in length. The linker may be of sufficient length to allow the probe to be a total length of 45-60 nucleotides. The linker may not be capable of forming a stable secondary structure, or may not be capable of folding on itself, or may not be capable of folding on a non-linker portion of a nucleic acid contained in the probe. The sequence of the linker may not appear in the genome of the animal from which the probe non-linker nucleic acid is derived.
Reverse Transcription Target sequences of a cDNA may be generated by reverse transcription of the target RNA.
Methods for generating cDNA may be reverse transcribing polyadenylated RNA. or alternatively, RNA with a ligated adaptor sequence.
Reverse Transcription using Adaptor Sequence Ligated to RNA
The RNA may be ligated to an adapter sequence prior to reverse transcription.
A ligation reaction may be performed by T4 RNA ligase to ligate an adaptor sequence at the 3' end of the RNA. Reverse transcription (RT) reaction may then be performed using a primer comprising a sequence that is complementary to the 3' end of the adaptor sequence.

Reverse Transcription using Polyadenylated Sequence Ligated to RNA
Polyadenylated RNA may be used in a reverse transcription (RT) reaction using a poly(T) primer comprising a 5' adaptor sequeiice. The poly(T) sequence may comprise 8, 9, 10, 11, 12, 13, or 14 consecutive thymines. The reverse transcription primer may comprise SEQ ID
NO:6.
RT-PCR of RNA
The 'reverse transcript of the RNA may be amplified by real time PCR, using a specific forward primer comprising at least 15 nucleic acids complementary to the target nucleic acid and a 5' tail sequence; a reverse primer that is complementary to the 3' end of the adaptor sequence; and a probe comprising at least 8 nucleic acids complementary to the target nucleic acid. The probe may be partially complementary to the 5' end of the adaptor sequence.
PCR of Target Nucleic Acids Methods of amplifying target nucleic acids are described herein. The amplification may be by a method comprising PCR. The first cycles of the PCR reaction may have an annealing temp of 56 C, 57 C, 58 C, 59 C, or 60 C. The first cycles may comprise 1-10 cycles. The remaining cycles of the PCR reaction may be 60 C. The remaining cycles may comprise 2-40 cycles. The amlealing temperature may cause the PCR to be more sensitive.
The PCR
may generate longer products that can serve as higher stringency PCR
templates.
Forward Primer The PCR reaction may comprise a forward primer. The forward primer may comprise 15, 16, 17, 18, 19, 20, or 21 nucleotides identical to the target nucleic acid.
The 3' end of the forward primer may be sensitive to differences in sequence between a target nucleic acid and a sibling nucleic acid.
The forward primer may also comprise a 5' overhanging tail. The 5' tail may increase the melting temperature of the forward primer. The sequence of the 5' tail may comprise a sequence that is non-identical to the genome of the animal from which the target nucleic acid is isolated. The sequence of the 5' tail may also be synthetic. The 5' tail may coinprise 8, 9, 10, 11, 12, 13, 14, 15, or 16 nucleotides. The forward primer may comprise SEQ ID
NOS: 7-9.
Reverse Primer The PCR reaction may comprise a reverse primer. The reverse primer may be complementary to a target nucleic acid. The reverse primer may also comprise a sequence complementary to an adaptor sequence. The sequence complementary to an adaptor sequence may comprise 12-24 nucleotides.
Biochip A biochip is also provided. The biochip may comprise a solid substrate comprising an attached probe or plurality of probes described herein. The probes may be capable of hybridizing to a target sequence under stringent hybridization conditions. The probes may be attached at spatially defined locations on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence. The probes may be capable of -hybridizing to target sequences associated with a single disorder appreciated by those in the art. The probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
The solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method. Representative examples of substrate materials include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene;
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The substrates may allow optical detection without appreciably fluorescing.
The substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as flexible foam, including closed cell foams made of particular plastics.
The substrate of the biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two. For example, the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker.
The probes may be attached to the solid support by either the 5' terminus, 3' terminus, or via an internal nucleotide.
The probe may also be attached to the solid support non-covalently. For example, biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
Diagnostics A method of diagnosis is also provided. The method comprises detecting a differential expression level of lung cancer-associated nucleic acids in a biological sample. The sample may be derived from a patient. Diagnosis of a cancer state, and its histological type, in a patient may allow for prognosis and selection of therapeutic strategy.
Further, the developmental stage of cells may be classified by determining temporarily expressed cancer-associated nucleic acids.
In situ hybridization of labeled probes to tissue arrays may be performed.
When comparing the fingerprints between an individual and a standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the fmdings. It is further understood that the genes which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
Kits A kit is also provided and may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base. In addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
For example, the kit may be used for the amplification, detection, identification or quantification of a target nucleic acid sequence. The kit may comprise a poly(T) primer, a forward primer, a reverse primer, and a probe.
Any of the compositions described herein may be comprised in a kit. In a non-limiting example, reagents for isolating miRNA, labeling miRNA, and/or evaluating a miRNA
population using an array are included in a kit. The kit may further include reagents for creating or synthesizing miRNA probes. The kits will thus comprise, in suitable container means, an enzyme for labeling the miRNA by incorporating labeled nucleotide or unlabeled nucleotides that are subsequently labeled. It may also include one or more buffers, such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the miRNA probes, components for in situ hybridization and components for isolating miRNA. Other kits of the invention may include components for making a nucleic acid array comprising miRNA, and thus, may include, for example, a solid support.

The following examples are presented in order to more fully illustrate some embodiments of the invention. They should, in no way be construed, however, as limiting the broad scope of the invention.

EXAMPLES
Example 1 Experimental Procedures 1. miRdicatorTM array platform Custom microarrays were produced by printing DNA oligonucleotide probes to 688 miRs (miRNA) [Sanger database, version 9.1 (miRBase: microRNA sequences, targets and gene nomenclature. Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ. NAR, 2006, 34, Database Issue, D140-D144) and additional Rosetta genomics validated and predicted miRs]. Each probe carries up to 22-nucleotide (nt) linker at the 3' end of the miRNA's coinplement sequence in addition to an amine group used to couple the probes to coated glass slides. 20 M of each probe were dissolved in 2X SSC + 0.0035% SDS
and spotted in triplicate on Schott NexterionOO Slide E coated microarray slides using a Genomic Solutions BioRobotics MicroGrid II according the MicroGrid manufacturer's directions. 64 negative control probes were designed using the sense sequences of different miRNAs. Two groups of positive control probes were designed to hybridize to miRdicatorTM array (1) synthetic spikes small RNA were added to the RNA before labeling to verify the labeling efficiency and (2) probes for abundant small RNA [e.g.
small nuclear RNAs (U43, U49, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA] were spotted on the array to verify RNA quality. The slides were blocked in a solution containing 50 mM
etlianolamine, 1M Tris (pH 9.0) and 0.1%SDS for 20 min at 50 C, then thoroughly rinsed with water and spun dry.
2. Cy-dye labeling of microRNA for miRdicatorTM array 15 gg of total RNA was labeled by ligation of a RNA-linker p-rCrU-Cy- dye (Thomson et al., 2004, Nat Methods 1, 47-53) (Dharmacon) to the 3' -end with Cy3 or Cy5.
The labeling reaction contained total RNA, spikes (20-0.1 finoles), 500ng RNA-linker-dye, 15% DMSO, lx ligase buffer and 20 units of T4 RNA ligase (NEB) and proceeded at 4 C. for lhr followed by lhr at 37 C. The labeled RNA was mixed with 3x hybridization buffer (Ambion), heated to 95 C for 3 min and then added on top of the miRdicatorTM
array. Slides were hybridize 12-16hr, followed by two washes with 1xSSC and 0.2% SDS and a final wash with 0.1xSSC.

The array was scanned using an Agilent Microarray Scanner Bundle G2565BA
(resolution of 10 m at 100% power). The data was analyzed using SpotReader software.
3. RNA extraction RNA was extracted from frozen or formalin fixed paraffin-embedded (FFPE) tissues originating from lung adenocarcinoma, lung squamous cell carcinoma and lung large cell carcinoma.
Total RNA from frozen tissues was extracted with the miRvana miRNA isolation kit (Ainbion) according to the manufacturer's instructions.
Total RNA from formalin fixed, paraffin-embedded (FFPE) tissues was extracted according to the following protocol:
1' ml Xylene (Biolab) was added to 1-2 mg tissue, incubated at 570 C for 5 min and centrifuged for 2 min at 10,000g. The supernatant was removed and 1 ml Ethanol (100%) (Biolab) was added. Following centrifugation for 10 min at 10,000g, the supematant was discarded and the washing procedure was repeated. Following air drying for 10-15 min, 500 l Buffer B(NaCI 10mM, Tris pH 7.6, 500 mM, EDTA 201nM, SDS 1%) and 5 1 proteinase K(50mg/ml) (Sigma) were added. Following incubation at 450 C for 16 h, inactivation of the proteinase K at 100 C for 7 min was preformed. Following extraction with acid phenol chloroform (1:1) (Sigma) and centrifugation for 10 min at maximum speed at 40 C, the upper phase was transferred to a new tube with the addition of 3 volumes of 100% Ethanol, 0.1volume of NaOAc (BioLab) and 8 1 glycogen (Ambion) and left over night at -20 C.
Following centrifugation at maximum speed for 40min at 4 C, washing with lml Ethanol (85%), aind drying, the RNA was re-suspended in 45 1 DDW.
The RNA concentration was tested and DNase Turbo (Ambion) was added accordingly (1 l DNase/10 g RNA). Following Incubation for 30 min at room temperature and extraction with acid phenol chloroform, the RNA was re-suspended in 45 1 DDW.
The RNA concentration was tested again and DNase Turbo (Ambion) was added accordingly (1 l DNase/10 g RNA). Following incubation for 30 min at room temperature and extraction with acid phenol chloroform, the RNA was re-suspended in 20 1,DDW.
4. RNA polyadenylation and annealing of Poly(T) adapter A mixture was prepared according to the following:
Component Vol/sample PNK buffer (NEB) 1 l 25mM MnC1Z (Sigma) 1 gl 10mM ATP (Promega) 2 1 Poly A polymerase (Takara) 1 l Total Vol 5 1 l of the mixture were added to 5 l of appropriate RNA sample (1 g) (or to the ultra pure water of the No RNA control). The reaction was incubated for 1 hour at 37 C.
Poly(T) adapter (GCGAGCACAGAATTAATACGACTCACTATCGGTTTTTTTTTTTTVN -SEQ ID
5 NO: 6) inixture was prepared according to the following:
Component Vol/sample 0. 5 g/ l Poly(T) adapter 1 l (IDT) Ultra pure water 2 1 Total Vol 3 l 3 l from the Poly(T) adapter mixture and 5 1 from the poly-adenylated RNA or negative control were transferred to PCR tubes. Annealing process was performed by the following annealing program:
STEP 1: 85 C for 2 min STEP 2: 70 C to 25 C - decrease of 1 C in each cycle for 20 sec.
5. Reverse Transcription Reverse Transcription mixture was prepared according to the following:
Component Vol/sample 5x RT buffer 4 14 (Invitrogen) Trehalose D 1.7M 3 l (Calbiochem, Sigma) 10rnM dNTPs mix 1 l (Promega) DTT (0.1 M) 2 (Invitrogen) Total Vol IO I

1.5 l Recombinant Rnasin (Promega) and 1 l superscript II RT (Invitrogen) were added to the above inixture. 12.5 l of the mix were added to each PCR tabe containing the aimealed 25 PolyA RNA and to the No RNA control.
The tubes were inserted into a PCR instrument (MJ Research Inc.) and the following prograin was performed:
STEP 1: 37 C for 5 min STEP 2: 45 C for 5 min STEP 3: Repeat steps 1-2, 5 times STEP 4: End the program at 4 C
The eDNA microtubes were stored at - 20 C.
6. Real time PCR using MGB probe Each cDNA sample was evaluated in triplicate for the following three RNAs: hsa-miR-21 (SEQ ID NO: 2), hsa-miR-205 (SEQ ID NO: 1) and U6 (SEQ ID NO: 3).
A primer-probe mix was prepared. In each tube 10 M Fwd primer with the same volume of 5 M of the corresponding MGB probe (ABI) specific for the same RNA were mixed.. The sequences of the Fwd primers and MGB probes are indicated in Table 1.
Table 1: Sequences of primers and probes SEQ SEQ
Fwd (Forward miR specific) Name ID TaqMan MGB probe ID
primer NO NO
miR - CAGTCATTTGGGTCCTTCAT CGTTTTTTTTTTTTCAG

miR - CAGTCATTTGGGTAGCTTAT 8 CCGTTTTTTTTTTTTCA 11 The cDNA was diluted to a final concentration of 0.5ng1 1. PCR mixture was prepared according to the following: .
Component Vol per well 2 X TaqMan Universal PCR 10 1 (ABI) RT-rev-primer-Race 10 M 1 l (IDT) Ultra pure water 6 1 Total Vol 17 1 681i1 (for No RNA control and for No cDNA control) or 170 l of the PCR mix were dispensed into the appropriately labeled microtubes.l0 l eDNA (0.5ng/ l) were added into the appropriately labeled microtubes containing the mix. The PCR plates were prepared by dispensing 18 1 from the mix into each well. 2 1 of primer probe mixture were added into each well using a PCR-multi-channel. The plates were loaded in a Real Time-PCR
instruinent (Applied Biosystems) and the following program was performed:

Sta eg 1, Reps=1 STEP 1: Hold @ 95.0 for 10 min (MM:SS), Ramp Rate = 100 Sta e~ 2, Reps=40 STEP 1: Hold @ 95.0 for 0:15 (MM:SS), Ramp Rate = 100 STEP 2: Hold @ 60.0 for 1:00 (MM:SS), Ramp Rate = 100 Standard 7500 Mode Sample Volume ( L): 20.0 Data Collection: Stage 2, Step 2 7. miRdicatorTM array data normalization The initial data set consisted of signals measured for multiple probes for every sample.
For the analysis, signals were used only for probes that were designed to measure the expression levels of known or validated human microRNAs.
Triplicate spots were combined into one signal by taking the logarithmic mean of the reliable spots. All data was log-transformed and the analysis was performed in log-space. A
reference data vector for norinalization, R, was calculated by taking the mean expression level for each probe in two representative samples, one from each tumor type, for example:
lung squamous cell carcinoma and lung adenocarcinoma.
For each sample k with data vector Sk, a 2nd degree polynomial Fk was found so as to provide the best fit between the sample data and the reference data, such that R4~(S).
Remote data points ("outliers") were not used for fitting the polynomials F.
For each probe in the sample (element S,.k in the vector Sk), the normalized value (in log-space) Mk is calculated from the initial value Sk by transforming it with the polynomial f-unction Fk, so that Mlk FI`( S'` ). Data is translated back to linear-space by taking the exponent.

8. Statistical analysis The purpose of this statistical analysis was to find probes whose normalized signal levels differ signiEcantly between the two compared sample sets. Probes that had normalized signal levels below log2(300) in the two sample sets were not analyzed. For each probe, two groups of normalized signals obtained for two sample sets were compared. The p-value was calculated for each probe, using the statistical un-paired two-sided t-test method. The p-value is the probability for obtaining, by chance, the measured signals or a more extreme difference between the groups, had the two groups of signals come from distributions with equal mean values. microRNAs whose probes had the lowest and most significant t-test p-values were selected. A p-value lower than 0.05 means that the probability that the two groups come from distributions with the same mean is lower than 0.05 or 5%, under the assumption of normal (Gaussian) signal distributions. The two groups of signals are likely to result from distributions with different means, and the relevant microRNA
is likely to be differentially expressed between the two sets of sainples.
9. In situ hybridization detection of hsa-mir-205 Standard paraffin sections of lung squaimous cell carcinoma were mounted on Superfrost plus histological slides (Menzel-Glazer). Before the hybridization slides with sections were kept at 60 C for 2 hrs.
All incubations at pre- and posthybridization steps were performed at room temperature unless stated otherwise. All solutions were prepared using ultrapure water purified by. an EASYpure II system (Barnstead) equipped with ultrafilter.
A. Prehybridization treatment Sections were deparaffinized by three consecutive incubations in xylene (5 min each) and rehydrated through the following series of ethanols: 100% - 3 changes 2 min each, 95% and 70% - 2 min each. Then slides were washed for 5 min in ultrapure water, put into 0.01M
citrate buffer (pH 6.0) and heated in a water bath until boiling and kept at boiling temperature for 10 min. Then slides were left in the buffer to cool down for lhr at room temperature.
Slides were incubated in proteinase K solution (20 g/ml in 1mM EDTA/lOmM Tris-HCI
pH7.5) for 10 min at 37 C and immediately fixed in freshly prepared 10%
formalin solution in phosphate buffered saline (PBS) for 20 min. Fonnalin fixation was followed by 5 min incubation in 0.2% glycine in PBS and three 2 min washes in ultrapure water.
Then slides were acetylated by shaking in 1.1 % (v/v) solution of triethanolamine to which 0.25% (v/v) of acetic anhydride was added siniultaneously with slides. After 5 min a new portion of acetic anhydride was added and acetylation proceeded for another 5 min.
Acetylation was followed by three 2 min washings in ultrapure water and then slides were rapidly dehydrated through graded ethanols (70%, 95%, 100% - 2 min each) and air-dried.
B. Hybridization Hybridization solution was prepared by dilution of digoxigenin labeled LNA
enhanced probe coinplementary to hsa-miR-205 (Exiqon product# 18099-01) diluted to 25 nM in hybridization buffer and -50 gl of this solution were applied to air-dried sections. For the negative control parallel sections were incubated with control hybridization solution prepared by dilution of digoxigenin labeled scramble-iniR LNA probe (Exiqon product#
99001-01).

After application of hybridization solution sections were covered with pieces of polyethylene film cut to the size of sections and incubated overnight at 50 C.
Composition of hybridization buffer:
Dextran sulfaw 10%
SSC x4 Deionized Formamide 50%
Denhardt`s Solution xl Salmon sperm DNA 0.5 mg/ml Yeast tRNA 0.25 mg/ml C. Posthybridization washing and immunodetection After hybridization slides were transferred into 5xSSC preheated to 50 C and incubated for 30 min. During this incubation covers floated off the slides. Then slides were washed for another 30 min in 2xSSC at 50 C.
Then slides were briefly washed in Tris buffered saline with Tween-20 (TBST -0.15M
NaCI, 0.05M Tris-HCI pH 7.5, 0.1% Tween-20) and incubated for lhr in blocking solution (10% bovine serum albumin in TBST).
For the detection of bound digoxigenin sections were incubated for 2 hrs with sheep anti-digoxigenin antibodies Fab conjugated to alkaline phosphatase (Roche Cat #11093274910) diluted 1:250 in blocking solution followed by 5 washings in TBST. Then slides were briefly washed in alkaline phosphatase buffer (APB - 0.1M Tris-HCI pH 9.5, 0.05M NaCl, 0.025M MgC12) and incubated for 5 hrs in staining solution - APB containing 4.5 l/ml of 5-bromo-4-chloro-3-indolyl-phosphate (BCIP - stock solution by Roche -Cat#11383221001) and 3.5 l/ml of 4-Nitro blue tetrazoliuin chloride (NBT -stock solution by Roche - Cat# 11383213001).
Finally, sections were washed in distilled water and coverslipped using Immu-Mount (Thermo Scientific Cat# 9990402).

Example 2 Specific microRNAs are able to distinguish between lung adenocarcinoma and lung squamous cell carcinoma The statistical analysis of the miRdicatorTM arrays results of lung adenocarcinoxina vs. lung squamous cell carcinoma are presented in Table 2. The results exhibited a significant difference in the expression pattern of hsa-miR-205 (SEQ ID NO 1).

The nonnalized expression levels of hsa-miR-205 were found to increase in lung squamous cell carcinoma in comparison to lung adenocarcinoma, as measured by miRdicatorTM array (Figures 1-3).
The sensitivity of the squamous cell carcinoma detection by hsa-miR-205 is 100 % (9/9) and the specificity of the signal is 84.2% (16/19).

Table 2:

Mean Number Number miR HID MID adeno- uamous samples, of p-value name carcinoma sq samples, (log) adeno-(log) carcinoma squamous hsa- 4 1 7.49 12.04 19 9 9.47E-07 miR-205 miR name: is the miRBase registry name (release 9.1).
HID: is the SEQ ID NO of the microRNA hairpin precursor (Pre-microRNA).
MID: is the SEQ ID NO of the mature microRNA.
Mean adeno-carcinoma (log): is the mean of the logarithins (log) of chip signal of lung adenocarcinoma samples.
Mean squamous (log): is the mean of the logarithms (log) of chip signal of lung squamous samples.
Number of samples, adeno-carcinoma: is the number of lung adenocarcinoma samples.
Number of samples, squamous: is the number of lung squamous san.lples.
p-value: is the result of the un-paired two-sided t-test between samples Example 3 Establishment of qRT-PCR assay for distinguishing between lung squamous cell carcinoma and other NSCLC
RNA was extracted from 20 lung samples of paraffin-embedded (FFPE) tissues originating -from lung squamous cell carcinoma and other Non Small Cell Lung Carcinoma (NSCLC) as described in Example 1 (3). The expression levels of hsa-miR-205 (SEQ ID
NO: 1), hsa-miR-21 (SEQ ID NO: 2) and U6 (SEQ ID NO: 3) were detected by quantitative qRT-PCR assay as described in Example 1 (4-6). The weighted Ct of 3 repeats of the 3 probes was calculated. The Ct of negative control wells was underdetermined.
The data was interpreted according to the following criteria:

U6 should have a weighted Ct of between 20 to 32. If not the experiment failed.
The weighted Ct of the 3 repeats was calculated according to the following:
If all repeats were within a difference of 1 Ct, meaning that the difference between the minimal and maximal Cts was less than 1, then their average was calculated.

Ctmax-Ctmin ~ --> weighted Ct=( Ctmax+Ctmedian+Ctmin)/3 The average of the outlier Cts were calculated, if they had a difference of 1 Ct or less from the middle Ct value.

Ctmax-Ctmedian g& Ctmedian-Ctm9n :514 weighted Ct=( Ctmax+Ctmedian'+"Ctmin)/3 Using the weighted calculated Ct, the assay final score was determined by subtractiiig the average Cts of U6 and hsa-mir-21 from the Ct of hsa-mir-205.
Assay final score= weighted Ctmir_2os- average [(weighted Ctmir_21 & weighted Ctu6)]
If the Ct of hsa-mir-205 was undetermined and the weighted Ct of U6 was within the legitimate range then the assay result is "Non-Squamous" with "High"
confidence level.
Ct,,,ir_2os=ND & 20 -~CtU6 Assay result= Non-Squamous with high confidence.
Otherwise: The result analysis is based on the assay final score calculation as described in Table 3:

Table 3:
Assay finalscore Assay result Confidence >4 Non-Squamous High < 1 Squamous cell carcinoma High ~.5 and < 4 Non-Squamous Low >1 and < 2.5 Squamous cell carcinoma Low Figure 4 demonstrates the full separation between samples originated from lung squamous cell carcinoma (asterisks) and samples originated from other NSCLC including lung adenocarcinoma and lung undifferentiated large cell carcinoma (ellipses) using RT-PCR
expression levels of hsa-miR-205 (SEQ ID NO: 1), normalized by qRT-PCR
expression levels of hsa-miR-21 (SEQ ID NO: 2), U6 (SEQ'ID NO: 3) and a threshold of a final score as described above. The full black line represents the threshold. The dashed black lines indicate the low confidence area border.
Example 4 In situ hybridization detection of hsa-mir-205 Sections of lung squamous cell carcinoma were hybridized to hsa-miR-205 specific probe and control (scramble) probe (see Exainple 1).
As shown in figure 5, staining of varying intensity of cells of squamous cell carcinoma was observed (Fig. 5A) while no staining was detected in sections hybridized to control (scramble) probe (Fig. 5B).
Example 5 Specific microRNAs are able to distinguish between lung adenocarcinoma and lung large cell carcinoma The statistical analysis of the miRdicatorTM arrays results of lung adenocarcinoma vs. lung large cell carcinoma are presented in Table 4. The results exhibited a significant difference in the expression pattern of several miRs, most prominent among them being hsa-miR-513 (SEQ ID NO: 13).
The normalized expression levels of hsa-miR-513 were found to increase in lung large cell carcinoma in comparison to lung adenocarcinoma, as measured by miRdicatorTM
array (Figures 6-8).
The sensitivity of the adenocarcinoma detection by hsa-miR-513 is 94.7%
(18/19) and the specificity of the signal is 85.7% (6/7).
Table 4:

Number Number Mean Mean of of adeno- large samples, samples, carcinoma cell adeno- large miR name HID MID (log) lo carcinoma cell p-value hsa-miR-513 22 13 8.27 10.31 19 7 6.14E-05 hsa-miR-183 23 14 8.21 10.47 19 7 1.71E-04 hsa=miR-189 24 15 7.03 8.5 19 7 4.08E-04 hsa-miR-103 25 16 10.39 8.59 19 7 4.55E-04 hsa-miR-525* 26 17 6.5 8.45 19 7 4.72E-04 hsa-iniR-492 27 18 7.99 9.57 19 7 5.67E-04 hsa-iniR-140 28 19 8.02 9.49 19 7 9.18E-04 hsa-miR-202* 29 20 7.24 8.7 19 7 1.09E-03 hsa-miR-449 30 21 9.37 11.55 19 7 1.90E-03 miR name: is the miRBase registry name (release 9.1).
HID: is the SEQ ID NO of the microRNA hairpin precursor (Pre-microRNA).
MID: is the SEQ ID NO of the mature microRNA.
Mean adeno-carcinoma (log): is the mean of the logarithms (log) of chip signal of Lung adenocarcinoma cells.

Mean large cell (log): is the mean of the logaritluns (log) of chip signal of Lung Large cells.
Number of samples, adeno-carcinoma cells: is the number of samples of Lung adenocarcinoma cells.
Number of samples, large cells: is the number of samples of Lung Large cells.
p-value: is the result of unmatched t-test between samples.

The foregoing description of the specific embodiments will so fully reveal the general nature. of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed einbodiments. Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art.
Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the spirit and broad scope of the appended claims.
It should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

Claims (21)

1. A method of classifying Non Small Cell Lung Carcinoma (NSCLC), the method comprising: obtaining a biological sample from a subject;
measuring the relative abundance in said sample of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1-5, 13-30, a fragment thereof or sequence having at least about 80% identity thereto;
and comparing said obtained measurement to a reference number representing abundance of a nucleic acid; whereby the differential expression of said nucleic acid sequence allows the classification of said NSCLC.
2. The method of claim 1, wherein said biological sample is selected from the group consisting of bodily fluid, a cell line and a tissue sample.
3. The method of claim 2, wherein said tissue is a fresh, frozen, fixed, wax-embedded or formalin fixed paraffin-embedded (FFPE) tissue.
4. The method of claim 2, wherein said tissue sample is a lung sample.
5. The method of claim 1, wherein said NSCLC is selected from the group consisting of lung squamous cell carcinoma, lung adenocarcinoma and lung undifferentiated large cell carcinoma.
6. The method of claim 5, wherein said lung undifferentiated large cell carcinoma is originated from lung squamous cell carcinoma or from lung adenocarcinoma.
7. A method for distinguishing between lung squamous cell carcinoma and other NSCLC, the method comprising: obtaining a biological sample from a subject; determining in said sample an expression level of a nucleic acid sequence selected from the group consisting of SEQ ID NOS: 1- 5, a fragment thereof or a sequence having at least 80% identity thereto;
whereby a relative abundance of SEQ ID NO: 1 indicates the presence of squamous cell carcinoma.
8. The method of claim 7, wherein said other NSCLC is lung adenocarcinoma.
9. The method of claim 7, wherein the method comprises determining the expression levels of at least two nucleic acid sequences.
10. The method of claim 7, wherein the method further comprises combining one or more expression ratios of said nucleic acid sequences.
11. The method of claim 7, wherein the expression levels are determined by a method selected from the group consisting of nucleic acid hybridization, nucleic acid amplification, and a combination thereof.
12. The method of claim 11, wherein the nucleic acid hybridization is performed using a solid-phase nucleic acid biochip array or in situ hybridization.
13. The method of claim 11, wherein the nucleic acid amplification method is real-time PCR.
14. The method of claim 13, wherein the real-time PCR method comprises forward and reverse primers.
15. The method of claim 14, wherein the forward primer comprises a sequence selected from the group consisting of any one of SEQ ID NOS:
7-9.
16. The method of claim 14, wherein the real-time PCR method further comprises a probe.
17. The method of claim 16, wherein the probe comprises a sequence selected from the group consisting of any one of SEQ ID NOS: 10-12.
18. A method for distinguishing between lung adenocarcinoma and large cell carcinoma, the method comprising: obtaining a biological sample from a subject; determining in said sample an expression level of one or more nucleic acid sequences selected from the group consisting of SEQ ID
NOS: 13-30, a fragment thereof or a sequence having at least 80% identity thereto; whereby a relative abundance of said nucleic acid indicates the presence of large cell carcinoma.
19. A kit for NSCLC classification, said kit comprising a probe comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOS:
10-12 and sequences at least about 80% identical thereto.
20. The kit of claim 19, wherein the kit further comprises a forward primer comprising a sequence selected from the group consisting of any one of SEQ ID NOS: 7-9.
21. The kit of claim 19, wherein said kit comprises reagents for performing in situ hybridization analysis.
CA002677043A 2007-03-01 2008-02-28 Method for distinguishing between lung squamous carcinoma and other non small cell lung cancers Abandoned CA2677043A1 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US90417107P 2007-03-01 2007-03-01
US60/904,171 2007-03-01
US2134608P 2008-01-16 2008-01-16
US61/021,346 2008-01-16
PCT/IL2008/000261 WO2008104985A2 (en) 2007-03-01 2008-02-28 Methods for distingushing between lung squamous carcinoma and other non smallcell lung cancers

Publications (1)

Publication Number Publication Date
CA2677043A1 true CA2677043A1 (en) 2008-09-04

Family

ID=39616499

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002677043A Abandoned CA2677043A1 (en) 2007-03-01 2008-02-28 Method for distinguishing between lung squamous carcinoma and other non small cell lung cancers

Country Status (5)

Country Link
EP (1) EP2126116A2 (en)
JP (1) JP2010519899A (en)
AU (1) AU2008220449A1 (en)
CA (1) CA2677043A1 (en)
WO (1) WO2008104985A2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010070637A2 (en) * 2008-12-15 2010-06-24 Rosetta Genomics Ltd. Method for distinguishing between adrenal tumors
CN101475984A (en) * 2008-12-15 2009-07-08 江苏命码生物科技有限公司 Non-small cell lung cancer detection marker, detection method thereof, related biochip and reagent kit
WO2010103522A1 (en) * 2009-03-10 2010-09-16 Rosetta Genomics Ltd. Method for detection of nucleic acid sequences
WO2010120853A2 (en) * 2009-04-16 2010-10-21 Padma Arunachalam Methods and compositions to detect and differentiate small rnas in rna maturation pathway
JP2014509522A (en) * 2011-03-28 2014-04-21 ロゼッタ ゲノミクス リミテッド Method for classifying lung cancer
EP3468605A4 (en) 2016-06-08 2020-01-08 President and Fellows of Harvard College Engineered viral vector reduces induction of inflammatory and immune responses
WO2019094548A1 (en) 2017-11-08 2019-05-16 President And Fellows Of Harvard College Compositions and methods for inhibiting viral vector-induced inflammatory responses

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2554818A1 (en) * 2004-02-09 2005-08-25 Thomas Jefferson University Diagnosis and treatment of cancers with microrna located in or near cancer-associated chromosomal features
WO2007148235A2 (en) * 2006-05-04 2007-12-27 Rosetta Genomics Ltd Cancer-related nucleic acids

Also Published As

Publication number Publication date
WO2008104985A2 (en) 2008-09-04
AU2008220449A1 (en) 2008-09-04
WO2008104985A3 (en) 2008-11-06
EP2126116A2 (en) 2009-12-02
JP2010519899A (en) 2010-06-10

Similar Documents

Publication Publication Date Title
US9133522B2 (en) Compositions and methods for the diagnosis and prognosis of mesothelioma
US20150099665A1 (en) Methods for distinguishing between specific types of lung cancers
WO2010018563A2 (en) Compositions and methods for the prognosis of lymphoma
EP2691545B1 (en) Methods for lung cancer classification
WO2007148235A2 (en) Cancer-related nucleic acids
US20160115546A1 (en) Micrornas expression signature for determination of tumors origin
CA2677043A1 (en) Method for distinguishing between lung squamous carcinoma and other non small cell lung cancers
US9834821B2 (en) Diagnosis and prognosis of various types of cancers
WO2010004562A2 (en) Methods and compositions for detecting colorectal cancer
WO2011024157A1 (en) Nucleic acid sequences related to cancer
WO2011030334A1 (en) Compositions and methods for treatment, diagnosis and prognosis of mesothelioma
WO2009066291A2 (en) Micrornas expression signature for determination of tumors origin
WO2009147658A2 (en) Compositions and methods for diagnosis, prognosis and treatment of mesothelioma
US8563252B2 (en) Methods for distinguishing between lung squamous carcinoma and other non small cell lung cancers
WO2010058393A2 (en) Compositions and methods for the prognosis of colon cancer
WO2010103522A1 (en) Method for detection of nucleic acid sequences
WO2010018564A1 (en) Compositions and methods for determining the prognosis of bladder urothelial cancer
WO2010070637A2 (en) Method for distinguishing between adrenal tumors
US20130123138A1 (en) Compositions and methods for prognosis of mesothelioma
WO2011039757A2 (en) Compositions and methods for prognosis of renal cancer
WO2010016064A2 (en) Gene expression signature for classification of kidney tumors

Legal Events

Date Code Title Description
FZDE Discontinued