WO2006035273A2 - Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic - Google Patents

Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic Download PDF

Info

Publication number
WO2006035273A2
WO2006035273A2 PCT/IB2005/002438 IB2005002438W WO2006035273A2 WO 2006035273 A2 WO2006035273 A2 WO 2006035273A2 IB 2005002438 W IB2005002438 W IB 2005002438W WO 2006035273 A2 WO2006035273 A2 WO 2006035273A2
Authority
WO
WIPO (PCT)
Prior art keywords
segment
transcript
found
libraries
cluster
Prior art date
Application number
PCT/IB2005/002438
Other languages
English (en)
Other versions
WO2006035273A3 (fr
Inventor
Michal Ayalon-Soffer
Sarah Pollock
Ronen Shemesh
Rotem Sorek
Levine Zurit
Zipi Shaqed
Amir Toporik
Gad S. Cojocaru
Dvir Dahary
Guy Kol
Pinchas Akiva
Amit Novik
Sergey Nemzer
Alexander Diber
Maxim Shklar
Osnat Sella-Tavor
Lily Bazak
Arial Farkash
Yossi Cohen
Original Assignee
Compugen Usa, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Compugen Usa, Inc. filed Critical Compugen Usa, Inc.
Priority to CA002554718A priority Critical patent/CA2554718A1/fr
Priority to AU2005288710A priority patent/AU2005288710A1/en
Priority to EP05805030A priority patent/EP1716256A2/fr
Priority claimed from US11/043,788 external-priority patent/US20060014166A1/en
Publication of WO2006035273A2 publication Critical patent/WO2006035273A2/fr
Publication of WO2006035273A3 publication Critical patent/WO2006035273A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07HSUGARS; DERIVATIVES THEREOF; NUCLEOSIDES; NUCLEOTIDES; NUCLEIC ACIDS
    • C07H21/00Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids
    • C07H21/04Compounds containing two or more mononucleotide units having separate phosphate or polyphosphate groups linked by saccharide radicals of nucleoside groups, e.g. nucleic acids with deoxyribosyl as saccharide radical
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material

Definitions

  • the present invention is related to novel nucleotide sequences that are useful as diagnostic markers, and assays and methods of use thereof.
  • NAT Nucleic Acid Testing
  • the sample could be a body fluid, a tissue sample, a body secretion or any other sample obtained from a patient which could contain the targeted nucleic acids.
  • NAT diagnosis has been used for the diagnosis of infectious diseases.
  • NAT diagnosis has expanded to noninfectious diseases, for example, for the diagnosis of prostate cancer based on DD3 (PCA3).
  • DD3 PCA3
  • PCA3 is a very prostate cancer- specific gene. It has shown a great diagnostic value for prostate cancer by measuring quantitavely the DD3 (PCA3) transcript in urine sediments obtained after prostatic massage. DD3( PCA3) is a non-coding transcript, therefore diagnosis in the protein level is not possible.
  • More NAT markers for more cancers in addition to prostate cancer are currently pursued. NAT diagnostic markers have at least four advantages on protein based diagnostic modalities:
  • test analyte could be amplified (e.g. with PCR)
  • detection method is sequence specific rather than epitope specific 2. They allow diagnosis even if a differentially expressed transcript is non-coding (as in the case of DD3(PCA3))
  • NAT analytes are sometimes found in body secretions and/or body fluids and therefore could replace the need for a tissue biopsy when a serum marker is not available.
  • NAT markers suffer from a few disadvantages including: 1.
  • the analyte itself is quite an unstable molecule (certainly when compared with a protein). 2.
  • the analyte itself is by nature not physiologically secreted, therefore it is not always easily found in samples.
  • the present invention overcomes deficiencies of the background art by providing novel variants that are suitable for use with NAT and/or nucleic acid hybridization methods and assays, which may optionally be used as diagnostic markers.
  • oligonucleotides methods and assays that are suitable for detecting a nucleic acid sequence (oligonucleotides) are referred to herein as "oligonucleotide detection technologies", including but not limited to NAT and hybridization technologies.
  • the markers of the present invention may optionally be used with any such oligonucleotide detection technology.
  • the markers are useful for detecting variant-detectable diseases (marker- detectable diseases), wherein these diseases and/or pathological states and/or conditions are described in greater detail below with regard to the different clusters (genes) below.
  • these variants are useful as diagnostic markers for variant-detectable diseases.
  • markers are specifically released to the bloodstream under disease conditions according to one of the above differential variant marker conditions.
  • the present invention therefore also relates to diagnostic assays for disease detection optionally and preferably in a sample taken from a subject (patient), which is more preferably some type of blood sample or body secretion sample.
  • the assays are optionally NAT (nucleic acid amplification technology) -based assays, such as PCR for example (or variations thereof such as real-time PCR for example).
  • the assays may also optionally encompass nucleic acid hybridization assays.
  • the assays may optionally be qualitative or quantitative.
  • the present invention also relates to kits based upon such diagnostic methods or assays.
  • the sample taken from the subject can be selected from one or more of blood, serum, plasma, blood cells, urine, sputum, saliva, stool, spinal fluid, lymph fluid, the external sections of the skin, respiratory, intestinal, and genitourinary tracts, tears, milk, neuronal tissue, pleural fluid, peritoneal fluid, cyst fluid, including ovarian cyst fluid, and any human organ and tissue.
  • this invention provides an isolated nucleic acid molecule encoding for a splice variant according to the present invention, having a nucleotide sequence as set forth in any one of the sequences listed herein, or a sequence complementary thereto.
  • this invention provides an isolated nucleic acid molecule, having a nucleotide sequence as set forth in any one of the sequences listed herein, or a sequence complementary thereto.
  • this invention provides an oligonucleotide of at least about 12 nucleotides, specifically hybridizable with the nucleic acid molecules of this invention.
  • this invention provides vectors, cells, liposomes and compositions comprising the isolated nucleic acids of this invention.
  • this invention provides a method for detecting a splice variant nucleic acid sequence in a biological sample, comprising: hybridizing the isolated nucleic acid molecules or oligonucleotide fragments of at least about 12 nucleotides thereof to a nucleic acid material of a biological sample and detecting a hybridization complex; wherein the presence of a hybridization complex correlates with the presence of a splice variant nucleic acid sequence in the biological sample.
  • the splice variant nucleic acid sequences described herein are non- limiting examples of markers for diagnosing the below described disease condition(s).
  • Each splice variant nucleic acid sequence marker of the present invention can be used alone or in combination, for various uses, including but not limited to, prognosis, prediction, screening, early diagnosis, determination of progression, therapy selection and treatment monitoring of one of the above-described diseases.
  • any marker according to the present invention may optionally be used alone or combination.
  • Such a combination may optionally comprise a plurality of markers described herein, optionally including any subcombination of markers, and/or a combination featuring at least one other marker, for example a known marker.
  • such a combination may optionally and preferably be used as described above with regard to determining a ratio between a quantitative or semi-quantitative measurement of any marker described herein to any other marker described herein, and/or any other known marker, and/or any other marker.
  • the known marker comprises the "known protein" as described in greater detail below with regard to each cluster or gene.
  • any method may be used to detect the presence (for example in the blood) and/or differential expression of this marker, optionally a NAT-based technology is used. Therefore, optionally and preferably, any nucleic acid molecule capable of selectively hybridizing to a nucleic acid of a splice variant marker as previously defined is also encompassed within the present invention.
  • a splice variant nucleic acid sequence or a fragment thereof may be featured as a biomarker for detecting a variant-detectable disease, such that a biomarker may optionally comprise any of the above.
  • the present invention optionally and preferably encompasses any amino acid sequence or fragment thereof encoded by a nucleic acid sequence as described herein.
  • the present invention also optionally and preferably encompasses any nucleic acid sequence or fragment thereof, or amino acid sequence or fragment thereof, corresponding to a splice variant nucleic acid sequence of the present invention as described above, optionally for any application.
  • a variant according to the present invention may be a marker for one or more of the diseases and/or pathologies as described above. Information is given in the text with regard to SNPs (single nucleotide polymorphisms).
  • T - > C means that the SNP results in a change at the position given in the table from T to C.
  • M - > Q for example, means that the SNP has caused a change in the corresponding amino acid sequence, from methionine (M) to glutamine (Q). If, in place of a letter at the right hand side for the nucleotide sequence SNP, there is a space, it indicates that a frameshift has occurred. A frameshift may also be indicated with a hyphen (-). A stop codon is indicated with an asterisk at the right hand side (*).
  • a comment may be found in parentheses after the above description of the SNP itself.
  • This comment may include an FTId, which is an identifier to a SwissProt entry that was created with the indicated SNP.
  • An FTId is a unique and stable feature identifier, which allows to construct links directly from position- specific annotation in the feature table to specialized protein-related databases.
  • Library-based statistics refer to statistics over an entire library, while EST clone statistics refer to expression only for ESTs from a particular tissue or cancer.
  • TAA histograms The following list of abbreviations for tissues was used in the TAA histograms.
  • TAA Tumor Associated Antigen
  • TAA histograms represent the cancerous tissue expression pattern as predicted by the biomarkers selection engine, as described in detail in examples 1-5 below: "BONE" for "bone”;
  • nucleic acid sequences of the present invention refer to portions of nucleic acid sequences that were shown to have one or more properties as described below. They are also the building blocks that were used to construct complete nucleic acid sequences as described in greater detail below. Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed.
  • disease includes any type of pathology and/or damage, including both chronic and acute damage, as well as a progress from acute to chronic damage.
  • marker in the context of the present invention refers to a nucleic acid fragment, which is differentially present in a sample taken from patients having one of the above- described diseases or conditions, as compared to a comparable sampb taken from subjects who do not have one the above-described diseases or conditions.
  • a nucleic acid fragment may optionally be differentially present between the two samples if the amount of the nucleic acid fragment in one sample is significantly different from the amount of the nucleic acid fragment in the other sample, for example as measured by hybridization and/or NAT-based assays. It should be noted that if the marker is detectable in one sample and not detectable in the other, then such a marker can be considered to be differentially present.
  • a relatively low amount of up- regulation may serve as the marker, as described above.
  • diagnostic means identifying the presence or nature of a pathologic condition. Diagnostic methods differ in their sensitivity and specificity.
  • the "sensitivity” of a diagnostic assay is the percentage of diseased individuals who test positive (percent of "true positives”). Diseased individuals not detected by the assay are “false negatives.” Subjects who are not diseased and who test negative in the assay are termed “true negatives.”
  • the "specificity” of a diagnostic assay is 1 minus the false positive rate, where the "false positive” rate is defined as the proportion of those without the disease who test positive. While a particular diagnostic method may not provide a definitive diagnosis of a condition, it suffices if the method provides a positive indication that aids in diagnosis.
  • diagnosis refers to classifying a disease or a symptom, determining a severity of the disease, monitoring disease progression, forecasting an outcome of a disease and/or prospects of recovery.
  • detecting may also optionally encompass any of the above.
  • Diagnosis of a disease according to the present invention can be effected by determining a level of a polynucleotide of the present invention in a biological sample obtained from the subject, wherein the level determined can be correlated with predisposition to, or presence or absence of the disease.
  • level refers to expression levels of RNA or to DNA copy number of a marker of the present invention.
  • a biological sample refers to a sample of tissue or fluid isolated from a subject, including but not limited to, for example, plasma, serum, spinal fluid, lymph fluid, the external sections of the skin, respiratory, intestinal, and genitourinary tracts, tears, saliva, sputum, milk, whole blood or any blood fraction, blood cells, tumors, neuronal tissue, organs or any other types of tissue, any sample obtained by lavage (for example of the bronchial system), and also samples of in vivo cell culture constituents.
  • tissue or fluid collection methods can be utilized to collect the biological sample from the subject in order to determine the level of DNA, RNA and/or polypeptide of the variant of interest in the subject.
  • Examples include, but are not limited to, fine needle biopsy, needle biopsy, core needle biopsy and surgical biopsy (e.g., brain biopsy), and lavage. Regardless of the procedure employed, once a biopsy/sample is obtained the level of the variant can be determined and a diagnosis can thus be made.
  • Determining the level of the same variant in normal tissues of the same origin is preferably effected along-side to detect an elevated expression and/or amplification, and/or a decreased expression, of the variant as opposed to the normal tissues.
  • a "test amount" of a marker refers to an amount of a marker present in a sample being tested.
  • a test amount can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative intensity of signals).
  • a “diagnostic amount” of a marker refers to an amount of a marker in a subject's sample that is consistent with a diagnosis of a variant- detectable disease.
  • a diagnostic amount can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative intensity of signals).
  • a "control amount" of a marker can be any amount or a range of amounts to be compared against a test amount of a marker.
  • a control amount of a marker can be the amount of a marker in a patient with variant- detectable disease or a person without variant - detectable disease.
  • a control amount can be either in absolute amount (e.g., microgram/ml) or a relative amount (e.g., relative intensity of signals).
  • Substrate refers to a solid phase onto which an adsorbent can be provided (e.g., by attachment, deposition, etc.)
  • Adsorbent refers to any material capable of adsorbing a marker.
  • the term “adsorbent” is used herein to refer both to a single material ("monoplex adsorbent") (e.g., a compound or functional group) to which the marker is exposed, and to a plurality of different materials (“multiplex adsorbent”) to which the marker is exposed.
  • the adsorbent materials in a multiplex adsorbent are referred to as "adsorbent species.”
  • an addressable location on a probe substrate can comprise a multiplex adsorbent characterized by many different adsorbent species (e.g., anion exchange materials, metal chelators, or antibodies), having different binding characteristics.
  • Substrate material itself can also contribute to adsorbing a marker and may be considered part of an "adsorbent.”
  • Adsorption or “retention” refers to the detectable binding between an absorbent and a marker either before or after washing with an eluant (selectivity threshold modifier) or a washing solution.
  • Eluant or “washing solution” refers to an agent that can be used to mediate adsoiption of a marker to an adsorbent. Eluants and washing solutions can be used to wash and remove unbound materials from the probe substrate surface.
  • Detect refers to identifying the presence, absence or amount of the object to be detected.
  • Detectable moiety or a “label” refers to a composition detectable by spectroscopic, photo chemical, biochemical, immunochemical, or chemical means.
  • useful labels include 32 P, 35 S, fluorescent dyes, electron- dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin- strep tavadin, dioxigenin, or nucleic acid molecules with a sequence complementary to a target.
  • the detectable moiety often generates a measurable signal, such as a radioactive, chromogenic, or fluorescent signal, that can be used to quantify the amount of bound detectable moiety in a sample.
  • the detectable moiety can be incorporated in or attached to a primer or probe either covalently, or through ionic, van der Waals or hydrogen bonds, e.g., incorporation of radioactive nucleotides, or biotinylated nucleotides that are recognized by streptavadin.
  • the detectable moiety may be directly or indirectly detectable. Indirect detection can involve the binding of a second directly or indirectly detectable moiety to the detectable moiety.
  • the detectable moiety can be a nucleotide sequence, which is the binding partner for a complementary sequence, to which it can specifically hybridize.
  • the binding partner may itself be directly detectable, for example, the partner may be itself labeled with a fluorescent molecule.
  • the binding partner also may be indirectly detectable, for example, a nucleic acid having a complementary nucleotide sequence can be a part of a branched DNA molecule that is in turn detectable through hybridization with other labeled nucleic acid molecules (see, e.g., P. D. Fahrlander and A. Klausner, Bio/Technology 6:1 165 (1988)). Quantitation of the signal is achieved by, e.g., scintillation counting, densitometry, or flow cytometry.
  • a “nucleic acid fragment” or an “oligonucleotide” or a “polynucleotide” are used herein interchangeably to refer to a polymer of nucleic acids.
  • a polynucleotide sequence of the present invention refers to a single or double stranded nucleic acid sequences which is isolated and provided in the form of an RNA sequence, a complementary polynucleotide sequence (cDNA), a genomic polynucleotide sequence and/or a composite polynucleotide sequences (e.g., a combination of the above).
  • complementary polynucleotide sequence refers to a sequence, which results from reverse transcription of messenger RNA using a reverse transcriptase or any other RNA dependent DNA polymerase. Such a sequence can be subsequently amplified in vivo or in vitro using a DNA dependent DNA polymerase.
  • genomic polynucleotide sequence refers to a sequence derived (isolated) from a chromosome and thus it represents a contiguous portion of a chromosome.
  • composite polynucleotide sequence refers to a sequence, which is composed of genomic and cDNA sequences.
  • a composite sequence can include some exonal sequences required to encode the polypeptide of the present invention, as well as some intronic sequences interposing therebetween.
  • the intronic sequences can be of any source, including of other genes, and typically will include conserved splicing signal sequences. Such intronic sequences may further include cis acting expression regulatory elements.
  • the present invention encompasses nucleic acid sequences described hereinabove; fragments thereof, sequences hybridizable therewith, sequences homologous thereto [e.g., at least 50 %, at least 55 %, at least 60%, at least 65 %, at least 70 %, at least 75 %, at least 80 %, at least 85 %, at least 95 % or more say 100 % identical to the nucleic acid sequences set forth below], sequences encoding similar polypeptides with different codon usage, altered sequences characterized by mutations, such as deletion, insertion or substitution of one or more nucleotides, either naturally occurring or artificially induced, either randomly or in a targeted fashion.
  • the present invention also encompasses homologous nucleic acid sequences (i.e., which form a part of a polynucleotide sequence of the present invention) which include sequence regions unique to the polynucleotides of the present invention.
  • the present invention also encompasses novel polypeptides or portions thereof, which are encoded by the isolated polynucleotide and respective nucleic acid fragments thereof described hereinabove.
  • the present invention also encompasses polypeptides encoded by the polynucleotide sequences of the present invention.
  • the present invention also encompasses homologues of these polypeptides, such homologues can be at least 50 %, at least 55 %, at least 60%, at least 65 %, at least 70 %, at least 75 %, at least 80 %, at least 85 %, at least 95 % or more say 100 % homologous to the amino acid sequences set forth below, as can be determined using BlastP software of the National Center of Biotechnology Information (NCBI) using default parameters, optionally and preferably including the following: filtering on (this option filters repetitive or low- complexity sequences from the query using the SEG (protein) program), scoring matrix is BLOSUM62 for proteins, word size is 3, E value is 10, gap costs are 11, 1 (initialization and extension), and number of alignments shown is 50.
  • NCBI National Center of Biotechnology Information
  • the present invention also encompasses fragments of the above described polypeptides and polypeptides having mutations, such as deletions, insertions or substitutions of one or more amino acids, either naturally occurring or artificially induced, either randomly or in a targeted fashion.
  • Oligonucleotides designed for carrying out the methods of the present invention for any of the sequences provided herein can be generated according to any oligonucleotide synthesis method known in the art such as enzymatic synthesis or solid phase synthesis.
  • Equipment and reagents for executing solid-phase synthesis are commercially available from, for example, Applied Biosystems. Any other means for such synthesis may also be employed; the actual synthesis of the oligonucleotides is well within the capabilities of one skilled in the art.
  • Oligonucleotides used according to this aspect of the present invention are those having a length selected from a range of about 10 to about 200 bases preferably about 15 to about 150 bases, more preferably about 20 to about 100 bases, most preferably about 20 to about 50 bases.
  • the oligonucleotides of the present invention may comprise heterocylic nucleosides consisting of purine and pyrimidine bases, bonded in a 3 1 to 5' phosphodiester linkage.
  • oligonucleotides are those modified at one or more of backbone, internucleoside linkages or bases, as is broadly described hereinunder. Such modifications can oftentimes facilitate oligonucleotide uptake and resistivity to intracellular conditions.
  • oligonucleotides useful according to this aspect of the present invention include oligonucleotides containing modified backbones or non- natural internucleoside linkages. Oligonucleotides having modified backbones include those that retain a phosphorus atom in the backbone, as disclosed in U.S. Pat.
  • Preferred modified oligonucleotide backbones include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkyl phosphonates including 3'-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 3'-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates having normal 3'-5' linkages, 2'-5' linked analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside units are linked 3'-5' to 5'-3' or 2'-5' to 5'-2'.
  • Various salts, mixed salts and free acid forms can also be used.
  • modified oligonucleotide backbones that do not include a phosphorus atom therein have backbones that are formed by short chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more short chain heteroatomic or heterocyclic internucleoside linkages.
  • morpholino linkages formed in part from the sugar portion of a nucleoside
  • siloxane backbones sulfide, sulfoxide and sulfone backbones
  • formacetyl and thioformacetyl backbones methylene formacetyl and thioformacetyl backbones
  • alkene containing backbones sulfamate backbones
  • sulfonate and sulfonamide backbones amide backbones; and others having mixed N, O, S and CH 2 component parts, as disclosed in U.S. Pat. Nos.
  • oligonucleotides which can be used according to the present invention, for example, are those modified in both sugar and the internucleoside linkage, i.e., the backbone, of the nucleotide units are replaced with novel groups.
  • the base units are maintained for complementation with the appropriate polynucleotide target.
  • An example for such an oligonucleotide mimetic includes but is not limited to peptide nucleic acid (PNA).
  • PNA oligonucleotide refers to an oligonucleotide where the sugar-backbone is replaced with an amide containing backbone, in particular an aminoethylglycine backbone.
  • Oligonucleotides of the present invention may also include base modifications or substitutions.
  • "unmodified” or “natural” bases include the purine bases adenine (A) and guanine (G), and the pyrimidine bases thymine (T), cytosine (C) and uracil (U).
  • Modified bases include but are not limited to other synthetic and natural bases such as 5- methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6-methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5-uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8- substituted adenines and guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other 5- substituted uracils and cyto
  • 5-substituted pyrimidines include 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and O-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine.
  • 5-methylcytosine substitutions have been shown to increase nucleic acid duplex stability by 0.6- 1.2 0 C. [Sanghvi YS et al. (1993) Antisense Research and Applications, CRC Press, Boca Raton 276-278] and are optional but preferred base substitutions, even more particularly when combined with 2'-O-methoxyethyl sugar modifications.
  • oligonucleotides of the invention involves chemically linking to the oligonucleotide one or more moieties or conjugates, which enhance the activity, cellular distribution or cellular uptake of the oligonucleotide.
  • moieties include but are not limited to lipid moieties such as a cholesterol moiety, cholic acid, a thioether, e.g., hexyl-S- tritylthiol, a thiocholesterol, an aliphatic chain, e.g., dodecandiol or undecyl residues, a phospholipid, e.g., di-hexadecyl-rac- glycerol or triethylammonium 1,2-di-O-hexadecyl-rac- glycero-3-H-phosphonate, a polyamine or a polyethylene glycol chain, or adamantane acetic acid, a palmity
  • the present invention provides novel variants, which may optionally be used as diagnostic markers.
  • variants are useful as diagnostic markers for variant- detectable diseases.
  • Differential variant markers are collectively described as "variant disease markers”.
  • Detection of a nucleic acid of interest in a biological sample may optionally be effected by hybridization-based assays using an oligonucleotide probe (non- limiting examples of probes according to the present invention are described in greater detail below).
  • Hybridization based assays which allow the detection of a variant of interest (i.e., DNA or RNA) in a biological sample rely on the use of oligonucleotide which can be 10, 15, 20, or 30 to 100 nucleotides long preferably from 10 to 50, more preferably from 40 to 50 nucleotides long.
  • Hybridization of short nucleic acids (below 200 bp in length, e.g.
  • hybridization duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then detected.
  • labels refer to radioactive, fluorescent, biological or enzymatic tags or labels of standard use in the art.
  • a label can be conjugated to either the oligonucleotide probes or the nucleic acids derived from the biological sample.
  • oligonucleotides of the present invention can be labeled subsequent to synthesis, by incorporating biotinylated dNTPs or rNTP, or some similar means (e.g., photo- cross- linking a psoralen derivative of biotin to RNAs), followed by addition of labeled streptavidin (e.g., phycoerythrin-conjugated streptavidin) or the equivalent.
  • biotinylated dNTPs or rNTP or some similar means (e.g., photo- cross- linking a psoralen derivative of biotin to RNAs)
  • streptavidin e.g., phycoerythrin-conjugated streptavidin
  • fluorescein, lissamine, phycoerythrin, rhodamine (Perkin Elmer Cetus), Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX (Amersham) and others [e.g., Kricka et al. (1992), Academic Press San Diego, Calif] can be attached to the oligonucleotides .
  • RNA detection Traditional hybridization assays include PCR, RT-PCR, Real-time PCR, RNase protection, in-situ hybridization, primer extension, Southern blots (DNA detection), dot or slot blots (DNA, RNA), and Northern blots (RNA detection) (NAT type assays are described in greater detail below). More recently, PNAs have been described (Nielsen et al. 1999, Current Opin. Biotechnol. 10:71-75). Other detection methods include kits containing probes on a dipstick setup and the like. Although the present invention is not specifically dependent on the use of a label for the detection of a particular nucleic acid sequence, such a label might be beneficial, by increasing the sensitivity of the detection.
  • Probes can be labeled according to numerous well known methods (Sambrook et al., 1989, supra).
  • Non- limiting examples of radioactive labels include 3 H, 14 C, 32 P, and 35 S.
  • Non- limiting examples of detectable markers include ligands, fluorophores, chemiluminescent agents, enzymes, and antibodies.
  • Other detectable markers for use with probes which can enable an increase in sensitivity of the method of the invention, include biotin and radio-nucleotides. It will become evident to the person of ordinary skill that the choice of a particular label dictates the manner in which it is bound to the probe.
  • radioactive nucleotides can be incorporated into probes of the invention by several methods.
  • Non- limiting examples thereof include kinasing the 5' ends of the probes using gamma ATP and polynucleotide kinase, using the Klenow fragment of Pol I of E coli in the presence of radioactive dNTP (i.e. uniformly labeled DNA probe using random oligonucleotide primers in low- melt gels), using the SP6/T7 system to transcribe a DNA segment in the presence of one or more radioactive NTP, and the like.
  • radioactive dNTP i.e. uniformly labeled DNA probe using random oligonucleotide primers in low- melt gels
  • SP6/T7 system to transcribe a DNA segment in the presence of one or more radioactive NTP, and the like.
  • wash steps may be employed to wash away excess target DNA or probe as well as unbound conjugate.
  • oligonucleotide primers and probes are suitable for detecting the hybrids using the labels present on the oligonucleotide primers and probes. It will be appreciated that a variety of controls may be usefully employed to improve accuracy of hybridization assays. For instance, samples may be hybridized to an irrelevant probe and treated with RNAse A prior to hybridization, to assess false hybridization.
  • Probes of the invention can be utilized with naturally occurring sugar-phosphate backbones as well as modified backbones including phosphorothioates, dithionates, alkyl phosphonates and a- nucleotides and the like. Modified sugar-phosphate backbones are generally taught by Miller, 1988, Ann. Reports Med. Chem. 23:295 and Moran et al, 1987, Nucleic acid molecule. Acids Res., 14:5019. Probes of the invention can be constructed of either ribonucleic acid (RNA) or deoxyribonucleic acid (DNA), and preferably of DNA.
  • RNA ribonucleic acid
  • DNA deoxyribonucleic acid
  • Detection of a nucleic acid of interest in a biological sample may also optionally be effected byNAT-based assays, which involve nucleic acid amplification technology, such as PCR for example (or variations thereof such as realtime PCR for example).
  • nucleic acid amplification technology such as PCR for example (or variations thereof such as realtime PCR for example).
  • Amplification of a selected, or target, nucleic acid sequence may be carried out by a number of suitable methods. See generally Kwoh et al., 1990, Am. Biotechnol. Lab. 8:14 Numerous amplification techniques have been described and can be readily adapted to suit particular needs of a person of ordinary skill. Non- limiting examples of amplification techniques include polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement amplification (SDA), transcription-based amplification, the q3 replicase system and NASBA (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86, 1173-1177; Lizardi et al., 1988,
  • PCR Polymerase chain reaction
  • a nucleic acid sample e.g., in the presence of a heat stable DNA polymerase
  • An extension product of each primer which is synthesized is complementary to each of the two nucleic acid strands, with the primers sufficiently complementary to each strand of the specific sequence to hybridize therewith.
  • the extension product synthesized from each primer can also serve as a template for further synthesis of extension products using the same primers.
  • the sample is analyzed to assess whether the sequence or sequences to be detected are present. Detection of the amplified sequence may be carried out by visualization following EtBr staining of the DNA following gel electrophores, or using a detectable label in accordance with known techniques, and the like.
  • EtBr staining of the DNA following gel electrophores, or using a detectable label in accordance with known techniques, and the like.
  • a "primer” defines an oligonucleotide which is capable of annealing to a target sequence, thereby creating a double stranded region which can serve as an initiation point for DNA synthesis under suitable conditions.
  • Ligase chain reaction (LCR) is carried out in accordance with known techniques (Weiss,
  • SDA Strand displacement amplification
  • amplification pair refers herein to a pair of oligonucleotides (oligos) of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes, preferably a polymerase chain reaction.
  • amplification processes include ligase chain reaction, strand displacement amplification, or nucleic acid sequence-based amplification, as explained in greater detail below.
  • the oligos are designed to bind to a complementary sequence under selected conditions.
  • amplification of a nucleic acid sample from a patient is amplified under conditions which favor the amplification of the most abundant differentially expressed nucleic acid.
  • RT-PCR is carried out on an mRNA sample from a patient under conditions which favor the amplification of the most abundant mRNA.
  • the amplification of the differentially expressed nucleic acids is carried out simultaneously.
  • the nucleic acid i.e. DNA or RNA
  • the nucleic acid for practicing the present invention may be obtained according to well known methods.
  • Oligonucleotide primers of the present invention may be of any suitable length, depending on the particular assay format and the particular needs and targeted genomes employed. In general, the oligonucleotide primers are at least 12 nucleotides in length, preferably between 15 and 24 molecules, and they may be adapted to be especially suited to a chosen nucleic acid amplification system.
  • the oligonucleotide primers can be designed by taking into consideration the melting point of hybridization thereof with its targeted sequence (see below and in Sambrook et al., 1989, Molecular Cloning -A Laboratory Manual, 2nd Edition, CSH Laboratories; Ausubel et al., 1989, in Current Protocols in Molecular Biology, John Wiley & Sons Inc., N.Y.).
  • Oligonucleotides according to the present invention may optionally be used as molecular probes as described herein.
  • probes are use&l for hybridization assays, and also for NAT assays (as primers, for example).
  • the present invention encompasses nucleic acid sequences described hereinabove; fragments thereof, sequences hybridizable therewith, sequences homologous thereto, sequences encoding similar polypeptides with different codon usage, altered sequences characterized by mutations, such as deletion, insertion or substitution of one or more nucleotides, either naturally occurring or artificially induced, either randomly or in a targeted fashion.
  • detection of a nucleic acid of interest in a biological sample is effected by hybridization-based assays using an oligonucleotide probe.
  • oligonucleotide refers to a single stranded or double stranded oligomer or polymer of ribonucleic acid (RNA) or deoxyribonucleic acid (DNA) or mimetics thereof. This term includes oligonucleotides composed of naturally-occurring bases, sugars and covalent internucleoside linkages (e.g., backbone) as well as oligonucleotides having non-naturally- occurring portions which function similarly to respective naturally-occurring portions.
  • an oligonucleotide probe which can be utilized by the present invention is a single stranded polynucleotide which includes a sequence complementary to the unique sequence region of any variant according to the present invention, including but not limited to a nucleotide sequence coding for an amino sequence of a bridge, tail, head and/or insertion according to the present invention, and/or the equivalent portions of any nucleotide sequence given herein (including but not limited to a nucleotide sequence of a node, segment or amplicon described herein).
  • an oligonucleotide probe of the present invention can be designed to hybridize with a nucleic acid sequence encompassed by any of the above nucleic acid sequences, particularly the portions specified above, including but not limited to a nucleotide sequence coding for an amino sequence of a bridge, tail, head and/or insertion according to the present invention, and/or the equivalent portions of any nucleotide sequence given herein (including but not limited to a nucleotide sequence of a node, segment or amplicon described herein).
  • Oligonucleotides designed according to the teachings of the present invention can be generated according to any oligonucleotide synthesis method known in the art such as enzymatic synthesis or solid phase synthesis.
  • Equipment and reagents for executing solid-phase synthesis are commercially available from, for example, Applied Biosystems. Any other means for such synthesis may also be employed; the actual synthesis of the oligonucleotides is well within the capabilities of one skilled in the art and can be accomplished via established methodologies as detailed in, for example, "Molecular Cloning: A laboratory Manual” Sambrook et al., (1989); “Current Protocols in Molecular Biology” Volumes I-III Ausubel, R. M., ed.
  • the oligonucleotide of the present invention is of at least 17, at least 18, at least 19, at least 20, at least 22, at least 25, at least 30 or at least 40, bases specifically hybridizable with the biomarkers of the present invention.
  • the oligonucleotides of the present invention may comprise heterocylic nucleosides consisting of purines and the pyrimidines bases, bonded in a 3' to 5' phosphodiester linkage.
  • Preferably used oligonucleotides are those modified at one or more of the backbone, interaucleoside linkages or bases, as is broadly described hereinunder.
  • oligonucleotides useful according to this aspect of the present invention include oligonucleotides containing modified backbones or non- natural internucleoside linkages. Oligonucleotides having modified backbones include those that retain a phosphorus atom in the backbone, as disclosed in U.S. Pat.
  • Preferred modified oligonucleotide backbones include, for example, phosphorothioates, chiral phosphorothioates, phosphorodithioates, phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkyl phosphonates including 3'-alkylene phosphonates and chiral phosphonates, phosphinates, phosphoramidates including 3'-amino phosphoramidate and aminoalkylphosphoramidates, thionophosphoramidates, thionoalkylphosphonates, thionoalkylphosphotriesters, and boranophosphates having normal 3'-5' linkages, 2'-5' linked analogs of these, and those having inverted polarity wherein the adjacent pairs of nucleoside units are linked 3'-5' to 5'-3' or 2'-5' to 5'-2'.
  • Various salts, mixed salts and free acid forms can also be used.
  • modified oligonucleotide backbones that do not include a phosphorus atom therein have backbones that are formed by short chain alkyl or cycloalkyl internucleoside linkages, mixed heteroatom and alkyl or cycloalkyl internucleoside linkages, or one or more short chain heteroatomic or heterocyclic internucleoside linkages.
  • morpholino linkages formed in part from the sugar portion of a nucleoside
  • siloxane backbones sulfide, sulfoxide and sulfone backbones
  • formacetyl and thioformacetyl backbones methylene formacetyl and thioformacetyl backbones
  • alkene containing backbones sulfamate backbones
  • sulfonate and sulfonamide backbones amide backbones
  • others having mixed N, O, S and CH 2 component parts, as disclosed h U.S. Pat. Nos.
  • oligonucleotides which can be used according to the present invention, are those modified in both sugar and the internucleoside linkage, i.e., the backbone, of the nucleotide units are replaced with novel groups.
  • the base units are maintained for complementation with the appropriate polynucleotide target.
  • An example for such an oligonucleotide mimetic includes peptide nucleic acid (PNA).
  • PNA peptide nucleic acid
  • a PNA oligonucleotide refers to an oligonucleotide where the sugar-backbone is replaced with an amide containing backbone, in particular an aminoethylglycine backbone.
  • the bases are retained and are bound directly or indirectly to aza nitrogen atoms of the amide portion of the backbone.
  • Oligonucleotides of the present invention may also include base modifications or substitutions.
  • "unmodified” or “natural” bases include the purine bases adenine (A) and guanine (G), and the pyrimidine bases thymine (T), cytosine (C) and uracil (U).
  • Modified bases include but are not limited to other synthetic and natural bases such as 5- methylcytosine (5-me-C), 5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine, 6- methyl and other alkyl derivatives of adenine and guanine, 2-propyl and other alkyl derivatives of adenine and guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl uracil and cytosine, 6-azo uracil, cytosine and thymine, 5 -uracil (pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hydroxyl and other 8- substituted adenines and guanines, 5-halo particularly 5-bromo, 5-trifluoromethyl and other 5- substituted uracils
  • 5-substituted pyrimidines include 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and O-6 substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5-propynylcytosine.
  • 5-methylcytosine substitutions have been shown to increase nucleic acid duplex stability by 0.6- 1.2 0 C. [Sanghvi YS et al. (1993) Antisense Research and Applications, CRC Press, Boca Raton 276-278] and are presently preferred base substitutions, even more particularly when combined with 2'-O-methoxyethyl sugar modifications.
  • oligonucleotides of the present invention may include further modifications which increase bioavailability, therapeutic efficacy and reduce cytotoxicity. Such modifications are described in Younes (2002) Current Pharmaceutical Design 8:1451-1466.
  • the isolated polynucleotides of the present invention can optionally be detected (and optionally quantified) by using hybridization assays.
  • the isolated polynucleotides of the present invention are preferably hybridizable with any of the above described nucleic acid sequences under moderate to stringent hybridization conditions.
  • Moderate to stringent hybridization conditions are characterized by a hybridization solution such as containing 10 % dextrane sulfate, 1 M NaCl, 1 % SDS and 5 x 10 ⁇ cpm 32 P labeled probe, at 65 0 C, with a final wash solution of 0.2 x SSC and 0.1 % SDS and final wash at 65 0 C and whereas moderate hybridization is effected using a hybridization solution containing 10 % dextrane sulfate, 1 M NaCl, 1 % SDS and 5 x 10 6 cpm 32 P labeled probe, at 65 0 C, with a final wash solution of 1 x SSC and 0.1 % SDS and final wash at 50 0 C.
  • a hybridization solution such as containing 10 % dextrane sulfate, 1 M NaCl, 1 % SDS and 5 x 10 ⁇ cpm 32 P labeled probe, at 65 0 C
  • moderate hybridization is effected using
  • Hybridization based assays which allow the detection of the biomarkers of the present invention (i.e., DNA or RNA) in a biological sample rely on the use of oligonucleotides which can be 10, 15, 20, or 30 to 100 nucleotides long, preferably from 10 to 50, and more preferably from 40 to 50 nucleotides.
  • Hybridization of short nucleic acids can be effected using the following exemplary hybridization protocols which can be modified according to the desired stringency; (i) hybridization solution of 6 x SSC and 1 % SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS, 100 ⁇ g/ml denatured salmon sperm DNA and 0.1 % nonfat dried milk, hybridization temperature of 1 - 1.5 0 C below the T 1n , final wash solution of 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1 mM EDTA (pH 7.6), 0.5 % SDS at 1 - 1.5 0 C below the T m ; (H) hybridization solution of 6 x SSC and 0.1 % SDS or 3 M TMACI, 0.01 M sodium phosphate (pH 6.8), 1
  • hybridization duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then detected.
  • labels refer to radioactive, fluorescent, biological or enzymatic tags or labels of standard use in the art.
  • a label can be conjugated to either the oligonucleotide probes or the nucleic acids derived from the biological sample (target).
  • oligonucleotides of the present invention can be labeled subsequent to synthesis, by incorporating biotinylated dNTPs or rNTP, or some similar means (e.g., photo- cross- linking a psoralen derivative of biotin to RNAs), followed by addition of labeled streptavidin (e.g., phycoerythrin- conjugated streptavidin) or the equivalent.
  • biotinylated dNTPs or rNTP or some similar means (e.g., photo- cross- linking a psoralen derivative of biotin to RNAs)
  • streptavidin e.g., phycoerythrin- conjugated streptavidin
  • fluorescein, lissamine, phycoerythrin, rhodamine (Perkin Elmer Cetus), Cy2, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, FluorX (Amersham) and others [e.g., Kricka et al. (1992), Academic Press San Diego, Calif] can be attached to the oligonucleotides.
  • RNA detection Traditional hybridization assays include PCR, RT-PCR, Real-time PCR, RNase protection, in-situ hybridization, primer extension, Southern blots (DNA detection), dot or slot blots (DNA, RNA), and Northern blots (RNA detection) (NAT type assays are described in greater detail below). More recently, PNAs have been described (Nielsen et al. 1999, Current Opin. Biotechnol. 10:71-75). Other detection methods include kits containing probes on a dipstick setup and the like.
  • Probes can be labeled according to numerous well known methods (Sambrook et al., 1989, supra).
  • Non- limiting examples of radioactive labels include 3H, 14C, 32P, and 35S.
  • Non- limiting examples of detectable markers include ligands, fluorophores, chemiluminescent agents, enzymes, and antibodies.
  • Other detectable markers for use with probes, which can enable an increase in sensitivity of the method of the invention include biotin and radio-nucleotides. It will become evident to the person of ordinary skill that the choice of a particular label dictates the manner in which it is bound to the probe.
  • radioactive nucleotides can be incorporated into probes of the invention by several methods.
  • Non- limiting examples thereof include kinasing the 5' ends of the probes using gamma ATP and polynucleotide kinase, using the Klenow fragment of Pol I of E coli in the presence of radioactive dNTP (i.e. uniformly labeled DNA probe using random oligonucleotide primers in low- melt gels), using the SP6/T7 system to transcribe a DNA segment in the presence of one or more radioactive NTP, and the like.
  • radioactive dNTP i.e. uniformly labeled DNA probe using random oligonucleotide primers in low- melt gels
  • wash steps may be employed to wash away excess target DNA or probe as well as unbound conjugate.
  • standard heterogeneous assay formats are suitable for detecting the hybrids using the labels present on the oligonucleotide primers and probes.
  • samples may be hybridized to an irrelevant probe and treated with RNAse A prior to hybridization, to assess false hybridization.
  • Probes of the invention can be utilized with naturally occurring sugar-phosphate backbones as well as modified backbones including phosphorothioates, dithionates, alkyl phosphonates and a- nucleotides and the like. Modified sugar-phosphate backbones are generally taught by Miller, 1988, Ann. Reports Med. Chem. 23:295 and Moran et al., 1987, Nucleic acid molecule. Acids Res., 14:5019. Probes of the invention can be constructed of either ribonucleic acid (RNA) or deoxyribonucleic acid (DNA), and preferably of DNA.
  • RNA ribonucleic acid
  • DNA deoxyribonucleic acid
  • Detection (and optionally quantification) of a nucleic acid of interest in a biological sample may also optionally be effected by NAT-based assays, which involve nucleic acid amplification technology, such as PCR for example (or variations thereof such as real-time PCR for example).
  • Amplification of a selected, or target, nucleic acid sequence may be carried out by a number of suitable methods. See generally Kwoh et al., 1990, Am. Biotechnol. Lab. 8: 14 Numerous amplification techniques have been described and can be readily adapted to suit particular needs of a person of ordinary skill.
  • Non- limiting examples of amplification techniques include polymerase chain reaction (PCR), ligase chain reaction (LCR), strand displacement amplification (SDA), transcription-based amplification, the q3 replicase system and NASBA (Kwoh et al., 1989, Proc. Natl. Acad. Sci. USA 86, 1173-1177; Lizardi et al., 1988, BioTechnology 6:1197-1202; Malek et al., 1994, Methods MoI. Biol., 28:253-260; and Sambrook et al., 1989, supra).
  • Polymerase chain reaction PCR is carried out in accordance with known techniques, as described for example, in U.S. Pat. Nos.
  • PCR involves a treatment of a nucleic acid sample (e.g., in the presence of a heat stable DNA polymerase) under hybridizing conditions, with one oligonucleotide primer for each strand of the specific sequence to be detected.
  • An extension product of each primer which is synthesized is complementary to each of the two nucleic acid strands, with the primers sufficiently complementary to each strand of the specific sequence to hybridize therewith.
  • the extension product synthesized from each primer can also serve as a template for further synthesis of extension products using the same primers.
  • a "primer” defines an oligonucleotide which is capable of annealing to a target sequence, thereby creating a double stranded region which can serve as an initiation point for DNA synthesis under suitable conditions.
  • Ligase chain reaction is carried out in accordance with known techniques (Weiss, 1991, Science 254:1292). Adaptation of the protocol to meet the desired needs can be carried out by a person of ordinary skill. Strand displacement amplification (SDA) is also carried out in accordance with known techniques or adaptations thereof to meet the 1 5 particular needs (Walker et al., 1992, Proc. Natl. Acad. Sci. USA 89:392-396; and ibid., 1992, Nucleic Acids Res. 20:1691-1696).
  • SDA Strand displacement amplification
  • amplification pair refers herein to a pair of oligonucleotides (oligos) of the present invention, which are selected to be used together in amplifying a selected nucleic acid sequence by one of a number of types of amplification processes, preferably a polymerase chain reaction.
  • amplification processes include ligase chain reaction, strand displacement amplification, or nucleic acid sequence-based amplification, as explained in greater detail below.
  • the oligos are designed to bind to a complementary sequence under selected conditions.
  • amplification of a nucleic acid sample from a patient is amplified under conditions which favor the amplification of the most abundant differentially expressed nucleic acid.
  • RT-PCR is carried out on an mRNA sample from a patient under conditions which favor the amplification of the most abundant mRNA.
  • the amplification of the differentially expressed nucleic acids is carried out simultaneously.
  • the nucleic acid i.e. DNA or RNA
  • the nucleic acid may be obtained according to well known methods.
  • Oligonucleotide primers of the present invention may be of any suitable length, depending on the particular assay format and the particular needs and targeted genomes employed. In general, the oligonucleotide primers are at least 12 nucleotides in length, preferably between 15 and 24 molecules, and they may be adapted to be especially suited to a chosen nucleic acid amplification system.
  • the oligonucleotide primers can be designed by taking into consideration the melting point of hybridization thereof with its targeted sequence (see below and in Sambrook et al., 1989, Molecular Cloning -A Laboratory Manual, 2nd Edition, CSH Laboratories; Ausubel et al., 1989, in Current Protocols in Molecular Biology, John Wiley & Sons Inc., N.Y.).
  • antisense oligonucleotides may be employed to quantify expression of a splice isoform of interest. Such detection is effected at the pre-mRNA level. Essentially the ability to quantitate transcription from a splice site of interest can be effected based on splice site accessibility. Oligonucleotides may compete with splicing factors for the splice site sequences. Thus, low activity of the antisense oligonucleotide is indicative of splicing activity [see Sazani and KoIe (2003), supra].
  • PCR-based methods may be used to identify the presence of mRNA of the markers of the present invention.
  • a pair of oligonucleotides is used, which is specifically hybridizable with the polynucleotide sequences described hereinabove in an opposite orientation so as to direct exponential amplification of a portion thereof (including the hereinabove described sequence alteration) in a nucleic acid amplification reaction.
  • oligonucleotide pairs of primers specifically hybridizable with nucleic acid sequences according to the present invention are described in greater detail with regard to the Examples below.
  • the polymerase chain reaction and other nucleic acid amplification reactions are well known in the art (various non- limiting examples of these reactions are described in greater detail below).
  • the pair of oligonucleotides according to this aspect of the present invention are preferably selected to have compatible melting temperatures (Tm), e.g., melting temperatures which differ by less than that 7 0 C, preferably less than 5 0 C, more preferably less than 4 0 C, most preferably less than 3 0 C, ideally between 3 0 C and 0 0 C.
  • Hybridization to oligonucleotide arrays may be also used to determine expression of the biomarkers of the present invention (hybridization itself is described above). Such screening has been undertaken in the BRCAl gene and in the protease gene of HIV-I virus [see Hacia et al., (1996) Nat Genet 1996;14(4):441-447; Shoemaker et al., (1996) Nat Genet 1996;14(4):450-456; Kozal et al., (1996) Nat Med 1996;2(7):753-759]. Optionally and preferably, such hybridization is combined with amplification as described herein.
  • the nucleic acid sample which includes the candidate region to be analyzed is preferably isolated, amplified and labeled with a reporter group.
  • This reporter group can be a fluorescent group such as phycoerythrin.
  • the labeled nucleic acid is then incubated with the probes immobilized on the chip using a fluidics station.
  • a fluidics station For example, Manz et al. (1993) Adv in Chromatogr 1993; 33:1-66 describe the fabrication of fluidics devices and particularly microcapillary devices, in silicon and glass substrates.
  • the chip is inserted into a scanner and patterns of hybridization are detected.
  • the hybridization data is collected, as a signal emitted from the reporter groups already incorporated into the nucleic acid, which is now bound to the probes attached to the chip. Since the sequence and position of each probe immobilized on the chip is known, the identity of the nucleic acid hybridized to a given probe can be determined.
  • determining the presence and/or level of any specific nucleic or amino acid in a biological sample obtained from, for example, a patient is effected by any one of a variety of methods including, but not limited to, a signal amplification method, a direct detection method and detection of at least one sequence change.
  • the signal amplification methods may amplify, for example, a DNA molecule or an RNA molecule.
  • Signal amplification methods which might be used as part of the present invention include, but are not limited to PCR, LCR (LAR), Self-Sustained Synthetic Reaction (3SR/NASBA) or a Q-Beta (Q ⁇ ) Replicase reaction.
  • PCR Polymerase Chain Reaction
  • PCR The polymerase chain reaction (PCR), as described in U.S. Pat. Nos. 4,683,195 and 4,683,202 to Mullis and Mullis et ah, is a method of increasing the concentration of a segment of target sequence in a mixture of genomic DNA without cloning or purification.
  • This technology provides one approach to the problems of low target sequence concentration.
  • PCR can be used to directly increase the concentration of the target to an easily detectable level.
  • This process for amplifying the target sequence involves the introduction of a molar excess of two oligonucleotide primers which are complementary to their respective strands of the double -stranded target sequence to the DNA mixture containing the desired target sequence. The mixture is denatured and then allowed to hybridize.
  • the primers are extended with polymerase so as to form complementary strands, denaturation, hybridization (annealing), and polymerase extension (elongation) can be repeated as often as needed, in order to obtain relatively high concentrations of a segment of the desired target sequence.
  • the length of the segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and, therefore, this length is a controllable parameter.
  • Ligase Chain Reaction (LCR or LAR): The ligase chain reaction [LCR; sometimes referred to as “Ligase Amplification Reaction” (LAR)] described by Barany, Proc. Natl. Acad. Sci., 88:189 (1991); Barany, PCR Methods and Applic, 1:5 (1991); and Wu and Wallace, Genomics 4:560 (1989) has developed into a well- recognized alternative method of amplifying nucleic acids.
  • LCR has also been used in combination with PCR to achieve enhanced detection of single-base changes; see for example Segev, PCT Publication No. W09001069 Al (1990).
  • the four oligonucleotides used in this assay can pair to form two short ligatable fragments, there is the potential for the generation of target- independent background signal.
  • the use of LCR for mutant screening is limited to the examination of specific nucleic acid positions.
  • the self- sustained sequence replication reaction (3SR) (Guatelli et ah, Proc. Natl. Acad. Sci., 87:1874-1878, 1990), with an erratum at Proc. Natl. Acad. Sci., 87:7797, 1990) is a transcription-based in vitro amplification system (Kwok et ah, Proc. Natl. Acad. Sci., 86:1173-1177, 1989) that can exponentially amplify RNA sequences at a uniform temperature. The amplified RNA can then be utilized for mutation detection (Fahy et al., PCR Meth.
  • an oligonucleotide primer is used to add a phage RNA polymerase promoter to the 5' end of the sequence of interest.
  • a cocktail of enzymes and substrates that includes a second primer, reverse transcriptase, RNase H, RNA polymerase and ribo-and deoxyribonucleoside triphosphates, the target sequence undergoes repeated rounds of transcription, cDNA synthesis and second-strand synthesis to amplify the area of interest.
  • the use of 3SR to detect mutations is kinetically limited to screening small segments of DNA (e.g., 200-300 base pairs).
  • Q-B eta (Q ⁇ ) Replicase In this method, a probe which recognizes the sequence of interest is attached to the replicatable KNA template for Q ⁇ replicase.
  • a previously identified major problem with false positives resulting from the replication of unhybridized probes has been addressed through use of a sequence- specific ligation step.
  • available thermostable DNA ligases are not effective on this RNA substrate, so the ligation must be performed by T4 DNA ligase at low temperatures (37 degrees C). This prevents the use of high temperature as a means of achieving specificity as in the LCR, the ligation event can be used to detect a mutation at the junction site, but not elsewhere.
  • a successful diagnostic method must be very specific.
  • a straight-forward method of controlling the specificity of nucleic acid hybridization is by controlling the temperature of the reaction. While the 3SR/NASBA, and Q ⁇ systems are all able to generate a large quantity of signal, one or more of the enzymes involved in each cannot be used at high temperature (i.e., > 55 degrees C). Therefore the reaction temperatures cannot be raised to prevent non-specific hybridization of the probes. If probes are shortened in order to make them melt more easily at low temperatures, the likelihood of having more than one perfect match in a complex genome increases. For these reasons, PCR and LCR currently dominate the research field in detection technologies.
  • the basis of the amplification procedure in the PCR and LCR is the fact that the products of one cycle become usable templates in all subsequent cycles, consequently doubling the population with each cycle.
  • PCR running at 85 % efficiency will yield only 21 % as much final product, compared to a reaction running at 100 % efficiency.
  • a reaction that is reduced to 50 % mean efficiency will yield less than 1 % of the possible product.
  • PCR has yet to penetrate the clinical market in a significant way.
  • LCR LCR must also be optimized to use different oligonucleotide sequences for each target sequence.
  • both methods require expensive equipment, capable of precise temperature cycling.
  • nucleic acid detection technologies such as in studies of allelic variation, involve not only detection of a specific sequence in a complex background, but also the discrimination between sequences with few, or single, nucleotide differences.
  • One method of the detection of allele -specific variants by PCR is based upon the fact that it is difficult for Taq polymerase to synthesize a DNA strand when there is a mismatch between the template strand and the 3' end of the primer.
  • An allele -specific variant may be detected by the use of a primer that is perfectly matched with only one of the possible alleles; the mismatch to the other allele acts to prevent the extension of the primer, thereby preventing the amplification of that sequence.
  • This method has a substantial limitation in that the base composition of the mismatch influences the ability to prevent extension across the mismatch, and certain mismatches do not prevent extension or have only a minimal effect (Kwok et al., Nucl. Acids Res., 18:999, 1990)
  • a similar 3'- mismatch strategy is used with greater effect to prevent ligation in the LCR (Barany, PCR Meth. Applic, 1:5, 1991).
  • thermostable ligase Any mismatch effectively blocks the action of the thermostable ligase, but LCR still has the drawback of target- independent background ligation products initiating the amplification. Moreover, the combination of PCR with subsequent LCR to identify the nucleotides at individual positions is also a clearly cumbersome proposition for the clinical laboratory.
  • the direct detection method may be, for example a cycling probe reaction (CPR) or a branched DNA analysis.
  • CPR cycling probe reaction
  • Cycling probe reaction The cycling probe reaction (CPR) (Duck et al., BioTech., 9:142, 1990), uses a long chimeric oligonucleotide in which a central portion is made of RNA while the two termini are made of DNA. Hybridization of the probe to a target DNA and exposure to a thermostable RNase H causes the RNA portion to be digested. This destabilizes the remaining DNA portions of the duplex, releasing the remainder of the probe from the target DNA and allowing another probe molecule to repeat the process. The signal, in the form of cleaved probe molecules, accumulates at a linear rate.
  • Branched DNA Branched DNA (bDNA), described by Urdea et al, Gene 61:253-264
  • the detection of at least one sequence change may be accomplished by, for example restriction fragment length polymorphism (RFLP analysis), allele specific oligonucleotide (ASO) analysis, Denaturing/Temperature Gradient Gel Electrophoresis (DGGE/TGGE), Single- Strand Conformation Po lymorphism (SSCP) analysis or Dideoxy fingerprinting (ddF).
  • RFLP analysis restriction fragment length polymorphism
  • ASO allele specific oligonucleotide
  • DGGE/TGGE Denaturing/Temperature Gradient Gel Electrophoresis
  • SSCP Single- Strand Conformation Po lymorphism
  • ddF Dideoxy fingerprinting
  • nucleic acid segments for mutations.
  • One option is to determine the entire gene sequence of each test sample (e.g., a bacterial isolate). For sequences under approximately 600 nucleotides, this may be accomplished using amplified material (e.g., PCR reaction products). This avoids the time and expense associated with cloning the segment of interest. However, specialized equipment and highly trained personnel are required, and the method is too labor- intense and expensive to be practical and effective in the clinical setting.
  • a given segment of nucleic acid may be characterized on several other levels. At the lowest resolution, the size of the molecule can be determined by electrophoresis by comparison to a known standard run on the same gel.
  • a more detailed picture of the molecule may be achieved by cleavage with combinations of restriction enzymes prior to electrophoresis, to allow construction of an ordered map.
  • the presence of specific sequences within the fragment can be detected by hybridization of a labeled probe, or the precise nucleotide sequence can be determined by partial chemical degradation or by primer extension in the presence of chain- terminating nucleotide analogs.
  • Restriction fragment length polymorphism For detection of single-base differences between like sequences, the requirements of the analysis are often at the highest level of resolution. For cases in which the position of the nucleotide in question is known in advance, several methods have been developed for examining single base changes without direct sequencing. For example, if a mutation of interest happens to fall within a restriction recognition sequence, a change in the pattern of digestion can be used as a diagnostic tool (e.g., restriction fragment length polymorphism [RFLP] analysis).
  • RFLP restriction fragment length polymorphism
  • MCC Mismatch Chemical Cleavage
  • RFLP analysis suffers from low sensitivity and requires a large amount of sample.
  • RFLP analysis is used for the detection of point mutations, it is, by its nature, limited to the detection of only those single base changes which fall within a restriction sequence of a known restriction endonuclease.
  • the majority of the available enzymes have 4 to 6 base-pair recognition sequences, and cleave too frequently for many large-scale DNA manipulations (Eckstein and Lilley (eds.), Nucleic Acids and Molecular Biology, vol. 2, Springer- Verlag, Heidelberg, 1988). Thus, it is applicable only in a small fraction of cases, as most mutations do not fall within such sites.
  • Allele specific oligonucleotide can be designed to hybridize in proximity to the mutated nucleotide, such that a primer extension or ligation event can bused as the indicator of a match or a mis-match.
  • Hybridization with radioactively labeled allelic specific oligonucleotides also has been applied to the detection of specific point mutations (Conner et ah, Proc. Natl. Acad. ScL, 80:278-282, 1983). The method is based on the differences in the melting temperature of short DNA fragments differing by a single nucleotide.
  • the precise location of the suspected mutation must be known in advance of the test. That is to say, they are inapplicable when one needs to detect the presence of a mutation within a gene or sequence of interest.
  • DGGE/TGGE Denaturing/Temperature Gradient Gel Electrophoresis
  • the fragments to be analyzed are "clamped” at one end by a long stretch of GC base pairs (30-80) to allow complete denaturation of the sequence of interest without complete dissociation of the strands.
  • the attachment of a GC “clamp” to the DNA fragments increases the fraction of mutations that can be recognized by DGGE (Abrams et al., Genomics 7:463-475, 1990). Attaching a GC clamp to one primer is critical to ensure that the amplified sequence has a low dissociation temperature (Sheffield et al, Proc. Natl. Acad. ScL, 86:232-236, 1989; and Lerman and Silverstein, Meth. Enzymol., 155:482-501, 1987).
  • CDGE requires that gels be performed under different denaturant conditions in order to reach high efficiency for the detection of mutations.
  • a technique analogous to DGGE, termed temperature gradient gel electrophoresis termed temperature gradient gel electrophoresis
  • TGGE uses a thermal gradient rather than a chemical denaturant gradient (Scholz, et al, Hum.
  • TGGE requires the use of specialized equipment which can generate a temperature gradient perpendicularly oriented relative to the electrical field. TGGE can detect mutations in relatively small fragments of DNA therefore scanning of large gene segments requires the use of multiple PCR products prior to running the gel.
  • SSCP Single-Strand Conformation Polymorphism
  • the SSCP process involves denaturing a DNA segment (e.g., a PCR product) that is labeled on both strands, followed by slow electrophoretic separation on a non-denaturing polyacrylamide gel, so that intra- molecular interactions can form and not be disturbed during the run.
  • This technique is extremely sensitive to variations in gel composition and temperature. A serious limitation of this method is the relative difficulty encountered in comparing data generated in different laboratories, under apparently similar conditions.
  • Dideoxy fingerprinting (ddF) The dideoxy fingerprinting (ddF) is another technique developed to scan genes for the presence of mutations (Liu and Sominer, PCR Methods Appli., 4:97, 1994). The ddF technique combines components of Sanger dideoxy sequencing with SSCP.
  • a dideoxy sequencing reaction is performed using one dideoxy terminator and then the reaction products are electrophoresed on nondenaturing polyacrylamide gels to detect alterations in mobility of the termination segments as in SSCP analysis.
  • ddF is an improvement over SSCP in terms of increased sensitivity
  • ddF requires the use of expensive dideoxynucleotides and this technique is still limited to the analysis of fragments of the size suitable for SSCP (i.e., fragments of 200-300 bases for optimal detection of mutations).
  • all of these methods are limited as to the size of the nucleic acid fragment that can be analyzed.
  • sequences of greater than 600 base pairs require cloning, with the consequent delays and expense of either deletion sub-cloning or primer walking, in order to cover the entire fragment.
  • SSCP and DGGE have even more severe size limitations. Because of reduced sensitivity to sequence changes, these methods are not considered suitable for larger fragments.
  • SSCP is reportedly able to detect 90 % of single-base substitutions within a 200 base-pair fragment, the detection drops to less than 50 % for 400 base pair fragments.
  • the sensitivity of DGGE decreases as the length of the fragment reaches 500 base-pairs.
  • the ddF technique as a combination of direct sequencing and SSCP, is also limited by the relatively small size of the DNA that can be screened.
  • the step of searching for the mutation or mutations in any of the genes listed above, such as, for example, the reduced folate carrier (RFC) gene, in tumor cells or in cells derived from a cancer patient is effected by a single strand conformational polymorphism (SSCP) technique, such as cDNA- SSCP or genomic DNA-SSCP.
  • SSCP single strand conformational polymorphism
  • nucleic acid sequencing polymerase chain reaction
  • ligase chain reaction self- sustained synthetic reaction
  • Q ⁇ -Replicase cycling probe reaction
  • branched DNA restriction fragment length polymorphism analysis
  • mismatch chemical cleavage heteroduplex analysis
  • allele-specific oligonucleotides denaturing gradient gel electrophoresis, constant denaturant gel electrophoresis, temperature gradient gel electrophoresis and dideoxy fingerprinting.
  • This Section relates to Examples of sequences according to the present invention, including illustrative methods of selection thereof.
  • Biological source examples of frequently used biological sources for construction of EST libraries include cancer cell- lines; normal tissues; cancer tissues; fetal tissues; and others such as normal cell lines and pools of normal cell- lines, cancer cell- lines and combinations thereof. A specific description of abbreviations used below with regard to these tissues/cell lines etc is given above.
  • Protocol of library construction various methods are known in the art for library construction including normalized library construction; non-normalized library construction; subtracted libraries; ORESTES and others. It will be appreciated that at times the protocol of library construction is not indicated.
  • Clusters having at least five sequences including at least two sequences from the tissue of interest are analyzed.
  • Clones no. score Generally, when the number of ESTs is much higher in the cancer libraries relative to the normal libraries it might indicate actual over- expression.
  • Clones number score The total weighted number of EST clones from cancer libraries was compared to the EST clones from normal libraries. To avoid cases where one library contributes to the majority of the score, the contribution of the library that gives most clones for a given cluster was limited to 2 clones. The score was computed as
  • Clones number score significance - Fisher exact test was used to check if EST clones from cancer libraries are significantly over-represented in the cluster as compared to the total number of EST clones from cancer and normal libraries.
  • tissue libraries/sequences were compared to the total number of libraries/sequences in cluster. Similar statistical tools to those described in above were employed to identify tissue specific genes. Tissue abbreviations are the same as for cancerous tissues, but are indicated with the header "normal tissue”.
  • Each cluster includes at least 2 libraries from the tissue T. At least 3 clones (weighed - as described above) from tissue T in the cluster; and
  • Clones from the tissue T are at least 40 % from all the clones participating in the tested cluster
  • a Region is defined as a group of adjacent exons that always appear or do not appear together in each splice variant.
  • a “segment” (sometimes referred also as “seg” or “node”) is defined as the shortest contiguous transcribed region without known splicing inside.
  • EST was defined as unreliable if: (i) Unspliced; (ii) Not covered by RNA; (iii) Not covered by spliced ESTs; and (iv) Alignment to the genome ends in proximity of long poly-A stretch or starts in proximity of long poly- T stretch.
  • Each unique sequence region divides the set of transcripts into 2 groups:
  • the set of EST clones of every cluster is divided into 3 groups:
  • Sl is significantly enriched by cancer EST clones compared to S2;
  • Sl is significantly enriched by cancer EST clones compared to cluster background (S1+S2+S3). Identification of unique sequence regions and division of the group of transcripts accordingly is illustrated in Figure 2. Each of these unique sequence regions corresponds to a segment, also termed herein a "node”.
  • Region 1 common to all transcripts, thus it is not considered; Region 2: specific to Transcript 1: T_l unique regions (2+6) against T_2+3 unique regions (3+4); Region 3: specific to Transcripts 2+3: T_2+3 unique regions (3+4) against Tl unique regions (2+6); Region 4: specific to Transcript 3: T_3 unique regions (4) against Tl+2 unique regions (2+5+6); Region 5: specific to Transcript 1+2: T_l+2 unique regions (2+5+6) against T3 unique regions (4); Region 6: specific to Transcript 1: same as region 2.
  • Cluster Z45766 features 17 transcript(s) and 37 segment(s) of interest, the names for which are given in Tables 1 and 2, respectively, the sequences themselves are given at the end of the application.
  • the selected protein variants are given in Table 3.
  • Protein G2 and S phase expressed protein 1 are variants of the known protein G2 and S phase expressed protein 1 (SwissProt accession identifier GTSEJHUMAN; known also according to the synonyms B99 homolog), referred to herein as the previously known protein.
  • Protein G2 and S phase expressed protein 1 is known or believed to have the following function(s): May be involved in p53- induced cell cycle arrest in G2/M phase by interfering with microtubule rearrangements that are required to enter mitosis. Overexpression delays G2/M phase progression.
  • the sequence for protein G2 and S phase expressed protein 1 is given at the end of the application, as "G2 and S phase expressed protein 1 amino acid sequence".
  • Known polymorphisms for this sequence are as shown in Table 4.
  • Protein G2 and S phase expressed protein 1 localization is believed to be Cytoplasmic. Associated with microtubules.
  • the following GO Annotation(s) apply to the previously known protein.
  • the following annotation(s) were found: G2 phase of mitotic cell cycle; DNA damage response, induction of cell arrest by p53; microtubule-based process, which are annotation(s) related to Biological Process; and cytoplasmic microtubule, which are annotation(s) related to Cellular Component.
  • the GO assignment relies on info ⁇ nation from one or more of the SwissProt/TremBl Protein knowledgebase, available from ⁇ http://www.expasy.ch/sprot/>; or Locuslink, available from ⁇ http ://www.ncbi .nlm .nih.gov/proj ects/LocusLink/>.
  • Cluster Z45766 can be used as a diagnostic marker according to overexpression of transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given according to the previously described methods.
  • the term "number" in the left hand column of the table and the numbers on the y-axis of Figure 3 below refer to weighted expression of ESTs in each category, as "parts per million” (ratio of the expression of ESTs for a particular cluster to the expression of all ESTs in that category, according to parts per million).
  • cluster Z45766 features 37 segment(s), which were listed in Table 2 above and for which the sequence(s) are given at the end of the application. These segment(s) are portions of nucleic acid sequence(s) which are described herein separately because they are of particular interest. A description of each segment according to the present invention is now provided.
  • Segment cluster Z45766_node_4 is supported by 1 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T28. Table 7 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P18.
  • Segment cluster Z45766_node_8 is supported by 39 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766JN8, Z45766_T21, Z45766_T22 and Z45766_T25. Table 8 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P2.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P12, Z45766JP8, Z45766JP14 and Z45766JP16, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_9 according to the present invention is supported by 44 libraries.
  • Segment cluster Z45766_node_12 is supported by 27 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766_T21 and Z45766_T22. Table 10 below describes the starting and ending position of this segment on each transcript. Table 10 - Segment location on transcripts
  • This segment can be found in the following protein(s): Z45766_P19, Z45766JP2, Z45766_P4, Z45766_P5, Z45766JP6, Z45766_P7, Z45766_P9, Z45766_P12, Z45766_P8 and Z45766 P14.
  • Segment cluster Z45766_node_16 is supported by 33 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T15, Z45766_T18, Z45766_T21, Z45766_T22 and Z45766_T28. Table 11 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766JP18.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766JM, Z45766_P5, Z45766_P6, Z45766_P9, Z45766_P12, Z45766_P8 and Z45766_P14, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766jnode_17 is supported by 1 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T28. Table 12 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_19 according to the present invention is supported by 36 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766_T21 and Z45766_T22. Table 13 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_22 is supported by 29 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766JN1, Z45766_T12, Z45766_T18, Z45766_T21 and Z45766_T22. Table 14 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P12, Z45766_P8 and Z45766_P14, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node__24 is supported by 5 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T21 and Z45766_T22. Table 15 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766JP8.
  • This segment can also be found in the following protein(s): Z45766_P14, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_28 is supported by 2 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766 T16. Table 16 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_30 is supported by 3 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T17 and Z45766_T27. Table 17 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_33 is supported by 34 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T16, Z45766_T17, Z45766_T18 and Z45766_T27. Table 18 below describes the starting and ending position of this segment on each transcript.
  • Microarray (chip) data is also available for this segment as follows. As described above with regard to the cluster itself, various oligonucleotides were tested for being differentially expressed in various disease conditions, particularly cancer. The following oligonucleotides were found to hit this segment, shown in Table 19. Table 19 - Oligonucleotides related to this segment
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766JP19, Z45766_JP2, Z45766_P4, Z45766_P6, Z45766_P10, Z45766_P11, Z45766_P12 and Z45766_P17, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_34 is supported by 4 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T27. Table 20 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_37 is supported by 43 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7,
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P6, Z45766_P10, Z45766_P11 and
  • Segment cluster Z45766_node_39 according to the present invention is supported by 5 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T18. Table 22 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_42 is supported by 36 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T12, Z45766_T16 and Z45766_T17. Table 23 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P4, Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P10 and Z45766_P11, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_44 is supported by 29 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7,
  • Z45766_T9, Z45766_T10, Z45766_T12, Z45766_T16 and Z45766_T17 Table 24 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in both coding and non-coding regions of tanscript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766JP4, Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P10 and Z45766_P11, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_45 is supported by 26 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T16 and Z45766_T17. Table 25 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P7, Z45766_P10 and Z45766JP11.
  • This segment can also be found in the following protein(s): Z45766_P6, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_46 is supported by 34 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_ T11, Z45766_T12, Z45766_T16 and Z45766_T17. Table 26 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766JP19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766JP6, Z45766_P7, Z45766_P10 and Z45766_Pl l.
  • Segment cluster Z45766_node_47 is supported by 56 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T16 and Z45766_T17. Table 27 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766JP7, Z45766_P10 and Z45766JP11.
  • This segment can also be found in the following protein(s): Z45766_P9, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_51 is supported by 21 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T16, Z45766_T17 and Z45766_T25. Table 28 below describes the starting and ending position of this segment on each transcript.
  • Microarray (chip) data is also available for this segment as follows. As described above with regard to the cluster itself, various oligonucleotides were tested for being differentially expressed in various disease conditions, particularly cancer. The following oligonucleotides were found to hit this segment, shown in Table 29.
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766JP7, Z45766_P9, Z45766_P10 and Z45766_P11.
  • This segment can also be found in the following protein(s): Z45766__P16, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_53 is supported by 22 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T16, Z45766_T17 and Z45766_T25. Table 30 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P10, Z45766_P11 and Z45766_P16.
  • Segment cluster Z45766_node_55 is supported by 18 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766JN 1, Z45766_T12, Z45766_T15, Z45766_T16, Z45766_T17 and Z45766_T25. Table 31 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P10, Z45766_P11 and Z45766_P16.
  • short segments related to the above cluster are also provided. These segments are up to about 120 bp in length, and so are i innrcVll ⁇ urdlfevdi i inn a a s craepia ⁇ rrnattpe H dpessrc.rirmpttiinonn. Segment cluster Z45766_node_0 according to the present invention is supported by 4 libraries. The number of libraries was determined as previously described.
  • This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766_T21, Z45766_T22 and Z45766_T25.
  • Table 32 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P12, Z45766_P8, Z45766_P14 and Z45766_P16.
  • Segment cluster Z45766_node_2 is supported by 25 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766JT21, Z45766_T22 and Z45766_T25. Table 33 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P2.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P12, Z45766_P8, Z45766_P14 and Z45766_P16, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_6 is supported by 29 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766_T21, Z45766_T22 and Z45766_T25. Table 34 below describes the starting and ending position of this segment on each transcript.
  • Segment cluster Z45766_node_15 is supported by 1 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T28. Table 35 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P18.
  • Segment cluster Z45766_node_20 is supported by 36 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T18, Z45766_T21 and Z45766JT22. Table 36 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in the following protein(s): Z45766JP19, Z45766JP2, Z45766_P4, Z45766_P6, Z45766_P7, Z45766_P9, Z45766JP12, Z45766_P8 and Z45766JP14.
  • Segment cluster Z45766_node_21 can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T11, Z45766_T12, Z45766_T18, Z45766_T21 and Z45766_T22. Table 37 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766JP4, Z45766_P6, Z45766_P7, Z45766_P12, Z45766_P8 and Z45766_P14.
  • Segment cluster Z45766_node_23 according to the present invention is supported by 3 libraries. The number of libraries was determined as previously described. This segment can be W
  • Segment cluster Z45766_node_25 can be found in the following transcript(s): Z45766_T21 and Z45766_T22. Table 39 below describes the starting and ending position of this segment on each transcript. Table 39 - Segment location on transcripts
  • transcript(s) that are related to the following protein(s): Z45766JP8 and Z45766_P14.
  • Segment cluster Z45766_node_26 is supported by 2 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcri ⁇ t(s): Z45766_T21 and Z45766_T22. Table 40 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in a non- coding region of transcript(s) that are related to the following protein(s): Z45766_P8 and Z45766_P14. Segment cluster Z45766_node_31 according to the present invention is supported by 28 libraries. The number of libraries was dete ⁇ nined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T16, Z45766_T17, Z45766_T18 and Z45766_T27. Table 41 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P6, Z45766_P10, Z45766_P11, Z45766_P12 and Z45766_P17, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_38 is supported by 37 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T16, Z45766_T17 and Z45766_T18. Table
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P6, Z45766_P10, Z45766_P11 and
  • Segment cluster Z45766_node_41 is supported by 32 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7,
  • Z45766_T9, Z45766_T10, Z45766_T12, Z45766_T16 and Z45766_T17 Table 43 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P5 and Z45766_P7.
  • This segment can also be found in the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P10 and Z45766JP11, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_50 is supported by 18 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T16, Z45766_T17 and Z45766_T25. Table 44 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766JP7, Z45766_P9, Z45766JP10 and Z45766JP11.
  • This segment can also be found in the following protein(s): Z45766_P16, since it is in the coding region for the corresponding transcript.
  • Segment cluster Z45766_node_52 according to the present invention is supported by 21 libraries. The number of libraries was determined as previously described.
  • This segment can be found in the following transcript(s): Z45766_T0, Z45766_T1, Z45766_T3, Z45766_T7, Z45766_T9, Z45766_T10, Z45766_T11, Z45766_T12, Z45766_T15, Z45766_T16, Z45766_T17 and Z45766_T25. Table 45 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): Z45766_P19, Z45766_P2, Z45766_P4, Z45766_P5, Z45766_P6, Z45766_P7, Z45766_P9, Z45766_P10, Z45766_P11 and Z45766_P16.
  • Cluster AA436634 features 1 transcript(s) and 1 segment(s) of interest, the names for which are given in Tables 46 and 47, respectively, the sequences themselves are given at the end of the application..
  • the heart- selective diagnostic marker prediction engine provided the following results with regard to cluster AA436634. Predictions were made for selective expression of transcripts of this contig in heart tissue, according to the previously described methods.
  • the numbers on the y-axis of the Figure 4 below refer to weighted expression of ESTs in each category, as "parts per million” (ratio of the expression of ESTs for a particular cluster to the expression of all ESTs in that category, according to parts per million).
  • This cluster was found to be selectively expressed in heart for the following reasons: in a comparison of the ratio of expression of the cluster in heart specific ESTs to the overall expression of the cluster in non-heart ESTs, which was found to be 39.1; the ratio of expression of the cluster in heart specific ESTs to the overall expression of the cluster in muscle- specific ESTs which was found to be 74; and fisher exact test P- values were computed both for library and weighted clone counts to check that the counts are statistically significant, and were found to be l.lOE-05.
  • cluster AA436634 features 1 segment(s), which were listed in Table 47 above and for which the sequence(s) are given at the end of the application. These segment(s) are portions of nucleic acid sequence(s) which are described herein separately because they are of particular interest. A description of each segment according to the present invention is now provided.
  • Segment cluster AA436634_node_0 is supported by 18 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA436634_T0. Table 49 below describes the starting and ending position of this segment on each transcript.
  • Cluster AA604379 features 4 transcript(s) and 22 segment(s) of interest, the names for which are given in Tables 50 and 51, respectively, the sequences themselves are given at the end of the application.
  • the selected protein variants are given in Table 52.
  • Cluster AA604379 can be used as a diagnostic marker according to overexpression of transcripts of this cluster in cancer. Expression of such transcripts in normal tissues is also given according to the previously described methods.
  • the term "number" in the left hand column of the table and the numbers on the y-axis of the Figure 5 refer to weighted expression of ESTs in each category, as "parts per million” (ratio of the expression of ESTs for a particular cluster to the expression of all ESTs in that category, according to parts per million).
  • This cluster is overexpressed (at least at a minimum level) in the following pathological conditions: brain malignant tumors, epithelial malignant rumors, a mixture of malignant rumors from different
  • cluster AA604379 features 22 segment(s), which were listed in Table 51 above and for which the sequence(s) are given at the end of the application. These segment(s) are portions of nucleic acid sequence(s) which are described herein separately because they are of particular interest. A description of each segment according to the present invention is now provided.
  • Segment cluster AA604379_node_2 is supported by 63 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA604379_T10. Table 55 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): AA604379_P4.
  • This segment can also be found in the following protein(s): AA6O4379_P1 and AA604379_P3, since it is in the coding region for the corresponding transcript.
  • Segment cluster AA604379_node_14 is supported by 55 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA604379_T10. Table 56 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non-coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): AA604379_P4.
  • This segment can also be found in the following protein(s): AA6O4379_P1 and AA604379_P3, since it is in the coding region for the corresponding transcript.
  • Segment cluster AA604379_node_19 is supported by 6 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T5 and AA604379_T10. Table 57 below describes the starting and ending position of this segment on each transcript.
  • This segment can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): AA604379_P3.
  • This segment can also be found in the following protein(s): AA604379_P4, since it is in the coding region for the corresponding transcript.
  • Segment cluster AA604379_node_21 according to the present invention is supported by 10 libraries. The number of libraries was determined as previously described.
  • This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA6O4379_T1O. Table 58 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) can be found in both coding and non- coding regions of transcript(s) as follows.
  • the segment can be found in a non-coding region of transcript(s) that are related to the following protein(s): AA6O4379_P1 and AA604379_P3.
  • This segment can also be found in the following protein(s): AA604379_P4, since it is in the coding region for the corresponding transcript.
  • Segment cluster AA604379_node_22 is supported by 38 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA604379_T10. Table 59 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): AA6O4379_P1, AA604379_P3 and AA604379_P4. Segment cluster AA604379_node_25 according to the present invention is supported by
  • transcript(s) that are related to the following protein(s): AA604379JP1, AA604379JP3 and AA604379_P4.
  • Segment cluster AA604379_node_27 is supported by 41 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA604379_T10. Table 61 below describes the starting and ending position of this segment on each transcript.
  • transcript(s) that are related to the following protein(s): AA6O4379_P1, AA604379_P3 and AA604379_P4.
  • short segments related to the above cluster are also provided. These segments are up to about 120 bp in length, and so are included in a separate description.
  • Segment cluster AA604379_node_0 is supported by 48 libraries. The number of libraries was determined as previously described. This segment can be found in the following transcript(s): AA604379_T4, AA604379_T5, AA604379_T6 and AA604379_T10. Table 62 below describes the starting and ending position of this segment on each transcript.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Engineering & Computer Science (AREA)
  • Toxicology (AREA)
  • Zoology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)

Abstract

La présente invention a trait à de nouvelles séquences d'acides nucléiques de variants d'épissage. Les nouveaux variants d'épissage et leurs séquences d'acides nucléiques de la présente invention peuvent éventuellement être utilisés pour le diagnostic d'une maladie détectable par des variants tels que décrite dans la spécification.
PCT/IB2005/002438 2004-01-27 2005-01-27 Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic WO2006035273A2 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CA002554718A CA2554718A1 (fr) 2004-01-27 2005-01-27 Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic
AU2005288710A AU2005288710A1 (en) 2004-01-27 2005-01-27 Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis
EP05805030A EP1716256A2 (fr) 2004-01-27 2005-01-27 Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US53912904P 2004-01-27 2004-01-27
US60/539,129 2004-01-27
US62866604P 2004-11-18 2004-11-18
US60/628,666 2004-11-18
US11/043,788 US20060014166A1 (en) 2004-01-27 2005-01-27 Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of endometriosis
US---- 2020-03-21

Publications (2)

Publication Number Publication Date
WO2006035273A2 true WO2006035273A2 (fr) 2006-04-06
WO2006035273A3 WO2006035273A3 (fr) 2009-04-16

Family

ID=36119256

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2005/002438 WO2006035273A2 (fr) 2004-01-27 2005-01-27 Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic

Country Status (4)

Country Link
EP (1) EP1716256A2 (fr)
AU (1) AU2005288710A1 (fr)
CA (1) CA2554718A1 (fr)
WO (1) WO2006035273A2 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9920123B2 (en) 2008-12-09 2018-03-20 Genentech, Inc. Anti-PD-L1 antibodies, compositions and articles of manufacture

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEUNG V.G. ET AL.: 'Natural variation in human gene expression assess in lymphoblastoid cells' NATURE GENETICS vol. 33, March 2003, pages 422 - 425, XP003002707 *
DATABASE GENBANK [Online] 02 May 2003 'Homo sapiens chromosome 1 clone RP11-34I24', XP008113368 Retrieved from EBI Database accession no. AC093150 *
ENARD W. ET AL.: 'Intra- and interspecific variation in primate gene expression patters.' SCIENCE vol. 296, 12 April 2002, pages 340 - 343, XP003025611 *
XIANG C.C. ET AL.: 'Amine-modified random primers to label probes for DNA microarrays' NATURE BIOTECHNOLOGY vol. 20, July 2002, pages 738 - 742, XP001182019 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9920123B2 (en) 2008-12-09 2018-03-20 Genentech, Inc. Anti-PD-L1 antibodies, compositions and articles of manufacture

Also Published As

Publication number Publication date
CA2554718A1 (fr) 2006-04-06
AU2005288710A2 (en) 2006-04-06
EP1716256A2 (fr) 2006-11-02
AU2005288710A1 (en) 2006-04-06
WO2006035273A3 (fr) 2009-04-16

Similar Documents

Publication Publication Date Title
US7842459B2 (en) Nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis
US20060046257A1 (en) Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of lung cancer
US20030059875A1 (en) Nucleic acids, proteins, and antibodies
US20060051774A1 (en) Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of prostate cancer
EP1931703A2 (fr) Nouveaux nucleotides et nouvelles sequences d'acides amines, et bioessais et procedes d'utilisation associes a des fins de diagnostic
WO2005072049A2 (fr) Expression differentielle de marqueurs en endometriose
WO2014197453A1 (fr) Mutations récurrentes touchant des régulateurs épigénétiques, les kinases rhoa et fyn, dans les lymphomes à cellules t périphériques
US20060263786A1 (en) Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of colon cancer
WO2010061393A1 (fr) Séquences d'acides aminés et de nucléotides de variants de he4 et leurs procédés d'utilisation
EP1716256A2 (fr) Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic
AU2699901A (en) Biallelic markers derived from genomic regions carrying genes involved in central nervous system disorders
US7528243B2 (en) Nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of breast cancer
WO2005116850A9 (fr) Expression differentielle de marqueurs dans le cancer de l'ovaire
EP1774046A2 (fr) Nouvelles sequences d'aminoacides et de nucleotides, et dosages et methodes d'utilisation afferentes dans le diagnostic du cancer du poumon
WO2007060671A2 (fr) Nouvelles sequences de nucleotides et d'acides amines, ainsi que tests et procedes d'utilisation de ces sequences a des fins de diagnostic
US20090075257A1 (en) Novel nucleic acid sequences and methods of use thereof for diagnosis
AU2005207882A1 (en) Novel nucleotide and amino acid sequences, and assays and methods of use thereof for diagnosis of breast cancer
EP1749025A2 (fr) Nouveaux nucleotides et sequences d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic du cancer du colon
US8981070B2 (en) Conjugate between a thiophilic solid phase and an oligonucleotide comprising a thiooxonucleotide
US20060148741A1 (en) Metastasis suppressor gene on human chromosome 8 and its use in the diagnosis, prognosis and treatment of cancer
WO2005107364A9 (fr) Nouvelles sequences de nucleotides et d'acides amines, et leurs dosages et procedes d'utilisation pour le diagnostic
Sobieszczańska et al. Genetic Variability in Selected ZnT8 SNPs in the Opolskie Voivodeship (Poland)-Relationship with Type 2 Diabetes and its Complications and Accompanying Diseases
WO2021133771A1 (fr) Variants d'adénylate cyclase 7 (adcy7) et leurs utilisations
KR20240058125A (ko) CAMP 반응 요소 결합 단백질 3 유사 3[CAMP Responsive Element Binding Protein 3 Like 3, CREB3L3] 억제제를 사용한 간 질환의 치료
EP1735468A2 (fr) Nouvelles sequences de nucleotides et d'acides amines; essais et methodes d'utilisation pour le diagnostic du cancer de a la prostate

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2005288710

Country of ref document: AU

Ref document number: 2554718

Country of ref document: CA

NENP Non-entry into the national phase in:

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: DE

WWE Wipo information: entry into national phase

Ref document number: 2005805030

Country of ref document: EP

ENP Entry into the national phase in:

Ref document number: 2005288710

Country of ref document: AU

Date of ref document: 20050127

Kind code of ref document: A

WWP Wipo information: published in national office

Ref document number: 2005288710

Country of ref document: AU

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWP Wipo information: published in national office

Ref document number: 2005805030

Country of ref document: EP