WO2002093165A1 - Materiaux et procedes de detection de nouvelles variantes epicees d'arnm - Google Patents

Materiaux et procedes de detection de nouvelles variantes epicees d'arnm Download PDF

Info

Publication number
WO2002093165A1
WO2002093165A1 PCT/US2002/015649 US0215649W WO02093165A1 WO 2002093165 A1 WO2002093165 A1 WO 2002093165A1 US 0215649 W US0215649 W US 0215649W WO 02093165 A1 WO02093165 A1 WO 02093165A1
Authority
WO
WIPO (PCT)
Prior art keywords
gene
oligonucleotides
splice
exon
oligonucleotide
Prior art date
Application number
PCT/US2002/015649
Other languages
English (en)
Inventor
Douglas Dolginow
Lawrence Mertz
Original Assignee
Gene Logic, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gene Logic, Inc. filed Critical Gene Logic, Inc.
Priority to US10/471,731 priority Critical patent/US20040115686A1/en
Publication of WO2002093165A1 publication Critical patent/WO2002093165A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Definitions

  • the invention relates generally to the field of molecular biology and gene expression.
  • the invention includes to materials and methods to detect alternative splice variants of mRNA.
  • RNA messenger RNA
  • transcriptome The set of related mRNAs derived from a given gene by alternative splicing is called the transcriptome.
  • splice variant profile means the set of mRNAs expressed along with their expression levels. Alternative splice variants have been associated with various disease states.
  • alternate splicing of the T-cell receptor zeta chain mRNA has been associated with lupus erythmatosis (Nambiar, et al, Arthritis Rheum 44(6): 1336-1350, 2001)
  • alternate splice variants of the vascular endothelial growth factor have been associated with osteoarthritis (Pufe, et al, Arthritis Rheum 44(5): 1082-1088, 2001)
  • alternate splice variants of the presenilin-2 gene have been associated with some types of Alzheimer's disease (Sato, et al., J. Neurochem.
  • microarrays used in the prior art have contained probes selected based on the predicted or known exon sequence or by using the entire genome sequence and overlapping the sequences of the probes. While either of these methods will permit the detection of an exon expressed in a mRNA sample, it provides no information concerning the arrangement of multiple exons that may be present in any given mRNA molecule. Thus, there exists a need in the art for improved microarrays and methods for detecting the presence of specific splice junctions and exons in mRNA.
  • the present invention includes, in part, materials and methods for detecting alternatively spliced mRNA.
  • the present invention provides a solid support comprising a plurality of oligonucleotides, wherein each oligonucleotide has a sequence that specifically hybridizes to a splice junction sequence in a target mRNA.
  • the plurality of oligonucleotides may comprise at least n-1
  • the solid supports of the invention may also comprise one or more oligonucleotides that specifically hybridize to an exon of the gene of interest.
  • the invention includes a solid support comprising at least two oligonucleotides, wherein a first oligonucleotide specifically hybridizes to a splice junction in a first mRNA transcribed from a first gene of interest and a second oligonucleotide that specifically hybridizes to a splice junction in a second mRNA transcribed from a second gene of interest.
  • the genes may be the same or different. In some embodiments, the different genes may originate on different chromosomes. In some embodiments, the genes may be the result of a translocation event. In some embodiments, the first mRNA and the second mRNA have at least one exon in common.
  • the solid supports according to the invention may further comprise a third and a fourth oligonucleotide, wherein the third oligonucleotide specifically hybridizes to an exon of the first gene and the fourth oligonucleotide specifically hybridizes to an exon of the second gene.
  • the present invention provides a solid support comprising oligonucleotides, wherein the oligonucleotides comprise at least one oligonucleotide that specifically hybridizes to each possible splice junction in a mRNA transcribed from a first gene of interest.
  • the solid supports may optionally comprise additional oligonucleotides, preferably the additional oligonucleotides comprise at least one oligonucleotide that specifically hybridizes to each possible splice junction in a mRNA transcribed from a second gene of interest.
  • the invention includes a method of detecting alternative spliced mRNA by contacting a solid support of the invention with a solution comprising nucleic acids representative of mRNA in a cell and detecting an alternatively spliced mRNA.
  • the nucleic acids may be ribonucleic acids and/or deoxyribonucleic acids.
  • the invention includes a method of detecting a pathological condition in a patient, wherein the pathological condition is characterized by alternative splice variants of one or more genes, by contacting a sample from the patient with a solid support according to the invention and detecting a level of expression of an alternative splice variant in the sample, wherein the expression level of the alternative splice variant is indicative of a pathological condition.
  • the invention provides a computer system that includes a database containing information identifying an expression level for one or more alternative splice variants of one or more mRNAs and a user interface to view the information.
  • the computer system of the invention may optionally include a database that contains information identifying an expression level for an alternative splice variant in normal and/or disease tissue.
  • the present invention provides a method of identifying an agent that modulates a pathological condition by contacting a sample with the agent, determining a splice variant expression profile for at least one gene, comparing the splice variant profile to a splice variant profile obtained from a sample not treated with the agent, and determining a change in the splice variant profile, wherein a change in the splice variant profile is indicative of an agent that modulates the condition.
  • the present invention also provides agents identified by this method.
  • the agents may be optionally formulated for pharmaceutical use, for example, an effective amount of an agent to modulate a pathological condition may be combined with one or more pharmaceutically acceptable buffers, excipients, diluents and the like.
  • RNA splicing occurs in the nucleus and is directed by small nuclear riboproteins (snRNPs). These snRNPs are believed to recognize specific RNA sequences that are present at exon-intron boundaries that act as nucleation points to direct the splicing reaction. These conserved boundary sequences are known as 5' splice (donor) and 3' splice (acceptor) sites. In higher eukaryotes, the consensus sequences for splicing exon-intron sequences are as follows (the second line represents other possibilities at certain nucleotide positions): 5' CAG GUAAGU A UUUUUUUUUUNUAG G....
  • transcripts may result from variations in the snRNP-directed splicing of an RNA molecule transcribed from a gene unit that contains multiple exons. For example, if the genomic organization of a gene is as follows:
  • Alternative spliced transcripts containing one or more of the exons i. e. transcripts containing exon 1, exons 1 and 2, exons 1, 2 and 3 etc.
  • the present invention provides materials and methods for the detection and analysis of these alternative spliced transcripts also referred to herein as splice variants.
  • oligonucleotide sequences that are complementary to one or more of the nucleic acids (DNA, mRNA, cDNA, rRNA etc.) described herein, such as sequence comprising a splice junction site, refers to oligonucleotides that are capable of hybridizing under stringent conditions to at least part of the nucleotide sequence of said nucleic acids. Such hybridizable oligonucleotides will typically exhibit at least about 75% sequence identity at the nucleotide level to said nucleic acids, preferably about 80% or 85% sequence identity or more preferably about 90% or 95% or more sequence identity.
  • binding(s) substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence.
  • background refers to hybridization signals resulting from non-specific binding, or other interactions, between the labeled target nucleic acids and components of the oligonucleotide array (e.g., the oligonucleotide probes, control probes, the array substrate, etc.). Background signals may also be produced by intrinsic fluorescence of the array components themselves. A single background signal can be calculated for the entire array, or a different background signal may be calculated for each target nucleic acid. In a preferred embodiment, background is calculated as the average hybridization signal intensity for the lowest 5% to 10% of the probes in the array, or, where a different background signal is calculated for each target gene, for the lowest 5% to 10% of the probes for each gene.
  • background may be calculated as the average hybridization signal intensity produced by hybridization to probes that are not complementary to any sequence found in the sample (e.g., probes directed to nucleic acids of the opposite sense or to genes not found in the sample such as bacterial genes where the sample is mammalian nucleic acids). Background can also be calculated as the average signal intensity produced by regions of the array that lack any probes at all.
  • the terms “gene” and “gene unit” refer to a segment of DNA comprising both coding sequences (exons) and non-coding sequences (introns) that occupies a particular chromosomal locus and contains all the information for the coding of at least one mRNA product (unless intergenic exon splicing occurs).
  • Said mRNA products may comprise differential arrangements of the exons of the gene, resulting in the encoding of differential polypeptide or protein products that are splice variants of one another.
  • the terms “gene” and “gene unit” also include the term “allele,” which, as used herein, encompasses naturally or artificially occurring alternative forms of a gene occupying a particular chromosomal locus.
  • hybridizing specifically to or “specifically hybridize” refers to the binding, duplexing or hybridizing of a molecule substantially to or only to a particular nucleotide sequence or sequences under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • Assays and methods of the invention may utilize available formats to simultaneously screen at least about 100, preferably about 1000, more preferably about 10,000 and most preferably about 1,000,000 or more different nucleic acid hybridizations.
  • mismatch control or mismatch probe refer to a probe whose sequence is deliberately selected not to be perfectly complementary to a particular target sequence.
  • mismatch probe For each mismatch (MM) control in a high-density array there typically exists a corresponding perfect match (PM) probe that is perfectly complementary to the same particular target sequence.
  • the mismatch may comprise one or more bases.
  • mismatch(s) may be located anywhere in the mismatch probe, terminal mismatches are less desirable as a terminal mismatch is less likely to prevent hybridization of the target sequence.
  • the mismatch is located at or near the center of the probe such that the mismatch is most likely to destabilize the duplex with the target sequence under the test hybridization conditions.
  • perfect match probe refers to a probe that has a sequence that is perfectly complementary to a particular target sequence.
  • the test probe is typically perfectly complementary to a portion (subsequence) of the target sequence.
  • the perfect match (PM) probe can be a "test probe”, a "normalization control” probe, an expression level control probe and the like.
  • a perfect match control or perfect match probe is, however, distinguished from a “mismatch control” or “mismatch probe.”
  • the term “predicted” refers to any nucleic acid sequence being investigated, studied, probed or tested for being adjacent to the splice junction site of two exons at either the 5' or 3' side.
  • a "probe” is defined as a nucleic acid, preferably an oligonucleotide, capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
  • a probe may include natural (t.e., A, G, U, C or T) or modified bases (7-deazaguanosine, inosine, etc.).
  • the bases in probes may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization.
  • probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages.
  • splice variants within a gene or allele refers to related transcripts which are products of alternative splicing between exons of said gene or allele resulting in the selective omission or inclusion of exons. Splice variants also include the products of the fusion of exons from at least two different genes or alleles. Said different genes may originate on the same chromosome and be adjacent or in close proximity to one another.
  • Said different genes may alternatively originate on the same or different chromosomes and their exons may be brought into proximity with one another by, for example, at least one of a translocation, crossover, chiasma, deletion, insertion, substitution, inversion, intrachromosomal rearrangement, intrachange, or recombination event.
  • stringent conditions refers to conditions under which a probe will hybridize to its target subsequence, but with only insubstantial hybridization to other sequences or to other sequences such that the difference may be identified. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
  • stringent conditions will be those in which the salt concentration is at least about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotide). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • stringent conditions include those that (1) employ low ionic strength and high temperature for washing, for example, 0.015 M NaCl/0.0015 M sodium citrate/0.1% SDS at 50°C, or (2) employ during hybridization a denaturing agent such as formamide, for example, 50% (vol/vol) formamide with 0.1 % bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42°C.
  • a denaturing agent such as formamide, for example, 50% (vol/vol) formamide with 0.1 % bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50 mM sodium phosphate buffer at pH 6.5 with 750 mM NaCl, 75 mM sodium citrate at 42°C.
  • Another example is hybridization in 50% formamide, 5 ⁇ SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 ⁇ Denhardt's solution, sonicated salmon sperm
  • DNA 50 ⁇ g/ml
  • 0.1% SDS 0.1% SDS
  • 10% dextran sulfate at 42°C washes at 42°C in 0.2x SSC and 0.1% SDS.
  • a skilled artisan can readily determine and vary the stringency conditions appropriately to obtain a clear and detectable hybridization signal.
  • the "percentage of sequence identity” or “sequence identity” is determined by comparing two optimally aligned sequences or subsequences over a comparison window or span, wherein the portion of the polynucleotide sequence in the comparison window may optionally comprise additions or deletions (i.e., gaps) as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the percentage is calculated by determining the number of positions at which the identical subunit (e.g., nucleic acid base or amino acid residue) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison and multiplying the result by 100 to yield the percentage of sequence identity.
  • Percentage sequence identity when calculated using the programs GAP or BESTFIT (see below) is calculated using default gap weights.
  • Homology or identity may be determined by BLAST (Basic Local Alignment Search Tool) analysis using the algorithm employed by the programs blastp, blastn, blastx, tblastn and tblastx (Karlin et «/., (1990) Proc. Natl. Acad. Sci. USA 87, 2264-2268 and Altschul, (1993) J. Mol. Evol. 36, 290-300, fully incorporated by reference) which are tailored for sequence similarity searching.
  • the approach used by the BLAST program is to first consider similar segments between a query sequence and a database sequence, then to evaluate the statistical significance of all matches that are identified and finally to summarize only those matches which satisfy a preselected threshold of significance.
  • arrays usually either array specific short DNA sequences (oligos) that are complementary to a portion of an exon in a gene, or they utilize larger cDNAs, to which many differently spliced transcripts can hybridize, thus making any conclusions concerning splice variants very tenuous.
  • One embodiment of the present invention is a microarray and method that can more accurately detect and quantify the expression of alternatively spliced mRNAs by arraying both sequences specific to and preferably within each exon, as well as DNA sequences that are specific to the predicted exon-exon junctions or splice junctions. For example, if a genomic unit is described as having exons A, B and C, DNA sequences specific to one or more of the exons may be arrayed and, in addition, sequences pertaining to one or more of the exon junctions AB, BC and AC are also arrayed. This approach may be expanded for more complex genes containing more exons as shown in Table 1.
  • exon shuffling may be detected using sequences designed to hybridize with one or more of the exon-exon junctions that result from the shuffled exons.
  • sequences designed to hybridize with one or more of the exon-exon junctions For example, if a gene of interest contained exons A, B, C, and D, the sequences to be arrayed might include one or more sequences designed to detect one or more of the following exon-exon junctions: AB, AC, AD, BC, BD, CD, BA, CA, DA, CB, DB, and DC (a total of 12).
  • a combinatorial approach may be used to detect all possible exon-exon junctions resulting from alternative splicing and/or crossover events involving more than one gene unit.
  • This embodiment of the invention will be particularly useful to detect transcriptomes resulting from gene shuffling or chromosomal cross over events in the human genome that are often involved in disease.
  • the sequences to be arrayed might include one or more sequences designed to detect one or more of the following exon-exon junctions: AB, AC, BC, BA, CA, CB (a total of 6 specific for the possible junctions of the first gene), XY, XZ, YZ, YX, ZX, and ZY (a total of 6 specific for the possible junctions of the second gene) and AX, AY, AZ, BX, BY, BZ, CX, CY, CZ, XA, YA, ZA, XB, YB, YC, ZA, ZB, and ZC (a total of 18 for possible junctions involving both genes).
  • N number of exons in gene ABC
  • P number of exons in gene XYZ.
  • the splice variant is detected using at least one oligonucleotide species comprising at least all or a fraction of the exon predicted to be 5' of the splice site and at least about a fraction of or all of the exon predicted to be 3' of the splice site.
  • the oligonucleotides include at least part of the exons that are 5' to the exon that is predicted to be immediately 5' of the splice of interest.
  • the oligonucleotides include at least part of the exons that are 3' to the exon that is predicted to be immediately 3' of the splice of interest.
  • said oligonucleotide comprises about the 3' Vi, l A, or 1/10 of the exon predicted to be 5' of the splice. In particular embodiments, said oligonucleotide comprises about the 5' Vi, ' ⁇ , or 1/10 of the exon predicted to be 3' of the splice.
  • said oligonucleotide comprises at least about the 3'- terminal 50 nucleotides of the exon predicted to be 5' of the splice. In another particular embodiment, said oligonucleotide comprises at least about the 3'-terminal 30 nucleotides of the exon predicted to be 5' of the splice. In still another particular embodiment, said oligonucleotide comprises at least about the 3 '-terminal 25 nucleotides of the exon predicted to be 5' of the splice. In yet another particular embodiment, said oligonucleotide comprises at least about the 3 '-terminal 20 nucleotides of the exon predicted to be 5' of the splice.
  • said oligonucleotide comprises at least about the 3'-terminal 15 nucleotides of the exon predicted to be 5' of the splice. In a preferred embodiment, said oligonucleotide comprises at least about the 3 '-terminal 12 nucleotides of the exon predicted to be 5' of the splice. In another preferred embodiment, said oligonucleotide comprises at least about the 3 '-terminal 10 nucleotides of the exon predicted to be 5' of the splice. In still another preferred embodiment, said oligonucleotide comprises at least about the 3 '-terminal 5 nucleotides of the exon predicted to be 5' of the splice.
  • said oligonucleotide comprises at least about the 5'- terminal 5 nucleotides of the exon predicted to be 3' of the splice. In another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 10 nucleotides of the exon predicted to be 3' of the splice. In still another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 12 nucleotides of the exon predicted to be 3' of the splice. In yet another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 15 nucleotides of the exon predicted to be 3' of the splice.
  • said oligonucleotide comprises at least about the 5'-terminal 20 nucleotides of the exon predicted to be 3' of the splice. In yet still another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 25 nucleotides of the exon predicted to be 3' of the splice. In even still another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 30 nucleotides of the exon predicted to be 3' of the splice. In another particular embodiment, said oligonucleotide comprises at least about the 5 '-terminal 50 nucleotides of the exon predicted to be 3' of the splice.
  • said oligonucleotide may further comprise a deletion of about the 3'-terminal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of the exon predicted to be 5' of the splice, with the proviso that at least 1 nucleotide (5' to the deletion) of said 5' exon remains, and/or a deletion of about the 5'-terminal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of the exon predicted to be 3' of the splice, with the proviso that at least 1 nucleotide (3' to the deletion) of said 3' exon remains.
  • oligonucleotides having sequences that specifically hybridize to sequences surrounding the predicted exon-exon splice junction may be arrayed.
  • oligonucleotides may be selected such that the sequence of each oligonucleotide overlaps-t. e., has sequence in common with- other nucleotides that are arrayed. This is referred to as tiling of the oligonucleotides (see, for example, U.S. Patent No. 5,837,832).
  • This embodiment may be useful in identifying exon-exon splice junctions that are difficult to predict accurately based upon currently available prediction algorithms.
  • tiled oligonucleotides permits the creation of a clustered set that will help capture the splice regions.
  • predictions of splice regions may be made from the genomic DNA using currently available splice junction prediction algorithms. After the predicted sites are identified, a clustered set of oligonucleotides spanning a region around the predicted site may be arrayed.
  • a set of oligonucleotides of length L may be synthesized such that each contains a sequence complementary to a portion of the target sequence.
  • the first oligonucleotide in the set may have a sequence complementary to the target sequence starting at starting at nucleotide X of the target sequence, while the next oligonucleotide in the series may have a sequence complementary to the target sequence starting at starting at nucleotide X + N of the target sequence.
  • N can be any number from 1 to L and is preferably in the range of about 1 to about 15 and most preferably is in the range of about 1 to 5.
  • the region selected for the clustered set may be from about 1 kb 5' of the predicted splice site to about 1 kb 3' of the predicted splice site in the genomic DNA sequence of the gene.
  • the cluster set may begin with an oligonucleotide comprising at least about 50 nucleotides of the 3' end of the exon predicted to be 5' of the splice site. In another embodiment, the cluster set may begin with an oligonucleotide comprising at least about 30 nucleotides of said 3' end. In still another embodiment, the cluster set may begin with an oligonucleotide comprising at least about 25 nucleotides of said 3' end. In yet another embodiment, the cluster set may begin with an oligonucleotide comprising at least about 20 nucleotides of said 3' end.
  • the cluster set may begin with an oligonucleotide comprising at least about 15 nucleotides of said 3' end. In a preferred embodiment, the cluster set may begin with an oligonucleotide comprising at least about 12 nucleotides of said 3' end. In another preferred embodiment, the cluster set may begin with an oligonucleotide comprising at least about 10 nucleotides of said 3' end. In still another preferred embodiment, the cluster set may begin with an oligonucleotide comprising at least about 5 nucleotides of said 3' end.
  • the cluster set may also include oligonucleotides that extend at least about 5 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In another embodiment, said oligonucleotides extend at least about 10 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In still another embodiment, said oligonucleotides extend at least about 12 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In yet another embodiment, said oligonucleotides extend at least about 15 nucleotides into the 5' end of the exon predicted to be 3' of the splice site.
  • said oligonucleotides extend at least about 20 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In another embodiment, said oligonucleotides extend at least about 25 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In another embodiment, said oligonucleotides extend at least about 30 nucleotides into the 5' end of the exon predicted to be 3' of the splice site. In another embodiment, said oligonucleotides extend at least about 50 nucleotides into the 5' end of the exon predicted to be 3' of the splice site.
  • Said oligonucleotides of said cluster set may all be of the same length or of different lengths, may begin with the same nucleotide of the exon 5' of the splice or may begin with different nucleotides of said 5' exon, and may end with the same nucleotide of the exon 3' of the splice or may end with different nucleotides of said 3' exon.
  • said oligonucleotide may further comprise a deletion of about the 3 '-terminal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of the exon predicted to be 5' of the splice, with the proviso that at least 1 nucleotide (5' to the deletion) of said 5' exon remains, and/or a deletion of about the 5'-terminal 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of the exon predicted to be 3' of the splice, with the proviso that at least 1 nucleotide (3' to the deletion) of said 3' exon remains.
  • the microarrays of the present invention may incorporate one or more oligonucleotides having sequences that are not predicted to be complementary to a splice junction sequence.
  • one or more oligonucleotides might be arrayed that have sequences predicted to be complementary to a sequence in a particular exon.
  • Oligonucleotides of this type may be designed so as to be complementary to a sequence that is entirely within the target exon, i. e., does not extend into any splice junction sequences.
  • oligonucleotides of this type may contain sequences predicted to be complementary to all or a portion of the splice junction as well as to a portion or all of the exon.
  • oligonucleotides may be arrayed that contain sequences predicted to be complementary to a sequence present in an intron. Oligonucleotides of this type may be designed to be complementary to a sequence entirely contained within the intron. Alternatively, oligonucleotides of this type may be complementary to all or a portion of the intron and to all or a portion of a predicted splice junction sequence.
  • oligonucleotides may be arrayed that contain a sequence designed to be complementary to all or a portion of an exon, all or a portion of a splice junction and all or a portion of an intron. Such oligonucleotides may span a genomic sequence that includes a predicted splice site as well as all or a portion of the exon and the intron that surround the splice site.
  • oligonucleotides specific for exon sequences may be arrayed along with oligonucleotides specific for splice junction sequences. Arrays of this type will provide detailed information concerning the composition of the various alternative spliced mRNAs that may be generated in a particular transcriptome.
  • an array of the present invention may comprise oligonucleotides such as those described above-complementary to a splice junction, exon, intron and/or combinations thereof-that are designed to be complementary to an individual gene.
  • a single array may contain oligonucleotides for a number of individual genes.
  • arrays may be designed to detect shuffled exons as described above.
  • Such arrays may include oligonucleotides designed to be complementary to exons, introns and/or splice junctions from two or more different genes.
  • the present invention provides materials and methods to identify those genes that express multiple splice variants and to identify which of the theoretically possible splice variants are actually expressed in any given tissue.
  • One of skill in the art can select one or more of the genes identified as having splice variants and use the information and methods provided herein to interrogate or test a particular sample. For a particular interrogation of two conditions or tissue sources, it is desirable to select those genes that display a difference in the presence and/or amount of splice variants produced between the two conditions or sources. These differences may be in the amount of a particular splice variant in one sample versus another or in the distribution of splice variants in one sample versus another.
  • Splice variants also include the products of the fusion of exons from at least two different genes.
  • Said different genes may originate on the same chromosome and be adjacent or in close proximity to one another.
  • Said different genes may alternatively originate on the same or different chromosomes and their exons may be brought into proximity with one another by, for example, at least one of a translocation, crossover, chiasma, deletion, insertion, substitution, inversion, intrachromosomal rearrangement, intrachange, or recombination event.
  • One example of the use of the materials and methods of the present invention to predict disease states is in the diagnosis of those diseases described in the background section.
  • Other disease states include, but are not limited to, a number of carcinomas, sarcomas, leukemias, lymphomas, pancreatitis and polycystic kidney disease.
  • tissue sample or other sample from a patient may be assayed by any of the methods described herein or otherwise known to those skilled in the art, and the presence and/or level of expression of one or more splice variants of one or more genes of interest may be compared to that of normal cells and/or cells derived from a disease tissue sample in order to determine whether a given sample contains disease tissue. Comparison of the may be done with the aid of a computer and databases as described herein.
  • the presence of a particular splice variant and/or level of expression of one or more splice variants may also be used as markers for the monitoring of disease progression, for instance, the amount of the splice variant of CD44 associated with tumor progression may be determined.
  • a tissue sample or other sample from a patient may be assayed by any of the methods known to those of skill in the art, and the presence and/or amount of one or more splice variants of one or more genes may be determined in the sample and may be compared to those found in normal tissue, tissue from a diseased individual or both. Comparison of the data may be done by researcher or diagnostician or may be done with the aid of a computer and databases as described herein.
  • Potential agents can be screened to determine if application of the agent alters the splice variant profile of one or more genes. This may be useful, for example, in determining whether a particular drug is effective in treating a particular patient with a disease, for example a tumor. In the case where the potential agent affects the splice variant profile such that the profile returns to normal or is altered to be more like normal, the agent is indicated in the treatment of the disease. Similarly, an agent that induces the expression of a splice variant profile that is similar to that expressed in a disease state may be contraindicated.
  • a gene identified as having one or more alternative splice variants may be used as the basis of an assay to evaluate the effects of a candidate drug or agent on a cell, for example on a diseased cell.
  • a coding sequence which is the product of alternative splice variants of at least two different genes may be used as the basis of an assay to evaluate the effects of a candidate drug or agent on a cell, for example on a diseased cell.
  • Said different genes includes genes which originate on the same chromosome or on different chromosomes.
  • a candidate drug or agent can be screened for the ability to modulate the production of one or more alternatively spliced mRNA molecules or the proteins translated from them.
  • Assays to monitor the expression of one or more splice variants may utilize any available means of monitoring for changes in the expression level of the nucleic acids of the invention.
  • an agent is said to modulate the expression of a nucleic acid of the invention if it is capable of up- or down-regulating expression of the nucleic acid in a cell.
  • Agents that are assayed in the above methods can be randomly selected or rationally selected or designed.
  • an agent is said to be randomly selected when the agent is chosen randomly without considering the specific sequences involved in the association of the a protein of the invention alone or with its associated substrates, binding partners, etc.
  • An example of randomly selected agents is the use a chemical library or a peptide combinatorial library, or a growth broth of an organism.
  • an agent is said to be rationally selected or designed when the agent is chosen on a nonrandom basis which takes into account the sequence of the target site and/or its conformation in connection with the agentDs action.
  • Agents can be rationally selected or rationally designed by utilizing the peptide sequences that make up these sites.
  • a rationally selected peptide agent can be a peptide whose amino acid sequence is identical to or a derivative of any functional consensus site.
  • the agents of the present invention can be, as examples, peptides, small molecules, vitamin derivatives, as well as carbohydrates, lipids, oligonucleotides and covalent and non- covalent combinations thereof.
  • Dominant negative proteins, DNA encoding these proteins, antibodies to these proteins, peptide fragments of these proteins or mimics of these proteins may be introduced into cells to affect function.
  • "Mimic” as used herein refers to the modification of a region or several regions of a peptide molecule to provide a structure chemically different from the parent peptide but topographically and functionally similar to the parent peptide (see Grant, (1995) in Molecular Biology and Biotechnology Meyers (editor) VCH Publishers). A skilled artisan can readily recognize that there is no limit as to the structural nature of the agents of the present invention.
  • agents that up- or down- regulate or modulate the production of one or more splice variants thereby altering the splice variant profile may be used to modulate biological and pathologic processes associated with one or more of the splice variants affected.
  • a subject can be any mammal, so long as the mammal is in need of modulation of a pathological or biological process mediated by a protein of the invention.
  • the term "mammal" is defined as an individual belonging to the class Mammalia. The invention is particularly useful in the treatment of human subjects.
  • Pathological processes refer to a category of biological processes that produce a deleterious effect.
  • expression of a particular splice variant may be associated with a disease or other pathological condition.
  • an agent is said to modulate a pathological process when the agent reduces the degree or severity of the process. For instance, tumor progression may be prevented or slowed by the administration of agents which up- or down-regulate or modulate in some way the production of splice variants of CD44.
  • agents of the present invention can be provided alone, or in combination with other agents that modulate a particular pathological process.
  • an agent of the present invention can be administered in combination with other known drugs.
  • two agents are said to be administered in combination when the two agents are administered simultaneously or are administered independently in a fashion such that the agents will act at the same time.
  • the agents of the present invention can be administered via parenteral, subcutaneous, intravenous, intramuscular, intraperitoneal, transdermal, or buccal routes. Alternatively, or concurrently, administration may be by the oral route.
  • the dosage administered will be dependent upon the age, health, and weight of the recipient, kind of concurrent treatment, if any, frequency of treatment, and the nature of the effect desired.
  • the present invention further provides compositions containing one or more agents that modulate the splice variant profile of one or more genes. While individual needs vary, determination of optimal ranges of effective amounts of each component is within the skill of the art.
  • Typical dosages comprise 0.1 to 100 ⁇ g/kg body wt.
  • the preferred dosages comprise 0.1 to 10 ⁇ g/kg body wt.
  • the most preferred dosages comprise 0.1 to 1 ⁇ g/kg body wt.
  • compositions of the present invention may contain suitable pharmaceutically acceptable carriers comprising excipients and auxiliaries that facilitate processing of the active compounds into preparations which can be used pharmaceutically for delivery to the site of action.
  • suitable formulations for parenteral administration include aqueous solutions of the active compounds in water- soluble form, for example, water-soluble salts.
  • suspensions of the active compounds as appropriate oily injection suspensions may be administered.
  • Suitable lipophilic solvents or vehicles include fatty oils, for example, sesame oil, or synthetic fatty acid esters, for example, ethyl oleate or triglycerides.
  • Aqueous injection suspensions may contain substances that increase the viscosity of the suspension including, for example, sodium carboxymethyl cellulose, sorbitol, and/or dextran.
  • the suspension may also contain stabilizers. Liposomes can also be used to encapsulate the agent for delivery into the cell.
  • the pharmaceutical formulation for systemic administration according to the invention may be formulated for enteral, parenteral or topical administration. Indeed, all three types of formulations may be used simultaneously to achieve systemic administration of the active ingredient.
  • Suitable formulations for oral administration include hard or soft gelatin capsules, pills, tablets, including coated tablets, elixirs, suspensions, syrups or inhalations and controlled release forms thereof.
  • the compounds of this invention may be used alone or in combination, or in combination with other therapeutic or diagnostic agents.
  • the compounds of this invention may be coadministered along with other compounds typically prescribed for these conditions according to generally accepted medical practice.
  • the compounds of this invention can be utilized in vivo, ordinarily in mammals, such as humans, sheep, horses, cattle, pigs, dogs, cats, rats and mice, or in vitro.
  • the materials and methods of the present invention may be used to diagnosis disease states and/or their progression.
  • One means of diagnosing diseases using the materials and methods of the present invention involves obtaining disease tissue from living subjects. Such tissue samples may be obtained by any conventional means, for example, by biopsy. When possible, urine, blood or peripheral lymphocyte samples may be used as the tissue sample in the assay.
  • the materials and methods of the present invention may be used to determine the splice variant profile of one or more genes in forensic/pathology specimens.
  • nucleic acid assays may be carried out by any means of conducting a transcriptional profiling analysis.
  • forensic methods of the invention may target the proteins of the invention, particularly proteins produced from an alternative splice variant.
  • Methods of the invention may involve treatment of tissues with collagenases or other proteases to make the tissue amenable to cell lysis (Semenov DE et al, (1987) Biull Eksp Biol Med 104:113-116). Further, it is possible to obtain biopsy samples from different regions of a target tissue for analysis.
  • Assays to detect nucleic acid or protein molecules of the invention may be in any available format.
  • Typical assays for nucleic acid molecules include hybridization or PCR based formats.
  • Typical assays for the detection of proteins, polypeptides or peptides of the invention include the use of antibody probes in any available format such as in situ binding assays, etc. See Harlow & Lane, (1988) Antibodies - A Laboratory Manual, Cold Spring Harbor Laboratory Press. In preferred embodiments, assays are carried-out with appropriate controls.
  • genes identified as undergoing alternative splicing may be used in a variety of nucleic acid detection assays to detect or quantify the expression level of one or more splice variants in a given sample.
  • nucleic acid detection assays For example, traditional Northern blotting, nuclease protection, RT-PCR and differential display methods may be used for detecting splice variant expression levels.
  • the protein products of the alternative splice variants identified using the materials and methods of the present invention can also be assayed to determine the amount of expression.
  • Methods for assaying for a protein include Western blot, immunoprecipitation, and radioimmunoassay. It is preferred, however, that the mRNA be assayed as an indication of expression.
  • Methods for assaying for mRNA include northern blots, slot blots, dot blots, and hybridization to an ordered array of oligonucleotides. Any method for specifically and quantitatively measuring a specific protein or mRNA or DNA product can be used. However, methods and assays of the invention are most efficiently designed with array or chip hybridization-based methods for detecting the splice variant profile of a large number of genes.
  • any hybridization assay format may be used to detect these variants in a sample of interest.
  • Such formats include solution-based and solid support-based assay formats.
  • a preferred solid support is a high-density array also known as a DNA chip or a gene chip.
  • gene chips containing probes to at least one predicted splice junction may be used to directly monitor or detect changes in splice variant profile in a treated or exposed cell as described herein.
  • Additional assay formats may be used to monitor the ability of the agent to modulate the expression of a splice variant. For instance, as described above, mRNA expression may be monitored directly by hybridization of probes to the nucleic acids of the invention. Cell lines are exposed to an agent to be tested under appropriate conditions and time and total RNA or mRNA is isolated by standard procedures such those disclosed in Sambrook et al, (1989) Molecular Cloning - A Laboratory Manual, Cold Spring Harbor Laboratory Press). In some embodiments, it may be desirable to amplify one or more of the RNA molecules isolated prior to application of the RNA to the gene chip.
  • the RNA may be reverse transcribed and amplified in the form of DNA or may be reverse transcribed into DNA and the DNA used as a template for transcription to generate recombinant RNA. Any method that results in the production of a sufficient quantity of nucleic acid to be hybridized effectively to the gene chip may be used.
  • cell lines that contain reporter gene fusions between the alternative splice variants and, optionally their 3' and/or 5' regulatory regions and any assayable fusion partner may be prepared.
  • Numerous assayable fusion partners are known and readily available including the firefly luciferase gene and the gene encoding chloramphenicol acetyltransferase (Alam et al, (1990) Anal. Biochem. 188, 245-254).
  • Cell lines containing the reporter gene fusions are then exposed to the agent to be tested under appropriate conditions and time. Differential expression of the reporter gene between samples exposed to the agent and control samples identifies agents that modulate the expression of the nucleic acid.
  • cells or cell lines are first identified which express one or more of the splice variants of the invention physiologically.
  • Cells and/or cell lines so identified would preferably comprise the necessary cellular machinery to ensure that the transcriptional and/or translational apparatus of the cells would faithfully mimic the response of normal or diseased tissue to an exogenous agent.
  • Such machinery would likely include appropriate surface transduction mechanisms and/or cytosolic factors.
  • the cells and/or cell lines may then be contacted with an agent and the expression of one or more of the splice variants of interest may then be assayed.
  • the splice variants may be assayed at the mRNA level and/or at the protein level.
  • such cells or cell lines may be transduced or transfected with an expression vehicle (e.g., a plasmid or viral vector) containing an expression construct comprising an operable 5'-promoter containing end of a gene having a splice variant of interest identified using the materials and methods of the invention fused to one or more nucleic acid sequences encoding one or more antigenic fragments.
  • the construct may comprise all or a portion of the coding sequence of one or more exons of the splice variant of interest that may be positioned 5'- or 3'- to a sequence encoding an antigenic fragment.
  • the coding sequence of one or more of the exons of the splice variant may be translated or un-translated after transcription of the gene fusion.
  • At least one antigenic fragment may be translated.
  • the antigenic fragments are selected so that the fragments are under the transcriptional control of the promoter of the splice variant of interest and are expressed in a fashion substantially similar to the expression pattern of the gene of interest.
  • the antigenic fragments may be expressed as polypeptides whose molecular weight can be distinguished from the naturally occurring polypeptides.
  • gene products of the invention may further comprise an immunologically distinct tag. Such a process is well known in the art (see Sambrook et al, (1989) Molecular Cloning - A Laboratory Manual, Cold Spring Harbor Laboratory Press).
  • an agent may comprise a pharmaceutically acceptable excipient and is contacted with cells comprised in an aqueous physiological buffer such as phosphate buffered saline (PBS) at physiological pH, Eagles balanced salt solution (BSS) at physiological pH, PBS or BSS comprising serum or conditioned media comprising PBS or BSS and serum incubated at 37°C.
  • PBS phosphate buffered saline
  • BSS Eagles balanced salt solution
  • the conditions may be modulated as deemed necessary by one of skill in the art.
  • the cells may be disrupted and the polypeptides of the lysate may be fractionated such that a polypeptide fraction is pooled and contacted with an antibody to be further processed by immunological assay (e.g., ELISA, immunoprecipitation or Western blot).
  • immunological assay e.g., ELISA, immunoprecipitation or Western blot.
  • the pool of proteins isolated from the "agent-contacted” sample will be compared with a control sample where only the excipient is contacted with the cells and an increase or decrease in the immunologically generated signal from the "agent-contacted” sample compared to the control will be used to distinguish the effectiveness of the agent.
  • Another embodiment of the present invention provides methods for identifying agents that modulate the levels, concentration or at least one activity of a protein(s) encoded by a splice variant of interest identified using the materials and methods of the present invention. Such methods or assays may utilize any means of monitoring or detecting the desired activity.
  • the relative amounts of a protein translated from a splice variant of the invention produced in a cell population that has been exposed to the agent to be tested may be compared to the amount produced in an un-exposed control cell population.
  • probes such as specific antibodies are used to monitor the differential expression of the protein in the different cell populations.
  • Cell lines or populations are exposed to the agent to be tested under appropriate conditions and time.
  • Cellular lysates may be prepared from the exposed cell line or population and a control, unexposed cell line or population. The cellular lysates are then analyzed with the probe, such as a specific antibody.
  • Probe design such as a specific antibody
  • Probes based on the sequences of splice variants to be detected may be prepared by any commonly available method. Oligonucleotide probes for assaying a tissue or cell sample are preferably of sufficient length to specifically hybridize only to appropriate, complementary transcripts. Typically the oligonucleotide probes will be at least about 10, 12, 14, 16, 18, 20 or 25 nucleotides in length. In some cases longer probes of at least about 30, 40, 50, 60, 70, 80, 90 or 100 or more nucleotides will be desirable.
  • the high-density array will typically include a number of probes that specifically hybridize to one or more splice junctions of interest.
  • the arrays may further comprise other sequences specific for various parts of the gene of interest, for example, intron or exon specific sequences. See WO 99/32660 for methods of producing probes for a given gene or genes.
  • the array will include one or more control probes.
  • Test probes may be oligonucleotides that range from about 5 to about 500, preferably about 10 to about 100 nucleotides, more preferably from about 40 to about 80 nucleotides and most preferably from about 50 to about 70 nucleotides in length. In other particularly preferred embodiments, the probes are about 60 nucleotides in length. In another preferred embodiment, test probes are double or single strand DNA sequences. DNA sequences may be isolated or cloned from natural sources or amplified from natural sources using natural nucleic acid as templates. These probes have sequences complementary to particular subsequences of the splice variant that they are designed to detect. Thus, the test probes are capable of specifically hybridizing to the target nucleic acid they are to detect.
  • the high- density array can contain a number of control probes.
  • the control probes fall into three categories referred to herein as (1) normalization controls; (2) expression level controls; and (3) mismatch controls.
  • Normalization controls are oligonucleotide or other nucleic acid probes that are complementary to labeled reference oligonucleotides or other nucleic acid sequences that are added to the nucleic acid sample.
  • the signals obtained from the normalization controls after hybridization provide a control for variations in hybridization conditions, label intensity, "reading" efficiency and other factors that may cause the signal of a perfect hybridization to vary between arrays.
  • signals (e.g., fluorescence intensity) read from all other probes in the array are divided by the signal (e.g., fluorescence intensity) from the control probes thereby normalizing the measurements.
  • any probe may serve as a normalization control.
  • Preferred normalization probes are selected to reflect the average length of the other probes present in the array, however, they can be selected to cover a range of lengths.
  • the normalization control(s) can also be selected to reflect the (average) base composition of the other probes in the array, however in a preferred embodiment, only one or a few probes are used and they are selected such that they hybridize well (i.e., no secondary structure) and do not match any target-specific probes.
  • Expression level controls are probes that hybridize specifically with constitutively expressed genes in the biological sample. Virtually any constitutively expressed gene provides a suitable target for expression level controls. Typical expression level control probes have sequences complementary to subsequences of constitutively expressed
  • housekeeping genes including, but not limited to the ⁇ -actin gene, the transferrin receptor gene, the GAPDH gene, and the like.
  • Mismatch controls may also be provided for the probes to the target splice variants, for expression level controls or for normalization controls.
  • Mismatch controls are oligonucleotide probes or other nucleic acid probes identical to their corresponding test or control probes except for the presence of one or more mismatched bases.
  • a mismatched base is a base selected so that it is not complementary to the corresponding base in the target sequence to which the probe would otherwise specifically hybridize.
  • One or more mismatches are selected such that under appropriate hybridization conditions (e.g., stringent conditions) the test or control probe would be expected to hybridize with its target sequence, but the mismatch probe would not hybridize (or would hybridize to a significantly lesser extent).
  • Preferred mismatch probes contain a central mismatch.
  • a corresponding mismatch probe may have the identical sequence except for a single base mismatch (e.g., substituting a G, a C or a T for an A) at any of positions 6 through 14 (the central mismatch).
  • Mismatch probes thus provide a control for non-specific binding or cross hybridization to a nucleic acid in the sample other than the target to which the probe is directed. Mismatch probes also indicate whether hybridization is specific or not. For example, if the target is present the perfect match probes should be consistently brighter than the mismatch probes. In addition, if all central mismatches are present, the mismatch probes can be used to detect a mutation. The difference in intensity between the perfect match and the mismatch probe (I(PM) - I(M )) provides a good measure of the concentration of the hybridized material.
  • nucleic acid samples used in the methods and assays of the invention may be prepared by any available method or process. Methods of isolating total mRNA are also well known to those of skill in the art. For example, methods of isolation and purification of nucleic acids are described in detail in
  • Such samples include RNA samples, but also include cDNA synthesized from a mRNA sample isolated from a cell or tissue of interest. Such samples also include DNA amplified from the cDNA, and RNA transcribed from the amplified
  • Biological samples may be of any biological tissue or fluid or cells from any organism as well as cells raised in vitro, such as cell lines and tissue culture cells. Frequently the sample will be a "clinical sample" which is a sample derived from a patient. Typical clinical samples include, but are not limited to, sputum, blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom.
  • sputum blood, blood cells (e.g., white cells), tissue or fine needle biopsy samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom.
  • Biological samples may also include sections of tissues, such as frozen sections or formalin fixed sections taken for histological purposes.
  • Solid supports containing oligonucleotide probes for use in the present invention can be any solid or semisolid support material known to those skilled in the art. Suitable examples include, but are not limited to, membranes, filters, tissue culture dishes, polyvinyl chloride dishes, beads, test strips, silicon or glass based chips and the like. Suitable glass wafers and hybridization methods are widely available, for example, those disclosed by Beattie (WO 95/11755). Any solid surface to which oligonucleotides can be bound, either directly or indirectly, either covalently or non-covalently, can be used. In some embodiments, it may be desirable to attach some oligonucleotides covalently and others non-covalently to the same solid support.
  • a preferred solid support is a high-density array or DNA chip. These contain a particular oligonucleotide probe in a predetermined location on the array. Each predetermined location may contain more than one molecule of the probe, but each molecule within the predetermined location has an identical sequence. Such predetermined locations are termed features. There may be, for example, from 2, 10, 100, 1000 to 10,000, 100,000 or 400,000 of such features on a single solid support. The solid support or the area within which the probes are attached may be on the order of a square centimeter. Oligonucleotide probe arrays for expression monitoring can be made and used according to any technique known in the art (see for example, Lockhart et al, Nat. Biotechnol.
  • Such probe arrays may contain at least two or more oligonucleotides that are complementary to or hybridize to all or a portion of a predicted splice junction.
  • Such arrays my also contain oligonucleotides that are complementary or hybridize to at least 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 50, 70 or more predicted splice junction sequences.
  • Oligonucleotide arrays are particularly useful for creating splice variant expression profiles comparing disease tissue to adjacent normal tissue.
  • oligonucleotide arrays of the invention will enable the determination of the expression levels of numerous splice variants simultaneously. From this mass of expression data, differentially expressed splice variants may be identified using Fold Change and Gene Signature Differential analysis.
  • Gene Signature Differential analysis is a method designed to detect mRNAs-/. e., splice variants-present in one sample set, and absent in another. mRNAs with differential expression in disease tissue versus normal tissue may be better diagnostic and therapeutic targets than those that do not change in expression.
  • oligonucleotide analogue array can be synthesized on a solid substrate by a variety of methods, including, but not limited to, light-directed chemical coupling, and mechanically directed coupling (see Pirrung et al, (1992) U.S. Patent No. 5,143, 854; Fodor et al, (1998) U.S. Patent No. 5,800,992; Chee et al, (1998) 5,837,832
  • a glass surface is derivatized with a silane reagent containing a functional group, e.g., a hydroxyl or amine group blocked by a photolabile protecting group.
  • a functional group e.g., a hydroxyl or amine group blocked by a photolabile protecting group.
  • Photolysis through a photolithogaphic mask is used selectively to expose functional groups that are then ready to react with incoming 5' photoprotected nucleoside phosphoramidites.
  • the phosphoramidites react only with those sites which are illuminated (and thus exposed by removal of the photolabile blocking group).
  • the phosphoramidites only add to those areas selectively exposed from the preceding step. These steps are repeated until the desired array of sequences has been synthesized on the solid surface. Combinatorial synthesis of different oligonucleotide analogues at different locations on the array is determined by the pattern of illumination during synthesis and the order of addition of coupling reagents.
  • High-density nucleic acid arrays can also be fabricated by depositing premade or natural nucleic acids in predetermined positions. Synthesized or natural nucleic acids are deposited on specific locations of a substrate by light directed targeting and oligonucleotide directed targeting. Another embodiment uses a dispenser that moves from region to region to deposit nucleic acids in specific spots.
  • Hybridization Nucleic acid hybridization simply involves contacting a probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing (see Lockhart et ⁇ l., (1999) WO 99/32660). The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. It is generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids.
  • hybrid duplexes e.g., DNA-DNA, RNA-RNA or RNA-DNA
  • hybridization conditions may be selected to provide any degree of stringency.
  • hybridization is performed at low stringency, in this case in 6x SSPE-T at 37°C (0.005% Triton x-100) to ensure hybridization and then subsequent washes are performed at higher stringency (e.g., lx SSPE-T at 37°C) to eliminate mismatched hybrid duplexes. Successive washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25x SSPET at 37°C to 50°C until a desired level of hybridization specificity is obtained. Stringency can also be increased by addition of agents such as formamide. Hybridization specificity may be evaluated by comparison of hybridization to the test probes with hybridization to the various controls that can be present (e.g., expression level control, normalization control, mismatch controls, etc.).
  • the wash is performed at the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity.
  • the hybridized array may be washed at successively higher stringency solutions and read between each wash. Analysis of the data sets thus produced will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular oligonucleotide probes of interest.
  • the hybridized nucleic acids are typically detected by detecting one or more labels attached to the sample nucleic acids.
  • the labels may be incorporated by any of a number of means well known to those of skill in the art (see Lockhart et al, (1999) WO 99/32660).
  • the present invention includes relational databases containing sequence information, for instance for one or more splice variants, as well as expression level information in various normal and/or disease tissue samples.
  • Databases may also contain information associated with a given sequence or tissue sample such as descriptive information about the gene associated with the sequence information, or descriptive information concerning the clinical status of the tissue sample, or the patient from which the sample was derived.
  • the database may be designed to include different parts, for instance a sequences database and an expression level database. Methods for the configuration and construction of such databases are widely available, for instance, see Akerblom et al, (1999) U.S. Patent 5,953,727, which is specifically incorporated herein by reference in its entirety.
  • the databases of the invention may be linked to an outside or external database.
  • the external database is GenBank and the associated databases maintained by the National Center for Biotechnology Information (NCBI).
  • Any appropriate computer platform may be used to perform the necessary comparisons between sequence information, expression level information and any other information in the database or provided as an input.
  • a large number of computer workstations are available from a variety of manufacturers, such as those available from Silicon Graphics.
  • Client-server environments, database servers and networks are also widely available and appropriate platforms for the databases of the invention.
  • the databases of the invention may be used to produce, among other things, electronic Northerns to allow the user to determine the cell type or tissue in which one or more given splice variants are expressed and to allow determination of the abundance or expression level of one or more given splice variants in a particular tissue or cell.
  • the databases of the invention may also be used to present information identifying the expression level in a tissue or cell of a set of splice variants for a gene, i. e., a transcriptome. Such presentation may comprise comparing the expression level of at least one splice variant in the tissue to the level of expression of the splice variant in the database.
  • Such methods may be used to predict the physiological state of a given tissue by comparing the level of expression of one or more splice variants from one or more genes from a sample to the expression levels found in normal tissue and/or disease tissue. Such methods may also be used in the drug or agent screening assays as described herein.
  • Example 1 Tissue Sample Acquisition and Analysis
  • samples from normal and/or disease tissue may be used.
  • the samples may be treated using standard techniques. Briefly, frozen tissue may be ground to powder, total RNA extracted using Trizol (Life Technologies), and mRNA isolated using the Oligotex mRNA Midi kit (Qiagen). If necessary, the mRNA may be concentrated using an ethanol precipitation step. Double stranded cDNA may be created using the Superscript Choice system (Gibco-BRL).
  • cRNA may be synthesized according to standard procedures. To bio tin label the cRNA, nucleotides Bio-11-CTP and Bio-16-UTP (Enzo Diagnostics) may be added to the reaction. The cRNA may then be fragmented (5x fragmentation buffer: 200 mM Tris- Acetate (pH 8.1), 500 mM KOAc, 150 mM MgOAc) for thirty-five minutes at 94°C.
  • Fragmented cRNA may be hybridized to the DNA chips of the present invention under suitable conditions. Such conditions include twenty-four hours at 60 rpm in a 45°C hybridization oven.
  • the chips may be washed and stained with Streptavidin Phycoerythrin (SAPE) (Molecular Probes) in fluidics stations.
  • SAPE Streptavidin Phycoerythrin
  • SAPE solution may be added twice with an anti-streptavidin biotinylated antibody (Vector Laboratories) staining step in between.
  • Hybridization to the probe arrays may be detected by fluorometric scanning (Hewlett Packard Gene Array Scanner). Following hybridization and scanning, the microarray images may be analyzed for quality control, looking for major chip defects or abnormalities in hybridization signal. After all chips pass QC, the data may be analyzed using ant available software or data mining tools.
  • Each DNA chip of the present invention may contain a plurality of oligonucleotide probe pairs per sequence to be detected, for example, splice junction, exon and in some instances, intron sequence.
  • These probe pairs may include perfectly matched sets and mismatched sets, both of which are necessary for the calculation of the average difference.
  • the average difference is a measure of the intensity difference for each probe pair, calculated by subtracting the intensity of the mismatch from the intensity of the perfect match. This takes into consideration variability in hybridization among probe pairs and other hybridization artifacts that could affect the fluorescence intensities.
  • the presence or absence of the various sequences will be used to determine the presence or absence of particular splice variants in a particular sample.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Bioethics (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention porte sur des matériaux et des procédés de détection et d'analyse de nouvelles variantes épissées d'ARNm, et dans certaines exécutions, sur des supports solides auxquels sont fixés des oligonucléotides présentant des séquences complémentaires des séquences de jonction épissées prédites. On peut préparer un profil de variante épissée comme échantillon et le comparer à un profil correspondant à un échantillon de tissu normal ou malade de profil correspondant.
PCT/US2002/015649 2001-05-17 2002-05-17 Materiaux et procedes de detection de nouvelles variantes epicees d'arnm WO2002093165A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/471,731 US20040115686A1 (en) 2002-05-17 2002-05-17 Materials and methods to detect alternative splicing of mrna

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US29159801P 2001-05-17 2001-05-17
US60/291,598 2001-05-17

Publications (1)

Publication Number Publication Date
WO2002093165A1 true WO2002093165A1 (fr) 2002-11-21

Family

ID=23120966

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/015649 WO2002093165A1 (fr) 2001-05-17 2002-05-17 Materiaux et procedes de detection de nouvelles variantes epicees d'arnm

Country Status (1)

Country Link
WO (1) WO2002093165A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6268147B1 (en) * 1998-11-02 2001-07-31 Kenneth Loren Beattie Nucleic acid analysis using sequence-targeted tandem hybridization
US6358691B1 (en) * 1998-03-03 2002-03-19 Third Wave Technologies, Inc. Target-dependent reactions using structure-bridging oligonucleotides
US6403309B1 (en) * 1999-03-19 2002-06-11 Valigen (Us), Inc. Methods for detection of nucleic acid polymorphisms using peptide-labeled oligonucleotides and antibody arrays

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6358691B1 (en) * 1998-03-03 2002-03-19 Third Wave Technologies, Inc. Target-dependent reactions using structure-bridging oligonucleotides
US6268147B1 (en) * 1998-11-02 2001-07-31 Kenneth Loren Beattie Nucleic acid analysis using sequence-targeted tandem hybridization
US6403309B1 (en) * 1999-03-19 2002-06-11 Valigen (Us), Inc. Methods for detection of nucleic acid polymorphisms using peptide-labeled oligonucleotides and antibody arrays

Similar Documents

Publication Publication Date Title
US10344334B2 (en) Method of diagnosing neoplasms
US20040115686A1 (en) Materials and methods to detect alternative splicing of mrna
AU745862C (en) Contiguous genomic sequence scanning
US20070015148A1 (en) Gene expression profiles in breast tissue
US20040033502A1 (en) Gene expression profiles in esophageal tissue
US20040058376A1 (en) Expression monitoring for gene function identification
US20050202421A1 (en) Method for diagnosis and treatment of rheumatoid arthritis
WO2001074405A9 (fr) Profils d'expression génique dans un tissu oesophagien
KR20080080531A (ko) 유전자 발현 프로파일 및 사용 방법
WO1998030722A9 (fr) Controle d'expression genique permettant d'identifier une fonction de gene
EP0973939A1 (fr) Controle d'expression genique permettant d'identifier une fonction de gene
WO2001032927A2 (fr) Genes specifiques de tissu a signification diagnostique
US20020029113A1 (en) Method and system for predicting splice variant from DNA chip expression data
WO1999054500A2 (fr) Marqueurs bialleles convenant a la constitution d'une carte haute densite des desequilibres du genome humain
US20060105363A1 (en) Methods for determining transcriptional activity
WO2002093165A1 (fr) Materiaux et procedes de detection de nouvelles variantes epicees d'arnm
US20060183186A1 (en) Gene expression profiles in stomach cancer
US20080181894A1 (en) Identification of Human Gene Sequences of Cancer Antigens Expressed in Metastatic Carcinoma Involved in Metastasis Formation, and Their Use in Cancer Diagnosis, Prognosis and Therapy
Sanoudou et al. Molecular classification of nemaline myopathies:“nontyping” specimens exhibit unique patterns of gene expression
US20040235008A1 (en) Methods and compositions for profiling transcriptionally active sites of the genome
WO2003025198A2 (fr) Polymorphismes regulateurs d'un nucleotide simple et procedes associes
WO2003016476A2 (fr) Profils d'expression genique de maladies glomerulaires
Cleophas et al. Statistical analysis of genetic data
JP2010193765A (ja) スティーブンス・ジョンソン症候群の発症リスクの判定方法
WO2001064957A1 (fr) Polymorphismes lies aux voies d'acheminement du glucose et de la signalisation de l'insuline

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 10471731

Country of ref document: US

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP