WO2001048245A2 - Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants - Google Patents

Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants Download PDF

Info

Publication number
WO2001048245A2
WO2001048245A2 PCT/US2000/035346 US0035346W WO0148245A2 WO 2001048245 A2 WO2001048245 A2 WO 2001048245A2 US 0035346 W US0035346 W US 0035346W WO 0148245 A2 WO0148245 A2 WO 0148245A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
polymoφhic
nucleotide
nucleic acid
complement
Prior art date
Application number
PCT/US2000/035346
Other languages
English (en)
Other versions
WO2001048245A3 (fr
Inventor
Richard A. Shimkets
Martin Leach
Original Assignee
Curagen Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Curagen Corporation filed Critical Curagen Corporation
Priority to EP00990358A priority Critical patent/EP1282726A2/fr
Priority to AU27394/01A priority patent/AU2739401A/en
Priority to CA002395786A priority patent/CA2395786A1/fr
Publication of WO2001048245A2 publication Critical patent/WO2001048245A2/fr
Publication of WO2001048245A3 publication Critical patent/WO2001048245A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the invention relates generally to nucleic acids and polypeptides and in particular to the identification of human single nucleotide polymorphisms based on at least one gene product that was not previously described.
  • Sequence polymorphism-based analysis of nucleic acid is generally based on alterations in nucleic acid sequences between related individuals. This analysis has been widely used in a variety of genetic, diagnostic, and forensic applications. ;For example, polymorphism analyses are used in identity and paternity analysis, and in genetic mapping studies.
  • RFLPS restriction fragment length polymorphism
  • STR sequences typically that include tandem repeats of 2, 3, or 4 nucleotide sequences that are present in a nucleic acid from one individual but absent from a second, related individual at the corresponding genomic location.
  • SNPs single nucleotide polymorphisms
  • cSNP single nucleotide polymorphisms
  • SNPs can arise in several ways.
  • a single nucleotide polymorphism may arise due to a substitution of one nucleotide for another at the polymorphic site.
  • Substitutions can be transitions or transversions.
  • a transition is the replacement of one purine nucleotide by another purine nucleotide, or one pyrimidine by another pyrimidine.
  • a transversion is the replacement of a purine by a pyrimidine, or the converse.
  • Single nucleotide polymorphisms can also arise from a deletion of a nucleotide or an insertion of a nucleotide relative to a reference allele.
  • the polymorphic site is a site at which one allele bears a gap with respect to a single nucleotide in another allele.
  • Some SNPs occur within, or near genes.
  • One such class includes SNPs falling within regions of genes encoding for a polypeptide product. These SNPs may result in an alteration of the amino acid sequence of the polypeptide product and give rise to the expression of a defective or other variant protein.
  • Such variant products can, in some cases result in a pathological condition, e.g. , genetic disease.
  • genes in which a polymorphism within a coding sequence gives rise to genetic disease include sickle cell anemia and cystic fibrosis.
  • Other SNPs do not result in alteration of the polypeptide product.
  • SNPs can also occur in noncoding regions of genes.
  • SNPs tend to occur with great frequency and are spaced uniformly throughout the genome.
  • the frequency and uniformity of SNPs means that there is a greater probability that such a polymorphism will be found in close proximity to a genetic locus of interest.
  • the invention is based in part on the discovery of single nucleotide polymorphisms (SNPs) in regions of human DNA.
  • SNPs single nucleotide polymorphisms
  • the invention provides nucleic acid sequences comprising nucleic acid segments of both publicly known and novel genes, including the polymorphic site.
  • the segments can be DNA or RNA, and can be single- or double-stranded.
  • Preferred segments include a biallelic polymorphic site.
  • the invention further provides allele-specific oligonucleotides that hybridize to a segment of a fragment shown in Table 1, column 4, or its complement. These oligonucleotides can be probes or primers. Also provided are isolated nucleic acids comprising a sequence shown in Table 1, column 4, in which the polymorphic site within the sequence is occupied by a base other than the reference bases shown in Table 1, columns 5 and 6.
  • the invention further provides a method of analyzing a nucleic acid from an individual. The method determines which base is present at any one of the polymorphic sites shown in Table 1.
  • a set of bases occupying a set of polymorphic sites shown in Table 1 is determined. This type of analysis can be performed on a number of individuals, who are tested for the presence of a disease phenotype.
  • the invention provides an isolated polynucleotide which includes one or more of the SNPs described herein.
  • the polynucleotide can be, e.g., a. nucleotide sequence which includes one or more of the polymorphic sequences shown in Table 1 and which includes a polymorphic sequence, or a fragment of the polymorphic sequence, as long as it includes the polymorphic site.
  • the polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of these sequences, or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the polynucleotide can be, e.g., DNA or RNA, and can be between about 10 and about 100 nucleotides, e.g, 10-90, 10-75, 10-51, 10-40, or 10-30, nucleotides in length.
  • the polymorphic site in the polymorphic sequence includes a nucleotide other than the nucleotide listed in Table 1, column 5 for the polymorphic sequence, e.g., the polymorphic site includes the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the complement of the polymorphic site includes a nucleotide other than the complement of the nucleotide listed in Table 1, column 5 for the complement of the polymorphic sequence, e.g., the complement of the nucleotide listed in Table 1, column 6 for the polymorphic sequence.
  • the polymorphic sequence is associated with a polypeptide related to one of the protein families disclosed herein.
  • the nucleic acid may be associated with a polypeptide related to angiopoietin, 4-hydroxybutyrate dehydrogenase, or any of the other proteins identified in Table 1, column 10.
  • the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymorphic site.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymorphic sequences recited in Table 1, provided that the polymorphic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymorphic sequence.
  • the first polynucleotide can be a nucleotide sequence that is a fragment of the polymorphic sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymorphic sequences in Table 1, provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymorphic site in the polymorphic sequence.
  • the oligonucleotide does not hybridize under stringent conditions to a second polynucleotide.
  • the second polynucleotide can be, e.g., (a) a nucleotide sequence comprising one or more polymorphic sequences in Table 1, wherein the polymorphic sequence includes the nucleotide listed in Table 1, column 5 for the polymo ⁇ hic sequence; (b) a nucleotide sequence that is a fragment of any of the polymorphic sequences; (c) a complementary nucleotide sequence including a sequence complementary to one or more polymorphic sequences disclosed herein in Table 1 ; and (d) a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the invention also provides a method of detecting a polymo ⁇ hic site in a nucleic acid.
  • the method includes contacting the nucleic acid with an oligonucleotide that hybridizes to a polymo ⁇ hic sequence selected shown in Table 1, or its complement, provided that the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1 , column 5 for the polymo ⁇ hic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the method also includes determining whether the nucleic acid and the oligonucleotide hybridize.
  • Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymo ⁇ hic site in the nucleic acid.
  • the oligonucleotide does not hybridize to the polymo ⁇ hic sequence when the polymo ⁇ hic sequence includes the nucleotide recited in Table 1, column 5 for the polymo ⁇ hic sequence, or when the complement of the polymo ⁇ hic sequence includes the complement of the nucleotide recited in Table 1, column 5 for the polymo ⁇ hic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the polymo ⁇ hic sequence identified by the oligonucleotide is associated with a nucleic acid encoding polypeptide related to one of the protein families disclosed herein, the polymo ⁇ hic sequence is associated with a polypeptide related to one of the protein families disclosed herein.
  • the nucleic acid may be associated with a polypeptide related to angiopoietin, 4-hydroxybutyrate dehydrogenase, * or any of the other proteins identified in Table 1, column 10.
  • the invention provides a method of determining the relatedness of a first and second nucleic acid.
  • the method includes providing a first nucleic acid and a second nucleic acid and contacting the first nucleic acid and the second nucleic acid with an oligonucleotide that hybridizes to a polymo ⁇ hic sequence selected disclosed in Table 1, or its complement, provided that the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1 , column 5 for the polymo ⁇ hic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the method also includes determining whether the first nucleic acid and the second nucleic acid hybridize to the oligonucleotide, and comparing hybridization of the first and second nucleic acids to the oligonucleotide. Hybridization of first and second nucleic acids to the nucleic acid indicates the first and second subjects are related.
  • the oligonucleotide does not hybridize to the polymo ⁇ hic sequence when the polymo ⁇ hic sequence includes the nucleotide recited in Table 1, column 5 for the polymo ⁇ hic sequence, or when the complement of the polymo ⁇ hic sequence includes the complement of the nucleotide recited in Table 1, column 5 column for the polymo ⁇ hic sequence.
  • the oligonucleotide can be, e.g., between about 10 and about 100 bases in length. In some embodiments, the oligonucleotide is between about 10 and 75 bases, 10 and 51 bases, 10 and about 40 bases, or about 15 and 30 bases in length.
  • the method can be used in a variety of applications.
  • the first nucleic acid may be isolated from physical evidence gathered at a crime scene, and the second nucleic acid may be obtained is a person suspected of having committed the crime. Matching the two nucleic acids using the method can establishing whether the physical evidence originated from the person.
  • the first sample may be from a human male suspected of being the father of a child and the second sample may be from a child. Establishing a match using the described method can establishing whether the male is the father of the child.
  • the method includes determining if a sequence polymo ⁇ hism is the present in a subject, such as a human.
  • the method includes providing a nucleic acid from the subject and contacting the nucleic acid with an oligonucleotide that hybridizes to a polymo ⁇ hic sequence disclosed in Table 1, or its complement, provided that the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymo ⁇ hic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • Hybridization between the , nucleic acid and the oligonucleotide is then determined. Hybridization of the oligonucleotide to the nucleic acid sequence indicates the presence of the polymo ⁇ hism in said subject.
  • the invention provides an isolated polypeptide comprising a polymo ⁇ hic site at one or more amino acid residues, and wherein the protein is encoded by a polynucleotide including one of the polymo ⁇ hic sequences in Table 1, or their complement, provided that the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1, column 5 for the polymo ⁇ hic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • polypeptide can be, e.g., related to one of the protein families disclosed herein.
  • polypeptide can be related to angiopoietin, 4-hydroxybutyrate dehydrogenase, ATP-dependent RNA helicase, MHC Class I histocompatibility antigen, or phosphogly cerate kinase.
  • the polypeptide is translated in the same open reading frame as is a wild type protein whose amino acid sequence is identical to the amino acid sequence of the polymo ⁇ hic protein except at the site of the polymo ⁇ hism.
  • the polypeptide encoded by the polymo ⁇ hic sequence, or its complement includes the nucleotide listed in Table 1 , column 6 for the polymo ⁇ hic sequence, or the complement includes the complement of the nucleotide listed in Table 1, column 6.
  • the invention also provides an antibody that binds specifically to a polypeptide encoded by a polynucleotide comprising a nucleotide sequence encoded by a polynucleotide including one or more of the polymo ⁇ hic sequences in Table 1, or its complement.
  • the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1 , column 5 for the polymo ⁇ hic sequence, or the complement includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 5.
  • the antibody binds specifically to a polypeptide encoded by a polymo ⁇ hic sequence which includes the nucleotide listed in Table 1, column 6 for the polymo ⁇ hic sequence.
  • the antibody does not bind specifically to a polypeptide encoded by a polymo ⁇ hic sequence which includes the nucleotide listed in Table 1, column 5 for the polymo ⁇ hic sequence.
  • the invention further provides a method of detecting the presence of a polypeptide having one or more amino acid residue polymo ⁇ hisms in a subject.
  • the method includes providing a protein sample from the subject and contacting the sample with the above- described antibody under conditions that allow for the formation of antibody-antigen complexes. The antibody-antigen complexes are then detected. The presence of the complexes indicates the presence of the polypeptide.
  • the invention also provides a method of treating a subject suffering from, at risk for, or suspected of, suffering from a pathology ascribed to the presence of a sequence polymo ⁇ hism in a subject, e.g., a human, non-human primate, cat, dog, rat, mouse, cow, pig, goat, or rabbit.
  • the method includes providing a subject suffering from a pathology associated with aberrant expression of a first nucleic acid comprising a polymo ⁇ hic sequence shown in Table 1, or its complement, and treating the subject by administering to the subject an effective dose of a therapeutic agent.
  • Aberrant expression can include qualitative alterations in expression of a gene, e.g., expression of a gene encoding a polypeptide having an altered amino acid sequence with respect to its wild-type counte ⁇ art.
  • Qualitatively different polypeptides can include, shorter, longer, or altered polypeptides relative to the amino acid sequence of the wild-type polypeptide.
  • Aberrant expression can also include quantitative alterations in expression of a gene. Examples of quantitative alterations in gene expression include lower or higher levels of expression of the gene relative to its wild-type counte ⁇ art, or alterations in the temporal or tissue-specific expression pattern of a gene.
  • aberrant expression may also include a combination of qualitative and quantitative alterations in gene expression.
  • the therapeutic agent can include, e.g., second nucleic acid comprising the polymo ⁇ hic sequence, provided that the second nucleic acid comprises the nucleotide present in the wild type allele.
  • the second nucleic acid sequence comprises a polymo ⁇ hic sequence which includes nucleotide listed in Table 1, column 5 for the polymo ⁇ hic sequence.
  • the therapeutic agent can be a polypeptide encoded by a polynucleotide comprising polymo ⁇ hic sequence shown in Table 1 , or by a polynucleotide comprising a nucleotide sequence that is complementary to any one of the polymo ⁇ hic sequences, provided that the polymo ⁇ hic sequence includes the nucleotide listed in Table 1, column 6 for the polymo ⁇ hic sequence.
  • the therapeutic agent may further include an antibody as herein described, or an oligonucleotide comprising a polymo ⁇ hic sequence shown in Table 1, or by a polynucleotide comprising a nucleotide sequence that is complementary to any one the polymo ⁇ hic sequences, provided that the polymo ⁇ hic sequence includes the nucleotide listed in Table 1, column 6 for the polymo ⁇ hic sequence,
  • the invention provides an oligonucleotide array comprising one or more oligonucleotides hybridizing to a first polynucleotide at a polymo ⁇ hic site encompassed therein.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymo ⁇ hic sequences shown in Table 1 ; a nucleotide sequence that is a fragment of any of the nucleotide sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence; a complementary nucleotide sequence comprising a sequence complementary to one or more of the polymo ⁇ hic sequences; or a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • the array comprises 10; 100; 1,000; 10,000; 100,000 or more oligonucleotides.
  • the invention also provides a kit comprising one or more of the herein-described nucleic acids.
  • the kit can include, e.g., polynucleotide which includes one or more of the SNPs described herein.
  • the polynucleotide can be, e.g. , a nucleotide sequence which includes one or more of the polymo ⁇ hic sequences shown in Table 1, and which includes a polymo ⁇ hic sequence, or a fragment of the polymo ⁇ hic sequence, as long as it includes the polymo ⁇ hic site.
  • the polynucleotide may alternatively contain a nucleotide sequence which includes a sequence complementary to one or more of the sequences, or a fragment of the complementary nucleotide sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • the kit can include the invention provides an isolated allele-specific oligonucleotide that hybridizes to a first polynucleotide containing a polymo ⁇ hic site.
  • the first polynucleotide can be, e.g., a nucleotide sequence comprising one or more polymo ⁇ hic sequences shown in Table 1, provided that the polymo ⁇ hic sequence includes a nucleotide other than the nucleotide recited in Table 1 , column 5 for the polymo ⁇ hic sequence.
  • the first polynucleotide can be a nucleotide sequence that is a fragment of the polymo ⁇ hic sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence, or a complementary nucleotide sequence which includes a sequence complementary to one or more polymo ⁇ hic sequences shown in Table 1, provided that the complementary nucleotide sequence includes a nucleotide other than the complement of the nucleotide recited in Table 1, column 6.
  • the first polynucleotide may in addition include a nucleotide sequence that is a fragment of the complementary sequence, provided that the fragment includes a polymo ⁇ hic site in the polymo ⁇ hic sequence.
  • FIG. 1 illustrates an example of the way in which SNP sites were identified in the present invention.
  • the invention provides human SNPs in sequences which are transcribed, i.e., are cSNPs.
  • Many SNPs have been identified in genes related to polypeptides of known function.
  • SNPs associated with various polypeptides can be used together.
  • SNPs can be grouped according to whether they are derived from a nucleic acid encoding a polypeptide related to particular protein family or involved in a particular function.
  • SNPs can be grouped according to the functions played by their gene products. Such functions include, structural proteins, proteins from which associated with metabolic pathways fatty acid metabolism, glycolysis, intermediary metabolism, calcium metabolism, proteases, and amino acid metabolism, etc.
  • the present invention provides a large number of human cSNP's based on at least one gene product that has not been previously identified.
  • the cSNP's involve nucleic acid sequences that are assembled from at least one known sequence.
  • the present invention describes 651 distinct polymo ⁇ hic sites, which are summarized in Table 1.
  • Raw traces underlying sequence data were drawn from public databases and from the proprietary database of the Assignee of the present invention. The sequences were obtained by calling the bases from these traces, and included assigning "Phred" quality scores for each called base.
  • allelic set at the polynucleotide level, four or more nucleotide sequences were identified having at least partial overlap with one another.
  • these four or more sequences could be clustered and assembled to make a consensus contig that included an ORF.
  • the assembled contigs defined associated sets of two, or possibly more than two, alleles defined by a SNP at a particular polymo ⁇ hic site.
  • the nucleotide change from the consensus sequence had to occur in at least two individual sequences, and had to have a "Phred" score of 23 or higher at the site of the presumed SNP.
  • no more than 50% mismatching with the consensus sequence was allowed.
  • the SNP alleles occur in polynucleotides found in public databases.
  • allelic sets were identified in which one allele defines a known polypeptide sequence that includes the polymo ⁇ hic site and another polypeptide allele is not previously known. Then, various associations of alleles are possible. For example, it is possible that an allelic pair is defined in a noncoding region of the contig containing an ORF. In such cases the inventors believe that the invention resides in the recognition of the allelic pair; this association has not heretofore been made.
  • sets of allelic contigs may exist in which the polymo ⁇ hic site is within an ORF, but does not result in an amino acid change among the allelic polypeptides.
  • the polymo ⁇ hic site resides within an ORF and results in an amino acid change, or a frameshift, among the alleles of the allelic set.
  • at least one of the alleles at the polypeptide level is a known protein.
  • At least one of the remaining allele or alleles in the set, carrying a variant amino acid at the polymo ⁇ hic site, is a novel polypeptide not heretofore known.
  • the invention resides at least in the recognition of the polymo ⁇ hic allele as being a variant of the known reference polypeptide.
  • Table 1 provides information concerning the allelic sequences. One of the sequences may be termed a reference polymo ⁇ hic sequence, and the corresponding second sequence includes the variant SNP at the polymo ⁇ hic site. Since the reference polypeptide sequence is already known, the Sequence Listing accompanying this application provides only the sequence of the polymo ⁇ hic allele, while its SEQ ID NO is provided in the Table. A reference to the SEQ ID NO that corresponds to the translated amino acid sequence is also given.
  • the Table includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and a description of each, are given below.
  • SNPs disclosed in Table 1 were detected by aligning large numbers of sequences from genetically diverse sources of publicly available mRNA libraries (Clontech). Software designed specifically to look for multiple examples of variant bases differing from a consensus sequence was created and deployed. A criteria of a minimum of 2 occurrences of a sequence differing from the consensus in high quality sequence reads was used to identify an SNP.
  • the SNPs described herein may be useful in diagnostic kits, for DNA arrays on chips and for other uses that involve hybridization of the SNP.
  • Specific SNPs may have utility where a disease has already been associated with that gene. Examples of possible disease correlations between the claimed SNPs with members of the genes of each classification are listed below:
  • Amylase is responsible for endohydrolysis of 1,4-alpha-glucosidic linkages in oligosaccharides and polysaccharides. Variations in amylase gene may be indicative of delayed maturation and of various amylase producing neoplasms and carcinomas.
  • the serum amyloid A (SAA) proteins comprise a family of vertebrate proteins that associate predominantly with high density lipoproteins (HDL). The synthesis of certain members of the family is greatly increased in inflammation. Prolonged elevation of plasma
  • amyloidosis a pathological condition, called amyloidosis, which affectsthe liver, kidney and spleen and which is characterized by the highly insoluble accumulation of SAA in these tissues.
  • Amyloid selectively inhibits insulin- stimulated glucose utilization and glycogen deposition in muscle, while not affecting adipocyte glucose metabolism.
  • angiogenesis is also an essential step in tumor growth in order for the tumor to get the blood supply it needs to expand. Variation in these genes may be predictive of any form of heart disease, numerous blood clotting disorders, stroke, hypertension and predisposition to tumor formation and metastasis. In particular, these variants may be predictive of the response to various antihypertensive drugs and chemotherapeutic and anti-tumor agents.
  • apoptosis Active cell suicide
  • apoptosis is induced by events such as growth factor withdrawal and toxins. It is controlled by regulators, which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro- apoptotic).
  • regulators which have either an inhibitory effect on programmed cell death (anti-apoptotic) or block the protective effect of inhibitors (pro- apoptotic).
  • anti-apoptotic an inhibitory effect on programmed cell death
  • pro- apoptotic block the protective effect of inhibitors
  • Many viruses have found a way of countering defensive apoptosis by encoding their own anti-apoptosis genes preventing their target-cells from dying too soon. Variants of apoptosis related genes may be useful in formulation of antiaging drugs.
  • Granulocyte/macrophage colony-stimulating factors are cytokines that act in hematopoiesis by controlling the production, differentiation, and function of 2 related white cell populations of the blood, the granulocytes and the monocytes-macrophages.
  • Complement proteins are immune associated cytotoxic agents, acting in a chain reaction to exterminate target cells to that were opsonized (primed) with antibodies, by forming a membrane attack complex (MAC). The mechanism of killing is by opening pores in the target cell membrane.
  • Variations in 20 complement genes or their inhibitors are associated with many autoimmune disorders. Modified serum levels of complement products cause edemas of various tissues, lupus (SLE), vasculitis, glomerulonephritis, renal failure, hemolytic anemia, thrombocytopenia, and arthritis. They interfere with mechanisms of ADCC (antibody dependent cell cytotoxicity), severely impair immune competence and reduce phagocytic ability.
  • Variants of complement genes may also be indicative of type I diabetes mellitus, meningitis neurological disorders such as Nemaline myopathy, Neonatal hypotonia, muscular disorders such as congenital myopathy and other diseases.
  • the respiratory chain is a key biochemical pathway which is essential to all aerobic cells.
  • cytochromes involved in the chain. These are heme bound proteins which serve as electron carriers. Modifications in these genes may be predictive of ataxia areflexia, dementia and myopathic and neuropathic changes in muscles. Also, association with various types of solid tumors.
  • Kinesins are tubulin molecular motors that function to transport organelles within cells and to move chromosomes along microtubules during cell division. Modifications of these genes may be indicative of neurological disorders such as Pick disease of the brain, tuberous sclerosis. Cytokines, Interferon, Interleukin
  • Cytokines such as erythropoietin are cell-specific in their growth stimulation; erythropoietin is useful for the stimulation of the proliferation of erythroblasts.
  • Variants in cytokines may be predictive for a wide variety of diseases, including cancer predisposition.
  • G-protein coupled receptors also called R7G are an extensive group of hormones, neurotransmitters, odorants and light receptors which transduce extracellular signals by interaction with guanine nucleotide-binding (G) proteins. Alterations in genes coding for G-coupled proteins may be involved in and indicative of a vast number of physiological conditions. These include blood pressure regulation, renal dysfunctions, male infertility, dopamine associated cognitive, emotional, and endocrine functions, hypercalcemia, chondrodysplasia and osteoporosis, pseudohypoparathyroidism, growth retardation and dwarfism.
  • G guanine nucleotide-binding
  • Eukaryotic thiol proteases are a family of proteolytic enzymes which contain an active site cysteine. Catalysis proceeds through a thioester intermediate and is facilitated by a nearby histidine side chain; an asparagine completes the essential catalytic triad. Variants of thioester associated genes may be predictive of neuronal disorders and mental illnesses such as Ceroid Lipoffiscinosis, Neuronal 1, Infantile, Santavuori disease and more.
  • PIR PIR DATABASE release 56, 29-OCT- 1998) polymerase polymerase potassium_channel potassium channel protein prostaglandin prostaglandin protease protease proteaseinhib protease inhibitor reductase reductase ribosomalprot ribosomal associated protein
  • Table 1 includes thirteen columns that provide descriptive information for each cSNP, each of which occupies one row in the Table. The column headings, and an explanation for each, are given below.
  • the first column of the table lists the names assigned to the fragments in which the polymo ⁇ hisms occur.
  • the fragments are all human genomic fragments.
  • the sequence of one allelic form of each of the fragments (arbitrarily referred to as the prototypical or reference form ) has been previously published. These sequences are listed at http://www- genome.wi.mit.edu/ (all STS's sequence tag sites)); http://shgc.stanford.edu (Stanford STS's); and http://www.tigr.org/ (TIGR STS's).
  • the web sites also list primers for amplification of the fragments, and the genomic location of the fragments. Some fragments arc expressed sequence tags, and some are random genomic fragments.
  • the second column lists the position in the fragment in which a polymo ⁇ hic site has been found. Positions are numbered consecutively with the first base of the fragment sequence listed as in one of the above databases being assigned the number one.
  • the third column lists the base occupying the polymo ⁇ hic site in the sequence in the data base. This base is arbitrarily designated the reference or prototypical form, but it is not necessarily the most frequently occurring form.
  • the fourth column in the table lists the alternative base(s) at the polymo ⁇ hic site.
  • the fifth column of the table lists a 5' (upstream or forward) primer that hybridizes with the 5' end of the DNA sequence to be amplified.
  • the sixth column of the table lists a 3' (downstream or reverse) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • the seventh column of the table lists a number of bases of sequence on either side of the polymo ⁇ hic site in each fragment.
  • the indicated sequences can either be DNA or RNA. In the latter, the T's shown in the table are replaced by U's.
  • the base occupying the polymo ⁇ hic site is indicated in EUT'AC-IUB ambiguity code.
  • SEQ ID provides the cross-references to the two nucleotide SEQ ID NOS: for the cognate pair, which are numbered consecutively, and, as explained below, amino acid SEQ ID NOS: as well, in the Sequence Listing of the application.
  • Each sequence entry in the Sequence Listing also includes a cross-reference to the CuraGen sequence ID, under the label " Accession number”.
  • the first pair of SEQ ID NOS: given in the first column of each row of the Table is the SEQ ID NO: identifying the nucleic acid sequence for the polymo ⁇ hism. If a polymo ⁇ hism carries an entry for the amino acid portion of the row, a third SEQ ID NO: appears in parentheses in the column "Amino acid before” (see below) for the reference amino acid sequence, and a fourth SEQ ID NO: appears in parentheses in the column "Amino acid after” (see below) for the polymo ⁇ hic amino acid sequence .
  • SEQ ID NOS: refer to amino acid sequences giving the cognate reference and polymo ⁇ hic amino acid sequences that are the translation of the nucleotide polymo ⁇ hism. If a polymo ⁇ hism carries no entry for the protein portion of the row, only one pair SEQ ID NOS: is provided, in the first column.
  • CuraGen sequence ID provides CuraGen Co ⁇ oration's accession number.
  • “Base pos. of SNP” gives the numerical position of the nucleotide in the nucleic acid at which the cSNP is found, as identified in this invention.
  • “Polymo ⁇ hic sequence” provides a 51 -base sequence with the polymo ⁇ hic site at the 26 th base in the sequence, as well as 25 bases from the reference sequence on the 5' side and the 3' side of the polymo ⁇ hic site. The designation at the polymo ⁇ hic site is enclosed in square brackets, and provides first, the reference nucleotide; second, a “slash (/)"; and third, the polymo ⁇ hic nucleotide. In certain cases the polymo ⁇ hism is an insertion or a deletion. In that case, the position that is "unfilled” (i.e., the reference or the polymo ⁇ hic position) is indicated by the word "gap".
  • Base before provides the nucleotide present in the reference sequence at the position at which the polymo ⁇ hism is found.
  • Base after provides the altered nucleotide at the position of the polymo ⁇ hism.
  • amino acid before provides the amino acid in the reference protein, if the polymo ⁇ hism occurs in a coding region.
  • This column also includes the SEQ ID NO: in parentheses for the translated reference amino acid sequence if the polymo ⁇ hism occurs in a coding region.
  • amino acid after provides the amino acid in the polymo ⁇ hic protein, if the polymo ⁇ hism occurs in a coding region.
  • This column also includes the SEQ ID NO in parentheses for the translated polymo ⁇ hic amino acid sequence if the polymo ⁇ hism occurs in a coding region.
  • Type of change provides information on the nature of the polymo ⁇ hism.
  • SILENT-NONCODING is used if the polymo ⁇ hism occurs in a noncoding region of a nucleic acid.
  • SILENT-CODING is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid of a nucleic acid and results in no change of amino acid in the translated polymo ⁇ hic protein.
  • CONSERVATIVE is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in the same class as the reference amino acid.
  • the classes are: 1) Aliphatic: Gly, Ala, Val, Leu, He; 2) Aromatic: Phe, Tyr, T ⁇ ; 3) Sulfur-containing: Cys, Met; 4) Aliphatic OH: Ser, Thr; 5) Basic: Lys, Arg, His; 6) Acidic: Asp, Glu, Asn, Gin; 7) Pro falls in none of the other classes; and 8) End defines a termination codon.
  • "NONCONSERVATIVE" is used if the polymo ⁇ hism occurs in a coding region of a nucleic acid and provides a change in which the altered amino acid falls in a different class than the reference amino acid.
  • FRAMESHIFT relates to an insertion or a deletion. If the frameshift occurs in a coding region, the Table provides the translation of the frameshifted codons 3' to the polymo ⁇ hic site.
  • Protein classification of CuraGen gene provides a generic class into which the protein is classified. Multiple classes of proteins were identified as listed above in the discussion of Table 1.
  • Similarity (pvalue) following a BLASTX analysis provides the pvalue, a statistical measure from the BLASTX analysis that the polymo ⁇ hic sequence is similar to, and therefore an allele of, the reference, or wild-type, sequence.
  • a cutoff of pvalue > 1 x 10 "50 is used to establish that the reference-polymo ⁇ hic cognate pairs are novel.
  • a pvalue ⁇ 1 x 10 "50 defines proteins considered to be already known.
  • Map location provides any information available at the time of filing related to localization of a gene on a chromosome.
  • the polymo ⁇ hisms are arranged in Table 1 in the following order:
  • SEQ ID NOs: 1-422 are nucleotide sequences for SNPs that are silent.
  • SEQ ID Nos: 423-480 are nucleotide sequences for SNPs that lead to conservative amino acid changes.
  • SEQ ID NOs : 481 -619 are nucleotide sequences for SNPs that lead to nonconservative amino acid changes.
  • SEQ ID NOs: 620-651 are nucleotide sequences for SNPs that involve a gap.
  • the allelic cSNP introduces an additional nucleotide (an insertion) or deletes a nucleotide (a deletion).
  • An SNP that involves a gap generates a frame shift.
  • SEQ ID NOs: 652-709 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to conservative amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • SEQ ID NOs: 710-848 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to nonconservative amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • SEQ ID NOs: 849-880 are the amino acid sequences centered at the polymo ⁇ hic amino acid residue for the protein products provided by SNPs that lead to frameshift-induced amino acid changes. 7 or 8 amino acids on either side of the polymo ⁇ hic site are shown. The order in which these sequences appear mirrors the order of presentation of the cognate nucleotide sequences, and is set forth in the Table.
  • compositions which include, or are capable of detecting, nucleic acid sequences having these polymo ⁇ hisms, as well as methods of using nucleic acids.
  • polymo ⁇ hic alleles of the invention may be detected at either the DNA, the RNA, or the protein level using a variety of techniques that are well known in the art. Strategies for identification and detection are described in e.g., EP 730,663, EP 717,113, and PCT US97/02102.
  • the present methods usually employ pre-characterized polymo ⁇ hisms. That is, the genotyping location and nature of polymo ⁇ hic forms present at a site have already been determined. The availability of this information allows sets of probes to be designed for specific identification of the known polymo ⁇ hic forms. Many of the methods described below require amplification of DNA from target samples. This can be accomplished by e.g., PCR. (1989), B.
  • recombinant protein refers to a peptide or protein produced using non-native cells that do not have an endogenous copy of DNA able to express the protein.
  • a recombinantly produced protein relates to the gene product of a polymo ⁇ hic allele, i.e., a "polymo ⁇ hic protein” containing an altered amino acid at the site of translation of the nucleotide polymo ⁇ hism.
  • the cells produce the protein because they have been genetically altered by the introduction of the appropriate nucleic acid sequence.
  • the recombinant protein will not be found in association with proteins and other subcellular components normally associated with the cells producing the protein.
  • protein and “polypeptide” are used interchangeably herein.
  • nucleic acid when referring to a nucleic acid, peptide or protein, means that the chemical composition is in a milieu containing fewer, or preferably, essentially none, of other cellular components with which it is naturally associated.
  • isolated or substantially pure refers to nucleic acid preparations that lack at least one protein or nucleic acid normally associated with the nucleic acid in a host cell. It is preferably in a homogeneous state although it can be in either a dry or aqueous solution. Purity and homogeneity are typically determined using analytical chemistry techniques such as gel electrophoresis or high performance liquid chromatography.
  • a substantially purified or isolated nucleic acid or protein will comprise more than 80% of all macromolecular species present in the preparation.
  • the nucleic acid or protein is purified to represent greater than 90% of all macromolecular species present. More preferably the nucleic acid or protein is purified to greater than 95%, and most preferably the nucleic acid or protein is purified to essential homogeneity, wherein other macromolecular species are not detected by conventional analytical procedures.
  • the genomic DNA used for the diagnosis may be obtained from any nucleated cells of the body, such as those present in peripheral blood, urine, saliva, buccal samples, surgical specimen, and autopsy specimens.
  • the DNA may be used directly or may be amplified enzymatically in vitro through use of PCR (Saiki et al. Science 239:487-491 ( 988)) or other in vitro amplification methods such as the ligase chain reaction (LCR) (Wu and Wallace Genomics 4:560-569 (1989)), strand displacement amplification (SDA) (Walker et al. Proc. Natl. Acad. Sci. U.S.A. 89:392-396 (1992)), self-sustained sequence replication (3SR) (Fahy et al. PCR Methods P&J& 1 :25-33 (1992)), prior to mutation analysis.
  • LCR ligase chain reaction
  • SDA strand displacement amplification
  • 3SR self-sustained sequence replication
  • nucleic acid is a deoxyribonucleotide or ribonucleotide polymer in either single-or double-stranded form, including known analogs of natural nucleotides unless otherwise indicated.
  • nucleic acids refers to either DNA or RNA.
  • Nucleic acid sequence or “polynucleotide sequence” refers to a single-stranded sequence of deoxyribonucleotide or ribonucleotide bases read from the 5' end to the 3' end.
  • RNA transcripts The direction of 5' to 3' addition of nascent RNA transcripts is referred to as the transcription direction; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 5' end of the RNA transcript in the 5' direction are referred to as "upstream sequences"; sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences".
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences”.
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA transcript in the 3' direction are referred to as "downstream sequences”.
  • upstream sequences sequence regions on the DNA strand having the same sequence as the RNA and which are beyond the 3' end of the RNA
  • polymo ⁇ hisms in specific DNA sequences can be accomplished by a variety of methods including, but not limited to, restriction-fragment-length-polymo ⁇ hism detection based on allele-specific restriction-endonuclease cleavage (Kan and Dozy Lancet ii:910-912 (1978)), hybridization with allele-specific oligonucleotide probes (Wallace et al.
  • DGGE denaturing-gradient gel electrophoresis
  • Single-strand-conformation- polymo ⁇ hism detection Orita et al. Genomics 5:874-879 (1983)
  • RNAase cleavage at mismatched base-pairs Myers et al. Science 230:1242 (1985)
  • chemical Cotton et al. Proc. Natl. w Sci. U.S.A, 8Z4397-4401 (1988)
  • enzymatic Youil et al. Proc.
  • Specific hybridization refers to the binding, or duplexing, of a nucleic acid molecule only to a second particular nucleotide sequence to which the nucleic acid is complementary, under suitably stringent conditions when that sequence is present in a complex mixture (e.g., total cellular DNA or RNA).
  • Stringent conditions are conditions under which a probe will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and are different in different circumstances. Longer sequences hybridize specifically at higher temperatures than shorter ones.
  • stringent conditions are selected such that the temperature is about 5°C lower than the thermal melting point (Tm) for the specific sequence to which hybridization is intended to occur at a defined ionic strength and pH.
  • Tm is the temperature (under defined ionic strength, pH, and nucleic acid concentration) at which 50% of the target sequence hybridizes to the complementary probe at equilibrium.
  • stringent conditions include a salt concentration of at least about 0.01 to about 1.0 M Na ion concentration (or other salts), at pH 7.0 to 8.3.
  • the temperature is at least about 30°C for short probes (e.g., 10 to 50 nucleotides) .
  • Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. For example, conditions of 5X SSPE (750 mM NaCI, 50 mM NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30°C are suitable for allele-specific probe hybridizations.
  • “Complementary” or “target” nucleic acid sequences refer to those nucleic acid sequences which selectively hybridize to a nucleic acid probe. Proper annealing conditions depend, for example, upon a probe's length, base composition, and the number of mismatches and their position on the probe, and must often be determined empirically. For discussions of nucleic acid probe design and annealing conditions, see, for example, Sambrook et al., or Current Protocols in Molecular Biology, F. Ausubel et al., ed., Greene Publishing and Wiley-Interscience, New York ( 1987).
  • a perfectly matched probe has a sequence perfectly complementary to a particular target sequence.
  • the test probe is typically perfectly complementary to a portion of the target sequence.
  • a "polymo ⁇ hic" marker or site is the locus at which a sequence difference occurs with respect to a reference sequence.
  • Polymo ⁇ hic markers include restriction fragment length polymo ⁇ hisms, variable number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion elements such as Alu.
  • the reference allelic form may be, for example, the most abundant form in a population, or the first allelic form to be identified, and other allelic forms are designated as alternative, variant or polymo ⁇ hic alleles.
  • the allelic form occurring most frequently in a selected population is sometimes referred to as the "wild type" form, and herein may also be referred to as the "reference" form.
  • Diploid organisms may be homozygous or heterozygous for allelic forms.
  • a diallelic polymo ⁇ hism has two distinguishable forms (i.e., base sequences), and a triallelic polymo ⁇ hism has three such forms.
  • an "oligonucleotide” is a single-stranded nucleic acid ranging in length from 2 to about 60 bases. Oligonucleotides are often synthetic but can also be produced from naturally occurring polynucleotides.
  • a probe is an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing via hydrogen bond formation. Oligonucleotides probes are often between 5 and 60 bases, and, in specific embodiments, may be between 10- 40, or 15-30 bases long.
  • An oligonucleotide probe may include natural (i.e. A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.).
  • the bases in an oligonucleotide probe may be joined by a linkage other than a phosphodiester bond, such as a phosphoramidite linkage or a phosphorothioate linkage, or they may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than by phosphodiester bonds, so long as it does not interfere with hybridization. Examples of an oligonucleotide are shown in Table 1.
  • Oligonucleotides can be all of a nucleic acid segment as represented in column 4 of Table 1; a nucleic acid sequence which comprises a nucleic acid segment represented in column 4 of Table 1 and additional nucleic acids (present at either or both ends of a nucleic acid segment of column 4); or a portion (fragment) of a nucleic acid segment represented in column 4 of the table which includes a polymo ⁇ hic site.
  • Preferred polymo ⁇ hic sites of the invention include segments of DNA or their complements, which include any one of the polymo ⁇ hic sites shown in the Table.
  • the segments can be between 5 and 250 bases, and, in specific embodiments are between 5-10, 5-20, 10-20, 10-50, 20-50 or 10-100 bases.
  • the polymo ⁇ hic site can occur within any position of the segment.
  • the segments can be from any of the allelic forms of the DNA shown in the Table.
  • primer refers to a single-stranded oligonucleotide which acts as a point of initiation of template-directed DNA synthesis under appropriate conditions (e.g., in the presence of four different nucleoside triphosphates and a polymerization agent, such as DNA polymerase, RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
  • a polymerization agent such as DNA polymerase, RNA polymerase or reverse transcriptase
  • the appropriate length of a primer depends on the intended use of the primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.
  • a primer need not be perfectly complementary to the exact sequence of the template, but should be sufficiently complementary to hybridize with it.
  • primer site refers to the sequence of the target DNA to which a primer hybridizes.
  • primer pair refers to a set of primers including a 5' (upstream) primer that hybridizes with the 5' end of the DNA sequence to be amplified and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the sequence to be amplified.
  • DNA fragments can be prepared, for example, by digesting plasmid DNA, or by use of PCR.
  • Oligonucleotides for use as primers or probes are chemically synthesized by methods known in the field of the chemical synthesis of polynucleotides, including by way of non-limiting example the phosphoramidite method described by Beaucage and Carruthers, Tetrahedron Lett 22:1859-1 862 (1981) and the triester method provided by Matteucci, et al., J. Am. Chem. Soc, 103:3185 (1981) both inco ⁇ orated herein by reference.
  • oligonucleotides may be carried out by either native acrylamide gel electrophoresis or by anion-exchange HPLC as described in Pearson, J.D. and Regnier, F.E., ,J. Chro ,, 255:137-149 (1983).
  • a double stranded fragment may then be obtained, if desired, by annealing appropriate complementary single strands together under suitable conditions or by synthesizing the complementary strand using a DNA polymerase with an appropriate primer sequence.
  • a specific sequence for a nucleic acid probe is given, it is understood that the complementary strand is also identified and included. The complementary strand will work equally well in situations where the target is a double-stranded nucleic acid.
  • sequence of the synthetic oligonucleotide or of any nucleic acid fragment can be can be obtained using either the dideoxy chain termination method or the Maxam-Gilbert method (see Sambrook et al. Molecular Cloning - a Laboratory Manual (2nd Ed.). Vols. 1- 3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, (1989), which is inco ⁇ orated herein by reference. This manual is hereinafter referred to as "Sambrook et al.” ; Zyskind et al., (1988)). Recombinant DNA Laboratory Manual, (Acad. Press, New York). Oligonucleotides useful in diagnostic assays are typically at least 8 consecutive nucleotides in length, and may range upwards of 18 nucleotides in length to greater than 100 or more consecutive nucleotides.
  • antisense nucleic acid molecules that are hybridizable to or complementary to the nucleic acid molecule comprising the SNP- containing nucleotide sequences of the invention, or fragments, analogs or derivatives thereof.
  • An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or complementary to an mRNA sequence.
  • antisense nucleic acid molecules are provided that comprise a sequence complementary to at least about 10, about 25, about 50, or about 60 nucleotides or an entire SNP coding strand, or to only a portion thereof.
  • an antisense nucleic acid molecule is antisense to a "coding region" of the coding strand of a polymo ⁇ hic nucleotide sequence of the invention.
  • coding region refers to the region of the nucleotide sequence comprising codons which are translated into amino acid.
  • the antisense nucleic acid molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence of the invention.
  • noncoding region refers to 5' and 3' sequences which flank the coding region that are not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions).
  • antisense nucleic acids of the invention can be designed according to the rules of Watson and Crick or Hoogsteen base pairing.
  • the antisense nucleic acid molecule can generally be complementary to the entire coding region of an mRNA, but more preferably as embodied herein, it is an oligonucleotide that is antisense to only a portion of the coding or noncoding region of the mRNA.
  • An antisense oligonucleotide can range in length between about 5 and about 60 nucleotides, preferably between about 10 and about 45 nucleotides, more preferably between about 15 and 40 nucleotides, and still more preferably between about 15 and 30 in length.
  • an antisense nucleic acid of the invention can be constructed using chemical synthesis or enzymatic ligation reactions using procedures known in the art.
  • an antisense nucleic acid e.g., an antisense oligonucleotide
  • an antisense nucleic acid can be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used.
  • modified nucleotides that can be used to generate the antisense nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5 -methylaminomethyluracil, 5 -methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5'
  • the antisense nucleic acid can be produced biologically using an expression vector into which a nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, described further in the following subsection).
  • the antisense nucleic acid molecules of the invention are typically administered to a subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or genomic DNA encoding a polymo ⁇ hic protein to thereby inhibit expression of the protein, e.g., by inhibiting transcription and/or translation.
  • the hybridization can be by conventional nucleotide complementary to form a stable duplex, or, for example, in the case of an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in the major groove of the double helix.
  • An example of a route of administration of antisense nucleic acid molecules of the invention includes direct injection at a tissue site.
  • antisense nucleic acid molecules can be modified to target selected cells and then administered systemically.
  • antisense molecules can be modified such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface receptors or antigens.
  • the antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. To achieve sufficient intracellular concentrations of antisense molecules, vector constructs in which the antisense nucleic acid molecule is placed under the control of a strong pol II or pol III promoter are preferred.
  • the antisense nucleic acid molecule of the invention is an ⁇ -anomeric nucleic acid molecule.
  • An -anomeric nucleic acid molecule forms specific double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641).
  • the antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) FEBSLett 215: 327-330).
  • reference sequence is a defined sequence used as a basis for a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full- length cDNA or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence.
  • Optimal alignment of sequences for aligning a comparison window may, for example, be conducted by the local homology algorithm of Smith and Waterman Adv. Appl.
  • nucleic acid sequence encoding refers to a nucleic acid which directs the expression of a specific protein, peptide or amino acid sequence.
  • the nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein, peptide or amino acid sequence.
  • the nucleic acid sequences include both the full length nucleic acid sequences disclosed herein as well as non-full length sequences derived from the full length protein. It being further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell. Consequently, the principles of probe selection and array design can readily be extended to analyze more complex polymo ⁇ hisms (see EP 730,663). For example, to characterize a triallelic SNP polymo ⁇ hism, three groups of probes can be designed tiled on the three polymo ⁇ hic forms as described above.
  • Genomic DNA is typically amplified before analysis. Amplification is usually effected by PCR using primers flanking a suitable fragment e.g., of 50-500 nucleotides containing the locus of the polymo ⁇ hism to be analyzed. Target is usually labeled in the course of amplification.
  • the amplification product can be RNA or DNA, single stranded or double stranded. If double stranded, the amplification product is typically denatured before application to an array.
  • RNA may be desirable to remove RNA from the sample before applying it to the array. Such can be accomplished by digestion with DNase-free RNAase. DETECTION OF POLYMORPHISMS IN A NUCLEIC ACID SAMPLE
  • the SNPs disclosed herein can be used to determine which forms of a characterized polymo ⁇ hism are present in individuals under analysis.
  • Allele-specific probes can be designed that hybridize to a segment of target DNA from one individual but do not hybridize to the corresponding segment from another individual due to the presence of different polymo ⁇ hic forms in the respective segments from the two individuals.
  • Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and preferably an essentially binary response, whereby a probe hybridizes to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the polymo ⁇ hic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 7, 8 or 9 position) of the probe. This design of probe achieves good discrimination in hybridization between different allelic forms.
  • Allele-specific probes are often used in pairs, one member of a pair showing a perfect match to a reference form of a target sequence and the other member showing a perfect match to a variant form. Several pairs of probes can then be immobilized on the same support for simultaneous analysis of multiple polymo ⁇ hisms within the same target sequence.
  • the polymo ⁇ hisms can also be identified by hybridization to nucleic acid arrays, some examples of which are described in defendingshed PCT application WO 95/11995.
  • WO 95/11995 also describes subarrays that are optimized for detection of a variant form of a precharacterized polymo ⁇ hism.
  • Such a subarray contains probes designed to be complementary to a second reference sequence, which is an allelic variant of the first reference sequence.
  • the second group of probes is designed by the same principles, except that the probes exhibit complementarity to the second reference sequence.
  • a second group can be particularly useful for analyzing short subsequences of the primary reference sequence in which multiple mutations are expected to occur within a short distance commensurate with the length of the probes (e.g., two or more mutations within 9 to 21 bases).
  • An allele-specific primer hybridizes to a site on target DNA overlapping a polymo ⁇ hism and only primes amplification of an allelic form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 172427-2448 (1989). This primer is used in conjunction with a second primer which hybridizes at a distal site. Amplification proceeds from the two-primers, resulting in a detectable product which indicates the particular allelic form is present.
  • a control is usually performed with a second pair of primers, one of which shows a single base mismatch at the polymo ⁇ hic site and the other of which exhibits perfect complementarity to a distal site.
  • the single-base mismatch prevents amplification and no detectable product is formed.
  • the method works best when the mismatch is included in the 3'-most position of the oligonucleotide aligned with the polymo ⁇ hism because this position is most destabilizing to elongation from the primer (see, e.g., WO 93/22456).
  • Amplification products generated using the polymerase chain reaction can be analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be identified based on the different sequence-dependent melting properties and electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles and Applications for DNA Amplification, (W.H. Freeman and Co New York, 1992, Chapter 7).
  • Alleles of target sequences can be differentiated using single-strand conformation polymo ⁇ hism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described in Orita et al., Proc. Nat. Acad. Sci. 86, 2766-2770 (1989).
  • Amplified PCR products can be generated and heated or otherwise denatured, to form single stranded amplification products.
  • Single-stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence differences between alleles of target sequences.
  • the genotype of an individual with respect to a pathology suspected of being caused by a genetic polymo ⁇ hism may be assessed by association analysis.
  • Phenotypic traits suitable for association analysis include diseases that have known but hitherto unmapped genetic components (e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von Willebrand's disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute intermittent po ⁇ hyria).
  • diseases that have known but hitherto unmapped genetic components e.g., agammaglobulinemia, diabetes insipidus, Lesch-Nyhan syndrome, muscular dystrophy,
  • Phenotypic traits also include symptoms of, or susceptibility to, multifactorial diseases of which a component is or may be genetic, such as autoimmune diseases, inflammation, cancer, system, diseases of the nervous and infection by pathogenic microorganisms.
  • autoimmune diseases include rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non- independent), systemic lupus erythematosus and Graves disease.
  • cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, oral cavity, ovary, pancreas, prostate, skin, stomach, leukemia, liver, lung, and uterus.
  • Phenotypic traits also include characteristics such as longevity, appearance (e.g., baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity to particular drugs or therapeutic treatments.
  • Such correlations can be exploited in several ways.
  • detection of the polymo ⁇ hic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient.
  • Detection of a polymo ⁇ hic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions.
  • the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymo ⁇ hism from her husband to her offspring.
  • p(ID) is the probability that two random individuals have the same polymo ⁇ hic or allelic form at a given polymo ⁇ hic site.
  • diallelic loci four genotypes are possible: AA,
  • the probability of identity p(ID) for a 3-allele system where the alleles have the frequencies in the population of x, y and z, respectively, is equal to the sum of the squares of the genotype frequencies:
  • the object of paternity testing is usually to determine whether a male is the father of a child. In most cases, the mother of the child is known and thus, the mother's contribution to the child's genotype can be traced. Paternity testing investigates whether the part of the child's genotype not attributable to the mother is consistent with that of the putative father. Paternity testing can be performed by analyzing sets of polymo ⁇ hisms in the putative father and the child. If the set of polymo ⁇ hisms in the child attributable to the father does not match the putative father, it can be concluded, barring experimental error, that the putative father is not the real father. If the set of polymo ⁇ hisms in the child attributable to the father does match the set of polymo ⁇ hisms of the putative father, a statistical calculation can be performed to determine the probability of coincidental match.
  • the cumulative probability of exclusion of a random male is very high. This probability can be taken into account in assessing the liability of a putative father whose polymo ⁇ hic marker set matches the child's polymo ⁇ hic marker set attributable to his/her father.
  • allele Al at polymo ⁇ hism A correlates with heart disease.
  • allele Bl at polymo ⁇ hism B correlates with increased milk production of a farm animal.
  • Such correlations can be exploited in several ways.
  • detection of the polymo ⁇ hic form set in a human or animal patient may justify immediate administration of treatment, or at least the institution of regular monitoring of the patient.
  • Detection of a polymo ⁇ hic form correlated with serious disease in a couple contemplating a family may also be valuable to the couple in their reproductive decisions.
  • the female partner might elect to undergo in vitro fertilization to avoid the possibility of transmitting such a polymo ⁇ hism from her husband to her offspring.
  • Linkage is analyzed by calculation of LOD (log of the odds) values.
  • a lod value is the relative likelihood of obtaining observed segregation data for a marker and a genetic locus when the two are located at a recombination fraction , versus the situation in which the two are not linked, and thus segregating independently (Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders Company, Philadelphia, 1991); Strachan, "Mapping the human genome” in The Human Genome (BIOS Scientific Publishers Ltd, Oxford), Chapter 4).
  • Positive lod score values suggest that the two loci are linked, whereas negative values suggest that linkage is less likely (at that value of ) than the possibility that the two loci are unlinked.
  • a combined lod score of + 3 or greater is considered definitive evidence that two loci are linked.
  • a negative lod score of -2 or less is taken as definitive evidence against linkage of the two loci being compared.
  • Negative linkage data are useful in excluding a chromosome or a segment thereof from consideration. The search focuses on the remaining non-excluded chromosomal locations.
  • the invention further provides transgenic nonhuman animals capable of expressing an exogenous variant gene and/or having one or both alleles of an endogenous variant gene inactivated.
  • Expression of an exogenous variant gene is usually achieved by operably linking the gene to a promoter and optionally an enhancer, and microinjecting the construct into a zygote. See Hogan et al., "Manipulating the Mouse Embryo, A Laboratory Manual," Cold Spring Harbor Laboratory. (1989).
  • Inactivation of endogenous variant genes can be achieved by forming a transgene in which a cloned variant gene is inactivated by insertion of a positive selection marker.
  • transgene is then introduced into an embryonic stem cell, where it undergoes homologous recombination with an endogenous variant gene. Mice and other rodents are preferred animals. Such animals provide useful drug screening systems.
  • the invention further provides methods for assessing the pharmacogenomic susceptibility of a subject harboring a single nucleotide polymo ⁇ hism to a particular pharmaceutical compound, or to a class of such compounds.
  • Genetic polymo ⁇ hism in drug- metabolizing enzymes, drug transporters, receptors for pharmaceutical agents, and other drug targets have been correlated with individual differences based on distinction in the efficacy and toxicity of the pharmaceutical agent administered to a subject.
  • Pharmocogenomic characterization of a subjects susceptibility to a drug enhances the ability to tailor a dosing regimen to the particular genetic constitution of the subject, thereby enhancing and optimizing the therapeutic effectiveness of the therapy.
  • a subject suspected of suffering from a pathology ascribable to a polymo ⁇ hic protein that arises from a cSNP is to be diagnosed using any of a variety of diagnostic methods capable of identifying the presence of the cSNP in the nucleic acid, or of the cognate polymo ⁇ hic protein, in a suitable clinical sample taken from the subject.
  • the subject is treated with a pharmaceutical composition that includes a nucleic acid that harbors the correcting wild-type gene, or a fragment containing a correcting sequence of the wild-type gene.
  • Non-limiting examples of ways in which such a nucleic acid may be administered include inco ⁇ orating the wild-type gene in a viral vector, such as an adenovirus or adeno associated virus, and administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid.
  • a viral vector such as an adenovirus or adeno associated virus
  • administration of a naked DNA in a pharmaceutical composition that promotes intracellular uptake of the administered nucleic acid Once the nucleic acid that includes the gene coding for the wild-type allele of the polymo ⁇ hism is inco ⁇ orated within a cell of the subject, it will initiate de novo biosynthesis of the wild-type gene product. If the nucleic acid is further inco ⁇ orated into the genome of the subject, the treatment will have long-term effects, providing de novo synthesis of the wild-type protein for a prolonged duration. The synthesis of the wild-type protein in the cells of the subject will contribute to a therapeutic enhancement of the
  • a genetic defect leading to an inborn pathology may be overcome, as the chimeric oligonucleotides induces inco ⁇ oration of the wild-type sequence into the subject's genome.
  • the wild-type gene product is expressed, and the replacement is propagated, thereby engendering a permanent repair.
  • kits comprising at least one allele-specific oligonucleotide as described above.
  • the kits contain one or more pairs of allele- specific oligonucleotides hybridizing to different forms of a polymo ⁇ hism.
  • the allele-specific oligonucleotides are provided immobilized to a substrate.
  • the same substrate can comprise allele-specific oligonucleotide probes for detecting at least
  • nucleic acids comprising a SNP ofthe inventions.
  • DNA is isolated from a genomic or cDNA library using labeled oligonucleotide probes having sequences complementary to the sequences disclosed herein.
  • probes can be used directly in hybridization assays.
  • probes can be designed for use in amplification techniques such as PCR.
  • DNA encoding a sequence comprising a cSNP is isolated and cloned, one can express the encoded polymo ⁇ hic proteins in a variety of recombinantly engineered cells. It is expected that those of skill in the art are knowledgeable in the numerous expression systems available for expression of DNA encoding a sequence of interest. No attempt to describe in detail the various methods known for the expression of proteins in prokaryotes or eukaryotes is made here.
  • the expression of natural or synthetic nucleic acids encoding a sequence of interest will typically be achieved by operably linking the DNA or cDNA to a promoter (which is either constitutive or inducible), followed by inco ⁇ oration into an expression vector.
  • the vectors can be suitable for replication and integration in either prokaryotes or eukaryotes.
  • Typical expression vectors contain, initiation sequences, transcription and translation terminators, and promoters useful for regulation ofthe expression of a polynucleotide sequence of interest.
  • expression plasmids which contain, at the minimum, a strong promoter to direct transcription, a ribosome binding site for translational initiation, and a transcription/translation terminator.
  • regulatory regions suitable for this pu ⁇ ose in E. coli are the promoter and operator region ofthe E. coli tryptophan biosynthetic pathway as described by Yanofsky, C, J. Bacterial. 158:1018-1024 (1984) and the leftward promoter of phage lambda (P ) as described by ⁇ , I. and Hagen, P., Ann. Rev. Genet. 14:399-445 (1980).
  • selection markers in DNA vectors transformed in E. coli is also useful. Examples of such markers include genes specifying resistance to ampicillin, tetracycline, or chloramphenicol. See Sambrook et al. for details concerning selection markers for use in E. coli.
  • the expressed protein may first be denatured and then renatured. This can be accomplished by solubilizing the bacterially produced proteins in a chaotropic agent such as guanidine HCI and reducing all the cysteine residues with a reducing agent such as beta- mercaptoethanol. The protein is then renatured, either by slow dialysis or by gel filtration. See U.S. Patent No. 4,511,503. Detection ofthe expressed antigen is achieved by methods known in the art as radioimmunoassay, or Western blotting techniques or immunoprecipitation. Purification from E. coli can be achieved following procedures such as those described in U.S. Patent No. 4,511,503.
  • Any of a variety of eukaryotic expression systems such as yeast, insect cell lines, bird, fish, and mammalian cells, may also be used to express a polymo ⁇ hic protein ofthe invention.
  • a nucleotide sequence harboring a cSNP may be expressed in these eukaryotic systems. Synthesis of heterologous proteins in yeast is well known. Methods in Yeast Genetics, Sherman, F., et al., Cold Spring Harbor Laboratory, (1982) is a well recognized work describing the various methods available to produce the protein in yeast.
  • Suitable vectors usually have expression control sequences, such as promoters, including 3-phosphogtycerate kinase or other glycolytic enzymes, and an origin of replication, termination sequences and the like as desired.
  • promoters including 3-phosphogtycerate kinase or other glycolytic enzymes
  • origin of replication termination sequences and the like as desired.
  • suitable vectors are described in the literature (Botstein, et al., Gene 8:17-24 (1979); Broach, et al., Gene 8:121- 133 (1979)).
  • yeast cells are first converted into protoplasts using zymolyase, lyticase or glusulase, followed by addition of DNA and polyethylene gly col (PEG).
  • PEG polyethylene gly col
  • the PEG-treated protoplasts are then regenerated in a 3% agar medium under selective conditions. Details of this procedure are given in the papers by J.D. Beggs, Nature (London) 275:104-109 (1978); and Hinnen, A., et al., Proc. Natl. Acad. Sci. USA, 75:1929-1933 (1978).
  • the second procedure does not involve removal of the cell wall.
  • vectors for expressing the proteins ofthe invention in insect cells are usually derived from baculovirus.
  • Insect cell lines include mosquito larvae, silkworm, armyworm, moth and Drosophila cell lines such as a Schneider cell line (See Schneider J. Embryol. Exp. Mo ⁇ hol., 27:353-365 (1987).
  • the vector e.g., a plasmid, which is used to transform the host cell, preferably contains DNA sequences to initiate transcription and sequences to control the translation ofthe protein. These sequences are referred to as expression control sequences.
  • polyadenylation or transcription terminator sequences from known mammalian genes need to be inco ⁇ orated into the vector.
  • An example of a terminator sequence is the polyadenylation sequence from the bovine growth hormone gene. Sequences for accurate splicing ofthe transcript may also be included.
  • An example of a splicing sequence is the VP1 intron from
  • the transformed cells are cultured by means well known in the art (Biochemical Methods in Cell Culture and Virology, Kuchler, R.J., Dowden, Hutchinson and Ross, Inc., (1977)).
  • the expressed polypeptides are isolated from cells grown as suspensions or as monolayers. The latter are recovered by well known mechanical, chemical or enzymatic means.
  • operably linked refers to linkage of a promoter upstream from a DNA sequence such that the promoter mediates transcription ofthe DNA sequence.
  • operably linked means that the isolated polynucleotide ofthe invention and an expression control sequence are situated within a vector or cell in such a way that the gene encoding the protein is expressed by a host cell which has been transformed (transfected) with the ligated polynucleotide/expression sequence.
  • vector refers to viral expression systems, autonomous self-replicating circular DNA (plasmids), and includes both expression and nonexpression plasmids.
  • Mammalian host cells include, for example, monkey COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human epidermal A43 1 cells, human Co 10205 cells, 3T3 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL- 60, U937, HaK or Jurkat cells.
  • the protein may also be produced by operably linking the isolated polynucleotide of the invention to suitable control sequences in one or more insect expression vectors, and employing an insect expression system.
  • Materials and methods for baculovirus/insect cell expression systems are commercially available in kit form from, e.g., Invitrogen, San Diego, California, U.S.A. (the MaxBac ⁇ kit), and such methods are well known in the art, as described in Summers and Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), inco ⁇ orated herein by reference.
  • an insect cell capable of expressing_a polynucleotide ofthe present invention is "transformed.”
  • the protein ofthe invention may be prepared by culturing transformed host cells under culture conditions suitable to express the recombinant protein.
  • the polymo ⁇ hic protein ofthe invention may also be expressed as a product of transgenic animals, e.g., as a component ofthe milk of transgenic cows, goats, pigs, or sheep which are characterized by somatic or germ cells containing a nucleotide sequence encoding the protein.
  • the protein may also be produced by known conventional chemical synthesis. Methods for constructing the proteins of the present invention by synthetic means are known to those skilled in the art.
  • fusion protein such as those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX). Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, MA), Pharmacia (Piscataway, NJ) and InVitrogen, respectively.
  • MBP maltose binding protein
  • GST glutathione-S-transferase
  • TRX thioredoxin
  • Kits for expression and purification of such fusion proteins are commercially available from New England BioLab (Beverly, MA), Pharmacia (Piscataway, NJ) and InVitrogen, respectively.
  • the protein can also be tagged with an epitope and subsequently purified by using a specific antibody directed to such epitope.
  • One such epitope (“Flag") is commercially available from Kodak (New Haven, CT).
  • RP- HPLC reverse-phase high performance liquid chromatography
  • hydrophobic RP- HPLC media e.g., silica gel having pendant methyl or other aliphatic groups
  • Some or all ofthe foregoing purification steps, in various combinations, can also be employed to provide a substantially homogeneous isolated recombinant protein.
  • the protein thus purified is substantially free of other mammalian proteins and is defined in accordance with the present invention as an "isolated protein.”
  • antibody refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules that contain an antigen binding site that specifically binds (immunoreacts with) an antigen, such as polymo ⁇ hic.
  • Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b and F( a b ' )2 fragments, and an F a b expression library.
  • antibodies to human polymo ⁇ hic proteins are disclosed.
  • the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample.
  • Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein.
  • immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein.
  • solid-phase ELISA immunoassays are routinely used to select monoclonal antibodies specifically immunoreactive with a protein. See Harlow and Lane (1988) Antibodies, a Laboratory Manual, Cold Spring Harbor Publications, New York, for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity.
  • Antibodies that immunospecifically bind to polymo ⁇ hic gene products but not to the corresponding prototypical or "wild-type" gene products are also provided.
  • Antibodies can be made by injecting mice or other animals with the variant gene product or synthetic peptide.
  • Monoclonal antibodies are screened as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested for specific immunoreactivity with a variant gene product and lack of immunoreactivity to the corresponding prototypical gene product.
  • An isolated polymo ⁇ hic protein, or a portion or fragment thereof, can be used as an immunogen to generate the antibody that bind the polymo ⁇ hic protein using standard techniques for polyclonal and monoclonal antibody preparation.
  • the full-length polymo ⁇ hic protein can be used or, alternatively, the invention provides antigenic peptide fragments of polymo ⁇ hic for use as immunogens.
  • the antigenic peptide of a polymo ⁇ hic protein ofthe invention comprises at least 8 amino acid residues ofthe amino acid sequence encompassing the polymo ⁇ hic amino acid and encompasses an epitope ofthe polymo ⁇ hic protein such that an antibody raised against the peptide forms a specific immune complex with the polymo ⁇ hic protein.
  • the antigenic peptide comprises at least 10 amino acid residues, more preferably at least 15 amino acid residues, even more preferably at least 20 amino acid residues, and most preferably at least 30 amino acid residues.
  • Preferred epitopes encompassed by the antigenic peptide are regions of polymo ⁇ hic that are located on the surface ofthe protein, e.g., hydrophilic regions.
  • polymo ⁇ hic protein For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, goat, mouse or other mammal) may be immunized by injection with the polymo ⁇ hic protein.
  • An appropriate immunogenic preparation can contain, for example, recombinantly expressed polymo ⁇ hic protein or a chemically synthesized polymo ⁇ hic polypeptide. The preparation can further include an adjuvant.
  • Such techniques include, but are not limited to, the hybridoma technique (see Kohler & Milstein, 1975 Nature 256: 495-497); the trioma technique; the human B-cell hybridoma technique (see Kozbor, et al, 1983 Immunol Today 4: 72) and the EBV hybridoma technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96).
  • Human monoclonal antibodies may be utilized in the practice ofthe present invention and may be produced by using human hybridomas (see Cote, et al, 1983. Proc
  • Antibody fragments that contain the idiotypes to a polymo ⁇ hic protein may be produced by techniques known in the art including, but not limited to: (i) an F( a b ' ) 2 fragment produced by pepsin digestion of an antibody molecule; (ii) an F a b fragment generated by reducing the disulfide bridges of an F (a ' ) 2 fragment; (iii) an F a b fragment generated by the treatment ofthe antibody molecule with papain and a reducing agent and (iv) F v fragments.
  • methodologies for the screening of antibodies that possess the desired specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and other immunologically-mediated techniques known within the art.
  • ELISA enzyme-linked immunosorbent assay
  • Anti-polymo ⁇ hic protein antibodies may be used in methods known within the art relating to the detection, quantitation and/or cellular or tissue localization of a polymo ⁇ hic protein (e.g., for use in measuring levels ofthe polymo ⁇ hic protein within appropriate physiological samples, for use in diagnostic methods, for use in imaging the protein, and the like).
  • antibodies for polymo ⁇ hic proteins, or derivatives, fragments, analogs or homologs thereof, that contain the antibody-derived CDR are utilized as pharmacologically-active compounds in therapeutic applications intended to treat a pathology in a subject that arises from the presence ofthe cSNP allele in the subject.
  • An anti-polymo ⁇ hic protein antibody (e.g., monoclonal antibody) can be used to isolate polymo ⁇ hic proteins by a variety of immunochemical techniques, such as immunoaffinity chromatography or immunoprecipitation.
  • An anti-polymo ⁇ hic protein antibody can facilitate the purification of natural polymo ⁇ hic protein from cells and of recombinantly produced polymo ⁇ hic proteins expressed in host cells.
  • an anti-polymo ⁇ hic protein antibody can be used to detect polymo ⁇ hic protein (e.g., in a cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of expression ofthe polymo ⁇ hic protein.
  • Anti-polymo ⁇ hic antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance.
  • detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials.
  • suitable enzymes include horseradish peroxidase, alkaline phosphatase, -g alactosidase, or acetylcholinesterase;
  • suitable prosthetic group complexes include streptavidin biotin and avidin/biotin;
  • suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin;
  • an example of a luminescent material includes luminol;
  • examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125 I, 131 I, 35 S or 3 H.

Abstract

L'invention concerne des acides nucléiques contenant des polymorphismes de nucléotides simples identifiés pour des séquences humaines transcrites, ainsi que les procédés d'utilisation desdits acides nucléiques.
PCT/US2000/035346 1999-12-27 2000-12-27 Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants WO2001048245A2 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP00990358A EP1282726A2 (fr) 1999-12-27 2000-12-27 Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants
AU27394/01A AU2739401A (en) 1999-12-27 2000-12-27 Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
CA002395786A CA2395786A1 (fr) 1999-12-27 2000-12-27 Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US47268899A 1999-12-27 1999-12-27
US09/472,688 1999-12-27

Publications (2)

Publication Number Publication Date
WO2001048245A2 true WO2001048245A2 (fr) 2001-07-05
WO2001048245A3 WO2001048245A3 (fr) 2002-11-28

Family

ID=23876541

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/035346 WO2001048245A2 (fr) 1999-12-27 2000-12-27 Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants

Country Status (4)

Country Link
EP (1) EP1282726A2 (fr)
AU (1) AU2739401A (fr)
CA (1) CA2395786A1 (fr)
WO (1) WO2001048245A2 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002070745A2 (fr) * 2001-03-02 2002-09-12 Renovo Limited Depistage genetique
EP1461461A2 (fr) * 2001-12-10 2004-09-29 Isis Pharmaceuticals, Inc. Modulation antisens de l'expression du cd81
WO2006053955A2 (fr) * 2004-11-19 2006-05-26 Oy Jurilab Ltd Procede et kit de detection d'un risque d'hypertension arterielle essentielle
US7115374B2 (en) 2002-10-16 2006-10-03 Gen-Probe Incorporated Compositions and methods for detecting West Nile virus
US7304129B2 (en) * 2000-06-16 2007-12-04 Imperial Innovations Limited Peptides that stimulate cell survival and axon regeneration
US7871984B2 (en) * 2003-04-23 2011-01-18 Yukio Sato Methylated CpG polynucleotide
US7927840B2 (en) 2006-09-11 2011-04-19 Gen Probe Incorporated Method for detecting West Nile Virus nucleic acids in the 3′ non-coding region

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200082618A (ko) * 2018-12-31 2020-07-08 주식회사 폴루스 인슐린 과발현용 램프 태그 및 이를 이용한 인슐린의 제조방법

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0785280A2 (fr) * 1995-11-29 1997-07-23 Affymetrix, Inc. (a California Corporation) Détection de polymorphisme
WO1998018967A1 (fr) * 1996-10-28 1998-05-07 Affymetrix, Inc. Polymorphismes dans le locus de la glucose-6-phosphate-deshydrogenase
WO1998020165A2 (fr) * 1996-11-06 1998-05-14 Whitehead Institute For Biomedical Research Marqueurs bialleliques
WO1998021316A1 (fr) * 1996-11-15 1998-05-22 The New York Blood Center, Inc. Procede d'elaboration d'anticorps monoclonaux a l'aide d'animaux transgeniques polymorphes
WO1998030717A2 (fr) * 1996-12-02 1998-07-16 Biocem S.A. Sequences vegetales comprenant un site polymorphe et utilisation de celles-ci
WO1998038846A2 (fr) * 1997-03-07 1998-09-11 Affymetrix, Inc. Compositions genetiques et procedes

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0785280A2 (fr) * 1995-11-29 1997-07-23 Affymetrix, Inc. (a California Corporation) Détection de polymorphisme
WO1998018967A1 (fr) * 1996-10-28 1998-05-07 Affymetrix, Inc. Polymorphismes dans le locus de la glucose-6-phosphate-deshydrogenase
WO1998020165A2 (fr) * 1996-11-06 1998-05-14 Whitehead Institute For Biomedical Research Marqueurs bialleliques
WO1998021316A1 (fr) * 1996-11-15 1998-05-22 The New York Blood Center, Inc. Procede d'elaboration d'anticorps monoclonaux a l'aide d'animaux transgeniques polymorphes
WO1998030717A2 (fr) * 1996-12-02 1998-07-16 Biocem S.A. Sequences vegetales comprenant un site polymorphe et utilisation de celles-ci
WO1998038846A2 (fr) * 1997-03-07 1998-09-11 Affymetrix, Inc. Compositions genetiques et procedes

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE EMBL [Online] european molecular biology laboratory; Accession number AC: AQ781771, August 1999 (1999-08) XP002175438 -& MAHAIRAS G G ET AL.: "Sequence-tagged connectors: A sequence approach to mapping and scanning the human genome" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES USA, vol. 96, August 1999 (1999-08), pages 9739-9744, XP002175436 *
DATABASE EMBL [Online] european molecular biology laboratory; Accession number AC: J05451, 1990 XP002175439 -& MAEDA M ET AL.: "Human gastric (H+ + K+)-ATPase gene. Similarity to (Na+ + K+)-ATPase genes in exon/intron organization but difference in control region" THE JOURNAL OF BILOGICAL CHEMISTRY, vol. 265, no. 16, 1990, pages 9027-9032, XP002175437 *
FAN J ET AL: "Genetic mapping: Finding and analyzing single-nucleotide polymorphisms with high-density DNA arrays" AMERICAN JOURNAL OF HUMAN GENETICS, UNIVERSITY OF CHICAGO PRESS, CHICAGO,, US, vol. 61, no. 4, SUPPL, 1 October 1997 (1997-10-01), page 1601 XP002089397 ISSN: 0002-9297 *
WANG D G ET AL: "Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome" SCIENCE, AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE,, US, vol. 280, 1998, pages 1077-1082, XP002089398 ISSN: 0036-8075 *
XIONG M AND JIN L: "Biallelic markers in genetics studies of human diseases: Their power, accuracy, and density in population-based linkage analysis" AMERICAN JOURNAL OF HUMAN GENETICS, UNIVERSITY OF CHICAGO PRESS, CHICAGO,, US, vol. 61, no. 4, SUPPL, 1997, page 1759 XP002119235 ISSN: 0002-9297 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7304129B2 (en) * 2000-06-16 2007-12-04 Imperial Innovations Limited Peptides that stimulate cell survival and axon regeneration
WO2002070745A2 (fr) * 2001-03-02 2002-09-12 Renovo Limited Depistage genetique
WO2002070745A3 (fr) * 2001-03-02 2003-05-22 Renovo Ltd Depistage genetique
US7332305B2 (en) 2001-03-02 2008-02-19 Renovo Limited Genetic polymorphism in the Zf9 gene linked to inappropriate scarring or fibrosis
EP1461461A2 (fr) * 2001-12-10 2004-09-29 Isis Pharmaceuticals, Inc. Modulation antisens de l'expression du cd81
EP1461461A4 (fr) * 2001-12-10 2005-03-16 Isis Pharmaceuticals Inc Modulation antisens de l'expression du cd81
US7115374B2 (en) 2002-10-16 2006-10-03 Gen-Probe Incorporated Compositions and methods for detecting West Nile virus
US7732169B2 (en) 2002-10-16 2010-06-08 Gen-Probe Incorporated Method for detecting West Nile virus nucleic acids in the 5′ non-coding/capsid region
US8759003B2 (en) 2002-10-16 2014-06-24 Gen-Probe Incorporated Detection of West Nile virus nucleic acids in the viral 3′ non-coding region
US9580762B2 (en) 2002-10-16 2017-02-28 Gen-Probe Incorporated Detection of west nile virus nucleic acids in the viral 3′ non-coding region
US10781495B2 (en) 2002-10-16 2020-09-22 Gen-Probe Incorporated Detection of West Nile virus nucleic acids in the viral 3′ non-coding region
US7871984B2 (en) * 2003-04-23 2011-01-18 Yukio Sato Methylated CpG polynucleotide
WO2006053955A3 (fr) * 2004-11-19 2006-08-31 Jurilab Ltd Oy Procede et kit de detection d'un risque d'hypertension arterielle essentielle
WO2006053955A2 (fr) * 2004-11-19 2006-05-26 Oy Jurilab Ltd Procede et kit de detection d'un risque d'hypertension arterielle essentielle
US7927840B2 (en) 2006-09-11 2011-04-19 Gen Probe Incorporated Method for detecting West Nile Virus nucleic acids in the 3′ non-coding region

Also Published As

Publication number Publication date
WO2001048245A3 (fr) 2002-11-28
AU2739401A (en) 2001-07-09
EP1282726A2 (fr) 2003-02-12
CA2395786A1 (fr) 2001-07-05

Similar Documents

Publication Publication Date Title
WO2001047944A2 (fr) Acides nucleiques contenant des polymorphismes mononucleotidiques et procedes d'utilisation correspondants
EP1131467A2 (fr) Acides nucleiques contenant des polymorphismes d'un seul nucleotide et utilisations de ces acides nucleiques
EP0812922A2 (fr) Polymorphismes dans l'acide nucléique mitochondrial humain
WO2001048245A2 (fr) Acides nucleiques contenant des polymorphismes de nucleotides simples, et procedes d'utilisation correspondants
WO2001040521A2 (fr) Acides nucleiques contenant des polymorphismes mononucleotidiques et procedes d'utilisation correspondants
US20020155446A1 (en) Very low density lipoprotein receptor polymorphisms and uses therefor
US20040235041A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP1250456A2 (fr) Acides nucleiques renfermant des polymorphismes de nucleotide simple et procede d'utilisation associe
AU1915700A (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
AU2004203849B2 (en) Nucleic Acids Containing Single Nucleotide Polymorphisms and Methods of Use Thereof
US20030008301A1 (en) Association between schizophrenia and a two-marker haplotype near PILB gene
US20040235026A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
WO2001047942A2 (fr) Acides nucleiques contenant des polymorphismes mononucleotidiques et procedes d'utilisation de ces acides
US20030232365A1 (en) BDNF polymorphisms and association with bipolar disorder
US20030224413A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
US20030009016A1 (en) Nucleic acids containing single nucleotide polymorphisms and methods of use thereof
EP1244688A1 (fr) Acides nucleiques contenant des polymorphismes mononucleotidiques et procedes d'utilisation correspondants
US7339049B1 (en) Polymorphisms in human mitochondrial DNA
WO2001090161A2 (fr) Acides nucleiques contenant des polymorphismes de nucleotides uniques et leurs procedes d'utilisation
CA2294572A1 (fr) Compositions genetiques et methodes connexes
US6913885B2 (en) Association of dopamine beta-hydroxylase polymorphisms with bipolar disorder
WO2003087309A2 (fr) Polymorphismes de bdnf et association avec un trouble bipolaire

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2395786

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2000990358

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 27394/01

Country of ref document: AU

AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 2000990358

Country of ref document: EP

NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Ref document number: 2000990358

Country of ref document: EP