WO1998044123A2 - O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE - Google Patents

O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE Download PDF

Info

Publication number
WO1998044123A2
WO1998044123A2 PCT/US1998/006101 US9806101W WO9844123A2 WO 1998044123 A2 WO1998044123 A2 WO 1998044123A2 US 9806101 W US9806101 W US 9806101W WO 9844123 A2 WO9844123 A2 WO 9844123A2
Authority
WO
WIPO (PCT)
Prior art keywords
ala
leu
glcnac transferase
linked glcnac
protein
Prior art date
Application number
PCT/US1998/006101
Other languages
French (fr)
Other versions
WO1998044123A3 (en
Inventor
John A. Hanover
William Lubas
Original Assignee
The Government Of The United States Of America Represented By The Secretary Of The Department Of Health And Human Services
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Government Of The United States Of America Represented By The Secretary Of The Department Of Health And Human Services filed Critical The Government Of The United States Of America Represented By The Secretary Of The Department Of Health And Human Services
Priority to AU69425/98A priority Critical patent/AU6942598A/en
Publication of WO1998044123A2 publication Critical patent/WO1998044123A2/en
Publication of WO1998044123A3 publication Critical patent/WO1998044123A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1048Glycosyltransferases (2.4)
    • C12N9/1051Hexosyltransferases (2.4.1)

Definitions

  • O-LINKED GlcNAc TRANSFERASE O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE
  • This invention relates to a post-translational modification of a protein involving the addition of N-acetylglucosamine in O-glycosidic linkage to serine or threonine residues of cytoplasmic and nuclear proteins. It is believed that such modification plays a significant role in regulating the activity of proteins involved in transcriptional and translational processes.
  • this invention provides an enzyme catalyzing the formation of these derivatives, uridine diphospho-N-acetylglucosamine.polypeptide ⁇ -N-acetylglucosaminyl transferase (O-GlcNAc, OGT), and a nucleic acid encoding the enzyme.
  • O-linked GlcNAc modifies a large number of polypeptides in multimeric structures including RNA polymerase II transcription complexes and p67/eIF-2- initiation factor in the translation machinery.
  • uridine diphospho-N-acetylglucosamine.polypeptide ⁇ -N-acetylglucosaminyl transferase O-GlcNAc transferase, OGT
  • OGT uridine diphospho-N-acetylglucosamine
  • UDP-GlcNAc uridine diphospho-N-acetylglucosamine
  • the enzyme has a much lower K,, with respect to this substrate than is usually observed for glycosyltransferases.
  • O-linked GlcNAc addition is analogous to protein phosphorylation.
  • the enzyme has been shown to recognize a large number of phosphoproteins, some of which play a direct role in signal transduction.
  • phosphorylation and glycosylation seem to be mutually exclusive.
  • the glycosylated enzyme is necessary for assembly of the preinitiation complex, subsequent deglycosylation and phosphorylation are necessary for transition to the elongation complex (Kelly et al. (1993) J Biol Chem 268, 10416-10424).
  • other substrates such as neurofilaments (Dong et al. (1996) J Biol Chem 271, 20845-20852) or the nuclear pore proteins nup62, nup97, and nup200 (Macaulay et al.
  • GlcNAc to proteins in the cytoplasm and nucleus is also highly regulated. Since both phosphorylation and glycosylation compete for similar serine or threonine residues, it is possible that the two processes could be directly competing for sites, or they may alter the substrate specificity of nearby sites by steric or electrostatic effects.
  • the hexosamine biosynthetic pathway is responsible for the synthesis of cytoplasmic UDP-GlcNAc utilized by OGT. Normally 2-3% of incoming glucose fluxes through this pathway (Marshall et al. (1991) J Biol Chem 266, 4706-4712). Increased glucose flux through the hexosamine biosynthetic pathway, caused by hyperglycemia, has been shown to mediate insulin resistance (Marshall et al. (1991); Rossetti et al.. (1995) J Clin Invest 96, 132-140; Daniels et al. (1993) Mol Endocrinol 7, 1041-1048; Crook et al.
  • the hexosamine biosynthetic pathway by controlling intracellular UDP-GlcNAc concentrations, may be acting in peripheral tissues as a glucose sensor which is reflected in substrate-driven O-linked GlcNAc modification of intracellular proteins by OGT. Glucosamine administration has been shown to impair insulin secretion from the pancreas in response to glucose both in vitro and in vivo (Balkan et al. (1994) Diabetes 43, 1173-1179).
  • O-linked GlcNAc modifies many phosphoproteins which are components of multimeric complexes.
  • the sites modified by O-linked GlcNAc often resemble phosphorylation sites, leading to the suggestion that the modifications may compete for substrate in these polypeptides (Hart et al. (1995)).
  • the sites modified by OGT resemble those of the glycogen synthase kinases (GSK, such as GSK-3 or casein kinase II) and microtubule associated protein (MAP) kinase very closely.
  • GSK glycogen synthase kinases
  • MAP microtubule associated protein
  • insulin activates the MAP kinase cascade, inhibiting GSK-3 inhibition of glycogen synthase, the rate limiting enzyme in glycogen synthesis (Chou et al. (1995) Proc Natl Acad Sci U S A 92, 4417-4421; Woodgett (1991) Trends Biochem Sci 16, 177-181).
  • GSK-3 also modifies the oncogene c-jun and negatively regulates its transactivating potential in vivo.
  • Another oncogene, c-myc is modified by both O-linked GlcNAc and phosphorylated by GSK-3 in a domain required for transcriptional activation (Woodgett (1991); Stambolic et al. (1994) Biochem J 303 ( Pt 3), 701-704; Plyte et al. (1992) Biochim Biophys Acta 1114, 147-162).
  • Glucose-responsive elements from several mammalian genes have been identified and include myc-like response elements (Towle (1995) J Biol Chem 270,
  • O-linked GlcNAc addition by OGT and phosphorylation by kinases such as GSK-3 may have as a common denominator their involvement in transcriptional regulation of glucose metabolism.
  • Additional oncogenes which may serve as substrates for OGT include c-fos, c-jun, v-erb A, and the tumor suppressor Rb.
  • the level of GlcNAc in a cell, and its role as a substrate for OGT suggest that O-linked GlcNAc has a role in modulating or regulating the activity of oncogenes, or inhibiting their functions, in tumorigenesis and in tumor suppression.
  • Experimental insulin-dependent diabetes may be induced in animals by administering streptozotocin. It is known that this agent destroys the ⁇ cells in the islets of Langerhans, where insulin secretion occurs. It is possible that streptozotocin actually affects OGT, whether by interfering with its synthesis, or by inducing inhibition of its activity, or it may inhibit the activity of the hexosaminidase.
  • OGT activity is implicated in the pathogenesis of Alzheimer's disease.
  • Two proteins involved in this disease, tau and amyloid- ⁇ protein, are both glycosylated by OGT.
  • Griffith et al. ((1995) Biochem. Biophys. Res. Commun. 213, 424-431) have shown that O-GlcNAc glycosylation is upregulated in the brains of patients with Alzheimer's disease.
  • OGT has been purified from several different sources (Lubas et al. (1995), Haltiwanger et al. (1992)) it has not been molecularly cloned. There is therefore a need to clone a gene for OGT, especially a gene originating in humans. There further is a need for expressing the gene to produce a protein having
  • O-GlcNAc transferase activity The need also exists for employing an O-GlcNAc transferase protein in studies designed to identify inhibitors of the O-GlcNAc transferase activity. Such inhibitors would have strong potential as therapeutic compounds in the treatment of diabetes mellitus, and potentially in the treatment of tumor-derived disease and Alzheimer's disease as well.
  • This invention provides an isolated DNA molecule that includes a sequence encoding a protein exhibiting uridine diphospho-N-acetylglucosamine:polypeptide ⁇ -N-acetylglucosaminyl transferase (O-linked GlcNAc transferase, OGT) activity.
  • nucleic acid vector including the isolated DNA encoding OGT; the vector may also include a regulatory nucleotide sequence operably positioned with respect to the DNA sequence encoding an O-linked GlcNAc transferase such that, when the vector is introduced into a suitable host cell and the regulatory sequence is triggered, the protein is expressed.
  • the nucleic acid has the sequence of human OGT provided in SEQ ID NO:l, or of OGT from Caenorhabditis elegans, provided in SEQ ID NO:3.
  • the present invention additionally provides an isolated protein exhibiting O-linked GlcNAc transferase activity.
  • the protein has the amino acid sequence of an O-linked GlcNAc transferase and is a human O-linked GlcNAc transferase, and in a further important aspect has the amino acid sequence given by SEQ ID NO:2 .
  • the protein has the amino acid sequence of a C. elegans O-linked GlcNAc transferase, and in an additional significant aspect the protein has the amino acid sequence given by SEQ ID NO:4.
  • the invention additionally provides host cells containing a vector including the DNA encoding OGT and which express a protein with OGT activity.
  • the host cells may also harbor cellular components responsive to the regulatory nucleotide sequence contained in a vector that is operably linked to the OGT coding sequence.
  • the regulatory sequence is such that the protein encoded by the vector is expressed in the host when the cells are cultured under suitable conditions that trigger the regulatory sequence.
  • the host cells are capable of expressing the DNA encoding OGT.
  • the host cells express the human O-linked GlcNAc transferase protein whose amino acid sequence is given by SEQ ID NO:2, or the O-linked GlcNAc transferase protein from C. elegans having the amino acid sequence given by SEQ ID NO:4.
  • the present invention provides a method of expressing a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase.
  • the method includes the step of culturing host cells harboring a vector that includes a DNA encoding OGT under conditions that promote growth of the cells.
  • the vector may also include a control element operably linked to the OGT coding sequence that under suitable conditions of growth cause the control elements thereof to induce expression of the O-linked GlcNAc transferase gene, thereby expressing the protein.
  • the cells also may contain cellular components that interact with the control element under suitable conditions, resulting in expression of the DNA sequence.
  • the host cells contain a vector that includes the DNA sequence encoding either the human O-linked GlcNAc transferase protein whose sequence is given by SEQ ID NO:2, or the O-linked GlcNAc transferase protein from C. elegans having the amino acid sequence given by SEQ ID NO:4.
  • a method of identifying an inhibitor of O-linked GlcNAc transferase comprises the steps of
  • the protein having O-linked GlcNAc transferase activity in the first and second tests is a human O-linked GlcNAc transferase and has the amino acid sequence given by SEQ ID NO:2, or is a C. elegans O-linked GlcNAc transferase whose amino acid sequence is given by SEQ ID NO:4.
  • the glycosylation target protein is immobilized on a surface.
  • a plurality of samples is assessed simultaneously for the observation of inhibition by candidate inhibitors.
  • Further embodiments of the invention relate to methods of assessing the predisposition toward type II diabetes in a patient, methods for assessing predisposition toward Alzheimer's disease in a patient, and methods for assessing the metastatic potential of a tumor. These methods involve obtaining a clinical sample having OGT activity, assaying for the level of OGT activity present in the sample, and comparing the level obtained with levels found in samples from healthy individuals and from patients known to be diseased. Using these comparisons, evaluation of disease and non-disease states can be made.
  • FIG. 1 Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis of Purified Rabbit O-GlcNAc Transferase. Analysis of the purified 110 kDa OGT from rabbit blood by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (lane 2), compared to the standard proteins myosin (200 kDa), ⁇ -galactosidase (116.3 kDa), phosphorylase b (97.4 kDa), bovine serum albumin (66.3 kDa), glutamic dehydrogenase (55.4 kDa), lactate dehydrogenase (36.5 kDa) and carbonic anhydrase (31 kDa) in lane 1.
  • myosin 200 kDa
  • ⁇ -galactosidase 116.3 kDa
  • phosphorylase b 97.4 kDa
  • bovine serum albumin (66.3 kDa
  • FIG. 1 Comparison of Rabbit Tryptic Peptides with the Caenorhabditis elegans Gene Encoding OGT, and with Human Expressed Sequence Tags and cDNA Sequences Encoding OGT.
  • Panel A A schematic of the 7.3 kb C. elegans gene K04G7.3 is shown with predicted intron and exon junctions. The third and fourth exons were distinguished with hatched boxes to show they were not part of the isolated full length C elegans clone ZAP-CeOGT (Genbank accession number U77412). The location of the sequence predicted by the C. elegans expressed sequence tag clone ykl3c2 is shown.
  • the partial amino acid sequence of the two tryptic peptides isolated from the rabbit OGT, XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK, are compared to peptide sequences translated from clone Lv4F and two human expressed sequence tags (Genbank accession numbers R75943 and R76782).
  • FIG. 3 Protein Sequence Comparison of C. elegans and Human O-GlcNAc Transferase Deduced from the Isolated cDNAs.
  • Panel A Sequence alignment of C. elegans and human O-GlcNAc transferases. Identical amino acid matches are boxed and shaded. Similar amino acids are shaded only.
  • the peptide sequences (indicated "Peptide seq" in the Figure) corresponding to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK are underlined.
  • the cysteine residues are tentative assignments because they could not be distinguished from glutamine residues which comigrate in the amino acid profile.
  • NLS The putative nuclear localization signal
  • Panel B Schematic diagram showing the relative sizes of the C. elegans and human OGT, as well as the location of tetratricopeptide repeat (TPR) sequence repeats and putative nuclear localization signal (NLS).
  • TPR tetratricopeptide repeat
  • FIG. 4 Immunodetection of O-GlcNAc Transferase in Transgenic C. elegans Lines.
  • Panel A Immunoblot of phosphate buffered saline extracts from transgenic C. elegans embryos which were either uninduced or induced to overexpress OGT by heat shock. The primary antiserum used was a guinea pig anti-OGT prepared against recombinant OGT as described in Example 6. Under the conditions employed, only the OGT produced by overexpressing lines was detected. In other experiments, the wild type enzyme was also detected, but at greatly reduced levels.
  • Panel B (upper panel): Localization of OGT in wild type C. elegans embryos.
  • Indirect immunofluorescence was performed using antiserum raised against recombinant OGT.
  • a fluorescein isothiocyanate (F ⁇ TC)-labelled goat anti-guinea pig antibody was used for detection. Localization of the nuclei in these embryos was carried out using bis-benzamide and UV epifluorescence optics. OGT was found both within the nucleus and in a perinuclear location.
  • Panel B (left lower panel): Overexpression of recombinant OGT after 2-3 hours of heat shock in 3-fold stage
  • C elegans embryos measured by indirect immunofluorescence using antisera raised against recombinant OGT (Anti-OGT) .
  • a FITC-labelled goat anti-guinea pig antibody was used for detection.
  • Panel B (right lower panel): Nuclear localization using propidium iodide (PI) to stain the same embryo shown in panel B, lower left. Arrow heads are used to point to corresponding nuclei in lower panels.
  • PI propidium iodide
  • FIG. 1 Elevated O-GlcNAc Transferase Activity Induced by Overexpression of Human O-GlcNAc Transferase cDNA in Transfected Hela Cells.
  • Hela cells were plated at 100,000 cells per well and transfected using either lipofection (0.1 ⁇ g of DNA) or electroporation (0.05 ⁇ g of DNA).
  • Cells were transfected with plasmid containing the human OGT clone pECE-Lv4F or with the control plasmid alone pECE, harvested at 24 hours and assayed for O-linked GlcNAc transferase activity as described above.
  • the data are expressed in terms of the fold enrichment observed in OGT specific activity relative to untransfected Hela cells.
  • Genomic DNA (3 mg/lane) was digested with EcoRI, separated by electrophoresis on 0.7% agarose gel and transferred onto nylon membranes. The blot was probed with radiolabelled full length human liver clone Lv4F and exposed to Kodak Bio-Max MR film at -70°C for 3 days. The location of 1 kb ladder standards is shown to the left.
  • FIG. 7 Northern Blot Analysis of Human Tissues. Poly A RNA (2 mg) from a variety of adult human tissues was probed with radiolabelled full length human liver clone Lv4F (GlcNAc-T Probe Lv4F; top panel) and exposed to Kodak Bio-Max MR film at -70°C for 3 days. The blot was stripped according to the manufacturers protocol and rescreened with a human ⁇ -actin gene probe ( ⁇ -Actin Probe; bottom panel). The location of standards (kb) is shown to the left.
  • This invention provides the first molecular characterization of a protein having O-GlcNAc transferase activity.
  • the OGT was purified using recombinant rat nuclear pore protein, nup62, as substrate.
  • the enzyme isolated from rabbit blood has an apparent molecular weight of 110 kDa. It was subjected to trypsin digestion, high pressure liquid chromatography separation of the tryptic peptides, and microsequencing.
  • the partially sequenced enzyme was found to be nearly identical to a protein encoded in an open reading frame in the C. elegans gene, K04G7.3, on chromosome III ( Figure 2A).
  • a "vector” relates to a nucleic acid which functions to incorporate a particular nucleic acid segment, such as a sequence encoding a particular gene, into a cell. In most cases, the cell does not naturally contain the gene, so that the particular gene being incorporated is a heterologous gene.
  • a vector may include additional functional elements that direct and regulate transcription of the inserted gene or fragment.
  • the regulatory sequence is operably positioned with respect to the protein-encoding sequence such that, when the vector is introduced into a suitable host cell and the regulatory sequence is triggered, the protein is expressed.
  • Regulatory sequences may include, by way of non-limiting example, a promoter, regions upstream or downstream of the promoter such as enhancers that may regulate the transcriptional activity of the promoter, and an origin of replication.
  • a vector may additionally include appropriate restriction sites, antibiotic resistance or other markers for selection of vector containing cells, RNA splice junctions, a transcription termination region, and so forth.
  • a "host cell” is a prokaryotic or eukaryotic cell harboring a nucleic acid vector coding for one or more gene products.
  • a host cell harbors a foreign or heterologous substance, the vector, which is not naturally or endogenously found in it as a component.
  • a suitable host cell is one which has the capability for the biosynthesis of the gene products as a consequence of the introduction of the vector.
  • the host cell may contain components responsive to the regulatory nucleotide sequence of the vector, such that the protein encoded by the vector is expressed in the host when the cells are cultured under suitable conditions that trigger the regulatory sequence.
  • the host cell When the host cell is cultured in vitro, it may be a prokaryote, a single-celled eukaryote, or a mammalian cell.
  • Promoters for prokaryotic hosts include, by way of non-limiting example, the lac, trp, or beta-lactamase promoters, the promoter system from phage lambda, or other phage promoters such as T4 or T7.
  • Promoters for mammalian cells include, by way of non-limiting example, expression control sequences, such as an origin of replication, an enhancer, and necessary information processing sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences.
  • Expression control sequences are promoters derived from immunoglobulin genes, SV40, adenovirus, bovine papilloma virus, and so forth.
  • O-GlcNAc transferase may be isolated from mammalian sources by obtaining samples of tissue in which the enzyme activity is prevalent, and conducting purification procedures which rely on an assay of OGT activity in order to determine the presence of the protein and the preservation of activity.
  • Tissues that may be used to isolate the enzyme include formed elements of blood, and various organs such as liver, kidney, pancreas, lung, the central nervous system, and so forth. Tissues may also be derived from tumors. Procedures that may be employed in the purification include fractional precipitation, chromatography, centrifugation, and the like; such procedures are well known to workers of skill in protein chemistry and enzymology and are set forth, for example, in Irishr, M.P.
  • An assay for OGT activity is based on assessing the ability of the enzyme to glycosylate a protein substrate. A detailed description of an assay is set forth in Lubas et al. (1995) which is incorporated herein by reference.
  • this assay involves binding purified nup62 to nitrocellulose membranes, incubating with radiolabeled UDP-GlcNAc and a protein sample whose OGT activity is to be determined, and assessing the amount of the radiolabel incorporated into the immobilized nup62.
  • Mammalian OGT in general, may be employed to identify a gene expressing human OGT from the human genome, such as from human EST's or from an appropriate human cDNA library.
  • the mammalian OGT obtained upon completion of the purification procedure described above is subjected to amino acid sequencing procedures. Sequencing is well known to skilled artisans in protein chemistry, and is described, for example, in Allen, G, Sequencing of Proteins and Peptides: Laboratory Techniques in Biochemistry and Molecular Biology, 2nd Ed., Elsevier Science Publ., Amsterdam, 1989. Using procedures such as these, a partial or complete amino acid sequence of the mammalian OGT may be obtained.
  • amino acid sequences derived from either coding sequences of human EST's or cDNAs from libraries whose sequences are known is then compared with amino acid sequences derived from either coding sequences of human EST's or cDNAs from libraries whose sequences are known. Once the amino acid sequence is matched to a human nucleotide sequence, the corresponding human DNA sequence encoding the matched amino acid sequence is available.
  • the human nucleotide sequence then serves as a basis for preparing selective or unique oligonucleotide probes that may be used to isolate the entire gene from samples of human genomic DNA or human cDNA libraries.
  • the human gene sequence may be used to prepare an oligonucleotide primer pair for use in the polymerase chain reaction to identify and amplify the gene as found in a DNA sample from a human source.
  • the identified, or amplified, human DNA that includes the gene for OGT may then be incorporated into a plasmid vector, viral vector, or similar vector, for expression in a suitable host cell.
  • Identification of the OGT gene in other organisms, whether mammalian, non-mammalian vertebrate, or non-vertebrate, and expression of the encoded protein, may be done in a fashion similar to that outlined above.
  • O-linked GlcNAc transferase is the enzyme involved in the monoglycosidic modification of several proteins whose activity may be modulated in different physiological or pathological states.
  • proteins include the oncogenes c-jun, c-myc, c-fos, v-erb A, and the tumor suppressor Rb; glucose-responsive elements that include myc-like response elements; and certain nuclear pore proteins including nup62.
  • O-glycosylated sites derivatized by N-acetylglucosamine also are implicated in modulation of insulin secretion.
  • OGT activity may also be involved in the pathogenesis of Alzheimer's disease.
  • tau and amyloid- ⁇ protein are both glycosylated by OGT.
  • Therapeutic approaches to modulating the glycosylation levels of the proteins involved in these various diseases provide the potential for ameliorating the pathologies ascribable to this process.
  • Identifying inhibitors of O-linked GlcNAc transferase offers the prospect of obtaining substances having potential as therapeutic agents in these pathologies.
  • Enzymological assays that screen candidate substances in order to identify inhibitors of the enzyme afford a first step toward the development of such therapies. Any assay that is easy to implement and permits an assessment of the glycosylating activity of OGT suffices to accomplish this objective.
  • An example of such an assay is one that includes the steps of (i) providing a sample comprising a protein which is a glycosylation target of OGT activity, (ii) contacting the sample with a solution comprising a substance that is a candidate for being an inhibitor, and further comprising UDP-GlcNac and a protein having O-linked GlcNAc transferase activity, to generate a first test, (iii) determining the O-linked GlcNAc transferase activity in the first test, and (iv) evaluating whether, by comparing the activity determined in the first test with the activity determined in a second test in which the solution lacks the candidate substance, the O-linked GlcNAc transferase activity in the first test has been inhibited. The observation of inhibition then identifies the substance as an inhibitor of O-linked GlcNAc transferase.
  • Assays that incorporate solid phase components in order to isolate the detected analyte offer particular advantages in implementation.
  • a significant assay of the invention immobilizes the glycosylation target on a localized region of a surface.
  • the target may be a natural substrate for the enzyme, such as nup62, nup97, or nup200 (Macaulay et al., 1995; Lubas et al., 1995), or it may be a synthetic peptide substrate (Lubas et al., 1995).
  • a solution containing the candidate inhibitor substance in varying concentrations, active OGT, and the glycosylating substrate, UDP-GlcNAc, is brought into contact with the localized region of the surface for a time sufficient for, and under conditions of buffering and temperature that favor, the OGT-catalyzed incorporation of GlcNAc into the immobilized protein or peptide substrate.
  • a solution which lacks the candidate serves as a positive control for the absence of inhibition.
  • the amount of GlcNAc transferred is determined.
  • Techniques for determining the extent of glycosylation include labeling the GlcNAc moiety of the UDP-GlcNAc substrate, such as with a radioactive label, and evaluating any radioactivity incorporated.
  • An alternative technique may be an immunoassay using an antibody raised specifically against the Glc-NAc-derivatized protein or peptide.
  • the antibody serves as an indicator of the amount of GlcNAc transferred in the experiment, and the amount of antibody bound to the localized region of the surface is determined. Additional, functionally equivalent assays may also be devised for the purposes of the invention.
  • these solid phase assays are readily adaptable to high throughput, multiple sample, repetitive assays.
  • Repetitive assays are readily implemented, for example, using a multiwell microtiter plate or similar device. Additional, functionally equivalent formats for conducting repetitive assays, such as micro-binding arrays, are also contemplated within the scope of the assay method of the invention.
  • the Northern analysis for the distribution of the OGT gene among human tissues described in the Examples distinguishes four distinct OGT transcripts at 9.3, 7.9, 6.3, and 4.4 kb.
  • the signal in the pancreas is over 12 fold higher than seen in the lung and kidney. There also appears to be a tissue-specific distribution of these different bands.
  • the largest signals at 9.3 and 7.9 kb are most abundant in the pancreas and placenta while the 6.3 kb transcript is the major signal seen in the other tissues. It is not known at this time if the multiple transcripts represent the transcription of different genes or alternative splicing and processing of the same gene.
  • the large size of the mRNA transcripts compared to the isolated clones and open reading frame of the gene presumably corresponds to extensive 5' and 3' untranslated sequences. This has been observed for a number of glycosyltransferases (Homa et al. (1993) J Biol Chem 268, 12609-12616). The role of these large regions of untranslated mRNA is not known but it may be important in regulation of these genes.
  • the human clones identified here also show variation in the polyadenylation signal, which could partially explain the different size of the messages.
  • the hexosamine biosynthetic pathway is responsible for the synthesis of cytoplasmic UDP-GlcNAc utilized by OGT.
  • hexosamine biosynthetic pathway by controlling intracellular UDP-GlcNAc concentrations, may be acting in peripheral tissues as a glucose sensor which is reflected in substrate driven O-linked GlcNAc modification of intracellular proteins by OGT. Glucosamine administration has been shown to impair insulin secretion from the pancreas in response to glucose both in vitro and in vivo (Balkan et al.
  • High serum glucose levels seen in patients with type II diabetes has been shown, via shunting of a portion of the excess glucose into the hexosamine biosynthetic pathway, to result in increased UDP-GlcNAc and increased glycosylation of cellular proteins by this enzyme.
  • the level of expression of OGT activity may thus be a predictor for assessing which patients with glucose intolerance are more likely to progress to overt diabetes.
  • Red blood cells are a good source of the enzyme and so quantifying OGT glycosylation in human blood may be used to screen whether a patient is at increased risk to develop diabetes.
  • O-linked GlcNAc modifies many phosphoproteins which are components of multimeric complexes.
  • the sites modified by O-linked GlcNAc often resemble phosphorylation sites, leading to the suggestion that the modifications may compete for substrate in these polypeptides (Hart et al., 1995).
  • the sites modified by OGT resemble those of the glycogen synthase kinases (GSK-3, casein kinase II) and MAP kinase very closely.
  • insulin activates the MAP kinase cascade, inhibiting GSK-3 inhibition of glycogen synthase, the rate limiting enzyme in glycogen synthesis (Chou et al.
  • GSK-3 also modifies the oncogene c-jun and negatively regulates its transactivating potential in vivo.
  • Another oncogene, c-myc is modified by both O-linked GlcNAc and phosphorylated by GSK-3 in a domain required for transcriptional activation (Woodgett, 1991; Stambolic et al. (1994) Biochem J 303, ( Pt 3) 701-704; Plyt et al. (1992) Biochim Biophys Acta 1114, 147-162).
  • Glucose-responsive elements from several mammalian genes have been identified and include myc-like response elements (Towle (1995) J Biol Chem 270, 23235-23238). Therefore, O-linked GlcNAc addition and phosphorylation by kinases such as GSK-3 may have as a common denominator their involvement in transcriptional regulation of glucose metabolism. Furthermore, since it has been shown that the levels of certain oncogenes in tumors can be useful markers for grading tumors, screening tumor cells for OGT activity maybe a useful means of determining the aggressiveness or metastatic potential of these cells. OGT inhibitors may also be used as therapeutic agents in conditions in which the proteins discussed above are implicated.
  • insulin secretion is modulated, in an aberrant homeostatic response, by glycosylation mediated by OGT.
  • the activity of proteins encoded by tumor suppressor genes, as well as tumor necrotic activities appear to be candidates for therapeutic modulation by inhibitors of OGT.
  • inhibitors of these effects may act as therapeutic agents.
  • glycosylation of proteins such as tau and amyloid- ⁇ protein may be favorably modulated by therapeutic agents to be identified in the screening assays of the invention.
  • the expression of OGT in Alzheimer's disease may be elevated over normal levels
  • OGT OGT was purified from rabbit blood using a modification of previously described methods (Lubas et al.,1995; Haltiwanger et al., 1992). Fresh rabbit blood (4L), treated with EDTA, was pelleted in a GS3 rotor at 2,000Xg for 5 min. The red blood cells were washed 3 times with an isotonic salt solution (140 mM NaCl, 5 mM KC1, 1.5 mM magnesium acetate) and collected after centrifugation at
  • hypotonic lysis was performed using an equal volume of ice cold water containing the following protease inhibitors (Boehringer Mannheim, Indianapolis, IN), 1 mM phenylmethylsulfonylfluoride, 10 ⁇ g/ml chymostatin, 10 ⁇ g/ml pepstatin, 10 ⁇ g/ml leupeptin, 0.1% aprotinin, and 2 mM EDTA.
  • the lysate was pelleted at 10,000Xg for 40 min in a GSA rotor.
  • the soluble fraction was made 30% saturated ammonium sulfate by adding a stock of 100% saturated ammonium sulfate equilibrated at 4°C slowly over 1 hour and stirring the solution an additional 2 hours at 4°C.
  • the precipitate was collected after centrifugation at 10,000Xg for 40 min in a GSA rotor and resuspended in 15-20 mL of 50 mM Tris-HCl, pH 7.4, 2 mM MgCl 2 using a Dounce homogenizer.
  • the insoluble material was removed by centrifugation at 20,000Xg for 20 min in a SS34 rotor.
  • the soluble fraction from the 30% ammonium sulfate precipitation was loaded onto a 15 mL Phenyl-Sepharose column (Pharmacia Biotech, Piscataway, NJ), washed with 100 mL of 10 mM Tris-HCl, pH 7.5, 100 mM ammonium sulfate and eluted with 40 mL of 10 mM Tris-HCl, pH 7.5, 60% ethylene glycol.
  • All chromatography buffers also contained the following protease inhibitors, 0.1 % aprotinin, 10 ⁇ g/mL leupeptin, 10 ⁇ g/mL pepstatin, 0.1 mM phenylmethylsulfonylfluoride; all procedures were performed at 4°C.
  • the active fractions (15-20 mL) were pooled, passed through a 0.45 ⁇ M Millex-HA filter (Millipore Corp., Bedford, MA) and loaded onto a Mono Q HR 10/10 anion exchange column (Pharmacia Biotech, Piscataway, NJ) equilibrated with 50 mM Tris-HCl, pH 7.5, 12.5 mM MgCl 2 , 20% glycerol, 2 mM EDTA using a Pharmacia FPLC system. The column was washed with 30 mL of the equilibration buffer and then eluted with a linear gradient from 0 to 300 mM NaCl in 50 mL of equilibration buffer at a flow rate of 1 mL/min.
  • the active fractions (8-10 mL) were pooled and concentrated to a final volume of 0.3 mL using a Centricon 30 microconcentrator (Amicon, Beverly, MA) and loaded in 0.15 mL aliquots onto a Superose 6 FPLC column (Pharmacia Biotech, Piscataway, NJ) equilibrated with 50 mM Tris-HCl , pH 7.5, 12.5 mM MgCl 2 , 20% glycerol, 2 mM EDTA, 100 mM NaCl. The column was run at a flow rate of 0.15 mL/min and 0.6 mL fractions were collected.
  • Protein was calculated using the BCA reagent (Pierce Chemicals, Rockford, IL) using bovine serum albumin as a standard.
  • O-GlcNAc transferase activity was measured using recombinant nup62 bound to nitrocellulose membranes as previously described (Lubas et al.,1995) or in a modification of the method using recombinant nup62 bound to ScintiStrip polystyrene scintillation strips (Wallac Oy, Turku, Finland).
  • Purified O-GlcNAc transferase was subject to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).
  • the purified enzyme whose SDS-PAGE image is shown in Figure 1 , contains two polypeptides, an intense band at 110 kDa and a considerably weaker band at 78 kDa. Recovery of the' 78 kDa band was variable between preparations, thus preventing isolation of sufficient amounts for further analysis. Proteolytic fingerprinting of both polypeptides suggested that they are related.
  • the 78 kDa band maybe a proteolytic product of the larger 110 kDa band or the product of a second translation start site.
  • the apparent molecular weight of the slower-migrating band, 110 kDa is consistent with the translation product of the human coding sequence, shown in
  • SEQ ID NO:2 which has 920 amino acid residues and with the translation product of the C. elegans coding sequence shown in SEQ ID NO:4, which has 1151 residues. For these reasons it is believed that the 110 kDa band represents the complete OGT protein of the invention.
  • the 110 kDa band obtained in Example 1 was cut out of the gel and sent to the William M. Keck Foundation at Yale University for in-gel trypsin digestion, high pressure liquid chromatography purification of the resulting tryptic peptides, and amino acid microsequencing. Two peptides were initially identified, a 20- residue peptide, XVSLDPNFLDAYINLGNVLK, and a 17-residue peptide, XXXSQLT(C)LG(C)LELIAK.
  • the 20-mer was a perfect match to a sequence contained within the expressed sequence tag, cDNA clone ykl3c2 (gb-CelK013C2F) and in a previously uncharacterized gene K04G7.3 (Genbank accession number U21320) identified as part of the C. elegans genome sequencing project ( Figure 2A). Both peptide sequences ended in basic amino acids consistent with the generation of these fragments by trypsin digestion. Figure 2A shows the structure of the gene and localizes the tryptic peptides to the 8th and 14th exons in the C. elegans gene. Two human expressed sequence tags, Genbank accession numbers R75943 and R76782, showing greater than 60% identity to the C. elegans gene K04G7.3, were also identified and found to match the 17-mer rabbit OGT tryptic peptide perfectly (Figure 2B).
  • the OGT cDNA was isolated using a combination of phage library screening and polymerase chain reaction.
  • ATCGAAAATCCTGGCCTCTT (SEQ ID NO:6) were made to amplify a 195 base pair fragment from the cDNA clone ykl3c2 ( Figure 2A). After PCR amplification, this fragment was gel purified and used to probe a lambda ZAP (Stratagene, Cambridge, UK) C. elegans cDNA library (1010 units/mL) (Barstead et al., 1989). 140,000 clones were screened; only 1 positive plaque was identified. The identified insert (3.1 kb) was subcloned into pGem and Pet 32 (Novagen, Madison, WI) using EcoR I. This insert was sequenced and localized to the C-terminal 70% of the open reading frame of C. elegans K04G7.3. Using the known sequence for the open reading frame for the C. elegans K04G7.3 gene, primers were constructed to amplify the 5' end using high fidelity Takara Biomedicals (Gennevarris,
  • the consensus polyadenylation signal AATAAA occurs at positions 4065-4070.
  • the peptide sequences predicted to correspond to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK occur at positions 3321-340 and 1060-1076, respectively.
  • Example 4 Cloning of the Human O-GlcNAc Transferase
  • the human O-GlcNAc transferase was isolated using primers constructed from the sequence of the human expressed sequence tag, Genbank accession number R75943. These primers were used to screen SuperscriptTM (Gibco BRL Life Technologies, Gaithersburg MD) human brain and liver cDNA libraries using the GenetrapperTM (Gibco BRL Life Technologies, Gaithersburg MD) cDNA positive selection system.
  • Hybrids between the biotinylated oligonucleotide and the cDNA libraries were captured on streptavidin-coated paramagnetic beads and retrieved using a magnet.
  • the captured ssDNA was separated from the biotinylated primer, re-paired to double stranded DNA using the second oligonucleotide primer and transformed into ElectroMAX DH10B cells (Gibco BRL Life Technologies, Gaithersburg MD). There were a total of 48 liver and 53 brain clones identified on the initial screen. These clones were then rescreened by hybridization with the full length human placenta expressed sequence tag, Genbank accession number R75943; 40 of 48 liver and 42 of 53 brain clones were found to be positive.
  • the insert size was estimated by restriction digestion with Sal I and Not I. All liver clones longer than 2.5 kb and brain clones longer than 3 kb were screened by in vitro translation. The largest in vitro translation product identified was a protein of about 100 kDa formed by 6 different liver and 2 brain clones (data not shown). DNA sequencing showed that they were all overlapping clones of the same gene with variable 5' and 3' untranslated regions.
  • the consensus polyadenylation signal AATAAA occurs at positions 3027-3032.
  • the oligonucleotide primers GCGTTTTCCAGCAGTAGGAG and ACATTCTGAAGCGTGTTCCC used to screen the human cDNA libraries using the Genetrapper positive select system are found at positions 2471-2493 and 2514- 2533, respectively.
  • the translated protein sequence predicted by the open reading frame beginning at position 265 is given in SEQ ID NO:2.
  • the peptide sequences predicted to correspond to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK occur at positions 91-110 and 829-845, respectively..
  • the human cDNA open reading frame encodes a shorter protein (103 kDa) containing only the last 9 tetratricopeptide repeat (TPR) sequences found in C. elegans ( Figure 3B). While this is consistent with the observed size of the in vitro translation product, it is likely that post translational modification of the enzyme occurs since the human OGT translated in reticulocyte lysate was slightly larger than the product seen from wheat germ extract (data not shown). This behavior has been previously observed for proteins modified by O-linked GlcNAc (Starr et al., 1990).
  • OGT OGT was expressed in E. coli.
  • In vitro translation was performed using the TNT-T7 coupled wheat germ extract system (Promega Ltd., Southampton, UK) using the manufacturer's instructions.
  • the full length C. elegans cDNA ZAP-CeOGT was cloned into Pet32a and transfected into E. coli BL21(DE3) cells (Novagen, Madison, WI) for expression.
  • Cells were grown in Luria-Bertani medium containing 50 ⁇ g/mL carbenicillin at 37°C and 220 rpm until the OD600 was about 0.6.
  • Cells were induced with 1 mM isopropylthiogalactopyranoside for 90 min at 37°C and harvested by centrifugation at 3000 rpm for 5 min at 4°C in a Beckman GS-6R centrifuge. After resuspension in 1/10 volume of 50 mM Tris-HCl, pH 8, 2 mM EDTA, 100 ⁇ g/mL lysozyme, 0.1%) Triton X-100, cells were incubated at 30°C for 15 min, placed in an ice bath and sonicated twice for 10 seconds to shear the DNA.
  • the O-GlcNAc transferase was pelleted at 12,000Xg for 10 min at 4°C, solubilized with His-Tag (Novagen Inc., Madison, WI) binding buffer in 6 M urea (5 mM imidazole, 50 mM NaCl, 20 mM Tris-HCl, pH 7.9).
  • the solubilized protein was loaded onto a 2.5 mL His-Tag column (Novagen Inc., Madison, WI) , washed with 8 mL of binding buffer and eluted with 8 mL of elution buffer in 6 M urea (60 mM imidazole, 50 mM NaCl, 20 mM Tris-HCl, pH 7.9). Column fractions were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Full length OGT was gel purified and used to generate polyclonal antibodies in guinea pigs. The enzyme made in E. coli has biological activity and can be used in OGT assays. This was accomplished by incubating the recombinant bacteria described in the preceding paragraph overnight at room temperature without inducing by the addition of isopropylthiogalacto-pyranoside.
  • Transgenic C. elegans strains were generated by microinjection using the pRF4 plasmid as a marker to identify transformed animals (Mello et al. (1995)).
  • Test plasmid constructs were injected in combination with pRF4 DNA at 50 ng/ ⁇ L each.
  • Overexpression was achieved by transformation of N2 animals with derivatives of the heat shock promoter vectors (pPD49.78 and pPD49.83) (Mello et al. (1995)) in which the full length C.
  • elegans OGT cDNA (Nco I-Sac I partial digest, 4.25 kb) was cloned into the Nco I and Sac I restriction sites of the vector.
  • Transgenic animals were heat shocked by treatment at 33°C for 2-4 hours to induce production of fusion proteins driven by heat shock promoters.
  • Overexpressed OGT was detected by immunoblotting using an anti-OGT guinea pig antibody raised against the recombinant protein made in E. coli (see Example 5).
  • the C. elegans OGT was readily detected by immunoblotting using the guinea pig antisera ( Figure 4A).
  • Immunofluorescence of O-GlcNAc Transferase in C. elegans embryos was carried out by fixing the embryos with formaldehyde on glass slides (Krause et al. (1990)) and visualizing by indirect immunofluorescence using a FITC-labelled goat anti-guinea pig antibody, and guinea pig antibody raised against recombinant C. elegans OGT.
  • the immunofluorescence was detected using a Biorad 1024 Confocal Microscope equipped with a 60X objective. Wild-type C. elegans embryos showed a punctate perinuclear and nuclear pattern ( Figure 4B, top panel).
  • Human OGT clone Lv4F was introduced into HeLa cells by lipid-mediated transfection or by electroporation.
  • lipid-mediated transfection 105 cells were plated per well in 6- well plates in Dulbecco's minimal essential medium (DMEM)/10% fetal bovine serum (FBS)14-18 hours prior to transfection. The transfection was carried out in OptiMEM (Life Technologies, Inc., Paisley, Scotland).
  • OptiMEM Dulbecco's minimal essential medium
  • the plasmid pECE-OGT/Lv4F (0.1 mg) was mixed with 4 mL of Lipofectin reagent (Life Technologies, Inc., Paisley, Scotland) and applied to the cells according to the manufacturer's recommendations. Control cells were transfected with plasmid bearing no insert.
  • Electroporation of HeLa cells was performed in OptiMEM in cell suspensions (5 x 106/mL) containing 0.5 mg/mL of pECE-OGT/Lv4F or pECE. Cells were shocked at 4°C, with capacitance set at 1180 mF, and voltage at 200 V using a BRL electroporator (Gibco BRL Life).
  • Assays were performed in 50 mM Tris-HCl, pH 7.4, 12.5 mM MgCl 2 and 1 ⁇ Ci UDP- GlcNAc- [ ⁇ ] GlcNAc in a final volume of 40 mL for 90 min at 37°C and 220 rpm.
  • the full length human cDNA (clone Lv4F) was cloned into the pECE vector downstream of the SV40 promoter. Hela cell cultures were transiently transfected with vector alone or with the vector containing the clone Lv4F open reading frame. Cells were harvested at 24 hours. The transfected cells did not survive well during prolonged incubations, i.e., more than about 72 hours, suggesting the gene may be toxic to the cells. Toxicity has also been observed in experiments where the gene was overexpressed in transgenic C. elegans. Up to a three-fold increase in enzyme activity relative to backgroimd activity was observed using two different transfection procedures (Figure 5).
  • Example 8 Isolation of cloned human OGT from transfected Hela cells
  • the full length human cDNA (clone Lv4F) is to be cloned into the pECE vector downstream of the SV40 promoter.
  • Hela cell cultures are to be transiently transfected with the vector containing the clone Lv4F open reading frame.
  • Cells are to be harvested at about 24 hours after transfection.
  • the cells are to be disrupted and human OGT is to be purified from the cytosolic supernatant following procedures similar to those described in Example 1 and in Lubas et al. (1995) for the isolation of rabbit OGT.
  • the blot was prehybridized in 1% bovine serum albumin, 0.5 M NaPO 4 , pH 7, 1 mM EDTA, 7% sodium dodecyl sulfate, 100 ⁇ g/mL denatured salmon testis DNA, at 55°C for 1 hour and then hybridized overnight at 55°C with the gel purified, radiolabelled 3 kb Not I-Sal I fragment from human liver clone Lv4F.
  • the blot was washed two times for 15 min with 0.5% bovine serum albumin, 5% sodium dodecyl sulfate, 40 mM NaPO 4 , pH 7, 1 mM EDTA at 55°C, then two times for 15 min with 1% bovine serum albumin, 40 mM NaPO 4 , pH 7, 1 mM EDTA at 55°C and once with 0.2X SSPE (30 mM NaCl, 2 mM NaPO 4 , pH 7.4, 0.2 mM EDTA) at 55°C for 15 min. It was exposed to Kodak Bio-max MR film for 1-7 days at -70°C.
  • Example 10 Assessing predisposition toward type II diabetes.
  • a sample of blood from a patient suspected of having hyperglycemia that may evolve into type II diabetes is to be provided.
  • the red blood cells are to be isolated by centrifugation, lysed, and the supernatant fraction is to be concentrated by a method chosen from (a) retention by an ultrafilter and (b) precipitation by concentrated ammonium sulfate, as set forth in Example 1.
  • the resulting sample containing OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1.
  • Correlative samples are to be assayed periodically from healthy human subjects, known human diabetics, and human patients suffering from other pathologies not related to type II diabetes.
  • the levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and type II diabetic levels of OGT in human subjects. Patients are to be evaluated as being predisposed to type II diabetes if the level of OGT activity in the samples falls within the range established for patients known to have type II diabetes.
  • Example 11 Assessing predisposition toward Alzheimer's disease.
  • a sample from the central nervous system of a patient suspected of having Alzheimer's disease or of being at increased risk of developing the disease is to be provided.
  • the sample may, for example, be drawn from the cerobrospinal fluid or from cellular material obtained from the brain.
  • Cellular material, if used, is to be lysed, and OGT activity in the supernatant portion or in an ultrafilter retentate of the sample, is to concentrated.
  • the OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1.
  • Correlative samples are to be assayed periodically from healthy human subjects, and from known human Alzheimer's disease patients whether living or at autopsy.
  • the levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and pathological levels, related to Alzheimer's disease, of OGT in human subjects.
  • a patient is to be evaluated as being predisposed to or at increased risk of developing Alzheimer's disease if the level of OGT activity in the sample from the patient falls within the range established for patients known to have Alzheimer's disease.
  • Example 12 Assessing metastatic potential of a tumor.
  • a sample from a tumor present in a patient is to be provided.
  • the sample may be obtained, for example, by surgical biopsy.
  • Cellular material is to be homogenized, OGT activity in the supernatant portion or in an ultrafilter retentate of the sample is to be concentrated.
  • the OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1.
  • the substrate is to be a purified oncogene protein, such as a recombinant form of an oncogene protein. These proteins may be chosen from among myc, p53, Rb, and v-erb, and similar known oncogene proteins.
  • Correlative samples are to be assayed periodically from healthy human subjects or from fresh autopsy samples, and from tumors derived from known human cancer patients.
  • the levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and pathological levels of OGT in human subjects.
  • Pathological levels are those found in the samples of known tumors.
  • a tumor in a patient being tested is to be evaluated as having high metastatic potential if the level of OGT activity in the sample from the patient falls within the range established for patients known to have metastatic tumors.
  • Lys Asp Ser Gly Asn lie Pro Glu Ala lie Ala Ser Tyr Arg Thr Ala
  • Cys Leu Gin lie Val Cys Asp Trp Thr Asp Tyr Asp Glu Arg Met Lys 350 355 360 AAG TTG GTC AGT ATT GTG GCT GAC CAG TTA GAG AAG AAT AGG TTG CCT 1395 Lys Leu Val Ser lie Val Ala Asp Gin Leu Glu Lys Asn Arg Leu Pro 365 370 375
  • AAAGACTGCA CAGGAGAATT ACCCCTAAAA AAAAAAAAAA AAAAGGGCGG CCGC 3083
  • MOLECULE TYPE protein
  • FRAGMENT TYPE internal
  • ORIGINAL SOURCE
  • ORGANISM Caenorhabditis elegans
  • ix FEATURE:
  • GCT ATT CGA ACG CAA CTC GAA AAT CAA GCG GCA CAG CAG TTA GCA GTC 240
  • GGT GAT TTG GAG CAA
  • GAT GCT GGA AAT ATG GCA GAA
  • GCT ATT CAA
  • CTC 1680 Asp Ala Gly Asn Met Ala Glu Ala He Gin Ser Tyr Ser Thr Ala Leu 545 550 555 560
  • MOLECULE TYPE protein
  • FRAGMENT TYPE internal
  • ORIGINAL SOURCE
  • ORGANISM Caenorhabditis elegans

Abstract

This invention relates to an isolated DNA comprising a sequence that encodes a uridine diphospho-N-acetylglucosamine:polypeptide β-N-acetylglucosaminyl transferase (O-linked GlcNAc transferase, OGT), as well as vectors comprising the DNA suitable for the expression of OGT, and host cells harboring the vector that express OGT. The invention further relates to an isolated protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase. The invention additionally relates to a method of expressing a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase. Furthermore the invention relates to a method of identifying an inhibitor of O-linked GlcNAc transferase. Additionally the invention relates to methods of assessing the status or predisposition of a patient to pathologies associated with OGT activity.

Description

O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE
Field of the Inveltfion
This invention relates to a post-translational modification of a protein involving the addition of N-acetylglucosamine in O-glycosidic linkage to serine or threonine residues of cytoplasmic and nuclear proteins. It is believed that such modification plays a significant role in regulating the activity of proteins involved in transcriptional and translational processes. In particular, this invention provides an enzyme catalyzing the formation of these derivatives, uridine diphospho-N-acetylglucosamine.polypeptide β-N-acetylglucosaminyl transferase (O-GlcNAc, OGT), and a nucleic acid encoding the enzyme.
Background of the Invention
Over the last ten years a novel post-translational modification involving the addition of a single N-acetylglucosamine (GlcNAc) in O-glycosidic linkage to serine or threonine residues on cytoplasmic and nuclear proteins has been identified (Torres et al. (1984) J Biol Chem 259, 3308-3317; Hanover et al.(1987) J Biol Chem 262, 9887-9894). This form of glycosylation was found to modify a group of nuclear pore proteins (Hanover et al.; Holt et al. (1987) J Cell Biol 104, 1157-1164; Starr et al. (1990) J Biol Chem 265, 6868-6873). In addition to nuclear pore proteins, O-linked GlcNAc modifies a large number of polypeptides in multimeric structures including RNA polymerase II transcription complexes and p67/eIF-2- initiation factor in the translation machinery.
Although the addition of O-linked GlcNAc to proteins is formally a glycosyltransferase reaction, it is quite distinct from other glycosylation processes. The reaction occurs in the cytosol and nucleus, unlike other protein glycosylation reactions which are restricted to the endomembrane system of the cell (Hanover et al.; Holt et al.). Reflecting the fact that the enzyme catalyzing this reaction, uridine diphospho-N-acetylglucosamine.polypeptide β-N-acetylglucosaminyl transferase (O-GlcNAc transferase, OGT), must function in the cytoplasm where levels of the substrate, uridine diphospho-N-acetylglucosamine (UDP-GlcNAc), are lower, the enzyme has a much lower K,,, with respect to this substrate than is usually observed for glycosyltransferases. In many ways O-linked GlcNAc addition is analogous to protein phosphorylation. The enzyme has been shown to recognize a large number of phosphoproteins, some of which play a direct role in signal transduction. In the case of RNA-polymerase II, phosphorylation and glycosylation seem to be mutually exclusive. Thus, whereas the glycosylated enzyme is necessary for assembly of the preinitiation complex, subsequent deglycosylation and phosphorylation are necessary for transition to the elongation complex (Kelly et al. (1993) J Biol Chem 268, 10416-10424). In the case of other substrates, such as neurofilaments (Dong et al. (1996) J Biol Chem 271, 20845-20852) or the nuclear pore proteins nup62, nup97, and nup200 (Macaulay et al. (1995) J Biol Chem 270, 254-262), it appears that phosphorylation and glycosylation can both occur on the same molecule. The role of protein phosphorylation as a regulatory mechanism for signal transduction in eukaryotic cells was originally identified in studies over 40 years ago on glycogen phosphorylase, an enzyme involved in carbohydrate metabolism (Fisher et al. (1955) J Biol Chem 216, 121-132). It is likely that addition of O-linked
GlcNAc to proteins in the cytoplasm and nucleus is also highly regulated. Since both phosphorylation and glycosylation compete for similar serine or threonine residues, it is possible that the two processes could be directly competing for sites, or they may alter the substrate specificity of nearby sites by steric or electrostatic effects.
No strict consensus sequence for O-linked GlcNAc addition has so far been identified, although most glycosylation sites occur near proline or valine residues and typically in stretches rich in serine or threonine residues. A subset of glycosylation sites is located near acidic amino acid residues (Hart et al. (1995) Adv Exp Med Biol 376, 115-123; Lubas et al. (1995) Biochemistry 34,
1686-1694). These glycosylation sites are similar to phosphorylation sites for several protein kinases (Kennelly et al. (1991) J Biol Chem 266, 15555-15558). It has previously been shown that OGT has a much higher reactivity with the recombinant nucleopore protein, nup62, than with any synthetic peptide examined, suggesting that the enzyme may recognize other parts of the protein substrate, and not just a specific consensus sequence (Lubas et al. (1995)).
The hexosamine biosynthetic pathway is responsible for the synthesis of cytoplasmic UDP-GlcNAc utilized by OGT. Normally 2-3% of incoming glucose fluxes through this pathway (Marshall et al. (1991) J Biol Chem 266, 4706-4712). Increased glucose flux through the hexosamine biosynthetic pathway, caused by hyperglycemia, has been shown to mediate insulin resistance (Marshall et al. (1991); Rossetti et al.. (1995) J Clin Invest 96, 132-140; Daniels et al. (1993) Mol Endocrinol 7, 1041-1048; Crook et al. (1995) Diabetes 44, 314-320; Hebert et al. (1996) J Clin Invest 98, 930-936; Sayeski et al. (1996) J Biol Chem 271, 15237-15243). The hexosamine biosynthetic pathway, by controlling intracellular UDP-GlcNAc concentrations, may be acting in peripheral tissues as a glucose sensor which is reflected in substrate-driven O-linked GlcNAc modification of intracellular proteins by OGT. Glucosamine administration has been shown to impair insulin secretion from the pancreas in response to glucose both in vitro and in vivo (Balkan et al. (1994) Diabetes 43, 1173-1179). It is shown in the present Examples that OGT is highly abundant in the pancreas, further suggesting a possible role in modulating insulin secretion and in glucose homeostasis. An additional observation suggesting that the level of GlcNAc substitution of substrate proteins may exert regulatory effects on their activities is the existence of a specific hexosaminidase that cleaves O-linked GlcNAc from proteins.
O-linked GlcNAc modifies many phosphoproteins which are components of multimeric complexes. The sites modified by O-linked GlcNAc often resemble phosphorylation sites, leading to the suggestion that the modifications may compete for substrate in these polypeptides (Hart et al. (1995)). In general, the sites modified by OGT resemble those of the glycogen synthase kinases (GSK, such as GSK-3 or casein kinase II) and microtubule associated protein (MAP) kinase very closely. Interestingly, insulin activates the MAP kinase cascade, inhibiting GSK-3 inhibition of glycogen synthase, the rate limiting enzyme in glycogen synthesis (Chou et al. (1995) Proc Natl Acad Sci U S A 92, 4417-4421; Woodgett (1991) Trends Biochem Sci 16, 177-181).
GSK-3 also modifies the oncogene c-jun and negatively regulates its transactivating potential in vivo. Another oncogene, c-myc, is modified by both O-linked GlcNAc and phosphorylated by GSK-3 in a domain required for transcriptional activation (Woodgett (1991); Stambolic et al. (1994) Biochem J 303 ( Pt 3), 701-704; Plyte et al. (1992) Biochim Biophys Acta 1114, 147-162). Glucose-responsive elements from several mammalian genes have been identified and include myc-like response elements (Towle (1995) J Biol Chem 270,
23235-23238). Therefore, O-linked GlcNAc addition by OGT and phosphorylation by kinases such as GSK-3 may have as a common denominator their involvement in transcriptional regulation of glucose metabolism. Additional oncogenes which may serve as substrates for OGT include c-fos, c-jun, v-erb A, and the tumor suppressor Rb. As a result of this property, the level of GlcNAc in a cell, and its role as a substrate for OGT, suggest that O-linked GlcNAc has a role in modulating or regulating the activity of oncogenes, or inhibiting their functions, in tumorigenesis and in tumor suppression. It would be beneficial in the pharmacotherapy of various tumors to be able to inhibit the activity of OGT, thereby lowering the extent to which O-linked glycosylation by GlcNAc occurs. It would be useful in addition to screen various tumors for their OGT activity in order to evaluate their metastatic potential.
Experimental insulin-dependent diabetes may be induced in animals by administering streptozotocin. It is known that this agent destroys the β cells in the islets of Langerhans, where insulin secretion occurs. It is possible that streptozotocin actually affects OGT, whether by interfering with its synthesis, or by inducing inhibition of its activity, or it may inhibit the activity of the hexosaminidase.
It is furthermore possible that OGT activity is implicated in the pathogenesis of Alzheimer's disease. Two proteins involved in this disease, tau and amyloid- β protein, are both glycosylated by OGT. Griffith et al. ((1995) Biochem. Biophys. Res. Commun. 213, 424-431) have shown that O-GlcNAc glycosylation is upregulated in the brains of patients with Alzheimer's disease.
Although OGT has been purified from several different sources (Lubas et al. (1995), Haltiwanger et al. (1992)) it has not been molecularly cloned. There is therefore a need to clone a gene for OGT, especially a gene originating in humans. There further is a need for expressing the gene to produce a protein having
O-GlcNAc transferase activity. The need also exists for employing an O-GlcNAc transferase protein in studies designed to identify inhibitors of the O-GlcNAc transferase activity. Such inhibitors would have strong potential as therapeutic compounds in the treatment of diabetes mellitus, and potentially in the treatment of tumor-derived disease and Alzheimer's disease as well.
In addition, there are needs for evaluating a patient's predisposition to pathological conditions related to dysfunction of OGT-mediated glycosylation. Evaluation of the level of OGT activity, and of the level of glycosylation by GlcNAc in proteins may be useful in diagnosing such disease states.
Summary of the Invention
This invention provides an isolated DNA molecule that includes a sequence encoding a protein exhibiting uridine diphospho-N-acetylglucosamine:polypeptide β-N-acetylglucosaminyl transferase (O-linked GlcNAc transferase, OGT) activity.
It also provides a nucleic acid vector including the isolated DNA encoding OGT; the vector may also include a regulatory nucleotide sequence operably positioned with respect to the DNA sequence encoding an O-linked GlcNAc transferase such that, when the vector is introduced into a suitable host cell and the regulatory sequence is triggered, the protein is expressed. In preferred embodiments of the DNA molecules of the invention, the nucleic acid has the sequence of human OGT provided in SEQ ID NO:l, or of OGT from Caenorhabditis elegans, provided in SEQ ID NO:3.
The present invention additionally provides an isolated protein exhibiting O-linked GlcNAc transferase activity. In significant aspects of the invention, the protein has the amino acid sequence of an O-linked GlcNAc transferase and is a human O-linked GlcNAc transferase, and in a further important aspect has the amino acid sequence given by SEQ ID NO:2 . In other important aspects, the protein has the amino acid sequence of a C. elegans O-linked GlcNAc transferase, and in an additional significant aspect the protein has the amino acid sequence given by SEQ ID NO:4. The invention additionally provides host cells containing a vector including the DNA encoding OGT and which express a protein with OGT activity. The host cells may also harbor cellular components responsive to the regulatory nucleotide sequence contained in a vector that is operably linked to the OGT coding sequence. The regulatory sequence is such that the protein encoded by the vector is expressed in the host when the cells are cultured under suitable conditions that trigger the regulatory sequence. The host cells are capable of expressing the DNA encoding OGT. In important embodiments of the invention, the host cells express the human O-linked GlcNAc transferase protein whose amino acid sequence is given by SEQ ID NO:2, or the O-linked GlcNAc transferase protein from C. elegans having the amino acid sequence given by SEQ ID NO:4.
Additionally the present invention provides a method of expressing a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase. The method includes the step of culturing host cells harboring a vector that includes a DNA encoding OGT under conditions that promote growth of the cells. The vector may also include a control element operably linked to the OGT coding sequence that under suitable conditions of growth cause the control elements thereof to induce expression of the O-linked GlcNAc transferase gene, thereby expressing the protein. The cells also may contain cellular components that interact with the control element under suitable conditions, resulting in expression of the DNA sequence. In important embodiments of the invention, the host cells contain a vector that includes the DNA sequence encoding either the human O-linked GlcNAc transferase protein whose sequence is given by SEQ ID NO:2, or the O-linked GlcNAc transferase protein from C. elegans having the amino acid sequence given by SEQ ID NO:4.
In a further embodiment of the invention, a method of identifying an inhibitor of O-linked GlcNAc transferase is provided; this method comprises the steps of
(i) providing a sample comprising a glycosylation protein target of O-linked GlcNAc transferase activity;
(ii) contacting a first portion of the sample with a first solution comprising a substance that is a candidate for being an inhibitor, UDP-GlcNac, and a protein having O-linked GlcNAc transferase activity to generate a first test;
(iii) contacting a second portion of the sample with a second solution to generate a second test, wherein the second solution is the same as the first solution except that it does not contain the candidate;
(iv) determining and comparing the O-linked GlcNAc transferase activity in the first and second tests to identify whether the O-linked GlcNAc transferase activity has been inhibited by the candidate. In significant aspects of the method, the protein having O-linked GlcNAc transferase activity in the first and second tests is a human O-linked GlcNAc transferase and has the amino acid sequence given by SEQ ID NO:2, or is a C. elegans O-linked GlcNAc transferase whose amino acid sequence is given by SEQ ID NO:4. In a preferred embodiment of the method, the glycosylation target protein is immobilized on a surface. In still additional preferred embodiments of the method of identifying an inhibitor of O-linked GlcNAc transferase, a plurality of samples is assessed simultaneously for the observation of inhibition by candidate inhibitors. Further embodiments of the invention relate to methods of assessing the predisposition toward type II diabetes in a patient, methods for assessing predisposition toward Alzheimer's disease in a patient, and methods for assessing the metastatic potential of a tumor. These methods involve obtaining a clinical sample having OGT activity, assaying for the level of OGT activity present in the sample, and comparing the level obtained with levels found in samples from healthy individuals and from patients known to be diseased. Using these comparisons, evaluation of disease and non-disease states can be made.
Description of the Figures
Figure 1. Sodium Dodecyl Sulfate-Polyacrylamide Gel Electrophoresis of Purified Rabbit O-GlcNAc Transferase. Analysis of the purified 110 kDa OGT from rabbit blood by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (lane 2), compared to the standard proteins myosin (200 kDa), β-galactosidase (116.3 kDa), phosphorylase b (97.4 kDa), bovine serum albumin (66.3 kDa), glutamic dehydrogenase (55.4 kDa), lactate dehydrogenase (36.5 kDa) and carbonic anhydrase (31 kDa) in lane 1.
Figure 2. Comparison of Rabbit Tryptic Peptides with the Caenorhabditis elegans Gene Encoding OGT, and with Human Expressed Sequence Tags and cDNA Sequences Encoding OGT. Panel A: A schematic of the 7.3 kb C. elegans gene K04G7.3 is shown with predicted intron and exon junctions. The third and fourth exons were distinguished with hatched boxes to show they were not part of the isolated full length C elegans clone ZAP-CeOGT (Genbank accession number U77412). The location of the sequence predicted by the C. elegans expressed sequence tag clone ykl3c2 is shown. The partial amino acid sequences of the two tryptic peptides isolated from the rabbit OGT, XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK, are compared to the sequence predicted from the K04G7.3 open reading frame in exons 8 and 14. The cysteine residues are tentative assignments because they could not be distinguished from glutamine residues which comigrate in the amino acid profile. Panel B: A schematic of the 3.1 kb full length sequence of human liver OGT gene in clone Lv4F (Genbank accession number U77413) is shown. The partial amino acid sequence of the two tryptic peptides isolated from the rabbit OGT, XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK, are compared to peptide sequences translated from clone Lv4F and two human expressed sequence tags (Genbank accession numbers R75943 and R76782).
Figure 3. Protein Sequence Comparison of C. elegans and Human O-GlcNAc Transferase Deduced from the Isolated cDNAs. Panel A: Sequence alignment of C. elegans and human O-GlcNAc transferases. Identical amino acid matches are boxed and shaded. Similar amino acids are shaded only. The peptide sequences (indicated "Peptide seq" in the Figure) corresponding to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK are underlined. The cysteine residues are tentative assignments because they could not be distinguished from glutamine residues which comigrate in the amino acid profile. The putative nuclear localization signal ("NLS" in the Figure) is underlined. Sequence data are derived from the cDNAs (Genbank accession numbers U77412 and U77413) reported here. Panel B: Schematic diagram showing the relative sizes of the C. elegans and human OGT, as well as the location of tetratricopeptide repeat (TPR) sequence repeats and putative nuclear localization signal (NLS).
Figure 4. Immunodetection of O-GlcNAc Transferase in Transgenic C. elegans Lines. Panel A: Immunoblot of phosphate buffered saline extracts from transgenic C. elegans embryos which were either uninduced or induced to overexpress OGT by heat shock. The primary antiserum used was a guinea pig anti-OGT prepared against recombinant OGT as described in Example 6. Under the conditions employed, only the OGT produced by overexpressing lines was detected. In other experiments, the wild type enzyme was also detected, but at greatly reduced levels. Panel B (upper panel): Localization of OGT in wild type C. elegans embryos. Indirect immunofluorescence was performed using antiserum raised against recombinant OGT. A fluorescein isothiocyanate (FΙTC)-labelled goat anti-guinea pig antibody was used for detection. Localization of the nuclei in these embryos was carried out using bis-benzamide and UV epifluorescence optics. OGT was found both within the nucleus and in a perinuclear location. Panel B (left lower panel): Overexpression of recombinant OGT after 2-3 hours of heat shock in 3-fold stage C. elegans embryos measured by indirect immunofluorescence using antisera raised against recombinant OGT (Anti-OGT) . A FITC-labelled goat anti-guinea pig antibody was used for detection. Large arrowheads point out the position of nuclear OGT. Smaller arrowheads point out the position of perinuclear aggregates. Panel B (right lower panel): Nuclear localization using propidium iodide (PI) to stain the same embryo shown in panel B, lower left. Arrow heads are used to point to corresponding nuclei in lower panels.
Figure 5. Elevated O-GlcNAc Transferase Activity Induced by Overexpression of Human O-GlcNAc Transferase cDNA in Transfected Hela Cells. Hela cells were plated at 100,000 cells per well and transfected using either lipofection (0.1 μg of DNA) or electroporation (0.05μg of DNA). Cells were transfected with plasmid containing the human OGT clone pECE-Lv4F or with the control plasmid alone pECE, harvested at 24 hours and assayed for O-linked GlcNAc transferase activity as described above. The data are expressed in terms of the fold enrichment observed in OGT specific activity relative to untransfected Hela cells. The specific activity of Hela cells was approximately 900 dpm/μg protein. Figure 6. Hybridization of Human Liver O-GlcNAc Transferase to Genomic DNA from Various Species. Genomic DNA (3 mg/lane) was digested with EcoRI, separated by electrophoresis on 0.7% agarose gel and transferred onto nylon membranes. The blot was probed with radiolabelled full length human liver clone Lv4F and exposed to Kodak Bio-Max MR film at -70°C for 3 days. The location of 1 kb ladder standards is shown to the left.
Figure 7. Northern Blot Analysis of Human Tissues. Poly A RNA (2 mg) from a variety of adult human tissues was probed with radiolabelled full length human liver clone Lv4F (GlcNAc-T Probe Lv4F; top panel) and exposed to Kodak Bio-Max MR film at -70°C for 3 days. The blot was stripped according to the manufacturers protocol and rescreened with a human β-actin gene probe (β-Actin Probe; bottom panel). The location of standards (kb) is shown to the left.
Detailed Description
This invention provides the first molecular characterization of a protein having O-GlcNAc transferase activity. The OGT was purified using recombinant rat nuclear pore protein, nup62, as substrate. The enzyme isolated from rabbit blood has an apparent molecular weight of 110 kDa. It was subjected to trypsin digestion, high pressure liquid chromatography separation of the tryptic peptides, and microsequencing. The partially sequenced enzyme was found to be nearly identical to a protein encoded in an open reading frame in the C. elegans gene, K04G7.3, on chromosome III (Figure 2A). Using this sequence information, a full length cDNA clone was isolated that corresponds to the nematode gene (SEQ ID NO:3). A human expressed sequence tag (EST) was used to make primers to the C-terminal part of the gene in order to isolate the human cDNA (Figure 2B). A search of GenBank using these genes identified homologous EST sequences from Schistosoma mansoni and rice. O-linked GlcNAc has been previously reported in schistosome glycoproteins (Nyame et al. (1987) J Biol Chem 262, 7990-7995) and in plants (Heese-Peck et al. (1995) Plant Cell 7, 1459-1471).
As used herein, a "vector" relates to a nucleic acid which functions to incorporate a particular nucleic acid segment, such as a sequence encoding a particular gene, into a cell. In most cases, the cell does not naturally contain the gene, so that the particular gene being incorporated is a heterologous gene. A vector may include additional functional elements that direct and regulate transcription of the inserted gene or fragment. The regulatory sequence is operably positioned with respect to the protein-encoding sequence such that, when the vector is introduced into a suitable host cell and the regulatory sequence is triggered, the protein is expressed. Regulatory sequences may include, by way of non-limiting example, a promoter, regions upstream or downstream of the promoter such as enhancers that may regulate the transcriptional activity of the promoter, and an origin of replication. A vector may additionally include appropriate restriction sites, antibiotic resistance or other markers for selection of vector containing cells, RNA splice junctions, a transcription termination region, and so forth.
As used herein, a "host cell" is a prokaryotic or eukaryotic cell harboring a nucleic acid vector coding for one or more gene products. Thus a host cell harbors a foreign or heterologous substance, the vector, which is not naturally or endogenously found in it as a component. A suitable host cell is one which has the capability for the biosynthesis of the gene products as a consequence of the introduction of the vector. The host cell may contain components responsive to the regulatory nucleotide sequence of the vector, such that the protein encoded by the vector is expressed in the host when the cells are cultured under suitable conditions that trigger the regulatory sequence. When the host cell is cultured in vitro, it may be a prokaryote, a single-celled eukaryote, or a mammalian cell.
Promoters for prokaryotic hosts include, by way of non-limiting example, the lac, trp, or beta-lactamase promoters, the promoter system from phage lambda, or other phage promoters such as T4 or T7. Promoters for mammalian cells include, by way of non-limiting example, expression control sequences, such as an origin of replication, an enhancer, and necessary information processing sites, such as ribosome binding sites, RNA splice sites, polyadenylation sites, and transcriptional terminator sequences. Expression control sequences are promoters derived from immunoglobulin genes, SV40, adenovirus, bovine papilloma virus, and so forth.
According to the instant invention, O-GlcNAc transferase may be isolated from mammalian sources by obtaining samples of tissue in which the enzyme activity is prevalent, and conducting purification procedures which rely on an assay of OGT activity in order to determine the presence of the protein and the preservation of activity. Tissues that may be used to isolate the enzyme include formed elements of blood, and various organs such as liver, kidney, pancreas, lung, the central nervous system, and so forth. Tissues may also be derived from tumors. Procedures that may be employed in the purification include fractional precipitation, chromatography, centrifugation, and the like; such procedures are well known to workers of skill in protein chemistry and enzymology and are set forth, for example, in Deutscher, M.P. (ed.), Guide to Protein Purification: Methods in Enzymology, Vol. 182, Academic Press, San Diego, CA, 1990; and Scopes, R.K., Protein Purification: Principles and Practice, 3rd Ed., Springer- Verlag, New York, NY, 1993, incorporated herein by reference. An assay for OGT activity is based on assessing the ability of the enzyme to glycosylate a protein substrate. A detailed description of an assay is set forth in Lubas et al. (1995) which is incorporated herein by reference. Generally, this assay involves binding purified nup62 to nitrocellulose membranes, incubating with radiolabeled UDP-GlcNAc and a protein sample whose OGT activity is to be determined, and assessing the amount of the radiolabel incorporated into the immobilized nup62.
Mammalian OGT, in general, may be employed to identify a gene expressing human OGT from the human genome, such as from human EST's or from an appropriate human cDNA library. In order to accomplish this, the mammalian OGT obtained upon completion of the purification procedure described above is subjected to amino acid sequencing procedures. Sequencing is well known to skilled artisans in protein chemistry, and is described, for example, in Allen, G, Sequencing of Proteins and Peptides: Laboratory Techniques in Biochemistry and Molecular Biology, 2nd Ed., Elsevier Science Publ., Amsterdam, 1989. Using procedures such as these, a partial or complete amino acid sequence of the mammalian OGT may be obtained. It is then compared with amino acid sequences derived from either coding sequences of human EST's or cDNAs from libraries whose sequences are known. Once the amino acid sequence is matched to a human nucleotide sequence, the corresponding human DNA sequence encoding the matched amino acid sequence is available.
The human nucleotide sequence then serves as a basis for preparing selective or unique oligonucleotide probes that may be used to isolate the entire gene from samples of human genomic DNA or human cDNA libraries. Alternatively, the human gene sequence may be used to prepare an oligonucleotide primer pair for use in the polymerase chain reaction to identify and amplify the gene as found in a DNA sample from a human source. The identified, or amplified, human DNA that includes the gene for OGT may then be incorporated into a plasmid vector, viral vector, or similar vector, for expression in a suitable host cell. Methods and techniques required to obtain the human gene, clone it in a vector, and express it are well known to skilled artisans in molecular biology and recombinant nucleic acid engineering. They are described in extensive experimental detail in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, New York 1987 (updated quarterly); and Sambrook, Fritsch and Maniatis, Molecular Cloning: A Laboratory Manual 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1989, which are incorporated herein by reference. Recombinant OGT protein expressed in this way is then available for purification, characterization and/or enzymological assay. The inventors have shown that the enzyme made in Escherichia coli has biological activity and can be used in such assays.
Identification of the OGT gene in other organisms, whether mammalian, non-mammalian vertebrate, or non-vertebrate, and expression of the encoded protein, may be done in a fashion similar to that outlined above.
O-linked GlcNAc transferase is the enzyme involved in the monoglycosidic modification of several proteins whose activity may be modulated in different physiological or pathological states. Examples of such proteins include the oncogenes c-jun, c-myc, c-fos, v-erb A, and the tumor suppressor Rb; glucose-responsive elements that include myc-like response elements; and certain nuclear pore proteins including nup62. O-glycosylated sites derivatized by N-acetylglucosamine also are implicated in modulation of insulin secretion. OGT activity may also be involved in the pathogenesis of Alzheimer's disease. Two proteins implicated in this disease, tau and amyloid-β protein, are both glycosylated by OGT. Therapeutic approaches to modulating the glycosylation levels of the proteins involved in these various diseases provide the potential for ameliorating the pathologies ascribable to this process.
Identifying inhibitors of O-linked GlcNAc transferase offers the prospect of obtaining substances having potential as therapeutic agents in these pathologies. Enzymological assays that screen candidate substances in order to identify inhibitors of the enzyme afford a first step toward the development of such therapies. Any assay that is easy to implement and permits an assessment of the glycosylating activity of OGT suffices to accomplish this objective. An example of such an assay is one that includes the steps of (i) providing a sample comprising a protein which is a glycosylation target of OGT activity, (ii) contacting the sample with a solution comprising a substance that is a candidate for being an inhibitor, and further comprising UDP-GlcNac and a protein having O-linked GlcNAc transferase activity, to generate a first test, (iii) determining the O-linked GlcNAc transferase activity in the first test, and (iv) evaluating whether, by comparing the activity determined in the first test with the activity determined in a second test in which the solution lacks the candidate substance, the O-linked GlcNAc transferase activity in the first test has been inhibited. The observation of inhibition then identifies the substance as an inhibitor of O-linked GlcNAc transferase.
Assays that incorporate solid phase components in order to isolate the detected analyte offer particular advantages in implementation. Thus a significant assay of the invention immobilizes the glycosylation target on a localized region of a surface. The target may be a natural substrate for the enzyme, such as nup62, nup97, or nup200 (Macaulay et al., 1995; Lubas et al., 1995), or it may be a synthetic peptide substrate (Lubas et al., 1995). A solution containing the candidate inhibitor substance in varying concentrations, active OGT, and the glycosylating substrate, UDP-GlcNAc, is brought into contact with the localized region of the surface for a time sufficient for, and under conditions of buffering and temperature that favor, the OGT-catalyzed incorporation of GlcNAc into the immobilized protein or peptide substrate. A solution which lacks the candidate serves as a positive control for the absence of inhibition. In the final stage of the assay, the amount of GlcNAc transferred is determined. Techniques for determining the extent of glycosylation include labeling the GlcNAc moiety of the UDP-GlcNAc substrate, such as with a radioactive label, and evaluating any radioactivity incorporated. An alternative technique may be an immunoassay using an antibody raised specifically against the Glc-NAc-derivatized protein or peptide. In this approach, the antibody serves as an indicator of the amount of GlcNAc transferred in the experiment, and the amount of antibody bound to the localized region of the surface is determined. Additional, functionally equivalent assays may also be devised for the purposes of the invention.
For convenience and ease of assessing the inhibitory potential of a large number of candidate substances, these solid phase assays are readily adaptable to high throughput, multiple sample, repetitive assays. Repetitive assays are readily implemented, for example, using a multiwell microtiter plate or similar device. Additional, functionally equivalent formats for conducting repetitive assays, such as micro-binding arrays, are also contemplated within the scope of the assay method of the invention.
The Northern analysis for the distribution of the OGT gene among human tissues described in the Examples distinguishes four distinct OGT transcripts at 9.3, 7.9, 6.3, and 4.4 kb. The signal in the pancreas is over 12 fold higher than seen in the lung and kidney. There also appears to be a tissue-specific distribution of these different bands. The largest signals at 9.3 and 7.9 kb are most abundant in the pancreas and placenta while the 6.3 kb transcript is the major signal seen in the other tissues. It is not known at this time if the multiple transcripts represent the transcription of different genes or alternative splicing and processing of the same gene. The large size of the mRNA transcripts compared to the isolated clones and open reading frame of the gene presumably corresponds to extensive 5' and 3' untranslated sequences. This has been observed for a number of glycosyltransferases (Homa et al. (1993) J Biol Chem 268, 12609-12616). The role of these large regions of untranslated mRNA is not known but it may be important in regulation of these genes. The human clones identified here also show variation in the polyadenylation signal, which could partially explain the different size of the messages. The hexosamine biosynthetic pathway is responsible for the synthesis of cytoplasmic UDP-GlcNAc utilized by OGT. Normally 2-3% of incoming glucose fluxes through this pathway (Marshall et al. (1991) J Biol Chem 266, 4706-4712). Increased glucose flux through the hexosamine biosynthetic pathway, caused by hyperglycemia, has been shown to mediate insulin resistance (Marshall et al., 1991 ; Rossetti et al. (1995) J Clin Invest 96, 132-140; Daniels et al. (1993) Mol
Endocrinol 7, 1041-1048; Crook et al. (1995) Diabetes 44, 314-320; Hebert et al. (1996) J Clin Invest 98, 930-936; and Sayeski et al. (1996) J Biol Chem 271, 15237-15243). The hexosamine biosynthetic pathway, by controlling intracellular UDP-GlcNAc concentrations, may be acting in peripheral tissues as a glucose sensor which is reflected in substrate driven O-linked GlcNAc modification of intracellular proteins by OGT. Glucosamine administration has been shown to impair insulin secretion from the pancreas in response to glucose both in vitro and in vivo (Balkan et al. (1994) Diabetes 43, 1173-1179). That OGT is highly abundant in the pancreas (see Example 8) further suggests a possible role for this enzyme in insulin secretion and glucose homeostasis. This may be due to the ability of OGT to catalyze O-glycosylation of nuclear pore proteins by GlcNAc. The resistance to insulin in non-insulin dependent diabetes mellitus could be treated by reducing the transcriptional effects of increased flux of glucose through the hexosamine pathway. The effects of this pathway may also be felt in insulin- dependent diabetes mellitus.
High serum glucose levels seen in patients with type II diabetes has been shown, via shunting of a portion of the excess glucose into the hexosamine biosynthetic pathway, to result in increased UDP-GlcNAc and increased glycosylation of cellular proteins by this enzyme. The level of expression of OGT activity may thus be a predictor for assessing which patients with glucose intolerance are more likely to progress to overt diabetes. Red blood cells are a good source of the enzyme and so quantifying OGT glycosylation in human blood may be used to screen whether a patient is at increased risk to develop diabetes.
O-linked GlcNAc modifies many phosphoproteins which are components of multimeric complexes. The sites modified by O-linked GlcNAc often resemble phosphorylation sites, leading to the suggestion that the modifications may compete for substrate in these polypeptides (Hart et al., 1995). In general, the sites modified by OGT resemble those of the glycogen synthase kinases (GSK-3, casein kinase II) and MAP kinase very closely. Interestingly, insulin activates the MAP kinase cascade, inhibiting GSK-3 inhibition of glycogen synthase, the rate limiting enzyme in glycogen synthesis (Chou et al. (1995) Proc Natl Acad Sci U S A 92, 4417-4421 ; Woodgett, J.R. (1991) Trends Biochem Sci 16, 177-181). GSK-3 also modifies the oncogene c-jun and negatively regulates its transactivating potential in vivo. Another oncogene, c-myc, is modified by both O-linked GlcNAc and phosphorylated by GSK-3 in a domain required for transcriptional activation (Woodgett, 1991; Stambolic et al. (1994) Biochem J 303, ( Pt 3) 701-704; Plyt et al. (1992) Biochim Biophys Acta 1114, 147-162). Glucose-responsive elements from several mammalian genes have been identified and include myc-like response elements (Towle (1995) J Biol Chem 270, 23235-23238). Therefore, O-linked GlcNAc addition and phosphorylation by kinases such as GSK-3 may have as a common denominator their involvement in transcriptional regulation of glucose metabolism. Furthermore, since it has been shown that the levels of certain oncogenes in tumors can be useful markers for grading tumors, screening tumor cells for OGT activity maybe a useful means of determining the aggressiveness or metastatic potential of these cells. OGT inhibitors may also be used as therapeutic agents in conditions in which the proteins discussed above are implicated. In particular, it is believed that insulin secretion is modulated, in an aberrant homeostatic response, by glycosylation mediated by OGT. Likewise the activity of proteins encoded by tumor suppressor genes, as well as tumor necrotic activities, appear to be candidates for therapeutic modulation by inhibitors of OGT. Thus, inhibitors of these effects may act as therapeutic agents.
Additionally, glycosylation of proteins such as tau and amyloid-β protein, which may be involved in Alzheimer's disease, may be favorably modulated by therapeutic agents to be identified in the screening assays of the invention. The expression of OGT in Alzheimer's disease may be elevated over normal levels
(Griffith and Schmitz, 1995). Screening OGT activity in these patients may help to predict which patients are at increased risk of more rapid progression of the disease or which therapeutic agents may help to limit progression of the disease. Once such therapeutic agents have been positively identified, they may be employed in treatments of diabetes mellitus, Alzheimer's disease, or various malignant tumors.
Example 1. Isolation and Purification of Mammalian O-GlcNAc Transferase
OGT was purified from rabbit blood using a modification of previously described methods (Lubas et al.,1995; Haltiwanger et al., 1992). Fresh rabbit blood (4L), treated with EDTA, was pelleted in a GS3 rotor at 2,000Xg for 5 min. The red blood cells were washed 3 times with an isotonic salt solution (140 mM NaCl, 5 mM KC1, 1.5 mM magnesium acetate) and collected after centrifugation at
2,000Xg for 5 min for the first two washes and 5,000Xg for 10 min after the final wash. Hypotonic lysis was performed using an equal volume of ice cold water containing the following protease inhibitors (Boehringer Mannheim, Indianapolis, IN), 1 mM phenylmethylsulfonylfluoride, 10 μg/ml chymostatin, 10 μg/ml pepstatin, 10 μg/ml leupeptin, 0.1% aprotinin, and 2 mM EDTA. The lysate was pelleted at 10,000Xg for 40 min in a GSA rotor. The soluble fraction was made 30% saturated ammonium sulfate by adding a stock of 100% saturated ammonium sulfate equilibrated at 4°C slowly over 1 hour and stirring the solution an additional 2 hours at 4°C. The precipitate was collected after centrifugation at 10,000Xg for 40 min in a GSA rotor and resuspended in 15-20 mL of 50 mM Tris-HCl, pH 7.4, 2 mM MgCl2 using a Dounce homogenizer. The insoluble material was removed by centrifugation at 20,000Xg for 20 min in a SS34 rotor. The soluble fraction from the 30% ammonium sulfate precipitation was loaded onto a 15 mL Phenyl-Sepharose column (Pharmacia Biotech, Piscataway, NJ), washed with 100 mL of 10 mM Tris-HCl, pH 7.5, 100 mM ammonium sulfate and eluted with 40 mL of 10 mM Tris-HCl, pH 7.5, 60% ethylene glycol. All chromatography buffers also contained the following protease inhibitors, 0.1 % aprotinin, 10 μg/mL leupeptin, 10 μg/mL pepstatin, 0.1 mM phenylmethylsulfonylfluoride; all procedures were performed at 4°C. The active fractions (15-20 mL) were pooled, passed through a 0.45 μM Millex-HA filter (Millipore Corp., Bedford, MA) and loaded onto a Mono Q HR 10/10 anion exchange column (Pharmacia Biotech, Piscataway, NJ) equilibrated with 50 mM Tris-HCl, pH 7.5, 12.5 mM MgCl2, 20% glycerol, 2 mM EDTA using a Pharmacia FPLC system. The column was washed with 30 mL of the equilibration buffer and then eluted with a linear gradient from 0 to 300 mM NaCl in 50 mL of equilibration buffer at a flow rate of 1 mL/min. The active fractions (8-10 mL) were pooled and concentrated to a final volume of 0.3 mL using a Centricon 30 microconcentrator (Amicon, Beverly, MA) and loaded in 0.15 mL aliquots onto a Superose 6 FPLC column (Pharmacia Biotech, Piscataway, NJ) equilibrated with 50 mM Tris-HCl , pH 7.5, 12.5 mM MgCl2, 20% glycerol, 2 mM EDTA, 100 mM NaCl. The column was run at a flow rate of 0.15 mL/min and 0.6 mL fractions were collected. Protein was calculated using the BCA reagent (Pierce Chemicals, Rockford, IL) using bovine serum albumin as a standard. O-GlcNAc transferase activity was measured using recombinant nup62 bound to nitrocellulose membranes as previously described (Lubas et al.,1995) or in a modification of the method using recombinant nup62 bound to ScintiStrip polystyrene scintillation strips (Wallac Oy, Turku, Finland). A typical purification results in a 30,000 fold purification and a 1-2% yield.
Purified O-GlcNAc transferase was subject to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE). The purified enzyme, whose SDS-PAGE image is shown in Figure 1 , contains two polypeptides, an intense band at 110 kDa and a considerably weaker band at 78 kDa. Recovery of the' 78 kDa band was variable between preparations, thus preventing isolation of sufficient amounts for further analysis. Proteolytic fingerprinting of both polypeptides suggested that they are related. The 78 kDa band maybe a proteolytic product of the larger 110 kDa band or the product of a second translation start site. The apparent molecular weight of the slower-migrating band, 110 kDa, is consistent with the translation product of the human coding sequence, shown in
SEQ ID NO:2, which has 920 amino acid residues and with the translation product of the C. elegans coding sequence shown in SEQ ID NO:4, which has 1151 residues. For these reasons it is believed that the 110 kDa band represents the complete OGT protein of the invention.
Example 2. Sequencing of purified rabbit OGT
The 110 kDa band obtained in Example 1 was cut out of the gel and sent to the William M. Keck Foundation at Yale University for in-gel trypsin digestion, high pressure liquid chromatography purification of the resulting tryptic peptides, and amino acid microsequencing. Two peptides were initially identified, a 20- residue peptide, XVSLDPNFLDAYINLGNVLK, and a 17-residue peptide, XXXSQLT(C)LG(C)LELIAK. The 20-mer was a perfect match to a sequence contained within the expressed sequence tag, cDNA clone ykl3c2 (gb-CelK013C2F) and in a previously uncharacterized gene K04G7.3 (Genbank accession number U21320) identified as part of the C. elegans genome sequencing project (Figure 2A). Both peptide sequences ended in basic amino acids consistent with the generation of these fragments by trypsin digestion. Figure 2A shows the structure of the gene and localizes the tryptic peptides to the 8th and 14th exons in the C. elegans gene. Two human expressed sequence tags, Genbank accession numbers R75943 and R76782, showing greater than 60% identity to the C. elegans gene K04G7.3, were also identified and found to match the 17-mer rabbit OGT tryptic peptide perfectly (Figure 2B).
Example 3. Cloning of the cDNA encoding C. elegans O-GlcNAc Transferase
The OGT cDNA was isolated using a combination of phage library screening and polymerase chain reaction. Polymerase chain reaction (PCR) primers GTTTGTTACTTGAAAGCAATCG (SEQ ID NO:5) and
ATCGAAAATCCTGGCCTCTT (SEQ ID NO:6) were made to amplify a 195 base pair fragment from the cDNA clone ykl3c2 (Figure 2A). After PCR amplification, this fragment was gel purified and used to probe a lambda ZAP (Stratagene, Cambridge, UK) C. elegans cDNA library (1010 units/mL) (Barstead et al., 1989). 140,000 clones were screened; only 1 positive plaque was identified. The identified insert (3.1 kb) was subcloned into pGem and Pet 32 (Novagen, Madison, WI) using EcoR I. This insert was sequenced and localized to the C-terminal 70% of the open reading frame of C. elegans K04G7.3. Using the known sequence for the open reading frame for the C. elegans K04G7.3 gene, primers were constructed to amplify the 5' end using high fidelity Takara Biomedicals (Gennevilliers,
France) Ex Taq DNA polymerase. The PCR fragment was cloned into the Hind III site in the original clone isolated from the lambda ZAP library yielding cDNA clone designated ZAP-CeOGT (Genbank accession number U77412). This clone was sequenced (SEQ ID NO:3) and found to be nearly identical to the published C. elegans K04G7.3 gene sequence, except that it was lacking the third and fourth exons predicted by the program Genefinder (Favello et al., 1995). The exclusion of these two exons (see Figure 2 A), corresponding to base pairs 204-333 in the previously published sequence, does not affect the reading frame of the remaining sequence. The consensus polyadenylation signal AATAAA occurs at positions 4065-4070. The amino acid sequence of the protein encoded by SEQ ID NO:3, beginning with the initiation codon at position 1, is shown in SEQ ID NO:4. The peptide sequences predicted to correspond to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK occur at positions 3321-340 and 1060-1076, respectively. There are 1 1 single base changes between this C. elegans cDNA clone and the previously published sequence for gene K04G7.3. Sequence analysis of the gene shows that it has 13 tandem tetratricopetide repeats contained in exons 6-10, followed by a putative nuclear localization sequence near the end of exon 10 (Figure 3B).
Example 4. Cloning of the Human O-GlcNAc Transferase The human O-GlcNAc transferase was isolated using primers constructed from the sequence of the human expressed sequence tag, Genbank accession number R75943. These primers were used to screen Superscript™ (Gibco BRL Life Technologies, Gaithersburg MD) human brain and liver cDNA libraries using the Genetrapper™ (Gibco BRL Life Technologies, Gaithersburg MD) cDNA positive selection system. Taking advantage of the published sequence of a human expressed sequence tag, Genbank accession number R75943 (Figure 2B), two oligonucleotide primers GCGTTTTCCAGCAGTAGGAG (SEQ ID NO:7) and ACATTCTGAAGCGTGTTCCC (SEQ ID NO:8) were constructed and used to screen superscript human brain and liver cDNA libraries (Gibco BRL Life Technologies, Gaithersburg MD) using the Genetrapper™ cDNA positive selection system. The first primer was biotinylated at the 3' end with biotin-14-dCTP using terminal deoxynucleotidyl transferase. The biotinylated primer was used to screen single stranded human liver or brain cDNA libraries. Hybrids between the biotinylated oligonucleotide and the cDNA libraries were captured on streptavidin-coated paramagnetic beads and retrieved using a magnet. The captured ssDNA was separated from the biotinylated primer, re-paired to double stranded DNA using the second oligonucleotide primer and transformed into ElectroMAX DH10B cells (Gibco BRL Life Technologies, Gaithersburg MD). There were a total of 48 liver and 53 brain clones identified on the initial screen. These clones were then rescreened by hybridization with the full length human placenta expressed sequence tag, Genbank accession number R75943; 40 of 48 liver and 42 of 53 brain clones were found to be positive. The insert size was estimated by restriction digestion with Sal I and Not I. All liver clones longer than 2.5 kb and brain clones longer than 3 kb were screened by in vitro translation. The largest in vitro translation product identified was a protein of about 100 kDa formed by 6 different liver and 2 brain clones (data not shown). DNA sequencing showed that they were all overlapping clones of the same gene with variable 5' and 3' untranslated regions. The full length clone from liver, designated Lv4F, was fully sequenced (SEQ ID NO:l) and is provided in Genbank accession number U77413. The consensus polyadenylation signal AATAAA occurs at positions 3027-3032. The oligonucleotide primers GCGTTTTCCAGCAGTAGGAG and ACATTCTGAAGCGTGTTCCC used to screen the human cDNA libraries using the Genetrapper positive select system are found at positions 2471-2493 and 2514- 2533, respectively. The translated protein sequence predicted by the open reading frame beginning at position 265 is given in SEQ ID NO:2. The peptide sequences predicted to correspond to the partial amino acid sequence of the tryptic peptides XVTLDPNFLDAYINLGNVLK and XXXSQLT(C)LG(C)LELIAK occur at positions 91-110 and 829-845, respectively..
From 101 initial clones, 8 full length clones were identified. Six of these clones were obtained from liver and 2 from brain libraries; all had overlapping 5' and 3' untranslated sequences. When they were expressed by translating in vitro, each of these full length clones produced a polypeptide of approximately 100 kDa and variable amounts of a smaller 70 kDa species. The 70 kDa species, resulting from an alternative translation start, may be related to the 78 kDa that has been observed with purified OGT preparations (Figure 1). Clone Lv4F (Genbank accession number U77413), was found to encode an open reading frame having 68% identity with the C. elegans K40G7.3 gene product over the C-terminal 872 amino acids (Figure 3 A). The human cDNA open reading frame encodes a shorter protein (103 kDa) containing only the last 9 tetratricopeptide repeat (TPR) sequences found in C. elegans (Figure 3B). While this is consistent with the observed size of the in vitro translation product, it is likely that post translational modification of the enzyme occurs since the human OGT translated in reticulocyte lysate was slightly larger than the product seen from wheat germ extract (data not shown). This behavior has been previously observed for proteins modified by O-linked GlcNAc (Starr et al., 1990).
Example 5. In vitro Translation and Expression in E. coli
To examine the properties of the protein, OGT was expressed in E. coli. In vitro translation was performed using the TNT-T7 coupled wheat germ extract system (Promega Ltd., Southampton, UK) using the manufacturer's instructions. The full length C. elegans cDNA (ZAP-CeOGT) was cloned into Pet32a and transfected into E. coli BL21(DE3) cells (Novagen, Madison, WI) for expression. Cells were grown in Luria-Bertani medium containing 50 μg/mL carbenicillin at 37°C and 220 rpm until the OD600 was about 0.6. Cells were induced with 1 mM isopropylthiogalactopyranoside for 90 min at 37°C and harvested by centrifugation at 3000 rpm for 5 min at 4°C in a Beckman GS-6R centrifuge. After resuspension in 1/10 volume of 50 mM Tris-HCl, pH 8, 2 mM EDTA, 100 μg/mL lysozyme, 0.1%) Triton X-100, cells were incubated at 30°C for 15 min, placed in an ice bath and sonicated twice for 10 seconds to shear the DNA. The O-GlcNAc transferase was pelleted at 12,000Xg for 10 min at 4°C, solubilized with His-Tag (Novagen Inc., Madison, WI) binding buffer in 6 M urea (5 mM imidazole, 50 mM NaCl, 20 mM Tris-HCl, pH 7.9). After centrifugation at 12,000Xg for 10 min at 4°C, the solubilized protein was loaded onto a 2.5 mL His-Tag column (Novagen Inc., Madison, WI) , washed with 8 mL of binding buffer and eluted with 8 mL of elution buffer in 6 M urea (60 mM imidazole, 50 mM NaCl, 20 mM Tris-HCl, pH 7.9). Column fractions were analyzed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Full length OGT was gel purified and used to generate polyclonal antibodies in guinea pigs. The enzyme made in E. coli has biological activity and can be used in OGT assays. This was accomplished by incubating the recombinant bacteria described in the preceding paragraph overnight at room temperature without inducing by the addition of isopropylthiogalacto-pyranoside.
Example 6. Transgenic C. elegans Lines Overexpressing O-GlcNAc Transferase
To examine the localization of the OGT in C. elegans, several transgenic lines were produced which overexpress enzyme under control of heat shock promoters. Transgenic C. elegans strains were generated by microinjection using the pRF4 plasmid as a marker to identify transformed animals (Mello et al. (1995)). Test plasmid constructs were injected in combination with pRF4 DNA at 50 ng/μL each. Overexpression was achieved by transformation of N2 animals with derivatives of the heat shock promoter vectors (pPD49.78 and pPD49.83) (Mello et al. (1995)) in which the full length C. elegans OGT cDNA (Nco I-Sac I partial digest, 4.25 kb) was cloned into the Nco I and Sac I restriction sites of the vector. Transgenic animals were heat shocked by treatment at 33°C for 2-4 hours to induce production of fusion proteins driven by heat shock promoters. Overexpressed OGT was detected by immunoblotting using an anti-OGT guinea pig antibody raised against the recombinant protein made in E. coli (see Example 5). When induced to overexpress by heat shock, the C. elegans OGT was readily detected by immunoblotting using the guinea pig antisera (Figure 4A).
Immunofluorescence of O-GlcNAc Transferase in C. elegans embryos was carried out by fixing the embryos with formaldehyde on glass slides (Krause et al. (1990)) and visualizing by indirect immunofluorescence using a FITC-labelled goat anti-guinea pig antibody, and guinea pig antibody raised against recombinant C. elegans OGT. The immunofluorescence was detected using a Biorad 1024 Confocal Microscope equipped with a 60X objective. Wild-type C. elegans embryos showed a punctate perinuclear and nuclear pattern (Figure 4B, top panel). In embryos overexpressing the enzyme, (Figure 4B, lower panels), OGT was found within the nucleus in the gut suggesting that the nuclear localization sequence in C. elegans OGT is functional. In other regions of the embryos, the overexpressed OGT exhibited a distinct perinuclear localization (small arrows). This was particularly striking in the neurons. Similar localization was observed in all of the lines produced although the tissue distribution was somewhat dependent upon the heat shock promoter used as has been previously reported (Mello et al., 1995). Thus, the enzyme was found in both the nucleus and the cytoplasm, depending on the tissue overexpressing the cDNA.
Example 7. Expression of Human O-GlcNAc Transferase in Transiently Transfected Hela Cells
Human OGT clone Lv4F was introduced into HeLa cells by lipid-mediated transfection or by electroporation. For the lipid-mediated method, 105 cells were plated per well in 6- well plates in Dulbecco's minimal essential medium (DMEM)/10% fetal bovine serum (FBS)14-18 hours prior to transfection. The transfection was carried out in OptiMEM (Life Technologies, Inc., Paisley, Scotland). The plasmid pECE-OGT/Lv4F (0.1 mg) was mixed with 4 mL of Lipofectin reagent (Life Technologies, Inc., Paisley, Scotland) and applied to the cells according to the manufacturer's recommendations. Control cells were transfected with plasmid bearing no insert. Electroporation of HeLa cells was performed in OptiMEM in cell suspensions (5 x 106/mL) containing 0.5 mg/mL of pECE-OGT/Lv4F or pECE. Cells were shocked at 4°C, with capacitance set at 1180 mF, and voltage at 200 V using a BRL electroporator (Gibco BRL Life
Technologies, Gaithersburg MD). Following 1-2 min on ice after the shock, cells were diluted and plated in DMEM/10% FBS. Transfection efficiencies for both transfection methods were estimated using a plasmid which encodes green fluorescent protein (pGreenLantern; Life Technologies, Inc., Paisley, Scotland) and were typically about 10-20%. Using either method, cells were harvested 24 h after transfection, lysed by sonication, and centrifuged. The supernatant fraction was assayed for OGT enzyme activity using ScintiStrip wells (Wallac Oy, Turku, Finland) precoated with nup62. Assays were performed in 50 mM Tris-HCl, pH 7.4, 12.5 mM MgCl2 and 1 μCi UDP- GlcNAc- [Η] GlcNAc in a final volume of 40 mL for 90 min at 37°C and 220 rpm.
The full length human cDNA (clone Lv4F) was cloned into the pECE vector downstream of the SV40 promoter. Hela cell cultures were transiently transfected with vector alone or with the vector containing the clone Lv4F open reading frame. Cells were harvested at 24 hours. The transfected cells did not survive well during prolonged incubations, i.e., more than about 72 hours, suggesting the gene may be toxic to the cells. Toxicity has also been observed in experiments where the gene was overexpressed in transgenic C. elegans. Up to a three-fold increase in enzyme activity relative to backgroimd activity was observed using two different transfection procedures (Figure 5).
Example 8. Isolation of cloned human OGT from transfected Hela cells The full length human cDNA (clone Lv4F) is to be cloned into the pECE vector downstream of the SV40 promoter. Hela cell cultures are to be transiently transfected with the vector containing the clone Lv4F open reading frame. Cells are to be harvested at about 24 hours after transfection. The cells are to be disrupted and human OGT is to be purified from the cytosolic supernatant following procedures similar to those described in Example 1 and in Lubas et al. (1995) for the isolation of rabbit OGT.
Example 9. Conservation and Tissue Distribution of O-GlcNAc Transferase
Samples of human, rabbit, rat, and mouse genomic DNA (CLONTECH Laboratories, Palo Alto, CA) were digested overnight with EcoR I, chromatographed (3 μg/lane) on a 0.7% agarose gel, and transferred to a nylon membrane (Gene Screen Plus, DuPont) by capillary action. The blot was prehybridized in 1% bovine serum albumin, 0.5 M NaPO4, pH 7, 1 mM EDTA, 7% sodium dodecyl sulfate, 100 μg/mL denatured salmon testis DNA, at 55°C for 1 hour and then hybridized overnight at 55°C with the gel purified, radiolabelled 3 kb Not I-Sal I fragment from human liver clone Lv4F. The blot was washed two times for 15 min with 0.5% bovine serum albumin, 5% sodium dodecyl sulfate, 40 mM NaPO4, pH 7, 1 mM EDTA at 55°C, then two times for 15 min with 1% bovine serum albumin, 40 mM NaPO4, pH 7, 1 mM EDTA at 55°C and once with 0.2X SSPE (30 mM NaCl, 2 mM NaPO4, pH 7.4, 0.2 mM EDTA) at 55°C for 15 min. It was exposed to Kodak Bio-max MR film for 1-7 days at -70°C. A human multiple tissue Northern blot (CLONTECH Laboratories, Palo Alto, CA) was prehybridized as described previously for the Southern blot except that all incubations were performed at 65°C. Hybridization of the human liver OGT cDNA (clone Lv4F) to genomic
DNA digested with EcoR I from several different species is shown in Figure 6. The Southern analysis identifies a single large fragment in human while several smaller fragments are observed in rabbit, rat ,and mouse genomic DNAs. The high degree of conservation observed is not surprising since C. elegans and human OGT cDNAs were found to be similar. Several additional sequences in the database searches were found to be related to the C elegans and human sequences, including sequences from schistosomes and rice.
To examine the relative abundance of the human OGT mRNA in various adult human tissues, a Northern blot analysis was performed (Figure 7). The human clone Lv4F probe identifies four distinct bands at 9.3, 7.9, 6.3, and 4.4 kb which are present in different amounts in various human tissues. The pancreas, where the two largest species (9.3 and 7.9 kb) were most abundant, shows the highest level of expression. Skeletal muscle and heart exhibited a relative enrichment of the 6.3 kb species, while all transcripts were at low relative abundance in the kidney and lung. As a control, a Northern blot analysis of the same blot with β-actin cDNA confirmed that similar levels of mRNA were loaded in each lane.
Example 10. Assessing predisposition toward type II diabetes.
A sample of blood from a patient suspected of having hyperglycemia that may evolve into type II diabetes is to be provided. The red blood cells are to be isolated by centrifugation, lysed, and the supernatant fraction is to be concentrated by a method chosen from (a) retention by an ultrafilter and (b) precipitation by concentrated ammonium sulfate, as set forth in Example 1. The resulting sample containing OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1. Correlative samples are to be assayed periodically from healthy human subjects, known human diabetics, and human patients suffering from other pathologies not related to type II diabetes. The levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and type II diabetic levels of OGT in human subjects. Patients are to be evaluated as being predisposed to type II diabetes if the level of OGT activity in the samples falls within the range established for patients known to have type II diabetes.
Example 11. Assessing predisposition toward Alzheimer's disease.
A sample from the central nervous system of a patient suspected of having Alzheimer's disease or of being at increased risk of developing the disease is to be provided. The sample may, for example, be drawn from the cerobrospinal fluid or from cellular material obtained from the brain. Cellular material, if used, is to be lysed, and OGT activity in the supernatant portion or in an ultrafilter retentate of the sample, is to concentrated. The OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1. Correlative samples are to be assayed periodically from healthy human subjects, and from known human Alzheimer's disease patients whether living or at autopsy. The levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and pathological levels, related to Alzheimer's disease, of OGT in human subjects. A patient is to be evaluated as being predisposed to or at increased risk of developing Alzheimer's disease if the level of OGT activity in the sample from the patient falls within the range established for patients known to have Alzheimer's disease.
Example 12. Assessing metastatic potential of a tumor. A sample from a tumor present in a patient is to be provided. The sample may be obtained, for example, by surgical biopsy. Cellular material is to be homogenized, OGT activity in the supernatant portion or in an ultrafilter retentate of the sample is to be concentrated. The OGT activity is to be resuspended to a fixed volume and the OGT activity is to be assayed using procedures set forth in Example 1. The substrate is to be a purified oncogene protein, such as a recombinant form of an oncogene protein. These proteins may be chosen from among myc, p53, Rb, and v-erb, and similar known oncogene proteins.
Correlative samples are to be assayed periodically from healthy human subjects or from fresh autopsy samples, and from tumors derived from known human cancer patients. The levels of OGT activity found in the correlative samples are to be used to establish a range of limits for normal and pathological levels of OGT in human subjects. Pathological levels are those found in the samples of known tumors. A tumor in a patient being tested is to be evaluated as having high metastatic potential if the level of OGT activity in the sample from the patient falls within the range established for patients known to have metastatic tumors. SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: Hanover, John A. Lubas, William
(ii) TITLE OF INVENTION: O-Linked GlcNAc Transferase
(OGT) : Cloning, Molecular Expression, and Methods of Use
(iii) NUMBER OF SEQUENCES: 8
(iv) CORRESPONDENCE ADDRESS :
(A) ADDRESSEE: Fitch, Even, Tabin & Flannery
(B) STREET: 135 South LaSalle Street, Suite 900
(C) CITY: Chicago
(D) STATE: IL
(E) COUNTRY: USA '(F) ZIP: 60603-4277
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Diskette
(B) COMPUTER: IBM Compatible
(C) OPERATING SYSTEM: Windows
(D) SOFTWARE: FastSEQ for Windows Version 2.0
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(vii) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Kaba, Richard A
(B) REGISTRATION NUMBER: 30,562
(C) REFERENCE/DOCKET NUMBER: 6299/63320
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 312-372-7842
(B) TELEFAX: 312-372-7848
(C) TELEX:
(2) INFORMATION FOR SEQ ID NO : 1 :
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 3083 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS : single
(D) TOPOLOGY: linear (vi) ORIGINAL SOURCE:
(A) ORGANISM: Homo sapiens (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 265...3024 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 :
CCGGAAACAG TGGGGGTAGG AAAACTCGGC CTCAAGTTGC GCCCTCTAGG TAGCACTTGA 60
AAACATGACA AGGGCCCGTA GTTGTTTGGA TAAGAGAACT CCAGCATAGA GCCTTATAGC 120
AACTGACTTC CCAGTTAAGT CCCAGTGTAA GGGTTGGTCT TTGGTTGGCA GAACTGAACA 180
TGGTGGTTTG CACTTGGGTT CTGGTGGCGC AGGCGCAGGA GCAGCCAGCT GTGGCAGCGC 240
ATTAGTTTTG GCGCAAGCGA GCCT ATG CTG CAG GGT CAC TTT TGG CTG GTC 291
Met Leu Gin Gly His Phe Trp Leu Val 1 5
AGA GAA GGA ATA ATG ATA TCA CCT TCT TCC CCC CCT CCC CCC AAT CTT 339 Arg Glu Gly lie Met lie Ser Pro Ser Ser Pro Pro Pro Pro Asn Leu 10 15 20 25
TTT TTT TTC CCT TTA CAA ATT TTC CCC TTT CCC TTT ACC TCC TTT CCC 387 Phe Phe Phe Pro Leu Gin lie Phe Pro Phe Pro Phe Thr Ser Phe Pro 30 35 40
TCC CAT CTT CTT TCA TTA ACC CCT CCT AAG GCA TGT TAT TTG AAA GCA 435 Ser His Leu Leu Ser Leu Thr Pro Pro Lys Ala Cys Tyr Leu Lys Ala 45 50 55
ATT GAG ACG CAA CCG AAC TTT GCA GTA GCT TGG AGT AAT CTT GGC TGT 483 lie Glu Thr Gin Pro Asn Phe Ala Val Ala Trp Ser Asn Leu Gly Cys 60 65 70
GTT TTC AAT GCA CAA GGG GAA ATT TGG CTT GCA ATT CAT CAC TTT GAA 531 Val Phe Asn Ala Gin Gly Glu lie Trp Leu Ala lie His His Phe Glu 75 80 85
AAG GCT GTC ACC CTT GAC CCA AAC TTT CTG GAT GCT TAT ATC AAT TTA 579 Lys Ala Val Thr Leu Asp Pro Asn Phe Leu Asp Ala Tyr lie Asn Leu 90 95 100 105
GGA AAT GTC TTG AAA GAG GCA CGC ATT TTT GAC AGA GCT GTG GCA GCT 627 Gly Asn Val Leu Lys Glu Ala Arg lie Phe Asp Arg Ala Val Ala Ala 110 115 120
TAT CTT CGT GCC CTA AGT TTG AGT CCA AAT CAC GCA GTG GTG CAC GGC 675 Tyr Leu Arg Ala Leu Ser Leu Ser Pro Asn His Ala Val Val His Gly 125 130 135
AAC CTG GCT TGT GTA TAC TAT GAG CAA GGC CTG ATA GAT CTG GCA ATA 723 Asn Leu Ala Cys Val Tyr Tyr Glu Gin Gly Leu lie Asp Leu Ala lie
140 145 150
GAC ACC TAC AGG CGG GCT ATC GAA CTA CAA CCA CAT TTC CCT GAT GCT 771
Asp Thr Tyr Arg Arg Ala lie Glu Leu Gin Pro His Phe Pro Asp Ala
155 160 165
TAC TGC AAC CTA GCC AAT GCT CTC AAA GAG AAG GGC AGT GTT GCT GAA 819
Tyr Cys Asn Leu Ala Asn Ala Leu Lys Glu Lys Gly Ser Val Ala Glu 170 175 180 185
GCA GAA GAT TGT TAT AAT ACA GCT CTC CGT CTG TGT CCC ACC CAT GCA 867
Ala Glu Asp Cys Tyr Asn Thr Ala Leu Arg Leu Cys Pro Thr His Ala 190 195 200
GAC TCT CTG AAT AAC CTA GCC AAT ATC AAA CGA GAA CAG GGA AAC ATT 915
Asp Ser Leu Asn Asn Leu Ala Asn lie Lys Arg Glu Gin Gly Asn lie
205 210 215
GAA GAG GCA GTT CGC TTG TAT CGT AAA GCA TTA GAA GTC TTC CCA GAG 963
Glu Glu Ala Val Arg Leu Tyr Arg Lys Ala Leu Glu Val Phe Pro Glu
'220 225 230
TTT GCT GCT GCC CAT TCA AAT TTA GCA AGT GTA CTG CAG CAG CAG GGA 1011
Phe Ala Ala Ala His Ser Asn Leu Ala Ser Val Leu Gin Gin Gin Gly
235 240 245
AAA CTG CAG GAA GCT CTG ATG CAT TAT AAG GAG GCT ATT CGA ATC AGT 1059
Lys Leu Gin Glu Ala Leu Met His Tyr Lys Glu Ala lie Arg lie Ser 250 255 260 265
CCT ACC TTT GCT GAT GCC TAC TCT AAT ATG GGA AAC ACT CTA AAG GAG 1107
Pro Thr Phe Ala Asp Ala Tyr Ser Asn Met Gly Asn Thr Leu Lys Glu 270 275 280
ATG CAG GAT GTT CAG GGA GCC TTG CAG TGT TAT ACG CGT GCC ATC CAA 1155
Met Gin Asp Val Gin Gly Ala Leu Gin Cys Tyr Thr Arg Ala lie Gin
285 290 295
ATT AAT CCT GCA TTT GCA GAT GCA CAT AGC AAT CTG GCT TCC ATT CAT 1203 lie Asn Pro Ala Phe Ala Asp Ala His Ser Asn Leu Ala Ser lie His
300 305 310
AAG GAT TCA GGG AAT ATT CCA GAA GCC ATA GCT TCT TAC CGC ACG GCT 1251
Lys Asp Ser Gly Asn lie Pro Glu Ala lie Ala Ser Tyr Arg Thr Ala
315 320 325
CTG AAA CTT AAG CCT GAT TTT CCT GAT GCT TAT TGT AAC TTG GCT CAT 1299
Leu Lys Leu Lys Pro Asp Phe Pro Asp Ala Tyr Cys Asn Leu Ala His 330 335 340 345
TGC CTG CAG ATT GTC TGT GAT TGG ACA GAC TAT GAT GAG CGA ATG AAG 1347
Cys Leu Gin lie Val Cys Asp Trp Thr Asp Tyr Asp Glu Arg Met Lys 350 355 360 AAG TTG GTC AGT ATT GTG GCT GAC CAG TTA GAG AAG AAT AGG TTG CCT 1395 Lys Leu Val Ser lie Val Ala Asp Gin Leu Glu Lys Asn Arg Leu Pro 365 370 375
TCT GTG CAT CCT CAT CAT AGT ATG CTA TAT CCT CTT TCT CAT GGC TTC 1443 Ser Val His Pro His His Ser Met Leu Tyr Pro Leu Ser His Gly Phe 380 385 390
AGG AAG GCT ATT GCT GAG AGG CAC GGC AAC CTG TGC TTA GAT AAG ATT 1491 Arg Lys Ala lie Ala Glu Arg His Gly Asn Leu Cys Leu Asp Lys lie 395 400 405
AAT GTT CTT CAT AAA CCA CCA TAT GAA CAT CCA AAA GAC TTG AAG CTC 1539 Asn Val Leu His Lys Pro Pro Tyr Glu His Pro Lys Asp Leu Lys Leu 410 415 420 425
AGT GAT GGT CGG CTG CGT GTA GGA TAT GTG AGT TCC GAC TTT GGG AAT 1587 Ser Asp Gly Arg Leu Arg Val Gly Tyr Val Ser Ser Asp Phe Gly Asn 430 435 440
CAT CCT ACT TCT CAC CTT ATG CAG TCT ATT CCA GGC ATG CAC AAT CCT 1635 His Pro 'Thr Ser His Leu Met Gin Ser lie Pro Gly Met His Asn Pro 445 450 455
GAT AAA TTT GAG GTG TTC TGT TAT GCC CTG AGC CCA GAC GAT GGC ACA 1683 Asp Lys Phe Glu Val Phe Cys Tyr Ala Leu Ser Pro Asp Asp Gly Thr 460 465 470
AAC TTC CGA GTG AAG GTG ATG GCA GAA GCC AAT CAT TTC ATT GAT CTT 1731 Asn Phe Arg Val Lys Val Met Ala Glu Ala Asn His Phe lie Asp Leu 475 480 485
TCT CAG ATT CCA TGC AAT GGA AAA GCA GCT GAT CGC ATC CAT CAG GAT 1779 Ser Gin lie Pro Cys Asn Gly Lys Ala Ala Asp Arg He His Gin Asp 490 495 500 505
GGA ATT CAT ATC CTT GTA AAT ATG AAT GGC TAT ACT AAG GGC GCT CGA 1827 Gly He His He Leu Val Asn Met Asn Gly Tyr Thr Lys Gly Ala Arg 510 515 520
AAT GAG CTT TTT GCT CTC AGG CCA GCT CCT ATT CAG GCA ATG TGG CTG 1875 Asn Glu Leu Phe Ala Leu Arg Pro Ala Pro He Gin Ala Met Trp Leu 525 530 535
GGA TAC CCT GGG ACG AGT GGT GCG CTT TTC ATG GAT TAT ATT ATC ACT 1923 Gly Tyr Pro Gly Thr Ser Gly Ala Leu Phe Met Asp Tyr He He Thr 540 545 550
GAT CAG GAA ACT TCG CCA GCT GAA GTT GCT GAG CAG TAT TCC GAG AAA 1971 Asp Gin Glu Thr Ser Pro Ala Glu Val Ala Glu Gin Tyr Ser Glu Lys 555 560 565
TTG GCT TAT ATG CCC CAC ACT TTT TTT ATT GGT GAT CAT GCT AAT ATG 2019 Leu Ala Tyr Met Pro His Thr Phe Phe He Gly Asp His Ala Asn Met 570 575 580 585 TTC CCT CAC CTG AAG AAA AAA GCA GTC ATC GAT TTT AAG TCC AAT GGG 2067 Phe Pro His Leu Lys Lys Lys Ala Val He Asp Phe Lys Ser Asn Gly 590 595 600
CAC ATT TAT GAC AAT CGG ATA GTT CTG AAT GGC ATC GAC CTC AAA GCA 2115 His He Tyr Asp Asn Arg He Val Leu Asn Gly He Asp Leu Lys Ala 605 610 615
TTT CTT GAT AGT CTA CCA GAT GTG AAA ATT GTC AAG ATG AAG TGT CCT 2163 Phe Leu Asp Ser Leu Pro Asp Val Lys He Val Lys Met Lys Cys Pro 620 625 630
GAT GGA GGA GAC AAT GCA GAT AGC AGT AAC ACA GCT CTT AAT ATG CCT 2211 Asp Gly Gly Asp Asn Ala Asp Ser Ser Asn Thr Ala Leu Asn Met Pro 635 640 645
GTT ATT CCT ATG AAT ACT ATT GCA GAA GCA GTT ATT GAA ATG ATT AAC 2259 Val He Pro Met Asn Thr He Ala Glu Ala Val He Glu Met He Asn 650 655 660 665
CGA GGA CAG ATT CAA ATA ACA ATT AAT GGA TTC AGT ATT AGC AAT GGA 2307 Arg Gly Gin He Gin He Thr He Asn Gly Phe Ser He Ser Asn Gly 670 675 680
CTG GCA ACT ACT CAG ATC AAC AAT AAG GCT GCA ACT GGA GAG GAG GTT 2355 Leu Ala Thr Thr Gin He Asn Asn Lys Ala Ala Thr Gly Glu Glu Val 685 690 695
CCC CGT ACC ATT ATT GTA ACC ACC CGT TCT CAG TAC GGG TTA CCA GAA 2403 Pro Arg Thr He He Val Thr Thr Arg Ser Gin Tyr Gly Leu Pro Glu 700 705 710
GAT GCC ATC GTA TAC TGT AAC TTT AAT CAG TTG TAT AAA ATT GAC CCT 2451 Asp Ala He Val Tyr Cys Asn Phe Asn Gin Leu Tyr Lys He Asp Pro 715 720 725
TCT ACT TTG CAG ATG TGG GCA AAC ATT CTG AAG CGT GTT CCC AAT AGT 2499 Ser Thr Leu Gin Met Trp Ala Asn He Leu Lys Arg Val Pro Asn Ser 730 735 740 745
GTA CTC TGG CTG TTG CGT TTT CCA GCA GTA GGA GAA CCT AAT ATT CAA 2547 Val Leu Trp Leu Leu Arg Phe Pro Ala Val Gly Glu Pro Asn He Gin 750 755 760
CAG TAT GCA CAA AAC ATG GGC CTG CCC CAG AAC CGT ATC ATT TTT TCA 2595 Gin Tyr Ala Gin Asn Met Gly Leu Pro Gin Asn Arg He He Phe Ser 765 770 775
CCT GTT GCT CCT AAA GAG GAA CAC GTC AGG AGA GGC CAG CTG GCT GAT 2643 Pro Val Ala Pro Lys Glu Glu His Val Arg Arg Gly Gin Leu Ala Asp 780 785 790
GTC TGC TTG GAC ACT CCA CTC TGT AAT GGG CAC ACC ACA GGG ATG GAT 2691 Val Cys Leu Asp Thr Pro Leu Cys Asn Gly His Thr Thr Gly Met Asp 795 800 805 GTC CTC TGG GCA GGG ACC CCC ATG GTG ACT ATG CCA GGA GAG ACT CTT 2739 Val Leu Trp Ala Gly Thr Pro Met Val Thr Met Pro Gly Glu Thr Leu 810 815 820 825
GCT TCT CGA GTT GCA GCA TCC CAG CTC ACT TGC TTA GGT TGT CTT GAG 2787 Ala Ser Arg Val Ala Ala Ser Gin Leu Thr Cys Leu Gly Cys Leu Glu 830 835 840
CTT ATT GCT AAA AAC AGA CAA GAA TAT GAA GAC ATA GCT GTG AAG CTG 2835 Leu He Ala Lys Asn Arg Gin Glu Tyr Glu Asp He Ala Val Lys Leu 845 850 855
GGA ACT GAT CTA GAA TAC CTG AAG AAA GTT CGT GGC AAA GTC TGG AAG 2883 Gly Thr Asp Leu Glu Tyr Leu Lys Lys Val Arg Gly Lys Val Trp Lys 860 865 870
CAA AGA ATA TCT AGC CCT CTG TTC AAC ACC AAA CAA TAC ACA ATG GAA 2931 Gin Arg He Ser Ser Pro Leu Phe Asn Thr Lys Gin Tyr Thr Met Glu 875 880 885
CTA GAG CGG CTC TAT CTA CAG ATG TGG GAG CAT TAT GCA GCT GGC AAC 2979 Leu Glu Arg Leu Tyr Leu Gin Met Trp Glu His Tyr Ala Ala Gly Asn 890 895 900 905
AAA CCT GAC CAC ATG ATT AAG CCT GTT GAA GTC ACT GAG TCA GCA TAAAT 3029 Lys Pro Asp His Met He Lys Pro Val Glu Val Thr Glu Ser Ala 910 915 920
AAAGACTGCA CAGGAGAATT ACCCCTAAAA AAAAAAAAAA AAAAGGGCGG CCGC 3083
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 920 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (vi) ORIGINAL SOURCE:
(A) ORGANISM: Homo sapiens
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 :
Met Leu Gin Gly His Phe Trp Leu Val Arg Glu Gly He Met He Ser
1 5 10 15
Pro Ser Ser Pro Pro Pro Pro Asn Leu Phe Phe Phe Pro Leu Gin He
20 25 30
Phe Pro Phe Pro Phe Thr Ser Phe Pro Ser His Leu Leu Ser Leu Thr
35 40 45
Pro Pro Lys Ala Cys Tyr Leu Lys Ala He Glu Thr Gin Pro Asn Phe
50 55 60
Ala Val Ala Trp Ser Asn Leu Gly Cys Val Phe Asn Ala Gin Gly Glu 65 70 75 80
He Trp Leu Ala He His His Phe Glu Lys Ala Val Thr Leu Asp Pro 85 90 95
Asn Phe Leu Asp Ala Tyr He Asn Leu Gly Asn Val Leu Lys Glu Ala
100 105 110
Arg He Phe Asp Arg Ala Val Ala Ala Tyr Leu Arg Ala Leu Ser Leu
115 120 125
Ser Pro Asn His Ala Val Val His Gly Asn Leu Ala Cys Val Tyr Tyr
130 135 140
Glu Gin Gly Leu He Asp Leu Ala He Asp Thr Tyr Arg Arg Ala He 145 150 155 160
Glu Leu Gin Pro His Phe Pro Asp Ala Tyr Cys Asn Leu Ala Asn Ala
165 170 175
Leu Lys Glu Lys Gly Ser Val Ala Glu Ala Glu Asp Cys Tyr Asn Thr
180 185 190
Ala Leu Arg Leu Cys Pro Thr His Ala Asp Ser Leu Asn Asn Leu Ala
195 200 205
Asn He Lys Arg Glu Gin Gly Asn He Glu Glu Ala Val Arg Leu Tyr
210 215 220
Arg Lys Ala Leu Glu Val Phe Pro Glu Phe Ala Ala Ala His Ser Asn 225 230 235 240
Leu Ala Ser Val Leu Gin Gin Gin Gly Lys Leu Gin Glu Ala Leu Met
245 250 255
His Tyr Lys Glu Ala He Arg He Ser Pro Thr Phe Ala Asp Ala Tyr
260 265 270
Ser Asn Met Gly Asn Thr Leu Lys Glu Met Gin Asp Val Gin Gly Ala
275 280 285
Leu Gin Cys Tyr Thr Arg Ala He Gin He Asn Pro Ala Phe Ala Asp
290 295 300
Ala His Ser Asn Leu Ala Ser He His Lys Asp Ser Gly Asn He Pro 305 310 315 320
Glu Ala He Ala Ser Tyr Arg Thr Ala Leu Lys Leu Lys Pro Asp Phe
325 330 335
Pro Asp Ala Tyr Cys Asn Leu Ala His Cys Leu Gin He Val Cys Asp
340 345 350
Trp Thr Asp Tyr Asp Glu Arg Met Lys Lys Leu Val Ser He Val Ala
355 360 365
Asp Gin Leu Glu Lys Asn Arg Leu Pro Ser Val His Pro His His Ser
370 375 380
Met Leu Tyr Pro Leu Ser His Gly Phe Arg Lys Ala He Ala Glu Arg 385 390 395 400
His Gly Asn Leu Cys Leu Asp Lys He Asn Val Leu His Lys Pro Pro
405 410 415
Tyr Glu His Pro Lys Asp Leu Lys Leu Ser Asp Gly Arg Leu Arg Val
420 425 430
Gly Tyr Val Ser Ser Asp Phe Gly Asn His Pro Thr Ser His Leu Met
435 440 445
Gin Ser He Pro Gly Met His Asn Pro Asp Lys Phe Glu Val Phe Cys
450 455 460
Tyr Ala Leu Ser Pro Asp Asp Gly Thr Asn Phe Arg Val Lys Val Met 465 470 475 480
Ala Glu Ala Asn His Phe He Asp Leu Ser Gin He Pro Cys Asn Gly
485 490 495
Lys Ala Ala Asp Arg He His Gin Asp Gly He His He Leu Val Asn
500 505 510
Met Asn Gly Tyr Thr Lys Gly Ala Arg Asn Glu Leu Phe Ala Leu Arg 515 520 525 Pro Ala Pro He Gin Ala Met Trp Leu Gly Tyr Pro Gly Thr Ser Gly
530 535 540
Ala Leu Phe Met Asp Tyr He He Thr Asp Gin Glu Thr Ser Pro Ala 545 550 555 560
Glu Val Ala Glu Gin Tyr Ser Glu Lys Leu Ala Tyr Met Pro His Thr
565 570 575
Phe Phe He Gly Asp His Ala Asn Met Phe Pro His Leu Lys Lys Lys
580 585 590
Ala Val He Asp Phe Lys Ser Asn Gly His He Tyr Asp Asn Arg He
595 600 605
Val Leu Asn Gly He Asp Leu Lys Ala Phe Leu Asp Ser Leu Pro Asp
610 615 620
Val Lys He Val Lys Met Lys Cys Pro Asp Gly Gly Asp Asn Ala Asp 625 630 635 640
Ser Ser Asn Thr Ala Leu Asn Met Pro Val He Pro Met Asn Thr He
645 650 655
Ala Glu Ala Val He Glu Met He Asn Arg Gly Gin He Gin He Thr
660 665 670
He Asn Gly Phe Ser He Ser Asn Gly Leu Ala Thr Thr Gin He Asn
675 680 685
Asn Lys Ala Ala Thr Gly Glu Glu Val Pro Arg Thr He He Val Thr
690 695 700
Thr Arg Ser Gin Tyr Gly Leu Pro Glu Asp Ala He Val Tyr Cys Asn 705 710 715 720
Phe Asn Gin Leu Tyr Lys He Asp Pro Ser Thr Leu Gin Met Trp Ala
725 730 735
Asn He Leu Lys Arg Val Pro Asn Ser Val Leu Trp Leu Leu Arg Phe
740 745 750
Pro Ala Val Gly Glu Pro Asn He Gin Gin Tyr Ala Gin Asn Met Gly
755 760 765
Leu Pro Gin Asn Arg He He Phe Ser Pro Val Ala Pro Lys Glu Glu
770 775 780
His Val Arg Arg Gly Gin Leu Ala Asp Val Cys Leu Asp Thr Pro Leu 785 790 795 800
Cys Asn Gly His Thr Thr Gly Met Asp Val Leu Trp Ala Gly Thr Pro
805 810 815
Met Val Thr Met Pro Gly Glu Thr Leu Ala Ser Arg Val Ala Ala Ser
820 825 830
Gin Leu Thr Cys Leu Gly Cys Leu Glu Leu He Ala Lys Asn Arg Gin
835 840 845
Glu Tyr Glu Asp He Ala Val Lys Leu Gly Thr Asp Leu Glu Tyr Leu
850 855 860
Lys Lys Val Arg Gly Lys Val Trp Lys Gin Arg He Ser Ser Pro Leu 865 870 875 880
Phe Asn Thr Lys Gin Tyr Thr Met Glu Leu Glu Arg Leu Tyr Leu Gin
885 890 895
Met Trp Glu His Tyr Ala Ala Gly Asn Lys Pro Asp His Met He Lys
900 905 910
Pro Val Glu Val Thr Glu Ser Ala 915 920
(2) INFORMATION FOR SEQ ID NO : 3 :
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4097 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (vi) ORIGINAL SOURCE:
(A) ORGANISM: Caenorhabditis elegans (ix) FEATURE:
(A) NAME/KEY: Coding Sequence
(B) LOCATION: 1...3453 (D) OTHER INFORMATION:
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
ATG GAG AAG CCC AAT TAC TTT CAG TCG TAT AAT AAG GTG ATA GGA GCA 48
Met Glu Lys Pro Asn Tyr Phe Gin Ser Tyr Asn Lys Val He Gly Ala
1 5 10 15
ACC GGA GAG CAA TTG GCT CCC GGA GCA GTA CCT CCA CAT CCT GTA CTT 96
Thr Gly Glu Gin Leu Ala Pro Gly Ala Val Pro Pro His Pro Val Leu 20 25 30
GCA CCA TCA ATT GCT CCT GGA GGA GTT GCT GGA GTA TCG GCA GCA AAC 144
Ala Pro Ser He Ala Pro Gly Gly Val Ala Gly Val Ser Ala Ala Asn 35 40 45
ATG GCC AAC ATT ATG CAG ACT CCT GGT TTC GCG AAT CTC GTT CAG CAG 192
Met Ala Asn He Met Gin Thr Pro Gly Phe Ala Asn Leu Val Gin Gin 50 55 60
GCT ATT CGA ACG CAA CTC GAA AAT CAA GCG GCA CAG CAG TTA GCA GTC 240
Ala He Arg Thr Gin Leu Glu Asn Gin Ala Ala Gin Gin Leu Ala Val 65 70 75 80
AAC CAG CAA TTT CAA TTA AAT GGA GCA ACT GCT GTA CAA CAA CAA CTT 288
Asn Gin Gin Phe Gin Leu Asn Gly Ala Thr Ala Val Gin Gin Gin Leu
85 90 95
CTT TTG ACA CCT CAA CAG TCA CTT GCT CAA CCA ATT GCT CTT GCA CCA 336
Leu Leu Thr Pro Gin Gin Ser Leu Ala Gin Pro He Ala Leu Ala Pro 100 105 110
CAA CCT ACT GTC GTT TTA AAT GGA GTC AGT GAA ACA TTA AAR AAA GTC 384
Gin Pro Thr Val Val Leu Asn Gly Val Ser Glu Thr Leu Xaa Lys Val 115 120 125
GCT GAA TTG GCT CAT CGG CAG TTT CAA TCA GGA AAT TAT GTG GAA GCA 432
Ala Glu Leu Ala His Arg Gin Phe Gin Ser Gly Asn Tyr Val Glu Ala 130 135 140
GAG AAA TAT TGC AAT TTG GTT TTT CAA AGT GAC CCA AAT AAC TTG CCA 480
Glu Lys Tyr Cys Asn Leu Val Phe Gin Ser Asp Pro Asn Asn Leu Pro 145 150 155 160
ACG CTA TTG CTC CTC TCA GCC ATC AAT TTC CAA ACT AAA AAT TTG GAA 528 Thr Leu Leu Leu Leu Ser Ala He Asn Phe Gin Thr Lys Asn Leu Glu 165 170 175
AAA TCA ATG CAA TAT TCA ATG TTA GCC ATC AAA GTC AAT AAT CAG TGT 576 Lys Ser Met Gin Tyr Ser Met Leu Ala He Lys Val Asn Asn Gin Cys 180 185 190
GCA GAA GCC TAC AGT AAC CTT GGA AAT TAC TAT AAA GAG AAA GGA CAA 624 Ala Glu Ala Tyr Ser Asn Leu Gly Asn Tyr Tyr Lys Glu Lys Gly Gin 195 200 205
CTA CAG GAT GCA CTT GAA AAC TAC AAA CTA GCT GTT AAA CTC AAG CCA 672 Leu Gin Asp Ala Leu Glu Asn Tyr Lys Leu Ala Val Lys Leu Lys Pro 210 215 220
GAA TTC ATT GAT GCT TAT ATC AAT TTA GCC GCA GCT TTG GTG TCT GGT 720 Glu Phe He Asp Ala Tyr He Asn Leu Ala Ala Ala Leu Val Ser Gly 225 230 235 240
GGT GAT TTG GAG CAA GCT GTC ACA GCA TAT TTC AAT GCT TTA CAA ATT 768 Gly Asp Leu Glu Gin Ala Val Thr Ala Tyr Phe Asn Ala Leu Gin He 245 250 255
AAT CCT GAT TTG TAT TGT GTC AGA AGT GAT CTT GGA AAC TTA CTA AAA 816 Asn Pro Asp Leu Tyr Cys Val Arg Ser Asp Leu Gly Asn Leu Leu Lys 260 265 270
GCA ATG GGA AGA CTT GAA GAA GCG AAG GTC TGT TAC TTG AAA GCA ATC 864 Ala Met Gly Arg Leu Glu Glu Ala Lys Val Cys Tyr Leu Lys Ala He 275 280 285
GAA ACT CAA CCA CAG TTC GCT GTC GCA TGG TCC AAT CTT GGA TGT GTA 912 Glu Thr Gin Pro Gin Phe Ala Val Ala Trp Ser Asn Leu Gly Cys Val 290 295 300
TTC AAT AGT CAA GGA GAA ATT TGG TTG GCA ATT CAT CAT TTC GAG AAA 960 Phe Asn Ser Gin Gly Glu He Trp Leu Ala He His His Phe Glu Lys 305 310 315 320
GCT GTC ACT TTG GAT CCA AAC TTC CTC GAC GCT TAT ATA AAT CTT GGA 1008 Ala Val Thr Leu Asp Pro Asn Phe Leu Asp Ala Tyr He Asn Leu Gly 325 330 335
AAT GTT CTG AAA GAG GCC AGG ATT TTC GAT AGA GCG GTT TCA GCT TAT 1056 Asn Val Leu Lys Glu Ala Arg He Phe Asp Arg Ala Val Ser Ala Tyr 340 345 350
CTC CGA GCC TTG AAT CTG TCT GGT AAT CAT GCA GTT GTT CAT GGG AAT 1104 Leu Arg Ala Leu Asn Leu Ser Gly Asn His Ala Val Val His Gly Asn 355 360 365
TTG GCA TGT GTG TAC TAC GAA CAG GGA CTT ATT GAT TTG GCC ATT GAC 1152 Leu Ala Cys Val Tyr Tyr Glu Gin Gly Leu He Asp Leu Ala He Asp 370 375 380 ACT TAC AAA AAG GCT ATC GAC TTA CAA CCA CAC TTC CCT GAT GCC TAC 1200 Thr Tyr Lys Lys Ala He Asp Leu Gin Pro His Phe Pro Asp Ala Tyr 385 390 395 400
TGT AAT CTT GCA AAT GCA TTA AAA GAA AAA GGA AGT GTT GTA GAA GCT 1248 Cys Asn Leu Ala Asn Ala Leu Lys Glu Lys Gly Ser Val Val Glu Ala 405 410 415
GAG CAG ATG TAC ATG AAA GCT TTG GAG CTC TGC CCA ACC CAC GCG GAT 1296 Glu Gin Met Tyr Met Lys Ala Leu Glu Leu Cys Pro Thr His Ala Asp 420 425 430
TCC CAA AAC AAT CTT GCC AAT ATC AAA AGA GAA CAA GGA AAG ATC GAA 1344 Ser Gin Asn Asn Leu Ala Asn He Lys Arg Glu Gin Gly Lys He Glu 435 440 445
GAT GCC ACT CGA CTA TAT TTG AAA GCT CTC GAA ATC TAT CCG GAA TTT 1392 Asp Ala Thr Arg Leu Tyr Leu Lys Ala Leu Glu He Tyr Pro Glu Phe 450 455 460
GCT GCA GCT CAT TCG AAT CTT GCA TCC ATA CTA CAA CAA CAA GGA AAA 1440 Ala Ala Ala His Ser Asn Leu Ala Ser He Leu Gin Gin Gin Gly Lys 465 470 475 480
CTA AAC GAC GCC ATA CTC CAC TAC AAA GAA GCT ATT CGA ATT GCT CCG 1488 Leu Asn Asp Ala He Leu His Tyr Lys Glu Ala He Arg He Ala Pro 485 490 495
ACG TTT GCT GAT GCC TAT TCA AAT ATG GGA AAT ACT TTG AAA GAA ATG 1536 Thr Phe Ala Asp Ala Tyr Ser Asn Met Gly Asn Thr Leu Lys Glu Met 500 505 510
GGT GAC AGC TCG GCA GCA ATC GCA TGT TAC AAT CGT GCT ATT CAA ATT 1584 Gly Asp Ser Ser Ala Ala He Ala Cys Tyr Asn Arg Ala He Gin He 515 520 525
AAT CCA GCA TTT GCC GAT GCT CAT TCG AAT CTT GCC AGC ATA CAC AAG 1632 Asn Pro Ala Phe Ala Asp Ala His Ser Asn Leu Ala Ser He His Lys 530 535 540
GAT GCT GGA AAT ATG GCA GAA GCT ATT CAA AGT TAT TCG ACA GCT CTC 1680 Asp Ala Gly Asn Met Ala Glu Ala He Gin Ser Tyr Ser Thr Ala Leu 545 550 555 560
AAG CTG AAA CCA GAT TTC CCG GAT GCA TAC TGT AAT CTT GCA CAC TGT 1728 Lys Leu Lys Pro Asp Phe Pro Asp Ala Tyr Cys Asn Leu Ala His Cys 565 570 575
CAT CAA ATA ATT TGT GAT TGG AAT GAT TAT GAT AAA CGA GTA CGG AAA 1776 His Gin He He Cys Asp Trp Asn Asp Tyr Asp Lys Arg Val Arg Lys 580 585 590
TTG GTA CAA ATT GTG GAA GAT CAG CTT TGC AAG AAA CGT CTT CCA TCG 1824 Leu Val Gin He Val Glu Asp Gin Leu Cys Lys Lys Arg Leu Pro Ser 595 600 605 GTT CAT CCA CAT CAC AGT ATG CTT TAT CCA CTT TCA CAT GCG GCT CGG 1872 Val His Pro His His Ser Met Leu Tyr Pro Leu Ser His Ala Ala Arg 610 615 620
ATT GCA ATT GCT GCA AAG CAT GCA TCG TTG TGC TTC GAT AAG GTT CAT 1920 He Ala He Ala Ala Lys His Ala Ser Leu Cys Phe Asp Lys Val His 625 630 635 640
GTT CAA ATG CTT GGA AAG ACA CCA CTC ATC CAC GCT GAT CGA TTC AGT 1968 Val Gin Met Leu Gly Lys Thr Pro Leu He His Ala Asp Arg Phe Ser 645 650 655
GTT CAA AAT GGA CAG CGC CTC CGG ATT GGC TAC GTC TCC TCT GAT TTC 2016 Val Gin Asn Gly Gin Arg Leu Arg He Gly Tyr Val Ser Ser Asp Phe 660 665 670
GGA AAT CAT CCA ACA TCT CAT CTA ATG CAA TCA ATT CCT GGA ATG CAT 2064 Gly Asn His Pro Thr Ser His Leu Met Gin Ser He Pro Gly Met His 675 680 685
GAT AGG AGT CGA GTA GAA GTA TTC TGT TAT GCA CTT TCT GTG AAT GAT 2112 Asp Arg Ser Arg Val Glu Val Phe Cys Tyr Ala Leu Ser Val Asn Asp 690 695 700
GGA ACC AAT TTC CGA TCG AAA CTT ATG AAT GAA TCC GAA CAT TTT GTG 2160 Gly Thr Asn Phe Arg Ser Lys Leu Met Asn Glu Ser Glu His Phe Val 705 710 715 720
GAT CTT TCG CAG ATT CCA TGC AAT GGA AAA GCT GCT GAG AAA ATC GCC 2208 Asp Leu Ser Gin He Pro Cys Asn Gly Lys Ala Ala Glu Lys He Ala 725 730 735
CAA GAT GGA ATC CAC ATT CTC ATT AAC ATG AAT GGA TAT ACA AAA GGA 2256 Gin Asp Gly He His He Leu He Asn Met Asn Gly Tyr Thr Lys Gly 740 745 750
GCA AGA AAT GAG ATT TTT GCA CTC CGA CCT GCT CCG ATT CAA GTC ATG 2304 Ala Arg Asn Glu He Phe Ala Leu Arg Pro Ala Pro He Gin Val Met 755 760 765
TGG CTT GGA TAT CCA TCA ACA TCT GGA GCT ACA TTC ATG GAT TAT ATC 2352 Trp Leu Gly Tyr Pro Ser Thr Ser Gly Ala Thr Phe Met Asp Tyr He 770 775 780
ATC ACT GAT GCT GTC ACA TCA CCT CTT CGG CTT GCA AAT GCA TTT ACA 2400 He Thr Asp Ala Val Thr Ser Pro Leu Arg Leu Ala Asn Ala Phe Thr 785 790 795 800
GAG AAG CTC GCA TAT ATG CCA CAT ACA TTC TTC ATT GGA GAT CAC GCT 2448 Glu Lys Leu Ala Tyr Met Pro His Thr Phe Phe He Gly Asp His Ala 805 810 815
CAA ATG TTG AGA CAT TTG ACT GAT AAG GTT GTT GTA AAG GAT AAG GAA 2496 Gin Met Leu Arg His Leu Thr Asp Lys Val Val Val Lys Asp Lys Glu 820 825 830 ACA ACA GAA AGA GAT TCA TGT CTT ATC ATG AAT ACA GCG AAT ATG GAT 2544 Thr Thr Glu Arg Asp Ser Cys Leu He Met Asn Thr Ala Asn Met Asp 835 840 845
CCG ATT CTT GCA AAA TCT GAA ATC AAA GAA CAA GTT CTG GAT ACA GAA 2592 Pro He Leu Ala Lys Ser Glu He Lys Glu Gin Val Leu Asp Thr Glu 850 855 860
GTA GTA AGT GGA CCC AAC AAG GAA CTT GTC CGC GCA GAA ATG GTT TTA 2640 Val Val Ser Gly Pro Asn Lys Glu Leu Val Arg Ala Glu Met Val Leu 865 870 875 880
CCA GTC CTT GAA GTT CCA ACT GAG CCA ATC AAG CAA ATG ATC ATG ACA 2688 Pro Val Leu Glu Val Pro Thr Glu Pro He Lys Gin Met He Met Thr 885 890 895
GGA CAA ATG ACA ATG AAT GTA ATG GAA GAT ATG AAT GTT CAG AAT GGT 2736 Gly Gin Met Thr Met Asn Val Met Glu Asp Met Asn Val Gin Asn Gly 900 905 910
CTC GGA CAG TCT CAA ATG CAT CAC AAA GCA GCC ACT GGC GAA GAA ATT 2784 Leu Gly Gin Ser Gin Met His His Lys Ala Ala Thr Gly Glu Glu He 915 920 925
CCA AAC TCT GTT CTT CTA ACT TCC CGT GCT CAA TAT CAA CTT CCT GAT 2832 Pro Asn Ser Val Leu Leu Thr Ser Arg Ala Gin Tyr Gin Leu Pro Asp 930 935 940
GAT GCT ATT GTG TTT TGC AAT TTC AAT CAG CTT TAC AAA ATT GAT CCA 2880 Asp Ala He Val Phe Cys Asn Phe Asn Gin Leu Tyr Lys He Asp Pro 945 950 955 960
TCG ACT CTC GAC ATG TGG ATC AAA ATT CTC GAG AAT GTT CCG AAA TCA 2928 Ser Thr Leu Asp Met Trp He Lys He Leu Glu Asn Val Pro Lys Ser 965 970 975
ATT CTT TGG CTT CTG AGA TTC CCG TAT CAA GGA GAA GAA CAT ATT CGA 2976 He Leu Trp Leu Leu Arg Phe Pro Tyr Gin Gly Glu Glu His He Arg 980 985 990
AAG TAT TGT GTA GAG AGA GGA TTA GAT CCA TCA AGA ATT GTG TTC AGC 3024 Lys Tyr Cys Val Glu Arg Gly Leu Asp Pro Ser Arg He Val Phe Ser 995 1000 1005
AAT GTT GCC GCT AAA GAA GAG CAT GTT CGT CGA GGA CAA CTG GCT GAT 3072 Asn Val Ala Ala Lys Glu Glu His Val Arg Arg Gly Gin Leu Ala Asp 1010 1015 1020
GTT TGT CTA GAT ACT CCA TTA TGT AAT GGA CAT ACA ACT GGT ATG GAT 3120 Val Cys Leu Asp Thr Pro Leu Cys Asn Gly His Thr Thr Gly Met Asp 1025 1030 1035 1040
ATC CTG TGG ACA GGA ACA CCA ATG GTC ACA ATG CCT TTG GAA TCA TTG 3168 He Leu Trp Thr Gly Thr Pro Met Val Thr Met Pro Leu Glu Ser Leu 1045 1050 1055 GCT TCT CGT GTG GCA ACC TCC CAG CTT TAC GCT CTC GGA GTT CCA GAA 3216 Ala Ser Arg Val Ala Thr Ser Gin Leu Tyr Ala Leu Gly Val Pro Glu 1060 1065 1070
TTA GTT GCT AAA ACA AGG CAG GAA TAT GTT TCA ATT GCA GTT CGA CTT 3264 Leu Val Ala Lys Thr Arg Gin Glu Tyr Val Ser He Ala Val Arg Leu 1075 1080 1085
GGA ACT GAC GCT GAT CAC TTG GCG AAC ATG CGT GCA AAA GTA TGG ATG 3312 Gly Thr Asp Ala Asp His Leu Ala Asn Met Arg Ala Lys Val Trp Met 1090 1095 1100
GCT CGT ACC TCA TCA ACA TTA TTC GAT GTG AAA CAG TAT TGT CAT GAT 3360 Ala Arg Thr Ser Ser Thr Leu Phe Asp Val Lys Gin Tyr Cys His Asp 1105 1110 1115 1120
ATG GAA GAT CTT TTG GGA CAA ATG TGG AAA CGA TAT GAA AGT GGA ATG 3408 Met Glu Asp Leu Leu Gly Gin Met Trp Lys Arg Tyr Glu Ser Gly Met 1125 1130 1135
CCC ATT GAT CAT ATC ACT AAT AAT ACG GAA ACG CCA CAC GGC TTG TGAAT 3458 Pro He Asp His He Thr Asn Asn Thr Glu Thr Pro His Gly Leu 1140 1145 1150
AGATTTTCGA AGGATTTTTA AAAAATTGTA TATTTTGTAT AATTTTCACA AATGATTCAA 3518
TGCATTTCAT TTCTCTTCAA GATTTTCCAT ATTGGAAACC AAATGATTTT TTTGCTCATT 3578
TCTTTCTTTC TTTGCGTATT ACTCCTCAGT GCTCTTTCTA CAATATCTCG CTCCAAATCC 3638
CCTTCTCCAC CGTTCCTCAT AATCTCCATT GGGTAAATTC TTTGTGATTT CATTTTGTAC 3698
GAAGAATTTT CAATTTTCCC ATCCATCGTC CTATAGACGT TCTTATCAAA GACATAAGAT 3758
CCGTGGTTCT TGTTTTGTGG ATCTCTTCAT CACTCTAATC ACTCTTAAGA ATTTTATTCC 3818
TCCCTCCGTA GTTTTTCTTT TCTAATTTAT TTTTGTTCTA TAAACTGAAG AAAGAAAAAC 3878
GTGCCATTCA TTTTCCTTTC TCGTGTTTAA TTTTTAGTAG AATTTTAAGT TTTAAGATCC 3938
CCTTTGTTAA ACCATATTGG TTTCCCTTTC CCGTTTACTC TTTAGTTTTC TTGAAATATT 3998
CGGTTTTGTA TGCAACTTGA TCACTTACTC AATTCCTTCT TTTTCTTGAT TTTGCTTTCC 4058
CGTTTTAATA AATTTCTAAA AACATAAAAA AAAAAACGG 4097
(2) INFORMATION FOR SEQ ID NO : 4 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1151 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein (v) FRAGMENT TYPE: internal (vi) ORIGINAL SOURCE:
(A) ORGANISM: Caenorhabditis elegans
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 :
Met Glu Lys Pro Asn Tyr Phe Gin Ser Tyr Asn Lys Val He Gly Ala
1 5 10 15
Thr Gly Glu Gin Leu Ala Pro Gly Ala Val Pro Pro His Pro Val Leu
20 25 30
Ala Pro Ser He Ala Pro Gly Gly Val Ala Gly Val Ser Ala Ala Asn Met Ala Asn He Met Gin Thr Pro Gly Phe Ala Asn Leu Val Gin Gin
50 55 60
Ala He Arg Thr Gin Leu Glu Asn Gin Ala Ala Gin Gin Leu Ala Val 65 70 75 80
Asn Gin Gin Phe Gin Leu Asn Gly Ala Thr Ala Val Gin Gin Gin Leu
85 90 95
Leu Leu Thr Pro Gin Gin Ser Leu Ala Gin Pro He Ala Leu Ala Pro
100 105 110
Gin Pro Thr Val Val Leu Asn Gly Val Ser Glu Thr Leu Xaa Lys Val
115 120 125
Ala Glu Leu Ala His Arg Gin Phe Gin Ser Gly Asn Tyr Val Glu Ala
130 135 140
Glu Lys Tyr Cys Asn Leu Val Phe Gin Ser Asp Pro Asn Asn Leu Pro 145 150 155 160
Thr Leu Leu Leu Leu Ser Ala He Asn Phe Gin Thr Lys Asn Leu Glu
165 170 175
Lys Ser Met Gin Tyr Ser Met Leu Ala He Lys Val Asn Asn Gin Cys
180 185 190
Ala Glu Ala Tyr Ser Asn Leu Gly Asn Tyr Tyr Lys Glu Lys Gly Gin
195 200 205
Leu Gin Asp Ala Leu Glu Asn Tyr Lys Leu Ala Val Lys Leu Lys Pro
210 215 220
Glu Phe He Asp Ala Tyr He Asn Leu Ala Ala Ala Leu Val Ser Gly 225 230 235 240
Gly Asp Leu Glu Gin Ala Val Thr Ala Tyr Phe Asn Ala Leu Gin He
245 250 255
Asn Pro Asp Leu Tyr Cys Val Arg Ser Asp Leu Gly Asn Leu Leu Lys
260 265 270
Ala Met Gly Arg Leu Glu Glu Ala Lys Val Cys Tyr Leu Lys Ala He
275 280 285
Glu Thr Gin Pro Gin Phe Ala Val Ala Trp Ser Asn Leu Gly Cys Val
290 295 300
Phe Asn Ser Gin Gly Glu He Trp Leu Ala He His His Phe Glu Lys 305 310 315 320
Ala Val Thr Leu Asp Pro Asn Phe Leu Asp Ala Tyr He Asn Leu Gly
325 330 335
Asn Val Leu Lys Glu Ala Arg He Phe Asp Arg Ala Val Ser Ala Tyr
340 345 350
Leu Arg Ala Leu Asn Leu Ser Gly Asn His Ala Val Val His Gly Asn
355 360 365
Leu Ala Cys Val Tyr Tyr Glu Gin Gly Leu He Asp Leu Ala He Asp
370 375 380
Thr Tyr Lys Lys Ala He Asp Leu Gin Pro His Phe Pro Asp Ala Tyr 385 390 395 400
Cys Asn Leu Ala Asn Ala Leu Lys Glu Lys Gly Ser Val Val Glu Ala
405 410 415
Glu Gin Met Tyr Met Lys Ala Leu Glu Leu Cys Pro Thr His Ala Asp
420 425 430
Ser Gin Asn Asn Leu Ala Asn He Lys Arg Glu Gin Gly Lys He Glu
435 440 445
Asp Ala Thr Arg Leu Tyr Leu Lys Ala Leu Glu He Tyr Pro Glu Phe
450 455 460
Ala Ala Ala His Ser Asn Leu Ala Ser He Leu Gin Gin Gin Gly Lys 465 470 475 480
-40- Leu Asn Asp Ala He Leu His Tyr Lys Glu Ala He Arg He Ala Pro
485 490 495
Thr Phe Ala Asp Ala Tyr Ser Asn Met Gly Asn Thr Leu Lys Glu Met
500 505 510
Gly Asp Ser Ser Ala Ala He Ala Cys Tyr Asn Arg Ala He Gin He
515 520 525
Asn Pro Ala Phe Ala Asp Ala His Ser Asn Leu Ala Ser He His Lys
530 535 540
Asp Ala Gly Asn Met Ala Glu Ala He Gin Ser Tyr Ser Thr Ala Leu 545 550 555 560
Lys Leu Lys Pro Asp Phe Pro Asp Ala Tyr Cys Asn Leu Ala His Cys
565 570 575
His Gin He He Cys Asp Trp Asn Asp Tyr Asp Lys Arg Val Arg Lys
580 585 590
Leu Val Gin He Val Glu Asp Gin Leu Cys Lys Lys Arg Leu Pro Ser
595 600 605
Val His Pro His His Ser Met Leu Tyr Pro Leu Ser His Ala Ala Arg
610 615 620
He Ala He Ala Ala Lys His Ala Ser Leu Cys Phe Asp Lys Val His 625 630 635 640
Val Gin Met Leu Gly Lys Thr Pro Leu He His Ala Asp Arg Phe Ser
645 650 655
Val Gin Asn Gly Gin Arg Leu Arg He Gly Tyr Val Ser Ser Asp Phe
660 665 670
Gly Asn His Pro Thr Ser His Leu Met Gin Ser He Pro Gly Met His
675 680 685
Asp Arg Ser Arg Val Glu Val Phe Cys Tyr Ala Leu Ser Val Asn Asp
690 695 700
Gly Thr Asn Phe Arg Ser Lys Leu Met Asn Glu Ser Glu His Phe Val 705 710 715 720
Asp Leu Ser Gin He Pro Cys Asn Gly Lys Ala Ala Glu Lys He Ala
725 730 735
Gin Asp Gly He His He Leu He Asn Met Asn Gly Tyr Thr Lys Gly
740 745 750
Ala Arg Asn Glu He Phe Ala Leu Arg Pro Ala Pro He Gin Val Met
755 760 765
Trp Leu Gly Tyr Pro Ser Thr Ser Gly Ala Thr Phe Met Asp Tyr He
770 775 780
He Thr Asp Ala Val Thr Ser Pro Leu Arg Leu Ala Asn Ala Phe Thr 785 790 795 800
Glu Lys Leu Ala Tyr Met Pro His Thr Phe Phe He Gly Asp His Ala
805 810 815
Gin Met Leu Arg His Leu Thr Asp Lys Val Val Val Lys Asp Lys Glu
820 825 830
Thr Thr Glu Arg Asp Ser Cys Leu He Met Asn Thr Ala Asn Met Asp
835 840 845
Pro He Leu Ala Lys Ser Glu He Lys Glu Gin Val Leu Asp Thr Glu
850 855 860
Val Val Ser Gly Pro Asn Lys Glu Leu Val Arg Ala Glu Met Val Leu 865 870 875 880
Pro Val Leu Glu Val Pro Thr Glu Pro He Lys Gin Met He Met Thr
885 890 895
Gly Gin Met Thr Met Asn Val Met Glu Asp Met Asn Val Gin Asn Gly
900 905 910
Leu Gly Gin Ser Gin Met His His Lys Ala Ala Thr Gly Glu Glu He 915 920 925
Pro Asn Ser Val Leu Leu Thr Ser Arg Ala Gin Tyr Gin Leu Pro Asp
930 935 940
Asp Ala He Val Phe Cys Asn Phe Asn Gin Leu Tyr Lys He Asp Pro 945 950 955 960
Ser Thr Leu Asp Met Trp He Lys He Leu Glu Asn Val Pro Lys Ser
965 970 975
He Leu Trp Leu Leu Arg Phe Pro Tyr Gin Gly Glu Glu His He Arg
980 985 990
Lys Tyr Cys Val Glu Arg Gly Leu Asp Pro Ser Arg He Val Phe Ser
995 1000 1005
Asn Val Ala Ala Lys Glu Glu His Val Arg Arg Gly Gin Leu Ala Asp
1010 1015 1020
Val Cys Leu Asp Thr Pro Leu Cys Asn Gly His Thr Thr Gly Met Asp 1025 1030 1035 104
He Leu Trp Thr Gly Thr Pro Met Val Thr Met Pro Leu Glu Ser Leu
1045 1050 1055
Ala Ser Arg Val Ala Thr Ser Gin Leu Tyr Ala Leu Gly Val Pro Glu
1060 1065 1070
Leu Val Ala Lys Thr Arg Gin Glu Tyr Val Ser He Ala Val Arg Leu
1075 1080 1085
Gly Thr Asp Ala Asp His Leu Ala Asn Met Arg Ala Lys Val Trp Met
1090 1095 1100
Ala Arg Thr Ser Ser Thr Leu Phe Asp Val Lys Gin Tyr Cys His Asp 1105 1110 1115 112
Met Glu Asp Leu Leu Gly Gin Met Trp Lys Arg Tyr Glu Ser Gly Met
1125 1130 1135
Pro He Asp His He Thr Asn Asn Thr Glu Thr Pro His Gly Leu 1140 1145 1150
(2) INFORMATION FOR SEQ ID NO : 5 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: GTTTGTTACT TGAAAGCAAT CG 22
(2) INFORMATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: ATCGAAAATC CTGGCCTCTT 20
(2) INFORMATION FOR SEQ ID NO : 7 : (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: GCGTTTTCCA GCAGTAGGAG 20
(2) INFORMATION FOR SEQ ID NO: 8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: Other
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 :
ACATTCTGAA GCGTGTTCCC 20

Claims

Claims:
1. An isolated DNA comprising a sequence that encodes a protein exhibiting uridine diphospho-N-acetylglucosamine:polypeptide ╬▓-N-acetylglucosaminyl transferase (O-linked GlcNAc transferase) activity.
2. A nucleic acid vector comprising the isolated DNA of claim 1 such that, when the vector is introduced into a suitable host cell, the protein is expressed.
3. A nucleic acid vector comprising the isolated DNA of claim 1 wherein the vector further comprises a regulatory nucleotide sequence operably positioned with respect to the DNA sequence encoding an O-linked GlcNAc transferase such that, when the vector is introduced into a suitable host cell and the regulatory sequence is triggered, the protein is expressed.
4. The isolated DNA of claim 1 wherein said DNA has the nucleic acid sequence of SEQ ID NO:l.
5. An isolated protein exhibiting O-linked GlcNAc transferase activity.
6. The isolated protein of claim 5 having the amino acid sequence of a human O-linked GlcNAc transferase.
7. The isolated protein of claim 5 whose amino acid sequence is given by SEQ ID NO:2.
8. A host cell containing the vector described in claim 2 and that expresses a protein exhibiting O-linked GlcNAc transferase activity.
9. A host cell containing the vector described in claim 3 and harboring cellular components responsive to the regulatory nucleotide sequence of the vector, such that the protein having O-linked GlcNAc transferase activity encoded by the vector is expressed in the host when the cell is cultured under suitable conditions that trigger the regulatory sequence.
10. The host cell described in claim 8 that expresses the protein whose amino acid sequence is given by SEQ ID NO:2.
11. The isolated DNA of claim 1 wherein said DNA has the nucleic acid sequence of SEQ ID NO:3.
12. The isolated protein of claim 5 having the amino acid sequence of an O-linked GlcNAc transferase from Caenorhabditis elegans.
13. The isolated protein of claim 5 whose amino acid sequence is given by SEQ ID NO:4.
14. The host cell described in claim 8 that expresses the protein having the amino acid sequence given by SEQ ID NO:4.
15. A method of expressing a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase comprising the step of culturing a host cell containing a nucleic acid vector comprising a DNA sequence that encodes a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase, the vector being further such that, when it is introduced into the host cell and the cell is cultured under suitable conditions, the protein is expressed.
16. A method of expressing a protein exhibiting O-linked GlcNAc transferase activity and having the amino acid sequence of an O-linked GlcNAc transferase, comprising the step of culturing a host cell containing a nucleic acid vector comprising a DNA sequence that encodes a protein exhibiting O-linked
GlcNAc transferase activity and having the amino acid sequence of an O-linked
GlcNAc transferase, wherein the vector further comprises a regulatory nucleotide sequence operably positioned with respect to the sequence encoding the protein, wherein the host cell further harbors cellular components responsive to the regulatory nucleotide sequence of the vector, and wherein the cells are cultured under suitable conditions that promote growth of the cells and that trigger the regulatory nucleotide sequence of the vector such that the protein is expressed.
17. The method described in claim 15 wherein the protein expressed by the host cell has the amino acid sequence given by SEQ ID NO:2.
18. The method described in claim 15 wherein the protein expressed by the host cell has the amino acid sequence given by SEQ ID NO:4.
19. A method of identifying an inhibitor of O-linked GlcNAc transferase, said method comprising the steps of
(i) providing a sample comprising a glycosylation protein target of O-linked GlcNAc transferase activity;
(ii) contacting a first portion of the sample with a first solution comprising a substance that is a candidate for being an inhibitor, UDP-GlcNac, and a protein having O-linked GlcNAc transferase activity to generate a first test;
(iii) contacting a second portion of the sample with a second solution to generate a second test, wherein the second solution is the same as the first solution except that it does not contain the candidate;
(iv) determining and comparing the O-linked GlcNAc transferase activity in the first and second tests to identify whether the O-linked GlcNAc transferase activity has been inhibited by the candidate.
20. The method described in claim 19 wherein the protein having O-linked GlcNAc transferase activity has the amino acid sequence of a human O-linked GlcNAc transferase
21. The method described in claim 19 wherein the protein having O-linked GlcNAc transferase activity has the amino acid sequence of an O-linked GlcNAc transferase from Caenorhabditis elegans.
22. The method described in claim 19 wherein the glycosylation target protein is immobilized on a surface.
23. A method of identifying an inhibitor of O-linked GlcNAc transferase in a high-throughput assay comprising the steps of
(i) providing a plurality of first portions of a sample comprising a protein which is a glycosylation target of O-linked GlcNAc transferase activity,
(ii) contacting the first portions with first solutions comprising a substance that is a candidate for being an inhibitor, UDP-GlcNac, and a protein having O-linked GlcNAc transferase activity to generate a plurality of first tests, wherein the candidate in the first tests may be the same or different; (iii) contacting a second portion of the sample with a second solution to generate a second test, wherein the second solution is the same as the first solutions except that it does not contain a candidate;
(iv) determining the O-linked GlcNAc transferase activity in the first and second tests; and
(iv) comparing the activity determined in each first test with the activity determined in the second test, in order to evaluate whether the O-linked GlcNAc transferase activity in each first test has been inhibited; wherein the observation of inhibition in any first test identifies the candidate added thereto in step (ii) as an inhibitor of O-linked GlcNAc transferase.
24. The method described in claim 23 wherein the protein has the amino acid sequence of a human O-linked GlcNAc transferase.
25. The method described in claim 23 wherein the protein has the amino acid sequence of an O-linked GlcNAc transferase from Caenorhabditis elegans.
26. The method described in claim 23 wherein the glycosylation target is immobilized on a surface.
27. A method for assessing predisposition toward type II diabetes in a patient who is suspected of having hyperglycemia that may evolve into type II diabetes, said method comprising the steps of
(a) obtaining a sample of blood from the patient who is suspected of having hyperglycemia that may evolve into type II diabetes;
(b) assaying the specific activity of the O-linked GlcNAc transferase contained in red blood cells obtained from the blood sample; and
(c) comparing the O-linked GlcNAc transferase activity found in the sample from the patient with the levels of O-linked GlcNAc transferase activity found in correlative samples from both normal human subjects and human subjects known to have type II diabetes, wherein the patient is evaluated as being predisposed to type II diabetes if the level of O-linked GlcNAc transferase activity in the sample from the patient falls within a range established for patients known to have type II diabetes.
28. A method for assessing predisposition toward Alzheimer's disease in a patient, comprising the steps of (a) obtaining a sample from the central nervous system of the patient;
(b) assaying the specific activity of the O-linked GlcNAc transferase contained in an extract obtained from the sample; and (c) comparing the O-linked GlcNAc transferase activity found in the sample from the patient with the levels of O-linked GlcNAc transferase activity found in correlative samples from normal human subjects and human subjects known to have Alzheimer's disease, wherein the patient is evaluated as being predisposed to Alzheimer's disease if the level of O-linked GlcNAc transferase activity in the sample from the patient falls within a range established for patients known to have Alzheimer's disease.
29. A method for assessing the metastatic potential of a tumor, said method comprising the steps of (a) obtaining a sample of a tumor from a patient harboring the tumor;
(b) assaying the specific activity of the O-linked GlcNAc transferase contained in an extract obtained from the sample; and
(c) comparing the O-linked GlcNAc transferase activity found in the sample from the patient with the levels of O-linked GlcNAc transferase activity found in correlative samples from normal human subjects and from tumor samples of known metastatic activity from human subjects, wherein the tumor is evaluated as having metastatic potential if the level of O-linked GlcNAc transferase activity in the sample from the tumor falls within a range established for tumors known to have high metastatic activity.
PCT/US1998/006101 1997-03-31 1998-03-27 O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE WO1998044123A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU69425/98A AU6942598A (en) 1997-03-31 1998-03-27 O-linked glcnac transferase (ogt): cloning, molecular expression, and methods ofuse

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US4227097P 1997-03-31 1997-03-31
US60/042,270 1997-03-31

Publications (2)

Publication Number Publication Date
WO1998044123A2 true WO1998044123A2 (en) 1998-10-08
WO1998044123A3 WO1998044123A3 (en) 1998-12-03

Family

ID=21920962

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/006101 WO1998044123A2 (en) 1997-03-31 1998-03-27 O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE

Country Status (2)

Country Link
AU (1) AU6942598A (en)
WO (1) WO1998044123A2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002081669A1 (en) * 2001-04-03 2002-10-17 Sankyo Company, Limited Assay methods for o-g1cnac transferase activity
WO2009086952A2 (en) * 2008-01-07 2009-07-16 Projech Science To Technology, S.L. Compositions for the treatment of degenerative articular diseases
WO2014164805A1 (en) * 2013-03-11 2014-10-09 University Of North Carolina At Chapel Hill Compositions and methods for targeting o-linked n-acetylglucosamine transferase and promoting wound healing
CN113943718A (en) * 2021-10-11 2022-01-18 北京大学 Glycosyltransferase and application thereof in marking, imaging and detecting Tn antigen

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0334962A1 (en) * 1987-09-07 1989-10-04 Oriental Yeast Co., Ltd. Method for diagnosis of hepatic cancer or hepatocirrhosis

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0334962A1 (en) * 1987-09-07 1989-10-04 Oriental Yeast Co., Ltd. Method for diagnosis of hepatic cancer or hepatocirrhosis

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
DATABASE EMEST9 E.M.B.L. Databases Accession Number: AA187859, 17 May 1996 HILLIER L ET AL: "Homo sapiens cDNA clone 625918: 5' similar to WP:K04G7.3" XP002078645 *
GRIFFITH L S AND SCHMITZ B: "O-linked N-acetylglucosamine is upregulated in Alzheimer brains" BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS., vol. 213, no. 2, 15 August 1995, pages 424-431, XP002078642 *
HALTIWANGER R ET AL : "Glycosylation of nuclear and cytoplasmic proteins" JOURNAL OF BIOLOGICAL CHEMISTRY., vol. 267, no. 13, 5 May 1992, pages 9005-9013, XP002078639 cited in the application *
KREPPEL L ET AL: "Dynamic glycosylation of nuclear and cytosolic proteins" JOURNAL OF BIOLOGICAL CHEMISTRY., vol. 272, no. 14, 4 April 1997, pages 9308-9315, XP002078644 *
LUBAS W A AND HANOVER J: "Cloning and expression of an O-linked UDP-GlcNac transferase from C.elegans" FASEB JOURNAL., vol. 10, no. 6, page A1106 XP002078640 *
LUBAS W ET AL: "Analysis of nuclear pore protein p62 glycosylation" BIOCHEMISTRY, vol. 34, 1995, pages 1686-1694, XP002078641 cited in the application *
LUBAS W ET AL: "O-linked GlcNac transferase is a conserved nucleocytoplasmic protein containing tetratricopeptide repeats" JOURNAL OF BIOLOGICAL CHEMISTRY., vol. 272, no. 14, 4 April 1997, pages 9316-9324, XP002078643 MD US *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002081669A1 (en) * 2001-04-03 2002-10-17 Sankyo Company, Limited Assay methods for o-g1cnac transferase activity
WO2009086952A2 (en) * 2008-01-07 2009-07-16 Projech Science To Technology, S.L. Compositions for the treatment of degenerative articular diseases
WO2009086952A3 (en) * 2008-01-07 2010-04-01 Projech Science To Technology, S.L. Compositions for the treatment of degenerative articular diseases
WO2014164805A1 (en) * 2013-03-11 2014-10-09 University Of North Carolina At Chapel Hill Compositions and methods for targeting o-linked n-acetylglucosamine transferase and promoting wound healing
CN105408480A (en) * 2013-03-11 2016-03-16 北卡罗来纳大学教堂山分校 Compositions and methods for targeting O-linked N-acetylglucosamine transferase and promoting wound healing
EP2970978A4 (en) * 2013-03-11 2016-11-02 Univ North Carolina Compositions and methods for targeting o-linked n-acetylglucosamine transferase and promoting wound healing
CN113943718A (en) * 2021-10-11 2022-01-18 北京大学 Glycosyltransferase and application thereof in marking, imaging and detecting Tn antigen
CN113943718B (en) * 2021-10-11 2023-10-24 北京大学 Glycosyltransferase and application thereof in marking, imaging and detection of Tn antigen

Also Published As

Publication number Publication date
WO1998044123A3 (en) 1998-12-03
AU6942598A (en) 1998-10-22

Similar Documents

Publication Publication Date Title
Andjelković et al. The catalytic subunit of protein phosphatase 2A associates with the translation termination factor eRF1.
Peng et al. C-TAK1 protein kinase phosphorylates human Cdc25C on serine 216 and promotes 14-3-3 protein binding
Lubas et al. O-Linked GlcNAc transferase is a conserved nucleocytoplasmic protein containing tetratricopeptide repeats
Westphal et al. Scar/WAVE-1, a Wiskott–Aldrich syndrome protein, assembles an actin-associated multi-kinase scaffold
Hanover Glycan‐dependent signaling: O‐linked N‐acetylglucosamine
WO2008015013A1 (en) Methods for the identification of pi3k interacting molecules and for the purification of pi3k
Nakagawa et al. Characterization of a human Rhomboid homolog, p100hRho/RHBDF1, which interacts with TGF‐α family ligands
CA2446284A1 (en) Method for identification of agents for the treatment of diabetes
JP2006518214A (en) Genetic markers, compositions and uses thereof for diagnosis and treatment of neurological disorders and diseases
AU758731B2 (en) Tao protein kinases and methods of use therefor
US5972674A (en) Stimulus-inducible protein kinase complex and methods of use therefor
WO1998044123A2 (en) O-LINKED GlcNAc TRANSFERASE (OGT): CLONING, MOLECULAR EXPRESSION, AND METHODS OF USE
Lehnert et al. Protein kinase CK2 interacts with the splicing factor hPrp3p
US6511825B1 (en) Cell signaling polypeptides and nucleic acids
KR20010085373A (en) Compositions and methods for identifying mammalian malonyl coa decarboxylase inhibitors, agonists and antagonists
US6753413B1 (en) P35NCK5A binding proteins
EP0911408B1 (en) DNA coding for serine/threonine kinase
O'Neill et al. Selective synthesis of 2′, 3′‐cyclic nucleotide 3′‐phosphodiesterase isoform 2 and identification of specifically phosphorylated serine residues
US6835539B1 (en) Nucleic sequences coding for an AT2 interacting proteins interacting with the AT2 receptor and their applications
US6372467B1 (en) P54s6k and p85s6k genes, proteins, primers, probes, and detection methods
AU3653599A (en) Gene encoding syntaxin interacting protein
US6300473B1 (en) SLM-1: a novel Sam68-like mammalian protein
JP4660200B2 (en) Centrosome-binding proteins and their applications
US6124447A (en) Enzyme catalyzing dephosphorylation
EP1560923B1 (en) Assays for c-mannosyltransferase

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GE GH GM GW HU ID IL IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SZ UG ZW AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase in:

Ref country code: JP

Ref document number: 1998541849

Format of ref document f/p: F

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: CA