WO2013067523A1 - A prokaryote-based cell-free system for the synthesis of glycoproteins - Google Patents

A prokaryote-based cell-free system for the synthesis of glycoproteins Download PDF

Info

Publication number
WO2013067523A1
WO2013067523A1 PCT/US2012/063590 US2012063590W WO2013067523A1 WO 2013067523 A1 WO2013067523 A1 WO 2013067523A1 US 2012063590 W US2012063590 W US 2012063590W WO 2013067523 A1 WO2013067523 A1 WO 2013067523A1
Authority
WO
WIPO (PCT)
Prior art keywords
leu
ala
ser
phe
val
Prior art date
Application number
PCT/US2012/063590
Other languages
French (fr)
Inventor
Matthew Delisa
Original Assignee
Cornell University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cornell University filed Critical Cornell University
Priority to US14/356,258 priority Critical patent/US11193154B2/en
Priority to CN201280066129.1A priority patent/CN104080921A/en
Publication of WO2013067523A1 publication Critical patent/WO2013067523A1/en
Priority to IN4076CHN2014 priority patent/IN2014CN04076A/en
Priority to HK15103270.8A priority patent/HK1202896A1/en
Priority to US17/543,614 priority patent/US20220340947A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/005Glycopeptides, glycoproteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/1048Glycosyltransferases (2.4)
    • C12N9/1081Glycosyltransferases (2.4) transferring other glycosyl groups (2.4.99)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y204/00Glycosyltransferases (2.4)
    • C12Y204/99Glycosyltransferases (2.4) transferring other glycosyl groups (2.4.99)

Definitions

  • the present invention relates to cell-free systems, kits, and methods for producing a glycosylated protein or peptide.
  • CFEs cell-free extracts
  • EFEs cell-free extracts
  • reconstituted protein synthesis from purified components
  • coli lack glycosylation machinery.
  • rabbit reticulocyte and wheat germ CFE systems cannot perform this post-translational modification because they lack microsomes (Tarui et al., "A Novel Cell-Free Translation/Glycosylation System Prepared From Insect Cells," J. Biosci. Bioeng. 90:508-5 14 (2000)).
  • This can be overcome by supplementing eukaryotic CFEs with microsomal vesicles (e.g. , canine pancreas microsomes) (Lingappa et al., "Coupled Cell-Free Synthesis, Segregation, and Core Glycosylation of a Secretory Protein," Proc. Nat l. Acad. Sci. U.S.A.
  • a first aspect of the present invention is directed to a cell-free system for producing a glycosylated protein.
  • This system comprises an isolated
  • oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target; one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule; and a glycoprotein target comprising one or more glycan acceptor amino acid residues, or a nucleic acid molecule encoding said glycoprotein target.
  • kits comprising an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, and one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule.
  • Another aspect of the present invention relates to a method for producing a glycosylated protein in a cell-free system.
  • This method involves providing an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, providing one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule, and providing a glycoprotein target comprising one or more glycan acceptor amino acid residues.
  • This method further involves combining the oligosaccharyltransferase, one or more isolated glycans, and glycoprotein target to form a cell-free glycosylation reaction mixture, and subjecting the cell-free glycosylation reaction mixture to conditions effective for the oligosaccharyltransferase to transfer the glycan from the lipid carrier molecule to the one or more glycan acceptor residues of the glycoprotein target to produce a glycosylated protein.
  • glycoCFE protein glycosylation locus
  • This gene cluster encodes an N-linked glycosylation system that is functionally similar to that of eukaryotes and archaea, involving an oligosaccharyltransferase that catalyzes the en bloc transfer of preassembled oligosaccharides from lipid carriers onto asparagine residues in a conserved motif [ ⁇ - ⁇ ,-S/T in eukaryotes and D/E-X,-N-X 2 -S/T (SEQ ID NO: 1 ) in bacteria (Kowarik et al., "Definition of the Bacterial N-Glycosylation Site Consensus Sequence," EMBO J.
  • C. jejuni glycosylation machinery is ideally suited for use in a cell-free translation/glycosylation system for the following reasons.
  • E. coli transformed with the entire pgl gene cluster can perform N-linked protein glycosylation (Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer Into E. coli," Science 298: 1790-1793 (2002), which is hereby incorporated by reference in its entirety), thereby providing a convenient host for producing the necessary components in a pure and active form. Since E.
  • C. jejuni OST C. jejuni OST
  • PgIB CjPglB
  • PgIB X- ray Structure of a Bacterial Oligosaccharyltransferase
  • CjPglB can transfer sugars post- translationally to locally flexible structures in folded proteins (Kowarik et al., "N- Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase,” Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety), indicating that protein glycosylation can be achieved without supplementing a functional membrane system (e.g. microsomes).
  • Figures 1 A- 1 B depict aspects of bacterial and eukaryotic N-linked glycosylation.
  • Figure 1 A shows the 17-kb pgl locus of C. jejuni encoding the N- linked glycosylation machinery that has been fully reconstituted in E. coli.
  • Figure I B shows a comparison of N-linked glycosylation in prokaryotes (left) and eukaryotes (right).
  • several glycosyltransferases synthesize the glycan by sequential addition of nucleotide-activated sugars on a lipid carrier on the cytoplasmic face of the inner membrane.
  • a flippase transfers the lipid-linked glycans (also referred to as lipid-linked oligosaccharides or LLOs) across the membrane where the oligosaccharyltransferase catalyzes the transfer to Asn residues of periplasmic or endoplasmic reticulum substrate proteins.
  • PglB is a single-subunit, integral membrane protein that is homologous to the catalytic subunit of the eukaryotic OST STT3 (note that PglB and STT3 complex are not drawn to scale).
  • PglB requires an extended motif that includes an Asp or Glu residue in the -2 position (D/E-X r N-X 2 -S/T (SEQ ID NO: 1 ), where X, and X 2 can be any amino acid except Pro).
  • PglB can transfer sugars post-translationally to locally flexible structures in folded proteins.
  • Figures 2A-2B show the purification of bacterial OST.
  • CjPglB was expressed in E. coli C43(DE3) cells and purified to near homogeneity. Elution fractions (as indicated) from gel filtration columns were examined by SDS-PAGE, and the Coomassie Blue-stained gel images ( Figure 2B) are shown together with the elution profiles ( Figure 2A).
  • MW molecular weight standard.
  • Figures 3A-3C show reconstituted glycosylation with defined components.
  • the in vitro glycosylation assay was carried out using purified OST, extracted LLOs and purified acceptor proteins produced in E. coli.
  • the immunoblots of Figure 3 A show the detection of acceptor protein AcrA and scFvl 3- R4-GT (both anti-His) or glycans (anti-glycan). Reactions included 3 ⁇ g wild-type CjPglB, 5 (+) or 10 (++) ⁇ of LLOs and 5 ⁇ g of acceptor protein.
  • Figure 3B is the same assay as described in Figure 3A but with purified PglB from Campylobacter lari (ClPglB).
  • Figure 3C shows immunoblots detecting AcrA following in vitro glycosylation using 3-month-old freeze thawed components.
  • Figures 4A-4B demonstrate the cell-free translation glycosylation of
  • Figure 4A is an immunoblot detecting different AcrA constructs (anti-AcrA) produced by in vitro translation using either E. coli CFEs or purified translation components (PURE). AcrA concentration was estimated by comparing band intensities to that of purified AcrA loaded in lane 1 .
  • Figure 4B is an immunoblot detecting AssAcrA expression (anti-AcrA) and glycosylation (anti-glycan).
  • AssAcrA was produced by cell-free translation/glycosylation using either the CFE or the PURE systems that were primed with pET24(AcrA-cyt). Controls included the omission of different components (-) or LLOs from SCM6 cells with empty pACYC (+/-).
  • Figures 5A-5B depict the cell-free translation/glycosylation of scFv 13-
  • FIG. 5A is an immunoblot detecting different scFvl 3-R4-GT (anti-FLAG) produced by in vitro translation using either E. coli cell-free extracts (CFE) or purified translation components (PURE). Estimates of the scFvl 3-R4-GT concentration were determined by comparison of band intensities to that of the purified scFv 13-R4-GT sample loaded in lane 1 .
  • Figure 5B is an immunoblot detecting scFv l 3-R4-GT expression (anti-FLAG) and glycosylation (anti-glycan).
  • the scFvl 3-R4-GT protein was produced by cell-free translation/glycosylation using either the CFE or PURE systems that were primed with pET24-ssDsbAscFv l 3-R4-GT. Controls included omission of different components (-).
  • Figures 6A-6C show an amino acid sequence alignment of various amino acids
  • Campylobacter PglB proteins that are suitable for use in the systems, kits, and methods of the present invention.
  • the PglB amino acid sequences are derived from C. jejuni (SEQ ID NO: 2), C. lari (SEQ ID NO:4), C. coli (SEQ ID NO: 6), and C. upsaliensis (SEQ ID NO: 8).
  • An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties.
  • a PglB consensus sequence based on the alignment of Campylobacter PglB sequences is presented as SEQ ID NO: 10.
  • X can be any amino acid residue.
  • X is selected from an amino acid residue at that corresponding position in one of the four depicted Campylobacter sequences.
  • Figures 7A-7E shows an amino acid sequence alignment of various
  • Pyrococcus OST STT3 subunit proteins that are suitable for use in the systems, kits, and methods of the present invention.
  • the OST amino acid sequences are derived from P. furiosus (SEQ ID NO: 1 1 ), Pyrococcus sp. ST04 (SEQ ID NO: 13), Pyrococcus sp. (strain NA2) (SEQ ID NO: 14), P. horikoshii (SEQ ID NO: 15), P. abyssi (SEQ ID NO: 16), and P. yayanosii (SEQ ID NO: 17).
  • An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties.
  • STT3 consensus sequence based on the alignment of Pyrococcus STT3 sequences is presented as SEQ ID NO: 18. Residues that are not fully conserved between the six Pyrococcus sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the six depicted Pyrococcus sequences.
  • Figures 8A-8D shows an amino acid sequence alignment of various amino acids
  • Leishmania OST STT3 subunit related proteins that are suitable for use in the systems, kits, and methods of the present invention.
  • the OST amino acid sequences are derived from L. major (SEQ ID NO: 19), L. donovani (SEQ ID NO: 21 ), L. infantum (SEQ ID NO: 22), L. mexicana (SEQ ID NO: 23), and L. braziUensis (SEQ ID NO: 24).
  • An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties.
  • a STT3 consensus sequence based on the alignment of Leishmania STT3 sequences is presented as SEQ ID NO: 25. Residues that are not fully conserved between the five Leishmania sequences are depicted as X, where X can be any amino acid residue.
  • X is selected from an amino acid residue at the corresponding position in one of the five depicted Leishmania sequences.
  • Figures 9A-9J contain a listing of eukaryotic STT3
  • oligosaccharyltransferases that are suitable for use in the methods, systems, and kits of the present invention.
  • the oligosaccharyltransferases are identified by UniProtKB Entry number (col. 1 ), which provides the amino acid sequence of the protein, UniProtKB Entry name (col. 2), protein name (col. 3), gene name (col. 4), organism (col. 5) and European Molecular Biology Laboratory (EMBL) database accession number (col. 6) which provides the encoding nucleotide sequence of the protein.
  • a first aspect of the present invention is directed to a cell-free system for producing a glycosylated protein.
  • This system comprises an isolated
  • oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target; one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule; and a glycoprotein target comprising one or more glycan acceptor amino acid residue, or a nucleic acid molecule encoding said glycoprotein target.
  • oligosaccharyltransferase refers generally to a glycosylation enzyme or subunit of a glycosylation enzyme complex that is capable of transferring a glycan, i.e. , an oligosaccharide or polysaccharide, from a donor substrate to a particular acceptor substrate.
  • the donor substrate is typically a lipid carrier molecule linked to the glycan, and the acceptor substrate is typically a particular amino acid residue of a target glycoprotein.
  • Suitable OSTs include those enzymes that transfer a glycan to an asparagine residue, i.e., an OST involved in N-linked glycosylation, and those enzymes that transfer a glycan or activated sugar moiety to a hydroxyl oxygen molecule of an amino acid residue, i.e. , an OST involved in O-linked glycosylation.
  • An isolated OST of the present invention can be a single-subunit enzyme, a multi- subunit enzyme complex, or a single subunit derived from a multi-subunit enzyme complex. While a number of exemplary OST enzymes are described below, one of skill in the art readily appreciates that any oligosaccharyltransferase enzyme known in the art is suitable for use in the present invention.
  • OST can be a prokaryotic OST.
  • PglB a single, integral membrane OST protein derived from Campylobacter jejuni is suitable for use in the present invention.
  • PglB attaches a heptasaccharide to an asparagine residue of a glycoprotein target (Kowarik et al., "Definition of the Bacterial N-glycosylation Site Consensus Sequence," Embo J. 25: 1957-66 (2006), which is hereby incorporated by reference in its entirety).
  • the amino acid sequence encoding C. jejuni PglB (UniProtKB Accession No. Q9S4V7) is shown below as SEQ ID NO: 2:
  • SEQ ID NO: 2 is provided below as SEQ ID NO: 3 (EMBL Nucleotide Sequence Database No. AAD5 1383):
  • amino acid and nucleotide sequences of SEQ ID NOs: 2 and 3, respectively, are representative C. jejuni PglB protein and nucleic acid sequences. It is appreciated by one of skill in the art that there are at least 70 subspecies of C. jejuni having a PglB protein that may vary in sequence identity from the amino acid sequence of SEQ ID NO: 2, but retain the same function. Accordingly, homologous PglB protein sequences from other subspecies and strains of C. jejuni that are characterized by an amino acid sequence identity of at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C.
  • jejuni amino acid sequence of SEQ ID NO: 2 are also suitable for use in the present invention.
  • the amino acid sequences of related C. jejuni PglB proteins and nucleotide sequences encoding the same are known and readily available to one of skill in the art.
  • OSTs from other species of Campylobacter that share sequence identity to C. jejuni PglB and/or are capable of transferring an oligosaccharide moiety to a target glycoprotein are also suitable for use in this and all aspects of the present invention.
  • PglB from Campylobacter lari (ClPglB), which shares only 56% sequence identity to the amino acid sequence of C. jejuni (Schwarz et al., "Relaxed Acceptor Site Specificity of Bacterial
  • Oligosaccharyltransferase in Vivo is capable of transferring a glycan to an acceptor amino acid residue (i.e., asparagine) of a target glycoprotein in the cell-free glycosylation system of the present invention.
  • the amino acid sequence encoding C. lari PglB (UniProtKB Accession No. B9KDD4) is shown below as SEQ ID NO: 4:
  • Arg Asp Met lie Ala Gly Phe His Gin Pro Asn Asp Leu Ser Tyr Phe 65 70 75 80
  • Phe Ser Phe Glu Ser lie lie Leu Tyr Met Ser Ala Phe Phe Ala Ser
  • Lys Glu Glu Lys lie Asn Phe Tyr Met lie Trp Ala Leu lie Phe lie
  • Val Tyr lie Tyr Met Pro Tyr Arg Met Leu Arg lie Met Pro Val Val
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. lari amino acid sequence of SEQ ID NO: 4 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 4 is provided below as SEQ ID NO: 5 (EMBL Nucleotide Sequence Database No.
  • PglB Another N-linked OST from Campylobacter that is suitable for use in this and all aspects of the present invention is PglB from C. Coli.
  • the amino acid sequence encoding PglB from C coli (UniProtKB Accession No. H7WI6), which is 81 % identical to that of C. jejuni, is provided below as SEQ ID NO: 6
  • Met lie lie Ser Asn Asp Gly Tyr Ala Phe Ala Glu Gly Ala Arg Asp 50 55 60
  • Ala lie lie Leu Ala Ser lie Thr Leu Ser Asn lie Ala Trp Phe Tyr 225 230 235 240
  • Lys Phe Tyr lie Phe Arg Ser Asp Glu Ser Ala Asn Leu Ala Gin Gly 290 295 300
  • Ser Leu Ser Lys Pro Asp Phe Lys lie Asn Thr Pro Lys Thr Arg Asp 545 550 555 560
  • Val Tyr lie Tyr Met Pro Ala Arg Met Ser Leu lie Phe Ser Thr Val
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. coli amino acid sequence SEQ ID NO: 6 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 6 is provided below as SEQ ID NO: 7 (EMBL Nucleotide Sequence Database No.
  • Campylobacter OST that is suitable for use in this and all aspects of the present invention is PglB from C. upsaliensis.
  • the amino acid sequence encoding PglB from C. upsaliensis (UniProtKB Accession No. E6LAJ2), which is 57% identical to that of C. jejuni, is provided below as SEQ ID NO: 8:
  • Asp Tyr lie Val Ala Trp Trp Asp Tyr Gly Tyr Pro lie Arg Tyr Tyr 530 535 540
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. upsaliensis amino acid sequence of SEQ ID NO: 8 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 8 is provided below as SEQ ID NO: 9 (EMBL Nucleotide Sequence Database No.
  • FIGS. 6A-6C and a PglB consensus sequence based on this alighment is presented as SEQ ID NO: 10 of Figure 6.
  • Residues that are not fully conserved between the four Campylobacter sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from one of the four depicted amino acid residue at the corresponding position in the depicted Camplyobacter sequences.
  • the OST is an archaea oligosaccharyltransferase.
  • the OST STT3 subunit from Pyrococcus furiosus which is capable of transferring a glycan to an asparagine residue of a target glycoprotein is suitable for use in this and all aspects of the present invention.
  • the amino acid sequence of P. furiosus (UniProtKB Accession No. Q8U4D2) is provided below as SEQ ID NO: 1 1 :
  • Trp Leu Arg Glu Asn Thr Pro Glu Tyr Ser Thr Ala Thr Ser Trp Trp
  • Leu Lys Leu Tyr lie Ser Ala Phe Gly Arg Asp lie Glu Asn Ala Thr
  • Gin Lys Gly Pro lie Gly Val Leu Leu Asp Ala Pro Lys Val Asn Gly
  • Glu lie Arg Ser Pro Thr Asn lie Leu Arg Glu Gly Glu Ser Gly Glu
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the P. furiosus amino acid sequence of SEQ ID NO: 1 1 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 1 1 is provided below as SEQ ID NO: 12 (EMBL Nucleotide Sequence Database No. AAL80280):
  • OSTs from other Pyrococcus species or strains that share sequence identity to P. furiosus OST STT3 subunit related protein and/or are capable of transferring a glycan moiety to a target glycoprotein are also suitable for use in the present invention.
  • homologous OSTs derived from Pyrococcus sp. ST04 (SEQ ID NO: 13; UniProtKB No. I3RCF1 ), Pyrococcus sp. (strain NA2) (SEQ ID NO: 14; UniProtKB No. F4HM23), P. horikoshii (SEQ ID NO: 15; UniProtKB No. 074088), P. abyssi (SEQ ID NO: 16; UniProtKB No.
  • P. furiosus OST each share greater than 70% sequence identity with the amino acid sequence of P. furiosus OST (see alignment of Figure 7), and are suitable for use in this and all aspects of the present invention.
  • the nucleotide sequences encoding the aforementioned Pyrococcus OSTs are known and readily available in the art.
  • a STT3 consensus sequence based on the alignment of
  • Pyrococcus STT3 sequences is presented as SEQ ID NO: 18 in Figure 7. Residues that are not fully conserved between the six Pyrococcus sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the six depicted Pyrococcus sequences.
  • the OST is a eukaryotic oligosaccharyltransferase.
  • the OST STT3subunit from Leishmania major which is capable of transferring a glycan to an asparagine residue of a target glycoprotein is suitable for use in this and all aspects of the present invention.
  • the amino acid sequence of L. major (UniProtKB Accession No.
  • Q9U5N8 is provided below as SEQ ID NO: 19.
  • Ser Leu Arg Thr Arg Ser Ser Trp Pro lie Gly Val Leu Thr Gly Val 225 230 235 240 Ala Tyr Gly Tyr Met Ala Ala Ala Trp Gly Gly Tyr lie Phe Val Leu
  • Glu His lie Ala Thr lie Gly Lys Met Leu Thr Ser Pro Val Ala Glu 625 630 635 640
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the L. major amino acid sequence of SEQ ID NO: 19 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 19 (L. major STT3) is provided below as SEQ ID NO: 20 (EMBL Nucleotide Sequence Database No. CAB61569):
  • OSTs from other Leishmania species or strains that share sequence identity to L. major OST STT3 subunit related protein and/or are capable of transferring a glycan moiety to a target glycoprotein are also suitable for use in the present invention.
  • homologous OSTs derived from L. donovani SEQ ID NO: 21 ; UniProtKB No. E9BRZ2
  • L. infantum SEQ ID NO: 22; UniProtKB No. A4IB 10
  • L. mexicana SEQ ID NO: 23; UniProtKB KB No. E9B5Z4
  • braziliensis SEQ ID NO: 24; UniProtKB No. A4HMD6
  • which each share greater than 70% sequence identity with the amino acid sequence of L. major OST are also suitable for use in the this and all aspects of the present invention.
  • a STT3 consensus sequence based on the alignment of
  • Leishmania STT3 sequences is presented as SEQ ID NO: 25 in Figure 8. Residues that are not fully conserved between the five Leishmania sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the five depicted Leishmania sequences.
  • the eukaryotic oligosaccharyltransferase is STT3 from Saccharomyces cerevisiae.
  • the amino acid sequence of 5. cerevisiae (UniProtKB Accession No. P39007) is provided below as SEQ ID NO: 26.
  • cerevisiae amino acid sequence of SEQ ID NO: 26 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 26 (5. cerevisiae STT3) is provided below as SEQ ID NO: 27 (EMBL Nucleotide Sequence
  • the eukaryotic oligosaccharyltransferase is STT3 from Schizosaccharomyces pombe.
  • the amino acid sequence of 5. pombe (UniProtKB Accession No. 094335) is provided below as SEQ ID NO: 28.
  • Ala Asp Arg lie Thr Leu Val Asp Asn Asn Thr Trp Asn Asn Thr His
  • 565 570 575 lie Ala Thr Val Gly Lys Ala Met Ser Ser Pro Glu Glu Lys Ala Tyr
  • pombe amino acid sequence of SEQ ID NO: 28 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 28 (S. pombe STT3) is provided below as SEQ ID NO: 29 (EMBL Nucleotide Sequence Database No. BAA76479).
  • the eukaryotic oligosaccharyltransferase is STT3 from Dictyostelium discoideum.
  • the amino acid sequence of D. discoideum (UniProtKB Accession No. Q54NM9) is provided below as SEQ ID NO: 30.
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the D. discoideum amino acid sequence of SEQ ID NO: 30 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 30 (D. discoideum STT3) is provided below as SEQ ID NO: 31 (EMBL Nucleotide Sequence Database No.EAL64892).
  • This table identifies each oligosaccharyltransferase by its UniProtKB entry number, which provides the amino acid sequence of the protein, and the EMBL database accession number, which provides the encoding nucleotide sequence.
  • the UniProtKB and EMBL accession numbers, along with the corresponding amino acid and nucleotide sequence information for each oligosaccharyltransferase listed in Figure 9 is hereby incorporated by reference in its entirety.
  • oligosaccharyltransferase is an O-linked oligosaccharyltransferase.
  • An exemplary O- linked OST is PilO from Pseudomonas aeruginosa. PilO is responsible for the en bloc transfer of an oligosaccharide from a lipid-linked donor to an oxygen atom of serine and threonine residues (Faridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked Protein Glycosylation," /. Bacteriol. 189(22): 8088-8098 (2007), which is hereby incorporated by reference in its entirety).
  • the amino acid sequence of P. aeruginosa (UniProtKB Accession No. Q51353) is provided below as SEQ ID NO: 32
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the P. aeruginosa amino acid sequence of SEQ ID NO: 32 are also suitable for use in the present invention.
  • SEQ ID NO: 33 EBL Nucleotide Sequence Database No.AAA87404.
  • O-linked OST suitable for use in all aspects of the present invention is PgIL from Neisseria meningitidis (Faridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked Protein Glycosylation," J. Bacteriol. 189(22): 8088-8098 (2007), which is hereby incorporated by reference in its entirety).
  • the amino acid sequence of N is aridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked Protein Glycosylation," J. Bacteriol. 189(22): 8088-8098 (2007), which is hereby incorporated by reference in its entirety).
  • the amino acid sequence of N is aridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked
  • Lys Leu Phe Asp Val Lys lie Pro Ala lie Ser Phe Leu Leu Phe Ala 65 70 75 80
  • Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the N. menigitidis amino acid sequence of SEQ ID NO: 34 are also suitable for use in the present invention.
  • the nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 34 (N. menigitidis PglL) is provided below as SEQ ID NO: 35 (EMBL Nucleotide Sequence Database No. AEK98518).
  • an "isolated" oligosaccharyltransferase refers to an oligosaccharyltransferase that is substantially pure or substantially separated from other cellular components that naturally accompany the native protein in its natural host cell.
  • the isolated oligosaccharyltransferase of the present invention is at about 80% pure, usually at least about 90% pure, and preferably at least about 95% pure. Purity can be assessed using any method known in the art, e.g., polyacrylamide gel electrophoresis, HPLC, etc.
  • the isolated oligosaccharyltransferase can be obtained from the organism from which it is derived directly, or it can be
  • the use of recombinant expression systems to produce and isolate a protein of interest involves inserting a nucleic acid molecule encoding the amino acid sequence of the desired protein into an expression system to which the molecule is heterologous (i.e., not normally present).
  • One or more desired nucleic acid molecules encoding one or more proteins may be inserted into the vector.
  • the multiple nucleic acid molecules may encode the same or different enzymes.
  • the heterologous nucleic acid molecule is inserted into the expression system or vector in proper sense (5'— 3') orientation relative to the promoter and any other 5' regulatory molecules, and correct reading frame.
  • nucleic acid constructs can be carried out using standard cloning procedures well known in the art as described by Joseph Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL (Cold Springs Harbor 1989) and U.S. Patent No. 4,237,224 to Cohen and Boyer, which are hereby incorporated by reference in its entirety. These recombinant plasmids are then introduced by means of transformation and replicated in a suitable host cell.
  • a variety of genetic signals and processing events that control many levels of gene expression can be incorporated into the nucleic acid construct to maximize enzyme production.
  • mRNA messenger RNA
  • any one of a number of suitable promoters may be used. For instance, when cloning in E.
  • promoters such as the T7 phage promoter, lac promoter, trp promoter, recA promoter, ribosomal RNA promoter, the P R and PL promoters of coliphage lambda and others, including but not limited, to /acUV5, ompF, bla, Ipp, and the like, may be used to direct high levels of transcription of adjacent DNA segments. Additionally, a hybrid trp-lacV 5 (tac) promoter or other E. coli
  • promoters produced by recombinant DNA or other synthetic DNA techniques may be used to provide for transcription of the inserted gene.
  • Common promoters suitable for directing expression in mammalian cells include, without limitation, SV40, MMTV, metallothionein-1 , adenovirus Ela, CMV, immediate early, immunoglobulin heavy chain promoter and enhancer, and RSV-LTR.
  • nucleic acid constructs there are other specific initiation signals required for efficient gene transcription and translation in prokaryotic cells that can be included in the nucleic acid construct to maximize peptide production, e.g., the Shine -Dalgarno ribosome binding site.
  • suitable transcription and/or translation elements including constitutive, inducible, and repressible promoters, as well as minimal 5' promoter elements, enhancers or leader sequences may be used.
  • a nucleic acid molecule encoding an oligosaccharyltransferase or other protein component of the present invention e.g., glycoprotein target, enzymes involved in glycan production
  • a promoter molecule of choice including, without limitation, enhancers, and leader sequences
  • a suitable 3' regulatory region to allow transcription in the host and any additional desired components, such as reporter or marker genes, are cloned into the vector of choice using standard cloning procedures in the art, such as described in Joseph Sambrook et al., MOLECULAR CLONING: A
  • nucleic acid molecule encoding the protein or proteins has been cloned into an expression vector, it is ready to be incorporated into a host.
  • Recombinant molecules can be introduced into cells, without limitation, via transfection (if the host is a eukaryote), transduction, conjugation, mobilization, electroporation, lipofection, protoplast fusion, calcium chloride transformation, mobilization, transfection using bacteriophage, or particle bombardment, using standard cloning procedures known in the art, as described by JOSEPH SAMBROO et al., MOLECULAR CLONING: A LABORATORY MANUAL (Cold Springs Harbor 1989), which is hereby incorporated by reference in its entirety.
  • Suitable host cells for recombinant protein production include both prokaryotic and eukaryotic cells.
  • Suitable prokaryotic host cells include, without limitation, E. coli and other Enterobacteriaceae, Escherichia sp., Campylobacter sp., Wolinella sp., Desulfovibrio sp. Vibrio sp., Pseudomonas sp.
  • Bacillus sp. Listeria sp., Staphylococcus sp., Streptococcus sp., Peptostreptococcus sp., Megasphaera sp., Pectinatus sp., Selenomonas sp., Zymophilus sp., Actinomyces sp., Arthrobacter sp., Frankia sp., Micromonospora sp., Nocardia sp., Propionibacterium sp., Streptomyces sp., Lactobacillus sp., Lactococcus sp., Leuconostoc sp., Pediococcus sp.,
  • Bordetella sp. Brucella sp., Capnocytophaga sp., Cardiobacterium sp., Eikenella sp., Francisella sp., Haemophilus sp., Kingella sp., Pasteurella sp., Flavobacterium sp. Xanthomonas sp., Burkholderia sp., Aeromonas sp., Plesiomonas sp., Legionella sp.
  • alpha-proteobacteria such as Wolbachia sp., cyanobacteria, spirochaetes, green sulfur and green non-sulfur bacteria, Gram-negative cocci, Gram negative bacilli which are fastidious, Enterobacteriaceae -glucose-fermenting gram- negative bacilli, Gram negative bacilli - non-glucose fermenters, Gram negative bacilli - glucose fermenting, oxidase positive.
  • eukaryotic cells such as mammalian, insect, and yeast systems are also suitable host cells for transfection/transformation of the expression vector for recombinant protein production.
  • Mammalian cell lines available in the art for expression of a heterologous protein or polypeptide include Chinese hamster ovary cells, HeLa cells, baby hamster kidney cells, COS cells and many others.
  • Purified proteins may be obtained from the host cell by several methods readily known in the art, including ion exchange chromatography, hydrophobic interaction chromatography, affinity chromatography, gel filtration, and reverse phase chromatography.
  • the peptide is preferably produced in purified form (preferably at least about 70 to about 75% pure, or about 80% to 85% pure, more preferably at least about 90% or 95% pure) by conventional techniques.
  • purified form preferably at least about 70 to about 75% pure, or about 80% to 85% pure, more preferably at least about 90% or 95% pure
  • the protein can be isolated and purified by centrifugation (to separate cellular components from supernatant containing the secreted protein) followed by sequential ammonium sulfate precipitation of the supernatant.
  • the fraction containing the protein can be subjected to gel filtration in an appropriately sized dextran or polyacrylamide column to separate the protein from other cellular components and proteins. If necessary, the protein fraction may be further purified by HPLC.
  • the oligosaccharyltransferase catalyzes the transfer of a glycan from a lipid donor to an acceptor protein, peptide, or polypeptide.
  • the lipid donor or carrier molecule is a prokaryotic lipid donor, i.e., it is made in a prokaryote or native to the prokaryote.
  • prokaryotic lipid donors examples include an undecaprenyl-phosphate and an undecaprenyl phosphate-linked bacillosamine (Weerapana et al., "Investigating Bacterial N-Linked Glycosylation: Synthesis and Glycosyl Acceptor Activity of the Undecaprenyl Pyrophosphate-1 inked Bacillosamine," /. Am. Chem. Soc. 127: 13766-67 (2005), which is hereby incorporated by reference in its entirety).
  • the lipid donor is a eukaryotic lipid donor, i.e., it is made in a eukaryotic cell or native to the eukaryotic cell.
  • the glycan comprises an oligosaccharide or polysaccharide that is linked to a lipid donor molecule.
  • the composition of the glycan component varies in number and type of monosaccharide units that make up the oligosaccharide or polysaccharide chain.
  • the monosaccharide components of a glycan include, but are not limited to, one or more of glucose (Glc), galactose (Gal), mannose (Man), fucose (Fuc), N- acetylgalactosamine (GalNAc), N-acetylglucosamine (GlcNAc), glucorionic acid, xylose, sialic acid (e.g., N-acetyl-neuraminic acid (NeuAc), 6-deoxy-talose, and rhamnose monosaccharides.
  • the glycan can be a prokaryotic, archaea, or eukaryotic glycan.
  • the glycan may comprise a completely unnatural glycan composition.
  • the glycan is a prokaryotic glycan that is produced by one or more prokaryotic glycosyltransferases.
  • the prokaryotic glycan is produced using a combination of prokaryotic and eukaryotic glycosyltransferases, but has a
  • the prokaryotic glycan is synthetically produced (Seeberger et al., Chemical and Enzymatic Synthesis ofGlycans and Glycoconjugates, in ESSENTIALS OF GLYCOBIOLOGY (A. Varki et al. eds., 2009), which is hereby incorporated by reference in its entirety).
  • An exemplary prokaryotic glycan is a glycan produced by the glycosyltransferases of the C. jejuni, C. Coli, C. lari, or C. upsaliensis Pgl gene clusters or a modified C. jejuni, C. Coli, C. lari, or C. upsaliensis Pgl gene cluster.
  • Genes of the Pgl cluster include wlaA, galE, wlaB, pglH, pgll, pglJ, pglB, pglA, pglC, pglD, wlaJ, pglE, pglF, and pglG (Szymanski and Wren, "Protein Glycosylation in Bacterial Mucosal Pathogens," Nature Microbiol. 3:225-237 (2005), which is hereby incorporated by reference in its entirety).
  • a prokaryotic glycan typically comprises the diacetamido-trideoxy-sugar, bacillosamine (Bac; 2,4-diacetamido-2,4,6- trideoxyglucose).
  • a suitable prokaryotic glycan of this and all aspects of the present invention is a heptasaccharide comprising glucose, N-acetylgalactosamine, and bacillosamine, i.e., GlcGalNAcsBac.
  • the glycan of this and all aspects of the present invention can be recombinantly produced. For example, a modified or unmodified C.
  • jejuni pgl gene cluster encoding the enzymes that carry out the biosynthesis of the GlcGalNacsBac heptasaccharide and other glycan structures can be isolated and transferred to a suitable host cell for production of a lipid-linked glycan (see also Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer into E. coli," Science 298(5599): 1790-93 (2002), which is hereby incorporated by reference in its entirety).
  • Pgl gene clusters from other Campylobacter species e.g., C. coli, C. lari, and C.
  • upsaliensis are also suitable for recombinant production of glycans for use in all aspects of the present invention (Szymanski and Wren, "Protein Glycosylation in Bacterial Mucosal Pathogens," Nature Microbiol. 3:225-237 (2005), which is hereby incorporated by reference in its entirety). Additionally, similar Pgl-like glycosylation gene loci have been identified in Wolinella succinogens, Desulfovibrio desulfuricans, and D. vulgaris that are also suitable for recombinant production of glycans for the present invention (Baar et al., "Complete Genome Sequence and Analysis of Wolinella succinogenes," Proc. Natl. Acad.
  • the Pgl gene cluster may be modified to enhance lipid-linked glycan production, accumulation, and isolation in the host cell. For example, inactivation of the oligosaccharyltransferase component of the gene cluster (e.g., the pglB gene in the pgl gene cluster) is desirable to prevent transfer of the lipid-linked glycan to a glycoprotein target of the host cell. Additionally, in some embodiments of the present invention, it may be desirable to attenuate, disrupt, or delete competing glycan biosynthesis reactions of the host cell.
  • inactivation of host cell glycosyltransferase enzymes or other enzymes involved in the transfer or ligation of a glycan to acceptor moieties of the host cell may also be desirable.
  • host cell glycosyltransferase enzymes N-linked or O-linked reaction enzymes
  • other enzymes involved in the transfer or ligation of a glycan to acceptor moieties of the host cell may also be desirable.
  • deletion of the WaaL enzyme which transfers glycans from the undecaprenyl lipid carrier onto lipid A which in turn shuttles the oligosaccharides to the outer leaflet of the outer membrane, will ensure that the recombinantly produced lipid-linked glycans accumulate in the inner membrane.
  • coli host cell glycosylation related enzymes that may be deleted, disrupted, or modified include, without limitation, wecA, wbbL, glcT, glf, gafT, wzx, wzy, and enzymes of the 016 antigen biosynthesis pathway.
  • the glycan is a eukaryotic glycan, i.e., a glycan produced by one or more eukaryotic
  • a eukaryotic glycan is produced by only eukaryotic glycosyltransferases.
  • the eukaryotic glycan is produced using a combination of both eukaryotic and prokaryotic glycosyltransferase enzymes, but mimics eukaryotic glycan structure.
  • the eukaryotic glycan is synthetically produced (Seeberger et al., Chemical and Enzymatic Synthesis of Glycans and Glycoconjiigates, in ESSENTIALS OF GLYCOBIOLOGY (A. Varki et al. eds., 2009), which is hereby incorporated by reference in its entirety).
  • the eukaryotic glycan comprises a GlcNAc 2 core.
  • the GlcNac 2 core may further comprise at least one mannose residue.
  • Suitable eukaryotic glycan structures may comprise, but are not limited to, ManiGlcNAc 2 , Man 2 GlcNAc 2 , and Man 3 GlcNAc 2 .
  • the eukaryotic lipid-linked glycan can be recombinantly produced by introducing one or more eukaryotic glycosyltransferase enzymes in a suitable host cell.
  • a eukaryotic glycosyltransferase as used herein refers to an enzyme that catalyzes the transfer of a sugar reside from a donor substrate, e.g., from an activated nucleotide sugar, to an acceptor substrate, e.g., a growing lipid- linked oligosaccharide chain.
  • Suitable glycosyltransferase enzyme that can be utilized in host cells to facilitate the recombinant production of a eukaryotic lipid- linked glycan of the system include, without limitation, galactosyltransferases ⁇ e.g., i ,4-galactosyltransferase, i ,3-galactosyltransferase), fucosyl transferases, glucosyltransferases, N-acetylgalactosaminyltransferases (e.g., GalNAcT, GalNAc- Tl , GalNAc-T2, GalNAc-T3), N-acetylglucosaminyltransferases (e.g., -l ,2-N- acetylglucosaminyltransferase I (GnTI-), GnT-II, GnT-III, GnT-IV, GnT-V, GnT
  • glycosyltransferase enzymes have been extensively studied in a variety of eukaryotic systems. Accordingly, the nucleic acid and amino acid sequences of these enzymes are known and readily available to one of skill in the art. Additionally, many of these enzymes are commercially available (e.g., Sigma- Aldrich, St. Louis, MO).
  • Suitable host cells for the production of a prokaryotic or eukaryotic lipid-linked glycan include both prokaryotic host cells and eukaryotic cells.
  • An exemplary list of suitable host cells is provided supra.
  • the nucleotide sequences of the eukaryotic glycosyltransferases can be codon optimized to overcome limitations associated with the codon usage bias between E. coli (and other bacteria) and higher organisms, such as yeast and mammalian cells. Codon usage bias refers to differences among organisms in the frequency of occurrence of codons in protein- coding DNA sequences (genes).
  • a codon is a series of three nucleotides (triplets) that encodes a specific amino acid residue in a polypeptide chain. Codon optimization can be achieved by making specific transversion nucleotide changes, i.e. a purine to pyrimidine or pyrimidine to purine nucleotide change, or transition nucleotide change, i.e. a purine to purine or pyrimidine to pyrimidine nucleotide change.
  • glycoprotein target includes any peptide, polypeptide, or protein that comprise one or more glycan acceptor amino acid residues.
  • glycan acceptor residues comprise an asparagine (N or Asn) to form an N-linked glycoprotein, or hydroxyl oxygen on the side chain of hydroxylysine, hydroxyproline, serine, threonine, or tyrosine to form an O-linked glycoprotein.
  • glycoprotein targets exist including, without limitation, structural molecules (e.g., collagens), lubricant and protective agents (e.g., mucins), transport proteins (e.g., transferrin), immunological proteins (immunoglobulins, histocompatibility antigens), hormones, enzymes, cell attachment recognition sites, receptors, protein folding chaperones, developmentally regulated proteins, and proteins involved in hemostasis and thrombosis.
  • Therapeutic proteins, such as antibodies are important glycoprotein targets of the system of the present invention.
  • the one or more oligosaccharide acceptor residues of the glycoprotein target may be an asparagine (N or Asn) residue.
  • the asparagine residue is positioned within a glycosylation consensus sequence comprising N-XpS/T (eukaryotic consensus sequence) or D/E-Xi- N-X 2 -S/T (SEQ ID NO: 1 ) (prokaryotic consensus sequence) where D is aspartic acid, Xi and X 2 are any amino acid other than proline, N is asparagine, and T is threonine.
  • the glycoprotein target according to this and all aspects of the present invention can be a purified protein, peptide, or polypeptide comprising the requisite glycan acceptor residues.
  • the glycoprotein target can be in the form of an isolated nucleic acid molecule encoding the glycoprotein target.
  • the system further includes reagents suitable for synthesizing the glycoprotein target from said nucleic acid molecule, i.e., translation reagents.
  • RNA molecules typically consist of extracts from rabbit reticulocytes, wheat germ, and E. coli.
  • the extracts contain all the macromolecule components necessary for translation of an exogenous RNA molecule, including, for example, ribosomes, tRNAs, aminoacyl- tRNA synthetases, initiation, elongation, and termination factors.
  • the other required components of the system include amino acids, energy sources (e.g., ATP, GTP), energy regenerating systems (creatine phosphate and creatine phosphokinase for eukaryote systems, and phosphoenol pyruvate and pyruvate kinase for prokaryote systems), and other cofactors (e.g. , Mg 2+ , + , etc.).
  • energy sources e.g., ATP, GTP
  • energy regenerating systems creatine phosphate and creatine phosphokinase for eukaryote systems, and phosphoenol pyruvate and pyruvate kinase for prokaryote systems
  • other cofactors e.g. , Mg 2+ , + , etc.
  • kits comprising an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, and one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule.
  • the isolated oligosaccharyltransferase of the kit may be a purified protein or may be in the form of a nucleic acid encoding the oligosaccharyltransferase.
  • the nucleic acid molecule can be a DNA or RNA molecule, and it can be linearized (naked) or circularized (housed in an expression vector). Exemplary prokaryotic, archaea, and eukaryotic oligosaccharyltransferases are described supra.
  • the one or more glycans are linked to a lipid carrier molecule (e.g., an undecaprenol-pyrophosphate, an undecaprenyl pyrophosphate - linked bacillosamine, or a dolichylpyrophosphate).
  • a lipid carrier molecule e.g., an undecaprenol-pyrophosphate, an undecaprenyl pyrophosphate - linked bacillosamine, or a dolichylpyrophosphate.
  • the glycan may comprise a prokaryotic, archaea, eukaryotic, or completely unnatural synthetic glycan as also described supra.
  • Suitable prokaryotic core glycan structures comprise a
  • heptasaccharide containing glucose, N-acetylgalactosamine, and optionally bacillosamine e.g., GlcGalNAcjBac
  • Suitable eukaryotic glycan core structures comprises N-acetylglucosamine and mannose (e.g., Ma ⁇ GlcNAc:, Man 2 GlcNAc 2 , and Man 3 GlcNAc 2 ).
  • the one or more isolated glycans linked to a lipid carrier molecule of the kit are in an assembled and purified form.
  • the kit of the present invention comprises one or more nucleic acid molecules encoding one or more eukaryotic and/or prokaryotic glycosyltransferase enzymes, and host cells (eukaryotic or prokaryotic) that contain a polyisoprenyl pyrophosphate glycan carrier and are capable of expressing the one or more nucleic acid molecules.
  • the kit may further contain instructions for recombinantly producing and isolating the lipid-linked glycan in the host cells prior to use with the other kit components.
  • the kit of the present invention may further include in vitro or cell-free transcription and/or translation reagents for synthesizing the oligosaccharyltransferase and/or a glycoprotein, peptide or polypeptide of choice.
  • Another aspect of the present invention relates to a method for producing a glycosylated protein in a cell-free system.
  • This method involves providing an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, providing one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule, and providing a glycoprotein target comprising one or more glycan acceptor amino acid residues.
  • This method further involves combining the oligosaccharyltransferase, one or more isolated glycans, and glycoprotein target to form a cell-free glycosylation reaction mixture, and subjecting the cell-free glycosylation reaction mixture to conditions effective for the oligosaccharyltransferase to transfer the glycan from the lipid carrier molecule to the one or more glycan acceptor residues of the glycoprotein target to produce a glycosylated protein.
  • oligosaccharyltransferase isolated glycans linked to a lipid carrier molecule, and glycoprotein target are described in detail supra.
  • glycoprotein target translation may be coupled with glycosylation by providing reagents suitable for synthesizing a glycoprotein target from a nucleic acid molecule.
  • the nucleic acid molecule encoding the glycoprotein target, the translation reagents, oligosaccharyltransferase, isolated glycans are all combined to form a translation- glycosylation reaction mixture.
  • the glycoprotein target is then synthesized from the target nucleic acid molecule prior to or concurrent with the glycosylation reaction.
  • C43(DE3) (Lucigen, Middleton, WI) was freshly transformed with plasmid pSN 18 (Kowarik et al., "N-Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety), a modified pBAD expression plasmid encoding C. jejuni pglB with a C-terminal decahistidine affinity tag.
  • Cells were grown in 1.5 L of terrific Broth supplemented with 100 of ampicillin at 37°C.
  • Membranes containing PglB were resuspended in 25 mM Tris-HCl, pH 8.0, 250 mM NaCl, 10% glycerol (v/v) and 1 % DDM (w/v) (DDM, Anatrace, Affymetrix, Inc., Santa Clara, CA) and incubated for 2 h. The insoluble fraction was removed by ultracentrifugation at 100,000 x g for 1 h. All subsequent buffers contained DDM as the detergent.
  • the solubilized membranes were supplemented with 10 mM imidazole, loaded onto a Ni- NTA superflow affinity column (Qiagen, Valencia, CA) and washed with 60 mM imidazole before PglB was eluted with 200 mM imidazole.
  • the purified protein was then injected onto a Superdex 200 gel filtration column using AKTA-FPLC (GE Healthcare, Waukesha, WI). Eluate fractions were subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and stained with Coomassie blue to identify the fractions containing PglB ( Figure 2).
  • the protein was desalted with a PD 10 desalting column (GE Healthcare) into 20 mM Tris, pH 7.5, 100 mM NaCl, 5% glycerol (w/v) and 0.05% DDM (w/v) and concentrated to 5- 10 mg/mL in an Amicon centricon with a molecular mass cutoff of 100 kDa.
  • a PD 10 desalting column GE Healthcare
  • C43(DE3) cells carrying plasmid pSN 18.1 , which encodes an inactive copy of pglB subcloned from pACYCpglmut (see below) were used.
  • ClPglB was purified from BL2-Gold(DE3) cells (Stratagene, La Jolla, CA) carrying plasmid pSF2 as described elsewhere (Lizak et al., "X-ray Structure of a Bacterial Oligosacchary transferase," Nature 474:350-355 (201 1), which is hereby incorporated by reference in its entirety).
  • the glycerol content in PglB samples was increased to 10% (w/v).
  • Periplasmic extracts were prepared as described previously (Schwarz et al., "Relaxed Acceptor Site Specificity of Bacterial Oligosaccharyltransferase in Vivo," Glycobiology 21 :45-54 (201 1), which is hereby incorporated by reference in its entirety), supplemented with imidazole to reach a final concentration of 10 mM, sterile filtered (0.22 ⁇ ), and purified by nickel affinity chromatography using Ni- NTA superflow affinity column (Qiagen, Valencia, CA).
  • the sample was dried under nitrogen gas at 37°C, dissolved in 10 mM HEPES (4-(2-hydroxyethyl)-l - piperazineethanesulfonic acid), pH 7.5, 1 mM MnC12 and 0.1 % DDM (w/v) and stored at -20°C. An identical procedure was followed to extract lipids from SCM6 cells carrying empty pACYC.
  • a 50 ⁇ reaction was prepared using the S30 T7 High-Yield Expression System (Promega, Fitchburg, WI) or PURExpress (New England Biolabs, Ipswich, MA) according to the manufacturer's instructions.
  • S30 T7 High-Yield Expression System Promega, Fitchburg, WI
  • PURExpress New England Biolabs, Ipswich, MA
  • a total of 1 ⁇ g of the following plasmids were added to each reaction: pET24b (Novagen, Madison, WI); pET24-AcrA encoding full-length C.
  • DDM was chosen for in vitro translation/glycosylation because it was previously observed to be well tolerated in an E. co/i-derived CFE system (Klammt et al., "Evaluation of Detergents for the Soluble Expression of Alpha-Helical and Beta- Barrel-Type Integral Membrane Proteins by a Preparative Scale Individual Cell-Free Expression System," Febs J. 272:6024-6038 (2005), which is hereby incorporated by reference in its entirety).
  • GlcGalNAc5Bac heptasaccharide where Bac is bacillosamine
  • UndPP membrane-anchored undecaprenylpyrophosphate
  • a modified version of this cluster that carried an inactivated pglB gene (Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer Into E. coli " Science 298: 1790- 1793 (2002), which is hereby incorporated by reference in its entirety) was transferred to E. coli SCM6 cells and used to prepare LLOs. SCM6 cells were chosen for several reasons.
  • these cells lack the WaaL enzyme that naturally transfers oligosaccharides (e.g. O-antigens, glycans) from the lipid carrier undecaprenyl onto lipid A, which in turn shuttles the oligosaccharides to the outer leaflet of the outer membrane (Feldman et al., "Engineering N-Linked Protein Glycosylation With Diverse O Antigen Lipopolysaccharide Structures in Escherichia coli " Proc. Natl. Acad. Sci. U.S.A. 102:3016-3021 (2005), which is hereby incorporated by reference in its entirety).
  • the desired lipid-linked glycans accumulate in the inner membrane.
  • scFv l 3- R4-GT glycoengineered single-chain variable fragment
  • GT C-terminal glycosylation tag
  • Example 4 Cell-Free Translation and Glycosylation of Target Glycoproteins
  • the glycoCFE and glycoPURE translation/glycosylation systems were constructed by combining the purified glycosylation components (minus the acceptor protein) with one of the cell-free translation systems.
  • the plasmid pET24(AcrA-cyt) that encodes AcrA without an N- terminal signal peptide was chosen to evaluate these systems because it gave rise to significant amounts of target protein in both translation systems with no detectable degradation.
  • a major advantage of the open prokaryote-based translation/ glycosylation systems developed here is that the supply of purified glycosylation components as well as their substrates and cofactors (Lizak et al., "X-ray Structure of a Bacterial Oligosaccharyltransferase," Nature 474:350-355 (201 1 ), which is hereby incorporated by reference in its entirety) can be provided at precise ratios. Likewise, the concentration of inhibitory substances such as proteases and glycosidases that catalyze the hydrolysis of glycosidic linkages can be reduced or eliminated entirely.
  • the in vitro systems permit the introduction of components that may be incompatible with in vivo systems such as certain LLOs that cannot be produced or flipped in vivo.
  • This level of controllability is unavailable in any previous translation/glycosylation system and is significant for several reasons.
  • it helps to avoid glycoprotein heterogeneity, which is particularly bothersome in fundamental studies to assess the contribution of specific glycan structures or in pharmaceutical glycoprotein production.
  • the glycoCFE and glycoPURE systems should allow the examination of factors that interact with or stimulate the
  • the bacterial OST can glycosylate locally flexible structures in folded proteins (Kowarik et al., "N-Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety) and also structured domains of some proteins, these systems should help to decipher the influence of protein structure on glycosylation efficiency. Also, since bacterial and eukaryotic glycosylation mechanisms display significant similarities, these bacterial systems could provide a simplified model framework for understanding the more complex eukaryotic process. Third, it allows for further customization of the system by reconstituting additional or alternative steps (both natural and unnatural) in the glycosylation pathway. For instance, the sequential activities of the
  • glycosyltransferases in the pgl pathway have been reconstituted in vitro (Glover et al., "In Vitro Assembly of the Undecaprenylpyrophosphate -Linked Heptasaccharide for Prokaryotic N-Linked Glycosylation,” Proc. Nat'l. Acad. Sci. U.S.A. 102: 14255- 14259 (2005), which is hereby incorporated by reference in its entirety) and could easily be integrated with the translation/glycosylation reactions into a single integrated platform. While glycoengineered E.
  • the glycoCFE and glycoPURE systems should permit synthesis of hybrid natural/unnatural or even completely artificial glycans.
  • the addition of synthetic sugar-nucleotide donor substrates and/or mutant glycosyltransferases and OSTs having new specificities will enable the construction of a glycosylation system founded on a noncanonical glycan code.
  • the glycoCFE and glycoPURE systems are useful additions to the cell-free translation and glycobiology tookits alike.

Abstract

The present invention is directed to a cell-free system for producing a glycosylated protein. This system comprises an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, one or more isolated glycans, where each glycan is linked to a lipid carrier molecule, and a glycoprotein target comprising one or more glycan acceptor amino acid residues or a nucleic acid molecule encoding said glycoprotein target. The present invention further relates to kits and methods for producing a glycosylated protein in this cell- free system.

Description

A PROKARYOTE-BASED CELL-FREE SYSTEM FOR THE
SYNTHESIS OF GLYCOPROTEINS
[0001] This application claims the benefit of U.S. Provisional Patent
Application Serial No. 61/555,854, filed November 4, 201 1 , which is hereby incorporated by reference in its entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to cell-free systems, kits, and methods for producing a glycosylated protein or peptide.
BACKGROUND OF THE INVENTION
[0003] Cell-free protein-synthesizing systems are emerging as an attractive alternative to conventional expression systems that rely on living cells (Katzen et al., "The Past, Present and Future of Cell-Free Protein Synthesis," Trends Biotechnol. 23: 150- 156 (2005)). This is because, over the past decade, cell-free protein synthesis reactions: (i) can be completed in less than a day; (ii) use reagents whose costs are down; (iii) fold complex proteins by routinely forming disulfide bonds; and (iv) can be scaled to 100 L. Two main approaches have been used for in vitro transcription/ translation: one is based on cell-free extracts (CFEs), often derived from Escherichia coli, rabbit reticulocytes or wheat germ, and the second is based on reconstituted protein synthesis from purified components (Shimizu et al., "Cell-Free Translation Reconstituted With Purified Components," Nat. Biotechnol. 19:75 1 -755 (2001 )). Because of their ability to co-activate multiple biochemical networks in a single integrated platform (Jewett et al., "An Integrated Cell-Free Metabolic Platform for Protein Production and Synthetic Biology," Mol. Syst. Biol. 4:220 (2008)), cell free systems are increasingly used in many important biotechnology and synthetic biology applications (Ryabova et al., "Functional Antibody Production Using Cell-Free Translation: Effects of Protein Disulfide Isomerase and Chaperones," Nat. Biotechnol. 15:79-84 ( 1997); Noireaux et al., "Principles of Cell-Free Genetic Circuit Assembly," Proc. Nat'l. Acad. Sci. U.S.A. 100: 12672- 12677 (2003); Yang et al., "Rapid
Expression of Vaccine Proteins for B-Cell Lymphoma in a Cell-Free System," Biotechnol. Bioeng. 89:503-51 1 (2005)). [0004] The ability to accurately and efficiently glycosylate proteins in a cell- free system would have advantages for many areas of basic and applied research, especially given the importance of N-linked glycosylation in protein folding, quality control, sorting, degradation, secretion and activity (Helenius & Aebi, "Roles of N- Linked Glycans in the Endoplasmic Reticulum," Annu. Rev. Biochem. 73: 1019- 1049 (2004)). Unfortunately, the best characterized and most widely used cell-free translation systems based on E. coli are incapable of making glycoproteins because E. coli lack glycosylation machinery. Likewise, rabbit reticulocyte and wheat germ CFE systems cannot perform this post-translational modification because they lack microsomes (Tarui et al., "A Novel Cell-Free Translation/Glycosylation System Prepared From Insect Cells," J. Biosci. Bioeng. 90:508-5 14 (2000)). This can be overcome by supplementing eukaryotic CFEs with microsomal vesicles (e.g. , canine pancreas microsomes) (Lingappa et al., "Coupled Cell-Free Synthesis, Segregation, and Core Glycosylation of a Secretory Protein," Proc. Nat l. Acad. Sci. U.S.A.
75:2338-2342 ( 1978); Rothblatt & Meyer, "Secretion in Yeast: Reconstitution of the Translocation and Glycosylation of Alpha-Factor and Invertase in a Homologous Cell-Free System," Cell 44:619-628 ( 1986)), but the resulting systems do not always faithfully process the target protein due to poor compatibility between some CFEs and microsomal vesicles (Rothblatt & Meyer, "Secretion in Yeast: Reconstitution of the Translocation and Glycosylation of Alpha-Factor and Invertase in a Homologous
Cell-Free System," Cell 44:619-628 ( 1986); Moreno et al., "An mRNA-Dependent in Vitro Translation System from Trypanosoma brucei " Mol. Biochem. Parasitol. 46:265-274 ( 1991 )). An alternative strategy for creating a cell-free translation system that can execute N-linked glycosylation is to prepare CFEs from specialized cells such as hybridomas (Mikami et al., "A Hybridoma-Based in Vitro Translation System That Efficiently Synthesizes Glycoproteins," J. Biotechnol. 127:65-78 (2006)), trypanosomes (Moreno et al., "An mRNA-Dependent in Vitro Translation System from Trypanosoma brucei " Mol. Biochem. Parasitol. 46:265-274 ( 1991 )), insect cells (Tarui et al., "A Novel Cell-Free Translation/Glycosylation System Prepared From Insect Cells," /. Biosci. Bioeng. 90:508-514 (2000)) or mammalian cells (Shibutani et al., "Preparation of a Cell-Free Translation System From PCI 2 Cell," Neurochem. Res. 21 :801 -807 (1996)). However, these systems are technically difficult to prepare and typically result in inefficient glycosylation and low product yields. Moreover, in all the above systems, the glycosylation process is effectively a "black-box" and thus difficult to control.
[0005] The present invention is directed at overcoming these and other deficiencies in the art. SUMMARY OF THE INVENTION
[0006] A first aspect of the present invention is directed to a cell-free system for producing a glycosylated protein. This system comprises an isolated
oligosaccharyltransferase (OST) capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target; one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule; and a glycoprotein target comprising one or more glycan acceptor amino acid residues, or a nucleic acid molecule encoding said glycoprotein target.
[0007] Another aspect of the present invention is directed to a kit comprising an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, and one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule.
[0008] Another aspect of the present invention relates to a method for producing a glycosylated protein in a cell-free system. This method involves providing an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, providing one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule, and providing a glycoprotein target comprising one or more glycan acceptor amino acid residues. This method further involves combining the oligosaccharyltransferase, one or more isolated glycans, and glycoprotein target to form a cell-free glycosylation reaction mixture, and subjecting the cell-free glycosylation reaction mixture to conditions effective for the oligosaccharyltransferase to transfer the glycan from the lipid carrier molecule to the one or more glycan acceptor residues of the glycoprotein target to produce a glycosylated protein.
[0009] To address the failure of other cell-free systems to accurately and efficiently glycosylate proteins, two novel cell-free translation/glycosylation systems— termed "glycoCFE" and "glycoPURE"— were created as described herein. These systems combine existing in vitro translation systems with a reconstituted N- linked glycosylation pathway. Purified glycosylation components were derived from the protein glycosylation locus (pgl) present in the genome of the Gram-negative bacterium Campylobacter jejuni (Figure 1 A). This gene cluster encodes an N-linked glycosylation system that is functionally similar to that of eukaryotes and archaea, involving an oligosaccharyltransferase that catalyzes the en bloc transfer of preassembled oligosaccharides from lipid carriers onto asparagine residues in a conserved motif [Ν-Χ,-S/T in eukaryotes and D/E-X,-N-X2-S/T (SEQ ID NO: 1 ) in bacteria (Kowarik et al., "Definition of the Bacterial N-Glycosylation Site Consensus Sequence," EMBO J. 25: 1957- 1966 (2006), which is hereby incorporated by reference in its entirety), where Xi and X are any residues except prolinej within polypeptides (Figure I B). C. jejuni glycosylation machinery is ideally suited for use in a cell-free translation/glycosylation system for the following reasons. First, E. coli transformed with the entire pgl gene cluster can perform N-linked protein glycosylation (Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer Into E. coli," Science 298: 1790-1793 (2002), which is hereby incorporated by reference in its entirety), thereby providing a convenient host for producing the necessary components in a pure and active form. Since E. coli lacks native glycosylation machinery, the potential for contamination from background N- or O-linked systems is eliminated. Second, C. jejuni OST, named PgIB (CjPglB), is a single-subunit enzyme that is active when solubilized in detergent (Lizak et al., "X- ray Structure of a Bacterial Oligosaccharyltransferase," Nature 474:350-355 (201 1 ), which is hereby incorporated by reference in its entirety), and does not require any accessory components for its activity. Third, CjPglB can transfer sugars post- translationally to locally flexible structures in folded proteins (Kowarik et al., "N- Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety), indicating that protein glycosylation can be achieved without supplementing a functional membrane system (e.g. microsomes).
BRIEF DESCRIPTION OF THE DRAWINGS
[0010] Figures 1 A- 1 B depict aspects of bacterial and eukaryotic N-linked glycosylation. Figure 1 A shows the 17-kb pgl locus of C. jejuni encoding the N- linked glycosylation machinery that has been fully reconstituted in E. coli. Figure I B shows a comparison of N-linked glycosylation in prokaryotes (left) and eukaryotes (right). In both systems, several glycosyltransferases synthesize the glycan by sequential addition of nucleotide-activated sugars on a lipid carrier on the cytoplasmic face of the inner membrane. Once assembled, a flippase transfers the lipid-linked glycans (also referred to as lipid-linked oligosaccharides or LLOs) across the membrane where the oligosaccharyltransferase catalyzes the transfer to Asn residues of periplasmic or endoplasmic reticulum substrate proteins. PglB is a single-subunit, integral membrane protein that is homologous to the catalytic subunit of the eukaryotic OST STT3 (note that PglB and STT3 complex are not drawn to scale). Whereas eukaryotes and archaea use an N-X-S/T acceptor sequence (where X is any amino acid but Pro), PglB requires an extended motif that includes an Asp or Glu residue in the -2 position (D/E-XrN-X2-S/T (SEQ ID NO: 1 ), where X, and X2 can be any amino acid except Pro). PglB can transfer sugars post-translationally to locally flexible structures in folded proteins.
[0011] Figures 2A-2B show the purification of bacterial OST. CjPglB was expressed in E. coli C43(DE3) cells and purified to near homogeneity. Elution fractions (as indicated) from gel filtration columns were examined by SDS-PAGE, and the Coomassie Blue-stained gel images (Figure 2B) are shown together with the elution profiles (Figure 2A). MW, molecular weight standard.
[0012] Figures 3A-3C show reconstituted glycosylation with defined components. In Figure 3A, the in vitro glycosylation assay was carried out using purified OST, extracted LLOs and purified acceptor proteins produced in E. coli. The immunoblots of Figure 3 A show the detection of acceptor protein AcrA and scFvl 3- R4-GT (both anti-His) or glycans (anti-glycan). Reactions included 3 μg wild-type CjPglB, 5 (+) or 10 (++) μί of LLOs and 5 μg of acceptor protein. Controls included the omission of different components (-), inactivated PglB (mut) and LLOs from SCM6 cells with empty pACYC (+/-). Glycosylation yields a mobility shift from the unmodified (gO) to the glycosylated forms (gl and g2). Figure 3B is the same assay as described in Figure 3A but with purified PglB from Campylobacter lari (ClPglB). Figure 3C shows immunoblots detecting AcrA following in vitro glycosylation using 3-month-old freeze thawed components. [0013] Figures 4A-4B demonstrate the cell-free translation glycosylation of
AcrA. Figure 4A is an immunoblot detecting different AcrA constructs (anti-AcrA) produced by in vitro translation using either E. coli CFEs or purified translation components (PURE). AcrA concentration was estimated by comparing band intensities to that of purified AcrA loaded in lane 1 . Figure 4B is an immunoblot detecting AssAcrA expression (anti-AcrA) and glycosylation (anti-glycan).
AssAcrAwas produced by cell-free translation/glycosylation using either the CFE or the PURE systems that were primed with pET24(AcrA-cyt). Controls included the omission of different components (-) or LLOs from SCM6 cells with empty pACYC (+/-).
[0014] Figures 5A-5B depict the cell-free translation/glycosylation of scFv 13-
R4-GT. Figure 5A is an immunoblot detecting different scFvl 3-R4-GT (anti-FLAG) produced by in vitro translation using either E. coli cell-free extracts (CFE) or purified translation components (PURE). Estimates of the scFvl 3-R4-GT concentration were determined by comparison of band intensities to that of the purified scFv 13-R4-GT sample loaded in lane 1 . Figure 5B is an immunoblot detecting scFv l 3-R4-GT expression (anti-FLAG) and glycosylation (anti-glycan). The scFvl 3-R4-GT protein was produced by cell-free translation/glycosylation using either the CFE or PURE systems that were primed with pET24-ssDsbAscFv l 3-R4-GT. Controls included omission of different components (-).
[0015] Figures 6A-6C show an amino acid sequence alignment of various
Campylobacter PglB proteins that are suitable for use in the systems, kits, and methods of the present invention. The PglB amino acid sequences are derived from C. jejuni (SEQ ID NO: 2), C. lari (SEQ ID NO:4), C. coli (SEQ ID NO: 6), and C. upsaliensis (SEQ ID NO: 8). An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties. A PglB consensus sequence based on the alignment of Campylobacter PglB sequences is presented as SEQ ID NO: 10. Residues that are not fully conserved between the four Campylobacter sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at that corresponding position in one of the four depicted Campylobacter sequences. [0016] Figures 7A-7E shows an amino acid sequence alignment of various
Pyrococcus OST STT3 subunit proteins that are suitable for use in the systems, kits, and methods of the present invention. The OST amino acid sequences are derived from P. furiosus (SEQ ID NO: 1 1 ), Pyrococcus sp. ST04 (SEQ ID NO: 13), Pyrococcus sp. (strain NA2) (SEQ ID NO: 14), P. horikoshii (SEQ ID NO: 15), P. abyssi (SEQ ID NO: 16), and P. yayanosii (SEQ ID NO: 17). An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties. A STT3 consensus sequence based on the alignment of Pyrococcus STT3 sequences is presented as SEQ ID NO: 18. Residues that are not fully conserved between the six Pyrococcus sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the six depicted Pyrococcus sequences.
[0017] Figures 8A-8D shows an amino acid sequence alignment of various
Leishmania OST STT3 subunit related proteins that are suitable for use in the systems, kits, and methods of the present invention. The OST amino acid sequences are derived from L. major (SEQ ID NO: 19), L. donovani (SEQ ID NO: 21 ), L. infantum (SEQ ID NO: 22), L. mexicana (SEQ ID NO: 23), and L. braziUensis (SEQ ID NO: 24). An (*) indicates positions which have a single, fully conserved residue; (:) indicates conservation between groups of strongly similar properties; and (.) indicates conservation between groups of weakly similar properties. A STT3 consensus sequence based on the alignment of Leishmania STT3 sequences is presented as SEQ ID NO: 25. Residues that are not fully conserved between the five Leishmania sequences are depicted as X, where X can be any amino acid residue.
Alternatively, X is selected from an amino acid residue at the corresponding position in one of the five depicted Leishmania sequences.
[0018] Figures 9A-9J contain a listing of eukaryotic STT3
oligosaccharyltransferases that are suitable for use in the methods, systems, and kits of the present invention. The oligosaccharyltransferases are identified by UniProtKB Entry number (col. 1 ), which provides the amino acid sequence of the protein, UniProtKB Entry name (col. 2), protein name (col. 3), gene name (col. 4), organism (col. 5) and European Molecular Biology Laboratory (EMBL) database accession number (col. 6) which provides the encoding nucleotide sequence of the protein.
DETAILED DESCRIPTION OF THE INVENTION [0019] A first aspect of the present invention is directed to a cell-free system for producing a glycosylated protein. This system comprises an isolated
oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target; one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule; and a glycoprotein target comprising one or more glycan acceptor amino acid residue, or a nucleic acid molecule encoding said glycoprotein target.
[0020] In accordance with this and all aspects of the present invention,
"oligosaccharyltransferase" ("OST") refers generally to a glycosylation enzyme or subunit of a glycosylation enzyme complex that is capable of transferring a glycan, i.e. , an oligosaccharide or polysaccharide, from a donor substrate to a particular acceptor substrate. The donor substrate is typically a lipid carrier molecule linked to the glycan, and the acceptor substrate is typically a particular amino acid residue of a target glycoprotein. Suitable OSTs include those enzymes that transfer a glycan to an asparagine residue, i.e., an OST involved in N-linked glycosylation, and those enzymes that transfer a glycan or activated sugar moiety to a hydroxyl oxygen molecule of an amino acid residue, i.e. , an OST involved in O-linked glycosylation. An isolated OST of the present invention can be a single-subunit enzyme, a multi- subunit enzyme complex, or a single subunit derived from a multi-subunit enzyme complex. While a number of exemplary OST enzymes are described below, one of skill in the art readily appreciates that any oligosaccharyltransferase enzyme known in the art is suitable for use in the present invention.
[0021] In accordance with this and all aspects of the present invention, the
OST can be a prokaryotic OST. By way of example only, PglB, a single, integral membrane OST protein derived from Campylobacter jejuni is suitable for use in the present invention. PglB attaches a heptasaccharide to an asparagine residue of a glycoprotein target (Kowarik et al., "Definition of the Bacterial N-glycosylation Site Consensus Sequence," Embo J. 25: 1957-66 (2006), which is hereby incorporated by reference in its entirety). The amino acid sequence encoding C. jejuni PglB (UniProtKB Accession No. Q9S4V7) is shown below as SEQ ID NO: 2:
He H e Ser Asn Asp Gly Tyr Ala Phe Ala Glu Gly Ala Arg Asp Met 1 5 10 15
He Ala Gly Phe His Gin Pro Asn Asp Leu Ser Tyr Tyr Gly Ser Ser
20 25 30
Leu Ser Thr Leu Thr Tyr Trp Leu Tyr Lys He Thr Pro Phe Ser Phe
35 40 45
Glu Ser He He Leu Tyr Met Ser Thr Phe Leu Ser Ser Leu Val Val 50 55 60
He Pro He He Leu Leu Ala Asn Glu Tyr Lys Arg Pro Leu Met Gly 65 70 75 80
Phe Val Ala Ala Leu Leu Ala Ser He Ala Asn Ser Tyr Tyr Asn Arg
85 90 95
Thr Met Ser Gly Tyr Tyr Asp Thr Asp Met Leu Val He Val Leu Pro
100 105 110
Met Phe He Leu Phe Phe Met Val Arg Met He Leu Lys Lys Asp Phe
115 120 125
Phe Ser Leu He Ala Leu Pro Leu Phe He Gly He Tyr Leu Trp Trp 130 135 140
Tyr Pro Ser Ser Tyr Thr Leu Asn Val Ala Leu He Gly Leu Phe Leu 145 150 155 160
He Tyr Thr Leu He Phe His Arg Lys Glu Lys He Phe Tyr He Ala
165 170 175
Val He Leu Ser Ser Leu Thr Leu Ser Asn He Ala Trp Phe Tyr Gin
180 185 190
Ser Thr He He Val He Leu Phe Ala Leu Phe Ala Leu Glu Gin Lys
195 200 205
Arg Leu Asn Phe Val He He Gly He Leu Ala Ser Val Thr Leu He 210 215 220
Phe Leu He Leu Ser Gly Gly Val Asp Pro He Leu Tyr Gin Leu Lys 225 230 235 240
Phe Tyr He Phe Arg Ser Asp Glu Ser Ala Asn Leu Thr Gin Gly Phe
245 250 255
Met Tyr Phe Asn Val Asn Gin Thr He Gin Glu Val Glu Asn Val Asp
260 265 270
Leu Ser Glu Phe Met Arg Arg He Ser Gly Ser Glu He Val Phe Leu
275 280 285 Phe Ser Leu Phe Gly Phe Val Trp Leu Leu Arg Lys His Lys Ser Met 290 295 300
lie Met Ala Leu Pro lie Leu Val Leu Gly Phe Leu Ala Leu Lys Gly 305 310 315 320
Gly Leu Arg Phe Thr lie Tyr Ser Val Pro Val Met Ala Leu Gly Phe
325 330 335
Gly Phe Leu Leu Ser Glu Phe Lys Ala lie Leu Val Lys Lys Tyr
340 345 350
Gin Leu Thr Ser Asn Val Cys lie Val Phe Ala Thr lie Leu Thr Leu
355 360 365
Ala Pro Val Phe lie His lie Tyr Asn Tyr Lys Ala Pro Thr Val Phe 370 375 380
Ser Gin Asn Glu Ala Ser Leu Leu Asn Gin Leu Lys Asn lie Ala Asn 385 390 395 400
Arg Glu Asp Tyr Val Val Thr Trp Trp Asp Tyr Gly Tyr Pro Val Arg
405 410 415
Tyr Tyr Ser Asp Val Lys Thr Leu Val Asp Gly Gly Lys His Leu Gly
420 425 430
Lys Asp Asn Phe Phe Pro Ser Phe Ala Leu Ser Lys Asp Glu Gin Ala
435 440 445
Ala Ala Asn Met Ala Arg Leu Ser Val Glu Tyr Thr Glu Lys Ser Phe
450 455 460
Tyr Ala Pro Gin Asn Asp lie Leu Lys Thr Asp lie Leu Gin Ala Met 465 470 475 480
Met Lys Asp Tyr Asn Gin Ser Asn Val Asp Leu Phe Leu Ala Ser Leu
485 490 495 Ser Lys Pro Asp Phe Lys lie Asp Thr Pro Lys Thr Arg Asp lie Tyr
500 505 510
Leu Tyr Met Pro Ala Arg Met Ser Leu lie Phe Ser Thr Val Ala Ser
515 520 525
Phe Ser Phe lie Asn Leu Asp Thr Gly Val Leu Asp Lys Pro Phe Thr 530 535 540
Phe Ser Thr Ala Tyr Pro Leu Asp Val Lys Asn Gly Glu lie Tyr Leu 545 550 555 560
Ser Asn Gly Val Val Leu Ser Asp Asp Phe Arg Ser Phe Lys lie Gly
565 570 575 Asp Asn Val Val Ser Val Asn Ser lie Val Glu lie Asn Ser lie Lys
580 585 590
Gin Gly Glu Tyr Lys lie Thr Pro lie Asp Asp Lys Ala Gin Phe Tyr
595 600 605 lie Phe Tyr Leu Lys Asp Ser Ala lie Pro Tyr Ala Gin Phe lie Leu
610 615 620
Met Asp Lys Thr Met Phe Asn Ser Ala Tyr Val Gin Met Phe Phe Leu
625 630 635 640
Gly Asn Tyr Asp Lys Asn Leu Phe Asp Leu Val lie Asn Ser Arg Asp
645 650 655
Ala Lys Val Phe Lys Leu Lys lie
660
[0022] The nucleic acid sequence encoding the amino acid sequence of SEQ
ID NO: 2 is provided below as SEQ ID NO: 3 (EMBL Nucleotide Sequence Database No. AAD5 1383):
atcatttcaa acgatggtta tgcttttgct gagggtgcaa gagatatgat agcaggtttt 60 catcagccta atgatttgag ttattatgga tcttctttat ctacgcttac ttattggctt 120 tataaaatca cacctttttc tttcgaaagt attattttat atatgagtac ttttttatct 180 tctttggtgg tgattcctat tattttacta gctaatgaat acaaacgtcc tttaatgggc 240 tttgtagctg ctcttttagc aagtatagca aacagttatt ataatcgcac tatgagtggg 300 tattatgata cggatatgct ggtaattgtt ttacctatgt ttattttatt ttttatggta 360 agaatgattt taaaaaaaga ctttttttca ttgattgcct taccgttatt tataggaatt 420 tatctttggt ggtatccttc aagctatact ttaaatgtag ctttaattgg acttttttta 480 atttatacac ttatttttca tagaaaagaa aagatttttt atatagctgt gattttgtct 540 tctcttactc tttcaaatat agcatggttt tatcaaagta ctattatagt aatacttttt 600 gctttatttg ctttagagca aaaacgctta aattttgtaa ttataggaat tttagctagt 660 gtaactttga tatttttgat tttaagtgga ggggttgatc ctatacttta tcagcttaaa 720 ttttatattt ttagaagtga tgaaagtgcg aatttaacgc agggttttat gtattttaat 780 gtcaatcaaa ccatacaaga agttgaaaat gtagatctta gcgaatttat gcgaagaatt 840 agtggtagtg aaattgtttt tttgttttct ttgtttggtt ttgtatggct tttgagaaaa 900 cataaaagta tgattatggc tttacctata ttggtgcttg ggtttttagc cttaaaaggg 960 gggcttagat ttaccattta ttctgtacct gtaatggcct taggatttgg ttttttattg 1020 agcgagttta aggctatatt ggttaaaaaa tatagccaat taacttcaaa tgtttgtatt 1080 gtttttgcaa ctattttgac tttagctcca gtatttatcc atatttacaa ctataaagca 1140 ccaacagttt tttctcaaaa tgaagcatca ttattaaatc aattaaaaaa tatagccaat 1200 agagaagatt atgtggtaac ttggtgggat tatggttatc ctgtgcgtta ttatagtgat 1260 gtgaaaactt tagtagatgg tggaaagcat ttaggtaagg ataatttttt cccttctttt 1320 gctttaagca aagatgaaca agctgcagct aatatggcaa gacttagtgt agaatataca 1380 gaaaaaagct tttatgctcc gcaaaatgat attttaaaaa cagacatttt acaagccatg 1440 atgaaagatt ataatcaaag caatgtggat ttgtttctag cttcattatc aaaacctgat 1500 tttaaaatcg atacaccaaa aactcgtgat atttatcttt atatgcccgc tagaatgtct 1560 ttgatttttt ctacggtggc tagtttttct tttattaatt tagatacagg agttttggat 1620 aaacctttta cctttagcac agcttatcca cttgatgtta aaaatggaga aatttatctt 1680 agcaacggag tggttttaag cgatgatttt agaagtttta aaataggtga taatgtggtt 1740 tctgtaaata gtatcgtaga gattaattct attaaacaag gtgaatacaa aatcactcca 1800 attgatgata aggctcagtt ttatattttt tatttaaagg atagtgctat tccttacgca 1860 caatttattt taatggataa aaccatgttt aatagtgctt atgtgcaaat gtttttttta 1920 ggaaattatg ataagaattt atttgacttg gtgattaatt ctagagatgc taaggttttt 1980 aaacttaaaa tttaa 1995
[0023] The amino acid and nucleotide sequences of SEQ ID NOs: 2 and 3, respectively, are representative C. jejuni PglB protein and nucleic acid sequences. It is appreciated by one of skill in the art that there are at least 70 subspecies of C. jejuni having a PglB protein that may vary in sequence identity from the amino acid sequence of SEQ ID NO: 2, but retain the same function. Accordingly, homologous PglB protein sequences from other subspecies and strains of C. jejuni that are characterized by an amino acid sequence identity of at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. jejuni amino acid sequence of SEQ ID NO: 2 are also suitable for use in the present invention. The amino acid sequences of related C. jejuni PglB proteins and nucleotide sequences encoding the same are known and readily available to one of skill in the art.
[0024] OSTs from other species of Campylobacter that share sequence identity to C. jejuni PglB and/or are capable of transferring an oligosaccharide moiety to a target glycoprotein are also suitable for use in this and all aspects of the present invention. For example, as demonstrated herein, PglB from Campylobacter lari (ClPglB), which shares only 56% sequence identity to the amino acid sequence of C. jejuni (Schwarz et al., "Relaxed Acceptor Site Specificity of Bacterial
Oligosaccharyltransferase in Vivo," Glycobiology 21 :45-54 (201 1), which is hereby incorporated by reference in its entirety), is capable of transferring a glycan to an acceptor amino acid residue (i.e., asparagine) of a target glycoprotein in the cell-free glycosylation system of the present invention. The amino acid sequence encoding C. lari PglB (UniProtKB Accession No. B9KDD4) is shown below as SEQ ID NO: 4:
Met Lys Leu Gin Gin Asn Phe Thr Asp Asn Asn Ser lie Lys Tyr Thr
1 5 10 15
Cys lie Leu lie Leu lie Ala Phe Ala Phe Ser Val Leu Cys Arg Leu
20 25 30
Tyr Trp Val Ala Trp Ala Ser Glu Phe Tyr Glu Phe Phe Phe Asn Asp
35 40 45
Gin Leu Met lie Thr Thr Asn Asp Gly Tyr Ala Phe Ala Glu Gly Ala
50 55 60
Arg Asp Met lie Ala Gly Phe His Gin Pro Asn Asp Leu Ser Tyr Phe 65 70 75 80
Gly Ser Ser Leu Ser Thr Leu Thr Tyr Trp Leu Tyr Ser lie Leu Pro
85 90 95
Phe Ser Phe Glu Ser lie lie Leu Tyr Met Ser Ala Phe Phe Ala Ser
100 105 110
Leu lie Val Val Pro lie lie Leu lie Ala Arg Glu Tyr Lys Leu Thr
115 120 125
Thr Tyr Gly Phe lie Ala Ala Leu Leu Gly Ser lie Ala Asn Ser Tyr
130 135 140
Tyr Asn Arg Thr Met Ser Gly Tyr Tyr Asp Thr Asp Met Leu Val Leu 145 150 155 160
Val Leu Pro Met Leu lie Leu Leu Thr Phe lie Arg Leu Thr lie Asn
165 170 175
Lys Asp lie Phe Thr Leu Leu Leu Ser Pro Val Phe lie Met lie Tyr
180 185 190 Leu Trp Trp Tyr Pro Ser Ser Tyr Ser Leu Asn Phe Ala Met lie Gly
195 200 205
Leu Phe Gly Leu Tyr Thr Leu Val Phe His Arg Lys Glu Lys lie Phe
210 215 220
Tyr Leu Thr lie Ala Leu Met lie lie Ala Leu Ser Met Leu Ala Trp 225 230 235 240
Gin Tyr Lys Leu Ala Leu lie Val Leu Leu Phe Ala lie Phe Ala Phe
245 250 255
Lys Glu Glu Lys lie Asn Phe Tyr Met lie Trp Ala Leu lie Phe lie
260 265 270 Ser lie Leu lie Leu His Leu Ser Gly Gly Leu Asp Pro Val Leu Tyr
275 280 285 Gin Leu Lys Phe Tyr Val Phe Lys Ala Ser Asp Val Gin Asn Leu Lys 290 295 300
Asp Ala Ala Phe Met Tyr Phe Asn Val Asn Glu Thr lie Met Glu Val 305 310 315 320
Asn Thr lie Asp Pro Glu Val Phe Met Gin Arg lie Ser Ser Ser Val
325 330 335
Leu Val Phe lie Leu Ser Phe lie Gly Phe lie Leu Leu Cys Lys Asp
340 345 350
His Lys Ser Met Leu Leu Ala Leu Pro Met Leu Ala Leu Gly Phe Met
355 360 365
Ala Leu Arg Ala Gly Leu Arg Phe Thr lie Tyr Ala Val Pro Val Met
370 375 380
Ala Leu Gly Phe Gly Tyr Phe Leu Tyr Ala Phe Phe Asn Phe Leu Glu
385 390 395 400
Lys Lys Gin lie Lys Leu Ser Leu Arg Asn Lys Asn lie Leu Leu lie
405 410 415
Leu lie Ala Phe Phe Ser lie Ser Pro Ala Leu Met His lie Tyr Tyr
420 425 430
Tyr Lys Ser Ser Thr Val Phe Thr Ser Tyr Glu Ala Ser lie Leu Asn
435 440 445
Asp Leu Lys Asn Lys Ala Gin Arg Glu Asp Tyr Val Val Ala Trp Trp 450 455 460
Asp Tyr Gly Tyr Pro lie Arg Tyr Tyr Ser Asp Val Lys Thr Leu lie 465 470 475 480
Asp Gly Gly Lys His Leu Gly Lys Asp Asn Phe Phe Ser Ser Phe Val
485 490 495 Leu Ser Lys Glu Gin lie Pro Ala Ala Asn Met Ala Arg Leu Ser Val
500 505 510
Glu Tyr Thr Glu Lys Ser Phe Lys Glu Asn Tyr Pro Asp Val Leu Lys
515 520 525
Ala Met Val Lys Asp Tyr Asn Lys Thr Ser Ala Lys Asp Phe Leu Glu
530 535 540
Ser Leu Asn Asp Lys Asp Phe Lys Phe Asp Thr Asn Lys Thr Arg Asp 545 550 555 560
Val Tyr lie Tyr Met Pro Tyr Arg Met Leu Arg lie Met Pro Val Val
565 570 575 Ala Gin Phe Ala Asn Thr Asn Pro Asp Asn Gly Glu Gin Glu Lys Ser
580 585 590
Leu Phe Phe Ser Gin Ala Asn Ala lie Ala Gin Asp Lys Thr Thr Gly
595 600 605 Ser Val Met Leu Asp Asn Gly Val Glu He He Asn Asp Phe Arg Ala 610 615 620
Leu Lys Val Glu Gly Ala Ser He Pro Leu Lys Ala Phe Val Asp He 625 630 635 640
Glu Ser lie Thr Asn Gly Lys Phe Tyr Tyr Asn Glu He Asp Ser Lys
645 650 655
Ala Gin lie Leu Leu Phe Leu Arg Glu Tyr Lys Phe Val
665 670
Leu Asp Glu Ser Leu Tyr Asn Ser Ser Tyr He Gin Met Phe Leu Leu
675 680 685
Asn Gin Tyr Asp Gin Asp Leu Phe Glu Gin He Thr Asn Asp Thr Arg
690 695 700
Ala Lys lie Tyr Arg Leu Lys
705 710
[0025] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. lari amino acid sequence of SEQ ID NO: 4 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 4 is provided below as SEQ ID NO: 5 (EMBL Nucleotide Sequence Database No. ACM64573.1 ): atgaaactac aacaaaattt cacggataat aattctataa aatatacctg tattttaatc 60 cttatagcct ttgcttttag tgttttgtgt agattatact gggtagcttg ggcaagtgag 120 ttttatgagt ttttctttaa tgatcaactc atgattacta ctaatgatgg ctatgctttt 180 gcagaaggtg caagagatat gatagcaggt tttcatcaac ctaatgactt atcttatttt 240 ggaagctcac tttctacttt gacttattgg ctttatagta ttttgccttt tagctttgaa 300 agtattattt tatatatgag tgcttttttt gcttctttga ttgttgtgcc tattatatta 360 atcgcaagag agtataaact cactacctat ggctttatag cagctttact tggaagcatt 420 gcaaatagtt attataaccg cactatgagt gggtattacg atacagatat gctagtgtta 480 gttttaccaa tgcttatttt gcttaccttt atacgcttaa ctattaataa agacattttc 540 accctacttt taagtccggt ttttatcatg atttatttgt ggtggtatcc atcaagttat 600 tctttaaatt ttgctatgat aggacttttt ggactttata ctttagtatt tcatagaaaa 660 gaaaagattt tttatctaac tattgctttg atgatcatag ctttaagtat gctagcatgg 720 caatataagc ttgctttgat tgtattatta tttgctattt ttgcttttaa agaagaaaaa 780 atcaattttt atatgatttg ggctttgatt tttattagca ttttgatatt gcatttaagt 840 ggcggcttag atcctgtttt ataccaactt aaattttatg tatttaaagc ttctgatgtg 900 caaaatttaa aagatgctgc ctttatgtat tttaatgtca atgaaaccat tatggaagta 960 aatactatcg atcctgaagt atttatgcaa agaattagct ctagtgtttt agtatttatc 1020 ctttctttta taggttttat cttactttgc aaagatcaca aaagcatgct tttggctcta 1080 cctatgcttg cactaggttt tatggcttta agagctggac ttagatttac catttatgca 1140 gttcctgtga tggctttggg ttttgggtat tttttatatg cattttttaa ttttttagaa 1200 aaaaaacaaa tcaaacttag cctaagaaat aaaaatatct tacttatact cattgcattt 1260 tttagtataa gccctgcttt gatgcatatt tattattata aatcctctac tgtttttact 1320 tcttatgaag ctagtatttt aaatgattta aaaaataaag ctcaaagaga agattatgtt 1380 gttgcttggt gggattatgg ttatccaata cgctattata gcgatgtaaa aaccttaatc 1440 gatggtggaa aacacctagg aaaagataat tttttctcat cttttgtctt aagcaaagaa 1500 caaattccag cagccaatat ggcaagactt agcgtagaat acactgaaaa atctttcaaa 1560 gaaaactatc ctgatgtttt aaaagctatg gttaaagatt ataataaaac aagtgctaaa 1620 gattttttag aaagtttaaa tgataaagat tttaaatttg ataccaataa aactagagat 1680 gtatacattt atatgcctta tagaatgttg cgtatcatgc ctgtggtggc acaatttgca 1740 aatacaaatc ctgataatgg agagcaagaa aaaagtttat ttttctccca agctaatgcc 1800 atagctcaag ataaaaccac aggttctgtt atgcttgata atggagtaga aattattaat 1860 gattttagag ccttaaaagt agaaggtgca agcatacctt taaaagcttt tgtggatata 1920 gaatccatta ctaatggcaa attttattac aatgaaattg attcaaaagc tcaaatttat 1980 ttgctctttt taagagaata taaaagcttt gtgattttag atgaaagtct ttataatagt 2040 tcttatatac aaatgttttt gttaaatcaa tacgatcaag atttatttga acaaattact 2100 aatgatacaa gagcaaaaat ttataggcta aaaagatga 2139
[0026] Another N-linked OST from Campylobacter that is suitable for use in this and all aspects of the present invention is PglB from C. Coli. The amino acid sequence encoding PglB from C coli (UniProtKB Accession No. H7WI6), which is 81 % identical to that of C. jejuni, is provided below as SEQ ID NO: 6
Met Leu Lys Lys Glu Tyr Phe Lys Asn Pro Thr Phe lie Leu Leu Ala 1 5 10 15
Phe lie lie Leu Ala Tyr Val Phe Ser Val Leu Cys Arg Phe Tyr Trp
20 25 30 lie Phe Trp Ala Ser Glu Phe Asn Glu Tyr Phe Phe Asn Asn Glu Leu
35 40 45
Met lie lie Ser Asn Asp Gly Tyr Ala Phe Ala Glu Gly Ala Arg Asp 50 55 60
Met lie Ala Gly Phe His Gin Pro Asn Asp Leu Ser Tyr Tyr Gly Ser 65 70 75 80 Ser Leu Ser Thr Leu Thr Tyr Trp Phe Tyr Lys lie Thr Pro Phe Ser 85 90 95
Leu Glu Ser lie Phe lie Tyr lie Ser Thr Phe Leu Ser Ser Leu Val
100 105 110
Val lie Pro Leu lie Leu lie Ala Asn Glu Tyr Lys Arg Pro Leu Met
115 120 125
Gly Phe Val Ala Ala Leu Leu Ala Ser lie Ala Asn Ser Tyr Tyr Asn 130 135 140
Arg Thr Met Ser Gly Tyr Tyr Asp Thr Asp Met Leu Val lie Val Leu 145 150 155 160
Ala Met Met lie Val Phe Phe Met lie Arg Leu lie Leu Lys Lys Asp
165 170 175 Leu Leu Ser Leu lie Thr Leu Pro Leu Phe Val Gly lie Tyr Leu Trp
180 185 190
Trp Tyr Pro Ser Ser Tyr Thr Leu Asn Val Ala Leu Leu Gly Leu Phe
195 200 205
Phe lie Tyr Thr Leu Val Phe His lie Lys Glu Lys Thr Leu Tyr Met 210 215 220
Ala lie lie Leu Ala Ser lie Thr Leu Ser Asn lie Ala Trp Phe Tyr 225 230 235 240
Gin Ser Ala lie lie Val lie Leu Phe Ser Leu Phe Val Leu Gin Asn
245 250 255 Lys Arg Phe Ser Phe Ala Leu Leu Gly lie Leu Gly Leu Ala Thr Leu
260 265 270
Val Phe Leu lie Leu Ser Gly Gly lie Asp Pro lie Leu Tyr Gin Leu
275 280 285
Lys Phe Tyr lie Phe Arg Ser Asp Glu Ser Ala Asn Leu Ala Gin Gly 290 295 300
Phe Met Tyr Phe Asn Val Asn Gin Thr lie Gin Glu Val Glu Ser lie 305 310 315 320
Asp Leu Ser lie Phe Met Gin Arg lie Ser Gly Ser Glu Leu Val Phe
325 330 335 Phe Val Ser Leu lie Gly Phe lie Phe Leu Val Arg Lys His Lys Ser
340 345 350
Met lie Leu Ala Leu Pro Met Leu Ala Leu Gly Phe Leu Ala Leu Lys
355 360 365
Ser Gly Leu Arg Phe Thr lie Tyr Ala Val Pro Val Leu Ala Leu Gly 370 375 380
Phe Gly Phe Leu Met Ser Leu Leu Gin Glu Arg Lys Gin Lys Asn Asn 385 390 395 400 Asn Thr Tyr Trp Trp Ala Asn lie Gly Val Phe lie Phe Thr Phe Leu 405 410 415
Ser Leu lie Pro Met Phe Tyr His lie Asn Asn Tyr Lys Ala Pro Thr
420 425 430
Val Phe Ser Gin Asn Glu Ala Lys Leu Asp Glu Leu Lys Lys lie
435 445
Ala Gin Arg Glu Asp Tyr Val Val Thr Trp Trp Asp Tyr Gly Tyr Pro 450 455 460
lie Arg Tyr Tyr Ser Asp Val Lys Thr Leu Ala Asp Gly Gly Lys His 465 470 475 480
Leu Gly Lys Asp Asn Phe Phe Pro Ser Phe Val Leu Ser Lys Asp Gin
485 490 495 Val Ala Ala Ala Asn Met Ala Arg Leu Ser Val Glu Tyr Thr Glu Lys
500 505 510
Ser Phe Tyr Ala Pro Leu Asn Asp lie Leu Lys Asn Asp Leu Leu Gin
515 520 525
Ala Met Met Lys Asp Tyr Asn Gin Asn Asn Val Asp Leu Phe Leu Ala 530 535 540
Ser Leu Ser Lys Pro Asp Phe Lys lie Asn Thr Pro Lys Thr Arg Asp 545 550 555 560
Val Tyr lie Tyr Met Pro Ala Arg Met Ser Leu lie Phe Ser Thr Val
565 570 575 Ala Ser Phe Ser Phe Val Asp Leu Glu Thr Gly Glu lie Asn Lys Pro
580 585 590
Phe Thr Phe Ser Ala Ala Tyr Pro Leu Asp Val Lys Asn Gly Glu lie
595 600 605
Tyr Leu Ser Asn Gly lie Ala Leu Ser Asp Asp Phe Arg Ser Phe Lys
610 615 620
lie Asn Asn Ser Thr lie Ser Val Asn Ser lie lie Glu lie Asn Ser 625 630 635 640 lie Lys Gin Gly Glu Tyr Lys lie Thr Pro lie Asp Asp Met Ala Gin
645 650 655 Phe Tyr lie Phe Tyr Leu Lys Asp Ser Thr lie Pro Tyr Ala Gin Phe
660 665 670 lie Leu Met Asp Lys Thr Met Phe Asn Ser Ala Tyr Val Gin Met Phe
675 680 685
Phe Leu Gly Asn Tyr Asp Lys Asn Leu Tyr Asp Leu Val lie Asn Ala 690 695 700
Arg Asp Ala Lys Val Phe Lys Leu Lys lie
705 710 [0027] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. coli amino acid sequence SEQ ID NO: 6 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 6 is provided below as SEQ ID NO: 7 (EMBL Nucleotide Sequence Database No. EIB 14175): atgttaaaaa aagaatactt taaaaaccca acttttattt tattggcttt tataatttta 60 gcgtatgtct ttagtgtttt atgtaggttt tattggattt tttgggcaag tgagtttaat 120 gaatattttt tcaataacga gcttatgatt atctcaaatg atggatatgc ttttgcagag 180 ggtgcaagag atatgatagc gggttttcat caacctaatg atttgagtta ttatggttct 240 tcgctttcaa cgctcacata ttggttttat aaaataactc ctttttcttt agaaagcatt 300 tttatatata tcagtacttt tttatcttct ttggtggtta tacctttgat tttgattgct 360 aatgaataca aacgcccttt aatggggttt gttgcagcat tgctagccag tatagctaat 420 agctattata atcgcacgat gagcggatat tatgatactg atatgcttgt tatagttctt 480 gcaatgatga tagttttctt tatgataagg ctgattttga aaaaagattt attatcttta 540 ataacactgc ctttgtttgt aggaatttat ctttggtggt atccatcaag ctatacttta 600 aatgttgctt tactaggact tttctttatt tataccttgg tttttcatat aaaagaaaaa 660 acgctttata tggctattat cctagcttct atcacacttt caaatatagc ttggttttat 720 caaagcgcca tcattgtcat actttttagt ctttttgttt tgcaaaataa gcgttttagc 780 tttgctttgc ttggaatttt aggtttggca actttggtat ttttgatact aagcggtgga 840 attgatccta tactctatca acttaaattt tatattttta gaagtgatga gagtgcaaat 900 ttggctcaag gttttatgta ttttaatgta aatcaaacca tacaagaggt agaaagtata 960 gatttaagta tttttatgca aaggattagc ggaagcgagc ttgtattttt tgtatcttta 1020 atcggcttta ttttccttgt tagaaaacat aaaagtatga ttttggcttt gccgatgtta 1080 gctttaggat ttttagcact taagagtgga cttcgtttta ctatttatgc agtacctgtt 1140 ttagcacttg gatttggttt tttaatgagt cttttgcaag aaagaaaaca aaaaaacaat 1200 aatacctatt ggtgggccaa tataggcgtt tttattttta cttttttaag tttaattcct 1260 atgttctatc atatcaacaa ttataaagca ccaactgttt tttctcaaaa tgaggctacg 1320 aaattagatg agcttaaaaa aattgcacaa agagaagatt atgtagtaac ttggtgggat 1380 tatggatatc ctattaggta ttacagcgat gttaaaactt tggctgatgg gggtaagcat 1440 ttaggcaagg ataatttttt cccatctttt gttctaagta aagatcaagt ggctgctgca 1500 aatatggcaa gacttagtgt agaatacaca gaaaaaagtt tttacgcccc tttaaatgat 1560 attttaaaaa atgatctttt acaagccatg atgaaagatt ataatcaaaa taatgtggat 1620 ttgtttttag cttcgctttc caagcctgat tttaaaatca atacgccaaa aacacgcgat 1680 gtgtatatct atatgccagc tagaatgtct ttgatttttt caactgtggc tagtttttct 1740 tttgtggatt tggagacagg tgagataaat aaacctttta cttttagtgc agcttatcca 1800 cttgatgtta aaaatggaga aatttatctt agcaatggta ttgcattaag tgatgatttt 1860 agaagtttta aaataaataa tagtactata tccgtaaata gtatcataga gattaattct 1920 atcaaacaag gtgaatataa aatcactcct attgatgata tggctcaatt ttatattttt 1980 tatcttaaag atagcaccat accttatgct cagtttattt taatggataa aactatgttt 2040 aatagtgctt atgtgcaaat gtttttcctt ggaaattatg ataaaaattt gtatgattta 2100 gtgattaatg ctagagatgc aaaagttttt aaactcaaaa tttaa 2145
[0028] Another Campylobacter OST that is suitable for use in this and all aspects of the present invention is PglB from C. upsaliensis. The amino acid sequence encoding PglB from C. upsaliensis (UniProtKB Accession No. E6LAJ2), which is 57% identical to that of C. jejuni, is provided below as SEQ ID NO: 8:
Met Lys Asn Glu Ala Val Lys Asn Ala Asn Leu Arg Leu Val Phe Phe 1 5 10 15 lie Leu Leu Ala Phe Gly Phe Ser Val Leu Cys Arg Phe Tyr Trp lie
20 25 30
Tyr Trp Ala Ser Asp Phe Asn Glu Tyr Phe Phe Asn Asn Gin Leu Met
35 40 45
lie Ser Ser Asn Asp Gly Tyr Thr Phe Ala Glu Gly Ala Arg Asp Lys 50 55 60
lie Ala Gly Phe His Gin Glu Asn Asp Leu Ser Phe lie Asn Ser Ser
65 70 75 80
Leu Ser lie Leu Thr Tyr Val Leu Tyr Lys lie Thr Pro Phe Ser Phe
85 90 95
Glu Ser lie lie Leu Tyr Met Ser Val Phe Phe Ser Ser Leu lie Val
100 105 110
Val Pro Leu lie Leu lie Ala Asn Glu Leu Lys Arg Pro Leu Met Gly
115 120 125
Leu Phe Ala Ala Phe Leu Ala Ser lie Ala Lys Ser Tyr Tyr Asn Arg
130 135 140
Thr Met Ala Gly Tyr Tyr Asp Thr Asp Met Leu Ala lie Val Leu Pro 145 150 155 160
Met Phe lie Leu Tyr Phe Phe lie Arg Leu lie Leu Arg Lys Asp Asp
165 170 175
Phe Ser Leu Leu Ala Leu Pro Phe Phe Met Gly Leu Tyr Leu Trp Trp
180 185 190 Tyr Pro Ser Ser Tyr Thr Leu Asn Val Ala Phe He Ala Leu Phe Thr 195 200 205
Leu Tyr Val Leu He Tyr His Arg Lys Glu Arg Ser Phe Tyr Met Ala 210 215 220
Ala Leu Leu Cys Ala He Thr Leu Ser Asn He Ala Trp Phe Tyr Gin 225 230 235 240
Ser Ala He He Val Leu Leu Phe Ala Leu Phe Met Leu Lys Asn Ser
245 250 255
Phe Phe Asn Phe Lys Phe He Ala Leu Leu Ala Leu Gly Val Leu Val
260 265 270
Phe Leu Ala Leu Ser Gly Gly He Asp Pro He Leu Tyr Gin Leu Lys
275 280 285
Phe Tyr Leu Leu Arg Ser Asp Glu Ser Ala Ser Leu Ala Arg Gly Phe 290 295 300
Ala Tyr Phe Asn Val Asn Leu Thr He Gin Glu Val Glu Ser He Asp
305 310 315 320
Leu Ser Thr Phe Met Gin Arg He Ser Gly Ser Glu Leu Val Phe Leu
325 330 335
Leu Ser Leu Phe Gly Phe Leu Trp Leu Leu Lys Lys His Lys Val Met
340 345 350
Leu Leu Thr Leu Pro Met Leu Leu Leu Gly Phe Leu Ala Leu Arg Gly
355 360 365
Gly Leu Arg Phe Thr He Tyr Ala Val Pro He Met Ala Leu Gly Phe 370 375 380
Gly Phe Leu Ser Val Gin He Leu Ser Leu He Gin Lys Met Arg Pro 385 390 395 400
Leu Lys Glu Thr Arg Lys Leu Arg He Phe Phe Tyr Gly He Phe Pro
405 410 415
Leu Phe Val Leu Val Leu Gly Ala Tyr Phe Tyr Phe Ser Gin Ser Ala
420 425 430
He Tyr Glu Ser Met Gly Val Glu Phe Gin Lys Asn Phe Val Ser Phe
435 440 445
Phe Val Glu Asp Thr Leu Leu Phe Ser Leu Leu He Leu Ala He Phe 450 455 460
Thr Pro Leu He Phe Glu Leu Leu Trp Arg Lys Lys Asp He Arg Phe 465 470 475 480
Val Cys Ser Phe Tyr He Val Gly Val Leu Leu Phe Ser Leu Trp Ala
485 490 495
Asn Leu Ser His He Tyr Asn Tyr Arg Ala His Thr Val Phe Ser Tyr
500 505 510 Asn Glu Ala Ser lie Leu Asp Asn Leu Lys Ala Asn Val Ser Arg Glu 515 520 525
Asp Tyr lie Val Ala Trp Trp Asp Tyr Gly Tyr Pro lie Arg Tyr Tyr 530 535 540
Ser Asp Val Lys Thr Leu Ala Asp Gly Gly Lys His Leu Gly Lys Asp 545 550 555 560
Asn Phe Phe Pro Ser Phe Val Leu Ser Gin Asn Pro Arg Ala Ala Ala
565 570 575
Asn Met Ala Arg Leu Ser Val Glu Tyr Thr Glu Lys Gly Phe Lys Thr
580 585 590
Pro Tyr Asn Asp Leu Leu Glu Ala Met Met Lys Asp Tyr Asn Tyr Ser
595 600 605
Asn Val Asn Leu Phe Leu Ala Ala Leu Ser Lys Glu Asp Phe Thr Leu 610 615 620
Gin Thr Pro Lys Thr Arg Asp lie Tyr lie Tyr Met Pro Ser Arg Met 625 630 635 640
Ala Ala He Phe Gly Thr Val Ala Ser Phe Ser Tyr Met Ser Leu Glu
645 650 655
Thr Gly Glu Leu Glu Asn Pro Phe Val Tyr Ser Val Ala Tyr Tyr Leu
660 665 670
Gly Asn Glu Asp Gly Lys Leu Val Leu Ser Asn Asn Met Leu Leu His
675 680 685
Ser Asp Phe Arg Ser Phe Asp Leu Asn Gly Lys Asn Tyr Ala He Asn 690 695 700
Ser Leu Val Glu Phe Thr Ser Val Gin Gin Lys Tyr Tyr Ser Val Val 705 710 715 720
Glu He Asp Lys Asn Ala Lys Tyr Tyr Leu Phe His He Lys Asp Ala
725 730 735
Asn He Pro Asn Val Gin Phe He Leu Met Asp Lys Ala Met Tyr Glu
740 745 750
Ser Ala Phe Val Gin Met Phe Phe Phe Gly Lys Tyr Asp Glu Ser Leu
755 760 765
Tyr Glu Leu He Val Asp Ser Lys Glu Ala Lys Val Tyr Lys Leu Lys 770 775 780
Leu
785
[0029] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the C. upsaliensis amino acid sequence of SEQ ID NO: 8 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 8 is provided below as SEQ ID NO: 9 (EMBL Nucleotide Sequence Database No. EFU71695): atgaaaaacg aggctgtgaa aaatgcgaat ttgaggctag tattttttat cttactagct tttggtttta gtgttttatg tcgcttttat tggatttatt gggcgagtga ttttaacgaa tattttttta ataatcagct tatgataagc tcaaatgacg gctacacttt tgcagagggt gctagagata agatagcggg ctttcatcag gaaaatgatt Laagctttat taattcctct ctttctattt tgacttatgt gctttataaa atcacgcctt ttagttttga aagcattatt ttatatatga gtgtattttt ttcttcactt atagttgtgc cgcttatttt aattgcaaat gagcttaaac gccctttaat gggacttttt gcggcatttt tagcaagtat tgcaaaaagc tattataacc gcactatggc aggatattat gatacagata tgttagccat tgtgcttcct atgtttattt tatatttttt catcaggctt attttaagaa aagatgattt ttctttactt gccttgccgt tttttatggg actttatctt tggtggtatc catcaagcta tactctaaat gtcgctttta tcgcactttt taccctttat gttttgattt atcatagaaa agaaaggtct ttttatatgg cagcactttt gtgtgccatt accctttcaa atattgcttg gttttatcaa agtgctatta ttgttttact ttttgctctt tttatgctta aaaattcgtt ttttaatttt aaatttatcg cacttttagc cttaggagtt ttagtttttt tggctttaag tggggggata gaccccatac tttatcagct taaattttat cttttaagaa gtgatgaaag tgcaagttta gcgcgtggtt ttgcgtattt taatgtaaat ttaaccatac aagaggttga aagtatcgat ttaagcactt ttatgcaaag aattagcgga agtgagcttg tgtttttact ttctcttttt ggctttttat ggcttttaaa aaagcataag gtgatgcttt taaccctacc tatgcttttg ctcggttttt tagcacttag aggtgggctt agatttacta tttatgctgt gcctattatg gcgcttggct ttggcttttt aagcgttcaa attttaagct taatccaaaa aatgcgtccc ttaaaagaaa ctcgaaaatt aagaatattt ttttatggaa tctttccgct ttttgtgctt gttttggggg cttattttta ttttagtcaa agtgctattt atgagagtat gggagtggaa tttcaaaaga actttgtgag cttttttgta gaagatactt tgcttttttc tttgctgatt ttggctattt ttacgccttt aatttttgag cttttgtgga gaaaaaagga cattcgtttt gtgtgtagct tttatattgt gggggttttg cttttttctt tatgggcaaa tttaagtcat atttataatt atagagcaca caccgttttt agctacaatg aagcgagtat tttggataat cttaaagcta atgtttctag ggaagattat attgtggctt ggtgggatta tggctatcct attcgttatt atagcgatgt gaaaacctta gctgatgggg gtaagcattt gggtaaggat aattttttcc cttcttttgt tttaagtcaa aatccacgcg cagcggcaaa tatggcaaga cttagcgtag aatacacaga aaaaggcttt aaaacgcctt ataatgatct tttagaagcg 1800 atgatgaagg attataatta tagcaatgta aatttatttt tagcggcact ttctaaggag 1860 gattttactc ttcaaacgcc caaaactaga gatatttaca tctatatgcc ttctcgtatg 1920 gcggcgattt ttggcacggt ggcaagtttt tcttatatga gcttagaaac gggtgagctt 1980 gaaaatcctt ttgtttatag tgtggcgtat tatttgggaa atgaggacgg caaactcgtc 2040 ttaagtaata atatgctcct tcatagcgac tttagaagct ttgaccttaa tggcaagaat 2100 tatgctatta attctttggt tgaatttact tcggtgcagc aaaaatatta tagtgttgtg 2160 gagattgata aaaatgctaa atattatctc tttcacatca aagacgctaa tatccctaat 2220 gtgcaattta tcctaatgga taaggcgatg tatgagagtg ctttcgtgca aatgtttttc 2280 tttggtaagt atgatgagag tttgtatgaa ttaattgtag atagtaaaga agcaaaggtg 2340 tataaattaa aattatga 2358
[0030] An alignment of the Campylobacter PglB sequences is provided in
Figures 6A-6C, and a PglB consensus sequence based on this alighment is presented as SEQ ID NO: 10 of Figure 6. Residues that are not fully conserved between the four Campylobacter sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from one of the four depicted amino acid residue at the corresponding position in the depicted Camplyobacter sequences.
[0031] In another embodiment of the present invention, the OST is an archaea oligosaccharyltransferase. For example, the OST STT3 subunit from Pyrococcus furiosus which is capable of transferring a glycan to an asparagine residue of a target glycoprotein is suitable for use in this and all aspects of the present invention. The amino acid sequence of P. furiosus (UniProtKB Accession No. Q8U4D2) is provided below as SEQ ID NO: 1 1 :
Met Val Lys Thr Gin lie Lys Glu Lys Lys Lys Asp Glu Lys Val Thr
1 5 10 15
lie Pro Leu Pro Gly Lys lie Lys Thr Val Leu Ala Phe Leu Val Val
20 25 30
Leu Ala Phe Ala Ala Tyr Gly Phe Tyr lie Arg His Leu Thr Ala Gly
35 40 45
Lys Tyr Phe Ser Asp Pro Asp Thr Phe Tyr His Phe Glu lie Tyr Lys
50 55 60
Leu Val Leu Lys Glu Gly Leu Pro Arg Tyr Tyr Pro Met Ala Asp Ala
65 70 75 80
Pro Phe Gly Ser Leu lie Gly Glu Pro Leu Gly Leu Tyr lie Leu Pro
85 90 95 Ala lie Phe Tyr Lys He He Ser He Phe Gly Tyr Asn Glu Leu Glu 100 105 110
Ala Phe Leu Leu Trp Pro Pro Phe Val Gly Phe Leu Ser Val He Gly
115 120 125
Val Tyr Leu Leu Gly Arg Lys Val Leu Asn Glu Trp Ala Gly Met Trp 130 135 140
Gly Ala He He Leu Ser Val Leu Thr Ala Asn Phe Ser Arg Thr Phe 145 150 155 160
Ser Gly Asn Ala Arg Gly Asp Gly Pro Phe Met Met Leu Phe Thr Phe
165 170 175
Ser Ala Val Leu Met Leu Tyr Tyr Leu Thr Glu Glu Asn Lys Asn Lys
180 185 190
Lys He He Trp Gly Thr Leu Phe Val Leu Leu Ala Gly He Ser Thr
195 200 205
Ala Ala Trp Asn Gly Ser Pro Phe Gly Leu Met Val Leu Leu Gly Phe 210 215 220
Ala Ser Phe Gin Thr He He Leu Phe He Phe Gly Lys He Asn Glu 225 230 235 240
Leu Arg Glu Phe He Lys Glu Tyr Tyr Pro Ala Tyr Leu Gly He Leu
245 250 255
Ala He Ser Tyr Leu Leu Thr He Pro Gly He Gly Lys He Gly Gly
260 265 270
Phe Val Arg Phe Ala Phe Glu Val Phe Leu Gly Leu Val Phe Leu Ala
275 280 285
He Val Met Leu Tyr Gly Gly Lys Tyr Leu Asn Tyr Ser Asp Lys Lys 290 295 300
His Arg Phe Ala Val Val Ala Val He Val He Ala Gly Phe Ala Gly 305 310 315 320
Ala Tyr He Tyr Val Gly Pro Lys Leu Phe Thr Leu Met Gly Gly Ala
325 330 335
Tyr Gin Ser Thr Gin Val Tyr Glu Thr Val Gin Glu Leu Ala Lys Thr
340 345 350
Asp Trp Gly Asp Val Lys Val Tyr Tyr Gly Val Glu Lys Pro Asn Gly
355 360 365
He Val Phe Phe Leu Gly Leu Val Gly Ala Met He Val Thr Ala Arg 370 375 380
Tyr Leu Tyr Lys Leu Phe Lys Asp Gly Arg Arg Pro His Glu Glu Leu 385 390 395 400
Phe Ala He Thr Phe Tyr Val Met Ser He Tyr Leu Leu Trp Thr Ala
405 410 415 Ala Arg Phe Leu Phe Leu Ala Ser Tyr Ala lie Ala Leu Met Ser Gly 420 425 430
Val Phe Ala Gly Tyr Val Leu Glu Thr Val Glu Lys Met Lys Glu Ser
435 440 445
lie Pro lie Lys Ala Ala Leu Gly Gly Val lie Ala lie Met Leu Leu
450 455 460
Leu lie Pro Leu Thr His Gly Pro Leu Leu Ala Gin Ser Ala Lys Ser 465 470 475 480
Met Arg Thr Thr Glu lie Glu Thr Ser Gly Trp Glu Asp Ala Leu Lys
485 490 495
Trp Leu Arg Glu Asn Thr Pro Glu Tyr Ser Thr Ala Thr Ser Trp Trp
500 505 510
Asp Tyr Gly Tyr Trp lie Glu Ser Ser Leu Leu Gly Gin Arg Arg Ala
515 520 525
Ser Ala Asp Gly Gly His Ala Arg Asp Arg Asp His lie Leu Ala Leu 530 535 540
Phe Leu Ala Arg Asp Gly Asn lie Ser Glu Val Asp Phe Glu Ser Trp 545 550 555 560
Glu Leu Asn Tyr Phe Leu Val Tyr Leu Asn Asp Trp Ala Lys Phe Asn
565 570 575
Ala lie Ser Tyr Leu Gly Gly Ala He Thr Arg Arg Glu Tyr Asn Gly
580 585 590
Asp Glu Ser Gly Arg Gly Ala Val Thr Thr Leu Leu Pro Leu Pro Arg
595 600 605
Tyr Gly Glu Lys Tyr Val Asn Leu Tyr Ala Lys Val He Val Asp Val 610 615 620
Ser Asn Ser Ser Val Lys Val Thr Val Gly Asp Arg Glu Cys Asp Pro 625 630 635 640
Leu Met Val Thr Phe Thr Pro Ser Gly Lys Thr He Lys Gly Thr Gly
645 650 655
Thr Cys Ser Asp Gly Asn Ala Phe Pro Tyr Val Leu His Leu Thr Pro
660 665 670
Thr He Gly Val Leu Ala Tyr Tyr Lys Val Ala Thr Ala Asn Phe He
675 680 685
Lys Leu Ala Phe Gly Val Pro Ala Ser Thr He Pro Gly Phe Ser Asp 690 695 700
Lys Leu Phe Ser Asn Phe Glu Pro Val Tyr Glu Ser Gly Asn Val He 705 710 715 720
Val Tyr Arg Phe Thr Pro Phe Gly He Tyr Lys He Glu Glu Asn He
725 730 735 Asn Gly Thr Trp Lys Gin Val Tyr Asn Leu Thr Pro Gly Lys His Glu 740 745 750
Leu Lys Leu Tyr lie Ser Ala Phe Gly Arg Asp lie Glu Asn Ala Thr
755 760 765
Leu Tyr lie Tyr Ala lie Asn Asn Glu Lys lie lie Glu Lys lie Lys 770 775 780
lie Ala Glu lie Ser His Met Asp Tyr Leu Asn Glu Tyr Pro lie Ala 785 790 795 800
Val Asn Val Thr Leu Pro Asn Ala Thr Ser Tyr Arg Phe Val Leu Val
805 810 815
Gin Lys Gly Pro lie Gly Val Leu Leu Asp Ala Pro Lys Val Asn Gly
820 825 830
Glu lie Arg Ser Pro Thr Asn lie Leu Arg Glu Gly Glu Ser Gly Glu
835 840 845
lie Glu Leu Lys Val Gly Val Asp Lys Asp Tyr Thr Ala Asp Leu Tyr 850 855 860
Leu Arg Ala Thr Phe lie Tyr Leu Val Arg Lys Ser Gly Lys Asp Asn 865 870 875 880
Glu Asp Tyr Asp Ala Ala Phe Glu Pro Gin Met Asp Val Phe Phe lie
885 890 895
Thr Lys lie Gly Glu Asn lie Gin Leu Lys Glu Gly Glu Asn Thr Val
900 905 910
Lys Val Arg Ala Glu Leu Pro Glu Gly Val lie Ser Ser Tyr Lys Asp
915 920 925
Glu Leu Gin Arg Lys Tyr Gly Asp Lys Leu lie lie Arg Gly lie Arg 930 935 940
Val Glu Pro Val Phe He Ala Glu Lys Glu Tyr Leu Met Leu Glu Val 945 950 955 960
Ser Ala Ser Ala Pro His His
965
[0032] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the P. furiosus amino acid sequence of SEQ ID NO: 1 1 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 1 1 is provided below as SEQ ID NO: 12 (EMBL Nucleotide Sequence Database No. AAL80280):
atggtgaaaa cccaaataaa ggagaaaaag aaagatgaaa aagttactat tccacttcct 60 gggaagataa aaactgtttt ggccttccta gtcgttttgg catttgccgc atatggattt 120 tacattagac atttaacagc cggaaagtat ttctcagatc cagatacctt ctaccatttc 180 gaaatttata agctagtcct caaagagggc cttcctaggt attacccaat ggcagatgct 240 ccatttggaa gtctcatagg agaacctctt ggactataca tccttccagc aatattctac 300 aaaataatct caatatttgg gtacaatgag ctagaggcat ttcttctttg gcccccattc 360 gtaggatttc tcagtgttat aggtgtttac ttactcggaa gaaaagttct gaacgaatgg 420 gcagggatgt ggggtgctat aattctctca gtcctcacgg caaacttttc aagaacattc 480 tcaggcaacg caagaggcga cggcccattc atgatgttgt ttacgttttc agcagtccta 540 atgctctatt atctaaccga ggaaaataaa aacaagaaaa taatctgggg aacactgttt 600 gtactcttgg caggaatatc aactgcagca tggaacggtt caccatttgg actaatggtt 660 ctccttggat tcgcatcgtt ccagacaata atcctcttta tttttggaaa gatcaatgag 720 cttagagaat tcataaagga atactaccca gcatacctgg gaattttagc tataagctac 780 cttctaacga tcccaggaat tggaaaaata ggaggatttg taagatttgc atttgaggtt 840 ttcttagggt tagttttctt agccatcgtc atgctctatg gaggaaaata cttgaactat 900 tctgacaaga agcacaggtt cgcagtggtt gcagttatag ttattgcggg gttcgcagga 960 gcttatattt acgttggtcc aaaactcttc actctaatgg gtggagctta tcagtcaacg 1020 caagtttatg aaacagtaca ggagctcgca aaaactgatt ggggagatgt aaaagtctat 1080 tatggagtag aaaagccaaa cggaatagtc ttcttccttg gattagttgg agcaatgatt 1140 gttacagcta ggtacctcta caaattattt aaagatggaa ggcgcccaca cgaagagtta 1200 tttgcaataa ctttctatgt aatgtcaatt tacctcctct ggacagctgc tagattccta 1260 ttcctagcga gttatgcgat agcattgatg tcaggtgtct ttgcaggata cgtcctagag 1320 actgtagaaa agatgaaaga gagtatacca ataaaagcag cactaggagg agtaattgct 1380 attatgcttc ttctaatacc cttaactcat ggcccactct tagctcaaag cgctaaaagt 1440 atgagaacaa ccgagatcga gactagtgga tgggaagatg cgctcaaatg gctcagagaa 1500 aacactccag aatattcgac cgcaacctct tggtgggact atggatattg gatagagtca 1560 agcctcctag gacagagaag ggccagtgct gatggtggac atgcaagaga tagagatcat 1620 atcttagccc tatttctagc cagagacggt aacattagtg aagtagactt tgagagttgg 1680 gagcttaact acttcctagt ttaccttaat gattgggcaa agttcaatgc aatcagctat 1740 ctaggcgggg ctataacgag gagagaatac aatggagatg aaagtggaag aggagccgta 1800 actacgctac ttcctctccc aaggtatgga gagaaatacg tcaacctcta tgccaaagtt 1860 atagttgatg tttcaaactc gagcgtaaag gttactgtag gagacagaga gtgtgatcca 1920 ctaatggtta cgtttactcc aagtggaaag acgataaaag gaactggaac ctgtagtgat 1980 ggcaacgcct tcccatatgt tttacactta actccaacaa ttggagtact tgcatactac 2040 aaagtagcaa ctgcaaactt cattaagtta gccttcggtg ttccagcttc aacaattcca 2100 ggattctctg ataagctatt ctcaaacttt gagccagtgt atgagtcagg aaacgtaata 2160 gtatatcgct tcacaccatt tggaatatac aaaattgagg aaaacattaa cggaacttgg 2220 aagcaagttt ataacctaac tcctggaaaa cacgagctca aactgtacat ttcagcattc 2280 ggaagagaca tcgaaaatgc aacgctgtac atttacgcca taaacaacga gaagatcata 2340 gagaaaatta agattgccga gatatcccac atggactatc taaatgaata cccgatagca 2400 gtgaacgtaa ccctaccaaa tgctacaagc tacaggtttg tactagttca aaaaggccca 2460 ataggtgttc ttctagatgc accaaaagtc aatggtgaga taagaagtcc aaccaacata 2520 ctaagggaag gagaaagtgg agaaatagag cttaaagttg gggttgataa agactacact 2580 gccgatctat acttaagggc tacgttcata tatttagtca gaaaaagtgg aaaggataac 2640 gaagattatg acgcagcgtt tgagccccaa atggatgttt tctttatcac aaagatcgga 2700 gaaaacattc aacttaaaga aggagagaat acagtaaagg ttagggcgga gcttccagaa 2760 ggagttatat ctagctacaa agatgaacta cagagaaaat acggagacaa gttgataatc 2820 agaggaataa gagtagagcc agtgttcata gcagaaaaag agtacctaat gctcgaggtc 2880 agtgcatcgg ctcctcatca ctaa 2904
[0033] OSTs from other Pyrococcus species or strains that share sequence identity to P. furiosus OST STT3 subunit related protein and/or are capable of transferring a glycan moiety to a target glycoprotein are also suitable for use in the present invention. For example, homologous OSTs derived from Pyrococcus sp. ST04 (SEQ ID NO: 13; UniProtKB No. I3RCF1 ), Pyrococcus sp. (strain NA2) (SEQ ID NO: 14; UniProtKB No. F4HM23), P. horikoshii (SEQ ID NO: 15; UniProtKB No. 074088), P. abyssi (SEQ ID NO: 16; UniProtKB No. Q9V250), and P. yayanosii (SEQ ID NO: 17; UniProtKB No. F8AIG3) each share greater than 70% sequence identity with the amino acid sequence of P. furiosus OST (see alignment of Figure 7), and are suitable for use in this and all aspects of the present invention. The nucleotide sequences encoding the aforementioned Pyrococcus OSTs are known and readily available in the art. A STT3 consensus sequence based on the alignment of
Pyrococcus STT3 sequences is presented as SEQ ID NO: 18 in Figure 7. Residues that are not fully conserved between the six Pyrococcus sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the six depicted Pyrococcus sequences.
[0034] In another embodiment of the present invention, the OST is a eukaryotic oligosaccharyltransferase. For example, the OST STT3subunit from Leishmania major, which is capable of transferring a glycan to an asparagine residue of a target glycoprotein is suitable for use in this and all aspects of the present invention. The amino acid sequence of L. major (UniProtKB Accession No.
Q9U5N8) is provided below as SEQ ID NO: 19.
Met Ala Ala Ala Ser Asn Val Asn Ala Pro Glu Ser Asn Val Met Thr 1 5 10 15
Thr Arg Ser Ala Val Ala Pro Pro Ser Thr Ala Ala Pro Lys Glu Ala
20 25 30
Ser Ser Glu Thr Leu Leu lie Gly Leu Tyr Lys Met Pro Ser Gin Thr
35 40 45
Arg Ser Leu lie Tyr Ser Ser Cys Phe Ala Val Ala Met Ala lie Ala 50 55 60
Leu Pro lie Ala Tyr Asp Met Arg Val Arg Ser lie Gly Val Tyr Gly 65 70 75 80
Tyr Leu Phe His Ser Ser Asp Pro Trp Phe Asn Tyr Arg Ala Ala Glu
85 90 95
Tyr Met Ser Thr His Gly Trp Ser Ala Phe Phe Ser Trp Phe Asp Tyr
100 105 110
Met Ser Trp Tyr Pro Leu Gly Arg Pro Val Gly Ser Thr Thr Tyr Pro
115 120 125
Gly Leu Gin Leu Thr Ala Val Ala lie His Arg Ala Leu Ala Ala Ala 130 135 140
Gly Met Pro Met Ser Leu Asn Asn Val Cys Val Leu Met Pro Ala Trp 145 150 155 160 Phe Ser Leu Val Ser Ser Ala Met Ala Ala Leu Leu Ala His Glu Met
165 170 175
Ser Gly Asn Met Ala Val Ala Ser lie Ser Ser lie Leu Phe Ser Val
180 185 190
Val Pro Ala His Leu Met Arg Ser Met Ala Gly Glu Phe Asp Asn Glu
195 200 205
Cys lie Ala Val Ala Ala Met Leu Leu Thr Phe Tyr Cys Trp Val Arg 210 215 220
Ser Leu Arg Thr Arg Ser Ser Trp Pro lie Gly Val Leu Thr Gly Val 225 230 235 240 Ala Tyr Gly Tyr Met Ala Ala Ala Trp Gly Gly Tyr lie Phe Val Leu
245 250 255
Asn Met Val Ala Met His Ala Gly lie Ser Ser Met Val Asp Trp Ala
260 265 270
Arg Asn Thr Tyr Asn Pro Ser Leu Leu Arg Ala Tyr Thr Leu Phe Tyr
275 280 285 Val val Gly Thr Ala He Ala Val Cys Val Pro Pro Val Gly Met Ser 290 295 300
Pro Phe Lys Ser Leu Glu Gin Leu Gly Ala Leu Leu Val Leu Val Phe
305 310 315 320
He Phe Gly Gin Ser Val Cys Glu Ala Gin Arg Arg Arg Leu Gly He
325 330 335
Ala Arg Leu Ser Lys Glu Gly Val Ala Leu Leu He Arg He Asp Ala
340 345 350
Ala Phe Phe Val Gly He Val Ala Val Ala Thr He Ala Pro Ala Gly
355 360 365
Phe Phe Lys Pro Leu Ser Leu Gin Ala Asn Ala He He Thr Gly Val 370 375 380
Ser Arg Thr Gly Asn Thr Leu Val Asp He Leu Leu Ala Gin Asp Ala 385 390 395 400
Ser Asn Leu Leu Met Val Trp Gin Leu Phe Leu Phe Pro Phe Leu Gly
405 410 415
Trp Val Ala Gly Met Ser Ala Phe Leu Arg Glu Leu He Arg Asn Tyr
420 425 430
Thr Tyr Ala Lys Ser Phe He Leu Met Tyr Gly Val Val Gly Met Tyr
435 440 445
Phe Ala Ser Gin Ser Val Arg Met Met Val Met Met Ala Pro Val Ala 450 455 460
Cys He Phe Thr Ala Leu Leu Phe Arg Trp Ala Leu Asp Tyr Leu Leu
465 470 475 480
Gly Ser Leu Phe Trp Ala Glu Met Pro Pro Ser Phe Asp Thr Asp Ala
485 490 495
Gin Arg Gly Arg Gin Gin Gin Thr Ala Glu Glu Ser Glu Ala Glu Thr
500 505 510
Lys Arg Lys Glu Glu Glu Tyr Asn Thr Met Gin Val Lys Lys Met Ser
515 520 525
Val Arg Met Leu Pro Phe Met Leu Leu Leu Leu Leu Phe Arg Leu Ser
530 535 540
Gly Phe He Glu Asp Val Ala Ala He Ser Arg Lys Met Glu Ala Pro
545 550 555 560
Gly He Val Phe Pro Ser Glu Gin Val Gin Gly Val Ser Glu Lys Lys
565 570 575
Val Asp Asp Tyr Tyr Ala Gly Tyr Leu Tyr Leu Arg Asp Ser Thr Pro
580 585 590
Glu Asp Ala Arg Val Leu Ala Trp Trp Asp Tyr Gly Tyr Gin He Thr
595 600 605 Gly lie Gly Asn Arg Thr Ser Leu Ala Asp Gly Asn Thr Trp Asn His 610 615 620
Glu His lie Ala Thr lie Gly Lys Met Leu Thr Ser Pro Val Ala Glu 625 630 635 640
Ala His Ser Leu Val Arg His Met Ala Asp Tyr Val Leu lie Ser Ala
645 650 655
Gly Asp Thr Tyr Phe Ser Asp Leu Asn Arg Ser Pro Met Met Ala Arg
660 665 670 lie Gly Asn Ser Val Tyr His Asp lie Cys Pro Asp Asp Pro Leu Cys
675 680 685
Ser Gin Phe Val Leu Gin Lys Arg Pro Lys Ala Ala Ala Ala Lys Arg
690 695 700
Ser Arg His Val Ser Val Asp Ala Leu Glu Glu Asp Asp Thr Ala Glu 705 710 715 720
His Met Val Tyr Glu Pro Ser Ser Leu lie Ala Lys Ser Leu lie Tyr
725 730 735
His Leu His Ser Thr Gly Val Val Thr Gly Val Thr Leu Asn Glu Thr
740 745 750
Leu Phe Gin His Val Phe Thr Ser Pro Gin Gly Leu Met Arg lie Phe
755 760 765
Lys Val Met Asn Val Ser Thr Glu Ser Lys Lys Trp Val Ala Asp Ser
770 775 780
Ala Asn Arg Val Cys His Pro Pro Gly Ser Trp lie Cys Pro Gly Gin 785 790 795 800
Tyr Pro Pro Ala Lys Glu lie Gin Glu Met Leu Ala His Gin His Thr
805 810 815
Asn Phe Lys Asp Leu Leu Asp Pro Arg Thr Thr Trp Ser Gly Ser Arg
820 825 830
Arg
[0035] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the L. major amino acid sequence of SEQ ID NO: 19 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 19 (L. major STT3) is provided below as SEQ ID NO: 20 (EMBL Nucleotide Sequence Database No. CAB61569):
atggcggcag cgtcaaacgt gaatgccccc gaaagcaacg tgatgacaac gagaagtgcc 60 gttgcaccac cgtcgacggc tgcacccaaa gaggcttcaa gtgaaacgct gctcattggc 120 ctatacaaga tgccctcgca aactcgtagc ctcatctact cctcctgctt tgcggtggcc atggccattg ccctccctat cgcgtacgac atgcgtgtcc gctccatcgg cgtgtacggg tacctcttcc acagcagtga cccgtggttc aactaccgcg ctgccgagta catgtccacg cacggctggt ccgccttctt cagctggttc gactacatga gctggtaccc gctgggccgc cccgtcggct ccaccacgta cccgggcctg cagctcactg ccgtcgccat tcaccgcgca ctggcggctg ccggcatgcc gatgtctctc aacaacgtgt gcgtgctgat gccagcgtgg ttttcacttg tctcttcagc gatggcggca ctgctggcgc atgagatgag cggcaatatg gcggtagcca gcatctcgtc tatcttattc agtgtggttc cagcccacct gatgcggtcc atggcgggtg agttcgacaa cgagtgtatc gccgtcgcag ccatgctcct caccttctac tgctgggtgc gctcgctgcg cacgcggtcc tcgtggccca tcggtgtcct caccggtgtc gcctacggct acatggcggc ggcgtggggc ggctacattt tcgtgctcaa catggttgcc atgcatgccg gcatatcatc gatggtggac tgggcccgca acacgtacaa cccgtcgctg ctgcgtgcat acacgctgtt ctacgtcgtg ggcaccgcca tcgccgtgtg cgtgccgcca gtggggatgt cgcccttcaa gtcgctggag cagctgggtg cgctgctggt gcttgtcttc attttcggtc agtctgtgtg tgaggcccag cgcagacgat tgggaatcgc gcgcctttca aaggagggcg tggcgctgct catccgcatc gacgcagcct tcttcgtcgg tatcgttgcc gtggccacca ttgccccggc tggattcttc aagccgctct ccctgcaagc gaacgcgata atcactggcg tatctcgtac cggaaacaca ctcgtagaca ttctgcttgc gcaagacgcg tccaacctac tcatggtgtg gcagcttttt ctctttccct tcttaggttg ggtggcgggc atgagcgcct tccttagaga gttgatccgg aactacacct acgcgaagag tttcatcctg atgtacggcg tggtcggtat gtacttcgcc agccagtctg tccgaatgat ggtgatgatg gcccccgtgg cgtgcatctt tactgccctc ttgttccgct gggcactgga ctacctcctc gggtctttgt tttgggctga gatgccacct tcctttgaca ccgacgcaca gcgtgggcgg cagcaacaga ccgccgagga gtcggaggca gagaccaagc gtaaggagga agagtacaac accatgcagg tcaagaagat gtcggtgcgc atgttgccct tcatgctgtt gctcttactg tttcgtcttt cggggttcat cgaagatgtg gcggcgatat cgcgcaagat ggaggcgccg ggtatagttt ttcccagtga acaggtgcaa ggcgtgtcgg agaaaaaggt cgacgactac tatgcggggt acctgtatct gcgcgacagc acgccagagg acgcgcgcgt tttggcctgg tgggactacg gctaccagat cacaggcatc ggcaaccgca cctcgctggc cgatggcaac acctggaacc acgagcacat cgccacgatc ggcaagatgc tgacgtcgcc cgtggcggag gcgcactcgc tggtgcgcca catggccgac tatgttctga tttctgctgg agacacatat ttttccgacc tgaatcgctc accgatgatg gcgcgcatcg gcaacagcgt gtaccacgac atctgccccg acgacccact ttgtagtcag ttcgtgttgc agaaaagacc gaaagctgct gcagcgaagc gcagtcggca cgtcagcgtt gacgcactag aggaggatga cactgcagag 2160 catatggtat acgagccgtc atcactcata gccaagtcgc tcatatatca cctgcactcc 2220 acaggggtgg tgacgggggt cacgctgaat gagacgctct tccagcacgt cttcacctca 2280 ccgcagggtc tcatgcgcat cttcaaggtc atgaacgtga gcacggagag caaaaagtgg 2340 gttgctgact cggcaaaccg cgtgtgccac ccgcctgggt cgtggatctg ccccgggcag 2400 tacccgccgg cgaaggagat ccaggagatg ctggcacacc aacacaccaa cttcaaggac 2460 cttcttgatc ccagaacgac ttggagcggg agcaggcgct ga 2502
[0036] OSTs from other Leishmania species or strains that share sequence identity to L. major OST STT3 subunit related protein and/or are capable of transferring a glycan moiety to a target glycoprotein are also suitable for use in the present invention. For example, homologous OSTs derived from L. donovani (SEQ ID NO: 21 ; UniProtKB No. E9BRZ2), L. infantum (SEQ ID NO: 22; UniProtKB No. A4IB 10), L. mexicana (SEQ ID NO: 23; UniProtKB KB No. E9B5Z4), and L.
braziliensis (SEQ ID NO: 24; UniProtKB No. A4HMD6), which each share greater than 70% sequence identity with the amino acid sequence of L. major OST (see alignment of Figure 8), are also suitable for use in the this and all aspects of the present invention. A STT3 consensus sequence based on the alignment of
Leishmania STT3 sequences is presented as SEQ ID NO: 25 in Figure 8. Residues that are not fully conserved between the five Leishmania sequences are depicted as X, where X can be any amino acid residue. Alternatively, X is selected from an amino acid residue at the corresponding position in one of the five depicted Leishmania sequences.
[0037] In another embodiment of the present invention, the eukaryotic oligosaccharyltransferase is STT3 from Saccharomyces cerevisiae. The amino acid sequence of 5. cerevisiae (UniProtKB Accession No. P39007) is provided below as SEQ ID NO: 26.
Met Gly Ser Asp Arg Ser Cys Val Leu Ser Val Phe Gin Thr lie Leu 1 5 10 15
Lys Leu Val lie Phe Val Ala lie Phe Gly Ala Ala lie Ser Ser Arg
20 2 5 30
Leu Phe Ala Val lie Lys Phe Glu Ser lie lie His Glu Phe Asp Pro
35 40 45
Trp Phe Asn Tyr Arg Ala Thr Lys Tyr Leu Val Asn Asn Ser Phe Tyr
50 55 60 Lys Phe Leu Asn Trp Phe Asp Asp Arg Thr Trp Tyr Pro Leu Gly Arg 65 70 75 80
Val Thr Gly Gly Thr Leu Tyr Pro Gly Leu Met Thr Thr Ser Ala Phe
85 90 95
He Trp His Ala Leu Arg Asn Trp Leu Gly Leu Pro He Asp He Arg
100 105 110
Asn Val Cys Val Leu Phe Ala Pro Leu Phe Ser Gly Val Thr Ala Trp
115 120 125
Ala Thr Tyr Glu Phe Thr Lys Glu He Lys Asp Ala Ser Ala Gly Leu
130 135 140
Leu Ala Ala Gly Phe He Ala He Val Pro Gly Tyr He Ser Arg Ser
145 150 155 160
Val Ala Gly Ser Tyr Asp Asn Glu Ala He Ala He Thr Leu Leu Met
165 170 175
Val Thr Phe Met Phe Trp He Lys Ala Gin Lys Thr Gly Ser He Met
180 185 190
His Ala Thr Cys Ala Ala Leu Phe Tyr Phe Tyr Met Val Ser Ala Trp
195 200 205
Gly Gly Tyr Val Phe He Thr Asn Leu He Pro Leu His Val Phe Leu 210 215 220
Leu He Leu Met Gly Arg Tyr Ser Ser Lys Leu Tyr Ser Ala Tyr Thr 225 230 235 240
Thr Trp Tyr Ala He Gly Thr Val Ala Ser Met Gin He Pro Phe Val
245 250 255
Gly Phe Leu Pro He Arg Ser Asn Asp His Met Ala Ala Leu Gly Val
260 265 270
Phe Gly Leu He Gin He Val Ala Phe Gly Asp Phe Val Lys Gly Gin
275 280 285
He Ser Thr Ala Lys Phe Lys Val He Met Met Val Ser Leu Phe Leu 290 295 300
He Leu Val Leu Gly Val Val Gly Leu Ser Ala Leu Thr Tyr Met Gly
305 310 315 320
Leu He Ala Pro Trp Thr Gly Arg Phe Tyr Ser Leu Trp Asp Thr Asn
325 330 335
Tyr Ala Lys He His He Pro He He Ala Ser Val Ser Glu His Gin
340 345 350
Pro Val Ser Trp Pro Ala Phe Phe Phe Asp Thr His Phe Leu He Trp
355 360 365
Leu Phe Pro Ala Gly Val Phe Leu Leu Phe Leu Asp Leu Lys Asp Glu 370 375 380 His Val Phe Val lie Ala Tyr Ser Val Leu Cys Ser Tyr Phe Ala Gly 385 390 395 400
Val Met Val Arg Leu Met Leu Thr Leu Thr Pro Val lie Cys Val Ser
405 410 415
Ala Ala Val Ala Leu Ser Lys lie Phe Asp lie Tyr Leu Asp Phe Lys
420 425 430
Thr Ser Asp Arg Lys Tyr Ala lie Lys Pro Ala Ala Leu Leu Ala Lys
435 440 445
Leu lie Val Ser Gly Ser Phe lie Phe Tyr Leu Tyr Leu Phe Val Phe
450 455 460
His Ser Thr Trp Val Thr Arg Thr Ala Tyr Ser Ser Pro Ser Val Val
465 470 475 480
Leu Pro Ser Gin Thr Pro Asp Gly Lys Leu Ala Leu lie Asp Asp Phe
485 490 495
Arg Glu Ala Tyr Tyr Trp Leu Arg Met Asn Ser Asp Glu Asp Ser Lys
500 505 510
Val Ala Ala Trp Trp Asp Tyr Gly Tyr Gin lie Gly Gly Met Ala Asp
515 520 525
Arg Thr Thr Leu Val Asp Asn Asn Thr Trp Asn Asn Thr His lie Ala
530 535 540
lie Val Gly Lys Ala Met Ala Ser Pro Glu Glu Lys Ser Tyr Glu He
545 550 555 560
Leu Lys Glu His Asp Val Asp Tyr Val Leu Val He Phe Gly Gly Leu
565 570 575
He Gly Phe Gly Gly Asp Asp He Asn Lys Phe Leu Trp Met He Arg
580 585 590
He Ser Glu Gly He Trp Pro Glu Glu He Lys Glu Arg Tyr Phe Tyr
595 600 605
Thr Ala Glu Gly Glu Tyr Arg Val Asp Ala Arg Ala Ser Glu Thr Met 610 615 620
Arg Asn Ser Leu Leu Tyr Lys Met Ser Tyr Lys Asp Phe Pro Gin Leu
625 630 635 640
Phe Asn Gly Gly Gin Ala Thr Asp Arg Val Arg Gin Gin Met He Thr
645 650 655
Pro Leu Asp Val Pro Pro Leu Asp Tyr Phe Asp Glu Val Phe Thr Ser
660 665 670
Glu Asn Trp Met Val Arg He Tyr Gin Leu Lys Lys Asp Asp Ala Gin
675 680 685
Gly Arg Thr Leu Arg Asp Val Gly Glu Leu Thr Arg Ser Ser Thr Lys 690 695 700 Thr Arg Arg Ser lie Lys Arg Pro Glu Leu Gly Leu Arg Val
705 710 715
[0038] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the 5. cerevisiae amino acid sequence of SEQ ID NO: 26 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 26 (5. cerevisiae STT3) is provided below as SEQ ID NO: 27 (EMBL Nucleotide Sequence
Database No. BAA06079).
atgggatccg accggtcgtg tgttttgtct gtgtttcaga ccatcctcaa gctcgtcatc 60 ttcgtggcga tttttggggc tgccatatca tcacgtttgt ttgcagtcat caaatttgag 120 tctattatcc atgaattcga cccctggttc aattataggg ctaccaaata tctcgtcaac 180 aattcgtttt acaagttttt gaactggttt gacgaccgta cctggtaccc cctcggaagg 240 gttactggag ggactttata tcctggtttg atgacgacta gtgcgttcat ctggcacgcc 300 ctgcgcaact ggttgggctt gcccattgac atcagaaacg tttgtgtgct atttgcgcca 360 ctattttctg gggtcaccgc ctgggcgact tacgaattta cgaaagagat taaagatgcc 420 agcgctgggc ttttggctgc tggttttata gccattgtcc ccggttatat atctagatca 480 gtggcggggt cctacgataa tgaggccatt gccattacac tattaatggt cactttcatg 540 ttttggatta aggcccaaaa gactggctct atcatgcacg caacgtgtgc agctttattc 600 tacttctaca tggtgtcggc ttggggtgga tacgtgttca tcaccaactt gatcccactc 660 catgtctttt tgctgatttt gatgggcaga tattcgtcca aactgtattc tgcctacacc 720 acttggtacg ctattggaac tgttgcatcc atgcagatcc catttgtcgg tttcctacct 780 atcaggtcta acgaccacat ggccgcattg ggtgttttcg gtttgattca gattgtcgcc 840 ttcggtgact tcgtgaaggg ccaaatcagc acagctaagt ttaaagtcat catgatggtt 900 tctctgtttt tgatcttggt ccttggtgtg gtcggacttt ctgccttgac ctatatgggg 960 ttgattgccc cttggactgg tagattttat tcgttatggg ataccaacta cgcaaagatc 1020 cacattccta tcattgcctc cgtttccgaa catcaacccg tttcgtggcc cgctttcttc 1080 tttgataccc actttttgat ctggctattc cccgccggtg tattcctact attcctcgac 1140 ttgaaagacg agcacgtttt tgtcatcgct tactccgttc tgtgttcgta ctttgccggt 1200 gttatggtta gattgatgtt gactttgaca ccagtcatct gtgtgtccgc cgccgtcgca 1260 ttgtccaaga tatttgacat ctacctggat ttcaagacaa gtgaccgcaa atacgccatc 1320 aaacctgcgg cactactggc caaattgatt gtttccggat cattcatctt ttatttgtat 1380 cttttcgtct tccattctac ttgggtaaca agaactgcat actcttctcc ttctgttgtt 1440 ttgccatcac aaaccccaga tggtaaattg gcgttgatcg acgacttcag ggaagcgtac 1500 tattggttaa gaatgaactc tgatgaggac agtaaggttg cagcgtggtg ggattacggt 1560 taccaaattg gtggcatggc agacagaacc actttagtcg ataacaacac gtggaacaat 1620 actcacatcg ccatcgttgg taaagccatg gcttcccctg aagagaaatc ttacgaaatt 1680 ctaaaagagc atgatgtcga ttatgtcttg gtcatctttg gtggtctaat tgggtttggt 1740 ggtgatgaca tcaacaaatt cttgtggatg atcagaatta gcgagggaat ctggccagaa 1800 gagataaaag agcgttattt ctataccgca gagggagaat acagagtaga tgcaagggct 1860 tctgagacca tgaggaactc gctactttac aagatgtcct acaaagattt cccacaatta 1920 ttcaatggtg gccaagccac tgacagagtg cgtcaacaaa tgatcacacc attagacgtc 1980 ccaccattag actacttcga cgaagttttt acttccgaaa actggatggt tagaatatat 2040 caattgaaga aggatgatgc ccaaggtaga actttgaggg acgttggtga gttaaccagg 2100 tcttctacga aaaccagaag gtccataaag agacctgaat taggcttgag agtctaa 2157
[0039] In another embodiment of the present invention, the eukaryotic oligosaccharyltransferase is STT3 from Schizosaccharomyces pombe. The amino acid sequence of 5. pombe (UniProtKB Accession No. 094335) is provided below as SEQ ID NO: 28.
Met Ala Asn Ser Ala Thr lie Thr Ser Lys Lys Gly Val Lys Ser His 1 5 10 15
Gin Lys Asp Trp Lys lie Pro Leu Lys Val Leu lie Leu lie Cys lie
20 25 30
Ala Val Ala Ser Val Ser Ser Arg Leu Phe Ser Val lie Arg Tyr Glu
35 40 45
Ser lie lie His Glu Phe Asp Pro Trp Phe Asn Phe Arg Ala Ser Lys
50 55 60
lie Leu Val Glu Gin Gly Phe Tyr Asn Phe Leu Asn Trp Phe Asp Glu 65 70 75 80
Arg Ser Trp Tyr Pro Leu Gly Arg Val Ala Gly Gly Thr Leu Tyr Pro
85 90 95
Gly Leu Met Val Thr Ser Gly lie lie Phe Lys Val Leu His Leu Leu
100 105 110
Arg lie Asn Val Asn lie Arg Asp Val Cys Val Leu Leu Ala Pro Ala
115 120 125
Phe Ser Gly lie Thr Ala lie Ala Thr Tyr Tyr Leu Ala Arg Glu Leu
130 135 140
Lys Ser Asp Ala Cys Gly Leu Leu Ala Ala Ala Phe Met Gly lie Ala 145 150 155 160 Pro Gly Tyr Thr Ser Arg Ser Val Ala Gly Ser Tyr Asp Asn Glu Ala 165 170 175 lie Ala lie Thr Leu Leu Met Ser Thr Phe Ala Leu Trp lie Lys Ala
180 185 190
Val Lys Ser Gly Ser Ser Phe Trp Gly Ala Cys Thr Gly Leu Leu Tyr
195 200 205
Phe Tyr Met Val Thr Ala Trp Gly Gly Tyr Val Phe lie Thr Asn Met 210 215 220
He Pro Leu His Val Phe Val Leu Leu Leu Met Gly Arg Tyr Thr Ser 225 230 235 240
Lys Leu Tyr He Ala Tyr Thr Thr Tyr Tyr Val He Gly Thr Leu Ala
245 250 255
Ser Met Gin Val Pro Phe Val Gly Phe Gin Pro Val Ser Thr Ser Glu
260 265 270
His Met Ser Ala Leu Gly Val Phe Gly Leu Leu Gin Leu Phe Ala Phe
275 280 285
Tyr Asn Tyr Val Lys Gly Leu Val Ser Ser Lys Gin Phe Gin He Leu 290 295 300
He Arg Phe Ala Leu Val Cys Leu Val Gly Leu Ala Thr Val Val Leu
305 310 315 320
Phe Ala Leu Ser Ser Thr Gly Val He Ala Pro Trp Thr Gly Arg Phe
325 330 335
Tyr Ser Leu Trp Asp Thr Asn Tyr Ala Lys He His He Pro He He
340 345 350
Ala Ser Val Ser Glu His Gin Pro Pro Thr Trp Ser Ser Leu Phe Phe
355 360 365
Asp Leu Gin Phe Leu He Trp Leu Leu Pro Val Gly Val Tyr Leu Cys 370 375 380
Phe Lys Glu Leu Arg Asn Glu His Val Phe He He He Tyr Pro Val
385 390 395 400
Leu Gly Thr Tyr Phe Cys Gly Val Met Val Arg Leu Val Leu Thr Leu
405 410 415
Thr Pro Cys Val Cys He Ala Ala Ala Val Ala He Ser Thr Leu Leu
420 425 430
Asp Thr Tyr Met Gly Pro Glu Val Glu Glu Asp Lys Val Ser Glu Glu
435 440 445
Ala Ala Ser Ala Lys Ser Lys Asn Lys Lys Gly He Ser Ser He Leu 450 455 460
Ser Phe Phe Thr Ser Gly Ser Lys Asn He Gly He Tyr Ser Leu Leu 465 470 475 480 Ser Arg Val Leu Val lie Ser Ser Thr Ala Tyr Phe Leu lie Met Phe 485 490 495
Val Tyr His Ser Ser Trp Val Thr Ser Asn Ala Tyr Ser Ser Pro Thr
500 505 510
Val Val Leu Ser Thr Val Leu Asn Asp Gly Ser Leu Met Tyr lie Asp
515 520 525
Asp Phe Arg Glu Ala Tyr Asp Trp Leu Arg Arg Asn Thr Pro Tyr Asp
530 535 540
Thr Lys Val Met Ser Trp Trp Asp Tyr Gly Tyr Gin lie Ala Gly Met 545 550 555 560
Ala Asp Arg lie Thr Leu Val Asp Asn Asn Thr Trp Asn Asn Thr His
565 570 575 lie Ala Thr Val Gly Lys Ala Met Ser Ser Pro Glu Glu Lys Ala Tyr
580 585 590
Pro lie Leu Arg Lys His Asp Val Asp Tyr lie Leu lie lie Tyr Gly
595 600 605
Gly Thr Leu Gly Tyr Ser Ser Asp Asp Met Asn Lys Phe Leu Trp Met
610 615 620
lie Arg He Ser Gin Gly Leu Trp Pro Asp Glu He Val Glu Arg Asn 625 630 635 640
Phe Phe Thr Pro Asn Gly Glu Tyr Arg Thr Asp Asp Ala Ala Thr Pro
645 650 655
Thr Met Arg Glu Ser Leu Leu Tyr Lys Met Ser Tyr His Gly Ala Trp
660 665 670
Lys Leu Phe Pro Pro Asn Gin Gly Tyr Asp Arg Ala Arg Asn Gin Lys
675 680 685
Leu Pro Ser Lys Asp Pro Gin Leu Phe Thr He Glu Glu Ala Phe Thr
690 695 700
Thr Val His His Leu Val Arg Leu Tyr Lys Val Lys Lys Pro Asp Thr 705 710 715 720
Leu Gly Arg Asp Leu Lys Gin Val Thr Leu Phe Glu Glu Gly Lys Arg
725 730 735
Lys Lys Ser Ala Val Leu Gin Lys Leu Thr Lys Phe Leu
740 745
[0040] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the 5. pombe amino acid sequence of SEQ ID NO: 28 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 28 (S. pombe STT3) is provided below as SEQ ID NO: 29 (EMBL Nucleotide Sequence Database No. BAA76479).
atggctaatt ctgctacaat tacgagtaaa aaaggcgtga agtctcatca gaaggactgg 60 aaaattccac ttaaagtgct cattcttata tgtattgctg tggcttctgt ctcttcgagg 120 cttttttctg tcattcgtta cgagtccatt attcatgaat ttgatccttg gttcaatttc 180 cgagcttcca aaatattggt ggaacaaggt ttttataact ttttaaattg gtttgatgaa 240 agaagttggt acccgttggg tcgtgtagcg ggtggtactt tgtacccagg acttatggtc 300 acgtctggta ttattttcaa agttttacat cttttaagaa ttaacgtgaa catccgtgat 360 gtatgtgttt tacttgcccc tgctttctct ggaatcactg cgattgctac ctattatctg 420 gctagagaat tgaaaagtga tgcatgtggc cttttagctg ccgcatttat gggtattgct 480 cctggataca cctcccgttc cgtcgctggt tcttacgata atgaagcaat tgctattacc 540 cttttgatgt caacgtttgc tttgtggatc aaggcagtga agtctggctc ctctttctgg 600 ggtgcctgca caggattgct ctacttctat atggtaactg cgtggggtgg ttatgtattc 660 atcacaaaca tgataccttt acacgtattt gttcttctac ttatgggtcg ctatactagc 720 aaattataca ttgcttacac aacatactat gttattggaa cgctggcttc tatgcaagtt 780 ccgtttgttg gtttccaacc cgtgtcgact agtgagcata tgtccgcttt aggagtgttt 840 ggcctgttac agctttttgc attctacaat tatgttaaag gtctagtttc atccaagcaa 900 ttccaaatac ttattcgttt tgccttggtt tgcttagtgg gtctagcaac agtcgtcctt 960 tttgctttat cttcaacagg tgttatcgct ccttggacag gacgtttcta ttctctttgg 1020 gatacaaact acgccaagat tcatattcct atcattgctt cggtatcaga acatcagcct 1080 cctacttgga gttcgttgtt ctttgatctt caatttttga tttggttatt gccagttggt 1140 gtttacttgt gtttcaagga acttcgtaat gaacatgtct ttattattat atatcctgtc 1200 ttaggaacat atttttgtgg tgtgatggtt cgtttggttt taaccttaac tccttgtgtt 1260 tgcatagctg ctgctgtagc aatttccact cttttagaca catatatggg tcctgaagtt 1320 gaagaggaca aagtgagcga agaagccgct tcagccaaat ctaagaacaa gaaaggtatt 1380 tcctctattc ttagtttctt cacttctggc tcaaaaaata ttggaattta cagtttgctt 1440 tccagagtat tagtcatttc ctctaccgca tatttcctaa taatgtttgt ttatcattcc 1500 agttgggtga cttctaatgc ttactcttcc cctaccgtgg ttttgtctac cgtgttaaac 1560 gatggtagtt taatgtatat tgatgacttc cgtgaagctt atgactggct tcgtagaaac 1620 actccttatg acacaaaggt tatgagttgg tgggattatg gttaccaaat tgctggtatg 1680 gctgatcgta ttactttagt cgacaacaat acgtggaaca acacacatat tgccacagtt 1740 ggaaaagcca tgtcttcacc tgaagaaaaa gcttacccta tcctccgtaa acacgatgtt 1800 gattatattc ttattatata tggtggtact cttggataca gcagcgacga catgaacaag 1860 ttcctttgga tgatccgaat ttctcaggga ttatggcccg atgaaatagt agagcgtaac 1920 ttttttactc ctaatggaga atatcgaact gacgatgcgg ctactcccac tatgcgtgag 1980 tctttattat ataagatgtc atatcacggt gcttggaaac ttttccctcc caatcaagga 2040 tatgaccgtg ctcgcaatca aaaactacca tcgaaagatc ctcaactatt tactatcgaa 2100 gaagcattca ctaccgttca tcatttagtt cgtttgtata aggttaagaa accggataca 2160 cttggacgcg atttgaaaca agtgacatta tttgaagaag gcaaaagaaa gaagtccgcc 2220 gtcctgcaaa aactaacgaa attcctttga 2250
[0041] In another embodiment of the present invention, the eukaryotic oligosaccharyltransferase is STT3 from Dictyostelium discoideum. The amino acid sequence of D. discoideum (UniProtKB Accession No. Q54NM9) is provided below as SEQ ID NO: 30.
Met Lys Arg Ser Glu Lys Ser Ser Thr Ser Val Val Ser Asn Asn Lys 1 5 10 15
Gin Gin Asp Val Asn He He Ser Ser Asn Glu Val Gly Val Lys Glu
20 25 30
Glu Asn Lys Gly His Gin Glu Phe Leu Leu Lys Val Leu He Leu Ser
35 40 45
Val He Tyr Val Leu Ala Phe Ser Thr Arg Leu Phe Ser Val Leu Arg 50 55 60
Tyr Glu Ser Val He His Glu Phe Asp Pro Tyr Phe Asn Tyr Arg Ser 65 70 75 80
Thr He Tyr Leu Val Gin Glu Gly Phe Tyr Asn Phe Leu Asn Trp Phe
85 90 95
Asp Glu Arg Ala Trp Tyr Pro Leu Gly Arg He Val Gly Gly Thr He
100 105 110
Tyr Pro Gly Leu Met Ala Thr Ala Ser Leu Val His Trp Ser Leu Asn
115 120 125
Ser Leu Asn He Thr Val Asn He Arg Asn Val Cys Val Leu Leu Ser 130 135 140
Pro Trp Phe Ala Ser Asn Thr Ala Met Val Thr Tyr Lys Phe Ala Lys 145 150 155 160
Glu Val Lys Asp Thr Gin Thr Gly Leu Val Ala Ala Ala Met He Ala
165 170 175
He Val Pro Gly Tyr He Ser Arg Ser Val Ala Gly Ser Phe Asp Asn
180 185 190
Glu Gly He Ala He Phe Ala Leu He Phe Thr Tyr Tyr Cys Trp He
195 200 205 Lys Ser Val Asn Thr Gly Ser Leu Met Trp Ala Ala lie Cys Ser Leu 210 215 220
Ala Tyr Phe Tyr Met Ala Ser Ala Trp Gly Gly Tyr Val Phe He He 225 230 235 240
Asn Leu lie Pro Leu His Ala Phe Phe Leu Leu Leu Thr Gly Arg Tyr
245 250 255
Ser His Arg Leu Tyr lie Ala Tyr Ser Thr Met Phe Val He Gly Thr
260 265 270
He Leu Ser Met Gin He Thr Phe He Ser Phe Gin Pro Val Gin Ser
275 280 285
Ser Glu His Leu Ala Ala He Gly He Phe Gly Leu Leu Gin Leu Tyr 290 295 300
Ala Gly Leu Ser Trp Val Lys Ser His Leu Thr Asn Glu Ala Phe Lys 305 310 315 320
Lys Leu Gin Arg Leu Thr Val Leu Phe Val Leu Ser Cys Ala Ala Ala
325 330 335
Val Leu Val Val Gly Thr Leu Thr Gly Tyr He Ser Pro Phe Asn Gly
340 345 350
Arg Phe Tyr Ser Leu Leu Asp Pro Thr Tyr Ala Arg Asp His He Pro
355 360 365
He He Ala Ser Val Ser Glu His Gin Pro Thr Thr Trp Ala Ser Tyr
370 375 380
Phe Phe Asp Leu His He Leu Val Phe Leu Phe Pro Ala Gly Leu Tyr 385 390 395 400
Phe Cys Phe Gin Lys Leu Thr Asp Ala Asn He Phe Leu He Leu Tyr
405 410 415 Gly Val Thr Ser He Tyr Phe Ser Gly Val Met Val Arg Leu Met Leu
420 425 430
Val Leu Ala Pro Val Ala Cys He Leu Ala Ala Val Ala Val Ser Ala
435 440 445
Thr Leu Thr Thr Tyr Met Lys Lys Leu Lys Ala Pro Ser Ser Pro Ser
450 455 460
Asp Ala Asn Asn Ser Lys Glu Ser Gly Gly Val Met Val Ala Val Leu 465 470 475 480
Thr Val Leu Leu He Leu Tyr Ala Phe His Cys Thr Trp Val Thr Ser
485 490 495 Glu Ala Tyr Ser Ser Pro Ser He Val Leu Ser Ala Lys Gin Asn Asp
500 505 510
Gly Ser Arg Val He Phe Asp Asp Phe Arg Glu Ala Tyr Arg Trp He
515 520 525 Gly Gin Asn Thr Ala Asp Asp Ala Arg lie Met Ser Trp Trp Asp Tyr 530 535 540
Gly Tyr Gin Leu Ser Ala Met Ala Asn Arg Thr Val Leu Val Asp Asn
545 550 555 560
Asn Thr Trp Asn Asn Ser His lie Ala Gin Val Gly Lys Ala Phe Ala
565 570 575
Thr Glu Glu Asp Ala Tyr lie Gin Met Lys Ala Leu Asp Val Asp
580 585 590
Tyr Val Leu Val lie Phe Gly Gly Leu Thr Gly Tyr Ser Ser Asp Asp
595 600 605
lie Asn Lys Phe Leu Trp Met Val Arg lie Gly Gly Ser Cys Asp Pro 610 615 620
Asn lie Lys Glu Gin Asp Tyr Leu Thr Asn Gly Gin Tyr Arg lie Asp 625 630 635 640
Lys Gly Ala Ser Pro Thr Met Leu Asn Ser Leu Met Tyr Lys Leu Ser
645 650 655
Tyr Tyr Arg Phe Ser Glu Val His Thr Asp Tyr Gin Arg Pro Thr Gly
660 665 670
Phe Asp Arg Val Arg Asn Val Glu lie Gly Asn Lys Asn Phe Asp Leu
675 680 685
Thr Tyr Leu Glu Glu Ala Phe Thr Ser Val His Trp Leu Val Arg Val 690 695 700
Tyr Lys Val Lys Asp Phe Asp Asn Arg Ala
705 710
[0042] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the D. discoideum amino acid sequence of SEQ ID NO: 30 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 30 (D. discoideum STT3) is provided below as SEQ ID NO: 31 (EMBL Nucleotide Sequence Database No.EAL64892).
atgaaaagat cagaaaaatc aagtacatct gttgttagta ataacaaaca acaagatgta 60 aatatcatca gttcaaatga agttggtgtt aaagaagaaa ataaaggaca tcaagaattc 120 ttattaaaag ttttaattct atcagtcatt tatgttttag cattttcaac tcgtttattc 180 tcagtattac gttatgaaag tgttattcat gaatttgatc catattttaa ttatagatca 240 acaatatatc ttgttcaaga aggtttttat aattttttaa attggtttga tgaaagagca 300 tggtatccat taggacgtat tgtaggtggt acaatttacc caggtttaat ggcaacagca 360 agtttagttc attggtcatt gaattcattg aatattacag ttaatattag aaatgtatgt 420 gtattgttat caccatggtt tgcatcaaat acagcaatgg taacctataa atttgccaaa gaagttaagg atacacaaac tggtttggtt gcagcagcca tgattgcaat tgttccaggt tatatttcac gttcagtagc aggttcattc gataatgaag gtattgcaat ctttgcattg attttcacat attattgttg gattaagtca gtaaacacag gctcattgat gtgggctgcc atctgttcat tggcctactt ttatatggca agtgcctggg gtggttatgt attcatcatt aatttaatcc cattgcatgc ctttttcttg cttttgacag gccgttattc acatcgtctc tacatagcct acagcacaat gtttgtcatt ggtacaatcc tctctatgca aattacattc attagtttcc aaccagttca atcatctgaa catttggctg ccattggtat ctttggtctc ctccaattgt acgctggttt gtcatgggta aagagtcacc tcaccaatga agccttcaag aaacttcaac gtttgacagt gttattcgtt ttatcttgtg ctgctgccgt acttgtcgtt ggtacattaa ctggttacat ctcaccattc aatggtcgtt tctattcatt gttggatcca acctatgctc gtgaccacat tccaatcatt gcatcagtat cagagcatca accaaccact tgggcatcat actttttcga tctccatatc ttggtattcc ttttcccagc cggtttatac ttttgtttcc aaaaattaac cgatgctaat attttcctca ttctctacgg tgtcacctcc atttatttct ctggtgtaat ggtacgtctt atgttggttt tagcaccagt tgcatgtatt ttagccgccg ttgccgtcag tgcaaccctc accacctata tgaagaagtt aaaggctcca tcatcaccaa gtgatgctaa taat tccaaa gagagtggtg gtgttatggt tgcagtctta actgttcttt taattctcta cgctttccat tgtacttggg tcactagtga agcctactca tctccatcca ttgtactctc tgccaaacaa aacgatggta gtcgtgtgat tttcgatgat ttccgtgaag cctaccgttg gattggtcaa aatactgccg acgacgctcg tattatgtct tggtgggatt atggttatca attatctgca atggccaatc gtaccgtatt ggttgataat aacacttgga acaatagtca tatcgctcaa gttggtaaag catttgcatc cactgaagaa gatgcttaca tacaaatgaa agcattggat gtcgattatg ttttagttat ttttggtggt ttaactggtt acagttctga tgatatcaat aaattccttt ggatggttag aattggtggt agttgtgatc caaatattaa agaacaagat tatctcacca atggtcaata tagaatagat aaaggtgcct caccaacaat gttaaattct ctcatgtaca aacttagtta ctatcgtttc tctgaagttc acactgacta tcaaagacca acaggtttcg atcgtgtaag aaatgttgaa attggtaata aaaatttcga tttaacttat ttagaagaag ctttcacatc tgttcattgg ttagttagag tttataaagt taaagatttt gataatagag cttaa [0043] Other eukaryotic oligosaccharyltransferases that can be utilized in this and all aspects of the present invention are listed in the table of Figures 9A-9G. This table identifies each oligosaccharyltransferase by its UniProtKB entry number, which provides the amino acid sequence of the protein, and the EMBL database accession number, which provides the encoding nucleotide sequence. The UniProtKB and EMBL accession numbers, along with the corresponding amino acid and nucleotide sequence information for each oligosaccharyltransferase listed in Figure 9 is hereby incorporated by reference in its entirety.
[0044] In another embodiment of the present invention, the
oligosaccharyltransferase is an O-linked oligosaccharyltransferase. An exemplary O- linked OST is PilO from Pseudomonas aeruginosa. PilO is responsible for the en bloc transfer of an oligosaccharide from a lipid-linked donor to an oxygen atom of serine and threonine residues (Faridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked Protein Glycosylation," /. Bacteriol. 189(22): 8088-8098 (2007), which is hereby incorporated by reference in its entirety). The amino acid sequence of P. aeruginosa (UniProtKB Accession No. Q51353) is provided below as SEQ ID NO: 32
Met Ser Leu Ala Ser Ser Leu Glu Ser Leu Arg Lys lie Asp lie Asn 1 5 10 15
Asp Leu Asp Leu Asn Asn lie Gly Ser Trp Pro Ala Ala Val Lys Val
20 25 30
lie Val Cys Val Leu Leu Thr Ala Ala Val Leu Ala Leu Gly Tyr Asn
3 5 40 45
Phe His Leu Ser Asp Met Gin Ala Gin Leu Glu Gin Gin Ala Ala Glu
50 55 60
Glu Glu Thr Leu Lys Gin Gin Phe Ser Thr Lys Ala Phe Gin Ala Ala 65 70 7 5 80
Asn Leu Glu Ala Tyr Lys Ala Gin Met Lys Glu Met Glu Glu Ser Phe
85 90 95
Gly Ala Leu Leu Arg Gin Leu Pro Ser Asp Thr Glu Val Pro Gly Leu
100 105 110
Leu Glu Asp lie Thr Arg Thr Gly Leu Gly Ser Gly Leu Glu Phe Glu
115 120 125
Glu lie Lys Leu Leu Pro Glu Val Ala Gin Gin Phe Tyr lie Glu Leu
130 135 140
Pro lie Gin lie Ser Val Val Gly Gly Tyr His Asp Leu Ala Thr Val 145 150 155 160
Ser Gly Val Ser Ser Leu Pro Arg lie Val Thr Leu His Asp Phe Glu
165 170 175 Ile Lys Pro Val Ala Pro Gly Ser Thr Ser Lys Leu Arg Met Ser lie 180 185 190
Leu Ala Lys Thr Tyr Arg Tyr Asn Asp Lys Gly Leu Lys Lys
195 200 205
[0045] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the P. aeruginosa amino acid sequence of SEQ ID NO: 32 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 33 (P.
aeruginosa PilO) is provided below as SEQ ID NO: 33 (EMBL Nucleotide Sequence Database No.AAA87404).
atgagtctgg ccagttccct ggaaagtctg cgcaagatcg atatcaacga tctcgacctg 60 aacaacatcg gttcctggcc ggcggcggtc aaggtcatcg tctgcgtgct gctgaccgcg 120 gcggtcctgg cgctgggcta caacttccat ctgagtgaca tgcaggctca gctcgaacag 180 caggccgcgg aagaggagac gctcaagcag cagttctcca ccaaggcctt ccaggccgcg 240 aacctggaag cctacaaggc acagatgaag gagatggaag agtcctttgg cgccttgctg 300 cggcagttgc ccagcgacac cgaggtaccc gggctgctcg aggacatcac tcgtaccggc 360 ctgggcagcg gcctggagtt cgaggaaatc aagctgcttc ccgaggttgc ccagcagttc 420 tacatcgagc tgccgatcca gatcagcgtg gtcggcggct accacgactt ggcgaccttc 480 gtcagcggcg tgtccagcct gccgcggatc gtcaccctgc atgacttcga gatcaagccg 540 gtcgcgcccg gcagcacgtc caagctgcgc atgagcatcc tggccaagac ctatcgctac 600 aacgacaagg ggctgaagaa atga 624
[0046] Another exemplary O-linked OST suitable for use in all aspects of the present invention is PgIL from Neisseria meningitidis (Faridmoayer et al., "Functional Characterization of Bacterial Oligosaccharyltransferases Involved in O-Linked Protein Glycosylation," J. Bacteriol. 189(22): 8088-8098 (2007), which is hereby incorporated by reference in its entirety). The amino acid sequence of N.
meningitidis (UniProtKB Accession No. G 1FG65) is provided below as SEQ ID NO: 34:
Met Pro Ala Glu Thr Thr Val Ser Gly Ala His Pro Ala Ala Lys Leu 1 5 10 15
Pro lie Tyr lie Leu Pro Cys Phe Leu Trp lie Gly lie Val Pro Phe
20 25 30
Thr Phe Ala Leu Lys Leu Lys Pro Ser Pro Asp Phe Tyr His Asp Ala
35 40 45 Ala Ala Ala Ala Gly Leu lie Val Leu Leu Phe Leu Thr Ala Gly Lys
50 55 60
Lys Leu Phe Asp Val Lys lie Pro Ala lie Ser Phe Leu Leu Phe Ala 65 70 75 80
Met Ala Ala Phe Trp Tyr Leu Gin Ala Arg Leu Met Asn Leu lie Tyr
85 90 95
Pro Gly Met Asn Asp lie Val Ser Trp lie Phe lie Leu Leu Ala Val
100 105 110
Ser Ala Trp Ala Cys Arg Ser Leu Val Ala His Phe Gly Gin Glu Arg
115 120 125
lie Val Thr Leu Phe Ala Trp Ser Leu Leu lie Gly Ser Leu Leu Gin 130 135 140
Ser Cys lie Val Val lie Gin Phe Ala Gly Trp Glu Asp Thr Pro Leu 145 150 155 160
Phe Gin Asn lie lie Val Tyr Ser Gly Gin Gly Val lie Gly His lie
165 170 175
Gly Gin Arg Asn Asn Leu Gly His Tyr Leu Met Trp Gly lie Leu Ala
180 185 190
Ala Ala Tyr Leu Asn Gly Gin Arg Lys lie Pro Ala Ala Leu Gly Val
195 200 205
lie Cys Leu lie Met Gin Thr Ala Val Leu Gly Leu Val Asn Ser Arg 210 215 220
Thr lie Leu Thr Tyr lie Ala Ala lie Ala Leu lie Leu Pro Phe Trp 225 230 235 240
Tyr Phe Arg Ser Asp Lys Ser Asn Arg Arg Thr Met Leu Gly lie Ala
245 250 255
Ala Ala Val Phe Leu Thr Ala Leu Phe Gin Phe Ser Met Asn Thr lie
260 265 270
Leu Glu Thr Phe Thr Gly lie Arg Tyr Glu Thr Ala Val Glu Arg Val
275 280 285
Ala Asn Gly Gly Phe Thr Asp Leu Pro Arg Gin lie Glu Trp Asn Lys 290 295 300
Ala Leu Ala Ala Phe Gin Ser Ala Pro lie Phe Gly His Gly Trp Asn 305 310 315 320
Ser Phe Ala Gin Gin Thr Phe Leu lie Asn Ala Glu Gin His Asn lie
325 330 335
Tyr Asp Asn Leu Leu Ser Asn Leu Phe Thr His Ser His Asn lie Val
340 345 350 Leu Gin Leu Leu Ala Glu Met Gly Ile Ser Gly Thr Leu Leu Val Ala 355 360 365
Ala Thr Leu Leu Thr Gly lie Ala Gly Leu Leu Lys Arg Pro Leu Thr 370 375 380
Pro Ala Ser Leu Phe Leu lie Cys Thr Leu Ala Val Ser Met Cys His 385 390 395 400
Ser Met Leu Glu Tyr Pro Leu Trp Tyr Val Tyr Phe Leu lie Pro Phe
405 410 415
Gly Leu Met Leu Phe Leu Ser Pro Ala Glu Ala Ser Asp Gly lie Ala
420 425 430
Phe Lys Lys Ala Ala Asn Leu Gly lie Leu Thr Ala Ser Ala Ala lie
435 440 445
Phe Ala Gly Leu Leu His Leu Asp Trp Thr Tyr Thr Arg Leu Val Asn 450 455 460
Ala Phe Ser Pro Ala Thr Asp Asp Ser Ala Lys Thr Leu Asn Arg Lys
465 470 475 480 lie Asn Glu Leu Arg Tyr lie Ser Ala Asn Ser Pro Met Leu Ser Phe
485 490 495
Tyr Ala Asp Phe Ser Leu Val Asn Phe Ala Leu Pro Glu Tyr Pro Glu
500 505 510
Thr Gin Thr Trp Ala Glu Glu Ala Thr Leu Lys Ser Leu Lys Tyr Arg
515 520 525
Pro His Ser Ala Thr Tyr Arg lie Ala Leu Tyr Leu Met Arg Gin Gly 530 535 540
Lys Val Ala Glu Ala Lys Gin Trp Met Arg Ala Thr Gin Ser Tyr Tyr
545 550 555 560 Pro Tyr Leu Met Pro Arg Tyr Ala Asp Glu lie Arg Lys Leu Pro Val
565 570 575
Trp Ala Pro Leu Leu Pro Glu Leu Leu Lys Asp Cys Lys Ala Phe Ala
580 585 590
Ala Ala Pro Gly His Pro Glu Ala Lys Pro Cys Lys
595 600 [0047] Amino acid sequences sharing at least about 70 percent, more preferably at least about 75 percent or 80 percent, most preferably at least about 85 percent or 90 percent or 95 percent as compared to the N. menigitidis amino acid sequence of SEQ ID NO: 34 are also suitable for use in the present invention. The nucleic acid sequence encoding the amino acid sequence of SEQ ID NO: 34 (N. menigitidis PglL) is provided below as SEQ ID NO: 35 (EMBL Nucleotide Sequence Database No. AEK98518).
atgcccgctg aaacgaccgt atccggcgcg caccccgccg ccaaactgcc gatttacatc 60 ctgccctgct tcctttggat aggcatcgtc ccctttacct tcgcgctcaa actgaaaccg 120 tcgcccgact tttaccacga tgccgccgcc gcagccggcc tgattgtcct gttgttcctc 180 acggcaggaa aaaaactgtt tgatgtcaaa atccccgcca tcagcttcct tctgtttgca 240 atggcggcgt tttggtatct tcaggcacgc ctgatgaacc tgatttaccc cggtatgaac 300 gacatcgtct cttggatttt catcttgctc gccgtcagcg cgtgggcctg ccggagcttg 360 gtcgcacact tcggacaaga acgcatcgtg accctgtttg cctggtcgct gcttatcggc 420 tccctgcttc aatcctgcat cgtcgtcatc cagtttgccg gctgggaaga cacccctctg 480 tttcaaaaca tcatcgttta cagcgggcaa ggcgtaatcg gacacatcgg gcagcgcaac 540 aacctcggac actacctcat gtggggcata ctcgccgccg cctacctcaa cggacaacga 600 aaaatccccg ccgccctcgg cgtaatctgc ctgattatgc agaccgccgt tttaggtttg 660 gtcaactcgc gcaccatctt gacctacata gccgccatcg ccctcatcct tcccttctgg 720 tatttccgtt cggacaaatc caacaggcgg acgatgctcg gcatagccgc agccgtattc 780 cttaccgcgc tgttccaatt ttccatgaac accattctgg aaacctttac tggcatccgc 840 tacgaaactg ccgtcgaacg cgtcgccaac ggcggtttca cagacttgcc gcgccaaatc 900 gaatggaata aagcccttgc cgccttccag tccgccccga tattcgggca cggctggaac 960 agttttgccc aacaaacctt cctcatcaat gccgaacagc acaacatata cgacaacctc 1020 ctcagcaact tgttcaccca ttcccacaac atcgtcctcc aactccttgc agagatggga 1080 atcagcggca cgcttctggt tgccgcaacc ctgctgacgg gcattgccgg gctgcttaaa 1140 cgccccctga cccccgcatc gcttttccta atctgcacgc ttgccgtcag tatgtgccac 1200 agtatgctcg aatatccttt gtggtatgtc tatttcctca tccctttcgg actgatgctc 1260 ttcctgtccc ccgcagaggc ttcagacggc atcgccttca aaaaagccgc caatctcggc 1320 atactgaccg cctccgccgc catattcgca ggattgctgc acttggactg gacatacacc 1380 cggctggtta acgccttttc ccccgccact gacgacagtg ccaaaaccct caaccggaaa 1440 atcaacgagt tgcgctatat ttccgcaaac agtccgatgc tgtcctttta tgccgacttc 1500 tccctcgtaa acttcgccct gccggaatac cccgaaaccc agacttgggc ggaagaagca 1560 accctcaaat cactaaaata ccgcccccac tccgccacct accgcatcgc cctctacctg 1620 atgcggcaag gcaaagttgc agaagcaaaa caatggatgc gggcgacaca gtcctattac 1680 ccctacctga tgccccgata cgccgacgaa atccgcaaac tgcccgtatg ggcgccgctg 1740 ctacccgaac tgctcaaaga ctgcaaagcc ttcgccgccg cgcccggtca tccggaagca 1800 aaaccctgca aatga 1815
[0048] As used herein, an "isolated" oligosaccharyltransferase refers to an oligosaccharyltransferase that is substantially pure or substantially separated from other cellular components that naturally accompany the native protein in its natural host cell. Typically, the isolated oligosaccharyltransferase of the present invention is at about 80% pure, usually at least about 90% pure, and preferably at least about 95% pure. Purity can be assessed using any method known in the art, e.g., polyacrylamide gel electrophoresis, HPLC, etc. The isolated oligosaccharyltransferase can be obtained from the organism from which it is derived directly, or it can be
recombinantly produced and purified from a host cell as described in the Examples herein or using techniques readily known in the art as described below.
[0049] Generally, the use of recombinant expression systems to produce and isolate a protein of interest involves inserting a nucleic acid molecule encoding the amino acid sequence of the desired protein into an expression system to which the molecule is heterologous (i.e., not normally present). One or more desired nucleic acid molecules encoding one or more proteins may be inserted into the vector. When multiple nucleic acid molecules are inserted, the multiple nucleic acid molecules may encode the same or different enzymes. The heterologous nucleic acid molecule is inserted into the expression system or vector in proper sense (5'— 3') orientation relative to the promoter and any other 5' regulatory molecules, and correct reading frame.
[0050] The preparation of the nucleic acid constructs can be carried out using standard cloning procedures well known in the art as described by Joseph Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL (Cold Springs Harbor 1989) and U.S. Patent No. 4,237,224 to Cohen and Boyer, which are hereby incorporated by reference in its entirety. These recombinant plasmids are then introduced by means of transformation and replicated in a suitable host cell.
[0051] A variety of genetic signals and processing events that control many levels of gene expression (e.g., DNA transcription and messenger RNA ("mRNA") translation) can be incorporated into the nucleic acid construct to maximize enzyme production. For the purposes of expressing a cloned nucleic acid sequence encoding one or more desired enzymes, it is advantageous to use strong promoters to obtain a high level of transcription. Depending upon the host system utilized, any one of a number of suitable promoters may be used. For instance, when cloning in E. coli, its bacteriophages, or plasmids, promoters such as the T7 phage promoter, lac promoter, trp promoter, recA promoter, ribosomal RNA promoter, the PR and PL promoters of coliphage lambda and others, including but not limited, to /acUV5, ompF, bla, Ipp, and the like, may be used to direct high levels of transcription of adjacent DNA segments. Additionally, a hybrid trp-lacV 5 (tac) promoter or other E. coli
promoters produced by recombinant DNA or other synthetic DNA techniques may be used to provide for transcription of the inserted gene. Common promoters suitable for directing expression in mammalian cells include, without limitation, SV40, MMTV, metallothionein-1 , adenovirus Ela, CMV, immediate early, immunoglobulin heavy chain promoter and enhancer, and RSV-LTR.
[0052] There are other specific initiation signals required for efficient gene transcription and translation in prokaryotic cells that can be included in the nucleic acid construct to maximize peptide production, e.g., the Shine -Dalgarno ribosome binding site. Depending on the vector system and host utilized, any number of suitable transcription and/or translation elements, including constitutive, inducible, and repressible promoters, as well as minimal 5' promoter elements, enhancers or leader sequences may be used. For a review on maximizing gene expression see Roberts and Lauer, "Maximizing Gene Expression on a Plasmid Using
Recombination In Vitro " Methods in Enzymology 68:473-82 ( 1979), which is hereby incorporated by reference in its entirety.
[0053] A nucleic acid molecule encoding an oligosaccharyltransferase or other protein component of the present invention (e.g., glycoprotein target, enzymes involved in glycan production), a promoter molecule of choice, including, without limitation, enhancers, and leader sequences, a suitable 3' regulatory region to allow transcription in the host, and any additional desired components, such as reporter or marker genes, are cloned into the vector of choice using standard cloning procedures in the art, such as described in Joseph Sambrook et al., MOLECULAR CLONING: A
LABORATORY MANUAL (Cold Springs Harbor 1989); Frederick M. Ausiibel, SHORT PROTOCOLS IN MOLECULAR BIOLOGY (Wiley 1999), and U.S. Patent No. 4,237,224 to Cohen and Boyer, which are hereby incorporated by reference in their entirety.
[0054] Once the nucleic acid molecule encoding the protein or proteins has been cloned into an expression vector, it is ready to be incorporated into a host.
Recombinant molecules can be introduced into cells, without limitation, via transfection (if the host is a eukaryote), transduction, conjugation, mobilization, electroporation, lipofection, protoplast fusion, calcium chloride transformation, mobilization, transfection using bacteriophage, or particle bombardment, using standard cloning procedures known in the art, as described by JOSEPH SAMBROO et al., MOLECULAR CLONING: A LABORATORY MANUAL (Cold Springs Harbor 1989), which is hereby incorporated by reference in its entirety.
[0055] Suitable host cells for recombinant protein production include both prokaryotic and eukaryotic cells. Suitable prokaryotic host cells include, without limitation, E. coli and other Enterobacteriaceae, Escherichia sp., Campylobacter sp., Wolinella sp., Desulfovibrio sp. Vibrio sp., Pseudomonas sp. Bacillus sp., Listeria sp., Staphylococcus sp., Streptococcus sp., Peptostreptococcus sp., Megasphaera sp., Pectinatus sp., Selenomonas sp., Zymophilus sp., Actinomyces sp., Arthrobacter sp., Frankia sp., Micromonospora sp., Nocardia sp., Propionibacterium sp., Streptomyces sp., Lactobacillus sp., Lactococcus sp., Leuconostoc sp., Pediococcus sp.,
Acetobacterium sp., Eubacterium sp., Heliobacterium sp., Heliospirillum sp., Sporomusa sp., Spiroplasma sp., Ureaplasma sp., Erysipelothrix, sp.,
Corynebacterium sp. Enterococcus sp., Clostridium sp., Mycoplasma sp.,
Mycobacterium sp., Actinobacteria sp., Salmonella sp., Shigella sp., Moraxella sp., Helicobacter sp, Stenotrophomonas sp., Micrococcus sp., Neisseria sp., Bdellovibrio sp., Hemophilus sp., Klebsiella sp., Proteus mirabilis, Enterobacter cloacae, Serratia sp., Citrobacter sp., Proteus sp., Serratia sp., Yersinia sp., Acinetobacter sp., Actinobacillus sp. Bordetella sp., Brucella sp., Capnocytophaga sp., Cardiobacterium sp., Eikenella sp., Francisella sp., Haemophilus sp., Kingella sp., Pasteurella sp., Flavobacterium sp. Xanthomonas sp., Burkholderia sp., Aeromonas sp., Plesiomonas sp., Legionella sp. and alpha-proteobacteria such as Wolbachia sp., cyanobacteria, spirochaetes, green sulfur and green non-sulfur bacteria, Gram-negative cocci, Gram negative bacilli which are fastidious, Enterobacteriaceae -glucose-fermenting gram- negative bacilli, Gram negative bacilli - non-glucose fermenters, Gram negative bacilli - glucose fermenting, oxidase positive. In addition to bacteria cells, eukaryotic cells such as mammalian, insect, and yeast systems are also suitable host cells for transfection/transformation of the expression vector for recombinant protein production. Mammalian cell lines available in the art for expression of a heterologous protein or polypeptide include Chinese hamster ovary cells, HeLa cells, baby hamster kidney cells, COS cells and many others.
[0056] Purified proteins may be obtained from the host cell by several methods readily known in the art, including ion exchange chromatography, hydrophobic interaction chromatography, affinity chromatography, gel filtration, and reverse phase chromatography. The peptide is preferably produced in purified form (preferably at least about 70 to about 75% pure, or about 80% to 85% pure, more preferably at least about 90% or 95% pure) by conventional techniques. Depending on whether the recombinant host cell is made to secrete the protein into growth medium (see U.S. Patent No. 6,596,509 to Bauer et al., which is hereby incorporated by reference in its entirety), the protein can be isolated and purified by centrifugation (to separate cellular components from supernatant containing the secreted protein) followed by sequential ammonium sulfate precipitation of the supernatant. The fraction containing the protein can be subjected to gel filtration in an appropriately sized dextran or polyacrylamide column to separate the protein from other cellular components and proteins. If necessary, the protein fraction may be further purified by HPLC.
[0057] The oligosaccharyltransferase catalyzes the transfer of a glycan from a lipid donor to an acceptor protein, peptide, or polypeptide. In one embodiment of the present invention, the lipid donor or carrier molecule is a prokaryotic lipid donor, i.e., it is made in a prokaryote or native to the prokaryote. Examples of prokaryotic lipid donors include an undecaprenyl-phosphate and an undecaprenyl phosphate-linked bacillosamine (Weerapana et al., "Investigating Bacterial N-Linked Glycosylation: Synthesis and Glycosyl Acceptor Activity of the Undecaprenyl Pyrophosphate-1 inked Bacillosamine," /. Am. Chem. Soc. 127: 13766-67 (2005), which is hereby incorporated by reference in its entirety). In another embodiment of the present invention, the lipid donor is a eukaryotic lipid donor, i.e., it is made in a eukaryotic cell or native to the eukaryotic cell. An exemplary eukaryotic lipid donor is dolichylpyrophosphate [0058] In accordance with this and all aspects of the present invention, the glycan comprises an oligosaccharide or polysaccharide that is linked to a lipid donor molecule. The composition of the glycan component varies in number and type of monosaccharide units that make up the oligosaccharide or polysaccharide chain. The monosaccharide components of a glycan include, but are not limited to, one or more of glucose (Glc), galactose (Gal), mannose (Man), fucose (Fuc), N- acetylgalactosamine (GalNAc), N-acetylglucosamine (GlcNAc), glucorionic acid, xylose, sialic acid (e.g., N-acetyl-neuraminic acid (NeuAc), 6-deoxy-talose, and rhamnose monosaccharides.
[0059] In accordance with this and all aspects of the present invention, the glycan can be a prokaryotic, archaea, or eukaryotic glycan. Alternatively, the glycan may comprise a completely unnatural glycan composition.
[0060] In one embodiment of the present invention, the glycan is a prokaryotic glycan that is produced by one or more prokaryotic glycosyltransferases. In another embodiment of the present invention, the prokaryotic glycan is produced using a combination of prokaryotic and eukaryotic glycosyltransferases, but has a
monosaccharide composition that mimics a prokaryotic glycan structure. In another embodiment of the present invention, the prokaryotic glycan is synthetically produced (Seeberger et al., Chemical and Enzymatic Synthesis ofGlycans and Glycoconjugates, in ESSENTIALS OF GLYCOBIOLOGY (A. Varki et al. eds., 2009), which is hereby incorporated by reference in its entirety).
[0061] An exemplary prokaryotic glycan is a glycan produced by the glycosyltransferases of the C. jejuni, C. Coli, C. lari, or C. upsaliensis Pgl gene clusters or a modified C. jejuni, C. Coli, C. lari, or C. upsaliensis Pgl gene cluster. Genes of the Pgl cluster include wlaA, galE, wlaB, pglH, pgll, pglJ, pglB, pglA, pglC, pglD, wlaJ, pglE, pglF, and pglG (Szymanski and Wren, "Protein Glycosylation in Bacterial Mucosal Pathogens," Nature Microbiol. 3:225-237 (2005), which is hereby incorporated by reference in its entirety). A prokaryotic glycan typically comprises the diacetamido-trideoxy-sugar, bacillosamine (Bac; 2,4-diacetamido-2,4,6- trideoxyglucose). A suitable prokaryotic glycan of this and all aspects of the present invention is a heptasaccharide comprising glucose, N-acetylgalactosamine, and bacillosamine, i.e., GlcGalNAcsBac. [0062] As described in the Examples herein, the glycan of this and all aspects of the present invention can be recombinantly produced. For example, a modified or unmodified C. jejuni pgl gene cluster encoding the enzymes that carry out the biosynthesis of the GlcGalNacsBac heptasaccharide and other glycan structures can be isolated and transferred to a suitable host cell for production of a lipid-linked glycan (see also Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer into E. coli," Science 298(5599): 1790-93 (2002), which is hereby incorporated by reference in its entirety). Pgl gene clusters from other Campylobacter species, e.g., C. coli, C. lari, and C. upsaliensis, are also suitable for recombinant production of glycans for use in all aspects of the present invention (Szymanski and Wren, "Protein Glycosylation in Bacterial Mucosal Pathogens," Nature Microbiol. 3:225-237 (2005), which is hereby incorporated by reference in its entirety). Additionally, similar Pgl-like glycosylation gene loci have been identified in Wolinella succinogens, Desulfovibrio desulfuricans, and D. vulgaris that are also suitable for recombinant production of glycans for the present invention (Baar et al., "Complete Genome Sequence and Analysis of Wolinella succinogenes," Proc. Natl. Acad. Sci. USA 100: 1 1690-1 1695 (2003) and Szymanski and Wren, "Protein Glycosylation in Bacterial Mucosal Pathogens," Nature Microbiol. 3:225-237 (2005), which are hereby incorporated by reference in their entirety).
[0063] The Pgl gene cluster may be modified to enhance lipid-linked glycan production, accumulation, and isolation in the host cell. For example, inactivation of the oligosaccharyltransferase component of the gene cluster (e.g., the pglB gene in the pgl gene cluster) is desirable to prevent transfer of the lipid-linked glycan to a glycoprotein target of the host cell. Additionally, in some embodiments of the present invention, it may be desirable to attenuate, disrupt, or delete competing glycan biosynthesis reactions of the host cell. In particular, inactivation of host cell glycosyltransferase enzymes (N-linked or O-linked reaction enzymes) or other enzymes involved in the transfer or ligation of a glycan to acceptor moieties of the host cell may also be desirable. For instance, when E. coli is utilized as the host cell, deletion of the WaaL enzyme which transfers glycans from the undecaprenyl lipid carrier onto lipid A, which in turn shuttles the oligosaccharides to the outer leaflet of the outer membrane, will ensure that the recombinantly produced lipid-linked glycans accumulate in the inner membrane. Other E. coli host cell glycosylation related enzymes that may be deleted, disrupted, or modified include, without limitation, wecA, wbbL, glcT, glf, gafT, wzx, wzy, and enzymes of the 016 antigen biosynthesis pathway.
[0064] In another embodiment of the present invention, the glycan is a eukaryotic glycan, i.e., a glycan produced by one or more eukaryotic
glycosyltransferases. In one embodiment, of the present invention, a eukaryotic glycan is produced by only eukaryotic glycosyltransferases. In another embodiment of the present invention, the eukaryotic glycan is produced using a combination of both eukaryotic and prokaryotic glycosyltransferase enzymes, but mimics eukaryotic glycan structure. In another embodiment of the present invention, the eukaryotic glycan is synthetically produced (Seeberger et al., Chemical and Enzymatic Synthesis of Glycans and Glycoconjiigates, in ESSENTIALS OF GLYCOBIOLOGY (A. Varki et al. eds., 2009), which is hereby incorporated by reference in its entirety).
[0065] In one embodiment, the eukaryotic glycan comprises a GlcNAc2 core. The GlcNac2 core may further comprise at least one mannose residue. Suitable eukaryotic glycan structures may comprise, but are not limited to, ManiGlcNAc2, Man2GlcNAc2, and Man3GlcNAc2.
[0066] As described above, the eukaryotic lipid-linked glycan can be recombinantly produced by introducing one or more eukaryotic glycosyltransferase enzymes in a suitable host cell. A eukaryotic glycosyltransferase as used herein refers to an enzyme that catalyzes the transfer of a sugar reside from a donor substrate, e.g., from an activated nucleotide sugar, to an acceptor substrate, e.g., a growing lipid- linked oligosaccharide chain. Suitable glycosyltransferase enzyme that can be utilized in host cells to facilitate the recombinant production of a eukaryotic lipid- linked glycan of the system include, without limitation, galactosyltransferases {e.g., i ,4-galactosyltransferase, i ,3-galactosyltransferase), fucosyl transferases, glucosyltransferases, N-acetylgalactosaminyltransferases (e.g., GalNAcT, GalNAc- Tl , GalNAc-T2, GalNAc-T3), N-acetylglucosaminyltransferases (e.g., -l ,2-N- acetylglucosaminyltransferase I (GnTI-), GnT-II, GnT-III, GnT-IV, GnT-V, GnT-VI, and GvT-IVH), glucuronyltransferases, sialy transferases (e.g., <x(2,3)sialyltransferase, α-Ν-acetylgalactosaminide <x-2,6-sialy transferase I, Gai i ,3GalNAc <x2,3- sialyltransferase, β galactoside-a-2,6-sialyltransferaase, and a2,8-sialyltransferase), mannosyltransferases (e.g., a- l ,6-mannosyltransferase, a-l ,3-mannosyltransferase, β- 1 ,4-mannosyltransferase), glucuronic acid transferases, galacturonic acid transferases, and the like. The aforementioned glycosyltransferase enzymes have been extensively studied in a variety of eukaryotic systems. Accordingly, the nucleic acid and amino acid sequences of these enzymes are known and readily available to one of skill in the art. Additionally, many of these enzymes are commercially available (e.g., Sigma- Aldrich, St. Louis, MO).
[0067] Suitable host cells for the production of a prokaryotic or eukaryotic lipid-linked glycan include both prokaryotic host cells and eukaryotic cells. An exemplary list of suitable host cells is provided supra. When utilizing eukaryotic glycosyltransferases in prokaryotic host cells, the nucleotide sequences of the eukaryotic glycosyltransferases can be codon optimized to overcome limitations associated with the codon usage bias between E. coli (and other bacteria) and higher organisms, such as yeast and mammalian cells. Codon usage bias refers to differences among organisms in the frequency of occurrence of codons in protein- coding DNA sequences (genes). A codon is a series of three nucleotides (triplets) that encodes a specific amino acid residue in a polypeptide chain. Codon optimization can be achieved by making specific transversion nucleotide changes, i.e. a purine to pyrimidine or pyrimidine to purine nucleotide change, or transition nucleotide change, i.e. a purine to purine or pyrimidine to pyrimidine nucleotide change.
[0068] In accordance with this and all aspects of the present invention, a
"glycoprotein target" includes any peptide, polypeptide, or protein that comprise one or more glycan acceptor amino acid residues. Typically glycan acceptor residues comprise an asparagine (N or Asn) to form an N-linked glycoprotein, or hydroxyl oxygen on the side chain of hydroxylysine, hydroxyproline, serine, threonine, or tyrosine to form an O-linked glycoprotein. A wide variety of glycoprotein targets exist including, without limitation, structural molecules (e.g., collagens), lubricant and protective agents (e.g., mucins), transport proteins (e.g., transferrin), immunological proteins (immunoglobulins, histocompatibility antigens), hormones, enzymes, cell attachment recognition sites, receptors, protein folding chaperones, developmentally regulated proteins, and proteins involved in hemostasis and thrombosis. Therapeutic proteins, such as antibodies are important glycoprotein targets of the system of the present invention. [0069] According to this and all aspect of the present invention, the one or more oligosaccharide acceptor residues of the glycoprotein target may be an asparagine (N or Asn) residue. The asparagine residue is positioned within a glycosylation consensus sequence comprising N-XpS/T (eukaryotic consensus sequence) or D/E-Xi- N-X2-S/T (SEQ ID NO: 1 ) (prokaryotic consensus sequence) where D is aspartic acid, Xi and X2 are any amino acid other than proline, N is asparagine, and T is threonine.
[0070] The glycoprotein target according to this and all aspects of the present invention can be a purified protein, peptide, or polypeptide comprising the requisite glycan acceptor residues. Alternatively, the glycoprotein target can be in the form of an isolated nucleic acid molecule encoding the glycoprotein target. In accordance with this embodiment of the present invention, the system further includes reagents suitable for synthesizing the glycoprotein target from said nucleic acid molecule, i.e., translation reagents.
[0071] Reagents for synthesizing proteins from nucleic acid molecules in vitro
(i.e., in a cell-free environment) are well known in the art. These reagents or systems typically consist of extracts from rabbit reticulocytes, wheat germ, and E. coli. The extracts contain all the macromolecule components necessary for translation of an exogenous RNA molecule, including, for example, ribosomes, tRNAs, aminoacyl- tRNA synthetases, initiation, elongation, and termination factors. The other required components of the system include amino acids, energy sources (e.g., ATP, GTP), energy regenerating systems (creatine phosphate and creatine phosphokinase for eukaryote systems, and phosphoenol pyruvate and pyruvate kinase for prokaryote systems), and other cofactors (e.g. , Mg2+, +, etc.). If the nucleic acid molecule encoding the glycoprotein target is a DNA molecule, the cell-free translation reaction is coupled or linked to an initial transcription reaction that utilizes a RNA polymerase.
[0072] Another aspect of the present invention is directed to a kit comprising an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, and one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule.
[0073] In accordance with this aspect of the present invention, the isolated oligosaccharyltransferase of the kit may be a purified protein or may be in the form of a nucleic acid encoding the oligosaccharyltransferase. The nucleic acid molecule can be a DNA or RNA molecule, and it can be linearized (naked) or circularized (housed in an expression vector). Exemplary prokaryotic, archaea, and eukaryotic oligosaccharyltransferases are described supra.
[0074] As described supra, the one or more glycans are linked to a lipid carrier molecule (e.g., an undecaprenol-pyrophosphate, an undecaprenyl pyrophosphate - linked bacillosamine, or a dolichylpyrophosphate). The glycan may comprise a prokaryotic, archaea, eukaryotic, or completely unnatural synthetic glycan as also described supra. Suitable prokaryotic core glycan structures comprise a
heptasaccharide containing glucose, N-acetylgalactosamine, and optionally bacillosamine (e.g., GlcGalNAcjBac). Suitable eukaryotic glycan core structures comprises N-acetylglucosamine and mannose (e.g., Ma^GlcNAc:, Man2GlcNAc2, and Man3GlcNAc2).
[0075] In one embodiment of this aspect of the present invention, the one or more isolated glycans linked to a lipid carrier molecule of the kit are in an assembled and purified form. Alternatively, the kit of the present invention comprises one or more nucleic acid molecules encoding one or more eukaryotic and/or prokaryotic glycosyltransferase enzymes, and host cells (eukaryotic or prokaryotic) that contain a polyisoprenyl pyrophosphate glycan carrier and are capable of expressing the one or more nucleic acid molecules. In accordance with this embodiment, the kit may further contain instructions for recombinantly producing and isolating the lipid-linked glycan in the host cells prior to use with the other kit components.
[0076] The kit of the present invention may further include in vitro or cell-free transcription and/or translation reagents for synthesizing the oligosaccharyltransferase and/or a glycoprotein, peptide or polypeptide of choice.
[0077] Another aspect of the present invention relates to a method for producing a glycosylated protein in a cell-free system. This method involves providing an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target, providing one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule, and providing a glycoprotein target comprising one or more glycan acceptor amino acid residues. This method further involves combining the oligosaccharyltransferase, one or more isolated glycans, and glycoprotein target to form a cell-free glycosylation reaction mixture, and subjecting the cell-free glycosylation reaction mixture to conditions effective for the oligosaccharyltransferase to transfer the glycan from the lipid carrier molecule to the one or more glycan acceptor residues of the glycoprotein target to produce a glycosylated protein.
[0078] The components of the method of the present invention, i.e., the oligosaccharyltransferase, isolated glycans linked to a lipid carrier molecule, and glycoprotein target are described in detail supra.
[0079] The method of the present invention may comprise one or more additional steps. For example, glycoprotein target translation may be coupled with glycosylation by providing reagents suitable for synthesizing a glycoprotein target from a nucleic acid molecule. In this embodiment of the present invention, the nucleic acid molecule encoding the glycoprotein target, the translation reagents, oligosaccharyltransferase, isolated glycans are all combined to form a translation- glycosylation reaction mixture. The glycoprotein target is then synthesized from the target nucleic acid molecule prior to or concurrent with the glycosylation reaction.
EXAMPLES Materials and Methods for Examples 1-4
[0080] Protein purification. For the purification of CjPglB, E. coli strain
C43(DE3) (Lucigen, Middleton, WI) was freshly transformed with plasmid pSN 18 (Kowarik et al., "N-Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety), a modified pBAD expression plasmid encoding C. jejuni pglB with a C-terminal decahistidine affinity tag. Cells were grown in 1.5 L of terrific Broth supplemented with 100 of ampicillin at 37°C. When the optical density (A600) of the culture reached ~ 1.0, cells were induced by the addition of 0.02% arabinose (w/v) for 4.5 h at 30°C. All following steps were performed at 4°C unless specified differently. Cells were harvested by centrifugation, resuspended in 25 mM Tris, pH 8.0, and 250 mM NaCl and lysed by three passages through a French press (SLM-Aminco; 10,000 PSI, SLM Instruments, Inc., Urbana, IL). Following the removal of cell debris by centrifugation, the membrane fraction was isolated by ultracentrifugation at 100,000 x g for 1 h. Membranes containing PglB were resuspended in 25 mM Tris-HCl, pH 8.0, 250 mM NaCl, 10% glycerol (v/v) and 1 % DDM (w/v) (DDM, Anatrace, Affymetrix, Inc., Santa Clara, CA) and incubated for 2 h. The insoluble fraction was removed by ultracentrifugation at 100,000 x g for 1 h. All subsequent buffers contained DDM as the detergent. The solubilized membranes were supplemented with 10 mM imidazole, loaded onto a Ni- NTA superflow affinity column (Qiagen, Valencia, CA) and washed with 60 mM imidazole before PglB was eluted with 200 mM imidazole. The purified protein was then injected onto a Superdex 200 gel filtration column using AKTA-FPLC (GE Healthcare, Waukesha, WI). Eluate fractions were subjected to sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and stained with Coomassie blue to identify the fractions containing PglB (Figure 2). The protein was desalted with a PD 10 desalting column (GE Healthcare) into 20 mM Tris, pH 7.5, 100 mM NaCl, 5% glycerol (w/v) and 0.05% DDM (w/v) and concentrated to 5- 10 mg/mL in an Amicon centricon with a molecular mass cutoff of 100 kDa. Expression and purification of the inactive CjPglB mutant was performed identically except
C43(DE3) cells carrying plasmid pSN 18.1 , which encodes an inactive copy of pglB subcloned from pACYCpglmut (see below) were used. ClPglB was purified from BL2-Gold(DE3) cells (Stratagene, La Jolla, CA) carrying plasmid pSF2 as described elsewhere (Lizak et al., "X-ray Structure of a Bacterial Oligosacchary transferase," Nature 474:350-355 (201 1), which is hereby incorporated by reference in its entirety). For long-term storage at -20°C, the glycerol content in PglB samples was increased to 10% (w/v). Purification of AcrA and scFvl3-R4-GT was from periplasmic fractions isolated from BL21 (DE3) cells carrying plasmid pET24(AcrA-per) (Nita- Lazar et al., "The N-X-S/T Consensus Sequence is Required but not Sufficient for Bacterial N-Linked Protein Glycosylation," Glycobiology 15:361 -367 (2005), which is hereby incorporated by reference in its entirety) or pET24-ssDsbAscFvl 3-R4-GT (see below). Periplasmic extracts were prepared as described previously (Schwarz et al., "Relaxed Acceptor Site Specificity of Bacterial Oligosaccharyltransferase in Vivo," Glycobiology 21 :45-54 (201 1), which is hereby incorporated by reference in its entirety), supplemented with imidazole to reach a final concentration of 10 mM, sterile filtered (0.22 μπι), and purified by nickel affinity chromatography using Ni- NTA superflow affinity column (Qiagen, Valencia, CA).
[0081] Isolation of Lipid-linked Glycans. Escherichia coli SCM6 cells transformed with pACYCpglmut (Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer Into E. coli," Science 298: 1790- 1793 (2002), which is hereby incorporated by reference in its entirety), which codes for the biosynthesis of the C. jejuni LLO and an inactivated C. jejuni pglB gene (W458A and D459A), were grown in 1 L of Luria-Burtani supplemented with 25 μg/mL of chloramphenicol at 37°C. When the A600 reached ~ 1 .0, cells were harvested by centrifugation and the pellet was lyophilized to dryness for 20 h at -80°C and 0.04 mbar. All subsequent steps were performed using glass tubes and glass pipettes. Homogenized pellets were extracted in 25mL of 10:20:3
CHCl3:MeOH:H20 followed by centrifugation at 3000 x g for 30 min. The supematants were evaporated using a rotary evaporator (Biichi, Flawil, Sankt Gallen, Switzerland), after which the resulting pellet was resuspended in 1 mL of 10:20:3 CHCl3:MeOH:H20 and sonicated until homogenous. The sample was dried under nitrogen gas at 37°C, dissolved in 10 mM HEPES (4-(2-hydroxyethyl)-l - piperazineethanesulfonic acid), pH 7.5, 1 mM MnC12 and 0.1 % DDM (w/v) and stored at -20°C. An identical procedure was followed to extract lipids from SCM6 cells carrying empty pACYC.
[0082] Cell-Free Translation and Glycosylation. For in vitro glycosylation of purified acceptor proteins, a 50 solution containing 3 μg of purified PglB, 5-10 μ . of extracted LLOs and 5 μg of purified AcrA or scFv l 3-R4-GT in 10 mM HEPES, pH 7.5, 1 mM MnC12 and 0.1 % DDM (w/v) was incubated for 12 h at 30°C. For in vitro translation of AcrA and scFv l 3-R4-GT in the absence of glycosylation, a 50 μί reaction was prepared using the S30 T7 High-Yield Expression System (Promega, Fitchburg, WI) or PURExpress (New England Biolabs, Ipswich, MA) according to the manufacturer's instructions. A total of 1 μg of the following plasmids were added to each reaction: pET24b (Novagen, Madison, WI); pET24-AcrA encoding full-length C. jejuni AcrA with a C-terminal hexahistidine tag (Nita-Lazar et al., "The N-X-S/T Consensus Sequence is Required but not Sufficient for Bacterial N-Linked Protein Glycosylation," Glycobiology 15:361 -367 (2005), which is hereby incorporated by reference in its entirety); pET24(AcrA-per) encoding a version of AcrA with an N- terminal PelB signal peptide in place of its native export signal (Nita-Lazar et al., "The N-X-S/T Consensus Sequence is Required but not Sufficient for Bacterial N- Linked Protein Glycosylation," Glycobiology 15:361 -367 (2005), which is hereby incorporated by reference in its entirety); pET24(AcrA-cyt) encoding a version of AcrA without an N-terminal export signal (AssAcrA) (Nita-Lazar et al., "The N-X- S/T Consensus Sequence is Required but not Sufficient for Bacterial N-Linked Protein Glycosylation," Glycobiology 15:361 -367 (2005), which is hereby
incorporated by reference in its entirety), and pET24-ssDsbA-scFvl 3-R4-GT encoding the expression-optimized scFv l 3-R4 intrabody gene (Martineau et al.,
"Expression of an Antibody Fragment at High Levels in the Bacterial Cytoplasm," /. Mol. Biol. 280: 1 17- 127 ( 1998), which is hereby incorporated by reference in its entirety) with an N-terminal signal peptide from E. coli DsbA for secretion and a C- terminal GT (Fisher et al., "Production of Secretory and Extracellular N-Linked Glycoproteins in Escherichia coli " Appl. Environ. Microbiol. 77:871 -881 (201 1 ), which is hereby incorporated by reference in its entirety) followed by a FLAG and a hexahistidine epitope tag. For in vitro translation/glycosylation reactions, 50 uL of translation reactions was supplemented with 3 μg purified PglB, 5 μί extracted LLOs, 1 μg purified plasmid DNA, 1 mM MnC12 and 0. 1 % DDM (w/v) and incubated for 12 h at 30°C. DDM was chosen for in vitro translation/glycosylation because it was previously observed to be well tolerated in an E. co/i-derived CFE system (Klammt et al., "Evaluation of Detergents for the Soluble Expression of Alpha-Helical and Beta- Barrel-Type Integral Membrane Proteins by a Preparative Scale Individual Cell-Free Expression System," Febs J. 272:6024-6038 (2005), which is hereby incorporated by reference in its entirety).
[0083] Western blot analysis. Expression and glycosylation of AcrA and scFv l 3-R4-GT was analyzed by immunoblot following SDS-PAGE.
Immunodetection was performed with monoclonal anti-His antibody (Qiagen, Valencia, CA), monoclonal anti-FLAG antibody (Abeam, Cambridge, MA), polyclonal anti-AcrA serum (Wacker et al., "N-Linked Glycosylation in
Campylobacter jejuni and its Functional Transfer Into E. coli " Science 298: 1790- 1793 (2002), which is hereby incorporated by reference in its entirety) and polyclonal anti-glycan serum hR6. All in vitro translation samples were treated with RNase A (Roche Diagnostics GmbH, Mannheim, Germany) prior to SDS-PAGE to reduce the irregularity of gel electrophoresis due to excess RNA. All experiments were performed at least in triplicate, and representative samples are shown. Example 1 - Preparation of N-linked Glycosylation Components
[0084] To begin, functional reconstitution of bacterial N-linked glycosylation in vitro was attempted. Minimally, this required three components: an OST, a lipid- linked oligosaccharide (LLO) (i.e., a lipid-linked glycan) and an acceptor protein carrying the D E-X1-N-X2-S/T motif. For the OST, CjPglB was expressed in the membrane fraction of E. coli cells, solubilized with 1 % N-dodecyl-β-ϋ- maltopyranoside (DDM) and purified to near homogeneity by nickel affinity chromatography followed by gel filtration (Figure 2B). Separately, E. coli cells carrying the C. jejuni pgl locus were used for producing the oligosaccharide donor. This gene cluster encodes enzymes that carry out the biosynthesis of a
GlcGalNAc5Bac heptasaccharide (where Bac is bacillosamine) and its transfer from membrane-anchored undecaprenylpyrophosphate (UndPP) to asparagine residues. Here, a modified version of this cluster that carried an inactivated pglB gene (Wacker et al., "N-Linked Glycosylation in Campylobacter jejuni and its Functional Transfer Into E. coli " Science 298: 1790- 1793 (2002), which is hereby incorporated by reference in its entirety) was transferred to E. coli SCM6 cells and used to prepare LLOs. SCM6 cells were chosen for several reasons. First, these cells lack the WaaL enzyme that naturally transfers oligosaccharides (e.g. O-antigens, glycans) from the lipid carrier undecaprenyl onto lipid A, which in turn shuttles the oligosaccharides to the outer leaflet of the outer membrane (Feldman et al., "Engineering N-Linked Protein Glycosylation With Diverse O Antigen Lipopolysaccharide Structures in Escherichia coli " Proc. Natl. Acad. Sci. U.S.A. 102:3016-3021 (2005), which is hereby incorporated by reference in its entirety). Thus, in the absence of WaaL, the desired lipid-linked glycans accumulate in the inner membrane. Second, the lipopolysaccharide and enterobacterial common antigen initiating GlcNAc transferase, WecA, is removed. Thus, this strain should only produce LLOs with GlcGalNAc5Bac at the reducing end. In support of this notion, previous mass spectrometry analysis of LLOs extracted from an E. coli strain similar to the one used here (i.e. AwaaL AwecA) revealed that only LLOs containing GlcGalNAc5Bac heptasaccharide were detected (Reid et al., "Affinity-Capture Tandem Mass
Spectrometric Characterization of Polyprenyl-Linked Oligosaccharides: Tool to Study Protein N-Glycosylation Pathways," Anal. Chem. 80:5468-5475 (2008), which is hereby incorporated by reference in its entirety). For the oligosaccharide acceptor, the model glycoprotein AcrA from C. jejuni (Nita-Lazar et al., "The N-X-S/T Consensus Sequence is Required but not Sufficient for Bacterial N-Linked Protein
Glycosylation," Glycobiology 15:361 -367 (2005), which is hereby incorporated by reference in its entirety) was purified from the periplasm. AcrA presents two consensus D/E-X)-N-X2-S/T sites that are glycosylated by CjPglB (Kowarik et al., "Definition of the Bacterial N-Glycosylation Site Consensus Sequence," EMBO J. 25: 1957- 1966 (2006), which is hereby incorporated by reference in its entirety). Alternatively, a glycoengineered single-chain variable fragment (scFv) called scFv l 3- R4-GT, which carried a C-terminal glycosylation tag (GT) consisting of four consecutive DQNAT motifs separated from one another by consecutive glycine residues (Fisher et al., "Production of Secretory and Extracellular N-Linked
Glycoproteins in Escherichia coli," Appl. Environ. Microbiol. 77:871 -881 (201 1 ), which is hereby incorporated by reference in its entirety), was similarly purified. Example 2 - Functional Reconstitution In vitro of the C. jejuni Protein
Glycosylation Pathway
[0085] To evaluate the reconstituted glycosylation pathway, CjPglB OST was combined with LLOs extracted from E. coli cells and purified AcrA. This reaction resulted in efficient glycosylation of both AcrA sites as evidenced by the mobility shift of nearly all of the AcrA from the unmodified (gO) to the fully glycosylated (g2) form (Figure 3A). This activity was dependent on PglB and LLOs. Doubling the LLO concentration resulted in the appearance of the gO and gl forms of AcrA, in addition to g2, suggesting slightly less efficient glycosylation. Importantly, glycosylation activity was lost when lipid extracts from cells lacking the pgl cluster or an inactive CjPglB mutant was used (Figure 3A). These results were corroborated by detecting glycosylated AcrA with serum specific for the C. jejuni N-glycan (Figure 3A). Nearly identical results were observed when the glycoengineered scFv l 3-R4- GT protein was used as the oligosaccharide acceptor (Figure 3A). It should be noted that g2, g3 and g4 were the predominant glycoforms detected here, with barely detectable levels of gl . To demonstrate that other OSTs could be used in this system, in vitro glycosylation of AcrA was also performed using Campylobacter lari PglB (ClPglB), which is 56% identical to that of C. jejuni (Schwarz et al., "Relaxed Acceptor Site Specificity of Bacterial Oligosaccharyltransferase in Vivo," Glycobiology 21 :45-54 (201 1 ), which is hereby incorporated by reference in its entirety). This resulted in nearly equal amounts of the gO, gl and g2 forms of AcrA under the conditions tested (Figure 3B). To be useful for translation/glycosylation reactions, the purified glycosylation components must tolerate long-term storage and freeze-thaw cycles. To test this, the components were stored separately at -20°C for 3 months. No changes were made to the storage buffers except
that the final concentration of glycerol in the PglB samples was increased to 10%. Each of the components was thawed and re frozen 5-10 times during this period, after which an in vitro reaction with ClPglB was performed. This reaction yielded the glycosylation of AcrA that appeared to be only slightly less efficient than the glycosylation observed with freshly purified components (compare Figure 3B and 3C).
Example 3 - Cell-Free Translation of Protein Targets
[0086] To determine whether existing cell-free translation systems could synthesize protein targets of interest, both an E. coli CFE-based protein synthesis system and the PURE (protein synthesis using recombinant elements) system that uses purified translation components and T7 RNA polymerase (Shimizu et al., "Cell- Free Translation Reconstituted With Purified Components," Nat. Biotechnol. 19:751 - 755 (2001 ), which is hereby incorporated by reference in its entirety) were evaluated. This involved priming the CFE and PURE systems with three different AcrA DNA sequences cloned in a T7 promoter-driven pET vector. Using the CFE system, ~ 150— 250 μg/mL of each AcrA variant was produced as a full-length polypeptide in 1 h (Figure 4A). AcrA carrying its native signal peptide accumulated to the highest level but also experienced the greatest amount of degradation. In contrast, AcrA carrying a PelB signal peptide in place of the native signal and AcrA lacking a signal peptide each accumulated to a slightly lower concentration but experienced no visible degradation. The PURE system similarly produced all three AcrA variants as full- length polypeptides albeit at a slightly lower level (- 100 μg/mL/h of each) than the CFE-based system (Figure 4A). Both systems were also able to generate appreciable amounts of scFvl 3-R4-GT (Figure 5A). It should be noted that this scFv was previously optimized for expression under nonoxidizing conditions (i.e., in the absence of disulfide bonds) (Martineau et al., "Expression of an Antibody Fragment at High Levels in the Bacterial Cytoplasm," /. Mol. Biol. 280: 1 17- 127 ( 1998), which is hereby incorporated by reference in its entirety) and thus did not require special transcription/translation conditions.
Example 4 - Cell-Free Translation and Glycosylation of Target Glycoproteins [0087] Encouraged by these results, the glycoCFE and glycoPURE translation/glycosylation systems were constructed by combining the purified glycosylation components (minus the acceptor protein) with one of the cell-free translation systems. The plasmid pET24(AcrA-cyt) that encodes AcrA without an N- terminal signal peptide was chosen to evaluate these systems because it gave rise to significant amounts of target protein in both translation systems with no detectable degradation. When either the CFE or the PURE system were primed with this plasmid along with CjPglB and LLOs, AcrA was produced primarily as the doubly glycosylated g2 glycoform with lesser amounts of gl and virtually no detectable unmodified AcrA (Figure 4B). It was estimated that - 100-150 μg of glycosylated AcrA was produced in a 1 mL reaction volume after 12 h. Likewise, scFv l 3-R4-GT was efficiently produced by both the glycoCFE and glycoPURE systems, with -50% of the protein in the fully glycosylated g4 form and 50% in the g3 form (Figure 5B). Both systems produced -50-100 μg/mL of glycosylated scFv l 3-R4-GT in 12 h. Thus, the glycoCFE and glycoPURE systems contain all the components essential for efficiently translating N-linked glycoproteins.
Discussion of Examples 1-4
[0088] A major advantage of the open prokaryote-based translation/ glycosylation systems developed here is that the supply of purified glycosylation components as well as their substrates and cofactors (Lizak et al., "X-ray Structure of a Bacterial Oligosaccharyltransferase," Nature 474:350-355 (201 1 ), which is hereby incorporated by reference in its entirety) can be provided at precise ratios. Likewise, the concentration of inhibitory substances such as proteases and glycosidases that catalyze the hydrolysis of glycosidic linkages can be reduced or eliminated entirely. Additionally, the in vitro systems permit the introduction of components that may be incompatible with in vivo systems such as certain LLOs that cannot be produced or flipped in vivo. This level of controllability is unavailable in any previous translation/glycosylation system and is significant for several reasons. First, it helps to avoid glycoprotein heterogeneity, which is particularly bothersome in fundamental studies to assess the contribution of specific glycan structures or in pharmaceutical glycoprotein production. Along these lines, the glycoCFE and glycoPURE systems should allow the examination of factors that interact with or stimulate the
glycosylation machinery and promote increased acceptor site occupancy. While the glycosylation efficiency observed here with CjPglB exceeded the level typically observed in vivo (Kowarik et al., "N-Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006); Kowarik et al., "Definition of the Bacterial N-Glycosylation Site Consensus Sequence," EMBO J. 25: 1957- 1966 (2006); Fisher et al., "Production of Secretory and Extracellular N- Linked Glycoproteins in Escherichia coli," Appl. Environ. Microbiol. 77:871 -881 (201 1 ), which are hereby incorporated by reference in their entirety), it should be pointed out that further study of the reaction conditions should lead to increases in productivity and glycosylation efficiency. Second, it facilitates the integration/co- activation of multiple complex metabolic systems and pathways in vitro including transcription, translation, protein folding and glycosylation. Therefore, the glycoCFE and glycoPURE systems should provide a unique opportunity for studying the interplay of these important mechanisms under conditions where system complexity is reduced and structural barriers are removed. For instance, since the bacterial OST can glycosylate locally flexible structures in folded proteins (Kowarik et al., "N-Linked Glycosylation of Folded Proteins by the Bacterial Oligosaccharyltransferase," Science 314: 1 148- 1 150 (2006), which is hereby incorporated by reference in its entirety) and also structured domains of some proteins, these systems should help to decipher the influence of protein structure on glycosylation efficiency. Also, since bacterial and eukaryotic glycosylation mechanisms display significant similarities, these bacterial systems could provide a simplified model framework for understanding the more complex eukaryotic process. Third, it allows for further customization of the system by reconstituting additional or alternative steps (both natural and unnatural) in the glycosylation pathway. For instance, the sequential activities of the
glycosyltransferases in the pgl pathway have been reconstituted in vitro (Glover et al., "In Vitro Assembly of the Undecaprenylpyrophosphate -Linked Heptasaccharide for Prokaryotic N-Linked Glycosylation," Proc. Nat'l. Acad. Sci. U.S.A. 102: 14255- 14259 (2005), which is hereby incorporated by reference in its entirety) and could easily be integrated with the translation/glycosylation reactions into a single integrated platform. While glycoengineered E. coli have the potential to provide a wide array of UndPP-linked glycans (Feldman et al., "Engineering N-Linked Protein Glycosylation With Diverse O Antigen Lipopolysaccharide Structures in Escherichia coli " Proc. Nat 'l. Acad. Sci. U.S.A. 102:3016-3021 (2005); Yavuz et al.,
"Glycomimicry: Display of Fucosylation on the Lipo-Oligosaccharide of
Recombinant Escherichia coli K12," Glycoconj. J. 28:39-47 (201 1 ), which are hereby incorporated by reference in their entirety), the ability to extend beyond bacterial glycans can be achieved by supplementation with specific glycosyltransferases and the requisite activated sugars. This approach can be used for making eukaryotic glycan mimetics (Schwarz et al., "A Combined Method for Producing Homogeneous Glycoproteins With Eukaryotic N-Glycosylation," Nat. Chem. Biol. 6:264-266
(2010) , which is hereby incorporated by reference in its entirety) and will allow finer control over the diversity of glycoforms that can be used for modifying target proteins in vitro. Since CjPglB has relaxed specificity toward the glycan structure (Feldman et al., "Engineering N-Linked Protein Glycosylation With Diverse O Antigen
Lipopolysaccharide Structures in Escherichia coli " Proc. Nat'l. Acad. Sci. U.S.A. 102:3016-3021 (2005), which is hereby incorporated by reference in its entirety), all of these UndPP-linked glycans are likely to be suitable substrates. Even if CjPglB should prove insufficient, the demonstration here that two different OSTs could be used interchangeably suggests that virtually any single-subunit OST including those from other bacteria, archaea and even some eukaryotes (Nasab et al., "All in One: Leishmania Major STT3 Proteins Substitute for the Whole Oligosaccharyltransferase Complex in Saccharomyces cerevisiae " Mol. Biol. Cell 19:3758-3768 (2008), which is hereby incorporated by reference in its entirety) could be used in these systems. In support of this notion, the Leishmania major and Pyrococcus furiosus single-subunit OSTs can be functionally expressed in E. coli membranes (Igura & Kohda, "Selective Control of Oligosaccharide Transfer Efficiency for the N-Glycosylation Sequon by a Point Mutation in Oligosaccharyltransferase," /. Biol. Chem. 286: 13255- 13260
(201 1 ) , which is hereby incorporated by reference in its entirety). Finally, because one is not limited to natural glycans, the glycoCFE and glycoPURE systems should permit synthesis of hybrid natural/unnatural or even completely artificial glycans. For example, the addition of synthetic sugar-nucleotide donor substrates and/or mutant glycosyltransferases and OSTs having new specificities will enable the construction of a glycosylation system founded on a noncanonical glycan code. For all of these reasons, the glycoCFE and glycoPURE systems are useful additions to the cell-free translation and glycobiology tookits alike.
[0089] Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions, and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the claims which follow.

Claims

WHAT IS CLAIMED IS:
1. A cell-free system for producing a glycosylated protein comprising:
an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target;
one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule; and
a glycoprotein target comprising one or more glycan acceptor amino acid residue, or a nucleic acid molecule encoding said glycoprotein target.
2. The system of claim 1 , wherein the oligosaccharyltransferase is a prokaryotic oligosaccharyltransferase.
3. The system of claim 2, wherein the prokaryotic oligosaccharyltransferase is derived from Campylobacter.
4. The system of claim 1 , wherein the oligosaccharyltransferase is an archaea oligosaccharyltransferase.
5. The system of claim 1 , wherein the oligosaccharyltransferase is a eukaryotic oligosaccharyltransferase.
6. The system of claim 1 , wherein the lipid carrier molecule comprises undecaprenyl-phosphate.
7. The system of claim 1 , wherein the one or more isolated glycans comprise a prokaryotic glycan.
8. The system of claim 1 , wherein the prokaryotic glycan comprises GlcGalNAcsBac.
9. The system of claim 1 , wherein the one or more isolated glycans comprise a eukaryotic glycan.
10. The system of claim 9, wherein the eukaryotic glycan comprises GlcNAc2.
1 1. The system of claim 10, wherein the eukaryotic glycan further comprises at least one mannose residue.
12. The system of claim 9, wherein the eukaryotic glycan comprises a composition selected from ManiGlcNAc2, ManiGlcNAci, and
Man3GlcNAc2.
13. The system of claim 1 , wherein the one or more glycan acceptor amino acid residues of the glycoprotein target is an asparagine residue.
14. The system of claim 13, wherein glycoprotein target further comprising an N-Xi -S/T or a D/E-Xi - N-X2-S/T (SEQ ID NO: 1 ) glycan acceptor amino acid sequence motif wherein D is aspartic acid, Xi and X2 are any amino acid other than proline, N is asparagine, and T is threonine.
15. The system of claim 1 further comprising:
reagents suitable for synthesizing the glycoprotein target from said nucleic acid molecule.
16. The system of claim 1 , wherein the glycoprotein target comprises an antibody.
17. A kit comprising:
an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target and
one or more isolated glycans, wherein each glycan is linked to a lipid carrier molecule.
18. The kit of claim 17 further comprising:
reagents suitable for synthesizing a glycoprotein target from a nucleic acid molecule encoding said glycoprotein target.
19. A method for producing a glycosylated protein in a cell-free system comprising:
providing an isolated oligosaccharyltransferase capable of transferring a glycan from a lipid carrier molecule to a glycoprotein target;
providing one or more isolated glycans, wherein each glycan is linked a lipid carrier molecule;
providing a glycoprotein target comprising one or more glycan acceptor amino acid residues;
combining the oligosaccharyltransferase, one or more isolated glycans and glycoprotein target to form a cell-free glycosylation reaction mixture; and
subjecting the cell-free glycosylation reaction mixture to conditions effective for the oligosaccharyltransferase to transfer the glycan from the lipid carrier molecule to the one or more glycan acceptor residues of the glycoprotein target to produce a glycosylated protein.
20. The method of claim 19, wherein the oligosaccharyltransferase is a prokaryotic oligosaccharyltransferase.
21. The method of claim 20, wherein the prokaryotic oligosaccharyltransferase is derived from Campylobacter.
22. The method of claim 19, wherein the oligosaccharyltransferase is an archaea oligosaccharyltransferase.
23. The method of claim 19, wherein the oligosaccharyltransferase is a eukaryotic oligosaccharyltransferase.
24. The method of claim 19, wherein the lipid carrier molecule comprises undecaprenyl phosphate.
25. The method of claim 19, wherein the one or more isolated glycans comprise a prokaryotic glycan.
26. The method of claim 25, wherein the one or more prokaryotic glycans comprise GlcGalNAcsBac.
27. The method of claim 19, wherein one or more isolated glycans comprise a eukaryotic glycan.
28. The method of claim 27, wherein the one or more eukaryotic glycans comprise GlcNAc2.
29. The method of claim 28, wherein the one or more eukaryotic glycans further comprise at least one mannose residue.
30. The method of claim 28, wherein the one or more eukaryotic glycans comprise a composition selected from ManiGlcNAc2, Man2GlcNAc2, and Man3GlcNAc2.
31 . The method of claim 19, wherein said providing a glycoprotein target comprises providing a nucleic acid molecule encoding the glycoprotein, said method further comprising:
providing reagents suitable for synthesizing a glycoprotein target from said nucleic acid molecule and
blending the reagents with the glycosylation reaction under conditions effective to synthesize the glycoprotein target from the nucleic acid molecule prior to, or concurrent with, said subjecting.
32. The method of claim 19, wherein the one or more glycan acceptor amino acid residues of the glycoprotein target is an asparagine residue.
33. The method of claim 32, wherein the glycoprotein target further comprising an N-XrS/T or a D/E-Xi- N-X2-S/T (SEQ ID NO: 1 ) glycan acceptor amino acid sequence motif wherein D is aspartic acid, Xi and X2 are any amino acid other than proline, N is asparagine, and T is threonine.
34. The method of claim 19, wherein the protein comprises an antibody.
PCT/US2012/063590 2011-11-04 2012-11-05 A prokaryote-based cell-free system for the synthesis of glycoproteins WO2013067523A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US14/356,258 US11193154B2 (en) 2011-11-04 2012-11-05 Prokaryote-based cell-free system for the synthesis of glycoproteins
CN201280066129.1A CN104080921A (en) 2011-11-04 2012-11-05 Prokaryote-based cell-free system for synthesis of glycoproteins
IN4076CHN2014 IN2014CN04076A (en) 2011-11-04 2014-05-30
HK15103270.8A HK1202896A1 (en) 2011-11-04 2015-03-31 A prokaryote-based cell-free system for the synthesis of glycoproteins
US17/543,614 US20220340947A1 (en) 2011-11-04 2021-12-06 Prokaryote-based cell-free system for the synthesis of glycoproteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161555854P 2011-11-04 2011-11-04
US61/555,854 2011-11-04

Related Child Applications (2)

Application Number Title Priority Date Filing Date
US14/356,258 A-371-Of-International US11193154B2 (en) 2011-11-04 2012-11-05 Prokaryote-based cell-free system for the synthesis of glycoproteins
US17/543,614 Continuation US20220340947A1 (en) 2011-11-04 2021-12-06 Prokaryote-based cell-free system for the synthesis of glycoproteins

Publications (1)

Publication Number Publication Date
WO2013067523A1 true WO2013067523A1 (en) 2013-05-10

Family

ID=48192910

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2012/063590 WO2013067523A1 (en) 2011-11-04 2012-11-05 A prokaryote-based cell-free system for the synthesis of glycoproteins

Country Status (5)

Country Link
US (2) US11193154B2 (en)
CN (2) CN112980907A (en)
HK (1) HK1202896A1 (en)
IN (1) IN2014CN04076A (en)
WO (1) WO2013067523A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015085209A1 (en) 2013-12-06 2015-06-11 President And Fellows Of Harvard College Paper-based synthetic gene networks
WO2016107818A1 (en) * 2014-12-30 2016-07-07 Glycovaxyn Ag Compositions and methods for protein glycosylation
WO2017117539A1 (en) * 2015-12-30 2017-07-06 Northwestern University Cell-free glycoprotein synthesis (cfgps) in prokaryotic cell lysates enriched with components for glycosylation
US11497804B2 (en) 2015-02-26 2022-11-15 Vaxnewmo Llc Acinetobacter O-oligosaccharyltransferases and uses thereof
US11932670B2 (en) 2018-06-16 2024-03-19 Vaxnewmo Llc Glycosylated ComP pilin variants, methods of making and uses thereof

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2971030B8 (en) * 2013-03-14 2018-12-05 Glycobia, Inc Oligosaccharide compositions, glycoproteins and methods to produce the same in prokaryotes
EP3167072A4 (en) * 2014-07-08 2018-04-18 Technion Research & Development Foundation Ltd. Methods and kits for cell-free transcription and translation
CN106478773B (en) * 2015-08-25 2021-09-14 三生国健药业(上海)股份有限公司 Novel artificially synthesized signal peptide
US10829795B2 (en) * 2016-07-14 2020-11-10 Northwestern University Method for rapid in vitro synthesis of glycoproteins via recombinant production of N-glycosylated proteins in prokaryotic cell lysates
US11898187B2 (en) 2017-08-15 2024-02-13 Northwestern University Protein glycosylation sites by rapid expression and characterization of N-glycosyltransferases
US11530432B2 (en) 2018-03-19 2022-12-20 Northwestern University Compositions and methods for rapid in vitro synthesis of bioconjugate vaccines in vitro via production and N-glycosylation of protein carriers in detoxified prokaryotic cell lysates
WO2019204346A1 (en) 2018-04-16 2019-10-24 Northwestern University METHODS FOR CO-ACTIVATING IN VITRO NON-STANDARD AMINO ACID (nsAA) INCORPORATION AND GLYCOSYLATION IN CRUDE CELLLYSATES
CN115181752A (en) * 2022-07-12 2022-10-14 大连大学 Method for improving modified protein efficiency and protein expression quantity by sugar chain plasmid optimization

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020123101A1 (en) * 2000-12-28 2002-09-05 Akio Inoue Process for producing peptides by using in vitro transcription/translation system
US20090074798A1 (en) * 2002-03-07 2009-03-19 Eth Zurich System and method for the production of recombinant glycosylated proteins in a prokaryotic host
US20100286067A1 (en) * 2008-01-08 2010-11-11 Biogenerix Ag Glycoconjugation of polypeptides using oligosaccharyltransferases
US20110039729A1 (en) * 2008-01-03 2011-02-17 Cornell Research Foundation, Inc. Glycosylated protein expression in prokaryotes

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4590249B2 (en) * 2004-11-17 2010-12-01 独立行政法人理化学研究所 Cell-free protein synthesis system for glycoprotein synthesis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020123101A1 (en) * 2000-12-28 2002-09-05 Akio Inoue Process for producing peptides by using in vitro transcription/translation system
US20090074798A1 (en) * 2002-03-07 2009-03-19 Eth Zurich System and method for the production of recombinant glycosylated proteins in a prokaryotic host
US20110039729A1 (en) * 2008-01-03 2011-02-17 Cornell Research Foundation, Inc. Glycosylated protein expression in prokaryotes
US20100286067A1 (en) * 2008-01-08 2010-11-11 Biogenerix Ag Glycoconjugation of polypeptides using oligosaccharyltransferases

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN ET AL.: "From Peptide to Protein: Comparative Analysis of the Substrate Specificity of N- Linked Glycosylation in C. jejuni.", BIOCHEMISTRY, vol. 46, no. 18, 8 May 2007 (2007-05-08), pages 5579 - 5585, XP055158776, DOI: doi:10.1021/bi602633n *
FISHER ET AL.: "Production of secretory and extracellular N-linked glycoproteins in Escherichia coli.", APPL. ENVIRON. MICROBIOL., vol. 77, no. 3, February 2011 (2011-02-01), pages 871 - 881, XP055142701, DOI: doi:10.1128/AEM.01901-10 *
MALTA ET AL.: "Comparative structural biology of eubacterial and archaeal oligosaccharyltransferases.", J. BIOL. CHEM, vol. 285, no. 7, 12 February 2010 (2010-02-12), pages 4941 - 4950, XP055038785, DOI: doi:10.1074/jbc.M109.081752 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015085209A1 (en) 2013-12-06 2015-06-11 President And Fellows Of Harvard College Paper-based synthetic gene networks
WO2016107818A1 (en) * 2014-12-30 2016-07-07 Glycovaxyn Ag Compositions and methods for protein glycosylation
WO2016107819A1 (en) * 2014-12-30 2016-07-07 Glycovaxyn Ag Compositions and methods for protein glycosylation
BE1022998B1 (en) * 2014-12-30 2016-10-28 Glycovaxyn Ag COMPOSITIONS AND METHODS FOR GLYCOSYLATION OF PROTEINS
US20180002679A1 (en) * 2014-12-30 2018-01-04 Glaxosmithkline Biologicals, Sa Compositions and methods for protein glycosylation
US10150952B2 (en) * 2014-12-30 2018-12-11 Glaxosmithkline Biologicals S.A. Compositions and methods for protein glycosylation
US11015177B2 (en) 2014-12-30 2021-05-25 Glaxosmithkline Biologicals Sa Compositions and methods for protein glycosylation
EP4043561A1 (en) * 2014-12-30 2022-08-17 GlaxoSmithKline Biologicals SA Compositions and methods for protein glycosylation
US11497804B2 (en) 2015-02-26 2022-11-15 Vaxnewmo Llc Acinetobacter O-oligosaccharyltransferases and uses thereof
WO2017117539A1 (en) * 2015-12-30 2017-07-06 Northwestern University Cell-free glycoprotein synthesis (cfgps) in prokaryotic cell lysates enriched with components for glycosylation
US11453901B2 (en) 2015-12-30 2022-09-27 Northwestern University Cell-free glycoprotein synthesis (CFGpS) in prokaryotic cell lysates enriched with components for glycosylation
US11932670B2 (en) 2018-06-16 2024-03-19 Vaxnewmo Llc Glycosylated ComP pilin variants, methods of making and uses thereof

Also Published As

Publication number Publication date
IN2014CN04076A (en) 2015-10-23
US11193154B2 (en) 2021-12-07
US20220340947A1 (en) 2022-10-27
CN104080921A (en) 2014-10-01
US20140255987A1 (en) 2014-09-11
HK1202896A1 (en) 2015-10-09
CN112980907A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
US20220340947A1 (en) Prokaryote-based cell-free system for the synthesis of glycoproteins
US20230399670A1 (en) In vivo synthesis of sialylated compounds
Zhou et al. Expression of heparan sulfate sulfotransferases in Kluyveromyces lactis and preparation of 3′-phosphoadenosine-5′-phosphosulfate
EP2909318A1 (en) A thermostable sucrose and sucrose-6&#39;-phosphate phosphorylase
JP5189585B2 (en) Novel β-galactoside-α2,6-sialyltransferase, gene encoding the same, and method for improving enzyme activity
EP3017041B1 (en) N-terminally truncated glycosyltransferases
Deng et al. Heparosan oligosaccharide synthesis using engineered single-function glycosyltransferases
US20230140642A1 (en) Mutated sulfotransferases and uses thereof
US20170204381A1 (en) Pmst1 mutants for chemoenzymatic synthesis of sialyl lewis x compounds
EP2441832A1 (en) Novel protein and gene that codes therefor
US9783838B2 (en) PmST3 enzyme for chemoenzymatic synthesis of alpha-2-3-sialosides
Gandini et al. A transmembrane crenarchaeal mannosyltransferase is involved in N-glycan biosynthesis and displays an unexpected minimal cellulose-synthase-like fold
Gu et al. Discovery and biochemical characterization of the UDP-xylose biosynthesis pathway in Sphaerobacter thermophilus
US20230130811A1 (en) Uses and methods for sulfating a substrate with a mutated arylsulfotransferase
JP2011223885A (en) New cytidine 5&#39;-monophosphosialic acid synthetase, gene encoding the same and method for producing the synthetase
EP4265730A1 (en) Cell-free enzymatic method for preparation of n-glycans
WO2012014980A1 (en) Novel enzyme protein, process for production of the enzyme protein, and gene encoding the enzyme protein
Ferrero et al. Purification and characterization of GlcNAc-6-P 2-epimerase from Escherichia coli K92
WO2023202991A2 (en) Cell-free enzymatic method for preparation of n-glycans
JP4977125B2 (en) Novel β-galactoside-α2,6-sialyltransferase, gene encoding the same, and method for producing the same
NZ796027A (en) In vivo synthesis of sialylated compounds

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12845364

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 14356258

Country of ref document: US

122 Ep: pct application non-entry in european phase

Ref document number: 12845364

Country of ref document: EP

Kind code of ref document: A1