WO2010114615A2 - A facile system for encoding unnatural amino acids in mammalian cells - Google Patents

A facile system for encoding unnatural amino acids in mammalian cells Download PDF

Info

Publication number
WO2010114615A2
WO2010114615A2 PCT/US2010/000992 US2010000992W WO2010114615A2 WO 2010114615 A2 WO2010114615 A2 WO 2010114615A2 US 2010000992 W US2010000992 W US 2010000992W WO 2010114615 A2 WO2010114615 A2 WO 2010114615A2
Authority
WO
WIPO (PCT)
Prior art keywords
trna
amino acid
amino acids
substituted
synthetase
Prior art date
Application number
PCT/US2010/000992
Other languages
French (fr)
Other versions
WO2010114615A3 (en
Inventor
Peng R. Chen
Daniel Groff
Jiantao Guo
Bernhard H. Geierstanger
Peter G. Schultz
Original Assignee
The Scripps Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Scripps Research Institute filed Critical The Scripps Research Institute
Publication of WO2010114615A2 publication Critical patent/WO2010114615A2/en
Publication of WO2010114615A3 publication Critical patent/WO2010114615A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P21/00Preparation of peptides or proteins
    • C12P21/02Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1068Template (nucleic acid) mediated chemical library synthesis, e.g. chemical and enzymatical DNA-templated organic molecule synthesis, libraries prepared by non ribosomal polypeptide synthesis [NRPS], DNA/RNA-polymerase mediated polypeptide synthesis

Definitions

  • the inventions are in the field of translation biochemistry.
  • the inventions are directed to aminoacyl-tRNA synthetase/tRNA orthogonal pairs that function to charge unnatural amino acids in both eubacterial and eukaryotic cells.
  • Additional amino acids can be added to the genetic codes of both prokaryotic and eukaryotic organisms. This can be accomplished by means of an orthogonal tRNA (O-tRNA) and aminoacyl-tRNA synthetase (RS) pair that incorporates the unnatural amino acid in response to a nonsense or four base codon in the gene of interest.
  • OF-tRNA orthogonal tRNA
  • RS aminoacyl-tRNA synthetase
  • Directed evolution of the specificity of the aminoacyl-tRNA synthetase in either bacteria or yeast has been used to genetically encode approximately 50 unnatural amino acids with novel physical, chemical or biological properties in these organisms.
  • the present inventions include methods and compositions for incorporation of unnatural amino acids by translation optionally in both eubacteria and in eukaryotes.
  • the invention includes translation system components that can function orthogonally in, and can be shuttled between, eubacteria and eukaryotes.
  • Methods include, e.g., mutating an aminoacyl tRNA synthetase (RS) from, e.g., Methanosarcinae, Desulfitobacterium or other Archaea, at identified positions, selecting mutants with structures functioning to accommodate an unnatural amino acid of interest as substrate, shuttling the RSs to a eukaryotic translation system where they function orthogonally with a cognate tRNA, and translating a nucleic acid sequence to provide a polypeptide incorporating the unnatural amino acid.
  • RS aminoacyl tRNA synthetase
  • compositions include translation system components, such as an aminoacyl tRNA synthetase (RS) and a cognate tRNA, wherein the synthetase is orthogonal in an enterobacteria and is also orthogonal in a eukaryotic cell.
  • the cognate tRNA recognizes a selector codon and the synthetase is capable of specifically aminoacylating the tRNA with an unnatural amino acid when both the synthetase and the tRNA are expressed in either the enterobacteria or the eukaryotic cell.
  • the synthetase is derived from an Archaea or bacteria synthetase and the cognate tRNA is derived from an Archaea or bacteria tRNA.
  • the synthetase can be derived from a Methanosarcinae RS, Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence, a Methanosarcina barken pyrrolysyl-tRNA synthetase (MbPyIRS) sequence, a Desulfitobacterium hafniense pyrrolysyl-tRNA synthetase (DhPyIRS - see, Biochem Biophys Res Commun.
  • MmPyIRS Methanosarcina maize pyrrolysyl-tRNA synthetase
  • MbPyIRS Methanosarcina barken pyrrolysyl-tRNA syntheta
  • the cognate tRNA can be, e.g., a pyrrolysyl-tRNA with an anticodon loop that recognizes a selector codon.
  • the selector codon can be an amber codon or, e.g., another appropriate stop codon or 4 or more base codon.
  • the RS is derived by appropriate functional mutations at amino acid positions corresponding positions 305, 306, 309, 348, 384 or 419 of the MmPyIRS.
  • the synthetase sequence can include an isoleucine or methionine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384 to provide an RS charging with a caged lysine or similar lysine epsilon-substituted lysine analog.
  • Exemplary unnatural amino acids that can be incorporated, e.g., in bacteria or eukaryotes using methods and compositions of the invention include, e.g., : an epsilon- substituted lysine, a photocaged lysine, a photocaged lysine analog, an ortho acyl- substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono-substituted phenylalanine, a para boron
  • the unnatural amino acid optionally is other than Boc-lysine, acetyllysine or N ⁇ -benzyloxycarbonyl-L-lysine.
  • the unnatural amino acids are other than the 20 canonical natural amino acids, seleno-cysteine or pyrrolysine.
  • the unnatural amino acid is optionally O-nitrobenzyl-oxycarbonly-N ⁇ -L-lysine (ONBK).
  • the RS includes an amino acid sequence at least 90% identical to SEQ ID NO: 4 (NBK-I), and has an Ala amino acid at a position corresponding to Leu309 of wild type MmPyIRS sequence SEQ ID NO: 2, an Ala amino acid at a position corresponding to Cys348 of SEQ ID NO: 2, and a Tyr amino acid at a position corresponding to Phe384 of SEQ ID NO: 2.
  • the aminoacyl tRNA synthetase comprises a polypeptide sequence comprising at least 90% identity to a Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence, and the polypeptide sequence comprises methionine at a position corresponding to position 306 of the MmPyIRS sequence, an isoleucine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309 of the MmPyIRS sequence, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384.
  • the RS is at least 95% or more identical to SEQ ID NO: 4 or SEQ ID NO: 6.
  • the methods include producing polypeptides in a eukaryotic cell by producing an orthogonal aminoacyl-tRNA synthetase (O-RS) library in one or more bacterial cells, selecting the synthetase library for an orthogonal member that specifically aminoacylates an orthogonal tRNA (O-tRNA) in the bacterial cells with an unnatural amino acid to provide an unnatural amino acid-specific synthetase that is orthogonal in the bacterial cells.
  • the unnatural amino acid-specific synthetase can be shuttled into the eukaryotic cell, such as a mammalian cell or an insect cell, to charge a cognate tRNA and to function orthogonally in the eukaryotic cell.
  • the synthetase and/or O-tRNA is derived from corresponding Archaea translation components.
  • the unnatural amino acid can be any appropriate unnatural amino acid.
  • the unnatural amino acid can be an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl- substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono-substituted phenylalanine, a para benzoyl-substituted phenylalanine
  • the unnatural amino acid is other than the canonical 20 natural amino acids, seleno-cysteine, pyrrolysine, Boc-lysine, acetyllysine or N ⁇ - benzyloxycarbonyl-L-lysine.
  • the amino acid is not itself a peptide and is not unnatural due to linkage of a chemical moiety to the side chain after the amino acid has previously been incorporated into a polypeptide.
  • the unnatural amino acid typically has a side chain with dimensions that fit into a modified binding pocket of an aminoacyl-tRNA synthetase.
  • a typical natural amino acid side chain can be considered to have a length ranging from about zero (glycine) to about 10 angstroms.
  • Side chains of many unnatural amino acids of interest range in size from about 2 angstroms to about 25 angstroms, or more; from 3 angstroms to 20 angstroms, from 5 angstroms to 15 angstroms, or about 12 angstroms.
  • Unnatural amino acids with side chains having lengths greater than 50 angstroms (or about 30 carbon-carbon bond equivalents) are typically less desirable.
  • the methods include providing the orthogonal synthetase by mutating a Methanosarcina nucleic acid encoding a pyrrolysyl-tRNA synthetase (MPyIRS) polypeptide.
  • MPyIRS pyrrolysyl-tRNA synthetase
  • useful synthetases can be provided by mutation of the nucleic acid (e.g., MmPylRSwt SEQ ID NO: 1) at position corresponding amino acid positions Leu309, Cys348 and Tyr384 of SEQ ID NO: 2; wherein SEQ ID NO: 2 is a wild type
  • Methanosarcina maize polypeptide sequence.
  • the method further include mutation of the MPyIRS nucleic acid at positions encoding amino acids at positions corresponding to Tyr306 of SEQ ID NO: 2.
  • the mutated nucleic acid can be used to transform bacteria with the mutated nucleic acids along with nucleic acids encoding cognate tRNAs preferentially aminoacylated by the MPyIRS, thereby providing an O-RS library of mutated RSs paired with the cognate tRNA. Clones from the library can be positively selected for members encoding a mutant MPyIRS that charges the Pyl-tRNA with an unnatural amino acid of choice.
  • the methods can include growing a eukaryotic cell comprising: the unnatural amino acid, a nucleic acid that encodes a protein and comprises at least one selector codon recognized by the Pyl-tRNA, a selected mutant MPyIRS and the Pyl-tRNA, so that the protein is translated from the nucleic acid in the eukaryotic cell to incorporate the unnatural amino acid at the specified position.
  • Methods include shuttling translation system components from eubacterial cells to eukaryotic translation systems (typically cells) where they function to orthogonally incorporate unnatural amino acids into polypeptides of interest.
  • the shuttling can comprise transforming a eukaryotic cell with a nucleic acid (e.g., NBK-I RS SEQ ID NO: 3 or NBK-2 RS SEQ ID NO: 5) encoding a sequence encoding an O-RS comprising an amino acid sequence at least 90% identical to SEQ ID NO: 4 (NBK-I RS) or to SEQ ID NO: 6 (NBK-2 RS), wherein the O-RS further comprises an Ala amino acid in a position of the O-RS corresponding to Leu309 of SEQ ID NO: 2, an Ala amino acid residue in a position of the O-RS corresponding to Cys348 of SEQ ID NO: 2, and a Tyr amino acid residue in a position of the O-RS corresponding to Tyr384 of SEQ ID
  • the method can also include transforming the eukaryotic cell with a nucleic acid encoding a nucleic acid sequence encoding a pyrrolysyl-tRNA (Pyl-tRNA) preferentially aminoacylated by the O-RS.
  • the cell can be provided with a nucleic acid that encodes the polypeptide of interest including at least one selector codon recognized by the Pyl-tRNA so that the unnatural amino acid will be incorporated at the position designated by the selector codon.
  • the methods include preparing the unnatural amino acid by photocaging a residue of interest, e.g., a lysine or by substituting a chemical group on the residue.
  • the unnatural amino acid is a caged lysine analog or other unnatural amino acid
  • it can be charged on to the Pyl-tRNA and incorporated into the polypeptide of interest.
  • the polypeptide can be illuminated with light to remove the cage group from the lysine or other unnatural amino acid.
  • the present inventions include polypeptide libraries comprising
  • Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequences that collectively comprise mutations at positions corresponding to positions 305, 306, 309, 348, 384 and 419. In many embodiments, less than 20%, e.g., less than 10%, less than 5% or less of amino acids, other than those at positions 305, 306, 309, 348, 384, and 419 are mutated. In many embodiments, the library synthetases comprise one or more mutations selected from the group consisting of: Y306M, Y306I, L309A, C348A and Y384F.
  • the polypeptide can be present within a cell, such as, e.g., a eubacterial or eukaryotic cell.
  • nucleic acid comprising a sequence encoding an aminoacyl tRNA synthetase comprising an isoleucine or methionine at a position corresponding to position 306 of the wild type Methanosarcina maize PyIRS sequence, an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384.
  • the nucleic acid is typically incorporated into a vector, such as an expression vector, or a shuttle vector.
  • Figure 1 shows a ribbon diagram of an exemplary Methanosarcina pyrrolysyl-tRNA synthetase protein structure.
  • Figure 2 shows a schematic diagram of an exemplary translation system that can be shuttled between an enterobacteria and a mammalian cell and function in each.
  • Figure 3 shows structures of pyrrolysine and certain analogs or pyrrolysine.
  • Figure 4 shows incorporation of a pyrrolysine analog by native MmPyIRS in
  • FIG. 2a shows northern blot analysis of tRNA charging in E. coli.
  • the uncharged tRNAj; A band and the charged tRNA5; A band are indicated by arrows.
  • tRNA ⁇ A is only charged in the presence of both PyIRS and Cyc.
  • Figure 2b western blot analysis shows protein expression in mammalian cells.
  • the full length mutant His-RBP4 is only expressed when CHO cells harboring both MmPyIRS and IRNA ⁇ plasmids were grown with 5 mM Cyc.
  • Figure 2c shows library design for directed evolution of MmPyIRS.
  • PyI is colored magenta and residues in close contact with the terminal ring of PyI are colored green.
  • Figure 5 presents SDS-PAGE and mass spectroscopy confirming the preparation of efficient polypeptide translation incorporating a caged lysine. The results demonstrate evolution of a MmPyIRS- IRNA ⁇ pair that encodes ONBK in E. coli.
  • Figure 5a shows a plate assay of NBK-I and NBK-2 surviving up to 120 ⁇ g ml "1 Cm challenges when supplemented with 1 mM ONBK.
  • Figure 5b genetic incorporation of ONBK into GFP protein in E. coli is analyzed by SDS-PAGE. The expressed full length GFP proteins were purified by Ni 2+ -NTA chromatography and stained with coomassie blue.
  • Figure 5c shows ESI-MS analysis of purified GFP149ONBK protein produced by NBK-I- IRNA ⁇ .
  • the major peak (mass: 27,915 Da) corresponds to the full length GFP149ONBK; the minor peak (mass: 27,782 Da) corresponds to the same protein with the N-terminal Met posttranslationally cleaved (GFP149ONBK-M).
  • FIG. 6 shows shuttling of E. coli mutated RS functioning orthogonally in a mammalian system. Shuttling the evolved synthetase into mammalian cells.
  • EGFP37TAG protein is expressed using a NBK-I- IRNA ⁇ pair in HEK293 cells in the presence of 1 mM ONBK. The top pictures show the fluorescence images of cells and the bottom pictures show cells illuminated with visible light.
  • Figure 6b shows ESI-MS analysis of purified EGFP37ONBK protein from CHO cells. Inset shows the deconvoluted spectrum of EGFP37ONBK.
  • Figure 6c shows ESI-MS analysis of EGFP37ONBK after photolysis. EGFP37ONBK protein at a final concentration of 100 ⁇ M was irradiated (365 nm) for 20 min.
  • the present inventions are directed to, e.g., compositions and methods using orthogonal aminoacyl-tRNA synthetase/orthogonal tRNA (O-RS/O-tRNA) pairs derived from certain Archaea RS/tRNA pairs that normally charge pyrrolysine.
  • O-RS/O-tRNA orthogonal aminoacyl-tRNA synthetase/orthogonal tRNA
  • an RS/tRNA pair from Methanosarcina sp. is mutated to prepare a library of orthogonal pairs in bacteria incorporating an unnatural amino acid, then selected pairs are shuttled into eukaryotic cells.
  • an E is mutated to prepare a library of orthogonal pairs in bacteria incorporating an unnatural amino acid, then selected pairs are shuttled into eukaryotic cells.
  • c ⁇ / ⁇ -mammalian shuttle system has been developed to genetically encode unnatural amino acids in mammalian cells using aminoacyl-tRNA synthetases (RSs) evolved in E. coli.
  • RSs aminoacyl-tRNA synthetases
  • a pyrrolysyl-tRNA synthetase (PyIRS) mutant was evolved in E. coli that selectively aminoacylates a cognate nonsense suppressor tRNA with a photocaged lysine derivative.
  • a wide variety of unnatural amino acids can, similarly, be incorporated using similarly constructed mutants. Transfer of such orthogonal tRNA-RS pairs into eukaryotic (e.g., mammalian) cells makes possible the selective incorporation of unnatural amino acids into proteins in such eukaryotic cells.
  • the present invention includes compositions and methods for shuttling unnatural amino acid incorporation functionality between enterobacterial species and eukaryotic translation systems.
  • the compositions include translation systems and translation system components designed with structures that function to incorporate unnatural amino acids in eukaryotes or prokaryotes, as desired.
  • the methods include techniques of providing a library of, e.g., mutated Archaea aminoacyl-tRNA synthetases in enterobacteria, screening the library to select for synthetases that charge a cognate tRNA with an unnatural amino acid of choice, shuttling the selected synthetase to a eukaryotic cell, and incorporating the unnatural amino acid into a polypeptide using the synthetase in the eukaryotic cell.
  • a library of e.g., mutated Archaea aminoacyl-tRNA synthetases in enterobacteria
  • screening the library to select for synthetases that charge a cognate tRNA with an unnatural amino acid of choice
  • shuttling the selected synthetase to a eukaryotic cell shuttling the selected synthetase to a eukaryotic cell, and incorporating the unnatural amino acid into a polypeptide using the synthetase
  • compositions of the present invention include, e.g., translation systems and translation system components comprising orthogonal aminoacyl-tRNA synthetases (O- RSs), orthogonal tRNAs (O-tRNAs) derived from Archaea, unnatural amino acids, and/or nucleic acids encoding polypeptides of interest.
  • the compositions can include libraries comprising synthetases that function orthogonally in both eubacteria and eukaryotes, cells comprising the translation system components, and/or vectors for expression of the translation system components.
  • the compositions include components having structures, such as, e.g., RS binding pockets and structural scaffolding, and tRNA selector codons and A arms and other structural features that function to incorporate desired unnatural amino acids into intended positions of polypeptides of interest.
  • Methods of the invention include selected and/or random mutation of an
  • Archaea e.g., Methanosarcina
  • Archaea e.g., Methanosarcina
  • a eubacteria e.g., E. coli
  • shuttling the mutated RS and a cognate tRNA into a eukaryotic cell e.g., a mammalian, insect or plant cell line
  • the Archaea RS is a pyrrolysyl-tRNA synthetase and the mutations are directed to modification of amino acid residues at specific positions lining the binding pocket.
  • evaluation of charging activity includes positive and/or negative selection techniques in the eubacteria to enrich for and identify those mutant RSs with the highest desired charging activity.
  • Production of polypeptides including certain unnatural amino acids is desirable, e.g., to provide research tools and medicines reflecting the unique translation and post translation processing available in eukaryotic cells.
  • Desired characteristics of the orthogonal pair include tRNA that decode or recognize only a specific codon, e.g., a selector codon, e.g., an amber stop codon, that is not decoded by any endogenous tRNA, and aminoacyl-tRNA synthetases that preferentially aminoacylate, or "charge", its cognate tRNA with a specific unnatural amino acid.
  • the O-tRNA is also not typically aminoacylated, or is very poorly aminoacylated, i.e., "charged," by endogenous synthetases. For example, in an E.
  • an orthogonal pair will include an aminoacyl-tRNA synthetase that does not cross- react with any of the endogenous tRNAs, e.g., of which there are 40 endogenous in E. coli, and an orthogonal tRNA that is not aminoacylated by any of the endogenous synthetases, e.g., of which there are 21 in E. coli.
  • the term "cognate” refers to components that function together, or have some aspect of specificity for each other, e.g., an orthogonal tRNA and an orthogonal aminoacyl-tRNA synthetase.
  • the present invention includes orthogonal components that function in both bacterial and eukaryotic cells, as well as vectors for shuttling nucleic acids that encode the orthogonal components between such cells.
  • Translation system components that act orthogonally in both eubacteria and eukaryotic cells can be derived from certain Archaea translation system components, as discussed herein.
  • tRNAs from eubacteria and eukaryotes are generally not charged by native Archaea RSs.
  • some native Archaea RS/tRNA pairs can function in eubacterial translation systems, e.g., incorporating pyrrolysine as a natural orthogonal suppressor (see, e.g., Blight, et al., Direct Charging of tRNA (CUA) with Pyrrolysine In Vitro and In Vivo, Nature 431: 333-335, 2004).
  • Such natural translation system components provide a platform for engineering orthogonal unnatural amino acid specific RS/tRNA pair suppressors with modified structures that function to suppress a selector codon by incorporation of a selected unnatural amino acid into a polypeptide of interest.
  • the present orthogonal systems include, e.g., O-RS/O-tRNA pairs derived from Methanosarcina and/or Desulfitobacterium RS/tRNA pairs.
  • O-RS/O-tRNA pairs derived from Methanosarcina and/or Desulfitobacterium RS/tRNA pairs.
  • certain amino acid residues in the amino acid side chain binding pocket of the RS can be substituted to accommodate the characteristics of the desired amino acid side chain.
  • amino acids in the Archaea pyrrolysine- tRNA synthetase (PyI-RS) binding pocket corresponding to positions 305, 306, 309, 348, 384 and/or 419 can be substituted with amino acids with size, polarity, hydrogen bonding groups and/or hydrophobic groups that configure the binding pocket to provide space and interactions promoting binding of a particular desired unnatural amino acid.
  • the modified RS/tRNA pairs can be tested in eubacteria (e.g., E.
  • Orthogonal translation systems generally comprise cells, e.g., prokaryotic cells such as E.
  • orthogonal tRNA an orthogonal tRNA (O-tRNA), an orthogonal aminoacyl tRNA synthetase (O-RS), and an unnatural amino acid, e.g., a non-canonical amino acid, where the O-RS aminoacylates the O-tRNA with the unnatural amino acid.
  • An orthogonal pair of the invention can include an O-tRNA, e.g., a suppressor tRNA, a frameshift tRNA, or the like, and a cognate O-RS.
  • the orthogonal systems of the invention which typically include O- tRNA/O-RS pairs, can comprise a cell or a cell-free environment.
  • the invention also provides novel individual components, for example, several novel orthogonal aminoacyl-tRNA synthetase polypeptides, e.g., those in the sequence listing herein, and the polynucleotides that encodes these polypeptides, e.g., as shown in the sequence listing.
  • the orthogonal pair when an orthogonal pair recognizes a selector codon and loads an amino acid in response to the selector codon, the orthogonal pair is said to "suppress" the selector codon. That is, a selector codon that is not recognized by the translation system's, e.g., the E. coli, yeast, mammalian, etc. cell's, endogenous machinery is not ordinarily charged, which results in blocking production of a polypeptide that would otherwise be translated from the nucleic acid.
  • an O-tRNA of the invention recognizes a selector codon and includes at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppression efficiency in the presence of a cognate synthetase in response to a selector codon as compared to the suppression efficiency of an O-tRNA comprising or encoded by a polynucleotide sequence as set forth in the sequence listing herein.
  • the O-tRNAs of the invention can recognize a selector codon and suppress in either or both a eubacteria or a eukaryote cell with a suppression efficiency of at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more.
  • the suppression efficiency of the O-RS and the O- tRNA together is about, e.g., 5-fold, 10-fold, 15-fold, 20-fold, or 25-fold or more greater than the suppression efficiency of the O-tRNA lacking the O-RS.
  • the suppression efficiency of the O-RS and the O-tRNA together is at least about, e.g., 35%, 40%, 45%, 50%, 60%, 75%, 80%, or 90% or more of the suppression efficiency of an orthogonal synthetase pair as set forth in the sequence listings herein.
  • the O-RS/O-tRNA pair has a suppression efficiency in each of a eubacteria (e.g., an enterobacteria) and in a eukaryotic translation system (e.g., an animal cell) of at least about, e.g., 35%, 40%, 45%, 50%, 60%, 75%, 80%, or 90% or more.
  • the translation system e.g., an enterobacteria, yeast, insect, mammalian cell, human cell or in vitro system, uses the O-tRNA/O-RS pair to incorporate the unnatural amino acid into a growing polypeptide chain, e.g., via a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises a selector codon that is recognized by the O-tRNA.
  • the cell can include one or more additional O-tRNA/O-RS pairs, where the additional O-tRNA is loaded by the additional O-RS with a different unnatural amino acid.
  • one of the O- tRNAs can recognize a four base codon and the other O-tRNA can recognize a stop codon.
  • multiple different stop codons, multiple different four base codons, multiple different rare codons and/or multiple different non-coding codons can be used in the same coding nucleic acid.
  • the translation system can further include an additional different O-tRNA/O-RS pair and a second different unnatural amino acid, where this additional O- tRNA recognizes a second selector codon and this additional O-RS preferentially aminoacylates the O-tRNA with the second unnatural amino acid.
  • a cell that includes an O-tRNA/O-RS pair, where the O-tRNA recognizes, e.g., an amber selector codon can further comprise a second orthogonal pair, where the second O-tRNA recognizes a different selector codon, e.g., an opal codon, an ochre codon, a four-base codon, a rare codon, a non-coding codon, or the like.
  • the different orthogonal pairs are derived from different sources, which can facilitate recognition of different selector codons.
  • translation systems can comprise an in vitro translation system, a cell, such as an E. coli or other bacterial cell, yeast, plant cell, mammalian or other eukaryotic cell, that includes an orthogonal tRNA (O-tRNA), an orthogonal aminoacyl- tRNA synthetase (O-RS), an unnatural amino acid and a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises the selector codon that is recognized by the O-tRNA.
  • a cell such as an E. coli or other bacterial cell, yeast, plant cell, mammalian or other eukaryotic cell, that includes an orthogonal tRNA (O-tRNA), an orthogonal aminoacyl- tRNA synthetase (O-RS), an unnatural amino acid and a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynu
  • orthogonal translation systems e.g., translation systems comprising an O-RS, an O-tRNA and an unnatural amino acid can utilize cultured cells to produce proteins having unnatural amino acids
  • an orthogonal translation system of the invention require an intact, viable cell.
  • an orthogonal translation system can utilize a cell-free system in the presence of a cell extract.
  • cell free in vitro transcription/translation systems for protein production is a well established technique. Adaptation of these in vitro systems to produce proteins having unnatural amino acids using orthogonal translation system components described herein is well within the scope of the invention.
  • the O-tRNA and/or the O-RS can be naturally occurring or can be, e.g., derived by mutation of a naturally occurring tRNA and/or RS, e.g., by generating libraries of tRNAs and/or libraries of RSs, from any of a variety of organisms and/or by using any of a variety of available mutation strategies.
  • one strategy for producing an orthogonal tRNA/ aminoacyl-tRNA synthetase pair involves importing a tRNA/synthetase pair that is heterologous to the system in which the pair will function from a source, or multiple sources, other than the translation system in which the tRNA/synthetase pair will be used.
  • O-RS/O-tRNA pairs from Archaea can be imported to eubacterial or eukaryotic systems to function orthogonally, in native form or with selected mutations, to incorporate desired unnatural amino acids.
  • the properties of the heterologous synthetase candidate include, e.g., that it does not charge any host cell tRNA, and the properties of the heterologous tRNA candidate include, e.g., that it is not aminoacylated by any host cell synthetase.
  • the heterologous tRNA is orthogonal to all host cell synthetases. Strategies to generate orthogonal pairs can involve generating mutant libraries from which to screen and/or select an O-tRNA or O-RS with the desired functional structures. Importation and mutant library screening strategies can also be combined.
  • Synthetase libraries can include two or more different mutant nucleic acids encoding different RSs.
  • the RSs be derived from Archaea, such as, e.g., Methanosarcina and/or Desulfitobacterium species, e.g., using pylRS and its cognate tRNA as a platform to develop unnatural amino acid-specific orthogonal pairs.
  • the mutations substitute amino acids in positions lining the amino acid side chain binding pocket of the RS.
  • the RSs can predictably include amino acid substitutions outside the binding pocket that retain general structures (e.g., secondary and tertiary structure form and function), key mutations for customizing amino acid specificity are typically made in the binding pocket residues.
  • RS libraries can be provided to receive a wide variety of unnatural amino acids as substrate.
  • the Methanosarcina mazei pyrrolysyl-tRNA synthetase (MmPyIRS) can be, e.g., selectively and/or randomly mutated at key amino acid positions to provide any desired specificity.
  • mutations can be directed to MmPIyRS (NA SEQ ID NO: 1; polypeptide SEQ TD NO: 2) amino acid positions 305, 306, 309, 348, 384 and/or 419 to accommodate and favorably interact with an unnatural amino acid of given structure.
  • similar functional libraries can be derived from homologous RS sequences, e.g., with mutations directed to positions corresponding to MmPIyRS amino acid positions 305, 306, 309, 348, 384 and/or 419.
  • libraries with members functioning charge a given unnatural amino acid can be designed with appropriate mutations to the pyrrolysyl-tRNA synthetase of Methanosarcina mazei or Desulfitobacterium species at positions corresponding to MmPIyRS positions 305, 306, 309, 346, 348, 384, 417 and/or 419.
  • Orthogonal tRNA (O-tRNA)
  • An orthogonal tRNA (O-tRNA) of the invention desirably mediates incorporation of an unnatural amino acid into a protein that is encoded by a polynucleotide that comprises a selector codon that is recognized by the O-tRNA, e.g., in vivo or in vitro.
  • an O-tRNA of the invention includes at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppression efficiency in the presence of a cognate synthetase in response to a selector codon as compared to an O-tRNA comprising or encoded by a polynucleotide sequence as set forth in the O-tRNA sequences in the sequence listing herein.
  • the tRNA will typically display this selectivity in both bacterial and eukaryotic cells.
  • O-tRNAs of the invention are set forth in the sequence listing herein. The disclosure herein also provides guidance for the design of additional equivalent O-tRNA species.
  • an RNA molecule such as an O-RS mRNA, or O-tRNA molecule
  • Thymine (T) is replaced with Uracil (U) relative to a given sequence (or vice versa for a coding DNA), or complement thereof. Additional routine modifications to the bases can also be present.
  • the O-tRNA can have 80% sequence identity, 90% identity, 95%, identity, 98% identity, or more to an orthogonal tRNA, such as Mmpyl-tRNA of SEQ ID No: 7.
  • the invention also encompasses conservative variations of O-tRNAs corresponding to particular O-tRNAs herein.
  • conservative variations of O- tRNA include those molecules that function like the particular O-tRNAs, e.g., as in the sequence listing herein and that maintain the tRNA L-shaped structure by virtue of appropriate self-complementarity, but that do not have a sequence identical to that, e.g., in the sequence listing, and desirably, are other than wild type tRNA molecules.
  • composition comprising an O-tRNA can further include an orthogonal aminoacyl-tRNA synthetase (O-RS), where the O-RS preferentially aminoacylates the O- tRNA with an unnatural amino acid.
  • O-RS orthogonal aminoacyl-tRNA synthetase
  • a composition including an O-tRNA can further include a translation system, e.g., in vitro or in vivo.
  • a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises a selector codon that is recognized by the O-tRNA, or a combination of one or more of these can also be present in the cell.
  • Q-RS Orthogonal aminoacyl-tRNA synthetase
  • the present orthogonal synthetases can be derived from any Archaea synthetases, particularly pyrrolysyl-tRNA synthetases, by selectively engineering or randomly mutating RS binding pocket amino acids corresponding to those identified herein.
  • Orthogonal synthetases of the invention typically include an amino acid binding pocket configured to accept the side chain of a desired unnatural amino acid as a substrate.
  • the RS can be any RS having significant homology to
  • Methanosarcina pylRSs particularly in the region of the binding pocket, and mutated to provide structures that function to accept an intended unnatural amino acid as a substrate.
  • Significant homology can be found according to methods known in the art and discussed herein
  • alternate functional synthetases can be provided by mutating homologous RSs to have mutations similar to those identified or suggested herein.
  • RSs homologous to the presently identified or suggested RSs can be mutated to include similar mutations in binding pocket amino acids in order to accept the same or similar unnatural amino acids as substrates.
  • the homologous RSs can have 99% sequence identity or more, more than about 98% identity, 95% identity, 90% identity, 80% identity, 50% identity, or more.
  • the alternate RSs for similar mutation of the binding pocket have a relatively high percent identity in the region of the amino acid binding pocket.
  • the percent identity in a homologous RS region be at least 75%, at least 90%, at least 95%, at least 98% at least 99%, or more. This percent identity in homologous regions is particularly desirable in regions corresponding to positions between amino acid 305 and 419 of the MmpylRS (SEQ ID NO: 2).
  • Homology of proteins and/or protein sequences can be the result of derivation from a common ancestral protein or protein sequence. Homology can be inferred, e.g., from structural and functional characteristics and from the percent identity of a putative homologous protein of homologous region of a protein. That is, homology is generally inferred from sequence similarity between two or more nucleic acids or proteins (or sequences thereof). The precise percentage of similarity between sequences that is useful in establishing homology varies with the nucleic acid and protein at issue, but as little as 25% sequence similarity is routinely used to establish homology.
  • sequence similarity e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99%, or more, can also be used to establish homology.
  • Methods for determining sequence similarity percentages e.g., BLASTP and BLASTN using default parameters are described herein and are generally available.
  • the O-RS of the invention preferentially aminoacylates an O-tRNA with an unnatural amino acid, e.g., an epsilon substituted lysine, a photocaged lysine, an ortho, meta and/or para-substituted phenylalanine or tyrosine, alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids, alkoxyamine containing amino acids, borono-substituted amino acids and/or the like, in vitro or in vivo.
  • an unnatural amino acid e.g., an epsilon substituted lysine, a photocaged lysine, an ortho, meta and/or para-substituted phenylalanine or tyrosine
  • alkynyl aryl amino acids aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo
  • the O-RS of the invention can be provided to the translation system, e.g., a bacterial or eukaryotic cell, by a polypeptide that includes an O-RS and/or by a polynucleotide that encodes an O-RS or a portion thereof.
  • a polypeptide that includes an O-RS and/or by a polynucleotide that encodes an O-RS or a portion thereof.
  • an example O-RS comprises an amino acid sequence as set forth in the sequence listing, or a conservative variation thereof.
  • an O-RS, or a portion thereof is encoded by a polynucleotide sequence that encodes an amino acid comprising sequence in the sequence listing or examples herein, or a complementary polynucleotide sequence thereof.
  • the orthogonal translational components (O-tRNA and O-RS) of the invention can be derived from any Archaea organism, or a combination of organisms, for use in a host translation system from any eubacterial of eukaryotic species, with the caveat that the O-tRNA/O-RS components and the host system work in an orthogonal manner. It is not a requirement that the O-tRNA and the O-RS from an orthogonal pair be derived from the same organism.
  • the orthogonal components are derived from archaebacterial genes for use in a eubacterial host system and/or eukaryotic host system.
  • the orthogonal O-tRNA can be derived from an archaebacterium, such as Methanosarcina mazei, Methanosarcina acetovorans, Methanosarcina barken, Methanosarcina frisia, Methanosarcina thermophila, Methanosarcina vacolata, Desulfitobacterium hqfhiense, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I , Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrococcus horikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyrus kandleri, Methanosarcina mazei (Mm), Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus
  • the orthogonal O-RS can be derived from an organism or combination of organisms, e.g., an archaebacterium, such as Methanosarcina mazei, Methanosarcina acetovorans, Methanosarcina barken, Methanosarcina frisia, Methanosarcina thermophila, Desulfitobacterium hafniense, Methanosarcina vacolata,
  • an archaebacterium such as Methanosarcina mazei, Methanosarcina acetovorans, Methanosarcina barken, Methanosarcina frisia, Methanosarcina thermophila, Desulfitobacterium hafniense, Methanosarcina vacolata,
  • Methanococcus jannaschii Methanobacterium thermoautotrophicum, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I, Archaeoglobus fiilgidus, Pyrococcus furiosus, Pyrococcus horikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyrus kandleri, Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus, Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasma volcanium, or the like.
  • the O-tRNA is a native Methanosarcina tRNA or is derived from a Methanosarcina tRNA.
  • the O-RS is a native Methanosarcina RS or is derived from a Methanosarcina RS. PylRS/tRNA pairs from these organisms represent particularly desirable platforms for development of unnatural amino acid -specific orthogonal pairs.
  • the individual components of an 0-tRNA/O-RS pair can be derived from the same organism or from different organisms.
  • the O-tRNA/O-RS pair is from the same organism.
  • the O-tRNA and the O-RS of the O-tRNA/O-RS pair are from different organisms.
  • the O-tRNA/O-RS pair can be derived from a natural Archaeal pair, or derived from Archaeal tRNA and RS that were previously not functionally paired, e.g., a tRNA from Methanosarcina mazei and an RS from Methanosarcina acetovoran.
  • the O-tRNA, O-RS or O-tRNA/O-RS pair can be selected or screened in vivo or in vitro and/or used in a cell, e.g., a eubacterial cell or enterobacterial cell to screen RS/tRNA pair activity or to produce a polypeptide with an unnatural amino acid.
  • a cell e.g., a eubacterial cell or enterobacterial cell to screen RS/tRNA pair activity or to produce a polypeptide with an unnatural amino acid.
  • the eubacterial cell used is not limited, for example, Escherichia coli, Thermus thermophilics, Bacillus subtilis, Bacillus stearothermphilus, or the like.
  • Compositions of eubacterial cells comprising translational components of the invention are also a feature of the invention.
  • the O-tRNA, O-RS or O-tRNA/O-RS pair functions in a eukaryotic translation system to incorporate an unnatural amino acid of interest.
  • the O-tRNA, O-RS or O-tRNA/O-RS pair function to incorporate the unnatural amino acid both in a eubacterial translation system and in a eukaryotic translation system.
  • Selector codons of the invention expand the genetic codon framework of protein biosynthetic machinery.
  • a selector codon includes, e.g., a unique three base codon, a nonsense codon, such as a stop codon, e.g., an amber codon (UAG), a ochre (UAA), or an opal codon (UGA), an unnatural codon, at least a four base codon, a rare codon, or the like.
  • a number of selector codons can be introduced into a desired gene, e.g., one or more, two or more, more than three, etc.
  • Conventional site-directed mutagenesis can be used to introduce the selector codon at the site of interest in a polynucleotide encoding a polypeptide of interest. See, e.g., Sayers, J. R., et al. (1988) "5', 3' Exonuclease in phosphorothioate-based oligonucleotide-directed mutagenesis. " Nucl Acid Res 16: 791-802.
  • selector codons By using different selector codons, multiple orthogonal tRNA/synthetase pairs can be used that allow the simultaneous site-specific incorporation of multiple same or different unnatural amino acids e.g., including at least one unnatural amino acid, using these different selector codons.
  • Unnatural amino acids can also be encoded with rare codons.
  • the rare arginine codon, AGG has proven to be efficient for insertion of Ala by a synthetic tRNA acylated with alanine. See, e.g., Ma, C. et al., (1993) "In vitro protein engineering using synthetic tRNA Ala with different anticodons.” Biochemistry 32: 7939-7945.
  • the synthetic tRNA competes with the naturally occurring tRNA ⁇ 8 , which exists as a minor species (fewer tRNA molecules than for other Arg tRNAs and associated with a far lower occurrence (rarity) of the corresponding codon) in Escherichia coli.
  • some organisms do not use all triplet codons.
  • An unassigned codon AGA in Micrococcus luteus has been utilized for insertion of amino acids in an in vitro transcription/translation extract. See, e.g., Kowal and Oliver, (1997) "Exploiting unassigned codons in Micrococcus luteus for tRNA-based amino acid mutagenesis.” Nucl Acid Res 25: 4685-4689.
  • a rare codon can be considered a codon used in a cell or translation system less than 5 percent of the time to encode a particular amino acid compared to the total of other codons encoding the amino acid in the system or cell.
  • Selector codons can also comprise extended codons, e.g., four or more base codons, such as, four, five, six or more base codons.
  • four base codons include, e.g., AGGA, CUAG, UAGA, CCCU, and the like.
  • five base codons include, e.g., AGGAC, CCCCU, CCCUC, CUAGA, CUACU, UAGGC and the like.
  • Methods of the invention include using extended codons based on frameshift suppression.
  • Four or more base codons can insert, e.g., one or multiple unnatural amino acids, into the same protein.
  • the anticodon loops can decode, e.g., at least a four-base codon, at least a five-base codon, or at least a six-base codon or more. Since there are 256 possible four-base codons, multiple unnatural amino acids can be encoded in the same cell using a four or more base codon.
  • a selector codon can also include one of the natural three base codons, where the endogenous system does not use (or rarely uses) the natural base codon. For example, this includes a system that is lacking a tRNA that recognizes the natural three base codon, and/or a system where the three base codon is a rare codon.
  • Selector codons optionally include unnatural base pairs.
  • Descriptions of unnatural base pairs which can be adapted for methods and compositions include, e.g., Hirao, et al., (2002) “An unnatural base pair for incorporating amino acid analogues into protein.” Nature Biotechnology 20: 177-182. See also Wu, et al, (2002) “Enzymatic Phosphorylation of Unnatural Nucleosides.” J Am Chem Soc 124: 14626-14630.
  • pyrrolysyl-tRNA synthetases have a propensity to function charging their cognate tRNA orthogonally in both eukaryotic and eubacterial translation systems. They are also relatively non-specific (promiscuous) in their selectivity between unnatural amino acids of similar structure (see, e.g., Mukai, et al., Biochem. Biophys. Res. Comm.371: 818-822, 2008).
  • Methanosarcina PyIRSs can be modified for use in charging a wide variety of unnatural amino acids, e.g., in both eukaryotic and prokaryotic translation systems.
  • modifications to amino acids at certain positions can be provided, e.g., by directed protein engineering and/or by screening of random mutations at the positions, to functionally accommodate various unnatural amino acid structures as aminoacylation substrates.
  • the methods include selection of a desired unnatural amino acid, identification of expected favorable and unfavorable interactions with the Methanosarcina pylRS binding pocket amino acids, directed or random mutation of identified pocket amino acids, and screening of mutated pylRSs to identify those with highest charging activity with the desired unnatural amino acid.
  • the methods can include shuttling mutated pylRSs to a eukaryotic translation system for incorporation of the unnatural amino acid into, e.g., a eukaryotically processed polypeptide.
  • the unnatural amino acid can be any, including, e.g., an epsilon-substituted lysine, a photocaged lysine, an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido- substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono- substituted phenylalanine, an ortho borono-
  • the unnatural amino acids are other than Boc-lysine, acetyllysine or N ⁇ -benzyloxycarbonyl-L-lysine.
  • Methanosarcina synthetases can be modified to accommodate these unnatural amino acids in the binding pocket and charge a cognate tRNA, e.g., for ultimate incorporation into a polypeptide.
  • Pyrrolysine is essentially an analog of lysine substituted at the epsilon nitrogen with a ketopyrrole group.
  • the main selective interaction between the Methanosarcina pylRS and amino acids for charging is at the binding pocket amino acids at positions corresponding to, e.g., amino acid positions Leu305, Tyr306, Leu309, Cys348 Tyr384 and GIy 419 of the Methanosarcina maize pylRS (SEQ ID NO: 2).
  • the amino acids are arranged along the pocket in the order Gly419, Cys348, Leu309, Leu305, Tyr306 and Tyr384 (see, Figure 4).
  • Tyr306, Leu 309 and Leu 305 are positioned at the far end of the pocket, projecting to specifically interact hydrophobically and by hydrogen bonding with pyrrolysine side chain end, while the other amino acids line the sides of the pocket.
  • the pocket can be modified by intelligent substitution of the identified amino acids in consideration of, e.g., steric interactions, hydrophobic interactions and hydrogen bonding interactions that would promote functional interactions with unnatural amino acids of choice. For example, where the desired unnatural amino acid would extend deeper into the bonding pocket with a hydrophobic group than does pyrrolysine, shorter hydrophobic amino acids can be selected for positions 306 and 309, e.g., to avoid steric hindrance while maintaining hydrophobic interactions useful in specifically binding the amino acid in the pocket.
  • larger amino acids can be provided, e.g., at positions 305, 306 and/or 309; typically with at least one of these amino acids including a hydrogen bonding group.
  • O-nitrobenzyl-oxycarbonly-N-L- lysine O-nitrobenzyl-oxycarbonly-N-L- lysine (ONBK) is longer than pyrrolysine and includes a larger terminal aryl group.
  • the PyIRs can accommodate ONBK when the end-pocket Try306 residue is substituted with hydrophobic, less sterically hindering, He or Met. Further, ONBK is better accommodated when steric hindrance is reduced with a shorter Ala at positions 309 and 348.
  • mutations to the synthetase structure, outside identified binding pocket residues are optionally mutated in a conservative fashion. That is amino acids making up alpha helices, beta sheets, hair pins and other structural features, e.g., in the polypeptide secondary structures can be maintained, even with substantial conservative changes to the amino acid sequence. For example, amino acids that generally support certain identified secondary structures can be substituted for other amino acids that support the identified structure while maintaining the orientation of amino acids in the binding pocket and retaining functional activity of the RS. With careful selection of conservative variations in non-binding pocket structures useful charging function can be maintained with more than 50% substitutions with amino acids that conserve the alpha helices, and beta sheet structures of the RS, as shown in Figure 1. The RSs can retain functional activity and selectivity with intelligent conservative amino acid substitutions from about 1% to 50% or more, about 2% to 40%, about 4% to 30%, about 5% to 25%, about 10% to 20% or about 15% of the total amino acids in the RS structure.
  • the unnatural amino acids are substituted Tyr and Phe analogs
  • the amino acid residues 348 and/or 384 at the opening of the binding pocket be relatively short hydrophobic residues (e.g., Ala, Pro or VaI) to allow apace and to hydrophobically interact with the aryl residue of the amino acid.
  • the substituted Tyr and Phe analogs include a short hydrophilic group
  • amino acids at positions 309, 306 and/or 305 can be selected to extend relatively far to interact with the hydrophilic group (using, e.g., Lys, Arg, His, GIt and/or GIn).
  • amino acids at positions 309, 306 and/or 305 can extend relatively far to interact with the hydrophobic group (using, e.g, Met, Trp, Ue and/or Leu).
  • the unnatural amino acid has a relatively long side chain (e.g., 5, 6, 7, 8 or more carbon bond equivalents) with one or more hydrophilic groups
  • positions 306, 305 and/or 309 can have an amino acid with a relatively short hydrophilic side chain (e.g., Ser, Asp, Thr, and/or Cys).
  • positions 305, 306 and/or 309 can have an amino acid with a relatively short hydrophobic side chain (e.g., Ala, VaI, Pro and/or He) to accommodate the longer unnatural amino acid in the binding pocket.
  • a relatively short hydrophobic side chain e.g., Ala, VaI, Pro and/or He
  • RS amino acids at positions 348, 384 and/or 419 can be substituted, as appropriate.
  • the amino acids at RS positions 348 and 384 can include, e.g., Ala, VaI, He, Leu and/or Pro; it whole logically be advisable to retain Gly419 as GIy.
  • the amino acids at positions 348 and 384 can include, e.g., Asp, GIu, Ser, Thr, Asn, GIn and/or Cys.
  • the amino acids at positions 348, 384 and 419 can include, e.g., Met, Trp, Phe and/or Tyr.
  • the amino acids at positions 348, 384 and 419 can include, e.g., Arg, Lys, GIn, GIu and/or Asn.
  • Methanosarcina PyIRS or other Archaea
  • a selection of alternate mutants can be generated and screened by positive and/or negative selection, as described herein, to enrich for mutant pylRSs with the most enhanced activity and/or specificity with the unnatural amino acid.
  • mutant RSs that preferentially aminoacylate one unnatural amino acid over another unnatural amino acid because such cross-over can typically be avoided by controlling what unnatural amino acid is made available in the translation system.
  • selection schemes to select one unnatural amino acid over another can be employed, e.g., where multiple orthogonal elements are used to incorporate more than one unnatural amino acid site-specifically into a polypeptide of interest.
  • the methods of incorporating the unnatural amino acids typically include mutation or substitution of a nucleic acid encoding a Methanosarcina pylRS to express putative appropriate amino acids on translation.
  • two or more optional mutant Methanosarcina pylRSs are associated in a library.
  • libraries of mutant pylRSs are screened to identify library members having the most desired characteristics of activity and selectivity. The screening is typically more convenient to carry out in bacteria than eukaryotic cells.
  • RS ribonucleic acid
  • a eukaryotic translation system e.g., a mammalian cell.
  • shuttling is accomplished by excising the nucleic acid encoding the RS from, e.g., a bacterial expression plasmid, functionally ligating it into a eukaryotic expression plasmid and transforming a compatible eukaryotic cell.
  • Shuttling generally involves transfer of encoding nucleic acids from one host cell type to another.
  • translation system components e.g., O-RSs and O-tRNAs
  • a first system such as a eubacterial system
  • a second system such as a eukaryotic system.
  • the nucleic acid encoding the translation system component typically includes a transcription promoter recognized by transcription components of the host system.
  • Shuttling typically requires the nucleic acid encoding the component to be physically transferred from one host to the other.
  • the encoding nucleic acid present in the first host system are recombined into a new expression vector before transfer into the new host system.
  • the encoding nucleic acid can be transferred directly to the second host, e.g., in a shuttle vector.
  • An expression vector is usually a plasmid that is used to introduce and express a specific gene into a target cell. Once the expression vector is inside the cell, the protein that is encoded by the gene is produced by the cellular transcription and translation machinery.
  • the plasmid is typically engineered to contain an active transcription promoter facilitates production of mRNA complimentary to the gene.
  • Many expression vectors are designed to function properly only in a particular suitable cell type (e.g., a bacterium, a plant cell, an animal cell, a yeast, etc.).
  • a selected O-RS is present in a bacterial cell encoded in a plasmid having promoters specific to the bacterial cell.
  • the promoters for bacterial expression can include bacteriophage promoters, native bacterial gene promoters or engineered promoters.
  • Plasmids for expression of orthogonal translation system components in eubacteria can include, e.g.: the lambda PL promoter, the tac promoter/operator (Ptac), the E. coli arabinose operon promoter (Pbad), E.coli glutamine promoter (glnS), a mutant glnS promoter (glnS ⁇ ), and/or the like.
  • expression can be regulated to eukaryotic promoters, regulators, enhancers, and the like.
  • Plasmids for expression of orthogonal components can include; a TATA sequence, upstream activator sequence (UAS), initiator sequences (INR), downstream promoter elements (DPE), and/or the like.
  • UAS upstream activator sequence
  • ILR initiator sequences
  • DPE downstream promoter elements
  • CMV promoter with TetO sites tetracycline inducible CMV promoter
  • promoter efl-a promoter efl-a
  • b-actin tetracycline inducible CMV promoter
  • a commonly used eukaryotic expression plasmid includes the constitutive CMV promoter.
  • Methods to select orthogonal components in bacterial cells can include transformation of the bacteria with plasmids encoding the components so that the components can be readily cloned, screened and identified in the bacterial environment. Once the desired components are identified, the encoding plasmids can be harvested by conventional methods. The nucleic acid sequence encoding the desired component can be cut from the bacterial expression plasmid, e.g., using specific endonucleases, and purified (e.g., by chromatography of electrophoresis).
  • the purified nucleic acid encoding the component can then be ligated into an expression vector adapted for expression in a desired eukaryotic host cell.
  • the eukaryotic expression vector containing the nucleic acid encoding the component can be used to transform the eukaryotic host cell for expression of the component in the new host cell.
  • the orthogonal component has been shuttled from the bacterial host to the eukaryotic host.
  • a shuttle vector is a vector (usually a plasmid) constructed so that it can propagate in two different host species. Therefore, DNA inserted into a shuttle vector can be tested or manipulated in two different cell types.
  • the main advantage of these vectors is that they can be manipulated in bacteria and then used in a system, which is more difficult or slower to use (e.g. yeast or mammal cells) without intervening DNA recombination steps.
  • Shuttle vectors include plasmids that can propagate in eukaryotes and prokaryotes (e.g., both Saccharomyces cerevisiae and E. coli). For example, certain adenovirus shuttle vectors can function to express a polypeptide in both E. coli and mammals.
  • Yeast shuttle vectors can be useful in the present methods.
  • Yeast shuttle vectors typically have components that allow for replication and selection in both E. coli cells and yeast cells.
  • the E. coli component of a yeast shuttle vector includes an origin of replication and a selectable marker, e.g. antibiotic resistance, beta lactamase.
  • the yeast component of a yeast shuttle vector includes an autonomously replicating sequence (ARS), a yeast centromere (C ⁇ N), and a yeast selectable marker (e.g., UR A3, a gene that encodes an enzyme for uracil synthesis, Lodish et al. 2007).
  • ARS autonomously replicating sequence
  • C ⁇ N yeast centromere
  • a yeast selectable marker e.g., UR A3, a gene that encodes an enzyme for uracil synthesis, Lodish et al. 2007.
  • Unnatural amino acids charged and incorporated in translation systems of the invention can be photocaged amino acids.
  • “Caging” groups of amino acids can inhibit or conceal (e.g., by disrupting bonds which would usually stabilize interactions with target molecules, by changing the hydrophobicity or ionic character of a particular side chain, or by steric hindrance, etc.) biological activity in a molecule, e.g., a peptide comprising such amino acid. See, e.g., Adams, et al., Annu. Rev. Physiol., 1993, 55:755-784.
  • a photocaged amino acid can be created by protecting its ⁇ -amino group with compounds such as BOC (butyloxycarbonyl), and protecting the ⁇ -carboxyl group with compounds such as a t-butyl ester.
  • BOC butyloxycarbonyl
  • Such protection can be followed by reaction of the amino acid side chain with a photolabile caging group such as 2-nitrobenzyl, in a reactive form such as 2-nitrobenzylchloroformate, ⁇ -carboxyl 2-nitrobenzyl bromide methyl ester, or 2-nitrobenzyl diazoethane.
  • the photolabile cage group is added, the protecting groups can be removed via standard procedures. See, e.g., USPN 5,998,580.
  • lysine residues can be caged using 2- nitrobenzylchloroformate to derivatize the ⁇ -lysine amino group, thus eliminating the positive charge.
  • lysine can be caged by introducing a negative charge into a peptide (which has such lysine) by use of an ⁇ -carboxy 2-nitrobenzyloxycarbonyl caging group.
  • phosphoserine and phosphothreonine can be caged by treatment of the phosphoamino acid or the phosphopeptide with l(2-nitrophenyl)diazoethane. See, e.g., Walker et al, Meth Enzymol. 172:288-301, 1989.
  • amino acids are also easily amenable to standard caging chemistry, for example serine, threonine, histidine, glutamine, asparagine, aspartic acid and glutamic acid. See, e.g., Wilcox et al., J. Org. Chem. 55: 1585-1589, 1990). Again, it will be appreciated that recitation of particular photoregulated (amino acids and/or those capable of being converted to photoregulated forms) should not necessarily be taken as limiting.
  • photoregulating and/or photocaging groups include, but are not limited to: nitroindolines; N-acyl-7-nitroindolines; phenacyls; hydroxyphenacyl; brominated 7- hydroxycoumarin-4-ylmethyls (e.g., Bhc); benzoin esters; dimethoxybenzoin; meta- phenols; 2-nitrobenzyl; l-(4,5-dimethoxy-2-nitrophenyl)ethyl (DMNPE); 4,5-dimethoxy-2- nitrobenzyl (DMNB); alpha-carboxy-2-nitrobenzyl (CNB); l-(2-nitrophenyl)ethyl (NPE); 5-carboxymethoxy-2-nitrobenzyl (CMNB); (5-carboxymethoxy-2-nitrobenzyl)oxy) carbon
  • a photocaging group can optionally comprise a first binding moiety, which can bind to a second binding moiety.
  • a commercially available caged phosphoramidite [l-N-(4,4'-Dimethoxytrityl)-5-(6- biotinamidocaproamidomethyl)-l-(2-nitrophenyl)-ethyl]-2-cyanoethyl-(N,N-diisopropyl)- phosphoramidite (PC Biotin Phosphoramadite, from Glen Research Corp., www.glenres.com) comprises a photolabile group and a biotin (the first binding moiety).
  • a second binding moiety e.g., streptavidin or avidin
  • a caged component comprises two or more caging groups each comprising a first binding moiety, and the second binding moiety can bind two or more first binding moieties simultaneously.
  • the caged component can comprise at least two biotinylated caging groups; binding of streptavidin to multiple biotin moieties on multiple caged component molecules links the caged components into a large network. Cleavage of the photolabile group attaching the biotin to the component results in dissociation of the network.
  • caged polypeptides including e.g. peptide substrates and proteins such as antibodies or transcription factors
  • a caging compound or by incorporating a caged amino acid during synthesis of a polypeptide.
  • a caged amino acid during synthesis of a polypeptide. See, e.g., USPN 5,998,580 to Fay e ⁇ al. (December 7, 1999) entitled "Photosensitive caged macromolecules”; Kossel et al. (2001) PNAS 98: 14702- 14707; Trends Plant Sci (1999) 4:330-334; PNAS (1998) 95:1568-1573; J. Am. Chem. Soc.
  • a photolabile polypeptide linker (e.g., for connecting a protein transduction domain and a sensor, or the like) can, for example, comprise a photolabile amino acid such as that described in USPN 5,998,580.
  • Irradiation with light can, e.g., release a side chain residue of an amino acid that is important for activity of the peptide comprising such amino acid.
  • uncaged amino acids can cleave the peptide backbone of the peptide comprising the amino acid and can thus, e.g., open a cyclic peptide to a linear peptide with different biological properties, etc.
  • Activation of a caged peptide can be done through destruction of a photosensitive caging group on a photoregulated amino acid by any standard method known to those skilled in the art.
  • a photosensitive amino acid can be uncaged or activated by exposure to a suitable conventional light source, such as lasers (e.g., emitting in the UV range or infrared range).
  • lasers e.g., emitting in the UV range or infrared range.
  • suitable conventional light source such as lasers (e.g., emitting in the UV range or infrared range).
  • lasers e.g., emitting in the UV range or infrared range
  • Those of skill in the art will be aware of and familiar with a number of additional lasers of appropriate wavelengths and energies as well as appropriate application protocols (e.g., exposure duration, etc.) that are applicable to use with photoregulated amino acids such as those utilized herein.
  • Release of photoregulated caged amino acids allows control of the peptides that comprise such amino acids
  • compositions and methods herein can be utilized in a number of aspects.
  • photocaged amino acids e.g., in peptides
  • the methods, structures, and compositions of the invention are applicable to incorporation/use of photocaged natural amino acids (e.g., ones with photocaging moieties attached/associated with them, thus rendering them "unnatural” amino acids). See, e.g., application PCT/US2005/034002 - Adding Photoregulated Amino Acids to the Genetic Code.
  • the invention provides for polynucleotide sequences encoding, e.g., O-tRNAs and O-RSs, and polypeptide amino acid sequences, e.g., O-RSs, and, e.g., compositions, systems and methods comprising said polynucleotide or polypeptide sequences.
  • polynucleotide sequences e.g., O-tRNAs and O-RSs
  • polypeptide amino acid sequences e.g., O-RSs
  • compositions, systems and methods comprising said polynucleotide or polypeptide sequences.
  • examples of said sequences, e.g., O-tRNA and O-RS amino acid and nucleotide sequences are disclosed herein (see the sequence listing).
  • the invention is not limited to those sequences disclosed herein, e.g., in the Examples and sequence listing.
  • the invention also provides many related sequences with the functions described herein, e.g., polynucleo
  • the term "conservative variant,” in the context of a translation component, refers to a translation component, e.g., a conservative variant O- tRNA or a conservative variant O-RS, that functionally performs similar to a base component that the conservative variant is similar to, e.g., an O-tRNA or O-RS, having variations in the sequence as compared to a reference O-tRNA or O-RS.
  • a conservative variant of that O-RS will both aminoacylate a cognate O-tRNA with the same unnatural amino acid.
  • the O-RS and the conservative variant O-RS do not have the same amino acid sequences.
  • the conservative variant can have, e.g., one variation, two variations, three variations, four variations, or five or more variations in sequence, as long as the conservative variant is still complementary to, e.g., functions with, the cognate corresponding O-tRNA or O-RS.
  • amino acids outside the active enzymatic site physical structures e.g., that retain the orientation of amino acids in the active site
  • amino acids known in the art for cooperating in stabilization of the physical structures e.g., those of skill know that different amino-acid sequences have different propensities for forming ⁇ -helical structure.
  • Methionine, alanine, leucine, uncharged glutamate, and lysine all have especially high helix-forming propensities, whereas proline, glycine and negatively charged aspartate have poor helix-forming propensities.
  • Proline tends to break or kink helices because it cannot donate an amide hydrogen bond (having no amide hydrogen), and because its side chain interferes sterically; its ring structure also restricts its backbone ⁇ dihedral angle to the vicinity of -70°, which is less common in ⁇ - helices.
  • proline is often seen as the first residue of a helix, presumably due to its structural rigidity.
  • glycine also tends to disrupt helices because its high conformational flexibility makes it entropically expensive to adopt the relatively constrained ⁇ -helical structure. It would be a conservative variation, and one of skill would expect continued enzymatic activity, e.g., to exchange an alanine for a leucine in an alpha helix segment of an enzyme structure, or visa versa. See the general structure of the Methanosarcina mazei pylRS in Figure 1. Regarding beta sheet structures found in active proteins, large aromatic residues (Tyr, Phe and Trp) and ⁇ -branched amino acids (Thr, VaI, He) are favored to be found in ⁇ strands in the middle of ⁇ sheets.
  • a conservative variant O-RS comprises one or more conservative amino acid substitutions compared to the O-RS from which it was derived.
  • a conservative variant O-RS comprises one or more conservative amino acid substitutions compared to the O-RS from which it was derived, and furthermore, retains O-RS biological activity; for example, a conservative variant O-RS that retains at least 10% of the biological activity of the parent O-RS molecule from which it was derived, or alternatively, at least 20%, at least 30%, or at least 40%.
  • the conservative variant O-RS retains at least 50% of the biological activity of the parent O-RS molecule from which it was derived.
  • the conservative amino acid substitutions of a conservative variant O-RS can occur in any domain of the O-RS, including the amino acid binding pocket.
  • amino acid residue is substituted for another amino acid residue having similar chemical properties (e.g., aromatic side chains or positively charged side chains), and therefore does not substantially change the functional properties of the polypeptide molecule.
  • similar chemical properties e.g., aromatic side chains or positively charged side chains
  • substitutions i.e., substitutions in a nucleic acid sequence which do not result in an alteration in an encoded polypeptide, are an implied feature of every nucleic acid sequence that encodes an amino acid sequence.
  • conservative amino acid substitutions where one or a limited number of amino acids in an amino acid sequence are substituted with different amino acids with highly similar properties, are also readily identified as being highly similar to a disclosed construct. Such conservative variations of each disclosed sequence are a feature of the present invention.
  • the invention can include O-tRNAs and O-RS that are
  • derived from refers to a component that is isolated from or made using a specified molecule or organism, or information from the specified molecule or organism.
  • a polypeptide that is derived from a second polypeptide can include an amino acid sequence that is identical or substantially similar to the amino acid sequence of the second polypeptide.
  • the derived species can be obtained by, for example, naturally occurring mutagenesis, artificial directed mutagenesis or artificial random mutagenesis.
  • the mutagenesis used to derive polypeptides can be intentionally directed or intentionally random, or a mixture of each.
  • the mutagenesis of a polypeptide to create a different polypeptide derived from the first can be a random event, e.g., caused by polymerase infidelity, and the identification of the derived polypeptide can be made by appropriate screening methods, e.g., as discussed herein.
  • Mutagenesis of a polypeptide typically entails manipulation of the polynucleotide that encodes the polypeptide.
  • Comparative hybridization can also be used to identify nucleic acids of the invention, including conservative variations of nucleic acids of the invention.
  • target nucleic acids which hybridize to a nucleic acid represented in the sequence listing herein, under high, ultra-high and ultra-ultra high stringency conditions, are an aspect of the invention where the nucleic acids encode mutations corresponding to: a Met or De residue at position corresponding to position 306, an Ala at position 309, an Ala at position 348, a Phe at position 384, or a combination thereof, with amino acid position numbering corresponding to amino acid position numbering of the wild-type pyrrolysyl-tRNA synthetase.
  • nucleic acids examples include those with one or a few silent or conservative nucleic acid substitutions as compared to a given nucleic acid sequence of the sequence listing, e.g., which encode, e.g.: a Met or He residue at position corresponding to position 306, an Ala at position 309, an Ala at position 348, a Phe at position 384, or a combination thereof, wherein amino acid position numbering corresponds to amino acid position numbering of the wild-type pyrrolysyl-tRNA synthetase.
  • a test nucleic acid is said to specifically hybridize to a probe nucleic acid when it hybridizes at least 50% as well to the probe as to the perfectly matched complementary target, i.e., with a signal to noise ratio at least half as high as hybridization of the probe to the target under conditions in which the perfectly matched probe binds to the perfectly matched complementary target with a signal to noise ratio that is at least about 5x- 10x as high as that observed for hybridization to any of the unmatched target nucleic acids.
  • Nucleic acids "hybridize” when they associate, typically in solution. Nucleic acids hybridize due to a variety of well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the like.
  • An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formalin with 1 mg of heparin at 42 0 C, with the hybridization being carried out overnight.
  • An example of stringent wash conditions is a 0.2x SSC wash at 65°C for 15 minutes ⁇ see, Sambrook, supra for a description of SSC buffer). Often the high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example low stringency wash is 2x SSC at 40°C for 15 minutes. In general, a signal to noise ratio of 5x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization.
  • Stringent hybridization wash conditions in the context of nucleic acid hybridization experiments such as Southern and northern hybridizations are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993), supra, and in Hames and Higgins, 1 and 2. Stringent hybridization and wash conditions can easily be determined empirically for any test nucleic acid. For example, in determining stringent hybridization and wash conditions, the hybridization and wash conditions are gradually increased (e.g., by increasing temperature, decreasing salt concentration, increasing detergent concentration and/or increasing the concentration of organic solvents such as formalin in the hybridization or wash), until a selected set of criteria are met.
  • the hybridization and wash conditions are gradually increased until a probe binds to a perfectly matched complementary target with a signal to noise ratio that is at least 5x as high as that observed for hybridization of the probe to an unmatched target.
  • “Very stringent” conditions are selected to be equal to the thermal melting point (T m ) for a particular probe.
  • T m is the temperature (under defined ionic strength and pH) at which 50% of the test sequence hybridizes to a perfectly matched probe.
  • “highly stringent” hybridization and wash conditions are selected to be about 5° C lower than the T m for the specific sequence at a defined ionic strength and pH.
  • Ultra high-stringency hybridization and wash conditions are those in which the stringency of hybridization and wash conditions are increased until the signal to noise ratio for binding of the probe to the perfectly matched complementary target nucleic acid is at least 10x as high as that observed for hybridization to any of the unmatched target nucleic acids.
  • a target nucleic acid which hybridizes to a probe under such conditions, with a signal to noise ratio of at least Vi that of the perfectly matched complementary target nucleic acid is said to bind to the probe under ultra-high stringency conditions.
  • even higher levels of stringency can be determined by gradually increasing the hybridization and/or wash conditions of the relevant hybridization assay. For example, those in which the stringency of hybridization and wash conditions are increased until the signal to noise ratio for binding of the probe to the perfectly matched complementary target nucleic acid is at least 10x, 2OX, 50X, 10OX, or 500X or more as high as that observed for hybridization to any of the unmatched target nucleic acids.
  • a target nucleic acid which hybridizes to a probe under such conditions, with a signal to noise ratio of at least Vi that of the perfectly matched complementary target nucleic acid is said to bind to the probe under ultra-ultra-high stringency conditions.
  • a variety of protein methods are known and can be used to isolate, detect, manipulate or otherwise handle a protein produced according to the invention e.g., from recombinant cultures of cells expressing the recombinant unnatural amino acid-containing proteins of the invention.
  • a variety of protein isolation and detection methods are well known in the art, including, e.g., those set forth in R. Scopes, Protein Purification, Springer- Verlag, N. Y. (1982); Deutscher, Methods in Enzvmologv Vol. 182: Guide to Protein Purification, Academic Press, Inc. N. Y. (1990); Sandana (1997) Bioseparation of Proteins, Academic Press, Inc.; Bollag, et al.
  • Kits are also a feature of the invention.
  • such kits can comprise components for using the composition herein, such as: a container to hold the kit components, instructional materials for practicing any method herein with the kit, or for producing a protein comprising one or more unnatural amino acid, a nucleic acid comprising a polynucleotide sequence encoding an O-tRNA, a nucleic acid comprising a polynucleotide encoding an O-RS, an O-RS, an unnatural amino acid, reagents for the post- translational modification of the unnatural amino acid (e.g., reagents for any one or more of the reactions described herein), a suitable strain of prokaryotic, e.g., bacterial (e.g., E.
  • a target protein comprising, e.g., one or more an epsilon-substituted lysine a caged lysine, an O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK), an ortho, meta or para- substituted phenylalanine or tyrosine, alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids and/or alkoxyamine containing amino acids.
  • a target protein comprising, e.g., one or more an epsilon-substituted lysine a caged lysine, an O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK), an ortho, meta or para- substituted phenylalanine or tyrosine, alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids
  • kits can contain a solid phase matrix for scarless purification, reagents for the covalent coupling of a polypeptide comprising an unnatural amino acid to the matrix, reagents for the oxidation or reduction of a redox unnatural amino acid and/or light sources for photolysis of caged amino acids in a polypeptide, e.g., to produce a natural amino acid.
  • compositions of the invention can be used to incorporate unnatural amino acids into any polypeptide in interest.
  • Polypepotides modified to include unnatural amino acids incorporated by the present methods are considered an aspect of the invention.
  • modified polypeptides can find use, e.g., in research and medicine.
  • the modified proteins of the invention comprising unnatural amino acids are, e.g., at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, or at least 99% or more identical to any available protein (e.g., a therapeutic protein, a diagnostic protein, an industrial enzyme, or portion thereof, and the like), and they comprise one or more unnatural amino acid.
  • therapeutic, diagnostic, and other proteins that can be modified to comprise one or more photoregulated amino acid (e.g., such as o-nitrobenzyl cysteine and azobenzyl-Phe), O-Me-L-tyrosine, or ⁇ -aminocaprylic acid can be found, but not limited to, those in International Application Number PCT/US2004/011786, filed April 16, 2004, entitled “Expanding the Eukaryotic Genetic Code;” and, WO 2002/085923, entitled “IN VIVO INCORPORATION OF UNNATURAL AMINO ACIDS.”
  • therapeutic, diagnostic, and other proteins that can be modified to comprise one or more homoglutamines include, but are not limited to, e.g., Alpha-1 antitrypsin, Angiostatin, Antihemolytic factor, antibodies, Apolipoprotein, Apoprotein, Atrial natriuretic factor, Atrial natriuretic polypeptide, Atrial peptides, C-X-C chemokines (e
  • Somatomedin Somatostatin, Somatotropin, Streptokinase, Superantigens, i.e., Staphylococcal enterotoxins (SEA, SEB, SECl, SEC2, SEC3, SED, SEE), Superoxide dismutase (SOD), Toxic shock syndrome toxin (TSST-I), Thymosin alpha 1, Tissue plasminogen activator, Tumor necrosis factor beta (TNF beta), Tumor necrosis factor receptor (TNFR), Tumor necrosis factor-alpha (TNF alpha), Vascular Endothelial Growth Factor (VEGEF), Urokinase and many others.
  • SEA Staphylococcal enterotoxins
  • SEB SECl
  • SEC2 SEC2, SEC3, SED, SEE
  • SOD Superoxide dismutase
  • SOD Superoxide dismutase
  • TSST-I Thymosin alpha 1
  • Tissue plasminogen activator Tu
  • one type of biomolecule can "encode” another.
  • the term “encode” refers to any process whereby the information in a polymeric macromolecule or sequence string is used to direct the production of a second molecule or sequence string that is different from the first molecule or sequence string.
  • the term can be used broadly, and can have a variety of applications.
  • the term “encode” describes the process of semi-conservative DNA replication, where one strand of a double-stranded DNA molecule is used as a template to encode a newly synthesized complementary sister strand by a DNA-dependent DNA polymerase.
  • the term "encode” refers to any process whereby the information in one molecule is used to direct the production of a second molecule that has a different chemical nature from the first molecule.
  • a DNA molecule can encode an RNA molecule, e.g., by the process of transcription catalyzed by a DNA-dependent RNA polymerase enzyme.
  • an RNA molecule can encode a polypeptide, as in the process of translation.
  • the term “encode” also extends to the triplet codon that encodes an amino acid or selector codons that encode a particular natural or unnatural amino acid.
  • an RNA molecule can encode a DNA molecule, e.g., by the process of reverse transcription incorporating an RNA-dependent DNA polymerase.
  • a DNA molecule can encode a polypeptide, where it is understood that "encode" as used in that case incorporates both the processes of transcription and translation.
  • orthogonal refers to functional molecules, e.g., an orthogonal tRNA (O-tRNA) and/or an orthogonal aminoacyl-tRNA synthetase (O-RS), that functions poorly or not at all with endogenous components of a cell, when compared to a corresponding molecule (tRNA or RS) that is endogenous to the cell or translation system.
  • O-tRNA orthogonal tRNA
  • O-RS orthogonal aminoacyl-tRNA synthetase
  • Orthogonal components are usefully provided as cognate components that function well with each other, e.g., an O-RS can be provided that efficiently aminoacylates a cognate O- tRNA in a cell, even though the O-tRNA functions poorly or not at all as a substrate for the endogenous RS of the cell, and the O-RS functions poorly or not at all with endogenous tRNAs of the cell.
  • O-RS can be provided that efficiently aminoacylates a cognate O- tRNA in a cell, even though the O-tRNA functions poorly or not at all as a substrate for the endogenous RS of the cell, and the O-RS functions poorly or not at all with endogenous tRNAs of the cell.
  • Various comparative efficiencies of the orthogonal and endogenous components can be evaluated.
  • an O-tRNA will typically display poor or nonexistent activity as a substrate, under typical physiological conditions, with endogenous RSs, e.g., the O-tRNA is less than 10% as efficient as a substrate as endogenous tRNAs for any endogenous RS, and will typically be less than 5%, and usually less than 1% as efficient a substrate.
  • the tRNA can be highly efficient as a substrate for the O-RS, e.g., at least 50%, and often 75%, 95%, or even 100% or more as efficient as an aminoacylation substrate as any endogenous tRNA is for its endogenous RS.
  • Orthogonal aminoacyl-tRNA synthetase As used herein, an orthogonal aminoacyl-tRNA synthetase (O-RS) is an enzyme that preferentially aminoacylates an O- tRNA with an amino acid in a translation system of interest.
  • An ORS "selectively recognizes" an unnatural amino acid when it charges a cognate tRNA with the amino acid more efficiently than with any natural amino acid.
  • the present invention includes O-RSs, e.g., derived from Methanosarcina species, that function to orthogonally charge an unnatural amino acid, optionally in a eubacterial translation system (e.g., an enterobacteria cell) or in a eukaryotic translation system (e.g., a mammalian cell), as desired. That is, e.g., the RS can be shuttled between systems (e.g., encoded as a nucleic acid sequence in a plasmid) and can function orthogonally in each system.
  • a eubacterial translation system e.g., an enterobacteria cell
  • a eukaryotic translation system e.g., a mammalian cell
  • Orthogonal tRNA As used herein, an orthogonal tRNA (O-tRNA) is a tRNA that is orthogonal to a translation system of interest.
  • the O-tRNA can exist charged with, e.g., an unnatural amino acid, or can exist in an uncharged state. It is also to be understood that an O-tRNA is optionally charged (aminoacylated) by a cognate orthogonal aminoacyl-tRNA synthetase with an unnatural amino acid. It will be appreciated that the O- tRNA of the invention can be advantageously used to insert the unnatural amino acids into a growing polypeptide, during translation, in response to a selector codon.
  • O-tRNAs of the invention can function orthogonally in more than one translation system, e.g., such as, in both a eubacterial system (e.g., E. col ⁇ ) and in a eukaryotic system (e.g., in a mammalian, insect or plant cell line).
  • a eubacterial system e.g., E. col ⁇
  • a eukaryotic system e.g., in a mammalian, insect or plant cell line.
  • an O-RS "preferentially aminoacylates" a cognate O-tRNA when the O-RS charges the O-tRNA with an amino acid (e.g., an unnatural amino acid) more efficiently than it charges any endogenous tRNA in an expression system (e.g., a system into which it has been shuttled). That is, when the O-tRNA and any given endogenous tRNA are present in a translation system in approximately equal molar ratios, the O-RS will charge the O-tRNA more frequently than it will charge the endogenous tRNA.
  • an amino acid e.g., an unnatural amino acid
  • the relative ratio of O-tRNA charged by the O-RS to endogenous tRNA charged by the O- RS is high, preferably resulting in the O-RS charging the O-tRNA exclusively, or nearly exclusively, when the O-tRNA and endogenous tRNA are present in equal molar concentrations in the translation system.
  • the relative ratio between O-tRNA and endogenous tRNA that is charged by the O-RS, when the O-tRNA and O-RS are present at equal molar concentrations, is greater than 1: 1, preferably at least about 2:1, more preferably 5: 1, still more preferably 10:1, yet more preferably 20:1, still more preferably 50: 1, yet more preferably 75: 1, still more preferably 95:1, 98:1, 99:1, 100: 1, 500:1, 1,000:1, 5,000:1 or higher.
  • charging of an endogenous tRNA by an O-RS is not detectable, e.g., by suppression assays.
  • the O-RS "preferentially aminoacylates an O- tRNA with a lysine analog" when (a) the O-RS preferentially aminoacylates the O-tRNA compared to an endogenous tRNA, and (b) where that aminoacylation is specific for the lysine analog (e.g., epsilon-substituted) amino acid, as compared to aminoacylation of the O-tRNA by the O-RS with any natural amino acid.
  • lysine analog e.g., epsilon-substituted
  • the O-RS will load the O-tRNA with ONBK more frequently than with any natural amino acid.
  • the relative ratio of O-tRNA charged with ONBK to O-tRNA charged with the natural amino acid is high. More preferably, O-RS charges the O-tRNA exclusively, or nearly exclusively, with ONBK or other relevant unnatural amino acid.
  • the relative ratio between charging of the O-tRNA with the unnatural amino acid and charging of the O- tRNA with a natural amino acid, when both the natural and unnatural amino acid are present in the translation system in equal molar concentrations, is greater than 1: 1, preferably at least about 2:1, more preferably 5:1, still more preferably 10:1, yet more preferably 20:1, still more preferably 50:1, yet more preferably 75:1, still more preferably 95:1, 98:1, 99: 1, 100: 1, 500: 1, 1,000:1, 5,000: 1 or higher.
  • shuttle refers to transfer of a nucleic acid encoding a translation system component (e.g., an RS and/or tRNA) from one cell to another cell.
  • a translation system component e.g., an RS and/or tRNA
  • the source and target cells are not from the same species, and typically include a eubacterial cell and a eukaryotic cell.
  • Selector codon refers to codons recognized by the O-tRNA in the translation process and not recognized by an endogenous tRNA.
  • the O- tRNA anticodon loop recognizes the selector codon on the mRNA and incorporates the amino acid with which it is charged, e.g., an unnatural amino acid, at this site in the polypeptide.
  • Selector codons can include, e.g., nonsense codons, such as, stop codons, e.g., amber, ochre, and opal codons; four or more base codons; rare codons; noncoding codons; and codons derived from natural or unnatural base pairs and/or the like.
  • Suppression activity refers, in general, to the ability of a tRNA, e.g., a suppressor tRNA, to allow translational read- through of a codon, e.g., a selector codon that is an amber codon or a 4-or-more base codon, that would otherwise result in the termination of translation or mistranslation, e.g., frame- shifting.
  • Suppression activity of a suppressor tRNA can be expressed as a percentage of translational read-through activity observed compared to a second suppressor tRNA, or as compared to a control system, e.g., a control system lacking an O-RS.
  • Suppressor tRNA is a tRNA that alters the reading of a messenger RNA (mRNA) in a given translation system, typically by allowing the incorporation of an amino acid in response to a stop codon or 4 or more base codon (i.e., "read-through") during the translation of a polypeptide.
  • a selector codon of the invention is a suppressor codon, e.g., a stop codon, e.g., an amber, ocher or opal codon, a four base codon, a rare codon, etc.
  • a therapeutic protein is a protein that can be administered to a patient to treat a disease or disorder.
  • Translation system refers to the components that incorporate an amino acid into a growing polypeptide chain (protein).
  • Components of a translation system can include, e.g., ribosomes, tRNAs, synthetases, mRNA and the like.
  • the O-tRNA and/or the O-RSs of the invention can be added to or be part of an in vitro or in vivo translation system, e.g., in a non-eukaryotic cell, e.g., a bacterium, such as E. coli, or in a eukaryotic cell, e.g., a yeast cell, a mammalian cell, a plant cell, an algae cell, a fungus cell, an insect cell, and/or the like.
  • Unnatural amino acid refers to any amino acid, modified amino acid, and/or amino acid analogue, that is not one of the 20 common naturally occurring amino acids. Further, herein neither seleno cysteine nor pyrrolysine are considered unnatural amino acids. For example, the unnatural amino acid O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK - see Figure 3A) finds use with the invention.
  • Cognate refers to components that function together, e.g., an orthogonal tRNA and an orthogonal aminoacyl-tRNA synthetase.
  • the components can also be referred to as being complementary.
  • derived from refers to a component that is isolated from or made using a specified molecule or organism, or information from the specified molecule or organism.
  • a first nucleic acid or peptide sequence is derived from a second sequence, e.g., when the second sequence is changed by addition, deletion or substitution at sequence positions to create the first sequence.
  • Eukaryote refers to organisms belonging to the phylogenetic domain Eucarya such as animals (e.g., mammals, insects, reptiles, birds, etc.), ciliates, plants (e.g., monocots, dicots, algae, etc.), fungi, yeasts, flagellates, microsporidia, protists, etc.
  • animals e.g., mammals, insects, reptiles, birds, etc.
  • ciliates e.g., monocots, dicots, algae, etc.
  • fungi e.g., yeasts, flagellates, microsporidia, protists, etc.
  • Non-eukaryote refers to non- eukaryotic organisms.
  • a non-eukaryotic organism can belong to the Eubacteria (e.g., Escherichia coli, Thermus thermophilics, Bacillus stearothermophilus, etc.) phylogenetic domain, or the Archaea (e.g., Methanococcus jannaschii (Mj), Methanosarcina mazei (Mm), Methanobacterium thermoautotrophicum (Mt), Methanococcus maripaludis, Methanopyrus kandleri, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I , Archaeoglobus fulgidus (Af), Pyrococcus furiosus (Pf), Pyrococcus horikoshii (Ph), Pyrobaculum aerophilum, Pyrococc
  • An E. c ⁇ /j-mammalian shuttle system has been developed to genetically encode unnatural amino acids in mammalian cells using aminoacyl-tRNA synthetases (RSs) evolved in E. coli.
  • RSs aminoacyl-tRNA synthetases
  • PyIRS pyrrolysyl-tRNA synthetase
  • a pyrrolysyl-tRNA synthetase (PyIRS) mutant was evolved in E. coli that selectively aminoacylates a cognate nonsense suppressor tRNA with a photocaged lysine derivative. Transfer of this orthogonal tRNA-RS pair into mammalian cells made possible the selective incorporation of this unnatural amino acid into proteins.
  • tRNA ⁇ A which naturally incorporates pyrrolysine (PyI) ( Figure 3a) in response to the amber nonsense codon in the archaea Methanosarcina maize.
  • tRNA ⁇ A is not recognized by endogenous RSs in E. coli and mammalian cells as a result of its unique structural features. See, C. Polycarpo, et al., Proc. Natl. Acad. Sci. USA 2004, 101, 12450; and, K. Nozawa, et al., Nature 2008, advance online publication.
  • MmPyIRS Methanosarcina maize PyIRS
  • Chin and coworkers used a mutant Methanosarcina barken PyIRS (MbPyIRS), a close homologue of MmPyIRS, to incorporate acetyl lysine in E. coli, demonstrating that the specificity of the PyIRS can be altered by directed evolution methods.
  • MbPyIRS Methanosarcina barken PyIRS
  • coli co-transformed with either NBK-I or NBK-2 and CATl 12TAG exhibited a significant difference in growth on Cm in the presence and absence of 1 mM ONBK (Figure 3a), suggesting that these evolved MmPyIRS- tRNATM A pairs are selective for ONBK relative to endogenous host amino acids.
  • NBK-I exhibited enhanced amber suppression relative to NBK-2 and thus the NBK-I- tRNA ⁇ A pair was used for further studies.
  • a vector pSup-NBK-1 was constructed to encode the NBK- 1- tRNAj£ A pair in which a single copy of the tRNA
  • a plasmid containing the wild type MmPyIRS- tRNATM A was employed for expression of GFP149TAG in the presence of 1 mM Cyc ( Figure 4b) and the protein yield was less than 1 mg L "1 .
  • Electrospray ionization mass spectrometry (ESI-MS) of purified GFP protein with ONBK at position 149 revealed two peaks (27,915 Da and 27,782 Da) corresponding to GFP protein containing the intact ONBK residue with and without the N-terminal Met ( Figure 5c).
  • This result confirms the high specificity of the NBK-I mutant aminoacyl- tRNA synthetase for ONBK relative to endogenous amino acids, and for tRNATM A relative to endogenous tRNAs.
  • tRNATM aminoacyl- tRNA synthetase
  • NBK-I- tRNA ⁇ A pair from E. coli was shuttled into mammalian cells.
  • a vector pCMV-NBK-1 was constructed containing the NBK-I gene under control of a non-regulated CMV promoter, and a single tRNA ⁇ A gene under control of a human U6 promoter. Amber suppression was monitored using an enhanced GFP (EGFP) with an amber mutation at the permissive residue 37 (EGFP37TAG).
  • EGFP enhanced GFP
  • EGFP37TAG amber mutation at the permissive residue 37
  • the plasmid pCMV- NBK-I was co-transfected with a plasmid encoding EGFP37TAG into HEK293 cells using an optimized transfection condition.
  • Lys(ONB)-OH (ONBK) 3 was dissolved in 1 ml 1,4-dioxane. 10 ml 4 N HCl in dioxane was added, and after 2 h the dioxane and HCl were removed in vacuo. The straw-colored precipitate was triturated 3 times with 10 ml ethyl ether to yield 0.74 g (97%).
  • His-tagged proteins produced from E. coli and mammalian cell cultures were purified with Ni-NTA columns (Qiagen) following the instructions provided.
  • cell lysate was dialyzed against and equilibrated with PBS buffer before loading onto the Ni- NTA column.
  • Columns were washed with 10 bed volumes of wash buffer (50 mM NaH 2 PO 4 , pH 8, 300 mM NaCl, and 25 mM imidazole). Proteins were eluted with 50 mM NaH 2 PO 4 , pH 8, containing 250 mM imidazole.
  • the MmPyIRS active site library was constructed by overlap extension polymerase chain reaction (PCR) using synthetic degenerate oligonucleotide primers to introduce mutations.
  • PCR polymerase chain reaction
  • the Methanosarcina maize PyIRS gene was codon optimized for E. coli and synthesized by DNA2.0. This gene served as the template to perform standard PCR reactions.
  • MmPylRS_N-term_F (5'-GTG TAC ACA TAT GGA TAA AAA GCC TCT GA-3') and MmPylRS_L305Y306L309/NNK_R (5'-GGC AGG GCA CGG TCC AGT TTA CGM NNA TAG TTM NNM NNG TTC GG-3');
  • MmPylRS_L309_F (5'-AAA CTG GAC CGT GCC CTG CC-3') and MmPylRS_C348/NNK_R (5'-TTT CAC GCG TGC AAC CGC TAC CCA TCT GMN NGA AGT TC-3');
  • MmPylRS_C348_F (5'-TAG CGG TTG CAC GCG TGA AA-3') and MmPylRS_Y384/NNK_R (5'-TGC ATA ACA TCC A
  • Overlap extension PCR was employed to assemble these PCR fragments and multiple rounds of PCR were conducted with the combination of primers listed above.
  • the intact MmPyIRS gene was generated by this strategy and the desired mutation sites were substituted by NNK, so that all 20 common amino acids were encoded.
  • tRNA £! A was inserted into pRep and pNEG vectors to construct pRep- tRNA£J A for positive selection and pNEG- tRNATM A for negative selection (see, L. Wang, J. Xie, P. G. Schultz, Annu. Rev. Biophys. Biomol. Struct. 2006, 35, 225).
  • the pBK-PylRS plasmids encoding the MmPyIRS active site library were transformed into E. coli DHlOB competent cells harboring pRep- tRNA ⁇ A to yield a library greater than 1 xlO 9 cfu, ensuring complete coverage.
  • the cells were allowed to recover for 2 h at 37°C before being plated on LB agar plates containing 50 ⁇ g ml '1 Kan, 100 ⁇ g ml "1 ampicillin (Amp) and 0.2% arabinose.
  • the plates were incubated for 12 h at 37°C at which point the cells were pooled and the pBK-PylRS plasmids were extracted.
  • Five alternative rounds of positive and negative selection finally yielded MmPyIRS variants that can survive the selection by acylating the cognate tRNA£
  • the newly extracted pBK- MmPyIRS plasmids were transformed into DHlOB competent cells containing pRep- tRNA ⁇ A and their ability to survive upon Cm challenge was tested with increasing concentrations of Cm in the presence and absence of 1 mM ONBK.
  • the mammalian expression vector pCMV-MmPylRS was constructed based on the pSWAN-pMpaRS plasmid developed previously. See, W. S. Liu, A. Brock, S. Chen, S. B. Chen, P. G. Schultz, Nat. Methods 2007, 4, 239. PIPE cloning (see, H. E. Klock, E. J. Koesema, M. W. Knuth, S. A. Lesley, Proteins: Struct., Funct., Bioinf. 2008, 71, 982) was used for inserting the desired genes into the vector.
  • tRNATM A gene was inserted into pCMV-MmPylRS after a human U6 promoter, and the MmPyIRS gene was inserted after a CMV promoter.
  • Both CHO cells and HEK293 cells were used for transfection and protein expression. CHO cells were grown in a medium containing F-12, 10% FBS, 1% Pen-Strep, and 2 mM L-glutamine at 37°C in a humidified atmosphere of 5% CO 2 .
  • HEK293F cells were grown in a medium containing Gibco D-MEM medium, 10% FBS, 1% Pen-Strep, and 2 mM L-glutamine at 37°C in a humidified atmosphere of 5% CO 2 .
  • media were exchanged to either fresh F12 media or Fl 2 media containing 1 mM unnatural amino acid, and then transfected with pCMV-MmPylRS and pWAN-GFP37TAG using Fugene 6 (Roche; 8 ⁇ l Fugene 6 + 0.8 ⁇ g of pCMV-MmPylRS + 1.2 ⁇ g of pWAN-GFP37TAG for 2 ml cell culture in Costar 6-well cell-culture clusters; 54 ⁇ l Fugene 6 + 3 ⁇ g of pCMV- MmPyIRS + 9 ⁇ g of pWAN-GFP37TAG for 12 ml cell culture in 75 cm 2 tissue culture flasks).
  • Cells were grown in a medium
  • RNA samples isolated from E. coli cells were separated by acid-urea gel electrophoresis and electroblotted onto a Hybond N + membrane in 0.5 x TBE running buffer at 30 V constant for 1 h using the Xcell II Blot Module (Invitrogen).
  • the Chemiluminescent Nucleic Acid Detection Module (Pierce) was used with a 72-base oligonecleotidecomplementary to tRNA ⁇ A as the probe.
  • Western blot analysis cells were detached and lysed in RIPA buffer (Upstate) with protease inhibitor cocktail (Roche).
  • the supernatant of cell lysate was fractionated by SDS-PAGE and transferred to 0.45 ⁇ m nitrocellulose membrane (Invitrogen).
  • the proteins on the membrane were probed with anti-His-HRP followed by detection of the luminescence with the ECL western blotting substrate (Pierce).
  • Tris-buffer solution 40 mM Tris, pH 8.0, 100 mM NaCl and 1 mM DTT. Protein samples with a final concentration of 100 ⁇ M were irradiated with high pressure mercury lamp (500 W, Spectra Physics) equipped with 310 nm long pass optical filter.
  • SEQ ID NO 1 MmPyIRS WT nucleic acid sequence: atggataaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaaggtagtaaagctcctgaccaaagctaacgaggatcag
  • SEQ ID NO 2 MmPyIRS WT polypeptide sequence:
  • SEQ ID NO 3 NBK-I nucleic acid sequence: atggataaaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaagtgtgtgtgtgaccaaagctaacga ggatcagacctc
  • SEQ ID NO 4 NBK-I polypeptide sequence:
  • SEQ ID NO 5 NBK-2 nucleic acid sequence: atggataaaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaaggtagtaaagctcctgaccaaagctaacgaaggatcagacct
  • SEQ ID NO6 NBK-2 polypeptide sequence:
  • SEQ ID NO: 7 Mmpyl-tRNA nucleic acid sequence:

Abstract

This invention provides translation system components functional in both eubacterial and eukaryotic environments. The translation system components, such as aminoacyl-tRNA synthetases and tRNAs derived from Methanosarcina species are capable of charging unnatural amino acids, and can be shuttled from enterobacteria to mammalian cells.

Description

A FACILE SYSTEM FOR ENCODING UNNATURAL AMINO ACIDS IN
MAMMALIAN CELLS
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to and benefit of a prior U.S. Provisional
Application number 61/211,811, A Facile System for Encoding Unnatural Amino Acids in Mammalian Cells, by Peng R. Chen, et al., filed April 3, 2009. The full disclosure of the prior application is incorporated herein by reference.
STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT
[0002] This invention was made with government support under grant ROl
GM62159 awarded by the National Institute of Health; and under grant DE-FG03- 00ER46051 awarded by the Department of Energy. The government has certain rights in the invention.
FIELD OF THE INVENTION
[0003] The inventions are in the field of translation biochemistry. In particular, the inventions are directed to aminoacyl-tRNA synthetase/tRNA orthogonal pairs that function to charge unnatural amino acids in both eubacterial and eukaryotic cells.
BACKGROUND OF THE INVENTION
[0004] Additional amino acids, beyond the canonical twenty, can be added to the genetic codes of both prokaryotic and eukaryotic organisms. This can be accomplished by means of an orthogonal tRNA (O-tRNA) and aminoacyl-tRNA synthetase (RS) pair that incorporates the unnatural amino acid in response to a nonsense or four base codon in the gene of interest. Directed evolution of the specificity of the aminoacyl-tRNA synthetase in either bacteria or yeast has been used to genetically encode approximately 50 unnatural amino acids with novel physical, chemical or biological properties in these organisms. One can also use an RS evolved in S. cerevisiae in conjunction with an amber suppressor tRNA from B. stearothermophilus (which is expressed at high levels) to incorporate unnatural amino acids in mammalian cells. See, W. S. Liu, et al., Nat. Methods 2007, 4, 239. However, it has not been previously possible to easily export the large number of aminoacyl-tRNA synthetases evolved in E. coli to mammalian cells due to the fact that the M. jannaschii-deήved aminoacyl-tRNA synthetases typically used in E. coli are not orthogonal in mammalian cells.
[0005] In view of the above, a need exists for methods and systems for efficient selection of orthogonal translation system components for incorporation of unnatural amino acids into eukaryotic processed proteins. Benefits could also be realized for certain specific applications if such proteins included photoactivation capabilities. The present invention provides these and other features that will be apparent upon review of the following.
SUMMARY OF THE INVENTION
[0006] The present inventions include methods and compositions for incorporation of unnatural amino acids by translation optionally in both eubacteria and in eukaryotes. The invention includes translation system components that can function orthogonally in, and can be shuttled between, eubacteria and eukaryotes. Methods include, e.g., mutating an aminoacyl tRNA synthetase (RS) from, e.g., Methanosarcinae, Desulfitobacterium or other Archaea, at identified positions, selecting mutants with structures functioning to accommodate an unnatural amino acid of interest as substrate, shuttling the RSs to a eukaryotic translation system where they function orthogonally with a cognate tRNA, and translating a nucleic acid sequence to provide a polypeptide incorporating the unnatural amino acid.
[0007] Compositions include translation system components, such as an aminoacyl tRNA synthetase (RS) and a cognate tRNA, wherein the synthetase is orthogonal in an enterobacteria and is also orthogonal in a eukaryotic cell. The cognate tRNA recognizes a selector codon and the synthetase is capable of specifically aminoacylating the tRNA with an unnatural amino acid when both the synthetase and the tRNA are expressed in either the enterobacteria or the eukaryotic cell. In certain embodiments, the synthetase is derived from an Archaea or bacteria synthetase and the cognate tRNA is derived from an Archaea or bacteria tRNA. For example, the synthetase can be derived from a Methanosarcinae RS, Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence, a Methanosarcina barken pyrrolysyl-tRNA synthetase (MbPyIRS) sequence, a Desulfitobacterium hafniense pyrrolysyl-tRNA synthetase (DhPyIRS - see, Biochem Biophys Res Commun. 2008 Sep 26; 374(3): 470-4. Epub 2008 JuI 24), and/or the like. The cognate tRNA can be, e.g., a pyrrolysyl-tRNA with an anticodon loop that recognizes a selector codon. The selector codon can be an amber codon or, e.g., another appropriate stop codon or 4 or more base codon.
[0008] In exemplary embodiments, the RS is derived by appropriate functional mutations at amino acid positions corresponding positions 305, 306, 309, 348, 384 or 419 of the MmPyIRS. For example, the synthetase sequence can include an isoleucine or methionine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384 to provide an RS charging with a caged lysine or similar lysine epsilon-substituted lysine analog.
[0009] Exemplary unnatural amino acids that can be incorporated, e.g., in bacteria or eukaryotes using methods and compositions of the invention include, e.g., : an epsilon- substituted lysine, a photocaged lysine, a photocaged lysine analog, an ortho acyl- substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono-substituted phenylalanine, a para benzoyl-substituted phenylalanine, an ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, an ortho nitro-substituted phenylalanine, a meta nitro-substituted phenylalanine, para nitro-substituted phenylalanine, an ortho nitro-substituted tyrosine, a meta nitro-substituted tyrosine, para nitro-substituted tyrosine; alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids, alkoxyamine containing amino acids, and/or the like. The unnatural amino acid optionally is other than Boc-lysine, acetyllysine or Nε-benzyloxycarbonyl-L-lysine. The unnatural amino acids are other than the 20 canonical natural amino acids, seleno-cysteine or pyrrolysine. The unnatural amino acid is optionally O-nitrobenzyl-oxycarbonly-Nε-L-lysine (ONBK).
[0010] In an embodiment, the RS includes an amino acid sequence at least 90% identical to SEQ ID NO: 4 (NBK-I), and has an Ala amino acid at a position corresponding to Leu309 of wild type MmPyIRS sequence SEQ ID NO: 2, an Ala amino acid at a position corresponding to Cys348 of SEQ ID NO: 2, and a Tyr amino acid at a position corresponding to Phe384 of SEQ ID NO: 2. In another embodiment, the aminoacyl tRNA synthetase comprises a polypeptide sequence comprising at least 90% identity to a Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence, and the polypeptide sequence comprises methionine at a position corresponding to position 306 of the MmPyIRS sequence, an isoleucine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309 of the MmPyIRS sequence, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384. In some embodiments, the RS is at least 95% or more identical to SEQ ID NO: 4 or SEQ ID NO: 6.
[0011] Methods of incorporating unnatural amino acids using orthogonal aminoacyl-tRNA synthetases derived from Archaea, particularly Methanosarcinae are included. Such synthetases function in eubacteria and eukaryotes and are often capable of optionally charging a cognate tRNA with two or more alternate similar unnatural amino acids. The methods include producing polypeptides in a eukaryotic cell by producing an orthogonal aminoacyl-tRNA synthetase (O-RS) library in one or more bacterial cells, selecting the synthetase library for an orthogonal member that specifically aminoacylates an orthogonal tRNA (O-tRNA) in the bacterial cells with an unnatural amino acid to provide an unnatural amino acid-specific synthetase that is orthogonal in the bacterial cells. The unnatural amino acid-specific synthetase can be shuttled into the eukaryotic cell, such as a mammalian cell or an insect cell, to charge a cognate tRNA and to function orthogonally in the eukaryotic cell. In an aspect of the invention, the synthetase and/or O-tRNA is derived from corresponding Archaea translation components. [0012] In the methods and compositions, the unnatural amino acid can be any appropriate unnatural amino acid. For example, the unnatural amino acid can be an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl- substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono-substituted phenylalanine, a para benzoyl-substituted phenylalanine, an ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, an ortho nitro-substituted phenylalanine, a meta nitro-substituted phenylalanine, para nitro-substituted phenylalanine, an ortho nitro-substituted tyrosine, a meta nitro-substituted tyrosine, para nitro-substituted tyrosine; alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids, alkoxyamine containing amino acids and/or the like. In many embodiments, the unnatural amino acid is other than the canonical 20 natural amino acids, seleno-cysteine, pyrrolysine, Boc-lysine, acetyllysine or Nε- benzyloxycarbonyl-L-lysine. In typical embodiments, the amino acid is not itself a peptide and is not unnatural due to linkage of a chemical moiety to the side chain after the amino acid has previously been incorporated into a polypeptide. The unnatural amino acid typically has a side chain with dimensions that fit into a modified binding pocket of an aminoacyl-tRNA synthetase. For example, whereas a typical natural amino acid side chain can be considered to have a length ranging from about zero (glycine) to about 10 angstroms. Side chains of many unnatural amino acids of interest range in size from about 2 angstroms to about 25 angstroms, or more; from 3 angstroms to 20 angstroms, from 5 angstroms to 15 angstroms, or about 12 angstroms. Unnatural amino acids with side chains having lengths greater than 50 angstroms (or about 30 carbon-carbon bond equivalents) are typically less desirable.
[0013] In one aspect, the methods include providing the orthogonal synthetase by mutating a Methanosarcina nucleic acid encoding a pyrrolysyl-tRNA synthetase (MPyIRS) polypeptide. For example, useful synthetases can be provided by mutation of the nucleic acid (e.g., MmPylRSwt SEQ ID NO: 1) at position corresponding amino acid positions Leu309, Cys348 and Tyr384 of SEQ ID NO: 2; wherein SEQ ID NO: 2 is a wild type
Methanosarcina maize polypeptide (MmPyIRS) sequence. The method further include mutation of the MPyIRS nucleic acid at positions encoding amino acids at positions corresponding to Tyr306 of SEQ ID NO: 2. The mutated nucleic acid can be used to transform bacteria with the mutated nucleic acids along with nucleic acids encoding cognate tRNAs preferentially aminoacylated by the MPyIRS, thereby providing an O-RS library of mutated RSs paired with the cognate tRNA. Clones from the library can be positively selected for members encoding a mutant MPyIRS that charges the Pyl-tRNA with an unnatural amino acid of choice. The methods can include growing a eukaryotic cell comprising: the unnatural amino acid, a nucleic acid that encodes a protein and comprises at least one selector codon recognized by the Pyl-tRNA, a selected mutant MPyIRS and the Pyl-tRNA, so that the protein is translated from the nucleic acid in the eukaryotic cell to incorporate the unnatural amino acid at the specified position.
[0014] Methods include shuttling translation system components from eubacterial cells to eukaryotic translation systems (typically cells) where they function to orthogonally incorporate unnatural amino acids into polypeptides of interest. For example, the shuttling can comprise transforming a eukaryotic cell with a nucleic acid (e.g., NBK-I RS SEQ ID NO: 3 or NBK-2 RS SEQ ID NO: 5) encoding a sequence encoding an O-RS comprising an amino acid sequence at least 90% identical to SEQ ID NO: 4 (NBK-I RS) or to SEQ ID NO: 6 (NBK-2 RS), wherein the O-RS further comprises an Ala amino acid in a position of the O-RS corresponding to Leu309 of SEQ ID NO: 2, an Ala amino acid residue in a position of the O-RS corresponding to Cys348 of SEQ ID NO: 2, and a Tyr amino acid residue in a position of the O-RS corresponding to Tyr384 of SEQ ID NO: 2; wherein SEQ ID NO: 2 is the wild type MmPyIRS sequence. The method can also include transforming the eukaryotic cell with a nucleic acid encoding a nucleic acid sequence encoding a pyrrolysyl-tRNA (Pyl-tRNA) preferentially aminoacylated by the O-RS. The cell can be provided with a nucleic acid that encodes the polypeptide of interest including at least one selector codon recognized by the Pyl-tRNA so that the unnatural amino acid will be incorporated at the position designated by the selector codon. The methods include preparing the unnatural amino acid by photocaging a residue of interest, e.g., a lysine or by substituting a chemical group on the residue. If the unnatural amino acid is a caged lysine analog or other unnatural amino acid, it can be charged on to the Pyl-tRNA and incorporated into the polypeptide of interest. Optionally, the polypeptide can be illuminated with light to remove the cage group from the lysine or other unnatural amino acid.
[0015] The present inventions include polypeptide libraries comprising
Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequences that collectively comprise mutations at positions corresponding to positions 305, 306, 309, 348, 384 and 419. In many embodiments, less than 20%, e.g., less than 10%, less than 5% or less of amino acids, other than those at positions 305, 306, 309, 348, 384, and 419 are mutated. In many embodiments, the library synthetases comprise one or more mutations selected from the group consisting of: Y306M, Y306I, L309A, C348A and Y384F.
[0016] Polypeptides comprising one or more O-nitrobenzyl-oxycarbonyl-lysine
(ONBK) residues are one aspect of the present invention. The polypeptide can be present within a cell, such as, e.g., a eubacterial or eukaryotic cell.
[0017] Another aspect of the invention is a nucleic acid comprising a sequence encoding an aminoacyl tRNA synthetase comprising an isoleucine or methionine at a position corresponding to position 306 of the wild type Methanosarcina maize PyIRS sequence, an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384. The nucleic acid is typically incorporated into a vector, such as an expression vector, or a shuttle vector.
BRIEF DESCRIPTION OF THE DRAWINGS
[0018] Figure 1 shows a ribbon diagram of an exemplary Methanosarcina pyrrolysyl-tRNA synthetase protein structure.
[0019] Figure 2 shows a schematic diagram of an exemplary translation system that can be shuttled between an enterobacteria and a mammalian cell and function in each.
[0020] Figure 3 shows structures of pyrrolysine and certain analogs or pyrrolysine.
[0021] Figure 4 shows incorporation of a pyrrolysine analog by native MmPyIRS in
E coli. Figure 2a shows northern blot analysis of tRNA charging in E. coli. The uncharged tRNAj;A band and the charged tRNA5;A band are indicated by arrows. tRNA^A is only charged in the presence of both PyIRS and Cyc. In Figure 2b, western blot analysis shows protein expression in mammalian cells. The full length mutant His-RBP4 is only expressed when CHO cells harboring both MmPyIRS and IRNA^ plasmids were grown with 5 mM Cyc. Figure 2c shows library design for directed evolution of MmPyIRS. PyI is colored magenta and residues in close contact with the terminal ring of PyI are colored green.
[0022] Figure 5 presents SDS-PAGE and mass spectroscopy confirming the preparation of efficient polypeptide translation incorporating a caged lysine. The results demonstrate evolution of a MmPyIRS- IRNA^ pair that encodes ONBK in E. coli. Figure 5a shows a plate assay of NBK-I and NBK-2 surviving up to 120 μg ml"1 Cm challenges when supplemented with 1 mM ONBK. In Figure 5b, genetic incorporation of ONBK into GFP protein in E. coli is analyzed by SDS-PAGE. The expressed full length GFP proteins were purified by Ni2+-NTA chromatography and stained with coomassie blue. Figure 5c shows ESI-MS analysis of purified GFP149ONBK protein produced by NBK-I- IRNA^ . The major peak (mass: 27,915 Da) corresponds to the full length GFP149ONBK; the minor peak (mass: 27,782 Da) corresponds to the same protein with the N-terminal Met posttranslationally cleaved (GFP149ONBK-M).
[0023] Figure 6 shows shuttling of E. coli mutated RS functioning orthogonally in a mammalian system. Shuttling the evolved synthetase into mammalian cells. In Figure 6a, EGFP37TAG protein is expressed using a NBK-I- IRNA^ pair in HEK293 cells in the presence of 1 mM ONBK. The top pictures show the fluorescence images of cells and the bottom pictures show cells illuminated with visible light. Figure 6b shows ESI-MS analysis of purified EGFP37ONBK protein from CHO cells. Inset shows the deconvoluted spectrum of EGFP37ONBK. Figure 6c shows ESI-MS analysis of EGFP37ONBK after photolysis. EGFP37ONBK protein at a final concentration of 100 μM was irradiated (365 nm) for 20 min.
DETAILED DESCRIPTION
[0024] The present inventions are directed to, e.g., compositions and methods using orthogonal aminoacyl-tRNA synthetase/orthogonal tRNA (O-RS/O-tRNA) pairs derived from certain Archaea RS/tRNA pairs that normally charge pyrrolysine. In an example embodiment, an RS/tRNA pair from Methanosarcina sp. is mutated to prepare a library of orthogonal pairs in bacteria incorporating an unnatural amino acid, then selected pairs are shuttled into eukaryotic cells. For example, an E. cø/ϊ-mammalian shuttle system, as shown in Figure 2, has been developed to genetically encode unnatural amino acids in mammalian cells using aminoacyl-tRNA synthetases (RSs) evolved in E. coli. A pyrrolysyl-tRNA synthetase (PyIRS) mutant was evolved in E. coli that selectively aminoacylates a cognate nonsense suppressor tRNA with a photocaged lysine derivative. A wide variety of unnatural amino acids can, similarly, be incorporated using similarly constructed mutants. Transfer of such orthogonal tRNA-RS pairs into eukaryotic (e.g., mammalian) cells makes possible the selective incorporation of unnatural amino acids into proteins in such eukaryotic cells.
[0025] The present invention includes compositions and methods for shuttling unnatural amino acid incorporation functionality between enterobacterial species and eukaryotic translation systems. The compositions include translation systems and translation system components designed with structures that function to incorporate unnatural amino acids in eukaryotes or prokaryotes, as desired. The methods include techniques of providing a library of, e.g., mutated Archaea aminoacyl-tRNA synthetases in enterobacteria, screening the library to select for synthetases that charge a cognate tRNA with an unnatural amino acid of choice, shuttling the selected synthetase to a eukaryotic cell, and incorporating the unnatural amino acid into a polypeptide using the synthetase in the eukaryotic cell.
COMPOSITIONS AND METHODS FOR ORTHOGONAL TRANSLATION OF UNNATURAL AMINO ACIDS IN EITHER PROKARYOTES OR EUKARYOTES.
[0026] The compositions of the present invention include, e.g., translation systems and translation system components comprising orthogonal aminoacyl-tRNA synthetases (O- RSs), orthogonal tRNAs (O-tRNAs) derived from Archaea, unnatural amino acids, and/or nucleic acids encoding polypeptides of interest. The compositions can include libraries comprising synthetases that function orthogonally in both eubacteria and eukaryotes, cells comprising the translation system components, and/or vectors for expression of the translation system components. The compositions include components having structures, such as, e.g., RS binding pockets and structural scaffolding, and tRNA selector codons and A arms and other structural features that function to incorporate desired unnatural amino acids into intended positions of polypeptides of interest.
[0027] Methods of the invention include selected and/or random mutation of an
Archaea (e.g., Methanosarcina) RS to accommodate an unnatural amino acid of interest, evaluation of the RS charging activity for the amino acid in a eubacteria (e.g., E. coli), and shuttling the mutated RS and a cognate tRNA into a eukaryotic cell (e.g., a mammalian, insect or plant cell line) for incorporation of the unnatural amino acid into a specific position in a polypeptide of interest. In example embodiments, the Archaea RS is a pyrrolysyl-tRNA synthetase and the mutations are directed to modification of amino acid residues at specific positions lining the binding pocket. In certain embodiments, evaluation of charging activity includes positive and/or negative selection techniques in the eubacteria to enrich for and identify those mutant RSs with the highest desired charging activity. Production of polypeptides including certain unnatural amino acids is desirable, e.g., to provide research tools and medicines reflecting the unique translation and post translation processing available in eukaryotic cells.
[0028] In general, in order to add additional unnatural amino acids to the genetic code, new orthogonal pairs comprising an aminoacyl-tRNA synthetase and a suitable tRNA are needed that can function efficiently in the host translational machinery, but that are "orthogonal" to the translation system at issue, meaning that it functions independently of the synthetases and tRNAs endogenous to the translation system. Methods of the invention generally employ the compositions of the invention to efficiently incorporate unnatural amino acids in translation systems of choice. Desired characteristics of the orthogonal pair include tRNA that decode or recognize only a specific codon, e.g., a selector codon, e.g., an amber stop codon, that is not decoded by any endogenous tRNA, and aminoacyl-tRNA synthetases that preferentially aminoacylate, or "charge", its cognate tRNA with a specific unnatural amino acid. The O-tRNA is also not typically aminoacylated, or is very poorly aminoacylated, i.e., "charged," by endogenous synthetases. For example, in an E. coli host system, an orthogonal pair will include an aminoacyl-tRNA synthetase that does not cross- react with any of the endogenous tRNAs, e.g., of which there are 40 endogenous in E. coli, and an orthogonal tRNA that is not aminoacylated by any of the endogenous synthetases, e.g., of which there are 21 in E. coli. The term "cognate" refers to components that function together, or have some aspect of specificity for each other, e.g., an orthogonal tRNA and an orthogonal aminoacyl-tRNA synthetase. The present invention includes orthogonal components that function in both bacterial and eukaryotic cells, as well as vectors for shuttling nucleic acids that encode the orthogonal components between such cells.
[0029] The general principles for the production of orthogonal translation systems that are suitable for making proteins that comprise one or more desired unnatural amino acid are known in the art, as are the general methods for producing orthogonal translation systems. For example, see International Publication Numbers WO 2002/086075, entitled "METHODS AND COMPOSITION FOR THE PRODUCTION OF ORTHOGONAL tRNA- AMINO AC YL tRNA SYNTHETASE PAIRS;" WO 2002/085923, entitled "IN VIVO INCORPORATION OF UNNATURAL AMINO ACIDS;" WO 2004/094593, entitled "EXPANDING THE EUKARYOTIC GENETIC CODE;" WO 2005/019415, filed July 7, 2004; WO 2005/007870, filed July 7, 2004; WO 2005/007624, filed July 7, 2004; WO 2006/110182, filed October 27, 2005, entitled "ORTHOGONAL TR ANSLATION COMPONENTS FOR THE VIVO INCORPORATION OF UNNATURAL AMINO ACIDS" and WO 2007/103490, filed March 7, 2007, entitled "SYSTEMS FOR THE EXPRESSION OF ORTHOGONAL TRANSLATION COMPONENTS IN EUBACTERIAL HOST CELLS." Each of these applications is incorporated herein by reference in its entirety. For discussion of orthogonal translation systems that incorporate unnatural amino acids, and methods for their production and use, see also, Wang and Schultz, (2005) "Expanding the Genetic Code." Angewandte Chemie Int Ed 44: 34-66; Xie and Schultz, (2005) "An Expanding Genetic Code." Methods 36: 227-238; Xie and Schultz, (2005) "Adding Amino Acids to the Genetic Repertoire." Cwrr Opinion in Chemical Biology 9: 548-554; and Wang, et al., (2006) "Expanding the Genetic Code." Annu Rev Biophys Biomol Struct 35: 225-249; Deiters, et al, (2005) "Λi vivo incorporation of an alkyne into proteins in Escherichia coli." Bioorganic & Medicinal Chemistry Letters 15:1521-1524; Chin, et al., (2002) "Addition of p-Azido-L-phenylalanine to the Genetic Code of Escherichia coli. " J Am Chem Soc 124: 9026-9027; and International Publication
No. WO2006/034332, filed on September 20, 2005, the contents of each of which are incorporated by reference in their entirety. Additional details are found in United States Patents No. 7,045,337; No. 7,083,970; No. 7,238,510; No. 7,129,333; No. 7,262,040; No. 7,183,082; No. 7,199,222; and No. 7,217,809.
Orthogonal Translation Systems
[0030] Translation system components that act orthogonally in both eubacteria and eukaryotic cells can be derived from certain Archaea translation system components, as discussed herein. tRNAs from eubacteria and eukaryotes are generally not charged by native Archaea RSs. However, some native Archaea RS/tRNA pairs can function in eubacterial translation systems, e.g., incorporating pyrrolysine as a natural orthogonal suppressor (see, e.g., Blight, et al., Direct Charging of tRNA (CUA) with Pyrrolysine In Vitro and In Vivo, Nature 431: 333-335, 2004). Such natural translation system components provide a platform for engineering orthogonal unnatural amino acid specific RS/tRNA pair suppressors with modified structures that function to suppress a selector codon by incorporation of a selected unnatural amino acid into a polypeptide of interest.
[0031] The present orthogonal systems include, e.g., O-RS/O-tRNA pairs derived from Methanosarcina and/or Desulfitobacterium RS/tRNA pairs. Depending on the unnatural amino acid desired for charging, certain amino acid residues in the amino acid side chain binding pocket of the RS can be substituted to accommodate the characteristics of the desired amino acid side chain. For example, amino acids in the Archaea pyrrolysine- tRNA synthetase (PyI-RS) binding pocket corresponding to positions 305, 306, 309, 348, 384 and/or 419 can be substituted with amino acids with size, polarity, hydrogen bonding groups and/or hydrophobic groups that configure the binding pocket to provide space and interactions promoting binding of a particular desired unnatural amino acid. The modified RS/tRNA pairs can be tested in eubacteria (e.g., E. coli) for orthogonality and unnatural amino acid specificity, and their encoding nucleic acids shuttled into eukaryotic translation systems (e.g., mammalian cells) for incorporation of the unnatural amino acid into a polypeptide encoded by a nucleic acid with a selector codon corresponding to the anticodon of the tRNA. [0032] Orthogonal translation systems generally comprise cells, e.g., prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, plant, insect, or mammalian cells that include an orthogonal tRNA (O-tRNA), an orthogonal aminoacyl tRNA synthetase (O-RS), and an unnatural amino acid, e.g., a non-canonical amino acid, where the O-RS aminoacylates the O-tRNA with the unnatural amino acid. An orthogonal pair of the invention can include an O-tRNA, e.g., a suppressor tRNA, a frameshift tRNA, or the like, and a cognate O-RS. The orthogonal systems of the invention, which typically include O- tRNA/O-RS pairs, can comprise a cell or a cell-free environment. In addition to multi- component systems, the invention also provides novel individual components, for example, several novel orthogonal aminoacyl-tRNA synthetase polypeptides, e.g., those in the sequence listing herein, and the polynucleotides that encodes these polypeptides, e.g., as shown in the sequence listing.
[0033] In general, when an orthogonal pair recognizes a selector codon and loads an amino acid in response to the selector codon, the orthogonal pair is said to "suppress" the selector codon. That is, a selector codon that is not recognized by the translation system's, e.g., the E. coli, yeast, mammalian, etc. cell's, endogenous machinery is not ordinarily charged, which results in blocking production of a polypeptide that would otherwise be translated from the nucleic acid. In an orthogonal pair system, the O-RS aminoacylates the O-tRNA with a specific unnatural amino acid, e.g., an epsilon amine caged lysine. The charged O-tRNA recognizes the selector codon and suppresses the translational block caused by the selector codon.
[0034] In some aspects, an O-tRNA of the invention recognizes a selector codon and includes at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppression efficiency in the presence of a cognate synthetase in response to a selector codon as compared to the suppression efficiency of an O-tRNA comprising or encoded by a polynucleotide sequence as set forth in the sequence listing herein. The O-tRNAs of the invention can recognize a selector codon and suppress in either or both a eubacteria or a eukaryote cell with a suppression efficiency of at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more. [0035] In some embodiments, the suppression efficiency of the O-RS and the O- tRNA together is about, e.g., 5-fold, 10-fold, 15-fold, 20-fold, or 25-fold or more greater than the suppression efficiency of the O-tRNA lacking the O-RS. In an aspect, the suppression efficiency of the O-RS and the O-tRNA together is at least about, e.g., 35%, 40%, 45%, 50%, 60%, 75%, 80%, or 90% or more of the suppression efficiency of an orthogonal synthetase pair as set forth in the sequence listings herein. In some embodiments, the O-RS/O-tRNA pair has a suppression efficiency in each of a eubacteria (e.g., an enterobacteria) and in a eukaryotic translation system (e.g., an animal cell) of at least about, e.g., 35%, 40%, 45%, 50%, 60%, 75%, 80%, or 90% or more.
[0036] The translation system, e.g., an enterobacteria, yeast, insect, mammalian cell, human cell or in vitro system, uses the O-tRNA/O-RS pair to incorporate the unnatural amino acid into a growing polypeptide chain, e.g., via a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises a selector codon that is recognized by the O-tRNA. In certain preferred aspects, the cell can include one or more additional O-tRNA/O-RS pairs, where the additional O-tRNA is loaded by the additional O-RS with a different unnatural amino acid. For example, one of the O- tRNAs can recognize a four base codon and the other O-tRNA can recognize a stop codon. Alternately, multiple different stop codons, multiple different four base codons, multiple different rare codons and/or multiple different non-coding codons can be used in the same coding nucleic acid. For further details regarding available O-RS/O-tRNA cognate pairs and their use, see, e.g., the references noted above.
[0037] As noted, in some embodiments, there exist multiple O-tRNA/O-RS pairs in translation system, which allow incorporation of more than one unnatural amino acid into a polypeptide. For example, the translation system can further include an additional different O-tRNA/O-RS pair and a second different unnatural amino acid, where this additional O- tRNA recognizes a second selector codon and this additional O-RS preferentially aminoacylates the O-tRNA with the second unnatural amino acid. For example, a cell that includes an O-tRNA/O-RS pair, where the O-tRNA recognizes, e.g., an amber selector codon, can further comprise a second orthogonal pair, where the second O-tRNA recognizes a different selector codon, e.g., an opal codon, an ochre codon, a four-base codon, a rare codon, a non-coding codon, or the like. Desirably, the different orthogonal pairs are derived from different sources, which can facilitate recognition of different selector codons.
[0038] In certain embodiments, translation systems can comprise an in vitro translation system, a cell, such as an E. coli or other bacterial cell, yeast, plant cell, mammalian or other eukaryotic cell, that includes an orthogonal tRNA (O-tRNA), an orthogonal aminoacyl- tRNA synthetase (O-RS), an unnatural amino acid and a nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises the selector codon that is recognized by the O-tRNA. Although orthogonal translation systems, e.g., translation systems comprising an O-RS, an O-tRNA and an unnatural amino acid can utilize cultured cells to produce proteins having unnatural amino acids, it is not intended that an orthogonal translation system of the invention require an intact, viable cell. For example, an orthogonal translation system can utilize a cell-free system in the presence of a cell extract. Indeed, the use of cell free in vitro transcription/translation systems for protein production is a well established technique. Adaptation of these in vitro systems to produce proteins having unnatural amino acids using orthogonal translation system components described herein is well within the scope of the invention.
[0039] The O-tRNA and/or the O-RS can be naturally occurring or can be, e.g., derived by mutation of a naturally occurring tRNA and/or RS, e.g., by generating libraries of tRNAs and/or libraries of RSs, from any of a variety of organisms and/or by using any of a variety of available mutation strategies. For example, one strategy for producing an orthogonal tRNA/ aminoacyl-tRNA synthetase pair involves importing a tRNA/synthetase pair that is heterologous to the system in which the pair will function from a source, or multiple sources, other than the translation system in which the tRNA/synthetase pair will be used. For example, O-RS/O-tRNA pairs from Archaea can be imported to eubacterial or eukaryotic systems to function orthogonally, in native form or with selected mutations, to incorporate desired unnatural amino acids. The properties of the heterologous synthetase candidate include, e.g., that it does not charge any host cell tRNA, and the properties of the heterologous tRNA candidate include, e.g., that it is not aminoacylated by any host cell synthetase. In addition, the heterologous tRNA is orthogonal to all host cell synthetases. Strategies to generate orthogonal pairs can involve generating mutant libraries from which to screen and/or select an O-tRNA or O-RS with the desired functional structures. Importation and mutant library screening strategies can also be combined.
[0040] Synthetase libraries can include two or more different mutant nucleic acids encoding different RSs. It is preferred that the RSs be derived from Archaea, such as, e.g., Methanosarcina and/or Desulfitobacterium species, e.g., using pylRS and its cognate tRNA as a platform to develop unnatural amino acid-specific orthogonal pairs. It is preferred that the mutations substitute amino acids in positions lining the amino acid side chain binding pocket of the RS. Although the RSs can predictably include amino acid substitutions outside the binding pocket that retain general structures (e.g., secondary and tertiary structure form and function), key mutations for customizing amino acid specificity are typically made in the binding pocket residues.
[0041] RS libraries can be provided to receive a wide variety of unnatural amino acids as substrate. For example, the Methanosarcina mazei pyrrolysyl-tRNA synthetase (MmPyIRS) can be, e.g., selectively and/or randomly mutated at key amino acid positions to provide any desired specificity. For example, mutations can be directed to MmPIyRS (NA SEQ ID NO: 1; polypeptide SEQ TD NO: 2) amino acid positions 305, 306, 309, 348, 384 and/or 419 to accommodate and favorably interact with an unnatural amino acid of given structure. Further, similar functional libraries can be derived from homologous RS sequences, e.g., with mutations directed to positions corresponding to MmPIyRS amino acid positions 305, 306, 309, 348, 384 and/or 419. For example, libraries with members functioning charge a given unnatural amino acid can be designed with appropriate mutations to the pyrrolysyl-tRNA synthetase of Methanosarcina mazei or Desulfitobacterium species at positions corresponding to MmPIyRS positions 305, 306, 309, 346, 348, 384, 417 and/or 419.
Orthogonal tRNA (O-tRNA)
[0042] An orthogonal tRNA (O-tRNA) of the invention desirably mediates incorporation of an unnatural amino acid into a protein that is encoded by a polynucleotide that comprises a selector codon that is recognized by the O-tRNA, e.g., in vivo or in vitro. In certain embodiments, an O-tRNA of the invention includes at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppression efficiency in the presence of a cognate synthetase in response to a selector codon as compared to an O-tRNA comprising or encoded by a polynucleotide sequence as set forth in the O-tRNA sequences in the sequence listing herein. The tRNA will typically display this selectivity in both bacterial and eukaryotic cells.
[0043] Examples of O-tRNAs of the invention are set forth in the sequence listing herein. The disclosure herein also provides guidance for the design of additional equivalent O-tRNA species. In an RNA molecule, such as an O-RS mRNA, or O-tRNA molecule, Thymine (T) is replaced with Uracil (U) relative to a given sequence (or vice versa for a coding DNA), or complement thereof. Additional routine modifications to the bases can also be present. In embodiments, the O-tRNA can have 80% sequence identity, 90% identity, 95%, identity, 98% identity, or more to an orthogonal tRNA, such as Mmpyl-tRNA of SEQ ID No: 7.
[0044] The invention also encompasses conservative variations of O-tRNAs corresponding to particular O-tRNAs herein. For example, conservative variations of O- tRNA include those molecules that function like the particular O-tRNAs, e.g., as in the sequence listing herein and that maintain the tRNA L-shaped structure by virtue of appropriate self-complementarity, but that do not have a sequence identical to that, e.g., in the sequence listing, and desirably, are other than wild type tRNA molecules.
[0045] The composition comprising an O-tRNA can further include an orthogonal aminoacyl-tRNA synthetase (O-RS), where the O-RS preferentially aminoacylates the O- tRNA with an unnatural amino acid. In certain embodiments, a composition including an O-tRNA can further include a translation system, e.g., in vitro or in vivo. A nucleic acid that comprises a polynucleotide that encodes a polypeptide of interest, where the polynucleotide comprises a selector codon that is recognized by the O-tRNA, or a combination of one or more of these can also be present in the cell.
[0046] General methods for producing a recombinant orthogonal tRNA and screening its efficiency with respect to incorporating an unnatural amino acid into a polypeptide in response to a selector codon can be found, e.g., in International Application Publications WO 2002/086075, entitled "METHODS AND COMPOSITIONS FOR THE PRODUCTION OF ORTHOGONAL tRNA AMINO ACYL-tRNA SYNTHETASE PAIRS;" WO 2004/094593, entitled "EXPANDING THE EUKAR YOTIC GENETIC CODE;" and WO 2005/019415, filed July 7, 2004. See also Forster, et al., (2003) "Programming peptidomimetic synthetases by translating genetic codes designed de novo. " Proc Natl Acad Sci U S A 100: 6353-6357; and Feng, et al, (2003) "Expanding tRNA recognition of a tRNA synthetase by a single amino acid change." Proc Natl Acad Sci U S A 100: 5676-5681. Additional details are found in United States Patents No. 7,045,337; No. 7,083,970; No. 7,238,510; No. 7,129,333; No. 7,262,040; No. 7,183,082; No. 7,199,222; and No. 7,217,809.
Orthogonal aminoacyl-tRNA synthetase (Q-RS)
[0047] The present orthogonal synthetases can be derived from any Archaea synthetases, particularly pyrrolysyl-tRNA synthetases, by selectively engineering or randomly mutating RS binding pocket amino acids corresponding to those identified herein. Orthogonal synthetases of the invention typically include an amino acid binding pocket configured to accept the side chain of a desired unnatural amino acid as a substrate.
[0048] In some embodiments, the RS can be any RS having significant homology to
Methanosarcina pylRSs, particularly in the region of the binding pocket, and mutated to provide structures that function to accept an intended unnatural amino acid as a substrate. Significant homology can be found according to methods known in the art and discussed herein For example, alternate functional synthetases can be provided by mutating homologous RSs to have mutations similar to those identified or suggested herein. For example, RSs homologous to the presently identified or suggested RSs can be mutated to include similar mutations in binding pocket amino acids in order to accept the same or similar unnatural amino acids as substrates. The homologous RSs can have 99% sequence identity or more, more than about 98% identity, 95% identity, 90% identity, 80% identity, 50% identity, or more. In particular, it is desirable that the alternate RSs for similar mutation of the binding pocket have a relatively high percent identity in the region of the amino acid binding pocket. For example, it is desirable that the percent identity in a homologous RS region be at least 75%, at least 90%, at least 95%, at least 98% at least 99%, or more. This percent identity in homologous regions is particularly desirable in regions corresponding to positions between amino acid 305 and 419 of the MmpylRS (SEQ ID NO: 2).
[0049] Homology of proteins and/or protein sequences can be the result of derivation from a common ancestral protein or protein sequence. Homology can be inferred, e.g., from structural and functional characteristics and from the percent identity of a putative homologous protein of homologous region of a protein. That is, homology is generally inferred from sequence similarity between two or more nucleic acids or proteins (or sequences thereof). The precise percentage of similarity between sequences that is useful in establishing homology varies with the nucleic acid and protein at issue, but as little as 25% sequence similarity is routinely used to establish homology. Higher levels of sequence similarity, e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or 99%, or more, can also be used to establish homology. Methods for determining sequence similarity percentages (e.g., BLASTP and BLASTN using default parameters) are described herein and are generally available.
[0050] The O-RS of the invention preferentially aminoacylates an O-tRNA with an unnatural amino acid, e.g., an epsilon substituted lysine, a photocaged lysine, an ortho, meta and/or para-substituted phenylalanine or tyrosine, alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids, alkoxyamine containing amino acids, borono-substituted amino acids and/or the like, in vitro or in vivo. The O-RS of the invention can be provided to the translation system, e.g., a bacterial or eukaryotic cell, by a polypeptide that includes an O-RS and/or by a polynucleotide that encodes an O-RS or a portion thereof. For example, an example O-RS comprises an amino acid sequence as set forth in the sequence listing, or a conservative variation thereof. In another example, an O-RS, or a portion thereof, is encoded by a polynucleotide sequence that encodes an amino acid comprising sequence in the sequence listing or examples herein, or a complementary polynucleotide sequence thereof.
[0051] General details for producing an O-RS, assaying its aminoacylation efficiency, and/or altering its substrate specificity can be found in Internal Publication Number WO 2002/086075, entitled "METHODS AND COMPOSITIONS FOR THE PRODUCTION OF ORTHOGONAL tRNA AMINOACYL-tRNA SYNTHETASE PAIRS; " and WO 2004/094593, entitled "EXPANDING THE EUKAR YOTIC GENETIC CODE." See also, Wang and Schultz "Expanding the Genetic Code," Angewandte Chemie Int Ed 44: 34-66 (2005); and Hoben and Soil (1985) Methods Enzymol 113: 55-59, the contents of which are incorporated by reference in their entirety. Additional details are found in United States Patents No. 7,045,337; No. 7,083,970; No. 7,238,510; No. 7,129,333; No. 7,262,040; No. 7,183,082; No. 7,199,222; and No. 7,217,809.
Source and Host Organisms
[0052] The orthogonal translational components (O-tRNA and O-RS) of the invention can be derived from any Archaea organism, or a combination of organisms, for use in a host translation system from any eubacterial of eukaryotic species, with the caveat that the O-tRNA/O-RS components and the host system work in an orthogonal manner. It is not a requirement that the O-tRNA and the O-RS from an orthogonal pair be derived from the same organism. In particular embodiments, the orthogonal components are derived from archaebacterial genes for use in a eubacterial host system and/or eukaryotic host system.
[0053] For example, the orthogonal O-tRNA can be derived from an archaebacterium, such as Methanosarcina mazei, Methanosarcina acetovorans, Methanosarcina barken, Methanosarcina frisia, Methanosarcina thermophila, Methanosarcina vacolata, Desulfitobacterium hqfhiense, Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I , Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrococcus horikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyrus kandleri, Methanosarcina mazei (Mm), Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus (Ss), Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasma volcanium, or the like. In a similar fashion, the orthogonal O-RS can be derived from an organism or combination of organisms, e.g., an archaebacterium, such as Methanosarcina mazei, Methanosarcina acetovorans, Methanosarcina barken, Methanosarcina frisia, Methanosarcina thermophila, Desulfitobacterium hafniense, Methanosarcina vacolata,
Methanococcus jannaschii, Methanobacterium thermoautotrophicum, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I, Archaeoglobus fiilgidus, Pyrococcus furiosus, Pyrococcus horikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyrus kandleri, Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus, Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasma volcanium, or the like. In a preferred embodiment, the O-tRNA is a native Methanosarcina tRNA or is derived from a Methanosarcina tRNA. In preferred embodiments, the O-RS is a native Methanosarcina RS or is derived from a Methanosarcina RS. PylRS/tRNA pairs from these organisms represent particularly desirable platforms for development of unnatural amino acid -specific orthogonal pairs.
[0054] The individual components of an 0-tRNA/O-RS pair can be derived from the same organism or from different organisms. In one embodiment, the O-tRNA/O-RS pair is from the same organism. Alternatively, the O-tRNA and the O-RS of the O-tRNA/O-RS pair are from different organisms. For example, the O-tRNA/O-RS pair can be derived from a natural Archaeal pair, or derived from Archaeal tRNA and RS that were previously not functionally paired, e.g., a tRNA from Methanosarcina mazei and an RS from Methanosarcina acetovoran.
[0055] The O-tRNA, O-RS or O-tRNA/O-RS pair can be selected or screened in vivo or in vitro and/or used in a cell, e.g., a eubacterial cell or enterobacterial cell to screen RS/tRNA pair activity or to produce a polypeptide with an unnatural amino acid. The eubacterial cell used is not limited, for example, Escherichia coli, Thermus thermophilics, Bacillus subtilis, Bacillus stearothermphilus, or the like. Compositions of eubacterial cells comprising translational components of the invention are also a feature of the invention. In preferred embodiments, the O-tRNA, O-RS or O-tRNA/O-RS pair functions in a eukaryotic translation system to incorporate an unnatural amino acid of interest. In more preferred embodiments, the O-tRNA, O-RS or O-tRNA/O-RS pair function to incorporate the unnatural amino acid both in a eubacterial translation system and in a eukaryotic translation system.
[0056] See also, International Application Publication Number WO 2004/094593, entitled "EXPANDING THE EUKARYOTIC GENETIC CODE," filed April 16, 2004, for screening O-tRNA and/or O-RS in one species for use in another species. Additional details are found in Wang and Schultz, (2005) "Expanding the Genetic Code." Angewandte Chemie Int Ed 44: 34-66; Xie and Schultz, (2005) "An Expanding Genetic Code." Methods 36: 227-238; Xie and Schultz, (2005) "Adding Amino Acids to the Genetic Repertoire." Curr Opinion in Chemical Biology 9: 548-554; and Wang, et ai, (2006) "Expanding the Genetic Code." Anna Rev Biophys Biomol Struct 35: 225-249, and United States Patents No. 7,045,337; No. 7,083,970; No. 7,238,510; No. 7,129,333; No. 7,262,040; No. 7,183,082; No. 7,199,222; and No. 7,217,809.
Selector Codons
[0057] Selector codons of the invention expand the genetic codon framework of protein biosynthetic machinery. For example, a selector codon includes, e.g., a unique three base codon, a nonsense codon, such as a stop codon, e.g., an amber codon (UAG), a ochre (UAA), or an opal codon (UGA), an unnatural codon, at least a four base codon, a rare codon, or the like. A number of selector codons can be introduced into a desired gene, e.g., one or more, two or more, more than three, etc. Conventional site-directed mutagenesis can be used to introduce the selector codon at the site of interest in a polynucleotide encoding a polypeptide of interest. See, e.g., Sayers, J. R., et al. (1988) "5', 3' Exonuclease in phosphorothioate-based oligonucleotide-directed mutagenesis. " Nucl Acid Res 16: 791-802. By using different selector codons, multiple orthogonal tRNA/synthetase pairs can be used that allow the simultaneous site-specific incorporation of multiple same or different unnatural amino acids e.g., including at least one unnatural amino acid, using these different selector codons.
[0058] Unnatural amino acids can also be encoded with rare codons. For example, when the arginine concentration in an in vitro protein synthesis reaction is reduced, the rare arginine codon, AGG, has proven to be efficient for insertion of Ala by a synthetic tRNA acylated with alanine. See, e.g., Ma, C. et al., (1993) "In vitro protein engineering using synthetic tRNAAla with different anticodons." Biochemistry 32: 7939-7945. In this case, the synthetic tRNA competes with the naturally occurring tRNA^8, which exists as a minor species (fewer tRNA molecules than for other Arg tRNAs and associated with a far lower occurrence (rarity) of the corresponding codon) in Escherichia coli. In addition, some organisms do not use all triplet codons. An unassigned codon AGA in Micrococcus luteus has been utilized for insertion of amino acids in an in vitro transcription/translation extract. See, e.g., Kowal and Oliver, (1997) "Exploiting unassigned codons in Micrococcus luteus for tRNA-based amino acid mutagenesis." Nucl Acid Res 25: 4685-4689. Components of the invention can be generated to use these rare codons in vivo. A rare codon can be considered a codon used in a cell or translation system less than 5 percent of the time to encode a particular amino acid compared to the total of other codons encoding the amino acid in the system or cell.
[0059] Selector codons can also comprise extended codons, e.g., four or more base codons, such as, four, five, six or more base codons. Examples of four base codons include, e.g., AGGA, CUAG, UAGA, CCCU, and the like. Examples of five base codons include, e.g., AGGAC, CCCCU, CCCUC, CUAGA, CUACU, UAGGC and the like. Methods of the invention include using extended codons based on frameshift suppression. Four or more base codons can insert, e.g., one or multiple unnatural amino acids, into the same protein. In other embodiments, the anticodon loops can decode, e.g., at least a four-base codon, at least a five-base codon, or at least a six-base codon or more. Since there are 256 possible four-base codons, multiple unnatural amino acids can be encoded in the same cell using a four or more base codon. See also, Anderson, et al., (2002) "Exploring the Limits of Codon and Anticodon Size." Chemistry and Biology 9: 237-244; Magliery, et al., (2001) "Expanding the Genetic Code: Selection of Efficient Suppressors of Four-base Codons and Identification of "Shifty" Four-base Codons with a Library Approach in Escherichia coli." J MoI Biol 307: 755-769; Ma, C, et al., (1993) "In vitro protein engineering using synthetic tRNAAla with different anticodons. " Biochemistry 32:7939; Hohsaka, et al, (1999) "Efficient Incorporation of Non-natural Amino Acids with Large Aromatic Groups into Streptavidin in In Vitro Protein Synthesizing Systems." J Am Chem Soc 121: 34-40; and Moore, et al., (2000) "Quadruplet Codons: Implications for Code Expansion and the Specification of Translation Step Size." J MoI Biol 298: 195-209. Four base codons have been used as selector codons in a variety of orthogonal systems. See, e.g., WO 2005/019415; WO 2005/007870 and WO 2005/07624. See also, Wang and Schultz, (2005) "Expanding the Genetic Code." Angewandte Chemie Int Ed 44: 34-66, the content of which is incorporated by reference in its entirety. [0060] For a given system, a selector codon can also include one of the natural three base codons, where the endogenous system does not use (or rarely uses) the natural base codon. For example, this includes a system that is lacking a tRNA that recognizes the natural three base codon, and/or a system where the three base codon is a rare codon.
[0061] Selector codons optionally include unnatural base pairs. Descriptions of unnatural base pairs which can be adapted for methods and compositions include, e.g., Hirao, et al., (2002) "An unnatural base pair for incorporating amino acid analogues into protein." Nature Biotechnology 20: 177-182. See also Wu, et al, (2002) "Enzymatic Phosphorylation of Unnatural Nucleosides." J Am Chem Soc 124: 14626-14630.
Methods of Unnatural Amino Acid Incorporation Using PyIRS Mutants
[0062] Many pyrrolysyl-tRNA synthetases (PyIRSs) have a propensity to function charging their cognate tRNA orthogonally in both eukaryotic and eubacterial translation systems. They are also relatively non-specific (promiscuous) in their selectivity between unnatural amino acids of similar structure (see, e.g., Mukai, et al., Biochem. Biophys. Res. Comm.371: 818-822, 2008). In one aspect of the invention, Methanosarcina PyIRSs can be modified for use in charging a wide variety of unnatural amino acids, e.g., in both eukaryotic and prokaryotic translation systems.
[0063] Based on the available crystal structures of the Methanosarcina barkeri
PyIRS and the Methanosarcina mazeri PyIRS (see, e.g., Figures 1 and 4), modifications to amino acids at certain positions can be provided, e.g., by directed protein engineering and/or by screening of random mutations at the positions, to functionally accommodate various unnatural amino acid structures as aminoacylation substrates. The methods include selection of a desired unnatural amino acid, identification of expected favorable and unfavorable interactions with the Methanosarcina pylRS binding pocket amino acids, directed or random mutation of identified pocket amino acids, and screening of mutated pylRSs to identify those with highest charging activity with the desired unnatural amino acid. The methods can include shuttling mutated pylRSs to a eukaryotic translation system for incorporation of the unnatural amino acid into, e.g., a eukaryotically processed polypeptide. [0064] In an aspect of the methods, the unnatural amino acid can be any, including, e.g., an epsilon-substituted lysine, a photocaged lysine, an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido- substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono- substituted phenylalanine, a para borono-substituted phenylalanine, a para benzoyl- substituted phenylalanine, an ortho aido-substituted phenylalanine, a meta azido-substituted phenylalanine, an ortho nitro-substituted phenylalanine, a meta nitro-substituted phenylalanine, para nitro-substituted phenylalanine, an ortho nitro-substituted tyrosine, a meta nitro-substituted tyrosine, para nitro-substituted tyrosine; alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids and alkoxyamine containing amino acids. In some embodiments, the unnatural amino acids are other than Boc-lysine, acetyllysine or Nε-benzyloxycarbonyl-L-lysine. Methanosarcina synthetases can be modified to accommodate these unnatural amino acids in the binding pocket and charge a cognate tRNA, e.g., for ultimate incorporation into a polypeptide.
[0065] Pyrrolysine is essentially an analog of lysine substituted at the epsilon nitrogen with a ketopyrrole group. The main selective interaction between the Methanosarcina pylRS and amino acids for charging is at the binding pocket amino acids at positions corresponding to, e.g., amino acid positions Leu305, Tyr306, Leu309, Cys348 Tyr384 and GIy 419 of the Methanosarcina maize pylRS (SEQ ID NO: 2). The amino acids are arranged along the pocket in the order Gly419, Cys348, Leu309, Leu305, Tyr306 and Tyr384 (see, Figure 4). Tyr306, Leu 309 and Leu 305 are positioned at the far end of the pocket, projecting to specifically interact hydrophobically and by hydrogen bonding with pyrrolysine side chain end, while the other amino acids line the sides of the pocket. The pocket can be modified by intelligent substitution of the identified amino acids in consideration of, e.g., steric interactions, hydrophobic interactions and hydrogen bonding interactions that would promote functional interactions with unnatural amino acids of choice. For example, where the desired unnatural amino acid would extend deeper into the bonding pocket with a hydrophobic group than does pyrrolysine, shorter hydrophobic amino acids can be selected for positions 306 and 309, e.g., to avoid steric hindrance while maintaining hydrophobic interactions useful in specifically binding the amino acid in the pocket. Assuming the desired unnatural amino acid side chain is shorter than pyrrolysine with a hydrophilic (e.g., carboxyl or hydroxyl) terminus, larger amino acids can be provided, e.g., at positions 305, 306 and/or 309; typically with at least one of these amino acids including a hydrogen bonding group. For example, O-nitrobenzyl-oxycarbonly-N-L- lysine (ONBK) is longer than pyrrolysine and includes a larger terminal aryl group. The PyIRs can accommodate ONBK when the end-pocket Try306 residue is substituted with hydrophobic, less sterically hindering, He or Met. Further, ONBK is better accommodated when steric hindrance is reduced with a shorter Ala at positions 309 and 348.
[0066] It is an aspect of the invention that mutations to the synthetase structure, outside identified binding pocket residues, are optionally mutated in a conservative fashion. That is amino acids making up alpha helices, beta sheets, hair pins and other structural features, e.g., in the polypeptide secondary structures can be maintained, even with substantial conservative changes to the amino acid sequence. For example, amino acids that generally support certain identified secondary structures can be substituted for other amino acids that support the identified structure while maintaining the orientation of amino acids in the binding pocket and retaining functional activity of the RS. With careful selection of conservative variations in non-binding pocket structures useful charging function can be maintained with more than 50% substitutions with amino acids that conserve the alpha helices, and beta sheet structures of the RS, as shown in Figure 1. The RSs can retain functional activity and selectivity with intelligent conservative amino acid substitutions from about 1% to 50% or more, about 2% to 40%, about 4% to 30%, about 5% to 25%, about 10% to 20% or about 15% of the total amino acids in the RS structure.
[0067] In embodiments where the unnatural amino acids are substituted Tyr and Phe analogs, it is typically preferred that, e.g., the amino acid residues 348 and/or 384 at the opening of the binding pocket be relatively short hydrophobic residues (e.g., Ala, Pro or VaI) to allow apace and to hydrophobically interact with the aryl residue of the amino acid. Where the substituted Tyr and Phe analogs include a short hydrophilic group, amino acids at positions 309, 306 and/or 305 can be selected to extend relatively far to interact with the hydrophilic group (using, e.g., Lys, Arg, His, GIt and/or GIn). Alternately, in the case where Tyr and Phe substitution includes a short hydrophobic group, amino acids at positions 309, 306 and/or 305 can extend relatively far to interact with the hydrophobic group (using, e.g, Met, Trp, Ue and/or Leu). Where the unnatural amino acid has a relatively long side chain (e.g., 5, 6, 7, 8 or more carbon bond equivalents) with one or more hydrophilic groups, positions 306, 305 and/or 309 can have an amino acid with a relatively short hydrophilic side chain (e.g., Ser, Asp, Thr, and/or Cys). Where the unnatural amino acid has a relatively long hydrophobic side chain, positions 305, 306 and/or 309 can have an amino acid with a relatively short hydrophobic side chain (e.g., Ala, VaI, Pro and/or He) to accommodate the longer unnatural amino acid in the binding pocket. In a like manner, depending on the bulk and relative polarity of chemical groups near the base of the unnatural amino acid side chain, RS amino acids at positions 348, 384 and/or 419 can be substituted, as appropriate. For example, where the unnatural amino acid is relatively bulky and hydrophobic at the side chain base or mid portion, the amino acids at RS positions 348 and 384 can include, e.g., Ala, VaI, He, Leu and/or Pro; it whole logically be advisable to retain Gly419 as GIy. Where the unnatural amino acid is relatively bulky and hydrophilic at the side chain base or mid portion, the amino acids at positions 348 and 384 can include, e.g., Asp, GIu, Ser, Thr, Asn, GIn and/or Cys. Where the unnatural amino acid is not bulky and but is hydrophobic at the side chain base or mid portion, the amino acids at positions 348, 384 and 419 can include, e.g., Met, Trp, Phe and/or Tyr. Where the unnatural amino acid is not bulky and is hydrophilic at the side chain base or mid portion, the amino acids at positions 348, 384 and 419 can include, e.g., Arg, Lys, GIn, GIu and/or Asn.
[0068] Depending on the choice of unnatural amino acid intended for charging, the
Methanosarcina PyIRS, or other Archaea, can be adapted, e.g., as discussed above, by site- directed mutation and/or by random mutation at the identified amino acid positions of the RS. A selection of alternate mutants can be generated and screened by positive and/or negative selection, as described herein, to enrich for mutant pylRSs with the most enhanced activity and/or specificity with the unnatural amino acid. In many cases it is not necessary to select for mutant RSs that preferentially aminoacylate one unnatural amino acid over another unnatural amino acid because such cross-over can typically be avoided by controlling what unnatural amino acid is made available in the translation system.
However, selection schemes to select one unnatural amino acid over another can be employed, e.g., where multiple orthogonal elements are used to incorporate more than one unnatural amino acid site-specifically into a polypeptide of interest.
[0069] The methods of incorporating the unnatural amino acids typically include mutation or substitution of a nucleic acid encoding a Methanosarcina pylRS to express putative appropriate amino acids on translation. In many embodiments, two or more optional mutant Methanosarcina pylRSs are associated in a library. Often libraries of mutant pylRSs are screened to identify library members having the most desired characteristics of activity and selectivity. The screening is typically more convenient to carry out in bacteria than eukaryotic cells. Once promising RSs are identified, it is often desirable to shuttle the encoding nucleic acid from the screening or testing system (e.g., a bacterial cell) to a eukaryotic translation system (e.g., a mammalian cell). To shuttle an RS (and/or tRNA) from a first cell to a second cell, one can simply transform the second cell with nucleic acids from the first cell. This can be readily accomplished if the encoding nucleic acid of interest is in an expression system compatible with both cells. Optionally, shuttling is accomplished by excising the nucleic acid encoding the RS from, e.g., a bacterial expression plasmid, functionally ligating it into a eukaryotic expression plasmid and transforming a compatible eukaryotic cell.
SHUTTLING TRANSLATION SYSTEM COMPONENTS
[0070] Shuttling generally involves transfer of encoding nucleic acids from one host cell type to another. In the present inventions translation system components (e.g., O-RSs and O-tRNAs) functioning in a first system, such as a eubacterial system, can be transferred to function in a second system, such as a eukaryotic system. The nucleic acid encoding the translation system component typically includes a transcription promoter recognized by transcription components of the host system. Shuttling typically requires the nucleic acid encoding the component to be physically transferred from one host to the other. In some embodiments, the encoding nucleic acid present in the first host system are recombined into a new expression vector before transfer into the new host system. In other embodiments, the encoding nucleic acid can be transferred directly to the second host, e.g., in a shuttle vector. [0071] An expression vector is usually a plasmid that is used to introduce and express a specific gene into a target cell. Once the expression vector is inside the cell, the protein that is encoded by the gene is produced by the cellular transcription and translation machinery. The plasmid is typically engineered to contain an active transcription promoter facilitates production of mRNA complimentary to the gene. Many expression vectors are designed to function properly only in a particular suitable cell type (e.g., a bacterium, a plant cell, an animal cell, a yeast, etc.).
[0072] In some embodiments, e.g., a selected O-RS is present in a bacterial cell encoded in a plasmid having promoters specific to the bacterial cell. For example, the promoters for bacterial expression can include bacteriophage promoters, native bacterial gene promoters or engineered promoters. Plasmids for expression of orthogonal translation system components in eubacteria can include, e.g.: the lambda PL promoter, the tac promoter/operator (Ptac), the E. coli arabinose operon promoter (Pbad), E.coli glutamine promoter (glnS), a mutant glnS promoter (glnSø), and/or the like.
[0073] In eukaryotic host systems, expression can be regulated to eukaryotic promoters, regulators, enhancers, and the like. Plasmids for expression of orthogonal components can include; a TATA sequence, upstream activator sequence (UAS), initiator sequences (INR), downstream promoter elements (DPE), and/or the like. For example, the tetracycline inducible CMV promoter (CMV promoter with TetO sites), promoter efl-a, and b-actin. A commonly used eukaryotic expression plasmid includes the constitutive CMV promoter.
[0074] Methods to select orthogonal components (e.g., O-RS/O-TRNA pairs) in bacterial cells can include transformation of the bacteria with plasmids encoding the components so that the components can be readily cloned, screened and identified in the bacterial environment. Once the desired components are identified, the encoding plasmids can be harvested by conventional methods. The nucleic acid sequence encoding the desired component can be cut from the bacterial expression plasmid, e.g., using specific endonucleases, and purified (e.g., by chromatography of electrophoresis). The purified nucleic acid encoding the component can then be ligated into an expression vector adapted for expression in a desired eukaryotic host cell. Finally, the eukaryotic expression vector containing the nucleic acid encoding the component can be used to transform the eukaryotic host cell for expression of the component in the new host cell. The orthogonal component has been shuttled from the bacterial host to the eukaryotic host.
[0075] The process of shuttling an orthogonal translation system component from one cell type to another can be simplified by use of a shuttle vector. A shuttle vector is a vector (usually a plasmid) constructed so that it can propagate in two different host species. Therefore, DNA inserted into a shuttle vector can be tested or manipulated in two different cell types. The main advantage of these vectors is that they can be manipulated in bacteria and then used in a system, which is more difficult or slower to use (e.g. yeast or mammal cells) without intervening DNA recombination steps. Shuttle vectors include plasmids that can propagate in eukaryotes and prokaryotes (e.g., both Saccharomyces cerevisiae and E. coli). For example, certain adenovirus shuttle vectors can function to express a polypeptide in both E. coli and mammals.
[0076] Yeast shuttle vectors can be useful in the present methods. Yeast shuttle vectors typically have components that allow for replication and selection in both E. coli cells and yeast cells. The E. coli component of a yeast shuttle vector includes an origin of replication and a selectable marker, e.g. antibiotic resistance, beta lactamase. The yeast component of a yeast shuttle vector includes an autonomously replicating sequence (ARS), a yeast centromere (CΕN), and a yeast selectable marker (e.g., UR A3, a gene that encodes an enzyme for uracil synthesis, Lodish et al. 2007).
[0077] Other vectors suitable for replication and integration in prokaryotes, eukaryotes, or preferably both are known in the art. For example, see, Giliman & Smith, Gene 8:81 (1979); Roberts, et al., Nature, 328:731 (1987); Schneider, B., et al., Protein Εxpr. Purif. 6435:10 (1995); Berger, Sambrook, Ausubel. A catalogue of Bacteria and Bacteriophages useful for cloning is provided, e.g., by the ATCC, e.g., The ATCC Catalogue of Bacteria and Bacteriophage (1992) Gherna et al. (eds) published by the ATCC. Additional basic procedures for sequencing, cloning and other aspects of molecular biology and underlying theoretical considerations are also found in Watson et al. (1992) Recombinant DNA, Second Edition Scientific American Books, NY. PHOTOCAGED AMINO ACIDS
[0078] Unnatural amino acids charged and incorporated in translation systems of the invention can be photocaged amino acids. "Caging" groups of amino acids can inhibit or conceal (e.g., by disrupting bonds which would usually stabilize interactions with target molecules, by changing the hydrophobicity or ionic character of a particular side chain, or by steric hindrance, etc.) biological activity in a molecule, e.g., a peptide comprising such amino acid. See, e.g., Adams, et al., Annu. Rev. Physiol., 1993, 55:755-784. Thus, for example, if a photocaged amino acid is incorporated within a peptide having biological activity, illumination can alter the amino acid, thereby changing the biological activity of the peptide. See U.S. application 11/233,508 - Adding Photoregulated Amino Acids to the Genetic Code.
[0079] A number of methods are optionally applicable to create a photocaged amino acid. Thus, for example, a photocaged amino acid can be created by protecting its α-amino group with compounds such as BOC (butyloxycarbonyl), and protecting the α-carboxyl group with compounds such as a t-butyl ester. Such protection can be followed by reaction of the amino acid side chain with a photolabile caging group such as 2-nitrobenzyl, in a reactive form such as 2-nitrobenzylchloroformate, α-carboxyl 2-nitrobenzyl bromide methyl ester, or 2-nitrobenzyl diazoethane. After the photolabile cage group is added, the protecting groups can be removed via standard procedures. See, e.g., USPN 5,998,580.
[0080] As another example, lysine residues can be caged using 2- nitrobenzylchloroformate to derivatize the ε-lysine amino group, thus eliminating the positive charge. Alternatively, lysine can be caged by introducing a negative charge into a peptide (which has such lysine) by use of an α-carboxy 2-nitrobenzyloxycarbonyl caging group. Additionally, phosphoserine and phosphothreonine can be caged by treatment of the phosphoamino acid or the phosphopeptide with l(2-nitrophenyl)diazoethane. See, e.g., Walker et al, Meth Enzymol. 172:288-301, 1989. A number of other amino acids are also easily amenable to standard caging chemistry, for example serine, threonine, histidine, glutamine, asparagine, aspartic acid and glutamic acid. See, e.g., Wilcox et al., J. Org. Chem. 55: 1585-1589, 1990). Again, it will be appreciated that recitation of particular photoregulated (amino acids and/or those capable of being converted to photoregulated forms) should not necessarily be taken as limiting.
[0081] A large number of caging groups, and a number of reactive compounds used to covalently attach such groups to other molecules such as amino acids, are well known in the art. Examples of photoregulating and/or photocaging groups include, but are not limited to: nitroindolines; N-acyl-7-nitroindolines; phenacyls; hydroxyphenacyl; brominated 7- hydroxycoumarin-4-ylmethyls (e.g., Bhc); benzoin esters; dimethoxybenzoin; meta- phenols; 2-nitrobenzyl; l-(4,5-dimethoxy-2-nitrophenyl)ethyl (DMNPE); 4,5-dimethoxy-2- nitrobenzyl (DMNB); alpha-carboxy-2-nitrobenzyl (CNB); l-(2-nitrophenyl)ethyl (NPE); 5-carboxymethoxy-2-nitrobenzyl (CMNB); (5-carboxymethoxy-2-nitrobenzyl)oxy) carbonyl; (4,5-dimethoxy-2-nitrobenzyl)oxy) carbon yl; desoxybenzoinyl; and the like. See, e.g., USPN 5,635,608 to Haugland and Gee (June 3, 1997) entitled "α-carboxy caged compounds" Neuro 19, 465 (1997); J Physiol 508.3, 801 (1998); Proc Natl Acad Sci USA 1988 Sep, 85(17):6571-5; J Biol Chem 1997 Feb 14, 272(7):4172-8; Neuron 20, 619-624, 1998; Nature Genetics, vol. 28:2001:317-325; Nature, vol. 392,1998:936-941; Pan, P., and Bayley, H. "Caged cysteine and thiophosphoryl peptides" FEBS Letters 405:81-85 (1997); Pettit et al. (1997) "Chemical two-photon uncaging: a novel approach to mapping glutamate receptors" Neuron 19:465-471; Furuta et al. (1999) "Brominated 7-hydroxycoumarin-4- ylmethyls: novel photolabile protecting groups with biologically useful cross-sections for two photon photolysis" Proc. Natl. Acad. Sci. 96(4): 1193-1200; Zou et al. "Catalytic subunit of protein kinase A caged at the activating phosphothreonine" J. Amer. Chem. Soc. (2002) 124:8220-8229; Zou et al. "Caged Thiophosphotyrosine Peptides" Angew. Chem. Int. Ed. (2001) 40:3049-3051; BioProbes Handbook, 2002 from Molecular Probes, Inc.; and Handbook of Fluorescent Probes and Research Products, Ninth Edition or Web Edition, from Molecular Probes, Inc, as well as the references herein. Many compounds, kits, etc. for use in caging various molecules are commercially available, e.g., from Molecular Probes, Inc. (www.molecularprobes.com). Additional references are found in, e.g., Merrifield, Science 232:341 (1986) and Come, J. E. T. and Trentham, D. R. (1993) In: Biological Applications of Photochemical Switches, ed., Morrison, H., John Wiley and Sons, Inc. New York, pp. 243-305. Examples of suitable photosensitive caging groups include, but are not limited to, 2-nitrobenzyl, benzoin esters, N-acyl-7-nitindolines, meta- phenols, and phenacyls.
[0082] In some embodiments, a photocaging group can optionally comprise a first binding moiety, which can bind to a second binding moiety. For example, a commercially available caged phosphoramidite [l-N-(4,4'-Dimethoxytrityl)-5-(6- biotinamidocaproamidomethyl)-l-(2-nitrophenyl)-ethyl]-2-cyanoethyl-(N,N-diisopropyl)- phosphoramidite (PC Biotin Phosphoramadite, from Glen Research Corp., www.glenres.com) comprises a photolabile group and a biotin (the first binding moiety). A second binding moiety, e.g., streptavidin or avidin, can thus be bound to the caging group, increasing its bulkiness and its effectiveness at caging. In certain embodiments, a caged component comprises two or more caging groups each comprising a first binding moiety, and the second binding moiety can bind two or more first binding moieties simultaneously. For example, the caged component can comprise at least two biotinylated caging groups; binding of streptavidin to multiple biotin moieties on multiple caged component molecules links the caged components into a large network. Cleavage of the photolabile group attaching the biotin to the component results in dissociation of the network.
[0083] "Traditional" methods of creating caged polypeptides (including e.g. peptide substrates and proteins such as antibodies or transcription factors) include, e.g., by reacting a polypeptide with a caging compound or by incorporating a caged amino acid during synthesis of a polypeptide. See, e.g., USPN 5,998,580 to Fay eϊ al. (December 7, 1999) entitled "Photosensitive caged macromolecules"; Kossel et al. (2001) PNAS 98: 14702- 14707; Trends Plant Sci (1999) 4:330-334; PNAS (1998) 95:1568-1573; J. Am. Chem. Soc. (2002) 124:8220-8229; Pharmacology & Therapeutics (2001) 91:85-92; and Angew. Chem. Int. Ed. Engl. (2001) 40:3049-3051. A photolabile polypeptide linker (e.g., for connecting a protein transduction domain and a sensor, or the like) can, for example, comprise a photolabile amino acid such as that described in USPN 5,998,580.
[0084] Irradiation with light can, e.g., release a side chain residue of an amino acid that is important for activity of the peptide comprising such amino acid. Additionally, in some embodiments, uncaged amino acids can cleave the peptide backbone of the peptide comprising the amino acid and can thus, e.g., open a cyclic peptide to a linear peptide with different biological properties, etc.
[0085] Activation of a caged peptide can be done through destruction of a photosensitive caging group on a photoregulated amino acid by any standard method known to those skilled in the art. For example, a photosensitive amino acid can be uncaged or activated by exposure to a suitable conventional light source, such as lasers (e.g., emitting in the UV range or infrared range). Those of skill in the art will be aware of and familiar with a number of additional lasers of appropriate wavelengths and energies as well as appropriate application protocols (e.g., exposure duration, etc.) that are applicable to use with photoregulated amino acids such as those utilized herein. Release of photoregulated caged amino acids allows control of the peptides that comprise such amino acids. Such control can be both in terms of location and in terms of time. For example, focused laser exposure can uncage amino acids in one location, while not uncaging amino acids in other locations.
[0086] The compositions and methods herein can be utilized in a number of aspects.
For example, photocaged amino acids (e.g., in peptides) can deliver therapeutic compositions to discrete locations of a body since the release or activation/deactivation/etc. of the photocaged amino acid can be localized through targeted light exposure, etc. It will also be appreciated that the methods, structures, and compositions of the invention are applicable to incorporation/use of photocaged natural amino acids (e.g., ones with photocaging moieties attached/associated with them, thus rendering them "unnatural" amino acids). See, e.g., application PCT/US2005/034002 - Adding Photoregulated Amino Acids to the Genetic Code.
NUCLEIC ACID AND POLYPEPTIDE SEQUENCES AND VARIANTS
[0087] As described herein, the invention provides for polynucleotide sequences encoding, e.g., O-tRNAs and O-RSs, and polypeptide amino acid sequences, e.g., O-RSs, and, e.g., compositions, systems and methods comprising said polynucleotide or polypeptide sequences. Examples of said sequences, e.g., O-tRNA and O-RS amino acid and nucleotide sequences are disclosed herein (see the sequence listing). However, one of skill in the art will appreciate that the invention is not limited to those sequences disclosed herein, e.g., in the Examples and sequence listing. One of skill will appreciate that the invention also provides many related sequences with the functions described herein, e.g., polynucleotides and polypeptides encoding conservative variants of an O-RS disclosed herein.
[0088] As used herein, the term "conservative variant," in the context of a translation component, refers to a translation component, e.g., a conservative variant O- tRNA or a conservative variant O-RS, that functionally performs similar to a base component that the conservative variant is similar to, e.g., an O-tRNA or O-RS, having variations in the sequence as compared to a reference O-tRNA or O-RS. For example, an O-RS and a conservative variant of that O-RS, will both aminoacylate a cognate O-tRNA with the same unnatural amino acid. In this example, the O-RS and the conservative variant O-RS do not have the same amino acid sequences. The conservative variant can have, e.g., one variation, two variations, three variations, four variations, or five or more variations in sequence, as long as the conservative variant is still complementary to, e.g., functions with, the cognate corresponding O-tRNA or O-RS.
[0089] Regarding enzymatically active proteins, one of skill in the art knows that in many cases, one amino acid can be exchanged for another with similar properties with retention of significant enzymatic activity. One of skill knows amino acids outside the active enzymatic site physical structures (e.g., that retain the orientation of amino acids in the active site) can be exchanged with other amino acids known in the art for cooperating in stabilization of the physical structures. For example, those of skill know that different amino-acid sequences have different propensities for forming α-helical structure. Methionine, alanine, leucine, uncharged glutamate, and lysine ("MALEK" in the amino- acid 1 -letter codes) all have especially high helix-forming propensities, whereas proline, glycine and negatively charged aspartate have poor helix-forming propensities. Proline tends to break or kink helices because it cannot donate an amide hydrogen bond (having no amide hydrogen), and because its side chain interferes sterically; its ring structure also restricts its backbone φ dihedral angle to the vicinity of -70°, which is less common in α- helices. However, proline is often seen as the first residue of a helix, presumably due to its structural rigidity. At the other extreme, glycine also tends to disrupt helices because its high conformational flexibility makes it entropically expensive to adopt the relatively constrained α-helical structure. It would be a conservative variation, and one of skill would expect continued enzymatic activity, e.g., to exchange an alanine for a leucine in an alpha helix segment of an enzyme structure, or visa versa. See the general structure of the Methanosarcina mazei pylRS in Figure 1. Regarding beta sheet structures found in active proteins, large aromatic residues (Tyr, Phe and Trp) and β-branched amino acids (Thr, VaI, He) are favored to be found in β strands in the middle of β sheets. Interestingly, different types of residues (such as Pro) are likely to be found in the edge strands in β sheets, presumably to avoid the "edge-to-edge" association between proteins that might lead to aggregation. One of skill in the protein engineering arts would expect a high proportion of success in retaining activity, with a minimum of experimentation, in conservative variant modifications of enzymes, such as synthetases, particularly outside of enzyme active sites. Experimentation would be further minimized given additional structural information, such as X-ray crystallography data from the enzyme or from an enzyme of homologous structure and function.
[0090] In some embodiments, a conservative variant O-RS comprises one or more conservative amino acid substitutions compared to the O-RS from which it was derived. In some embodiments, a conservative variant O-RS comprises one or more conservative amino acid substitutions compared to the O-RS from which it was derived, and furthermore, retains O-RS biological activity; for example, a conservative variant O-RS that retains at least 10% of the biological activity of the parent O-RS molecule from which it was derived, or alternatively, at least 20%, at least 30%, or at least 40%. In some preferred embodiments, the conservative variant O-RS retains at least 50% of the biological activity of the parent O-RS molecule from which it was derived. The conservative amino acid substitutions of a conservative variant O-RS can occur in any domain of the O-RS, including the amino acid binding pocket.
[0091] Conservative substitution tables providing functionally similar amino acids are well known in the art, where one amino acid residue is substituted for another amino acid residue having similar chemical properties (e.g., aromatic side chains or positively charged side chains), and therefore does not substantially change the functional properties of the polypeptide molecule. The following sets forth example groups that contain natural amino acids of like chemical properties, where substitutions within a group is a "conservative substitution".
TABLE A Conservative Amino Acid Substitutions
Figure imgf000038_0001
[0092] Owing to the degeneracy of the genetic code, "silent substitutions", i.e., substitutions in a nucleic acid sequence which do not result in an alteration in an encoded polypeptide, are an implied feature of every nucleic acid sequence that encodes an amino acid sequence. Similarly, "conservative amino acid substitutions," where one or a limited number of amino acids in an amino acid sequence are substituted with different amino acids with highly similar properties, are also readily identified as being highly similar to a disclosed construct. Such conservative variations of each disclosed sequence are a feature of the present invention.
[0093] In one aspect, the invention can include O-tRNAs and O-RS that are
"derived from" a parental molecule. As used herein, the term "derived from" refers to a component that is isolated from or made using a specified molecule or organism, or information from the specified molecule or organism. For example, a polypeptide that is derived from a second polypeptide can include an amino acid sequence that is identical or substantially similar to the amino acid sequence of the second polypeptide. In the case of polypeptides, the derived species can be obtained by, for example, naturally occurring mutagenesis, artificial directed mutagenesis or artificial random mutagenesis. The mutagenesis used to derive polypeptides can be intentionally directed or intentionally random, or a mixture of each. The mutagenesis of a polypeptide to create a different polypeptide derived from the first can be a random event, e.g., caused by polymerase infidelity, and the identification of the derived polypeptide can be made by appropriate screening methods, e.g., as discussed herein. Mutagenesis of a polypeptide typically entails manipulation of the polynucleotide that encodes the polypeptide.
Nucleic Acid Hybridization
[0094] Comparative hybridization can also be used to identify nucleic acids of the invention, including conservative variations of nucleic acids of the invention. In addition, target nucleic acids which hybridize to a nucleic acid represented in the sequence listing herein, under high, ultra-high and ultra-ultra high stringency conditions, are an aspect of the invention where the nucleic acids encode mutations corresponding to: a Met or De residue at position corresponding to position 306, an Ala at position 309, an Ala at position 348, a Phe at position 384, or a combination thereof, with amino acid position numbering corresponding to amino acid position numbering of the wild-type pyrrolysyl-tRNA synthetase.
[0095] Examples of such nucleic acids include those with one or a few silent or conservative nucleic acid substitutions as compared to a given nucleic acid sequence of the sequence listing, e.g., which encode, e.g.: a Met or He residue at position corresponding to position 306, an Ala at position 309, an Ala at position 348, a Phe at position 384, or a combination thereof, wherein amino acid position numbering corresponds to amino acid position numbering of the wild-type pyrrolysyl-tRNA synthetase.
[0096] A test nucleic acid is said to specifically hybridize to a probe nucleic acid when it hybridizes at least 50% as well to the probe as to the perfectly matched complementary target, i.e., with a signal to noise ratio at least half as high as hybridization of the probe to the target under conditions in which the perfectly matched probe binds to the perfectly matched complementary target with a signal to noise ratio that is at least about 5x- 10x as high as that observed for hybridization to any of the unmatched target nucleic acids. [0097] Nucleic acids "hybridize" when they associate, typically in solution. Nucleic acids hybridize due to a variety of well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base stacking and the like. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic Acid Probes part I chapter 2, "Overview of principles of hybridization and the strategy of nucleic acid probe assays," (Elsevier, New York), as well as in Current Protocols in Molecular Biology, Ausubel, et al., eds., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 2004) ("Ausubel"); Hames and Higgins (1995) Gene Probes 1 IRL Press at Oxford University Press, Oxford, England, (Hames and Higgins 1) and Hames and Higgins (1995) Gene Probes 2 IRL Press at Oxford University Press, Oxford, England (Hames and Higgins 2) provide details on the synthesis, labeling, detection and quantification of DNA and RNA, including oligonucleotides.
[0098] An example of stringent hybridization conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or northern blot is 50% formalin with 1 mg of heparin at 420C, with the hybridization being carried out overnight. An example of stringent wash conditions is a 0.2x SSC wash at 65°C for 15 minutes {see, Sambrook, supra for a description of SSC buffer). Often the high stringency wash is preceded by a low stringency wash to remove background probe signal. An example low stringency wash is 2x SSC at 40°C for 15 minutes. In general, a signal to noise ratio of 5x (or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization.
[0099] "Stringent hybridization wash conditions" in the context of nucleic acid hybridization experiments such as Southern and northern hybridizations are sequence dependent, and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993), supra, and in Hames and Higgins, 1 and 2. Stringent hybridization and wash conditions can easily be determined empirically for any test nucleic acid. For example, in determining stringent hybridization and wash conditions, the hybridization and wash conditions are gradually increased (e.g., by increasing temperature, decreasing salt concentration, increasing detergent concentration and/or increasing the concentration of organic solvents such as formalin in the hybridization or wash), until a selected set of criteria are met. For example, in highly stringent hybridization and wash conditions, the hybridization and wash conditions are gradually increased until a probe binds to a perfectly matched complementary target with a signal to noise ratio that is at least 5x as high as that observed for hybridization of the probe to an unmatched target.
[0100] "Very stringent" conditions are selected to be equal to the thermal melting point (Tm) for a particular probe. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the test sequence hybridizes to a perfectly matched probe. For the purposes of the present invention, generally, "highly stringent" hybridization and wash conditions are selected to be about 5° C lower than the Tm for the specific sequence at a defined ionic strength and pH.
[0101] "Ultra high-stringency" hybridization and wash conditions are those in which the stringency of hybridization and wash conditions are increased until the signal to noise ratio for binding of the probe to the perfectly matched complementary target nucleic acid is at least 10x as high as that observed for hybridization to any of the unmatched target nucleic acids. A target nucleic acid which hybridizes to a probe under such conditions, with a signal to noise ratio of at least Vi that of the perfectly matched complementary target nucleic acid is said to bind to the probe under ultra-high stringency conditions.
[0102] Similarly, even higher levels of stringency can be determined by gradually increasing the hybridization and/or wash conditions of the relevant hybridization assay. For example, those in which the stringency of hybridization and wash conditions are increased until the signal to noise ratio for binding of the probe to the perfectly matched complementary target nucleic acid is at least 10x, 2OX, 50X, 10OX, or 500X or more as high as that observed for hybridization to any of the unmatched target nucleic acids. A target nucleic acid which hybridizes to a probe under such conditions, with a signal to noise ratio of at least Vi that of the perfectly matched complementary target nucleic acid is said to bind to the probe under ultra-ultra-high stringency conditions. ADDITIONAL DETAILS REGARDING TECHNIQUES
[0103] Additional useful references for producing RS and tRNA mutations, as well as a variety of recombinant and in vitro nucleic acid manipulation methods (including cloning, expression, PCR, and the like) include Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzvmologv volume 152 Academic Press, Inc., San Diego, CA (Berger); Kaufman, et al. (2003) Handbook of Molecular and Cellular Methods in Biology and Medicine Second Edition Ceske (ed) CRC Press (Kaufman); and The Nucleic Acid Protocols Handbook Ralph Rapley (ed) (2000) Cold Spring Harbor, Humana Press Inc (Rapley); Chen, et al. (ed) PCR Cloning Protocols, Second Edition (Methods in Molecular Biology, volume 192) Humana Press; and in Viljoen, et al. (2005) Molecular Diagnostic PCR Handbook Springer, ISBN 1402034032.
[0104] A variety of protein methods are known and can be used to isolate, detect, manipulate or otherwise handle a protein produced according to the invention e.g., from recombinant cultures of cells expressing the recombinant unnatural amino acid-containing proteins of the invention. A variety of protein isolation and detection methods are well known in the art, including, e.g., those set forth in R. Scopes, Protein Purification, Springer- Verlag, N. Y. (1982); Deutscher, Methods in Enzvmologv Vol. 182: Guide to Protein Purification, Academic Press, Inc. N. Y. (1990); Sandana (1997) Bioseparation of Proteins, Academic Press, Inc.; Bollag, et al. (1996) Protein Methods, 2nd Edition Wiley-Liss, NY; Walker (1996) The Protein Protocols Handbook Humana Press, NJ, Harris and Angal (1990) Protein Purification Applications: A Practical Approach IRL Press at Oxford, Oxford, England; Harris and Angal Protein Purification Methods: A Practical Approach IRL Press at Oxford, Oxford, England; Scopes (1993) Protein Purification: Principles and Practice 3rd Edition Springer Verlag, NY; Janson and Ryden (1998) Protein Purification: Principles, High Resolution Methods and Applications, Second Edition Wiley- VCH, NY; and Walker (1998) Protein Protocols on CD-ROM Humana Press, NJ; and the references cited therein. Additional details regarding protein purification and detection methods can be found in Satinder Ahuja ed., Handbook of Bioseparations, Academic Press (2000). These available methods are optionally used in conjunction with the novel protein purification methods herein, e.g., scarless protein purification methods. KITS
[0105] Kits are also a feature of the invention. For example, such kits can comprise components for using the composition herein, such as: a container to hold the kit components, instructional materials for practicing any method herein with the kit, or for producing a protein comprising one or more unnatural amino acid, a nucleic acid comprising a polynucleotide sequence encoding an O-tRNA, a nucleic acid comprising a polynucleotide encoding an O-RS, an O-RS, an unnatural amino acid, reagents for the post- translational modification of the unnatural amino acid (e.g., reagents for any one or more of the reactions described herein), a suitable strain of prokaryotic, e.g., bacterial (e.g., E. coli) or eukaryotic (e.g., yeast or mammalian) host cells for expression of the O-tRNA/O-RS and production of a target protein comprising, e.g., one or more an epsilon-substituted lysine a caged lysine, an O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK), an ortho, meta or para- substituted phenylalanine or tyrosine, alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids and/or alkoxyamine containing amino acids.
[0106] Alternately or additionally, the kits can contain a solid phase matrix for scarless purification, reagents for the covalent coupling of a polypeptide comprising an unnatural amino acid to the matrix, reagents for the oxidation or reduction of a redox unnatural amino acid and/or light sources for photolysis of caged amino acids in a polypeptide, e.g., to produce a natural amino acid.
PROTEINS OF INTEREST
[0107] The methods and compositions of the invention can be used to incorporate unnatural amino acids into any polypeptide in interest. Polypepotides modified to include unnatural amino acids incorporated by the present methods are considered an aspect of the invention. Such modified polypeptides can find use, e.g., in research and medicine.
[0108] Typically, the modified proteins of the invention, comprising unnatural amino acids are, e.g., at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95%, or at least 99% or more identical to any available protein (e.g., a therapeutic protein, a diagnostic protein, an industrial enzyme, or portion thereof, and the like), and they comprise one or more unnatural amino acid. Examples of therapeutic, diagnostic, and other proteins that can be modified to comprise one or more photoregulated amino acid (e.g., such as o-nitrobenzyl cysteine and azobenzyl-Phe), O-Me-L-tyrosine, or α-aminocaprylic acid can be found, but not limited to, those in International Application Number PCT/US2004/011786, filed April 16, 2004, entitled "Expanding the Eukaryotic Genetic Code;" and, WO 2002/085923, entitled "IN VIVO INCORPORATION OF UNNATURAL AMINO ACIDS." Examples of therapeutic, diagnostic, and other proteins that can be modified to comprise one or more homoglutamines include, but are not limited to, e.g., Alpha-1 antitrypsin, Angiostatin, Antihemolytic factor, antibodies, Apolipoprotein, Apoprotein, Atrial natriuretic factor, Atrial natriuretic polypeptide, Atrial peptides, C-X-C chemokines (e.g., T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP-10, GCP-2, NAP-4, SDF-I, PF4, MEG), Calcitonin, CC chemokines (e.g., Monocyte chemoattractant protein-1, Monocyte chemoattractant protein-2, Monocyte chemoattractant protein-3, Monocyte inflammatory protein-1 alpha, Monocyte inflammatory protein-1 beta, RANTES, 1309, R83915, R91733, HCCl, T58847, D31065, T64262), CD40 ligand, C-kit Ligand, Collagen, Colony stimulating factor (CSF), Complement factor 5a, Complement inhibitor, Complement receptor 1, cytokines, (e.g., epithelial Neutrophil Activating Peptide-78, GROα/MGSA, GROβ, GROγ, MTP-lα, MTP-lδ, MCP-I), Epidermal Growth Factor (EGF), Erythropoietin ("EPO"), Exfoliating toxins A and B, Factor DC, Factor VII, Factor Vm, Factor X, Fibroblast Growth Factor (FGF), Fibrinogen, Fibronectin, G-CSF, GM-CSF, Glucocerebrosidase, Gonadotropin, growth factors, Hedgehog proteins (e.g., Sonic, Indian, Desert), Hemoglobin, Hepatocyte Growth Factor (HGF), Hirudin, Human serum albumin, Insulin, Insulin-like Growth Factor (IGF), interferons (e.g., IFN-α, IFN-β, IFN-γ), interleukins (e.g., IL-I, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, EL-IO, IL-I l, IL-12, etc.), Keratinocyte Growth Factor (KGF), Lactoferrin, leukemia inhibitory factor, Luciferase, Neurturin, Neutrophil inhibitory factor (NTF), oncostatin M, Osteogenic protein, Parathyroid hormone, PD-ECSF, PDGF, peptide hormones (e.g., Human Growth Hormone), Pleiotropin, Protein A, Protein G, Pyrogenic exotoxins A, B, and C, Relaxin, Renin, SCF, Soluble complement receptor I, Soluble I-CAM 1, Soluble interleukin receptors (DL-I, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15), Soluble TNF receptor,
Somatomedin, Somatostatin, Somatotropin, Streptokinase, Superantigens, i.e., Staphylococcal enterotoxins (SEA, SEB, SECl, SEC2, SEC3, SED, SEE), Superoxide dismutase (SOD), Toxic shock syndrome toxin (TSST-I), Thymosin alpha 1, Tissue plasminogen activator, Tumor necrosis factor beta (TNF beta), Tumor necrosis factor receptor (TNFR), Tumor necrosis factor-alpha (TNF alpha), Vascular Endothelial Growth Factor (VEGEF), Urokinase and many others.
DEFINITIONS
[0109] Unless otherwise defined herein or below in the remainder of the specification, all technical and scientific terms used herein have meanings commonly understood by those of ordinary skill in the art to which the present invention belongs.
[0110] Before describing the present invention in detail, it is to be understood that this invention is not limited to particular devices or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms "a", "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "an amino acid" can include a combination of two or more of the amino acid; reference to a "component" can include multiple copies of the component, and the like.
[0111] Although many methods and materials similar, modified, or equivalent to those described herein can be used in the practice of the present invention without undue experimentation, the preferred materials and methods are described herein. In describing and claiming the present invention, the following terminology will be used in accordance with the definitions set out below.
[0112] As used herein, one type of biomolecule can "encode" another. As used herein, the term "encode" refers to any process whereby the information in a polymeric macromolecule or sequence string is used to direct the production of a second molecule or sequence string that is different from the first molecule or sequence string. As used herein, the term can be used broadly, and can have a variety of applications. In some aspects, the term "encode" describes the process of semi-conservative DNA replication, where one strand of a double-stranded DNA molecule is used as a template to encode a newly synthesized complementary sister strand by a DNA-dependent DNA polymerase. In another aspect, the term "encode" refers to any process whereby the information in one molecule is used to direct the production of a second molecule that has a different chemical nature from the first molecule. For example, a DNA molecule can encode an RNA molecule, e.g., by the process of transcription catalyzed by a DNA-dependent RNA polymerase enzyme. Also, an RNA molecule can encode a polypeptide, as in the process of translation. When used to describe the process of translation, the term "encode" also extends to the triplet codon that encodes an amino acid or selector codons that encode a particular natural or unnatural amino acid. In some aspects, an RNA molecule can encode a DNA molecule, e.g., by the process of reverse transcription incorporating an RNA- dependent DNA polymerase. In another aspect, a DNA molecule can encode a polypeptide, where it is understood that "encode" as used in that case incorporates both the processes of transcription and translation.
[0113] As used herein, the term "orthogonal" refers to functional molecules, e.g., an orthogonal tRNA (O-tRNA) and/or an orthogonal aminoacyl-tRNA synthetase (O-RS), that functions poorly or not at all with endogenous components of a cell, when compared to a corresponding molecule (tRNA or RS) that is endogenous to the cell or translation system. Orthogonal components are usefully provided as cognate components that function well with each other, e.g., an O-RS can be provided that efficiently aminoacylates a cognate O- tRNA in a cell, even though the O-tRNA functions poorly or not at all as a substrate for the endogenous RS of the cell, and the O-RS functions poorly or not at all with endogenous tRNAs of the cell. Various comparative efficiencies of the orthogonal and endogenous components can be evaluated. For example, an O-tRNA will typically display poor or nonexistent activity as a substrate, under typical physiological conditions, with endogenous RSs, e.g., the O-tRNA is less than 10% as efficient as a substrate as endogenous tRNAs for any endogenous RS, and will typically be less than 5%, and usually less than 1% as efficient a substrate. At the same time, the tRNA can be highly efficient as a substrate for the O-RS, e.g., at least 50%, and often 75%, 95%, or even 100% or more as efficient as an aminoacylation substrate as any endogenous tRNA is for its endogenous RS. [0114] Orthogonal aminoacyl-tRNA synthetase: As used herein, an orthogonal aminoacyl-tRNA synthetase (O-RS) is an enzyme that preferentially aminoacylates an O- tRNA with an amino acid in a translation system of interest. An ORS "selectively recognizes" an unnatural amino acid when it charges a cognate tRNA with the amino acid more efficiently than with any natural amino acid. The present invention includes O-RSs, e.g., derived from Methanosarcina species, that function to orthogonally charge an unnatural amino acid, optionally in a eubacterial translation system (e.g., an enterobacteria cell) or in a eukaryotic translation system (e.g., a mammalian cell), as desired. That is, e.g., the RS can be shuttled between systems (e.g., encoded as a nucleic acid sequence in a plasmid) and can function orthogonally in each system.
[0115] Orthogonal tRNA: As used herein, an orthogonal tRNA (O-tRNA) is a tRNA that is orthogonal to a translation system of interest. The O-tRNA can exist charged with, e.g., an unnatural amino acid, or can exist in an uncharged state. It is also to be understood that an O-tRNA is optionally charged (aminoacylated) by a cognate orthogonal aminoacyl-tRNA synthetase with an unnatural amino acid. It will be appreciated that the O- tRNA of the invention can be advantageously used to insert the unnatural amino acids into a growing polypeptide, during translation, in response to a selector codon. It will also be appreciated that the O-tRNAs of the invention can function orthogonally in more than one translation system, e.g., such as, in both a eubacterial system (e.g., E. colϊ) and in a eukaryotic system (e.g., in a mammalian, insect or plant cell line).
[0116] Preferentially aminoacylates: As used herein in reference to orthogonal translation systems, an O-RS "preferentially aminoacylates" a cognate O-tRNA when the O-RS charges the O-tRNA with an amino acid (e.g., an unnatural amino acid) more efficiently than it charges any endogenous tRNA in an expression system (e.g., a system into which it has been shuttled). That is, when the O-tRNA and any given endogenous tRNA are present in a translation system in approximately equal molar ratios, the O-RS will charge the O-tRNA more frequently than it will charge the endogenous tRNA. Preferably, the relative ratio of O-tRNA charged by the O-RS to endogenous tRNA charged by the O- RS is high, preferably resulting in the O-RS charging the O-tRNA exclusively, or nearly exclusively, when the O-tRNA and endogenous tRNA are present in equal molar concentrations in the translation system. The relative ratio between O-tRNA and endogenous tRNA that is charged by the O-RS, when the O-tRNA and O-RS are present at equal molar concentrations, is greater than 1: 1, preferably at least about 2:1, more preferably 5: 1, still more preferably 10:1, yet more preferably 20:1, still more preferably 50: 1, yet more preferably 75: 1, still more preferably 95:1, 98:1, 99:1, 100: 1, 500:1, 1,000:1, 5,000:1 or higher. Typically, charging of an endogenous tRNA by an O-RS is not detectable, e.g., by suppression assays. The O-RS "preferentially aminoacylates an O- tRNA with a lysine analog" when (a) the O-RS preferentially aminoacylates the O-tRNA compared to an endogenous tRNA, and (b) where that aminoacylation is specific for the lysine analog (e.g., epsilon-substituted) amino acid, as compared to aminoacylation of the O-tRNA by the O-RS with any natural amino acid. For example, when a unnatural amino acid (e.g., O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK)) and natural amino acids are present in equal molar amounts in a translation system comprising a relevant O-RS of the sequence listing herein and a relevant O-tRNA of the sequence listing herein, the O-RS will load the O-tRNA with ONBK more frequently than with any natural amino acid. Preferably, the relative ratio of O-tRNA charged with ONBK to O-tRNA charged with the natural amino acid is high. More preferably, O-RS charges the O-tRNA exclusively, or nearly exclusively, with ONBK or other relevant unnatural amino acid. The relative ratio between charging of the O-tRNA with the unnatural amino acid and charging of the O- tRNA with a natural amino acid, when both the natural and unnatural amino acid are present in the translation system in equal molar concentrations, is greater than 1: 1, preferably at least about 2:1, more preferably 5:1, still more preferably 10:1, yet more preferably 20:1, still more preferably 50:1, yet more preferably 75:1, still more preferably 95:1, 98:1, 99: 1, 100: 1, 500: 1, 1,000:1, 5,000: 1 or higher.
[0117] Shuttle: As used herein, the term "shuttle" refers to transfer of a nucleic acid encoding a translation system component (e.g., an RS and/or tRNA) from one cell to another cell. In one aspect, the source and target cells are not from the same species, and typically include a eubacterial cell and a eukaryotic cell.
[0118] Selector codon: The term "selector codon" refers to codons recognized by the O-tRNA in the translation process and not recognized by an endogenous tRNA. The O- tRNA anticodon loop recognizes the selector codon on the mRNA and incorporates the amino acid with which it is charged, e.g., an unnatural amino acid, at this site in the polypeptide. Selector codons can include, e.g., nonsense codons, such as, stop codons, e.g., amber, ochre, and opal codons; four or more base codons; rare codons; noncoding codons; and codons derived from natural or unnatural base pairs and/or the like.
[0119] Suppression activity: As used herein, the term "suppression activity" refers, in general, to the ability of a tRNA, e.g., a suppressor tRNA, to allow translational read- through of a codon, e.g., a selector codon that is an amber codon or a 4-or-more base codon, that would otherwise result in the termination of translation or mistranslation, e.g., frame- shifting. Suppression activity of a suppressor tRNA can be expressed as a percentage of translational read-through activity observed compared to a second suppressor tRNA, or as compared to a control system, e.g., a control system lacking an O-RS.
[0120] Suppressor tRNA: A suppressor tRNA is a tRNA that alters the reading of a messenger RNA (mRNA) in a given translation system, typically by allowing the incorporation of an amino acid in response to a stop codon or 4 or more base codon (i.e., "read-through") during the translation of a polypeptide. In some aspects, a selector codon of the invention is a suppressor codon, e.g., a stop codon, e.g., an amber, ocher or opal codon, a four base codon, a rare codon, etc.
[0121] A therapeutic protein is a protein that can be administered to a patient to treat a disease or disorder.
[0122] Translation system: The term "translation system" refers to the components that incorporate an amino acid into a growing polypeptide chain (protein). Components of a translation system can include, e.g., ribosomes, tRNAs, synthetases, mRNA and the like. The O-tRNA and/or the O-RSs of the invention can be added to or be part of an in vitro or in vivo translation system, e.g., in a non-eukaryotic cell, e.g., a bacterium, such as E. coli, or in a eukaryotic cell, e.g., a yeast cell, a mammalian cell, a plant cell, an algae cell, a fungus cell, an insect cell, and/or the like.
[0123] Unnatural amino acid: As used herein, the term "unnatural amino acid" refers to any amino acid, modified amino acid, and/or amino acid analogue, that is not one of the 20 common naturally occurring amino acids. Further, herein neither seleno cysteine nor pyrrolysine are considered unnatural amino acids. For example, the unnatural amino acid O-nitrobenzyl-oxycarbonly-N-L-lysine (ONBK - see Figure 3A) finds use with the invention.
[0124] Cognate: The term "cognate" refers to components that function together, e.g., an orthogonal tRNA and an orthogonal aminoacyl-tRNA synthetase. The components can also be referred to as being complementary.
[0125] Derived from: As used herein, the term "derived from" refers to a component that is isolated from or made using a specified molecule or organism, or information from the specified molecule or organism. A first nucleic acid or peptide sequence is derived from a second sequence, e.g., when the second sequence is changed by addition, deletion or substitution at sequence positions to create the first sequence.
[0126] Eukaryote: As used herein, the term "eukaryote" refers to organisms belonging to the phylogenetic domain Eucarya such as animals (e.g., mammals, insects, reptiles, birds, etc.), ciliates, plants (e.g., monocots, dicots, algae, etc.), fungi, yeasts, flagellates, microsporidia, protists, etc.
[0127] Non-eukaryote: As used herein, the term "non-eukaryote" refers to non- eukaryotic organisms. For example, a non-eukaryotic organism can belong to the Eubacteria (e.g., Escherichia coli, Thermus thermophilics, Bacillus stearothermophilus, etc.) phylogenetic domain, or the Archaea (e.g., Methanococcus jannaschii (Mj), Methanosarcina mazei (Mm), Methanobacterium thermoautotrophicum (Mt), Methanococcus maripaludis, Methanopyrus kandleri, Halobacterium such as Haloferax volcanii and Halobacterium species NRC-I , Archaeoglobus fulgidus (Af), Pyrococcus furiosus (Pf), Pyrococcus horikoshii (Ph), Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus (Ss), Sulfolobus tokodaii, Aeuropyrum pernix (Ap), Thermoplasma acidophilum, Thermoplasma volcanium, etc.) phylogenetic domain. EXAMPLES
[0128] The following examples are offered to illustrate, but not to limit the claimed invention.
[0129] It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.
Eubacteria-Eukaryote Shuttle
[0130] An E. cø/j-mammalian shuttle system has been developed to genetically encode unnatural amino acids in mammalian cells using aminoacyl-tRNA synthetases (RSs) evolved in E. coli. A pyrrolysyl-tRNA synthetase (PyIRS) mutant was evolved in E. coli that selectively aminoacylates a cognate nonsense suppressor tRNA with a photocaged lysine derivative. Transfer of this orthogonal tRNA-RS pair into mammalian cells made possible the selective incorporation of this unnatural amino acid into proteins.
[0131] It has not been previously possible to export the large number of aminoacyl- tRNA synthetases evolved in E. coli to mammalian cells due to the fact that the M. jannaschii-deήved aminoacyl-tRNA synthetases typically used in E. coli are not orthogonal in mammalian cells.
[0132] To overcome this limitation, we turned to a pyrrolysyl-tRNA synthetase
(PyIRS) and its cognate tRNA^A , which naturally incorporates pyrrolysine (PyI) (Figure 3a) in response to the amber nonsense codon in the archaea Methanosarcina maize. Previous work has shown that tRNA^A is not recognized by endogenous RSs in E. coli and mammalian cells as a result of its unique structural features. See, C. Polycarpo, et al., Proc. Natl. Acad. Sci. USA 2004, 101, 12450; and, K. Nozawa, et al., Nature 2008, advance online publication. Moreover, the Yokoyama group has recently taken advantage of the known promiscuity of the natural Methanosarcina maize PyIRS (MmPyIRS) to incorporate PyI analogues into proteins in mammalian cells. See, C. R. Polycarpo, et al., Febs Letters 2006, 580, 6695. In addition, Chin and coworkers used a mutant Methanosarcina barken PyIRS (MbPyIRS), a close homologue of MmPyIRS, to incorporate acetyl lysine in E. coli, demonstrating that the specificity of the PyIRS can be altered by directed evolution methods. See, T. Mukai, et al., Biochem. Biophys. Res. Commun. 2008, 371, 818; and, Neumann, S. Y., et al., Nat. Chem. Biol. 2008, 4, 232. Here we evolved new PyIRS specificities in E. coli, a host in which large libraries of mutant aminoacyl-tRNA synthetases can be generated and selected, and subsequently shuttled the evolved RSs directly into mammalian cells. We have demonstrated here the utility of such an E. coli- mammalian "shuttle" system by genetically encoding a photocaged lysine in both bacterial and mammalian cells.
[0133] First we confirmed the orthogonality of the M. maize pyrrolysyl-tRNA synthetase (MmPyIRS; SΕQ ID NO: 2)- tRNA^A (SΕQ ID NO: 7) pair in both E. coli and mammalian cells, which is the key requirement for establishing a robust system for shuttling tRNA/RS pairs between these two hosts. Northern blot analysis detected aminoacylated tRNAs only when E. coli cells harbored plasmids encoding both tRNA^A and MmPyIRS and were supplemented with 5 mM of the PyI analogue Λ^-cyclopentyloxycarbonyl-L-lysine (Cyc) (Figure 3a). Aminoacylation of tRNA^A does not occur in the absence of Cyc or of the plasmid encoding MmPyIRS, indicating that tRNA^A is not a substrate for endogenous RSs in E. coli and that MmPyIRS does not recognize endogenous amino acids in E. coli. Western blot analysis of samples from CHO cells shows that a C-terminal His-tagged retinol binding protein 4 (RBP4) with an amber mutation at Phe 36 (RBP4/Phe36TAG) was only expressed in the presence of MmPyIRS, tRNA^A and 5 mM Cyc (Figure 4b). Again, these results verify that tRNA£JA is not a substrate for endogenous RSs and that MmPyIRS does not recognize endogenous amino acids in mammalian cells. These data confirm that MmPyIRS- tRNA^A works as a functional amber suppressor pair in both E. coli and mammalian cells with the substrate Cyc, which is consistent with previous results obtained for MbPyIRS and MmPyIRS, respectively.
[0134] We next created a library of MmPyIRS active-site mutants in order to alter the amino acid specificity of this enzyme. On the basis of the crystal structure of MmPyIRS bound to PyI, five residues (Leu305, Tyr306, Leu309, Cys348 and Tyr384) surrounding the methyl pyrroline ring of PyI were randomized to expand the PyI recognition pocket (Figure 4c). Overlap extension polymerase chain reaction was performed with synthetic oligonucleotide primers in which the randomized residues were encoded as NNK (N=A or C or T or G, K=T or G) to generate a library with a diversity of 3 x 107, the quality of which was validated by sequencing.
[0135] We then evolved a mutant MmPyIRS- tRNA™A pair specific for the N6- photocaged lysine analogue, ø-nitrobenzyl-oxycarbonyl-N£-L-lysine (OΝBK, Figure 3b, Scheme 1) in E. coli. Photocaging, in which a molecule is derivatized with a photoremovable inactivating group, is widely used as a non-invasive tool for spatial and temporal control of a variety of complex cellular processes. We have previously genetically encoded photocaged Ser, Cys and Tyr residues (see, A. Deiters, et al., Angew. Chem. Int. Εdn. Εngl. 2006, 45, 2728; Ε. A. Lemke, et al., Nat. Chem. Biol. 2007, 3, 769; and N. Wu, et al., J. Am. Chem. Soc. 2004, 126, 14306) - a photocaged lysine would for example allow photoactivation of ubiquitination, methylation and acetylation in mammalian cells, and as a result could be used to activate protein degradation or modulate transcription. In order to identify MmPyIRS mutants that can selectively aminoacylate tRNA^ with ONBK, a series of positive and negative selections were performed as previously described. See, L. Wang, et al., Science 2001, 292, 498; and, J. M. Xie, et al., Angew. Chem. Int. Εdn. Εngl. 2007, 46, 9239. In brief, the positive selection was based on resistance to chloramphenicol (Cm), which is conferred by the suppression of an amber mutation at a permissive site (Aspl 12) in the type I chloramphenicol acetyletransferase gene (CATl 12T AG) in the presence of the unnatural amino acid and the RS mutant. The negative selection used the toxic barnase gene with amber mutations at permissive sites (Gln2TAG, Asp44TAG and Gly65TAG) and was carried in the absence of unnatural amino acid. Single MmPyIRS mutant clones that passed through the selection (three positive and two negative rounds) and survived on Cm only in the presence of ONBK were obtained: 60% of sequenced clones converged (see Table B, below) on a unique sequence (referred to as NBK-I; NA SΕQ ID NO: 3; polypeptide SΕQ ID NO: 4) with the mutations: Y306M, L309A, C348A, Y384F, while the other 40% converged to a second related sequence (referred to as NBK-2; NA SΕQ ID NO: 5; polypeptide SΕQ ID NO: 6) with the mutations: Y306I, L309A, C348A, Y384F. E. coli co-transformed with either NBK-I or NBK-2 and CATl 12TAG exhibited a significant difference in growth on Cm in the presence and absence of 1 mM ONBK (Figure 3a), suggesting that these evolved MmPyIRS- tRNA™A pairs are selective for ONBK relative to endogenous host amino acids. NBK-I exhibited enhanced amber suppression relative to NBK-2 and thus the NBK-I- tRNA^Apair was used for further studies.
[0136] Table B - Clones converging on encoding NBK-I and NBK-2
Seq: L305(CTG) Y306(TAC) L309(CTG) C348(TGC) Y384(TAC) WT clone-1 TTG ATG GCT GCT TTT NBK-1 clone-2 TTG ATG GCT GCG TTT NBK-1 clone-3 TTG ATG GCT GCT TTT NBK-1 clone-4 TTG ATG GCG GCG TTT NBK-1 clone-5 CTG ATG GCG GCT TTT NBK-1 clone-6 CTG ATG GCG GCG TTT NBK-1 clone-7 CTG ATG GCG GCT TTT NBK-1 clone-8 CTT ATG GCG GCG TTT NBK-1 clone-9 CTT ATG GCG GCT TTT NBK-1 clone-10 CTT ATG GCG GCG TTT NBK-1 clone-11 CTT ATG GCG GCT TTT NBK-1 clone-12 CTT ATG GCG GCG TTT NBK-1 clone-13 TTG ATT GCT GCG TTT NBK-2 clone-14 TTG ATT GCT GCT TTT NBK-2 clone-15 TTG ATT GCG GCG TTT NBK-2 clone-16 CTG ATT GCG GCT TTT NBK-2 clone-17 CTT ATT GCG GCG TTT NBK-2 clone-18 CTT ATT GCG GCT TTT NBK-2 clone-19 CTT ATT GCG GCG TTT NBK-2 clone-20 CTT ATT GCG GCG TTT NBK-2
NBK-1 L M A A F
NBK-2 L I A A F
[0137] To determine the efficiency and fidelity of ONBK incorporation into proteins in E. coli, an amber mutation (TAG) was introduced for Aspl49 in a C-terminal His-tagged variant of GFP (GFP 149T AG). A vector pSup-NBK-1 was constructed to encode the NBK- 1- tRNAj£A pair in which a single copy of the tRNA|^A gene is expressed under control of the proK promoter and terminator, and the NBK-1 gene is expressed under control of a mutant glnS iglnS^ promoter. This plasmid was co-transformed into BL21-DE3 E. coli cells with a plasmid carrying the GFP149TAG gene (pBAD-GFP149TAG). Protein expression was carried out in LB medium supplemented with and without 1 mM ONBK, followed by purification with Ni2+-NTA chromatography. SDS-PAGE analysis and subsequent coomassie staining showed that full length protein was only produced in the presence of ONBK (Figure 4b). Expression for 8 h at 30°C with NBK-I andtRNA^A yielded around -10 mg L"1 protein in medium containing 1 mM ONBK. As a control, a plasmid containing the wild type MmPyIRS- tRNA™A was employed for expression of GFP149TAG in the presence of 1 mM Cyc (Figure 4b) and the protein yield was less than 1 mg L"1.
[0138] Electrospray ionization mass spectrometry (ESI-MS) of purified GFP protein with ONBK at position 149 revealed two peaks (27,915 Da and 27,782 Da) corresponding to GFP protein containing the intact ONBK residue with and without the N-terminal Met (Figure 5c). This result confirms the high specificity of the NBK-I mutant aminoacyl- tRNA synthetase for ONBK relative to endogenous amino acids, and for tRNA™A relative to endogenous tRNAs. At longer induction times, we also observed intact protein mass peaks for lysine at position 149. We suspect that the ONBK photocaging group is partially removed by degradative enzymes in E. coli {vide infra).
[0139] Next, the evolved NBK-I- tRNA^A pair from E. coli was shuttled into mammalian cells. A vector pCMV-NBK-1 was constructed containing the NBK-I gene under control of a non-regulated CMV promoter, and a single tRNA^A gene under control of a human U6 promoter. Amber suppression was monitored using an enhanced GFP (EGFP) with an amber mutation at the permissive residue 37 (EGFP37TAG). The plasmid pCMV- NBK-I was co-transfected with a plasmid encoding EGFP37TAG into HEK293 cells using an optimized transfection condition. After induction, the cells were allowed to grow in the presence and absence of 1 mM ONBK for 36 h before being visualized under a fluorescence microscope (Figure 6a). Full length EGFP was only detected in cells supplemented with 1 mM ONBK, while no EGFP was observed otherwise.
[0140] The incorporation of ONBK in mammalian cells in response to an amber codon was further confirmed by mass spectroscopy. After purification by Ni2+-NTA chromatography, 35 μg EGFP protein was isolated from 4 x 107 CHO cells and analyzed by ESI-MS (Figure 6b). Only one peak was observed corresponding to the full-length protein containing the intact ONBK residue (EGFP37ONBK), indicating that no loss of the photocaging group occurred in mammalian cells (presumably due to the absence of degradative enzymes). In addition, this result shows that the mutant MmPyIRS does not load endogenous tRNAs with ONBK to give heterogeneous protein product. To verify the presence of the intact photocaged Lys, purified EGFP37ONBK was irradiated with 365 nm light for 20 min. ESI-MS analysis of this protein sample revealed one peak with a change in mass corresponding to the loss of one ø-nitrobenzyl-oxycarbonyl group (Figure 6c) indicating that EGFP37ONBK was cleanly converted to EGFP37K with near- visible light.
[0141] In summary, we have developed a straightforward strategy for the expansion of the mammalian amino acid repertoire with the PyIRS- tRNA£JA pair from archaea. We demonstrated the utility of this approach by genetically encoding a photocaged lysine which is likely to be a useful probe of protein function in bacterial and mammalian cells. Moreover, the x-ray crystal structure of the PyIRS active site suggests that this "shuttle" system can also be used for the directed evolution of additional RSs specific for other unnatural amino acids for use in both prokaryotic and eukaryotic organisms.
Shuttle Method Details
[0142] Synthesis of o-nitrobenzyl-oxycarbonyl-Nε-L-lysine (ONBK).
Figure imgf000056_0001
[0143] (1) Synthesis of Tert-butoxycarbonyl-Nε-o-nitrobenzyloxycarbonyl-L-lysine
(Boc-Lys(ONB)-OH, 2): 1.16 g o-nitrobenzyl chloroformate (5.43 mmol) dissolved in 2 ml acetone was added to 50 ml of a 10 mM sodium bicarbonate buffer, pH 8. 1.47 g Boc-L- Lys-OH 1 was added to the solution, which was then stirred in the dark. After 4 days, the reaction was acidified with 1 N HCl and then extracted three times with 50 ml CH2Cl2. The combined organic fractions were dried over Na2SO4, and then concentrated on a rotary evaporator. The crude product 2 was purified on a silica column. The column was washed with three column volumes 50% hexanes in ethyl acetate, and then the product was eluted in two column volumes of ethyl acetate with 2% acetic acid. 1.92 g of the product was recovered as a straw-colored oil (83% yield). See, V. K. Rusiecki, S. A. Warne, Bioorg. Med. Chem. Lett. 1993, 3, 707.
[0144] 1H NMR (400 MHz, CDCl3): δ = 1.37 (s, 9H), 1.53-1.46(m, 2H), 1.65(m,
IH), 1.78(m, IH), 3.14(d, 7=5.72 Hz, 2H), 4.04(s, IH), 4.23(s, IH), 5.32(s, IH), 5.44(d, 2H), 7.41(t, 7=7.58 Hz 2H), 7.54(d, 7=7.27 Hz IH), 7.58(d, 7=7.37 Hz IH), 8.02(d, 7=8.10 Hz IH). Mass calculated for C]9H26N3O8, (M-H)" 424.4 Da, found 424.1Da by liquid chromatography-mass spectrometry operating in negative ionization mode.
Figure imgf000057_0001
[0145] (2) Synthesis of o-nitrobenzyloxycarbonyl- Nε-L-lysine(ONBK, 3): 1 g Boc-
Lys(ONB)-OH (ONBK) 3 was dissolved in 1 ml 1,4-dioxane. 10 ml 4 N HCl in dioxane was added, and after 2 h the dioxane and HCl were removed in vacuo. The straw-colored precipitate was triturated 3 times with 10 ml ethyl ether to yield 0.74 g (97%).
[0146] 1H NMR (400 MHz, DMSO d6): δ= 1.28-1.33 (m, IH), 1.41(~s, 3H), 1.69-
1.73(m, 2H), 2.99(d, 7=5.8 Hz, 2H), 3.34(s, 2H), 5.35( s, 2H), 7.46(t, 7=5.6 Hz, IH), 7.58- 7.65 (m, 2H), 7.8(td, 7=7.7, 1.3 Hz, IH), 8.11 (dd, 7=8.1, 1.0 Hz, IH). High Resolution MS found 326.1352 Da. [0147] Protein Purification.
[0148] His-tagged proteins produced from E. coli and mammalian cell cultures were purified with Ni-NTA columns (Qiagen) following the instructions provided. In brief, cell lysate was dialyzed against and equilibrated with PBS buffer before loading onto the Ni- NTA column. Columns were washed with 10 bed volumes of wash buffer (50 mM NaH2PO4, pH 8, 300 mM NaCl, and 25 mM imidazole). Proteins were eluted with 50 mM NaH2PO4, pH 8, containing 250 mM imidazole.
[0149] Library construction.
[0150] The MmPyIRS active site library was constructed by overlap extension polymerase chain reaction (PCR) using synthetic degenerate oligonucleotide primers to introduce mutations. The Methanosarcina maize PyIRS gene was codon optimized for E. coli and synthesized by DNA2.0. This gene served as the template to perform standard PCR reactions. The gene was fragmented into four pieces by four pairs of primers in which the codons for the intended mutations were replaced by NNK (N=A or C or G or T, K=G or T). The following pairs of primers were used: (i) MmPylRS_N-term_F (5'-GTG TAC ACA TAT GGA TAA AAA GCC TCT GA-3') and MmPylRS_L305Y306L309/NNK_R (5'-GGC AGG GCA CGG TCC AGT TTA CGM NNA TAG TTM NNM NNG TTC GG-3'); (ii) MmPylRS_L309_F (5'-AAA CTG GAC CGT GCC CTG CC-3') and MmPylRS_C348/NNK_R (5'-TTT CAC GCG TGC AAC CGC TAC CCA TCT GMN NGA AGT TC-3'); (iii) MmPylRS_C348_F (5'-TAG CGG TTG CAC GCG TGA AA-3') and MmPylRS_Y384/NNK_R (5'-TGC ATA ACA TCC AGC GTA TCG CCM NNC ACC ATA CAG CTG TC-3'); (iv) MmPylRS_Y384_F (5'-GAT ACG CTG GAT GTT ATG CA- 3') and MmPylRS_C-term_R (5'-GTA GGC ACT GCA GTT ACA GGT TAG TAG AAA T-3'). Overlap extension PCR was employed to assemble these PCR fragments and multiple rounds of PCR were conducted with the combination of primers listed above. The intact MmPyIRS gene was generated by this strategy and the desired mutation sites were substituted by NNK, so that all 20 common amino acids were encoded.
[0151] The final products were amplified and digested with Ndel and Pstl. The selection vector pB K-MJ YRS (see, L. Wang, A. Brock, B. Herberich, P. G. Schultz,
Science 2001, 292, 498) was digested with the same two restriction enzymes and these two digested products were both purified by Gel-extraction kit (Qiagen). DNA ligation was conducted at 16°C for 16 h followed by yeast-tRNA enhanced ethanol precipitation. Finally, the precipitated ligation products were transformed into electrocompetent DHlOB cells. The theoretical diversity for this five codon randomized library is 3.4 x 107; the final transformation yielded ~ 1.0 x 108 mutants, estimated by counting CFU from agar plates containing serial diluted culture of transformed bacteria which provided 3x coverage of the theoretical diversity. The quality of the library was validated by sequencing 20 individual clones, which revealed that there was no codon bias in the randomized region.
[0152] Selection procedure for evolving pyrrolysyl-tRNA mutant synthetases.
[0153] tRNA£!A was inserted into pRep and pNEG vectors to construct pRep- tRNA£JA for positive selection and pNEG- tRNA™A for negative selection (see, L. Wang, J. Xie, P. G. Schultz, Annu. Rev. Biophys. Biomol. Struct. 2006, 35, 225). For the positive selection, the pBK-PylRS plasmids encoding the MmPyIRS active site library were transformed into E. coli DHlOB competent cells harboring pRep- tRNA^A to yield a library greater than 1 xlO9 cfu, ensuring complete coverage. Cells were plated on LB agar plates containing 25 μg ml"1 tetracycline (Tet), 50 μg ml"1 kanamycin (Kan), 68 μg ml"1 chloramphenicol (Cm) and 1 mM ONBK. After incubation at 37°C for 48 h, colonies on the plates were pooled, total plasmids were isolated and pBK-PylRS plasmids were separated by agarose gel electrophoresis using a Gel -extraction kit. The extracted pBK-PylRS plasmids from the positive selection were transformed into DHlOB harboring pNEG- tRNA^A for the negative selection. After electroporation, the cells were allowed to recover for 2 h at 37°C before being plated on LB agar plates containing 50 μg ml'1 Kan, 100 μg ml"1 ampicillin (Amp) and 0.2% arabinose. The plates were incubated for 12 h at 37°C at which point the cells were pooled and the pBK-PylRS plasmids were extracted. Five alternative rounds of positive and negative selection finally yielded MmPyIRS variants that can survive the selection by acylating the cognate tRNA£|A with ONBK but not any endogenous amino acids. Single colonies after the third positive round of selection were picked and the plasmids were isolated for sequencing. To further validate the selection results, the newly extracted pBK- MmPyIRS plasmids were transformed into DHlOB competent cells containing pRep- tRNA^A and their ability to survive upon Cm challenge was tested with increasing concentrations of Cm in the presence and absence of 1 mM ONBK.
[0154] Vectors, transfection and protein expression in mammalian cells.
[0155] The mammalian expression vector pCMV-MmPylRS was constructed based on the pSWAN-pMpaRS plasmid developed previously. See, W. S. Liu, A. Brock, S. Chen, S. B. Chen, P. G. Schultz, Nat. Methods 2007, 4, 239. PIPE cloning (see, H. E. Klock, E. J. Koesema, M. W. Knuth, S. A. Lesley, Proteins: Struct., Funct., Bioinf. 2008, 71, 982) was used for inserting the desired genes into the vector. Instead of three copies of BstRNA^A , only one copy of the tRNA™A gene was inserted into pCMV-MmPylRS after a human U6 promoter, and the MmPyIRS gene was inserted after a CMV promoter. Both CHO cells and HEK293 cells were used for transfection and protein expression. CHO cells were grown in a medium containing F-12, 10% FBS, 1% Pen-Strep, and 2 mM L-glutamine at 37°C in a humidified atmosphere of 5% CO2. HEK293F cells were grown in a medium containing Gibco D-MEM medium, 10% FBS, 1% Pen-Strep, and 2 mM L-glutamine at 37°C in a humidified atmosphere of 5% CO2. When cells reached 80-90% confluency, media were exchanged to either fresh F12 media or Fl 2 media containing 1 mM unnatural amino acid, and then transfected with pCMV-MmPylRS and pWAN-GFP37TAG using Fugene 6 (Roche; 8 μl Fugene 6 + 0.8 μg of pCMV-MmPylRS + 1.2 μg of pWAN-GFP37TAG for 2 ml cell culture in Costar 6-well cell-culture clusters; 54 μl Fugene 6 + 3 μg of pCMV- MmPyIRS + 9 μg of pWAN-GFP37TAG for 12 ml cell culture in 75 cm2 tissue culture flasks). Cells were grown for an additional 24 to 36 h before being detached and lysed in RIPA buffer (Upstate) with protease inhibitor cocktail (Roche).
[0156] For Northern blot analysis, RNA samples isolated from E. coli cells were separated by acid-urea gel electrophoresis and electroblotted onto a Hybond N+ membrane in 0.5 x TBE running buffer at 30 V constant for 1 h using the Xcell II Blot Module (Invitrogen). The Chemiluminescent Nucleic Acid Detection Module (Pierce) was used with a 72-base oligonecleotidecomplementary to tRNA^A as the probe. For Western blot analysis, cells were detached and lysed in RIPA buffer (Upstate) with protease inhibitor cocktail (Roche). The supernatant of cell lysate was fractionated by SDS-PAGE and transferred to 0.45 μm nitrocellulose membrane (Invitrogen). The proteins on the membrane were probed with anti-His-HRP followed by detection of the luminescence with the ECL western blotting substrate (Pierce).
[0157] To acquire intact protein mass spectra, the purified proteins were dialyzed against Tris buffer (20 mM, pH 7.3) and concentrated to ~ 0.1 mg ml"1. Intact protein mass spectrum was acquired on an automated LC/MS system (Agilent). The dialyzed protein sample (0.1 mg ml"1) was loaded onto a C-8 (Agilent) column for desalting with 0.1% TFA in water and eluted with 80% acetonitrile/0.1% TFA into the ESI source of the mass spectrometer.
[0158] Photolysis of all purified proteins containing ONBK residues was carried in
Tris-buffer solution (40 mM Tris, pH 8.0, 100 mM NaCl and 1 mM DTT). Protein samples with a final concentration of 100 μM were irradiated with high pressure mercury lamp (500 W, Spectra Physics) equipped with 310 nm long pass optical filter.
[0159] Further details can be found in Chen and Groff et al. (2009) "A Facile
System for Encoding Unnatural Amino Acids in Mammalian Cells" Angewandte Chemie International Edition; 48 (22): 4052-4055.
[0160] Translation system component sequences.
[0161] SEQ ID NO 1: MmPyIRS WT nucleic acid sequence: atggataaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaagtgaaggtagtaagcgctccgacccgtactaaaaaggctatgccaaaaagcgtggcccgtgccccg aaacctctggaaaacaccgaggcggctcaggctcaaccatccggttctaaattttctccggcgatcccagtgtccacccaagaatct gtttccgtaccagcaagcgtgtctaccagcattagcagcatttctaccggtgctaccgcttctgcgctggtaaaaggtaacactaaccc gattactagcatgtctgcaccggtacaggcaagcgccccagctctgactaaatcccagacggaccgtctggaggtgctgctgaacc caaaggatgaaatctctctgaacagcggcaagcctttccgtgagctggaaagcgagctgctgtctcgtcgtaaaaaggatctgcaac agatctacgctgaggaacgcgagaactatctgggtaagctggagcgcgaaattactcgcttcttcgtggatcgcggtttcctggagat caaatctccgattctgattccgctggaatacattgaacgtatgggcatcgataatgataccgaactgtctaaacagatcttccgtgtgga taaaaacttctgtctgcgtccgatgctggccccgaacctgtacaactatctgcgtaaactggaccgtgccctgccggacccgatcaaa attttcgagatcggtccttgctaccgtaaagagtccgacggtaaagagcacctggaagaattcaccatgctgaacttctgccagatgg gtagcggttgcacgcgtgaaaacctggaatccattatcaccgacttcctgaatcacctgggtatcgatttcaaaattgttggtgacagct gtatggtgtacggcgatacgctggatgttatgcacggcgatctggagctgtcttccgcagtagtgggcccaatcccgctggatcgtg agtggggtatcgacaaaccttggatcggtgcgggttttggtctggagcgtctgctgaaagtaaaacacgacttcaagaacatcaaac gtgctgcacgttccgagtcctattacaatggtatttctactaacctgtaa
[0162] SEQ ID NO 2: MmPyIRS WT polypeptide sequence:
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRT ARALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPK SVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALV KGNTNPITSMSAPVQAS AP ALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRR KKDLQQIY AEERENYLGKLEREΓΓRFFVDRGFLEIKSPILIPLE YIERMGIDNDTELSK QIFR VDKNFCLRPMLAPNL YNYLRKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFT MLNFCQMGSGCTRENLESΠTDFLNHLGIDFKIVGDSCMVYGDTLD VMHGDLELSS AVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL*
[0163] SEQ ID NO 3: NBK-I nucleic acid sequence: atggataaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaagtgaaggtagtaagcgctccgacccgtactaaaaaggctatgccaaaaagcgtggcccgtgccccg aaacctctggaaaacaccgaggcggctcaggctcaaccatccggttctaaattttctccggcgatcccagtgtccacccaagaatct gtttccgtaccagcaagcgtgtctaccagcattagcagcatttctaccggtgctaccgcttctgcgctggtaaaaggtaacactaaccc gattactagcatgtctgcaccggtacaggcaagcgccccagctctgactaaatcccagacggaccgtctggaggtgctgctgaacc caaaggatgaaatctctctgaacagcggcaagcctttccgtgagctggaaagcgagctgctgtctcgtcgtaaaaaggatctgcaac agatctacgctgaggaacgcgagaactatctgggtaagctggagcgcgaaattactcgcttcttcgtggatcgcggtttcctggagat caaatctccgattctgattccgctggaatacattgaacgtatgggcatcgataatgataccgaactgtctaaacagatcttccgtgtgga taaaaacttctgtctgcgtccgatgctggccccgaacctgatgaactatgcgcgtaaactggaccgtgccctgccggacccgatcaa aattttcgagatcggtccttgctaccgtaaagagtccgacggtaaagagcacctggaagaattcaccatgctgaacttcgcgcagatg ggtagcggttgcacgcgtgaaaacctggaatccattatcaccgacttcctgaatcacctgggtatcgatttcaaaattgttggtgacag ctgtatggtgtttggcgatacgctggatgttatgcacggcgatctggagctgtcttccgcagtagtgggcccaatcccgctggatcgt gagtggggtatcgacaaaccttggatcggtgcgggttttggtctggagcgtctgctgaaagtaaaacacgacttcaagaacatcaaa cgtgctgcacgttccgagtcctattacaatggtatttctactaacctgtaa
[0164] SEQ ID NO 4: NBK-I polypeptide sequence:
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSKIYIEMACGDHLVVNNSRSSRT ARALRHHK YRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPK SVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALV KGNTNPΓΓSMS APVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRR
KKDLQQIYAEERENYLGKLEREITRFFVDRGFLEIKSPILIPLE YIERMGIDNDTELSK QIFR VDKNFCLRPMLAPNLMNYARKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFT MLNFAQMGSGCTRENLESΠTDFLNHLGIDFKIVGDSCMVFGDTLD VMHGDLELSS AVVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL*
[0165] SEQ ID NO 5: NBK-2 nucleic acid sequence: atggataaaaagcctctgaacactctgatttctgcgaccggtctgtggatgtcccgcaccggcaccatccacaaaatcaaacaccat gaagttagccgttccaaaatctacattgaaatggcttgcggcgatcacctggttgtcaacaactcccgttcttctcgtaccgctcgcgc actgcgccaccacaaatatcgcaaaacctgcaaacgttgccgtgttagcgatgaagatctgaacaaattcctgaccaaagctaacga ggatcagacctccgtaaaagtgaaggtagtaagcgctccgacccgtactaaaaaggctatgccaaaaagcgtggcccgtgccccg aaacctctggaaaacaccgaggcggctcaggctcaaccatccggttctaaattttctccggcgatcccagtgtccacccaagaatct gtttccgtaccagcaagcgtgtctaccagcattagcagcatttctaccggtgctaccgcttctgcgctggtaaaaggtaacactaaccc gattactagcatgtctgcaccggtacaggcaagcgccccagctctgactaaatcccagacggaccgtctggaggtgctgctgaacc caaaggatgaaatctctctgaacagcggcaagcctttccgtgagctggaaagcgagctgctgtctcgtcgtaaaaaggatctgcaac agatctacgctgaggaacgcgagaactatctgggtaagctggagcgcgaaattactcgcttcttcgtggatcgcggtttcctggagat caaatctccgattctgattccgctggaatacattgaacgtatgggcatcgataatgataccgaactgtctaaacagatcttccgtgtgga taaaaacttctgtctgcgtccgatgctggccccgaacctgattaactatgcgcgtaaactggaccgtgccctgccggacccgatcaa aattttcgagatcggtccttgctaccgtaaagagtccgacggtaaagagcacctggaagaattcaccatgctgaacttcgcgcagatg ggtagcggttgcacgcgtgaaaacctggaatccattatcaccgacttcctgaatcacctgggtatcgatttcaaaattgttggtgacag ctgtatggtgtttggcgatacgctggatgttatgcacggcgatctggagctgtcttccgcagtagtgggcccaatcccgctggatcgt gagtggggtatcgacaaaccttggatcggtgcgggttttggtctggagcgtctgctgaaagtaaaacacgacttcaagaacatcaaa cgtgctgcacgttccgagtcctattacaatggtatttctactaacctgtaa
[0166] SEQ ID NO6: NBK-2 polypeptide sequence:
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSBaYIEMACGDHLVVNNSRSSRT ARALRHHKYRKTCKRCR VSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKAMPK
SVARAPKPLENTEAAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASALV KGNTNPITSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRR KKDLQQIY AEERENYLGKLEREITRFFVDRGFLEIKSPILIPLE YIERMGIDNDTELSK QIFRVDKNFCLRPMLAPNLINYARKLDRALPDPIKIFEIGPCYRKESDGKEHLEEFTM LNFAQMGSGCTRENLESΠTDFLNHLGIDFKIVGDSCMVFGDTLD VMHGDLELSSA VVGPIPLDREWGIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL*
[0167] SEQ ID NO: 7: Mmpyl-tRNA nucleic acid sequence:
5'-
GAAACCTGATCATGTAGATCGAATGGACTCTAAATCCGTTCAGCCGGGTTAGAT TCCCGGGGTTTCCGCCA-3'
[0168] While the foregoing invention has been described in some detail for purposes of clarity and understanding, it will be clear to one skilled in the art from a reading of this disclosure that various changes in form and detail can be made without departing from the true scope of the invention. For example, many of the techniques and apparatus described above can be used in various combinations.
[0169] All publications, patents, patent applications, and/or other documents cited in this application are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were individually indicated to be incorporated by reference for all purposes.

Claims

CLAIMSWHAT IS CLAIMED IS:
1. A composition comprising an aminoacyl tRNA synthetase (RS) and a cognate tRNA, wherein the synthetase is orthgonal in an enterobacteria and is also orthogonal in a eukaryotic cell, and wherein the cognate tRNA recognizes a selector codon, wherein the synthetase is capable of specifically aminoacylating the tRNA with an unnatural amino acid when both the synthetase and the tRNA are expressed in either the enterobacteria or the eukaryotic cell.
2. The composition of claim 1, wherein the synthetase is derived from an Archaea or bacteria synthetase and the cognate tRNA is derived from an Archaea or bacteria tRNA.
3. The composition of claim 1, wherein the synthetase is derived from an RS selected from the group consisting of: a Methanosarcinae RS, Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence, a Methanosarcina barken pyrrolysyl- tRNA synthetase (MbPyIRS) sequence, and a Desulfitobacterium hafniense pyrrolysyl- tRNA synthetase (DhPyIRS).
4. The composition of claim 3, wherein the MmPyIRS (SEQ ID NO: 2) is mutated at one or more of positions 305, 306, 309, 348, 384, or 419.
5. The composition of claim 3, wherein the synthetase sequence comprises an isoleucine or methionine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384.
6. The composition of claim 1, wherein the cognate tRNA is a pyrrolysyl-tRNA with an anticodon loop that recognizes a selector codon.
7. The composition of claim 1, wherein the unnatural amino acid is other than the canonical 20 natural amino acids, seleno-cysteine, pyrrolysine, Boc-lysine, acetyllysine or Nε-benzyloxycarbonyl-L-lysine.
8. The composition of claim 1, wherein the unnatural amino acid is selected from the group consisting of: an epsilon-substituted lysine, a photocaged lysine, a photocaged lysine analog, an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono- substituted phenylalanine, a para benzoyl-substituted phenylalanine, an ortho axido- substituted phenylalanine, a meta azido-substituted phenylalanine, an ortho nitro-substituted phenylalanine, a meta nitro-substituted phenylalanine, para nitro-substituted phenylalanine, an ortho nitro-substituted tyrosine, a meta nitro-substituted tyrosine, para nitro-substituted tyrosine; alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids, and alkoxyamine containing amino acids.
9. The composition of claim 8, wherein the unnatural amino acid is O-nitrobenzyl- oxycarbonly-Nε-L-lysine (ONBK).
10. The composition of claim 1, wherein the RS preferentially aminoacylates an O- tRNA with an epsilon-substituted lysine analog.
11. The composition of claim 1, wherein the aminoacyl tRNA synthetase comprises: a polypeptide sequence comprising at least 90% identity to a Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequence (SEQ ID NO: 2); wherein the polypeptide sequence comprises methionine at a position corresponding to position 306 of the MmPyIRS sequence, an isoleucine at a position corresponding to position 306 of the MmPyIRS sequence, an alanine at a position corresponding to position 309 of the MmPyIRS sequence, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384.
12. The composition of claim 1, wherein the RS comprises an amino acid sequence at least 90% identical to SEQ ID NO: 4 (NBK-I), and wherein the RS comprises an Ala amino acid at a position corresponding to Leu309 of SEQ ID NO: 2, an Ala amino acid at a position corresponding to Cys348 of SEQ ID NO: 2, and a Phe amino acid at a position corresponding to Tyr384 of SEQ ID NO: 2; wherein SEQ ID NO: 2 is a wild type MmPyIRS sequence.
13. A method of producing an unnatural amino acid-specific synthetase that is orthogonal in a eukaryotic cell, the method comprising: providing an orthogonal aminoacyl-tRNA synthetase (O-RS) library in one or more bacterial cells; selecting the synthetase library for an orthogonal member that specifically aminoacylates an orthogonal tRNA (OtRNA) in the bacterial cells with an unnatural amino acid, thereby providing an unnatural amino acid-specific synthetase that is orthogonal in the bacterial cells; and, shuttling the unnatural amino acid-specific synthetase into the eukaryotic cell, wherein the unnatural amino acid-specific synthetase is orthogonal in the eukaryotic cell.
14. The method of claim 13, further comprising deriving the synthetase from an Archaea synthetase and deriving the OtRNA from an Archaea tRNA.
15. The method of claim 13, wherein the eukaryotic cell is a mammalian cell or an insect cell.
16. The method of claim 13, wherein the unnatural amino acid is selected from the group consisting of: an ortho acyl-substituted phenylalanine, a meta acyl-substituted phenylalanine, a para acyl-substituted phenylalanine, ortho azido-substituted phenylalanine, a meta azido-substituted phenylalanine, a para azido-substituted phenylalanine, an ortho borono-substituted phenylalanine, a meta borono-substituted phenylalanine, a para borono- substituted phenylalanine, a para benzoyl-substituted phenylalanine, an ortho axido- substituted phenylalanine, a meta azido-substituted phenylalanine, an ortho nitro-substituted phenylalanine, a meta nitro-substituted phenylalanine, para nitro-substituted phenylalanine, an ortho nitro-substituted tyrosine, a meta nitro-substituted tyrosine, para nitro-substituted tyrosine; alkynyl aryl amino acids, aliphatic amino acids, alpha hydroxy acid substituted amino acids, beta diketo containing amino acids and alkoxyamine containing amino acids.
17. The method of claim 13, wherein the unnatural amino acid is other than the canonical 20 natural amino acids, seleno-cysteine, pyrrolysine, Boc-lysine, acetyllysine or Nε-benzyloxycarbonyl-L-lysine.
18. The method of claim 13, wherein said providing a library comprises mutating a Methanosarcina nucleic acid encoding a pyrrolysyl-tRNA synthetase (MPyIRS) polypeptide at positions encoding amino acids at positions corresponding to one or more of Tyr306, Leu309, Cys348, Tyr384 and Gly419 of SEQ ID NO: 2; wherein SEQ ID NO: 2 is a wild type Methanosarcina maize polypeptide (MmPyIRS) sequence, and transforming bacteria with the mutated nucleic acid and with a nucleic acid sequence encoding a pyrrolysyl-tRNA (Pyl-tRNA), which tRNA is preferentially aminoacylated by the MPyIRS, thus providing the O-RS library of mutated MPyIRSs paired with the Pyl-tRNA; wherein said selecting comprises positively selecting the library for clones encoding a mutant MPyIRS that charges the Pyl-tRNA with the unnatural amino acid; the method comprising growing, in an appropriate medium, the eukaryotic cell, where the cell comprises a nucleic acid that encodes a protein and comprises at least one selector codon recognized by the Pyl-tRNA; and providing the unnatural active amino acid, a selected mutant MPyIRS and the Pyl- tRNA in the cell; whereby the protein is translated in the eukaryotic cell to incorporate the unnatural amino acid at the specified position.
19. The method of claim 13, further comprising: providing a caged lysine analog in the eukaryotic cell, whereby the Pyl-tRNA is charged with the caged lysine analog by the O-RS followed by incorporation of the caged lysine analog into the polypeptide; and, illuminating the polypeptide with light to remove a cage group from the lysine.
20. The method of claim 19, further comprising preparing the unnatural amino acid by photocaging a lysine or by substituting a chemical group on a lysine.
21. A polypeptide library comprising Methanosarcina maize pyrrolysyl-tRNA synthetase (MmPyIRS) sequences that collectively comprise mutations at positions corresponding to one or more of the SEQ ID NO: 2 positions 305, 306, 309, 348, 384 and
419.
22. The library of claim 21, wherein less than 5% of amino acids, other than those at positions 305, 306, 309, 348, 384, and 419 are mutated.
23. The library of claim 21, wherein the synthetases comprise one or more mutations selected from the group consisting of: Y306M, Y306I, L309A, C348A and Y384F.
24. A polypeptide comprising one or more O-nitrobenzyl-oxycarbonyl-lysine (ONBK) residues.
25. A cell comprising the polypeptide of claim 24.
26. The cell of claim 24, wherein the cell is a eubacterial or eukaryotic cell.
27. A nucleic acid comprising a sequence encoding an aminoacyl tRNA synthetase comprising an isoleucine or methionine at a position corresponding to position 306 of the wild type Methanosarcina maize PyIRS sequence (SEQ ID NO: 2), an alanine at a position corresponding to position 309, an alanine at a position corresponding to position 348, or a phenylalanine at a position corresponding to position 384.
28. A vector comprising the nucleic acid of claim 27.
PCT/US2010/000992 2009-04-03 2010-04-02 A facile system for encoding unnatural amino acids in mammalian cells WO2010114615A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US21181109P 2009-04-03 2009-04-03
US61/211,811 2009-04-03

Publications (2)

Publication Number Publication Date
WO2010114615A2 true WO2010114615A2 (en) 2010-10-07
WO2010114615A3 WO2010114615A3 (en) 2011-02-24

Family

ID=42828904

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2010/000992 WO2010114615A2 (en) 2009-04-03 2010-04-02 A facile system for encoding unnatural amino acids in mammalian cells

Country Status (1)

Country Link
WO (1) WO2010114615A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012085279A2 (en) * 2010-12-23 2012-06-28 Universiteit Gent Method for cross-linking peptides
DE102010056289A1 (en) 2010-12-24 2012-06-28 Geneart Ag Process for the preparation of reading frame correct fragment libraries
JP2013521269A (en) * 2010-03-05 2013-06-10 メディカル リサーチ カウンシル Genetically encoded light control
WO2014044872A1 (en) 2012-09-24 2014-03-27 Allozyne, Inc Cell lines
US9163271B2 (en) 2001-04-19 2015-10-20 The Scripps Research Instiute Methods and compositions for the production of orthogonal tRNA-aminoacyl tRNA synthetase pairs
US9580721B2 (en) 2003-04-17 2017-02-28 The Scripps Reserach Institute Expanding the eukaryotic genetic code
WO2023031445A3 (en) * 2021-09-06 2023-04-13 Veraxa Biotech Gmbh Novel aminoacyl-trna synthetase variants for genetic code expansion in eukaryotes

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008073184A2 (en) * 2006-10-18 2008-06-19 The Scripps Research Institute Genetic incorporation of unnatural amino acids into proteins in mammalian cells

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008073184A2 (en) * 2006-10-18 2008-06-19 The Scripps Research Institute Genetic incorporation of unnatural amino acids into proteins in mammalian cells

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DEITERS A. ET AL.: 'A genetically encoded photocaged tyrosine' ANGEWANDTE CHEMI (INTERNATIONAL EDITION IN ENGLISH) vol. 45, no. 17, 21 April 2006, pages 2728 - 2731 *
MUKAI, T. ET AL.: 'Adding L-lysine derivatives to the geneti c code of mammalian cells with engineered pyrrolysyl-tRNA synthetases' BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS vol. 371, no. 4, 08 May 2008, pages 812 - 822 *
TATSU, Y. ET AL.: 'Synthesis of caged peptides using caged lysine: Applicatio n to the synthesis of caged AIP, a highly specific inhibitor of calmodulin-d ependent protein kinase II' BIOORGANIC vol. 9, no. 8, 19 April 1999, pages 1093 - 1696 *
YANAGISAWA, T. ET AL.: 'Multistep engineering of pyrrolysyl-tRNA synthetase t o genetically encode Nepsilon-(a-Azidobenzyloxycarbonyl)lysine for site-specific protein modification' CHEMISTRY AND BIOLOGY vol. 15, no. 11, 24 November 2008, pages 1187 - 1197 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9163271B2 (en) 2001-04-19 2015-10-20 The Scripps Research Instiute Methods and compositions for the production of orthogonal tRNA-aminoacyl tRNA synthetase pairs
US9580721B2 (en) 2003-04-17 2017-02-28 The Scripps Reserach Institute Expanding the eukaryotic genetic code
JP2013521269A (en) * 2010-03-05 2013-06-10 メディカル リサーチ カウンシル Genetically encoded light control
WO2012085279A2 (en) * 2010-12-23 2012-06-28 Universiteit Gent Method for cross-linking peptides
WO2012085279A3 (en) * 2010-12-23 2012-09-27 Universiteit Gent Method for cross-linking peptides
US9708363B2 (en) 2010-12-23 2017-07-18 Universiteit Gent Method for cross-linking peptides
DE102010056289A1 (en) 2010-12-24 2012-06-28 Geneart Ag Process for the preparation of reading frame correct fragment libraries
WO2012084923A1 (en) 2010-12-24 2012-06-28 Geneart Ag Method for producing reading-frame-corrected fragment libraries
WO2014044872A1 (en) 2012-09-24 2014-03-27 Allozyne, Inc Cell lines
WO2023031445A3 (en) * 2021-09-06 2023-04-13 Veraxa Biotech Gmbh Novel aminoacyl-trna synthetase variants for genetic code expansion in eukaryotes

Also Published As

Publication number Publication date
WO2010114615A3 (en) 2011-02-24

Similar Documents

Publication Publication Date Title
JP5589186B2 (en) Expression system of orthogonal translation components in eubacterial host cells
JP5539948B2 (en) Orthogonal translation components for in vivo incorporation of unnatural amino acids
JP5823941B2 (en) In vivo incorporation of alkynyl amino acids into eubacterial proteins
JP5192150B2 (en) Composition of orthogonal lysyl-tRNA and aminoacyl-tRNA synthetase pairs and uses thereof
JP5385143B2 (en) Genetically programmed expression of selectively sulfated proteins in eubacteria
US20090148887A1 (en) Genetically encoded boronate amino acid
JP2004537984A (en) Methods and compositions for producing orthogonal tRNA-aminoacyl tRNA synthetase pairs
JP2007514447A (en) Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells
WO2010114615A2 (en) A facile system for encoding unnatural amino acids in mammalian cells
JP2010506591A (en) Genetic integration of unnatural amino acids into proteins in mammalian cells
KR20090024706A (en) Genetically encoded fluorescent coumarin amino acids
JP5858543B2 (en) Method for producing recombinant bacteria for non-natural protein production and use thereof
US20070178554A1 (en) Orthogonal Aminoacyl Synthetase-tRNA Pairs for Incorporating Unnatural Amino Acids Into Proteins
AU2011204773B2 (en) Orthogonal translation components for the in vivo incorporation of unnatural amino acids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10759155

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10759155

Country of ref document: EP

Kind code of ref document: A2