EP3269809B1

EP3269809B1 - MODIFIED AMINOACYL-tRNA SYNTHETASE AND USE THEREOF

Info

Publication number: EP3269809B1
Application number: EP16764874.0A
Authority: EP
Inventors: Atsushi Ohta; Yusuke Yamagishi; Atsushi Matsuo
Original assignee: Chugai Pharmaceutical Co Ltd
Current assignee: Chugai Pharmaceutical Co Ltd
Priority date: 2015-03-13
Filing date: 2016-03-11
Publication date: 2022-07-27
Anticipated expiration: 2036-03-11
Also published as: EP3269809A1; US20180127761A1; JP2023116605A; HUE059925T2; DK3269809T3; EP3269809A4; US20210087572A1; JP7292442B2; JPWO2016148044A1; WO2016148044A1; EP4166664A1; JP2022061999A; JP7020910B2; US10815489B2

Description

Technical Field

Disclosed are aminoacyl-tRNA synthetases (ARSs) that aminoacylate tRNA with the corresponding N-methyl amino acid more efficiently than natural aminoacyl-tRNA synthetases, and use thereof. More specifically, the disclosed aminoacyl-tRNA synthetases have modified amino acid sequences, which are able to aminoacylate a tRNA with one of six N-methyl-substituted amino acids corresponding to the tRNA, namely, N-methyl-phenylalanine, N-methyl-valine, N-methyl-serine, N-methyl-threonine, N-methyl-tryptophan, and N-methyl-leucine, more efficiently than natural ARSs, and uses thereof. The aminoacyl-tRNA synthetases with modified amino acid sequences can be used to produce peptides that selectively and regioselectively contain N-methyl amino acids with high efficiency.

Background Art

Generally in organisms on earth, information stored in DNA (= an information-storing substance) defines, via RNA (= an information-transmitting substance), the structures of proteins (= functional substances) and functions resulting from the structures. Polypeptides and proteins are composed of 20 types of amino acids. The information stored in DNA, which is composed of four kinds of nucleotides, is transcribed into RNA and then translated into amino acids, which make polypeptides and proteins.
During translation, tRNA plays the role of an adaptor that matches a stretch of three nucleotides to one amino acid, and aminoacyl-tRNA synthetase (aminoacyl-tRNA synthetase; ARS) is involved in the attachment of tRNA to its amino acid.
ARSs are enzymes that specifically attach an amino acid to its corresponding tRNA. There are 20 types of ARSs, each corresponding to each of the 20 types of naturally-occurring amino acids with a few exceptions, in every biological species. Of the 20 kinds of proteinaceous amino acids, ARS precisely acylates tRNA having an anticodon corresponding to an arbitrary codon, using the specific amino acid assigned to the codon. In other words, a tRNA synthetase corresponding to a certain amino acid can distinguish a tRNA corresponding to that amino acid from tRNAs corresponding to other amino acids, and does not attach other amino acids.
In translation of an mRNA into a polypeptide chain, a tRNA bound to its corresponding amino acid (aminoacyl-tRNA) pairs with the appropriate codon on the mRNA starting from the initiation codon and form hydrogen bonds with the mRNA on ribosomes. This is followed by peptidyl transfer to the amino acid on the adjacent aminoacyl-tRNA bound corresponding to the next codon. The first tRNA that released the amino acid by the transfer is liberated from the mRNA and can reattach its corresponding amino acid catalyzed by ARS. Translation is terminated upon reaching the mRNA's stop codon and upon arrival of a protein called the termination factor, which releases the polypeptide chain from the ribosome. Proteins within the living body are produced via such processes. The proteins then exert important physiological functions in the living body.
Meanwhile, in the field of pharmaceuticals of which bioactivities are exerted similarly in living bodies, the possibility of creating new pharmaceuticals using compounds having a molecular weight from 500 to 2000, called "middle molecules", is being anticipated in the field of drug discovery where it was considered difficult to create drugs with conventional low molecular weight compounds having a molecular weight less than 500. A representative example is cyclosporine A, a naturally-derived middle molecule drug, which is a peptide consisting of 11 residues that is produced by microorganisms, inhibits the intracellular target cyclophilin, and can be orally administered.
The Cyclosporine A peptide is characterized by including the non-natural amino acids "N-methyl amino acids" as components. Triggered by this, multiple studies have recently reported that the incorporation of N-methyl amino acids into peptides can increase drug-likeness of the peptides, and this is being further applied to drug discovery ( Non Patent Literature 1, 2, 3). Particularly, it has come to be known that the incorporation of N-methyl amino acids leads to decrease in hydrogen bond-donating hydrogens, acquisition of protease resistance, and fixed conformation, and contributes to membrane permeability and metabolic stability ( Non Patent Literature 1, 4, Patent Literature 1).
This paves way to the conception of drug discovery methods that select pharmaceutical candidate substances from a library of diverse peptides containing multiple N-methyl amino acids. In terms of diversity and ease of screening, much anticipated are mRNA display libraries of N-methyl amino acid-containing peptides, and such, which use a cell-free translation system ( Non Patent Literature 9, 10, Patent Literature 1). First, a display library is constructed, which is a collection of molecules forming one-for-one pairs between an enormous variety of RNA or DNA molecules (genotype) or the like and peptides encoded by these molecules (phenotype). The display library is then allowed to bind to a target protein or the like, followed by washing to remove non-bound molecules and selecting a molecule in the library, which is contained scarcely, in extremely small amounts. The selected RNA (DNA) molecule can be then sequenced to easily obtain the sequence information of the bound peptide. Particularly, an mRNA display library and a ribosome display library utilizing a cell-free translation system can be used to easily analyze 10^12-14 diverse types of molecules (Non Patent Literature 11). Recently, a method that can prepare peptides containing non-proteinaceous amino acids using a reconstituted cell-free translation system has been also developed, enabling combination with display techniques to construct an N-methyl amino acid-containing peptide display library (Non Patent Literature 10).
Some methods for preparing N-methyl amino acid-containing peptides by translation of mRNAs are known so far. These methods are performed by separately preparing beforehand "N-methyl aminoacyl-tRNAs" and adding these to a translation system.
First, in the pdCpA method developed by Hecht et al. (Non Patent Literature 5), an N-methyl aminoacyl-tRNA is prepared beforehand by ligating pdCpA (5'-phospho-2'-deooxyribocytidylriboadenosine) acylated with a chemically-synthesized non-natural N-methyl amino acid to a tRNA lacking 3'-terminal CA obtained by transcription using T4 RNA ligase. This method has been used to introduce amino acids including N-methylalanine and N-methylphenylalanine (Non Patent Literature 6). However, when the present inventors attempted to introduce multiple N-methyl amino acids by preparing several complexes between non-natural N-methyl amino acids and tRNAs using the pdCpA method and adding the complexes to a cell-free translation system, translational efficiency decreased. Particularly, N-methylvaline could not be introduced (unpublished data).
As a different method, Suga et al. reported a method for aminoacylating tRNAs with N-methyl amino acids activated in advance by esterification using an artificial RNA catalyst (Flexizyme) (Patent Literature 2), and successfully introduced multiple types of N-methyl amino acids by translation. This method can be applied to various side chain structures, but aminoacylation efficiency of amino acids with nonaromatic side chains is in many cases 40 to 60% which is by no means high, and particularly, N-methylvaline introduction has not been confirmed as with the pdCpA method (Non Patent Literature 3).
Moreover, in the method of Szostak et al., aminoacyl-tRNAs are obtained by using tRNAs extracted from Escherichia coli and wild-type ARSs, and then preparing N-methyl aminoacyl-tRNAs via a chemical N-methylation reaction consisting of three steps. However, side chains that are efficiently translated in this method are limited to only the three side chains valine, leucine, and threonine. This method also requires cumbersome operations and further results in contamination of trace amounts of natural amino acids (the starting material) resulting from incomplete progression of the N-methylation reaction. Such difficulties in controlling the reaction directly affect the purity of products (Non Patent Literature 2).
Moreover, since these techniques add N-methyl aminoacyl-tRNAs prepared outside a translation system to a translation reaction solution, N-methyl aminoacyl-tRNAs are not reproduced in the translation system and thus are only consumed in the translation process. This requires addition of large amounts of N-methyl aminoacyl-tRNAs, but this addition itself of large amounts of tRNAs contributes to the reduction of peptide yield (Non Patent Literature 7). Further, instability of aminoacyl-tRNAs in the translation solution becomes problematic. Aminoacyl-tRNAs have been shown to be hydrolyzed under physiological conditions at pH 7.5 due to the presence of ester bonds between amino acids and tRNAs (Non Patent Literature 12). Half-life of aminoacyl-tRNAs depends on amino acid side chains, the shortest being 30 minutes. Hydrolysis of aminoacyl-tRNA is suppressed when it forms a complex with an aminoacyl-tRNA-elongation factor Tu (EF-Tu), but aminoacyl-tRNAs in excess of the concentration of EF-Tu present in the translation system are hydrolyzed. That is to say, as the translation reaction proceeds, deacylation of aminoacyl-tRNAs added at the start of translation proceeds, and at the end, aminoacyl-tRNAs having N-methyl amino acids are exhausted. In fact, Szostak et al. perceived this depletion as a problem and added N-methyl aminoacyl-tRNAs twice, at the start of translation and during the reaction, when synthesizing a polypeptide containing multiple N-methyl amino acids (Non Patent Literature 2).
The above-mentioned problems regarding introduction of N-methyl amino acids can be solved if there are ARSs for N-methyl amino acids having functions similar to natural ARSs for natural amino acids, but ARSs have an ability to precisely recognize their substrates and thus have limitations. As an exception, it was reported that natural HisRS and PheRS could be used to introduce N-methylhistidine and N-methylphenylalanine into peptides in a cell-free translation system (Non Patent Literature 8, 13). Also, Hartman et al. confirmed that natural ARSs were used to aminoacylate tRNAs with the six N-methyl amino acids N-methylvaline, N-methylleucine, N-methyl aspartic acid, N-methylhistidine, N-methyllysine, and N-methyltryptophan (Non Patent Literature 14). However, even though a subsequent report using a cell-free translation system and natural ARSs that analyzed translational synthesis of peptides containing N-methylhistidine and N-methyl aspartic acid by mass spectrum reported a certain level of yield, in the case of peptides containing N-methylvaline, N-methylleucine, N-methyllysine, and N-methyltryptophan, it was shown that the efficiency of translational synthesis (ribosomal synthesis) is very low (Non Patent Literature 8). These reports reveal that aminoacylation with N-methyl amino acids and translational introduction of N-methyl amino acids into peptides using natural ARSs has been confirmed substantially in only the three cases of N-methylphenylalanine, N-methylhistidine, and N-methyl aspartic acid.
There are some prior art in which natural ARSs have been modified to give them the function to catalyze the attachment of non-natural amino acids to tRNAs ( Patent Literature 3, 4, 5). Even though these are modified ARSs that catalyze the attachment of non-natural amino acids to tRNAs, and substrates of these modified ARSs are amino acids mainly having side chain derivatives of phenylalanine and tyrosine, modified ARSs having N-methyl amino acids as substrates, and modified ARSs that can introduce multiple N-methyl amino acid residues into peptides have not been known.

Citation List

Patent Literature

Patent Literature 1: WO2013/100132
Patent Literature 2: WO2007/066627
Patent Literature 3: WO2003/014354
Patent Literature 4: WO2007/103307
Patent Literature 5: WO2002/085923

Non Patent Literature

Non Patent Literature 1: R. S. Lokey et al., Nat. Chem. Biol. 2011, 7(11), 810-817.
Non Patent Literature 2: J. W. Szostak et al., J. Am. Chem. Soc. 2008, 130, 6131-6136.
Non Patent Literature 3: T. Kawakami et al., Chemistry & Biology, 2008, Vol. 15, 32-42.
Non Patent Literature 4: H. Kessler et al., J. Am. Chem. Soc. 2012, 134, 12125-12133
Non Patent Literature 5: S.M. Hecht et al., J. Biol. Chem. 253 (1978) 4517-4520.
Non Patent Literature 6: Z. Tan et al., J. Am. Chem. Soc. 2004, 126, 12752-12753.
Non Patent Literature 7: A. O. Subtelny et al., Angew Chem Int Ed 2011 50 3164.
Non Patent Literature 8: M. C. T. Hartman et al., PLoS one, 2007, 10, e972.
Non Patent Literature 9: S. W. Millward et al., J. Am. Chem. Soc., 2005, 127, 14142-14143
Non Patent Literature 10: Y. Yamagishi et al., Chem. Biol., 18, 1562-1570, 2011
Non Patent Literature 11: H. R. Hoogenboom, Nature Biotechnol. 23, 1105-1116, 2005
Non Patent Literature 12: J. R. Peacock et al., RNA, 20, 758-64, 2014
Non Patent Literature 13: T. Kawakami, ACS Chem Biol., 8, 1205-1214, 2013
Non Patent Literature 14: M. C. T. Hartman et al., Proc Natl Acad Sci USA., 103, 4356-4361, 2006

Summary

[Problems to be Solved]

An objective of the present disclosure is to provide modified ARSs that have been modified to increase reactivity, and which use N-methyl amino acids as substrates. More specifically, an objective of the present disclosure is to provide a novel modified ARS that catalyzes the acylation reaction in which tRNAs attach non-natural N-methyl amino acids, particularly N-methyl-phenylalanine, N-methyl-valine, N-methyl-serine, N-methyl-threonine, N-methyltryptophan, and N-methylleucine, without using large amounts of tRNAs in order to efficiently produce peptides containing multiple N-methyl amino acids, and uses thereof.

[Means for Solving the Problems]

The invention relates to the embodiments as defined in the claims. In particular the invention relates to a modified valyl-tRNA synthetase (ValRS) which incorporates an N-methyl valine more efficiently than the ValRS having the amino acid sequence SEQ ID NO:24, wherein said modified ValRS is selected from the following (a) and (b):

(a) a ValRS modified at a position(s) corresponding to asparagine at position 43 and/or threonine at position 45 and/or threonine at position 279 of ValRS from Escherichia coli having the amino acid sequence SEQ ID NO:24, and
(b) a ValRS having (i) glycine or alanine at a position corresponding to asparagine at position 43 and/or (ii) serine at a position corresponding to threonine at position 45 and/or (iii) glycine or alanine at a position corresponding to threonine at position 279 of ValRS from Escherichia coli having the amino acid sequence SEQ ID NO:24,

The invention also relates to a polynucleotide encoding the modified ValRS of the invention.
The invention further relates to a vector comprising a polynucleotide encoding the modified ValRS of the invention.
In addition, the invention relates to a host cell comprising a polynucleotide encoding the modified ValRS of the invention or a vector comprising a polynucleotide encoding the modified ValRS of the invention.
The invention additionally relates to a method producing the modified ValRS of the invention comprising the step of culturing the host cell as described directly above.
The invention further relates to a method for producing a tRNA acylated with an N-methylvaline, comprising the step of contacting the N-methylvaline with a tRNA in the presence of the modified ValRS of the invention.
The invention still further relates to a method for producing a polypeptide comprising an N-methylvaline, comprising the step of performing translation in the presence of the modified ValRS of the invention and the N-methylvaline.
In order to obtain ARSs having increased reactivity with N-methyl amino acids, the present inventors obtained multiple ARS genes that employ different amino acids as substrates and introduced mutations into the genes to construct mutated ARS genes encoding ARSs with altered amino acid sequences. These modified ARSs were expressed and collected, and were incubated with tRNAs in the presence of unmodified amino acids or N-methyl amino acids to determine aminoacylation reaction. As a result of estimating the conformation formed in the N-methyl amino acid-ARS interaction, and after much trial and error, the present inventors successfully obtained modified ARSs, including multiple ARSs such as phenylalanyl-tRNA synthetase (PheRS), seryl-tRNA synthetase (SerRS), valyl-tRNA synthetase (VaIRS), threonyl-tRNA synthetase (ThrRS), leucyl-tRNA synthetase (LeuRS), and tryptophanyl-tRNA synthetase (TrpRS), with increased activity in the aminoacylation reaction with N-methyl amino acids, compared to wild-type ARSs.
For example, 0.1 µM wild-type PheRS hardly incorporated N-methylphenylalanine into ribosomally-synthesized peptides even when 1 mM N-methylphenylalanine was added. On the other hand, 0.1 µM modified PheRS clearly incorporated N-methylphenylalanine into peptides even when 0.25 mM N-methylphenylalanine was added (Example 1). The amount of phenylalanine and N-methylphenylalanine translationally introduced (ribosomally introduced) into peptides were measured by mass spectroscopy using MALDI-TOF MS. The ratio of the peak values (the peak intensity of the peptide containing N-methylphenylalanine / the peak intensity of the peptide containing phenylalanine) at the time of 0.25 mM N-methylphenylalanine addition was 0.8 when using wild-type PheRS, whereas it dramatically increased to 12.4 when using the modified PheRS α subunit (Example 1). When sequences containing two consecutive and three consecutive phenylalanines were allowed to translate, peptides containing two consecutive and three consecutive N-methylphenylalanines were confirmed to be synthesized respectively, and the efficiency was significantly higher than that using the pdCpA method. Also, ValRS was used to perform translational synthesis in the presence of 5 mM N-methylvaline, and the peptide products were analyzed with mass spectroscopy. A peptide incorporated with unmodified valine was detected as the main product when using wild-type ValRS, whereas a peptide incorporated with N-methylvaline was observed as the main product when using modified ValRS (Example 2). Furthermore, the selectivity to N-methylvaline was successfully increased by introducing mutations into the editing domain of ValRS and decreasing aminoacylation activity with unmodified Val (Example 7). When sequences containing two consecutive and three consecutive valines were allowed to translate, peptides containing two consecutive and three consecutive N-methylvalines were confirmed to be synthesized respectively (Example 2). For SerRS, an N-methylserine-incorporated translation product was detected as the main product when modified SerRS was used under conditions in which an unmodified serine-incorporated translation product is detected as the main product when using wild-type SerRS (Example 3). Even for ThrRS, an N-methyl Thr-incorporated translation product was observed when modified ThrRS was used under conditions in which an unmodified Thr-incorporated translation product is detected when using wild-type ThrRS; whereas it was demonstrated that translation products incorporated with unmodified Thr were hardly observed and the peptide with N-methyl Thr introduced was synthesized with higher purity compared to that achieved with wild-type ThrRS (Example 4). For TrpRS, unmodified Trp was detected as a main product in using wild-type TrpRS, whereas the translation product with N-methyl Trp incorporated was observed as a main product using the modified TrpRS (Example 5). For LeuRS, translation products containing N-methyl Leu were not observed in using wild-type LeuRS, whereas it is revealed that translation products with N-methyl Leu incorporated were produced as much as translation products containing unmodified Leu when modified LeuRSs were used (Example 6). Thus, the modified ARSs as disclosed herein can be used to introduce N-methyl amino acids into peptides more efficiently than wild-type ARS.
The present disclosure provides ARSs having reactivity with N-methyl amino acids. Specifically, the present disclosure provides an aminoacyl-tRNA synthetase (aminoacyl-tRNA synthetase; ARS) that has an altered amino acid sequence and is able to incorporate any N-methyl amino acid, particularly the six N-methyl-substituted amino acids of N-methyl-phenylalanine, N-methyl-valine, N-methyl-serine, N-methyl-threonine, N-methyltryptophan, and N-methylleucine more efficiently than natural ARSs and use thereof. The ARS with an altered amino acid sequence as described herein can be used to produce peptides selectively and regioselectively containing any N-methyl amino acid from among these N-methyl amino acids with high efficiency.
Further disclosed is a method for producing polypeptides containing non-natural amino acids using a modified ARS as described herein. More specifically, disclosed is a method for producing polypeptides containing N-methylphenylalanine, N-methylvaline, N-methylserine, N-methylthreonine, N-methyltryptophan, and N-methylleucine using ARSs for phenylalanine, valine, serine, threonine, tryptophan, and leucine, respectively, with altered amino acid sequences.

[Effects of the Disclosure]

The modified ARSs as disclosed herein can be used to efficiently attach N-methylphenylalanine, N-methylvaline, N-methylthreonine, N-methylserine, N-methyltryptophan, and N-methylleucine to tRNAs corresponding to natural phenylalanine, valine, threonine, serine, tryptophan, and leucine, respectively, without complicated reactions.
Methods using modified ARSs as disclosed herein require no stoichiometric amount of tRNA, can synthesize peptides into which multiple N-methyl amino acids are introduced with a high translational efficiency, and is useful for generating a highly diverse peptide library.
In order to investigate the effect on translational efficiency provided by the ARS's characteristic feature of "providing a continuous supply of aminoacyl-tRNA during the translation reaction", the present inventors compared the introduction efficiencies of N-methylphenylalanine in two methods: a method using the modified PheRS05 (SEQ ID NO: 2) obtained by the disclosed application and the pdCpA method in which aminoacyl-tRNA is not expected to be regenerated during the translation reaction. As a result, the translational efficiency of the method using the modified ARS was higher, and particularly, when two consecutive and three consecutive N-methylphenylalanines were introduced, the target peptide was synthesized approximately 4 to 8 times more (unpublished data). Thus, the present disclosure enables more efficient production of N-methyl amino acid-containing polypeptides, which were conventionally hard to produce.

Brief Description of Drawings

[Figure 1] Figure 1 shows evaluation of aminoacylation activities of modified ARSs. The bands of the synthetic peptide acylated with N-methylphenylalanine using PheRS04 and PheRS05 were detected more strongly than when using wild-type PheRS ( lane 8, 9 vs. 12, 13).
[Figure 2] Figure 2 shows the confirmation by electrophoresis of peptides ribosomally synthesized using mutant PheRS. The bands of the N-methylphenylalanine-containing peptide synthesized using the modified PheRS were detected more strongly than that using wild-type PheRS ( lanes 2, 3 vs. 5, 6).
[Figure 3-1] Figure 3-1 shows detection by mass spectroscopy of peptides ribosomally synthesized using the modified PheRS. Mass spectrum of the peptide ribosomally synthesized using (a) 0.1 µM wt PheRS and 0.25 mM Phe, (b) 0.1 µM wt PheRS and 0.25 mM MePhe, or (c) 0.1 µM wt PheRS and 1 mM MePhe.
[Figure 3-2] Figure 3-2 shows detection by mass spectroscopy of peptides ribosomally synthesized using the modified PheRS. Mass spectrum of the peptide ribosomally synthesized using (d) 0.1 µM PheRS05 and 0.25 mM Phe, (e) 0.1 µM PheRS05 and 0.25 mM MePhe, or (f) 0.1 µM PheRS05 and 1 mM MePhe.
[Figure 4] Figure 4 shows aminoacylation reaction performed using modified ValRSs. tRNA acylated with N-methylvaline using the mutant 13 (ValRS13) was observed more than tRNA acylated using wild-type ValRS (lane 10).
[Figure 5] Figure 5 shows the results of mass spectrometry analysis of peptides translated using the modified ValRSs. Mass spectrum of the peptides translated using (a) wild-type ValRS, (b) ValRS13, or (c) ValRS04. The peptide containing MeVal was observed as a main product when ValRS13 was used.
[Figure 6] Figure 6 shows aminoacylation reaction performed using ValRS13-11. tRNA acylated with N-methylvaline using the mutant 13-11 (ValRS13-11) was observed more than tRNA acylated using wild-type ValRS or the mutant 13 (ValRS13).
[Figure 7] Figure 7 shows comparison of activities between ValRS13 and ValRS13-11. Mass spectra of the N-methyl peptide-containing peptides translated using ValRS13 ((a), (c), (e)) and ValRS13-11 ((b), (d), (f)). It can be seen that the N-methylvaline-containing target peptide translated using ValRS13-11 had higher purity.
[Figure 8] Figure 8 shows aminoacylation reaction performed using modified SerRSs. tRNA acylated with N-methylserine was observed when mutants 03, 35, and 37 were used ( lanes 3, 25, 27).
[Figure 9] Figure 9 shows the results of mass spectrometry analysis of peptides ribosomally synthesized using the modified SerRSs. Mass spectrum of the N-methylserine-containing peptide translated using (a) wild-type SerRS, (b) SerRS03, (c) SerRS05, (d) SerRS35, or (e) SerRS37. The MeSer-containing target peptide synthesized using each modified SerRS had higher purity compared to that synthesized using wild-type SerRS.
[Figure 10] Figure 10 shows the results of mass spectrometry analysis of peptides ribosomally synthesized using modified ThrRSs. Mass spectrum of the N-methylthreonine-containing peptide translated using (a) wild-type ThrRS, (b) ThrRS03, or (c) ThrRS14. The MeThr-containing target peptide synthesized using each modified ThrRS had higher purity compared to that synthesized using wild-type ThrRS.
[Figure 11] Figure 11 shows the results of mass spectrometry analysis of peptides ribosomally synthesized using modified TrpRSs. Mass spectrum of the peptide translated using (a) wild-type TrpRS, (b) TrpRS04, (c) TrpRS05, or (d) TrpRS18. The MeW-containing peptide was observed as a main product when TrpRS04, 05, and 18 were used.
[Figure 12] Figure 12 shows the results of mass spectrometry analysis of peptides ribosomally synthesized using the modified LeuRS. Mass spectrum of the peptide translated using (a) wild-type LeuRS or (b) LeuRS02. The MeL-containing peptide was observed as a main product when LeuRS02 was used.
[Figure 13] Figure 13 shows aminoacylation reaction with MeVal using the modified ValRSs having mutation in the editing domain. The activities of three mutants, 13-11, 66, and 67 were not much different ( lanes 17, 18, 19).
[Figure 14] Figure 14 shows aminoacylation reaction with Val using the modified ValRSs having mutation in the editing domain. The activities of the mutants 66 and 67 were attenuated compared to the activity of the mutant 13-11 (lane 17 vs 18, 19).

Mode for Carrying Out the Disclosure

Disclosed are mutant enzymes with altered enzyme-substrate specificity for aminoacyl-tRNA synthetase. Also disclosed is the preparation of mutant aminoacyl-tRNA synthetases that can efficiently and selectively produce polypeptides containing N-methyl amino acids in large amounts, by altering the amino acid sequence of natural aminoacyl-tRNA synthetase.
Disclosed are polypeptides including ARSs that can react with N-methyl amino acids. More specifically, polypeptides are disclosed comprising modified ARSs that can react with N-methyl amino acids more efficiently than the original, natural ARSs. Further disclosed are polypeptides comprising modified ARSs that can incorporate N-methyl amino acids more efficiently than the original, natural ARSs. The polypeptides as described herein are polypeptides that have aminoacyl-tRNA synthetase activity and are modified to enhance the aminoacylation reaction with N-methyl amino acids. As used herein, the phrase "incorporate N-methyl amino acids" refers to, for example, aminoacylation of tRNAs with N-methyl amino acids corresponding to the tRNAs and may be attachment of the N-methyl amino acids to the tRNAs or incorporation of N-methyl amino acids to proteins synthesized in a translation reaction using the aminoacyl-tRNA produced in the acylation reaction. The phrase "a polypeptide has aminoacyl-tRNA synthetase activity" includes not only the case in which the polypeptide exhibits aminoacyl-tRNA synthetase activity by itself, but also the case in which the polypeptide exhibits aminoacyl-tRNA synthetase activity together with other factors. For example, when an aminoacyl-tRNA synthetase is composed of multiple subunits, the polypeptide as described herein may be one subunit or may exhibit aminoacyl-tRNA synthetase activity as a complex with other subunits. In such a case, when an aminoacyl-tRNA synthetase complex is formed between a modified polypeptide ads described herein and other wild-type subunits, the aminoacyl-tRNA synthetase complex can enhance aminoacylation reaction with N-methyl amino acids more than an aminoacyl-tRNA synthetase complex consisting of wild-type subunits. That is to say, polypeptides with aminoacyl-tRNA synthetase activity modified to enhance aminoacylation reaction with N-methyl amino acids include a polypeptide that is one modified subunit in an aminoacyl-tRNA synthetase complex consisting of multiple subunits and is modified to enhance aminoacylation reaction with N-methyl amino acids by the aminoacyl-tRNA synthetase complex.
A polypeptide comprising a modified ARS refers to a polypeptide comprising the polypeptide chain of the modified ARS, and specifically the polypeptide is a polypeptide comprising the amino acid sequence of the modified ARS. The original, natural ARS refers to a natural ARS from which modified ARS is derived, may be for example a wild-type ARS, and includes a naturally-occurring polymorphism. The phrase "can react with N-methyl amino acids" means that modified ARSs can perform an enzymatic reaction with N-methyl amino acids as substrates. The reaction may be, for example, an acylation reaction of tRNAs with N-methyl amino acids, and specifically a reaction that catalyzes a coupling reaction between an N-methyl amino acid and a tRNA. For example, depending on an amino acid used as a substrate by an ARS, the reaction is performed in the presence of the appropriate N-methyl amino acid and the appropriate tRNA, and the coupling between the N-methyl amino acid and the tRNA may be detected. Alternatively, a reaction with N-methyl amino acids may be incorporation of N-methyl amino acids into polypeptides in translation. Production of N-methyl amino acid-tRNA by ARSs can be detected by, for example, performing translation in the presence of the modified ARSs and N-methyl amino acids and detecting incorporation of N-methyl amino acids into the polypeptides produced by translation. The reactivity with N-methyl amino acids is considered to be higher when N-methyl amino acids are frequently incorporated into polypeptides.
The phrase "a modified ARS can react more efficiently than the original, natural ARS" may mean that the modified ARS reacts more efficiently than the original, natural ARS at least under a certain condition, or that a reaction or reaction product that cannot be observed when using the original ARS can be observed when using the modified ARS. For example, a modified ARS is considered to react with an N-methyl amino acid more efficiently than the original, natural ARS when the modified ARS produces polypeptides containing the N-methyl amino acid more than the original, natural ARS. A modified ARS is considered to react with its corresponding N-methyl amino acid more efficiently than the original, natural ARS when, for example, production of polypeptides containing the N-methyl amino acid that cannot be observed when using the original, natural ARS can be observed when using the modified ARS. For example, a modified ARS is considered to react with an N-methyl amino acid more efficiently than the original, natural ARS when, for example, production of polypeptides containing 2, 3, or more consecutive N-methyl amino acids cannot be observed when using the original, natural ARS, but can be observed when using the modified ARS.
The phrase "can react more efficiently than the original, natural ARS" may mean that a reactant of interest is purified at least under a certain condition with higher purity compared to the purity achieved with the original, natural ARS. For example, a modified ARS is considered to react with its corresponding N-methyl amino acid more efficiently than the original, natural ARS when the modified ARS produces polypeptides containing its corresponding N-methyl amino acid more than the original, natural ARS, relative to natural amino acids derived as a result of contamination, Alternatively, a modified ARS is considered to react with N-methyl amino acids more efficiently than the original ARS when it is confirmed that reaction efficiency of the modified ARS for the N-methyl amino acid remains unchanged and reaction efficiency of the modified ARS for natural amino acids decreases. For example, a modified ARS is considered to react with its corresponding N-methyl amino acid more efficiently than the original, natural ARS when reactivity of the modified ARS to the N-methyl amino acid is relatively higher than reactivity of the modified ARS to natural amino acids.
N-methyl amino acids are not particularly limited, but are appropriately selected based on ARSs. For example, when the modified ARS is valine ARS (VaIRS), N-methyl amino acid is N-methylvaline; when the modified ARS is threonine ARS (Thr), N-methyl amino acid is N-methylthreonine; when the modified ARS is serine ARS (SerRS), N-methyl amino acid is N-methylserine; when the modified ARS is phenylalanine ARS α subunit (PheRS), N-methyl amino acid is N-methylphenylalanine; when the modified ARS is tryptophan ARS (TrpRS), N-methyl amino acid is N-methyltryptophan; when the modified ARS is leucine ARS (LeuRS), N-methyl amino acid is N-methylleucine.
For example in ValRS, modified sites of ARS are preferably the position(s) corresponding to asparagine at position 43 and/or threonine at position 45 and/or threonine at position 279 of ValRS from E. coli. Modified sites of ARS are preferably a combination of any two positions selected from the positions corresponding to asparagine at position 43, threonine at position 45, and threonine at position 279 (e.g., a combination of position 43 and position 45, position 43 and position 279, or position 45 and position 279), and more preferably a combination of positions corresponding to asparagine at position 43, threonine at position 45, and threonine at position 279. SerRS can be modified at the position(s) corresponding to glutamic acid at position 239 and/or threonine at position 237 of SerRS from E. coli, and more preferably, can be modified at a combination of the positions corresponding to glutamic acid at position 239 and threonine at position 237. PheRS α subunit is preferably modified at the position corresponding to glutamine at position 169 of PheRS from E. coli. ThrRS can be modified at the position(s) corresponding to methionine at position 332 and/or histidine at position 511 of ThrRS from E. coli. TrpRS can be modified at the position(s) corresponding to methionine at position 132 and/or glutamine at position 150 and/or histidine at position 153 of TrpRS from E. coli, preferably can be modified at a combination of any two positions selected from positions corresponding to methionine at position 132, glutamine at position 150, and histidine at position 153 (e.g., a combination of position 132 and position 150, position 132 and position 153, or position 150 and position 153), and more preferably can be modified at a combination of the positions corresponding to methionine at position 132, glutamine at position 150, and histidine at position 153. LeuRS can be modified at the position corresponding to tyrosine at position 43 of LeuRS from E. coli. It should be noted that these modified ARSs may be further modified at other positions. The position numbers in each ARS are indicated taking the position number of the starting methionine in each ARS from E. coli as 1. Specifically, the position numbers in each ARS are indicated taking the position number of the first methionine in the sequences of P07118 (SEQ ID NO: 24) for ValRS, P08312 (SEQ ID NO: 28) for PheRS α subunit, P0A8M3 (SEQ ID NO: 29) for ThrRS, P0A8L1 (SEQ ID NO: 26) for SerRS, P00954 (SEQ ID NO: 188) for TrpRS, and P07813 (SEQ ID NO: 189) for LeuRS (UniProt (http://www.uniprot.org/) as 1. In a certain ARS, a position corresponding to a certain amino acid in ARS from E. coli refers to the amino acid located in the site corresponding to the amino acid in ARS from E. coli and can be identified based on structural similarity between the certain ARS and ARS from E. coli. For example, the corresponding amino acid can be identified as the amino acid aligned at the position of the amino acid in ARS from E. coli when the amino acid sequences of an ARS of interest and ARS from E. coli are aligned. As used herein, a position corresponding to a certain amino acid in ARS from E. coli is preferably the position sterically corresponding to the certain amino acid in ARS from E. coli. The sterically corresponding position refers to the position of an amino acid corresponding to the position of the certain amino acid in ARS from E. coli in the conformation of ARS.
Those skilled in the art can easily identify the sterically corresponding position by aligning ARS from E. coli with all known ARSs from other biological species for example using Multiple Sequence Alignment with default parameters in ClustalW ver2.1 (http://clustalw.ddbj.nig.ac.jp). Particularly, ARSs of interest are not limited to those from prokaryotes. Generally, sequences of ARSs in eukaryotes comprise various functional domains in addition to catalytic domain, and the sequence identity between ARSs from eukaryotes and ARSs from prokaryotes is not always high. In contrast, sequences of catalytic sites, including amino acid recognition site, and editing domain are highly conserved, and the sterically corresponding position in ARSs from eukaryotes can be easily identified using publicly-available alignment techniques.
For example, the sites corresponding to positions 43 and 45 of E. coli ValRS may be respectively amino acid sites of "N/Y/T" and "T/S" in PPP(N/Y/T)X(T/S)G motif (SEQ ID NO: 180; "N/Y/T" is preferably N; X is any amino acid, preferably V, I, or P, and more preferably V; "T/S" is preferably T) present in ValRS from other organisms. More preferably, the sites corresponding to positions 43 and 45 of E. coli ValRS may be amino acids at N and T respectively in PPPNXTG motif (SEQ ID NO: 181; X is any amino acid, preferably V, I, or P, and more preferably V) present in ValRS from other organisms. For example, the position corresponding to asparagine at position 43 of E. coli ValRS is asparagine at position 345 in human (Uniprot P26640) and asparagine at position 191 in Saccharomyces cervisiae (Uniprot P07806).
Modifications of ARSs include preferably at least one substitution with an amino acid that causes a decrease of 10 or more in molecular weight. Such modifications include, for example, a modification of an amino acid selected from the group consisting of amino acids other than Thr (T), such as Gln (Q), Asn (N), Glu (E), Met (M), Tyr (Y), and His (H), into Ala (A) or Gly (G) (preferably into Gly), for example, a modification of Thr (T) into Ser (S), and for example, a modification of Met (M) into Val (V).
Amino acids to be modified may be appropriately selected. For example in ValRS, the position corresponding to asparagine at position 43 of ValRS from E. coli is preferably modified into glycine or alanine, the position corresponding to threonine at position 45 of ValRS from E. coli is preferably modified into serine, and the position corresponding to threonine at position 279 of ValRS from E. coli is preferably modified into glycine or alanine. In SerRS, the position corresponding to glutamic acid at position 239 of SerRS from E. coli is preferably modified into glycine or alanine, and the position corresponding to threonine at position 237 of SerRS from E. coli is preferably modified into serine. In PheRS, the position corresponding to glutamine at position 169 of PheRS α subunit from E. coli is preferably modified into glycine or alanine. In ThrRS, the position corresponding to methionine at position 332 of ThrRS from E. coli is preferably modified into glycine, and the position corresponding to histidine at position 511 of ThrRS from E. coli is preferably modified into glycine. In TrpRS, the position corresponding to methionine at position 132 of TrpRS from E. coli is preferably modified into valine or alanine, the position corresponding to glutamine at position 150 of TrpRS from E. coli is preferably modified into alanine, and the position corresponding to histidine at position 153 of TrpRS from E. coli is preferably modified into alanine. In LeuRS, the position corresponding to tyrosine at position 43 of LeuRS from E. coli is preferably modified into glycine.
Specifically, the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of any of SEQ ID NOs: 3-5, 182, and 183 (ValRS04, ValRS13, ValRS13-11, ValRS66, and ValRS67); and
(b) a polypeptide that has reactivity with N-methyl Val, has at least 90% identity to the amino acid sequence of any of SEQ ID NOs: 3-5, 182, and 183, and comprises at least one amino acid of the following (i) to (iii):
1. (i) Gly or Ala at the amino acid position corresponding to position 43 of SEQ ID NOs: 3-5, 182, and 183;
2. (ii) Ser at the amino acid position corresponding to position 45 of SEQ ID NOs: 3-5, 182, and 183; and
3. (iii) Gly or Ala at the amino acid position corresponding to position 279 of SEQ ID NOs: 3-5, 182, and 183.

Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide having (i) Asn at the amino acid position corresponding to position 43 and/or (ii) Thr at the amino acid position corresponding to position 45 and/or (iii) Thr at the amino acid position corresponding to position 279. For example, the ValRS as described herein preferably has a reactivity with N-methyl Val higher than the reactivity of a ValRS having Asn at the amino acid position corresponding to position 43, Thr at the amino acid position corresponding to position 45, and Thr at the amino acid position corresponding to position 279 of the ValRS.
Furthermore, the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of any of SEQ ID NOs: 6-9 (SerRS03, SerRS05, SerRS35, and SerRS37) and
(b) a polypeptide that has reactivity with N-methyl Ser, has at least 90% identity to the amino acid sequence of any of SEQ ID NOs: 6-9, and comprises at least one amino acid of the following (i) and (ii);
1. (i) Ser at the amino acid position corresponding to position 237 of SEQ ID NOs: 6-9, and
2. (ii) Gly or Ala at the amino acid position corresponding to position 239 of SEQ ID NOs: 6-9.

Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide having (i) Thr at the amino acid position corresponding to position 237 and/or (ii) Glu at the amino acid position corresponding to position 239. For example, the SerRS as described herein preferably has a reactivity with N-methyl Ser higher than the reactivity of a SerRS having Thr at the amino acid position corresponding to position 237 of the SerRS and Glu at the amino acid position corresponding to position 239 of the SerRS.
In addition, the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of any of SEQ ID NOs: 1-2 (PheRS05 and PheRS04); and
(b) a polypeptide that has reactivity with N-methyl Phe, has at least 90% identity to the amino acid sequence of any of SEQ ID NOs: 1-2, and comprises an amino acid sequence in which the amino acid at the position corresponding to position 169 of any of SEQ ID NOs: 1-2 is Gly or Ala. Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide in which the amino acid at the above-mentioned position is Gln. The polypeptides as described above represent ARS α subunit, and therefore the polypeptides can form a complex with β subunit to result in a functional ARS. β subunit is not particularly limited, but may be, for example, a desired wild-type subunit. As an example, β subunit that can be used is one comprising the amino acid sequence of NCBI Reference Sequence WP_000672380 (e.g., WP_000672380.1) (wherein the base sequence represents 1897337-1899721 of GenBank CP009685 (e.g., CP009685.1)).

In addition the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of any of SEQ ID NOs: 10-11 (ThrRS03 and ThrRS14); and
(b) a polypeptide that has reactivity with N-methyl Thr, has at least 90% identity to the amino acid sequence of any of SEQ ID NOs: 10-11, and comprises at least one amino acid of the following (i) and (ii):
1. (i) Gly at the amino acid position corresponding to position 332 of SEQ ID NOs: 10-11; and
2. (ii) Gly at the amino acid position corresponding to position 511 of SEQ ID NOs: 10-11.

Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide having (i) Met at the amino acid position corresponding to position 332 and/or (ii) His at the amino acid position corresponding to position 511. For example, the ThrRS as described herein preferably has a reactivity with N-methyl Thr higher than the reactivity of a ThrRS having Met at the amino acid position corresponding to position 332 of the ThrRS and His at the amino acid position corresponding to position 511 of the ThrRS.
Furthermore, the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of any of SEQ ID NOs: 184-186 (TrpRS04, TrpRS05, and TrpRS18); and
(b) a polypeptide that has reactivity with N-methyl Trp, has at least 90% identity to the amino acid sequence of any of SEQ ID NOs: 184-186, and comprises at least one amino acid according to any of the following (i) to (iii):
1. (i) Val or Ala at the amino acid position corresponding to position 132 of SEQ ID NOs: 184-186;
2. (ii) Ala at the amino acid position corresponding to position 150 of SEQ ID NOs: 184-186; and
3. (iii) Ala at the amino acid position corresponding to position 153 of SEQ ID NOs: 184-186.

Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide having (i) Met at the amino acid position corresponding to position 132 and/or (ii) Gln at the amino acid position corresponding to position 150 and/or (iii) His at the amino acid position corresponding to position 153. For example, the TrpRS as described herein preferably has a reactivity with N-methyl Trp higher than the reactivity of a TrpRS having Met at the amino acid position corresponding to position 132 of the TrpRS, Gln at the amino acid position corresponding to position 150 of the TrpRS, and His at the amino acid position corresponding to position 153 of the TrpRS.
Additionally, the following polypeptides are disclosed:

(a) a polypeptide comprising the amino acid sequence of SEQ ID NO: 187 (LeuRS02); and
(b) a polypeptide that has reactivity with N-methyl Leu, has at least 90% identity to the amino acid sequence of SEQ ID NO: 187, and comprises an amino acid sequence in which the amino acid at the position corresponding to position 43 of SEQ ID NO: 187 is Gly.

Furthermore, the above-mentioned reactivity is preferably higher than the reactivity of a polypeptide having (i) Tyr at the amino acid position corresponding to position 43.
The identity of amino acid sequence is preferably 93% or more, more preferably 95% or more, more preferably 97% or more, 98% or more, or 99% or more.
The N-methyl aminoacyl-tRNA synthetases as described herein are characterized by the ability to efficiently acylate tRNAs with N-methyl amino acids, which are non-natural amino acids known to enhance drug-likeness of peptides. The N-methyl aminoacyl-tRNA synthetases as described herein may be derived from any organism including bacteria such as E. coli, yeast, animals, or plants, but, because of their versatility, preferable are those that have high sequence conservation with aminoacyl-tRNA synthetases exemplified in the Examples herein (SEQ ID NOs: 1-11, 182-187), and thus have mutation site(s) easily identified in other biological species. For example, polypeptides comprising the modified ARSs as described herein may be derived from ARSs from eukaryotes or prokaryotes. Eukaryotes include protists (including protozoan and unicellular green algae), fungi (including ascomycetes and basidiomycetes), plants (including bryophytes, pteridophytes, and seed plants (gymnosperms and angiosperms)), and animals (including invertebrates and vertebrates). Prokaryotes include archaebacteria (including thermophiles and methane bacteria) and eubacteria (including cyanobacteria and E. coli). Polypeptides comprising the modified ARSs as described herein may be derived from mammals (such as human, mouse, rat, guinea pig, rabbit, sheep, monkey, goat, donkey, cattle, horse, and pig). Polypeptides comprising the modified ARSs as described herein may be derived from, for example, E. coli or yeast, and preferably derived from prokaryotes, for example bacteria. The polypeptides as described herein are derived from, for example, bacteria of the family Enterobacteriaceae, for example, E. coli, including, but are not limited to, for example, the genera Escherichia (including E. coli, E. albertii, E. fergusonii), Shigella (including S. dysenteriae, S. flexneri, S. boydii, S. sonnei, S. enterica, S. bongori), Citrobacter (including C. rodentium, C. koseri, C. farmeri, C. youngae), Kluyvera (including K. ascorbata), Trabulsiella (including T. guamensis), Klebsiella, and the like.
N-methyl aminoacyl-tRNA synthetases from E. coli were used in the Examples herein as one example, and therefore the modified positions indicated herein are positions in E. coli. When N-methyl aminoacyl-tRNA synthetases from other organisms are used, positions to be modified are amino acids at positions corresponding to those in the amino acid sequence of N-methyl aminoacyl-tRNA synthetases from E. coli based on sequence homology. ARSs are generally very highly conserved because they are enzymes playing an essential part in the translation mechanism that exists in all organisms. Accordingly, a desired ARS can be modified based on the method described herein to obtain a modified ARS with an increased ability to incorporate an N-methyl amino acid.
Amino acids to be newly introduced into N-methyl aminoacyl-tRNA synthetases are selected based on hydrophilicity, hydrogen bonds, and side chain size of amino acids in consideration of the distance to and interaction with N-methyl group. For example, when avoiding the steric repulsion between the amino acid at the position and an N-methyl group, the distance to the N-methyl amino group can be adjusted by, for example, substituting a high molecular weight amino acid with a low molecular weight amino acid. Specifically, the distance can be adjusted by reducing the molecular weight, for example, by substituting threonine (Thr) with serine (Ser).
For example, when an amino acid to be modified in an N-methyl aminoacyl-tRNA synthetase is asparagine, the low molecular weight amino acids can include, but are not limited to, for example, serine, valine, glycine, aspartic acid, and alanine. When an amino acid to be modified is glutamic acid, the low molecular weight amino acids can include, but are not limited to, for example, alanine, valine, serine, alanine, and aspartic acid. Moreover, for example, when an amino acid to be modified is Thr, the low molecular weight amino acids include preferably, for example, Ser. When an amino acid to be modified is an amino acid other than the amino acids as described above, the low molecular weight amino acids include preferably, for example, glycine (Gly) and alanine (Ala), and more preferably glycine (Gly). To give one specific example, Thr can be substituted with Ser; Gln, Glu, and Asn can be substituted with Gly or Ala; and Met, His, Gln, and the like can be substituted with Gly, but the substitutions are not limited thereto.
To give a more specific example, modification of valine aminoacyl-tRNA synthetase (ValRS) includes, for example, modification of amino acid(s) at position 43 and/or position 45 and/or position 279 of SEQ ID NO: 24 (natural VaIRS), or amino acids at positions corresponding to these positions. Amino acids selected for substitution are not limited, but as mentioned above, for example, Thr can be substituted with Ser, Ala, or Gly, and amino acids (e.g., Asn) other than Thr can be substituted with Gly or Ala. For example, preferred are substitution of the amino acid at position 43 of SEQ ID NO: 24 or an amino acid at a position corresponding to position 43 of SEQ ID NO: 24 with Gly or Ala, and/or substitution of the amino acid at position 45 of SEQ ID NO: 24 or an amino acid at the position corresponding to position 45 of SEQ ID NO: 24 with Ser, and/or substitution of the amino acid at position 279 or an amino acid at a position corresponding to position 279 with Gly or Ala. These substitutions may be any one of the substitutions, a combination of any of these substitutions (e.g., substitutions at position 43 and position 45, substitutions at position 43 and position 279, or substitutions at position 45 and position 279), or all of the substitutions. Other substitutions may be further combined. As a more specific illustration, N43 and/or T45 and/or T279 of SEQ ID NO: 24, or amino acids at the positions corresponding to these, are preferably substituted, and preferably substituted to N43G and/or T45S and/or T279A.
Modification of serine aminoacyl-tRNA synthetase (SerRS) includes, for example, modification of amino acid(s) at position 237 and/or position 239 of SEQ ID NO: 26 (natural SerRS), or amino acids at the positions corresponding to these positions. Amino acids selected for substitution are not limited, but as mentioned above, for example, Thr can be substituted with Ser, and amino acids (e.g., Glu) other than Thr can be substituted with Gly or Ala. For example, preferred are substitution of the amino acid at position 237 of SEQ ID NO: 26 or an amino acid at the position corresponding to position 237 of SEQ ID NO: 26 with Ser, and/or substitution of the amino acid at position 239 of SEQ ID NO: 26 or an amino acid at the position corresponding to position 239 of SEQ ID NO: 26 with Gly or Ala. These substitutions may be any one or both of the substitutions. Other substitutions may be further combined. As a more specific illustration, T237 and/or E239 of SEQ ID NO: 26, or amino acids at positions corresponding to these, are preferably substituted, and preferably substituted to T237S and/or E239G (or E239A).
Modification of phenylalanine aminoacyl-tRNA synthetase α subunit (PheRS α) includes, for example, modification of amino acid at position 169 of SEQ ID NO: 28 (natural PheRS α subunit), or an amino acid at a position corresponding to position 169 of SEQ ID NO: 28. Amino acids selected for substitution are not limited, but as mentioned above, for example, Thr can be substituted with Ser, and amino acids other than Thr can be substituted with glycine (Gly) or alanine (Ala) (more preferably Gly). For example, preferred is substitution of the amino acid at position 169 of SEQ ID NO: 28 or an amino acid at the position corresponding to position 169 of SEQ ID NO: 28 with Gly. Other substitutions may be further combined. As a more specific illustration, Q169 of SEQ ID NO: 28, or an amino acid at the position corresponding to Q169 of SEQ ID NO: 28, is preferably substituted, and preferably substituted to Q169G (or Q169A).
Modification of threonine aminoacyl-tRNA synthetase (ThrRS) includes, for example, modification of amino acid(s) at position 332 and/or position 511 of SEQ ID NO: 29 (natural ThrRS), or amino acids at positions corresponding to these positions. Amino acids selected for substitution are not limited, but as mentioned above, for example, Thr can be substituted with Ser, and amino acids (e.g., Met and His) other than Thr can be substituted with Gly. For example, preferred is/are substitution of the amino acid at position 332 of SEQ ID NO: 29 or an amino acid at a position corresponding to position 332 of SEQ ID NO: 29 with Gly and/or substitution of the amino acid at position 511 of SEQ ID NO: 29 or an amino acid at a position corresponding to position 511 of SEQ ID NO: 29 with Gly. These substitutions may be any one or both of the substitutions. Other substitutions may be further combined. As a more specific illustration, M332 and/or H511 of SEQ ID NO: 29, or amino acids at positions corresponding to these, are preferably substituted, and preferably substituted to M332G and/or H511G.
Modification of tryptophan aminoacyl-tRNA synthetase (TrpRS) includes, for example, modification of amino acid(s) at position 132 and/or position 150 and/or position 153 of SEQ ID NO: 188 (natural TrpRS), or amino acids at a position corresponding to these positions. Amino acids selected for substitution are not limited, but as mentioned above, for example, Met can be substituted with Val or Ala, and amino acids (e.g., Gln) other than Met can be substituted with Ala. For example, preferred is/are substitution of the amino acid at position 132 of SEQ ID NO: 188 or an amino acid at a position corresponding to position 132 of SEQ ID NO: 188 with Val or Ala, and/or substitution of the amino acid at position 150 of SEQ ID NO: 188 or an amino acid at a position corresponding to position 150 of SEQ ID NO: 188 with Ala, and/or substitution of the amino acid at position 153 or an amino acid at a position corresponding to position 153 with Ala. These substitutions may be any one of the substitutions, a combination of any of these substitutions (e.g., substitutions at position 132 and position 150, substitutions at position 132 and position 153, or substitutions at position 150 and position 153), or all of the substitutions. Other substitutions may be further combined. As a more specific illustration, M132 and/or Q150 and/or H153 of SEQ ID NO: 188, or amino acids at positions corresponding to these, are preferably substituted, and preferably substituted to M132V and/or Q150A and/or H153A.
Modification of leucine aminoacyl-tRNA synthetase (LeuRS) includes, for example, modification of the amino acid at position 43 of SEQ ID NO: 189 (natural LeuRS), or an amino acid at a position corresponding to position 43 of SEQ ID NO: 189. Amino acids selected for substitution are not limited, but as mentioned above, for example, Thr can be substituted with Gly. For example, preferred is substitution of the amino acid at position 43 of SEQ ID NO: 189 or an amino acid at a position corresponding to position 43 of SEQ ID NO: 189 with Gly. Other substitutions may be further combined. As a more specific illustration, Y43 of SEQ ID NO: 189, or an amino acid at a position corresponding to Y43 of SEQ ID NO: 189, is preferably substituted, and preferably substituted to Y43G.
A method for producing a mutant N-methyl aminoacyl-tRNA synthetase, which is modified by substituting an amino acid at a specific position with another amino acid, in accordance with the present disclosure, can be performed using any known genetic engineering technique. For example, DNA fragments having base sequences encoding amino acid sequences comprising amino acids at positions of interest are amplified using primers having base sequences substituted with base sequences encoding amino acid sequences comprising modified amino acids, resulting in base sequences encoding amino acid sequences comprising modified amino acids. The amplified DNA fragments are ligated together to obtain a full-length DNA encoding the mutant aminoacyl-tRNA synthetase. This full-length DNA can be expressed using a host cell such as E. coli to easily produce the mutant N-methyl aminoacyl-tRNA synthetase. Primers used in the method are 20 to 70 bases in length, and preferably about 20 to 50 bases in length. The primers have 1 to 3 base mismatches with the original unmodified base sequence, and therefore relatively long primers, for example, primers of 20 bases or more in length are preferably used.
A method for producing a mutant N-methyl aminoacyl-tRNA synthetase, which is modified by substituting an amino acid at a specific position with another amino acid, according to the present disclosure, is not limited to the method as described above, and various genetic engineering techniques, such as known point mutation techniques and gene synthesis techniques, and methods for introducing modified fragments using restriction enzymes, can be utilized. Expression hosts are not limited to E. coli, and animal cells and cell-free translation systems may also be used.
Modified ARSs as described herein include a polypeptide comprising an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187 (PheRS05, PheRS04, ValRS04, ValRS13, ValRS13-11, SerRS03, SerRS05, SerRS35, SerRS37, ThrRS03, ThrRS14, ValRS66, ValRS67, TrpRS04, TrpRS05, TrpRS18, and LeuRS02) and polypeptides functionally equivalent to the polypeptide. "Functionally equivalent polypeptides" are ARSs with a high structural identity to a polypeptide comprising an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187 and have reactivity with N-methyl amino acids. More specifically, "functionally equivalent polypeptides" are polypeptides that have amino acids modified according to the above description in ARSs with a high structural identity to a polypeptide comprising an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187, thereby having increased reactivity with N-methyl amino acids compared to unmodified ARSs. Increased reactivity with N-methyl amino acids may be, for example, increased substrate specificity to N-methyl amino acids (e.g., increased value of reactivity with N-methyl amino acids / reactivity with unmodified amino acids).
Such polypeptides include, for example, a polypeptide in which one or more amino acids (preferably 1 to 20 amino acids, for example, 1 to 10 amino acids, 1 to 7 amino acids, 1 to 5 amino acids, 1 to 3 amino acids, 1 to 2 amino acids, or 1 amino acid) are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187. Such polypeptides may also be, for example, a polypeptide in which one to several amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187.
A polypeptide functionally equivalent to a modified ARS comprising an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187 typically has a high identity to an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187. A polynucleotide encoding a functionally equivalent polypeptide also typically has a high identity to a base sequence (e.g., SEQ ID NOs: 12-22 and 190-195) encoding an amino acid sequence set forth in any of SEQ ID NOs: 1-11 and 182-187. High identity (sequence identity) specifically refers to 70% or more, preferably 80% or more, 85% or more, 90% or more, 95% or more, 97% or more, 98% or more, or 99% or more identity.
The identity of amino acid sequences or base sequences can be determined using the BLAST algorithm by Karlin and Altschul (Proc. Natl. Acad. Sci. USA (1993) 90: 5873-7). Programs called BLASTN and BLASTX have been developed based on the algorithm (Altschul et al., J. Mol. Biol. (1990)215: 403-10). When a base sequence is analyzed using BLAST-based BLASTN, parameters are set as, for example, score = 100 and wordlength = 12. When an amino acid sequence is analyzed using BLAST-based BLASTX, parameters are set as, for example, score = 50 and wordlength = 3. When BLAST and Gapped BLAST programs are used, default parameters of each program are used. Specific procedures of these analysis methods are known (see, information at the website of Basic Local Alignment Search Tool (BLAST) in National Center for Biotechnology Information (NCBI).
The described polypeptides include any polypeptide of the following (a) to (d), wherein the amino acid corresponding to position 43 of SEQ ID NO: 24 (ValRS) is other than Asn, and/or the amino acid corresponding to position 45 of SEQ ID NO: 24 (ValRS) is other than Thr and/or the amino acid corresponding to position 279 of SEQ ID NO: 24 (ValRS) is other than Thr (i.e., a polypeptide in which the amino acid at at least one of these 3 positions is other than the indicated respective amino acid) and wherein the polypeptide has increased reactivity to N-methyl Val compared to a polypeptide in which the amino acids corresponding to position 43, position 45, and position 279 are Asn, Thr, and Thr, respectively:

(a) a polypeptide comprising an amino acid sequence having a high identity to an amino acid sequence set forth in any of SEQ ID NOs: 3-5, 182, and 183 (ValRS04, ValRS13, ValRS13-11, ValRS66, and ValRS67);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NOs: 3-5, 182, and 183;
(c) a polypeptide encoded by a base sequence with high identity to a base sequence set forth in any one of SEQ ID NOs: 14-16, 190, and 191 (DNA encoding ValRS04, ValRS13, ValRS13-11, ValRS66, and ValRS67); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to a base sequence set forth in any one of SEQ ID NOs: 14-16, 190, and 191 under stringent conditions.

The above-mentioned polypeptides preferably have (i) Gly or Ala at the amino acid position corresponding to position 43 and/or (ii) Ser at the amino acid position corresponding to position 45 and/or (iii) Gly or Ala at the amino acid position corresponding to position 279. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which (i) the amino acid at the position corresponding to position 43 is substituted with Gly or Ala and/or (ii) the amino acid at the position corresponding to position 45 is substituted with Ser and/or (iii) the amino acid at the position corresponding to position 279 is substituted with Gly or Ala.
The described polypeptides include any polypeptide of the following (a) to (d), wherein the amino acid corresponding to position 237 of SEQ ID NO: 26 (SerRS) is other than Thr and/or the amino acid corresponding to position 239 of SEQ ID NO: 26 (SerRS) is other than Glu (i.e., a polypeptide in which the amino acid at at least one of these 2 positions is other than the indicated respective amino acid) and wherein the polypeptide has increased reactivity with N-methyl Ser compared to a polypeptide in which the amino acids corresponding to position 237 and position 239 are Thr and Glu respectively:

(a) a polypeptide comprising an amino acid sequence having a high identity to an amino acid sequence set forth in any one of SEQ ID NOs: 6-9 (SerRS03, SerRS05, SerRS35, and SerRS37);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NOs: 6-9;
(c) a polypeptide encoded by a base sequence having a high identity to a base sequence set forth in any one of SEQ ID NOs: 17-20 (DNA encoding SerRS03, SerRS05, SerRS35, and SerRS37); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to a base sequence set forth in any one of SEQ ID NOs: 17-20 under stringent conditions.

The above-mentioned polypeptides preferably have (i) Ser at the amino acid position corresponding to position 237 and/or (ii) Gly or Ala at the amino acid position corresponding to position 239. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which (i) the amino acid at the position corresponding to position 237 is substituted with Ser and/or (ii) the amino acid at the position corresponding to position 239 is substituted with Gly or Ala.
Such polypeptides include any polypeptide of the following (a) to (d), wherein the polypeptide has any amino acid other than Gln at the position corresponding to position 169 of SEQ ID NO: 28 (PheRS) and has increased reactivity with N-methyl Phe compared to a polypeptide in which the amino acid at position 169 of SEQ ID NO: 28 (PheRS) is Gln:

(a) a polypeptide comprising an amino acid sequence having a high identity to an amino acid sequence set forth in any one of SEQ ID NOs: 1-2 (PheRS05 and PheRS04);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NOs: 1-2;
(c) a polypeptide encoded by a base sequence having a high identity to a base sequence set forth in any one of SEQ ID NOs: 12-13 (DNA encoding PheRS05 and PheRS04); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to a base sequence set forth in any one of SEQ ID NOs: 12-13 under stringent conditions.

The above-mentioned polypeptides preferably have Gly or Ala at the amino acid position corresponding to position 169. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which the amino acid at the position corresponding to position 169 is substituted with Gly or Ala.
The described polypeptides include any polypeptide of the following (a) to (d), wherein the amino acid corresponding to position 332 of SEQ ID NO: 29 (ThrRS) is other than Met and/or the amino acid corresponding to position 511 of SEQ ID NO: 29 (ThrRS) is other than His (i.e., a polypeptide in which the amino acid at at least one of these 2 positions is other than the indicated respective amino acid) and wherein the polypeptide has increased reactivity with N-methyl Thr compared to a polypeptide in which the amino acids corresponding to position 332 and position 511 are Met and His, respectively:

(a) a polypeptide comprising an amino acid sequence having a high identity to an amino acid sequence set forth in any one of SEQ ID NOs: 10-11 (ThrRS03 and ThrRS 14);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NOs: 10-11;
(c) a polypeptide encoded by a base sequence having a high identity to a base sequence set forth in any one of SEQ ID NOs: 21-22 (DNA encoding ThrRS03 and ThrRS 14); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to a nucleotide sequence set forth in any one of SEQ ID NOs: 21-22 under stringent conditions.

The above-mentioned polypeptides preferably have (i) Gly at the amino acid position corresponding to 332 and/or (ii) Gly at the amino acid position corresponding to position 511. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which the amino acid at the position corresponding to position 332 is substituted with Gly and/or the amino acid at the position corresponding to position 511 is substituted with Gly.
The described polypeptides include a polypeptide that is any polypeptide of the following (a) to (d), wherein the amino acid corresponding to position 132 of SEQ ID NO: 188 (TrpRS) is other than Met and/or the amino acid corresponding to position 150 of SEQ ID NO: 188 (TrpRS) is other than Gln and/or the amino acid corresponding to position 153 of SEQ ID NO: 188 (TrpRS) is other than His (i.e., a polypeptide in which the amino acid at at least one of these 3 positions is other than the indicated respective amino acid) and wherein the polypeptide has increased reactivity with N-methyl Trp compared to a polypeptide in which the amino acids corresponding to position 132, position 150, and position 153 are Met, Gln, and His, respectively:

(a) a polypeptide comprising an amino acid sequence having a high identity to an amino acid sequence set forth in any one of SEQ ID NOs: 184-186 (TrpRS04, TrpRS05, and TrpRS 18);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NOs: 184-186;
(c) a polypeptide encoded by a base sequence having a high identity to a base sequence set forth in any one of SEQ ID NOs: 192-194 (DNA encoding TrpRS04, TrpRS05, and TrpRS18); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to a base sequence set forth in any one of SEQ ID NOs: 192-194 under stringent conditions.

The above-mentioned polypeptides preferably have (i) Val or Ala at the amino acid position corresponding to position 132 and/or (ii) Ala at the amino acid position corresponding to position 150 and/or (iii) Ala at the amino acid position corresponding to position 153. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which (i) the amino acid at the position corresponding to position 132 is substituted with Val or Ala and/or (ii) the amino acid at the position corresponding to position 150 is substituted with Ala and/or (iii) the amino acid at the position corresponding to position 153 is substituted with Ala.
The described polypeptides include any polypeptide of the following (a) to (d), wherein the polypeptide has any amino acid other than Tyr at the position corresponding to position 43 of SEQ ID NO: 189 (LeuRS) and has increased reactivity with N-methyl Leu compared to a polypeptide in which the amino acid corresponding to position 43 is Tyr:

(a) a polypeptide comprising an amino acid sequence having a high identity to the amino acid sequence set forth in SEQ ID NO: 187 (LeuRS02);
(b) a polypeptide comprising an amino acid sequence in which one or more amino acids are substituted, deleted, inserted, and/or added in an amino acid sequence set forth in any one of SEQ ID NO: 187;
(c) a polypeptide encoded by a base sequence having a high identity to the base sequence set forth in SEQ ID NO: 195 (DNA encoding LeuRS02); or
(d) a polypeptide encoded by a DNA fragment hybridizing with a strand complementary to the base sequence set forth in SEQ ID NO: 195 under stringent conditions.

The above-mentioned polypeptides preferably have Gly at the amino acid position corresponding to position 43. Such polypeptides include natural polypeptides and artificially-modified polypeptides, and preferably include polypeptides in which the amino acid at the position corresponding to position 43 is substituted with Gly.
In the case of polypeptides obtained by modifying natural ARSs, polypeptides according to the present disclosure are those with increased reactivity to N-methyl amino acids compared to unmodified ARSs. When the polypeptides are natural ARSs or artificially produced polypeptides, the polypeptides in accordance with the present disclosure are those with reactivity to N-methyl amino acids, namely those having an activity to acylate tRNAs with N-methyl amino acids. When comparing the reactivities of modified ARSs and natural ARSs, and the ARSs are made up of multiple subunits, subunit(s) other than that/those compared is/are the same is/are used. These subunits may be natural (or wild-type) or a modified as long as they are the same in both ARSs.
High identity (or high sequence identity) refers to, as mentioned above, for example, 70% or more, preferably 80% or more, 85% or more, 90% or more, 95% or more, 97% or more, 98% or more, or 99% or more identity. The number of amino acids substituted, deleted, inserted, and/or added may be one or several, for example, 1 to 20, preferably 1 to 15, more preferably 1 to 10, more preferably 1 to 8, more preferably 1 to 7, more preferably 1 to 6, more preferably 1 to 5, more preferably 1 to 4, more preferably 1 to 3, more preferably 1 to 2, and more preferably 1. Stringent hybridization conditions refer to, for example, conditions of about "1 X SSC, 0.1% SDS, at 37°C", more strictly conditions of about "0.5 X SSC, 0.1% SDS, at 42°C", further more strictly conditions of about "0.2 X SSC, 0.1% SDS, at 65°C", and yet more strictly conditions of about "0.1 X SSC, 0.1% SDS, at 65°C". It is noted that the conditions of SSC, SDS, and temperature as described above are only examples of combinations. Those skilled in the art can achieve stringency similar to those described above by appropriately combining the above-mentioned or other factors that determine hybridization stringency (such as probe concentration, probe length, and duration of hybridization).
Also disclosed polynucleotides encoding the polypeptides as described herein. The polynucleotides comprise any polynucleotide as long as it comprises a sequence encoding a polypeptide as described above. The polynucleotides as described herein also comprise any one of genomic DNA, cDNA, and DNAs artificially produced based on the genomic DNA or cDNA. Genomic DNA comprises exons and introns. That is to say, genomic DNA may or may not comprise introns, and may or may not comprise untranslated regions (5' UTR and/or 3' UTR), transcription control elements, and the like. cDNA is a nucleic acid sequence derived from a portion of intronic sequence and may comprise a nucleic acid sequence encoding an amino acid sequence.
The polynucleotides also comprise degenerate polynucleotides comprising any codon that encodes the same amino acid. The described polynucleotides may also be a polynucleotide derived from desired organisms.
The described polynucleotides may be obtained using any method. For example, a complementary DNA (cDNA) prepared from mRNA, a DNA prepared from genomic DNA, a DNA obtained by chemical synthesis, a DNA obtained by amplifying RNA or DNA as a template using PCR, and a DNA constructed by appropriately combining these techniques are all included. The described polynucleotides can be produced by cloning the genomic DNA or RNA encoding the polypeptide as described herein according to conventional methods and introducing mutations into the cloned genomic DNA or RNA.
For example, in a method for cloning cDNA from mRNA encoding a polypeptide as described herein, first, the mRNA encoding the polypeptide as described herein is prepared according to conventional methods from any tissue or cell where the polypeptide as described herein is expressed and produced. For example, total RNA prepared using a method such as the guanidine thiocyanate method, hot-phenol method, or AGPC method can be subjected to affinity chromatography on oligo (dT) cellulose, poly U-Sepharose, or the like.
The obtained mRNA is then used as a template to synthesize a cDNA strand by using a known method (Mol. Cell. Biol., Vol. 2, p. 161, 1982; Mol. Cell. Biol., Vol. 3, p. 280, 1983; Gene, Vol. 25, p. 263, 1983), for example, by using reverse transcriptase. The cDNA strand is converted into a double-stranded cDNA and incorporated into a plasmid vector, phage vector, cosmid vector, or the like. The vector is used to transform E. coli or to perform in vitro packaging followed by transfection of E. coli to generate a cDNA library.
The cDNA library can be screened using a polynucleotide as described herein (e.g., SEQ ID NOs: 12-22, 190-195) or a portion thereof as a probe to obtain a gene of interest. Alternatively, the cDNA library can be directly amplified by PCR using a polynucleotide as described herein (e.g., SEQ ID NOs: 12-22, 190-195) or a potion thereof as a primer. The sites and lengths of probes and primers may be appropriately determined.
Further disclosed are vectors (recombinant vectors) comprising polynucleotides encoding the polypeptides as described above. The vectors are not particularly limited as long as they can replicate and can be maintained or can self-proliferate in any host prokaryotic and/or eukaryotic cell. The vectors include plasmid vectors, phage vectors, viral vectors, and the like.
Examples of vectors for cloning include, for example, pUC19, λgt10, λgt11, and the like. Furthermore, the vectors preferably have a promoter that can express the herein described polypeptide when cells that can express the polynucleotide in host cells are isolated.
The recombinant vectors as described herein can be prepared by simply ligating a polynucleotide encoding a herein disclosed polypeptide into a vector for recombination (a plasmid DNA and bacteriophage DNA) available in the art according to conventional methods.
Examples of recombinant vectors that can be used include, for example, plasmids from E. coli (such as pBR322, pBR325, pUC12, pUC13, and pUC19), plasmids from yeast (such as pSH19 and pSH15), and plasmids from Bacillus subtilis (such as pUB110, pTP5, and pC194).
Examples of phages include bacteriophage such as λ phage, and further animal and insect viruses such as retrovirus, vaccinia virus, nucleopolyhedrovirus, and lentivirus (pVL1393, from Invitrogen).
Expression vectors are useful for expressing polynucleotides encoding the herein described polypeptides and production of the herein described polypeptides. The expression vectors are not particularly limited as long as they have functions to express polynucleotides encoding the polypeptides as described herein and produce the polypeptides in any host prokaryotic and/or eukaryotic cell.
For example, expression vectors include pMAL C2, pEF-BOS (Nucleic Acid Research, Vol. 18, 1990, p. 5322, and the like), or pME18S ("Idenshi Kougaku Handbook (Genetic Engineering Handbook)", supplementary volume of Jikken Igaku (Experimental Medicine), 1992, and the like).
Also disclosed are fusion between polypeptides as described herein with another protein/other proteins. A fusion polypeptide according to the present dislcosure is a fusion polypeptide between a polypeptide having reactivity with N-methyl amino acids as described herein and another polypeptide. The fusion polypeptide itself may have no reactivity with N-methyl amino acids as long as it comprises a polypeptide chain having reactivity with N-methyl amino acids as described herein. When prepared as a fusion protein with, for example, Glutathione S-transferase (GST), the fusion polypeptide as described herein can be prepared by subcloning a cDNA encoding the polypeptide as described herein into, for example, plasmid pGEX4T1 (from Pharmacia), transforming E. coli DH5α or the like with the plasmid, and culturing the transformant.
Alternatively, a fusion polypeptide according to the present disclosure can be prepared as a fusion with HA (influenza agglutinin), immunoglobulin constant region, β-galactosidase, maltose-binding protein (MBP), or the like. Moreover, a fusion polypeptide can be prepared as a fusion with, for example, any known peptide such as FLAG (Hopp, T. P. et al., BioTechnology (1988) 6, 1204-1210), a tag consisting of several (e.g., six) histidine (His) residues (such as 6 x His, 10 x His), influenza agglutinin (HA), a human c-myc fragment, a VSV-GP fragment, a p18HIV fragment, T7-tag, HSV-tag, E-tag, a SV40T antigen fragment, lck tag, an α-tubulin fragment, B-tag, a Protein C fragment, Stag, StrepTag, and HaloTag.
The vectors as described herein preferably comprise at least a promoter-operator region, the initiation codon, a polynucleotide encoding the herein described polypeptide, a termination codon, a terminator region, and a replicable unit when bacteria, particularly E. coli, are used as host cells.
The expression vectors preferably comprise at least a promoter, the initiation codon, a polynucleotide encoding a polypeptide as described herein, and a termination codon when yeast, animal cells, or insect cells are used as hosts.
The vectors may also comprise DNA encoding a signal peptide, an enhancer sequence, 5' and 3' untranslated regions of the gene encoding a polypeptide as described herein, a splicing junction, a polyadenylation site, a selection marker region, a replicable unit, or the like.
The vectors may also comprise, if desired, a marker gene (such as a gene amplification gene, a drug-resistant gene) that allows selection of hosts in which gene amplification and transformation have been achieved.
Examples of marker genes include, for example, dihydrofolate reductase (DHFR) gene, thymidine kinase gene, neomycin resistant gene, glutamic acid synthetase gene, adenosine deaminase gene, ornithine decarboxylase gene, hygromycin B phosphotransferase gene, aspartate transcarbamylase gene, and the like.
The promoter-operator region for expressing a polypeptide in bacteria can include, a promoter, an operator, and Shine-Dalgarno (SD) sequence (such as AAGG).
An example of a promoter-operator region includes one comprising, for example, Trp promoter, lac promoter, recA promoter, λPL promoter, lpp promoter, tac promoter, or the like when the host is a bacterium of the genus Escherichia, for example.
Promoters for expressing a polypeptide in yeast include PH05 promoter, PGK promoter, GAP promoter, and ADH promoter.
The promoters include SL01 promoter, SP02 promoter, penP promoter, and the like when the host is a bacterium of the genus Bacillus.
The promoters also include promoters derived from SV40, retroviral promoters, heat shock promoters, and the like when the hosts are eukaryotic cells such as mammalian cells. The promoters are preferably those derived from SV40 and retroviral promoters. However, the promoters are not particularly limited to those described above. Also, enhancers can be effectively used for expression.
An example of a suitable initiation codon includes the methionine codon (ATG). Examples of termination codons include common termination codons (e.g., TAG, TGA, and TAA). Terminator regions that can be used include natural or synthetic terminators generally used.
The replicable unit refers to DNA having an ability to replicate the full DNA sequence of the replicable unit in host cells, and includes natural plasmids, artificially modified plasmids (DNA fragments prepared from natural plasmids), synthetic plasmids, and the like. Suitable plasmids include plasmid pBR322 or an artificially modified pBR322 (a DNA fragment obtained by digesting pBR322 with a suitable restriction enzyme) for E. coli; yeast 2µ plasmid or yeast chromosomal DNA for yeast; and plasmid pRSVneo (ATCC 37198), plasmid pSV2dhfr (ATCC 37145), plasmid pdBPV-MMTneo (ATCC 37224), plasmid pSV2neo (ATCC 37149), and the like for mammalian cells.
Enhancer sequence, polyadenylation site, and splicing junction that can be used are those commonly used by those skilled in the art, such as for example those derived from SV40.
The expression vectors can be prepared by ligating at least a promoter as described above, an initiation codon, a polynucleotide encoding a polypeptide, a termination codon, and a terminator region to a suitable replicable unit continuously and circularly. Suitable DNA fragments (e.g., a linker, other restriction enzyme cleavage sites, and the like) can also be used in conventional techniques such as digestion with restriction enzymes and ligation with T4 DNA ligase, if desired.
Also disclosed are recombinant cells transformed with the vectors as described above, and the recombinant cells can be prepared by introducing the expression vectors as described above into host cells.
Host cells used in accordance with the disclosure are not particularly limited as long as they are compatible with the expression vectors described above and can be transformed. Examples of host cells include various cells including natural cells and artificially-established recombinant cells commonly used in the technical field, for example, bacteria (Escherichia bacteria, Bacillus bacteria), yeast (such as Saccharomyces, Pichia), animal cells, insect cells, and the like.
The host cells are preferably E. coli or animal cells, and examples include, for example, E. coli (such as DH5α, TB1, HB101), mouse-derived cells (such as COP, L, C127, Sp2/0, NS-1, or NIH3T3), rat-derived cells (PC12, PC12h), hamster-derived cells (such as BHK and CHO), monkey-derived cells (such as COS1, COS3, COS7, CV1, and Velo), human-derived cells (such as Hela, diploid fibroblast-derived cells, myeloma cells, and HepG2), and the like.
Introduction of expression vectors into host cells (transformation (transfection)) can be performed according to conventional methods ([for E. coli, Bacillus subtilis, and the like]: Proc. Natl. Acad. Sci. USA., Vol. 69, p. 2110, 1972; Mol. Gen. Genet., Vol. 168, p. 111, 1979; J. Mol. Biol., Vol. 56, p. 209, 1971; [for Saccharomyces cerevisiae]: Proc. Natl. Acad. Sci. USA., Vol. 75, p. 1927, 1978; J. Bacteriol., Vol. 153, p. 163, 1983); [for animal cells]: Virology, Vol. 52, p. 456, 1973; [for insect cells]: Mol. Cell. Biol., Vol. 3, p. 2156-2165, 1983).
The polypeptides can be produced by culturing recombinant transformed cells including expression vectors prepared as described above (hereinafter used to mean inclusion of an inclusion body) in a nutrient medium according to any conventional method.
The polypeptides can be produced such as by culturing recombinant cells as described above, particularly animal cells and allowing the recombinant cells to secrete the polypeptides into culture supernatant.
The obtained culture is subjected to filtration, centrifugation, or any other similar technique to obtain a culture filtrate (supernatant). The polypeptides are purified and isolated from the culture filtrate according to any conventional method commonly used to purify and isolate natural or synthetic proteins.
Isolation and purification methods include, for example, methods utilizing solubility such as salt precipitation and solvent precipitation, dialysis, ultrafiltration, gel filtration, methods utilizing the difference in molecular weight such as sodium dodecyl sulphate-polyacrylamide gel electrophoresis, methods utilizing electric charge such as ion exchange chromatography and hydroxylapatite chromatography, methods utilizing specific affinity such as affinity chromatography, methods utilizing the difference in hydrophobicity such as reversed-phase high-performance liquid chromatography, methods utilizing the difference in isoelectric point such as isoelectric focusing, and the like.
Meanwhile, when a polypeptide as described herein is present in periplasm or cytoplasm of cultured recombinant cells (e.g., E. coli), the culture is subjected to any conventional method such as filtration or centrifugation to collect bacterial pellets or cells. The collected bacterial pellets or cells are suspended in a suitable buffer, and cell wall and/or cytoplasmic membrane is/are disrupted by any method such as for example sonication, lysozyme, and freeze-thawing. Any method such as centrifugation or filtration is then performed to obtain membrane fraction containing the protein as described herein. The membrane fraction is solubilized with any surfactant such as Triton^™-X100 to give a crude solution. Then, the crude solution can be isolated and purified using any conventional method as described previously.
Also disclosed is a polynucleotide (cDNA or genomic DNA) encoding a polypeptide as described above or an oligonucleotide hybridizing with a strand complementary to the polynucleotide. For example, the oligonucleotide is a base sequence at least comprising modified sites of ARSsas described herein or an oligonucleotide hybridizing with a strand complementary to the base sequence. An oligonucleotide in accordance with the disclosure is also a partial fragment of polynucleotides encoding a modified ARS as described herein, and is preferably a fragment comprising bases in modified sites of the modified ARSs or a strand complementary to the fragment.
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Val, the oligonucleotide may be an oligonucleotide comprising bases encoding the codon(s) for an amino acid corresponding to position 43, and/or an amino acid corresponding to position 45, and/or an amino acid corresponding to position 279 of the amino acid sequences set forth in SEQ ID NOs: 3-5, 182, and 183 (ValRS04, ValRS13, ValRS13-11, ValRS66, and ValRS67), or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 43 is preferably Gly or Ala (more preferably Gly), and an amino acid corresponding to position 45 is preferably Ser, and an amino acid corresponding to position 279 is preferably Gly or Ala (more preferably Ala).
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Ser, the oligonucleotide may be an oligonucleotide comprising bases encoding the codon(s) for an amino acid corresponding to position 237 and/or an amino acid corresponding to position 239 of the amino acid sequences set forth in SEQ ID NOs: 6-9 (SerRS03, SerRS05, SerRS35, and SerRS37), or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 237 is preferably Ser, and an amino acid corresponding to position 239 is preferably Gly or Ala (more preferably Gly).
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Phe, the oligonucleotide may be an oligonucleotide comprising bases encoding the codon for an amino acid corresponding to position 169 of the amino acid sequences set forth in SEQ ID NOs: 1-2 (PheRS04 and PheRS05) or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 169 is preferably Gly or Ala.
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Thr, the oligonucleotide may be an oligonucleotide comprising bases encoding the codon(s) for an amino acid corresponding to position 332 and/or an amino acid corresponding to position 511 of the amino acid sequences set forth in SEQ ID NOs: 10-11 (ThrRS03 and ThrRS14), or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 332 is preferably Gly, and an amino acid corresponding to position 511 is preferably Gly.
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Trp, the oligonucleotide may be an oligonucleotide comprising bases encoding the codon(s) for an amino acid corresponding to position 132, and/or an amino acid corresponding to position 150, and/or an amino acid corresponding to position 153 of the amino acid sequences set forth in SEQ ID NOs: 184-186 (TrpRS04, TrpRS05, and TrpRS18), or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 132 is preferably Val or Ala (more preferably Val), an amino acid corresponding to position 150 is preferably Ala, and an amino acid corresponding to position 279 is preferably Ala.
For example, for a polynucleotide encoding a polypeptide as described herein having reactivity with N-methyl Leu, the oligonucleotide be an oligonucleotide comprising bases encoding the codon for an amino acid corresponding to position 43 of the amino acid sequence set forth in SEQ ID NO: 187 (LeuRS02) or an oligonucleotide consisting of a sequence complementary to the oligonucleotide. In this case, an amino acid corresponding to position 43 is preferably Gly.
Length of a partial fragment of polynucleotides encoding the polypeptides as described herein having reactivity with N-methyl amino acids is not particularly limited, but is for example, at least 15 consecutive bases, preferably 16 or more consecutive bases, more preferably 17 or more consecutive bases, more preferably 18 or more consecutive bases, more preferably 20 or more consecutive bases, more preferably 25 or more consecutive bases, more preferably 28 or more consecutive bases, more preferably 30 or more consecutive bases, more preferably 32 or more consecutive bases, more preferably 35 or more consecutive bases, more preferably 40 or more consecutive bases, and more preferably 50 or more consecutive bases.
In addition to the partial fragment of polynucleotides encoding ARSs as described above, oligonucleotides may also further comprise (an) oligonucleotide(s) consisting of other sequences at its/their both or either end(s) (5' and/or 3' end). An oligonucleotide in accordance with the disclosure is, for example, 500 bases or less in length, more preferably 300 bases or less in length, more preferably 200 bases or less in length, more preferably 100 bases or less in length, more preferably 70 bases or less in length, more preferably 60 bases or less in length, and more preferably 50 bases or less in length.
The disclosed oligonucleotides are useful for producing nucleic acids encoding the polypeptides as described herein (e.g., useful for introducing mutations), and also useful for detecting nucleic acids encoding the polypeptides as described herein. For example, an oligonucleotide as described herein can also be used as a probe in DNA hybridization or RNA hybridization operations. An example of a DNA for the purpose of using as a probe include a partial base sequence of 20 or more consecutive bases hybridizing with a herein described polynucleotide, preferably, a partial base sequence of 30 or more consecutive bases, more preferably a partial base sequence of 40 or 50 or more consecutive bases, more preferably a partial base sequence of 100 or more consecutive bases, more preferably a partial base sequence of 200 or more consecutive bases, and particularly preferably a partial base sequence of 300 or more consecutive bases.
The herein described polypeptides, polynucleotides, and oligonucleotides can be included in compositions in combination with carriers or vehicles, appropriately. The compositions can be produced using any method known to those skilled in the art. The polypeptides, polynucleotides, and oligonucleotides as described herein can be appropriately combined with, for example, pharmacologically acceptable carriers or vehicles, specifically, sterile water, physiological saline, vegetable oil, emulsifying agents, suspending agents, surfactants, stabilizers, flavoring agents, excipients, vehicles, preservatives, bonding agents, or the like. These can be mixed together to be formulated in unit dosage form required by generally-accepted pharmaceutical practices.
Further disclosed are cells transformed with a polynucleotide (including DNA and RNA) encoding a mutant N-methyl aminoacyl-tRNA synthetase as described above. Such cells may be prokaryotic cells or eukaryotic cells.
When a herein described mutant N-methyl aminoacyl-tRNA synthetase expressed in cells is directly used for protein synthesis in the cells, cells depending on the intended use can be used. Any known method can be adopted for the transformation.
Further disclosed is a method for producing polypeptide containing N-methyl amino acids using the ARS with an altered amino acid sequence as described above.
The ARSs with altered amino acid sequences as described herein are used not only in cells but also in vitro (in a cell-free system).
In either case, unlike prior art using chemical synthesis such as the pdCpA method, tRNAs can be acylated repeatedly during a translation reaction, and therefore N-methyl aminoacyl-tRNAs can be supplied continuously, and addition of a large amount of tRNAs, which may inhibit translation, can be avoided.
Accordingly, disclosed is a method for producing non-natural amino acids efficiently, selectively, in particular regioselectively, and in large amounts.

A modified ARS as described herein can be produced using any known genetic recombination technique as mentioned above, and generally can be prepared as follows: first, site-directed mutation is introduced into a specific position or a specific amino acid by PCR using a plasmid comprising the natural ARS gene as a template with appropriate primers. The template plasmid is digested with a restriction enzyme, and then E. coli or the like is transformed with the digested plasmid. The plasmid into which the desired mutation has been introduced is cloned.
If a second amino acid mutation is introduced, subsequently the introduction of site-directed mutation using the plasmid having the mutation introduced into the specific position as a template with appropriate primers is repeated as with the description above, which results in the construction of a plasmid DNA encoding the polypeptide in which the two amino acids are substituted. A similar procedure can be followed when introducing further amino acid mutations.
E. coli BL21 strain or the like is co-transformed with the constructed DNA and the plasmid pREP4 encoding lac repressor (LacI) or the like, and the obtained transformed strain is isolated and cultured followed by expression induction with IPTG. The obtained strain is then disrupted, and the supernatant is passed through an affinity column for His-tag to purify the mutant ARS.
Alternatively, the following method can be used to prepare a mutant ARS. Specifically, a sequence of a certain mutant ARS is genetically synthesized and inserted into an expression vector, and proteins are allowed to be expressed. Affinity columns for different purification tags are used to purify the protein of interest.
Methods for producing a mutant N-methyl aminoacyl-tRNA synthetase as described herein, which is modified by substituting an amino acid at a specific position with another amino acid, are not limited to the methods as described above, and various genetic engineering techniques including known point mutation techniques and gene synthesis techniques, methods for introducing modified fragments using restriction enzymes, and the like may be used. Also, expression is not limited to expression in E. coli, and animal cells and cell-free translation systems may also be used. Purification methods are also not limited to methods in an affinity column for polyhistidine, and various peptide-tags and purification columns may be used.

Substrate specificity of the obtained modified products can be confirmed by any of the following three assay methods, for example.
The first method can confirm by mass spectroscopy that N-methyl amino acids corresponding to codons on mRNA are introduced into a peptide by performing a translation reaction in a cell-free translation system reconstituted with the modified ARS rather than wild-type ARS and further N-methylated amino acids rather than natural amino acids, and performing aminoacylation with N-methyl amino acids in the translation system.
In the second method, peptides to be produced are labeled in a translation experiment using amino acids labeled with radioactive isotopes or fluorescent molecules, and the peptides are separated and visualized by electrophoresis or an analytical column to estimate yield of the peptides. In this method, translational introduction efficiency of corresponding N-methyl amino acids provided by using the modified ARS is increased compared to that provided by using wild-type ARS, and therefore the radioactivity or fluorescence, which is observed by electrophoresis or chromatogram, of the peptide produced using the modified ARS is observed more strongly than that of the peptide produced using wild-type ARS.
In the third method, three molecules, a modified ARS, a tRNA and an N-methyl amino acid corresponding to the modified ARS are reacted together in vitro, and the resulting N-methyl aminoacyl-tRNA is separated from unreacted tRNA by electrophoresis to quantify efficiency of the acylation reaction. The acylated tRNA produced using the modified ARS is detected more than that produced using wild-type ARS.
As mentioned above, the modified ARSs as described herein have increased reactivity with N-methyl amino acids compared to unmodified ARSs. The increased reactivity with N-methyl amino acids may be an increase in reaction rate, an increase in the amount of reaction products acylated with N-methyl amino acids, or an increase in substrate specificity to N-methyl amino acids. The increased reactivity with N-methyl amino acids may also be qualitative or quantitative. For example, the reactivity with N-methyl amino acids is considered to have increased in a reaction performed under the same conditions except for the use of an unmodified ARS or a modified ARS when the reactivity (such as reaction rate or the amount of reaction products) of the modified ARS to the substrate N-methyl amino acid is significantly increased compared to the reactivity of the unmodified ARS to N-methyl amino acids. Even if the reaction rate or reaction product amount in a reaction using the modified ARS and an N-methyl amino acid as substrate does not significantly increase compared to that in a reaction using the unmodified ARS, the reactivity of the modified ARS to N-methyl amino acids is considered to have increased when substrate specificity of the modified ARS to the N-methyl amino acid is increased relative to that in a reaction using unmodified amino acids. Preferably, the modified ARSs as described herein increase reaction rate or reaction product amount in a reaction using N-methyl amino acids as substrates compared to unmodified ARSs.
For example, the modified ARSs as described herein significantly increase the amount of tRNA aminoacylated with N-methyl amino acids or the amount of peptides incorporating N-methyl amino acids, as measured under the same conditions except for using either unmodified ARS or modified ARS, preferably by at least 10%, preferably by 20%, preferably by 1.3 times or more, preferably by 1.5 times or more, preferably by 2 times or more, preferably by 3 times or more, further preferably by 5 times or more, further preferably by 10 times or more, further preferably by 20 times or more, further preferably by 30 times or more, further preferably by 50 times or more, and further preferably by 100 times or more. Alternatively, the modified ARSs as described herein significantly increase the amount ratio of "products of the reaction with N-methyl amino acids / products of the reaction with unmodified amino acids", as measured under the same conditions except for using either unmodified ARS or modified ARS, preferably by at least 10%, preferably by 20%, preferably by 1.3 times or more, preferably by 1.5 times or more, preferably by 2 times or more, preferably by 3 times or more, further preferably by 5 times or more, further preferably by 10 times or more, further preferably by 20 times or more, further preferably by 30 times or more, further preferably by 50 times or more, further preferably by 100 times or more compared to unmodified ARSs. Alternatively, when a nucleic acid encoding a polypeptide containing consecutive N-methyl amino acids is allowed to be translated, the modified ARSs as described herein significantly increase the production amount of the polypeptide containing consecutive N-methyl amino acids, as measured under the same conditions except for using either unmodified ARS or modified ARS, preferably by at least 10%, preferably by 20%, preferably by 1.3 times or more, preferably by 1.5 times or more, preferably by 2 times or more, preferably by 3 times or more compared to unmodified ARSs. The consecutive N-methyl amino acids may be, for example, two and/or three consecutive N-methyl amino acids.
The reactivity with N-methyl amino acids can be determined according to a confirmation method as described above. Specifically, for example, ribosomally-produced peptides are electrophoresed, and band intensity of peptides that have incorporated N-methyl amino acids and peptides that have not incorporated N-methyl amino acids can be measured qualitatively or quantitatively to determine the reactivity. Alternatively, mass spectral peaks can be measured, and peaks of peptides that have incorporated N-methyl amino acids and peaks of peptides that have not incorporated N-methyl amino acids can be measured to determine the reactivity.
For example, substrate specificity of a modified ARSs as described herein is significantly increased, preferably by at least 10%, preferably by 20%, preferably by 1.3 times or more, preferably by 1.5 times or more, preferably by 2 times or more, preferably by 3 times or more, further preferably by 5 times or more, further preferably by 10 times or more, and further preferably by 20 times or more compared to unmodified ARSs when peptides are synthesized in the presence of unmodified amino acids or N-methyl amino acids and the amount ratio of "products of the reaction with N-methyl amino acids / products of the reaction with unmodified amino acids" is measured.
The reaction conditions may be appropriately determined as long as the conditions used for unmodified ARSs and for modified ARSs are the same. The substrate concentration of unmodified amino acids and N-methyl amino acids at the reaction may be appropriately adjusted, and may be any concentration condition as long as reactivity with N-methyl amino acids is increased. Preferably, unmodified amino acids may not be added (may be only endogenous amino acids originally contained in a cell-free translation system), or may be appropriately adjusted in the range from 0.1 µM to 1 mM, for example, 0.1 µM to 500 µM, 0.1 µM to 250 µM, 0.1 µM to 100 µM, or 0.1 µM to 50 µM for reactions. N-methyl amino acids may be appropriately adjusted in the range from, for example, 50 µM to 10 mM, for example, 100 µM to 5 mM, 200 µM to 2 mM, or 500 µM to 1 mM for reactions.
Modified ARSs obtained by the preparation method as described above can be used to produce peptides and peptide-mRNA fusions that have incorporated N-methyl amino acids in a site-specific manner.
Moreover, ARSs used herein are highly conserved among biological species. Therefore, it is clear that the herein described method can be generally applied to modification of N-methyl amino acid-tRNA synthetases from other biological species.
The modified ARS as described herein can be used to produce peptides having a particular amino acid substituted with its non-natural, N-methylated amino acid, in a prokaryotic translation. Such peptides can be produced using ARSs that are derived from other organisms and modified in a similar way as disclosed herein.

Use of the modified ARSs as described herein makes it possible to employ N-methyl amino acids as substrates in acylation of tRNAs. The modified ARSs as described herein can acylate tRNAs with N-methyl amino acids and to ribosomally produce N-methyl amino acid-containing peptides.
In order to produce N-methyl aminoacyl-tRNAs using the modified ARSs as described herein, the modified ARSs, N-methyl amino acids corresponding to the modified ARSs, and tRNAs may be reacted in vitro as in chemical synthesis such as the pdCpA method. The products, N-methyl aminoacyl-tRNAs may be isolated using any known nucleic acid purification technique such as ethanol precipitation, and the isolated product may be added to a translation system. This leads to production of polypeptides or polypeptide-mRNA fusions that have introduced N-methyl amino acids at the intended positions.
Unlike chemical synthesis which requires reactions performed in a reaction solution comprising only a particular tRNA and the N-methyl amino acid corresponding to the tRNA as substrates, three molecules, a modified ARS, N-methyl amino acid, and tRNA are used to precisely and efficiently produce the intended N-methyl aminoacyl-tRNA even in a mixture of translation reaction solution comprising other tRNAs and other amino acids because the modified ARSs as described herein have high substrate specificity for tRNAs and amino acids. Accordingly, the isolation and purification steps as described above are not essential. A reaction solution in which tRNAs have been acylated with N-methyl amino acids can be directly used in a translation reaction, or a translation reaction can be performed at the same time as an acylation reaction of tRNAs with N-methyl amino acids. The modified ARSs as described herein are very convenient because the reactions using the modified ARSs require no chemical synthesis of substrates such as pdCpA amino acids and activated amino acids and can be performed using commercially available reagents.
The most important characteristics include no requirement of stoichiometric consideration for a tRNA of interest in performing peptide translation (peptide expression) using a modified ARS as described herein. In chemical synthesis in which aminoacyl-tRNAs are supplied, N-methyl aminoacyl-tRNAs are consumed during the reaction, decreasing the reaction efficiency. However, if ARSs are used, translational efficiency is good even in peptide synthesis in which multiple N-methyl amino acids are introduced. Aminoacyl-tRNAs are constantly deacylated in hydrolytic reactions or transpeptidation reactions in translation systems, but the modified ARSs as described herein recognize released tRNAs and newly acylate the tRNAs with N-α amino acids, which results in a constant supply of N-α aminoacyl-tRNAs in the translation system. Therefore, the amount of tRNA required is less than the amount of produced polypeptide. Addition of large amounts of tRNA itself contributes to reduction of peptide yield. Therefore, the modified ARSs as described herein are useful for producing N-α amino acid-containing peptides at a high translational efficiency and producing highly diverse peptide libraries.
Moreover, because the modified ARSs as described herein have increased reactivity with N-methyl amino acids compared to natural ARSs, the modified ARSs can acylate tRNAs in the presence of N-methyl amino acids at a concentration lower than a concentration of N-methyl amino acids used for natural ARSs. In other words, the absolute amount of N-methyl amino acids required for peptide expression using the modified ARSs is not as much as that of N-methyl amino acids required for peptide expression using natural ARSs.
In addition, the modified ARSs as described herein can be used to translate N-methyl amino acid-containing peptides at a concentration lower than a concentration of natural ARSs used for translating N-methyl amino acid-containing peptides. In other words, the absolute amount of the modified ARS as described herein required for peptide expression is significantly less than in translation of N-methyl amino acid-containing peptides using natural ARSs.
These characteristics are important for improving orthogonality of aminoacylation reaction. Namely, in order to translate N-methyl amino acid-containing peptides, the more the substrate N-methyl amino acid, or an ARS corresponding to the N-methyl amino acid, is needed at a concentration higher than other natural amino acids, or ARSs corresponding to the other natural amino acids, the more it makes it easier for non-specific reactions to happen, such non-specific reactions being: ARS that should be normally using an N-methyl amino acid as a substrate may use other natural amino acids for acylation reaction; or an N-methyl amino acid that is present in excess may be used in acylation reactions by ARSs corresponding to other natural amino acids. This results in the obtained translated peptides being a mixture of N-methyl amino acid-containing peptides and peptides containing corresponding natural amino acid, which may cause issues for peptide libraries. However, it is expected that this issue can be minimized by using the modified ARSs as described herein.
Accordingly, the characteristics of the modified ARSs disclosed herein can be described as follows:
modified ARSs that catalyze acylation of tRNAs and comprise

(a) a tRNA binding site;
(b) a binding site to an N-methyl amino acid substrate; and
(c) a catalytically active site having activity to catalyze a reaction in which an acyl group is transferred from the N-methyl amino acid substrate to 3' end of a tRNA,

Moreover, the modified ARSs may comprise, in addition to (a), (b), and (c) as described above, (d) an editing site where a tRNA acylated with any undesired amino acid is hydrolyzed.

The modified ARSs as described herein can be used to synthesize tRNAs acylated with desired N-methyl amino acid substrates.
A method for producing acylated tRNAs using the modified ARSs as described herein, comprising the following steps:

(a) providing one or more modified ARSs as described herein;
(b) providing tRNAs;
(c) providing N-methyl amino acids; and
(d) contacting the modified ARSs with the tRNAs and N-methyl amino acids to acylate the tRNAs.

In addition to the steps as described above, the method may further comprise the step of (e) collecting the reaction product comprising the acylated tRNAs. The acylated tRNAs require no purification in the collecting step, and the reaction mixture can be collected and used directly. By not separating or purifying the produced aminoacyl-tRNAs from the ARSs, deacylation can be prevented.
This method uses N-methyl amino acids as substrates. Particularly preferable N-methyl amino acids include N-methylphenylalanine, N-methylvaline, N-methylthreonine, N-methyltryptophan, N-methylleucine, and/or N-methylserine. A substrate of an ARS (an amino acid corresponding to the ARS) is appropriately selected as N-methyl amino acid used as a substrate.
In a method for producing an acylated tRNA using a modified ARS as described herein, a tRNA that can be used is a tRNA corresponding to an ARS of a corresponding natural amino acid. The term "corresponding natural amino acid" refers to an amino acid that is not N-methylated relative to an N-methyl amino acid. For example, for N-methylphenylalanine, the corresponding amino acid is phenylalanine, and the corresponding tRNA is a tRNA recognizing the codon UUU or UUC (having the anticodon corresponding to the codon). For N-methylvaline, the corresponding amino acid is valine, and the corresponding tRNA is a tRNA recognizing the codon GUU, GUC, GUA, or GUG (having the anticodon corresponding to the codon). For N-methylserine, the corresponding amino acid is serine, and the corresponding tRNA is a tRNA recognizing the codon UCU, UCC, UCA, UCG, AGU, or AGC (having the anticodon corresponding to the codon). For N-methylthreonine, the corresponding amino acid is threonine, and the corresponding tRNA is a tRNA recognizing the codon ACU, ACC, ACA, or ACG (having the anticodon corresponding to the codon). For N-methyltryptophan, the corresponding amino acid is tryptophan, and the corresponding tRNA is a tRNA recognizing the codon UGG (having the anticodon corresponding to the codon). For N-methylleucine, the corresponding amino acid is leucine, and the corresponding tRNA is a tRNA recognizing the codon UUA, UUG, CUU, CUC, CUA, or CUG (having the anticodon corresponding to the codon). Similarly, for other N-methyl amino acids, tRNAs recognizing codons corresponding to the corresponding amino acid (having the anticodons corresponding to the codons) can be used.
When a tRNA is acylated by a modified ARS in a solution, a pellet obtained by ethanol precipitation of the reaction solution may be dissolved in a suitable buffer (such as 1 mM potassium acetate, pH 5) and added to a translation system. Typical reaction conditions include, for example, a reaction performed at 37°C for 5 minutes to 1 hour in 0.1 M reaction buffer (pH 7.5) containing a final concentration of 0.5 µM to 40 µM of a tRNA, 0.1 µM to 10 µM of a modified ARS as described herein, 0.1 mM to 10 mM of an N-methyl amino acid, 0.1 mM to 10 mM ATP, and 0.1 mM to 10 mM MgCl₂.
Furthermore, for an aminoacylation reaction, a tRNA can be refolded by, for example, heating 1 to 50 µM tRNA, 10 to 200 (e.g., 50 to 200) mM HEPES-K (pH 7.0 to 8.0 (e.g., 7.6)), 1 to 100 (e.g., 10) mM KCl solution at 95°C for 2 minutes and then left at room temperature for 5 minutes or more. This tRNA solution can be added to an acylation buffer (a final concentration of 25 to 100 (e.g., 50) mM HEPES-K [pH 7.0 to 8.0 (e.g., 7.6)], 1 to 10 (e.g., 2) mM ATP, 10 to 100 (e.g., 100) mM potassium acetate, 1 to 20 (e.g., 10) mM magnesium acetate, 0.1 to 10 (e.g., 1) mM DTT, 0.1 mg/mL Bovine Serum Albumin) to a final concentration of 1 to 40 (e.g., 10) µM, mixed with a modified ARS (a final concentration of 0.1 to 10 (e.g., 0.5) µM) and an N-methyl amino acid (a final concentration of 0.1 to 10 (e.g., 1) mM), and incubated at 37°C for 5 to 60 (e.g., 10) minutes.
Thus, the acylation reaction using a modified ARS as described herein requires no activated amino acids that need substrates and need to be synthesized. The acylation reaction can be performed with commercially available N-methyl amino acids and therefore is convenient. The modified ARSs as described herein can be combined with their substrates to form a kit product for obtaining acylated tRNAs. The kit may at least comprise (a) one or more modified ARSs as described herein, (b) N-methyl amino acid(s), and (c) tRNA(s), and may further comprise a reaction buffer, a reaction vessel, instructions for use, and the like. In the kit, each of (b) N-methyl amino acid(s) and (c) tRNA(s) acts as a substrate for (a) modified ARSs. In other words, the tRNAs recognize codons corresponding to natural amino acids equivalent to the N-methyl amino acids (the tRNAs have anticodons corresponding to the codons).

N-methyl amino acid-bound tRNAs can be used to produce polypeptides with N-methyl amino acids introduced into desired sites. The method comprises translating a nucleic acid encoding a polypeptide of interest in the presence of a modified ARS as described herein.
More specifically, for example, a method for producing an N-methyl amino acid-containing polypeptide using a modified ARS(s) as described herein comprises (a) providing the modified ARS(s) as described herein, (b) establishing a cell-free translation system reconstituted with the modified ARS(s) as described herein rather than wild-type ARS(s), (c) providing an mRNA having at a desired site(s) codon(s) corresponding to the anticodon(s) of a tRNA(s) that is the substrate(s) of the modified ARS(s), and (d) adding the mRNA to the cell-free translation system to produce a polypeptide with an N-methyl amino acid(s) introduced into a desired site(s). Matters particularly relevant to the production of polypeptide in (d) will be described below.
N-methyl amino acids preferably used in acylation of tRNAs with N-methyl amino acids using the modified ARSs as described herein include N-methylalanine, N-methylleucine, N-methyltryptophan, N-methylphenylalanine, N-methylvaline, N-methylthreonine, and/or N-methylserine, and particularly preferably N-methylphenylalanine, N-methylvaline, N-methylthreonine, N-methyltryptophan, N-methylleucine, and/or N-methylserine.
Specific methods for polypeptide synthesis may be essentially performed according to known methods, for example, performed as described in WO2013100132 , and various modifications can be made. Generally, the methods can be performed according to the following description.
A suitable translation system that can be used is a cell-free translation system, typified by PURESYSTEM (Registered trademark) (BioComber, Japan), reconstituted with translation factors. In such a cell-free translation system, components of the translation system can be controlled flexibly. For example, phenylalanine and the ARS corresponding to phenylalanine are removed from the translation system and instead, N-methylphenylalanine and the modified phenylalanine-ARS as described herein can be added. This achieves introduction of N-methylphenylalanine into a codon, such as UUU and UUC, encoding phenylalanine in a site-specific manner.
In a cell-free translation system, ribonucleosides preferably used are ATP and GTP at 0.1 mM-10 mM. A buffer preferably used is HEPES-KOH at 5 mM-500 mM and pH 6.5-8.5. Examples of other buffers include, but are not limited to, Tris-HCl, phosphate, and the like. Salts that can be used are acetates such as potassium acetate and ammonium acetate, and glutamates such as potassium glutamate, which are preferably used at 10 mM-1000 mM. A magnesium component preferably used is magnesium acetate at 2 mM-200 mM. Examples of other magnesium components include, but are not limited to, magnesium chloride and the like. Components in an energy-regenerating system preferably used are creatine kinase at 0.4 µg/mL-40 µg/mL and creatine phosphate at 2 mM-200 mM. Other energy-regenerating systems, typified by pyruvate kinase and phosphoenol pyruvate, may also be used. A nucleoside converting enzyme preferably used is myokinase at 0.1 unit/mL-10 unit/mL or nucleoside diphosphate kinase at 0.2 µg/mL-20 µg/mL. A diphosphatase preferably used is inorganic pyrophosphatase at 0.2 unit/mL-20 unit/mL. A polyamine preferably used is spermidine at 0.2 mM-20 mM. Examples of other polyamines include, but are not limited to, spermine and the like. A reducing agent preferably used is dithiothreitol at 0.1 mM-10 mM. Examples of other reducing agents include, but are not limited to, β-mercaptoethanol and the like. A tRNA preferably used is, for example, E. coli MRE600 (RNase-negative)-derived tRNA (Roche) at 0.5 mg/mL-50 mg/mL. Other tRNAs from E. coli may also be used. A formyl donor and an enzyme preferably used to synthesize formylmethionine used in a translational initiation reaction are 10-HCO-H4 folate at 0.1 mM-10 mM and methionyl-tRNA transformylase at 0.05 µM-5 µM. A translation initiation factor preferably used is IF1 at 0.5 µM-50 µM, IF2 at 0.1 µM-50 µM, or IF3 at 0.1 µM-50 µM. A translation elongation factor preferably used is EF-G at 0.1 µM-50 µM, EF-Tu at 1 µM-200 µM, or EF-Ts at 1 µM-200 µM. A translation termination factor preferably used is RF-2, RF3, or RRF at 0.1 µM-10 µM. Ribosome is preferably used at 1 µM-100 µM. There are 20 types of aminoacyl-tRNA synthetases, but only the enzymes corresponding to amino acids included in a peptide to be synthesized may be added. For example, ArgRS, AspRS, LysRS, MetRS, and TyrRS are all preferably used at 0.01 µM-1 µM. Amino acids used as substrates for peptide synthesis are natural 20 amino acids, which compose proteins, and derivatives thereof. Only amino acids included in a peptide to be synthesized are preferably used at 0.25 mM-10 mM. An mRNA as a template for peptide synthesis is preferably used at 0.1 µM-10 µM. When an mRNA is transcribed from a template DNA in a cell-free translation system, commercially available enzymes such as T7 RNA polymerase, T3 RNA polymerase, and SP6 RNA polymerase can be used, and the enzymes suitable for a promoter sequence in the template DNA may be appropriately selected and are preferably used at 1 µg/mL-100 µg/mL. In this case, nucleosides CTP and UTP, which are substrates, are preferably used at 0.1 mM-10 mM. A solution containing mixed these components can be left, for example, at 37°C for 1 hour to achieve translational synthesis of peptides. Temperature and reaction time are not limited to those.
The modified ARSs as described herein can also be used in combination with other techniques for introducing non-natural amino acids, such as the pdCpA method or the Flexizyme method. For example, polypeptides containing both N-methylglycine and N-methylphenylalanine can be synthesized when the aminoacyl-tRNA to which N-methylglycine is attached by the pdCpA method is added to a translation system at the same time as addition of N-methylphenylalanine and the modified phenylalanine ARS.
Alternatively, polypeptides containing N-methyl amino acids can be expressed in cells by inserting the modified ARSs as described herein into an expression vector or genome, expressing the modified ARSs in cells, and using N-methyl amino acids added to a medium as substrates.
tRNAs that can be used include a tRNA for a natural amino acid corresponding to an N-methyl amino acid. The tRNAs may be tRNAs that contain modified bases and are purified from a living body, or tRNAs that do not contain modified bases and are produced using an in vitro transcription reaction. Mutant tRNAs, which have a mutation in a portion of tRNA other than the portion recognized by an ARS, can also be used as substrates.

Examples

Example 1: N-methylphenylalanine-accepting ARS

Using a plasmid (pQE-32(2) 2_wtPheRS) comprising the ORF sequence of wild-type PheRS α subunit gene of E. coli (SEQ ID NOs: 27, 28) as starting material, mutant PheRS plasmids (having His-tag (6 x His) at the N-terminus) listed in Table 1 were constructed by introducing site-directed mutations using PCR. Specifically, 2 µL of 10 ng/µL template, 10 µL of 2 x KOD Fx buffer (TOYOBO, KFX-101), 0.6 µL of 10 µM forward primer, 0.6 µL of 10 µM reverse primer, 4 µL of 2 mM dNTP, 0.4 µL of KOD FX (TOYOBO, KFX-101), and 2.4 µL of H₂O were mixed together. Thereafter, the resulting reaction solution was heated at 94°C for 2 minutes and then subjected to 10 cycles, each consisting of heating at 98°C for 10 seconds and heating at 68°C for 7 minutes, to amplify the mutant gene. The combinations of the template plasmid, forward primer, and reverse primer used are listed in Table 2. Each sequence of primers "F.F02" through "F.F05" corresponds to SEQ ID NOs: 30-33 in ascending order, and each sequence of primers "R.F02" through "R.F05" corresponds to SEQ ID NOs: 34-37 in ascending order. 0.5 µL of 10 U/µL DpnI was then added to the PCR reaction solution and further incubated at 37°C for 1.5 hours to digest the template DNA, and the resulting mutant DNA was purified. E. coli XL-1 Blue strain (STRATAGENE, 200236) was then co-transformed with the resulting mutant gene DNA and pREP4 (Invitrogen, V004-50) encoding the lacI gene. The transformants were seeded onto agar containing ampicillin and kanamycin.
The plasmids of interest were purified from the resulting clones. The mutations were confirmed to be introduced into the plasmids.
[Table 2]

Name Template 5' primer 3' primer

PheRS02 PheRS01 (wt) F.F02 R.F02

PheRS03 PheRS01 (wt) F.F03 R.F03

PheRS04 PheRS01 (wt) F.F04 R.F04

PheRS05 PheRS01 (wt) F.F05 R.F05

Next, a plasmid comprising a resulting mutant gene and the gene encoding PheRS β subunit was introduced into E. coli, and a heterodimer comprising the mutant protein was expressed. First, E. coli BL21 strain transformed with the mutant plasmid and pREP4 (Invitrogen, V004-50) was cultured at 37°C in 4 mL of LB medium containing kanamycin, ampicillin, and 0.5% glucose. Subsequently, when the OD value at 600 nm reached 0.4 to 0.8, IPTG was added to a final concentration of 0.5 mM. After further culturing at 37°C for 4 hours, the bacterial pellets were collected using a centrifuge.

Next, the resulting bacterial pellets were disrupted, and the mutant protein of interest was purified from the supernatant. Specifically, the bacterial pellets as described above were suspended in 600 µL of CHAPS solution (0.5% CHAPS (DOJINDO: 349-04722), 50% TBS (TaKaRa, T903)) and mixed with 6 µL of 30 U/µl rLysozyme (Novagen, 71110-3), and 2 µL of 2.5 U/µL benzonase nuclease (Novagen, 70746-3) followed by incubation at room temperature for 30 minutes. Imidazole was then added to a final concentration of 15 mM, and an insoluble fraction was separated by centrifugation. Then, the mutant protein was purified from the resulting supernatant using QIAGEN Ni-NTA spin column kit (Qiagen, 31314) according to the product manual. Finally, excess imidazole was removed using a desalting column, PD miniTrap G-25 (GE Healthcare, 28-9180-07) according to the product manual.

The mutant protein confirmed to have activity was prepared in large scale. Specifically, E. coli BL21 strain transformed with a plasmid comprising the mutant α subunit gene and wild-type β subunit gene and pREP4 (Invitrogen, V004-50) was cultured at 37°C in 3 L of LB medium containing kanamycin, ampicillin, and 0.5% glucose. Then, when the OD value at 600 nm reached 0.4, IPTG was added to a final concentration of 0.5 mM. After further culturing at 37°C for 4 hours, the bacterial pellets were collected with a centrifuge. The bacterial pellets as described above were suspended in 1 L of CHAPS solution (0.5% CHAPS (DOJINDO: 349-04722) and 50% TBS (TaKaRa, T903)), mixed with 10 µL of 30 KU/µl rLysozyme (Novagen, 71110-3), and stirred at room temperature for 10 minutes. Next, 2 mL of 1 M MgCl₂ and 320 µL of Benzonase Nuclease (Novagen, 70746-3) were added and stirred at room temperature for 20 minutes. Imidazole was then added to a final concentration of 20 mM, and an insoluble fraction was separated by centrifugation. Subsequently, the mutant protein was purified from the resulting supernatant using a column filled with 15 mL of Ni Sepharose High Performance (GE Healthcare) and AKTA10S (GE Healthcare) with an imidazole concentration gradient (initial concentration 20 mM, final concentration 500 mM). Finally, dialysis was performed three times (2 hours x 2, overnight x 1) using a dialysis cassette (MWCO 10,000, Slide-A-Lyzer G2 Dialysis Cassettes 70mL, Thermo Scientific Pierce) and 3 L of stock solution (50 mM Hepes-KOH, 100 mM KCl, 10 mM MgCl₂, 1 mM DTT, pH 7.6) to obtain the mutant protein.

[Synthesis of E. coli tRNAPhe by in vitro transcription reaction]

E. coli tRNA (R-tRNAPhe (SEQ ID NO: 39)) was synthesized from a template DNA (D-tRNAPhe (SEQ ID NO: 38)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) in the presence of 7.5 mM GMP, and purified using RNeasy Mini kit (Qiagen).

D-tRNAPhe (SEQ ID NO: 38)
tRNAPhe DNA sequence:
R-tRNAPhe (SEQ ID NO: 39)
tRNAPhe RNA sequence:

For the aminoacylation reaction, the solution containing 40 µM transcribed tRNAPhe, 10 mM HEPES-K (pH 7.6), and 10 mM KCl solution was heated at 95°C for 2 minutes and then left at room temperature for 5 minutes or more to refold the tRNA. This tRNA solution was added to a final concentration of 10 µM to an acylation buffer (in final concentrations of 50 mM HEPES-K [pH 7.6], 2 mM ATP, 100 mM potassium acetate, 10 mM magnesium acetate, 1 mM DTT, 2 mM spermidine, 0.1 mg/mL Bovine Serum Albumin), mixed with wild-type or mutant PheRS (final concentration 0.5 µM) and phenylalanine (final concentration 0.25 mM, Watanabe Chemical Industries, Ltd., G00029) or N-methylphenylalanine (final concentration 1 mM, Watanabe Chemical Industries, Ltd., J00040), and incubated at 37°C for 10 minutes. Four volumes of a loading buffer (90 mM sodium acetate [pH 5.2], 10 mM EDTA, 95%(w/w) formamide, 0.001%(w/v) xylene cyanol) was added to the reaction solution and analyzed with acidic PAGE (12 % (w/v) polyacrylamide gel, pH 5.2) containing 6 M urea, and aminoacylation activity was detected by separating unreacted tRNA and aminoacylated tRNA. RNA was stained with SYBR Gold (Life Technologies) and detected with LAS4000 (GE Healthcare).
The activity for the aminoacylation reaction was assessed, and mutants 04 and 05 had increased activity for aminoacylation with N-methyl-phenylalanine compared to the wildtype (Figure 1). The base sequence of mutant 04 is set forth in SEQ ID NO: 12, and the amino acid sequence of mutant 04 is set forth in SEQ ID NO: 1. The base sequence of mutant 05 is set forth in SEQ ID NO: 13, and the amino acid sequence of mutant 05 is set forth in SEQ ID NO: 2.

[Synthesis of template DNA-F by in vitro transcription reaction]

A template mRNA for translation (R-F (SEQ ID NO: 41)) was synthesized from a template DNA (D-F (SEQ ID NO: 40)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-F (SEQ ID NO: 40)
DNA sequence:
R-F (SEQ ID NO: 41)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methylphenylalanine, a desired polypeptide containing N-methylphenylalanine was ribosomally synthesized by adding an N-methyl amino acid and PheRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant PheRS and phenylalanine or N-methylphenylalanine were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 0.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 44 µM EF-Ts, 1.2 µM ribosome, 0.03 µM ArgRS, 0.13 µM AspRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.02 µM TyrRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM arginine, aspartic acid, lysine, methionine, and tyrosine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

A peptide expression experiment was performed by using aspartic acid labeled with a radioisotope for detecting peptides into which N-methylphenylalanine is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-F (SEQ ID NO: 41)), arginine, lysine, methionine, and tyrosine (each final concentration 250 µM), and ¹⁴C-aspartic acid (final concentration 37 µM, Moravek Biochemicals, MC139) added to the cell-free translation system as described above was prepared. Wild-type PheRS or the mutant PheRS 05 (final concentration 0.1 µM) and phenylalanine (final concentration 250 µM) or N-methylphenylalanine (final concentration 1 mM or 250 µM) were added to the solution and incubated at 37°C for 60 minutes. An equal volume of 2× sample buffer (TEFCO, cat No. 06-323) was added to the resulting translation reaction solution and heated at 95°C for 3 minutes followed by electrophoresis (16% Peptide-PAGE mini, TEFCO, TB-162). After electrophoresis, the gel was dried using Clear Dry Quick Dry Starter KIT (TEFCO, 03-278), exposed to an imaging plate (GE Healthcare, 28-9564-75) for about 16 hours, detected using a Bioanalyzer System (Typhoon FLA 7000, GE Healthcare), and analyzed with ImageQuantTL (GE Healthcare).
Almost no bands of peptides that were ribosomally synthesized were observed in the presence of 0.1 µM wild-type PheRS when 0.25 mM or 1 mM N-methylphenylalanine was added (Figure 2). Meanwhile, peptide bands were observed in the presence of 0.1 µM mutant PheRS 05 even when 0.25 mM N-methylphenylalanine was added. The results can provide confirmation that the mutant PheRS 05 had increased aminoacylation activity with N-methylphenylalanine and that peptides containing N-methylphenylalanine were ribosomally synthesized at a high yield.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methylphenylalanine is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-F (SEQ ID NO: 41)) and amino acids, arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. PheRS (final concentration 0.1 µM) and phenylalanine (final concentration 250 µM) or N-methylphenylalanine (final concentration 1 mM or 250 µM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. α-Cyano-4-hydroxycinnamic acid was used as a matrix for translation products.
When a translation experiment was performed as a control experiment in the presence of 0.1 µM wild-type PheRS (SEQ ID NO: 28) or mutant PheRS 05 (SEQ ID NO: 2) with 250 µM phenylalanine added, the peak of the peptide into which phenylalanine was introduced (Calculated [M+H]+ = 1631.7) was detected in both experiments using wild-type PheRS and the mutant PheRS 05 (Figure 3(a), (d)).
Translational synthesis was then performed by using wild-type PheRS and adding 0.25 mM N-methylphenylalanine, and the peaks of the peptide containing N-methylphenylalanine (Calculated [M+H]+ = 1645.7) and the peptide containing phenylalanine were observed (Figure 3 (b)). This phenomenon is thought to be due to wild-type PheRS recognizing phenylalanine contaminated in N-methylphenylalanine, the subsequent occurrence of the aminoacylation reaction, and proceeding of translational synthesis. Furthermore, the peak intensity of the peptide containing N-methylphenylalanine was increased under the condition of 1 mM N-methylphenylalanine (Figure 3(c)). The ratio of peak intensities at each amino acid concentration was calculated (the calculating formula: the ratio of peak intensity = the peak intensity of the peptide containing N-methylphenylalanine / the peak intensity of the peptide containing phenylalanine) to be 0.8 at 0.25 mM and 2.6 at 1 mM.
Next, a translational synthesis was performed by using the mutant PheRS 05 and adding 0.25 mM or 1 mM N-methylphenylalanine, and the peak of the peptide containing N-methylphenylalanine was strongly detected, and the ratio of peak intensity of the peptide containing N-methylphenylalanine to the peak intensity of the peptide containing phenylalanine was 12.4 at 0.25 mM and 16.0 at 1 mM (Figure 3 (e), (f)). These results confirmed that the mutant PheRS 05 had increased aminoacylation activity with N-methylphenylalanine compared to wild-type PheRS, and consequently, translational synthesis of the peptide containing N-methylphenylalanine was promoted.

Peptide sequence P-F1 (SEQ ID NO: 42) formylMetArgPheArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeF1 (SEQ ID NO: 42) formylMetArg[MePhe]ArgAspTyrLysAspAspAspAspLys
MALDI-TOF MS: $Calc . m / z: [H + M] + = 1631.7 (the peptide corresponding to the sequence P - F1)$
$Calc . m / z: [H + M] + = 1645.7 (the peptide corresponding to the sequence P - MeF1)$

Incorporation efficiency of N-methylphenylalanine using an ARS as described herein was compared with that in the pdCpA method. Translation was performed using a sequence containing one, two consecutive, or three consecutive phenylalanines in a cell-free translation system, and the synthesized peptide band was detected by electrophoresis. The amount of the peptide synthesized using a mutant PheRS as described herein (PhrRS05 (SEQ ID NO: 2)) was compared with the amount of the peptide synthesized using the pdCpA method. It is confirmed that the amount of the peptide synthesized using the mutant PheRS was detected more than the amount of the peptide synthesized using the pdCpA method in all translations and that translational efficiency obtained by using the mutant PheRS was also higher than that obtained by using the pdCpA method. Particularly, it is revealed that the amount of the peptide produced by translating sequences containing two consecutive and three consecutive N-methylphenylalanines using the mutant PheRS (PhrRS05) was 4 to 8 times higher than the amount of the peptide produced using the pdCpA method.

Example 2: N-methylvaline-accepting ARS

The mutant ValRS plasmids (having His-tag (6 × His) at the N-terminus) listed in Table 3 were constructed by introducing site-directed mutations using PCR into the plasmid (PQE-32(2) 2_wtVALRS) comprising the ORF sequence of wild-type ValRS gene of E. coli (SEQ ID NOs: 23, 24), which was used as starting material. Specifically, 2 µL of 10 ng/µL template, 10 µL of 2 × KOD Fx buffer (TOYOBO, KFX-101), 0.6 µL of 10 µM forward primer, 0.6 µL of 10 µM reverse primer, 4 µL of 2 mM dNTP, 0.4 µL of KOD FX (TOYOBO, KFX-101), and 2.4 µL of H₂O were mixed together. Then, the resulting reaction solution was heated at 94°C for 2 minutes and then subjected to 10 cycles, each consisting of heating at 98°C for 10 seconds and heating at 68°C for 7 minutes, to amplify the mutant gene. The combinations of the template plasmid, forward primer, and reverse primer used are listed in Table 4. Each sequence of primers "F.V2" through "F.V19", "F.V46" through "F.V48", and "F.V13-01" through "F.V13-16" corresponds to SEQ ID NOs: 43-79 in ascending order. Each sequence of primers "R.V2" through "R.V19", "R.V46" through R.V48", and "R.V13-01" through "R.V13-16" corresponds to SEQ ID NOs: 80-116 in ascending order. 0.5 µL of 10 U/µL DpnI was then added to the PCR reaction solution and further incubated at 37°C for 1.5 hours to digest the template DNA, and the resulting mutant DNA was purified. E. coli XL-1 Blue strain (STRATAGENE, 200236) was then co-transformed with the resulting mutant gene DNA and pREP4 (Invitrogen, V004-50) encoding the lacI gene. The transformants were seeded onto agar containing ampicillin and kanamycin. The plasmids of interest were purified from the resulting clones. The mutations were confirmed to be introduced into the plasmids.

For plasmid construction requiring multistep mutation introduction, the procedure as described above was repeated to obtain plasmids into which mutations of interest were introduced. The combinations of the primers and template for such plasmid construction are listed in Table 4.

[Table 4]

	forward primer	reverse primer	Template
ValRS001
ValRS002	F. V2	R. V2	wild type
ValRS003	F. V3	R. V3	wild type
ValRS004	F. V4	R. V4	wild type
ValRS005	F. V5	R. V5	wild type
ValRS006	F. V6	R. V6	wild type
ValRS007	F. V7	R. V7	wild type
ValRS008	F. V8	R. V8	wild type
ValRS009	F. V9	R. V9	wild type
ValRS010	F. V10	R. V10	wild type
ValRS011	F. V11	R. V11	wild type
ValRS012	F. V12	R. V12	wild type
ValRS013	F. V13	R. V13	wild type
ValRS014	F. V14	R. V14	wild type
ValRS015	F. V15	R. V15	wild type
ValRS016	F. V16	R. V16	wild type
ValRS017	F. V17	R. V17	wild type
ValRS018	F. V18	R. V18	wild type
ValRS019	F. V19	R. V19	wild type
VAlRS020	F. V2	R. V2	ValRS019
ValRS021	F. V3	R. V3	ValRS019
ValRS022	F. V4	R. V4	ValRS019
ValRS023	F. V5	R. V5	ValRS019
ValRS024	F. V6	R. V6	ValRS019
ValRS025	F. V7	R. V7	ValRS019
ValRS026	F. VB	R. V8	ValRS019
ValRS027	F. V9	R. V9	ValRS019
ValRS028	F. V10	R. V10	ValRS019
ValRS029	F. V11	R. V11	ValRS019
ValRS030	F. V12	R. V12	ValRS019
ValRS031	F. V13	R. V13	ValRS019
ValRS032	F. V14	R. V14	ValRS019
ValRS033	F. V15	R. V15	ValRS019
ValRS034	F. V16	R. V16	ValRS019
ValRS035	F. V17	R. V17	ValRS019
ValRS036	F. V18	R. V18	ValRS019
ValRS037	F. V16	R. V16	ValRS046
ValRS038	F. V17	R. V17	ValRS046
ValRS039	F. V18	R. V18	ValRS046
ValRS040	F. V16	R. V16	ValRS047
ValRS041	F. V17	R. V17	ValRS047
ValRS042	F. V18	R. V18	ValRS047
ValRS043	F. V16	R. V16	ValRS048
ValRS044	F. V17	R. V17	ValRS048
ValRS045	F. V18	R. V18	ValRS048
ValRS046	F. V46	R. V46	wt
ValRS047	F. V47	R. V47	wt
ValRS048	F. V48	R. V48	wt
ValRS13-01	F. V13-01	R. V13-01	ValRS13
ValRS13-02	F. V13-02	R. V13-02	ValRS13
ValRS13-03	F. V13-03	R. V13-03	ValRS13
ValRS13-04	F. V13-04	R. V13-04	ValRS13
ValRS13-05	F. V13-05	R. V13-05	ValRS13
ValRS13-06	F. V13-06	R. V13-06	ValRS13
ValRS13-07	F. V13-07	R. V13-07	ValRS13
ValRS13-08	F. V13-08	R. V13-08	ValRS13
ValRS13-09	F. V13-09	R. V13-09	ValRS13
ValRS13-10	F. V13-10	R. V13-10	ValRS13
ValRS13-11	F. V13-11	R. V13-11	ValRS13
ValRS13-12	F.V13-12	R. V13-12	ValRS13
ValRS13-13	F. V13-13	R. V13-13	ValRS13
ValRS13-14	F. V13-14	R. V13-14	ValRS13
VaoRS13-15	F. V13-15	R. V13-15	ValRS13
VaoRS13-16	F. V13-16	R. V13-16	ValRS13

Next, a resulting mutant plasmid was introduced into E. coli, and the mutant protein was expressed. First, E. coli BL21 strain transformed with the mutant plasmid and pREP4 (Invitrogen, V004-50) was cultured at 37°C in 4 mL of LB medium containing kanamycin and ampicillin. Then, when the OD value at 600 nm reached 0.4 to 0.8, IPTG was added to a final concentration of 0.5 mM. After further culturing at 37°C for 4 hours, the bacterial pellets were collected using a centrifuge.

Next, the resulting bacterial pellets were disrupted, and the mutant protein of interest was purified from the supernatant. Specifically, the bacterial pellets as described above were suspended in 600 µL of CHAPS solution (0.5% CHAPS (DOJINDO: 349-04722), 50% TBS (TaKaRa, T903)) and mixed with 6 µL of 30 U/µl rLysozyme (Novagen, 71110-3) followed by incubation at room temperature for 10 minutes. The reaction was further mixed with 2 µL of 2.5 U/µL benzonase nuclease (Novagen, 70746-3) followed by incubation at room temperature for 20 minutes, and an insoluble fraction was separated by centrifugation. The mutant protein was then purified from the resulting supernatant using QIAGEN Ni-NTA spin column kit (Qiagen, 31314) according to the product manual. Finally, excess imidazole was removed using a desalting column, PD miniTrap G-25 (GE Healthcare, 28-9180-07) according to the product manual.

[Synthesis of E. coli tRNAVal by in vitro transcription reaction]

E. coli tRNA (R-tRNAVal2A (SEQ ID NO: 118)) was synthesized from a template DNA (D-tRNAVal2A (SEQ ID NO: 117)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) in the presence of 7.5 mM GMP, and purified using RNeasy Mini kit (Qiagen).

D-tRNAVal2A (SEQ ID NO: 117)
tRNAVal2A DNA sequence:
R-tRNAVal2A (SEQ ID NO: 118)
tRNAVal2A RNA sequence:

For the aminoacylation reaction, the solution containing 40 µM transcribed tRNA, 10 mM HEPES-K (pH 7.6), and 10 mM KCl solution was heated at 95°C for 2 minutes and then left at room temperature for 5 minutes or more to refold the tRNA. This tRNA solution was added to a final concentration of 10 µM to an acylation buffer (in final concentrations of 50 mM HEPES-K [pH 7.6], 2 mM ATP, 100 mM potassium acetate, 10 mM magnesium acetate, 1 mM DTT, 2 mM spermidine, 0.1 mg/mL Bovine Serum Albumin), mixed with wild-type or mutant ValRS (final concentration 0.2 µM-1 µM) and N-methylvaline (final concentration 5 mM), and incubated at 37°C for 10 minutes. Four volumes of a loading buffer (90 mM sodium acetate [pH 5.2], 10 mM EDTA, 95%(w/w) formamide, 0.001%(w/v) xylene cyanol) were added to the reaction solution and analyzed with acidic PAGE containing 6 M urea, and aminoacylation activity was detected by separating unreacted tRNA and aminoacyl-tRNA. RNA was stained with SYBR Gold (Life Technologies) and detected with LAS4000 (GE Healthcare) (Figure 4).
As a result, tRNA acylated with N-methylvaline was observed when mutant 13 was used. It was demonstrated that mutant 13 had increased activity for aminoacylation with N-methylvaline compared to wild-type ValRS (Figure 4, lane 2 vs 10).

Template mRNAs for translation (R-V, R-V2, and R-V3 (SEQ ID NOs: 120, 122, and 124, respectively)) were synthesized from template DNAs (D-V, D-V2, and D-V3 (SEQ ID NOs: 119, 121, and 123, respectively)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-V (SEQ ID NO: 119)
DNA sequence:
R-V (SEQ ID NO: 120)
RNA sequence:
D-V2 (SEQ ID NO: 121)
DNA sequence:
R-V2 (SEQ ID NO: 122)
RNA sequence:
D-V3 (SEQ ID NO: 123)
DNA sequence:
R-V3 (SEQ ID NO: 124)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methylvaline, a desired polypeptide containing N-methylvaline was ribosomally synthesized by adding an N-methylvaline and a mutant ValRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant ValRS and N-methylvaline were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 1.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 0.1 mM 10-HCO-H4 folate, 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 84 µM EF-Ts, 1.2 µM ribosome, 0.03 µM ArgRS, 0.13 µM AspRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.02 µM TyrRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM arginine, aspartic acid, lysine, methionine, and tyrosine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methylvaline is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-V (SEQ ID NO: 120)) and arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. ValRS (final concentration 0.1 µM-1 µM) and N-methylvaline (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. Translation products were identified with MALDI-TOF MS spectrometry using α-cyano-4-hydroxycinnamic acid as a matrix.
As a result of the translation using wild-type ValRS (SEQ ID NO: 24), a peak corresponding to the peptide sequence P-V1 introduced with valine contaminated in the cell-free translation system ((Figure 5(a), Peak VI, m/z: [H+M]+ = 1583.6) was observed as main product. Similar experiments were performed using different mutant ValRSs. When ValRS04 (SEQ ID NO: 3) or ValRS13 (SEQ ID NO: 4) was used, a peak corresponding to the peptide sequence P-MeV1 with N-methylvaline introduced ((Figure 5(b), Peak MeV1, m/z: [H+M]+ = 1597.5, Figure 5(c), Peak MeV2, m/z: [H+M]+ = 1597.5) was observed as main product. This demonstrated, also from a view point of translation reaction that ValRS04 and ValRS13 have increased activity to N-methylvaline compared to wild-type ValRS. Although peaks corresponding to the peptide P-V1 introduced with valine contaminated in the translation system (Figure 5(b) Peak V2, (c) Peak V3) were also observed at the same time, the intensity of the peak in using ValRS13 was weaker than that in using ValRS04. Therefore, it was suggested that ValRS13 has a higher activity to N-methylvaline (Figure 5(b), (c)).

Peptide sequence P-V1 (SEQ ID NO: 125)
formylMetArgValArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeV1 (SEQ ID NO: 125)
formylMetArg[MeVal]ArgAspTyrLysAspAspAspAspLys
MALDI-TOF MS: $Calc . m / z: [H + M] + = 1583.7 (the peptide corresponding to the sequence P - V1)$
$Calc . m / z: [H + M] + = 1597.7 (the peptide corresponding to the sequence P - MeV1)$

Improvement in activity to N-methylvaline was aimed by further introducing mutations into ValRS 13 as described above. Plasmids into which the mutations of interest are introduced were prepared as mentioned above, expressed in E. coli, and purified to prepare mutant ValRSs (Table 3). As a result of screening by performing aminoacylation reaction and translational synthesis, ValRS13-11 (SEQ ID NO: 5) exhibited higher activity to N-methylvaline compared to ValRS13.
When aminoacylation reactions with N-methylvaline using wild-type ValRS, ValRS13, and ValRS13-11 were investigated, it was observed that the amount of aminoacyl-tRNA synthesized using ValRS13-11 was more than that synthesized using ValRS13 (Figure 6, lane 8 vs lane 9 and lane 12 vs lane 13). Particularly, the amounts of aminoacyl-tRNA synthesized using ValRS13 and ValRS13-11 with N-methylvaline at the low concentration of 1.25 mM were greatly different, demonstrating that ValRS13-11 had a high activity.
Next, mass spectroscopy was performed using MALDI-TOF MS for confirming translational synthesis of peptides having N-methylvaline using ValRS13 and ValRS13-11. Specifically, a solution containing 1 µM template mRNA (R-V (SEQ ID NO: 120)) and arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. ValRS (final concentration 4 µM) and N-methylvaline (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. Template mRNAs encoding peptide sequences containing two consecutive or three consecutive N-methylvalines (R-V2 (SEQ ID NO: 122), R-V3 (SEQ ID NO: 124)) were used to perform similar experiments. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS.
The intended synthesis of the peptide sequence P-MeV1 containing N-methylvaline using ValRS13 and ValRS13-11 was confirmed when the template mRNA containing one N-methylvaline (R-V (SEQ ID NO: 120)) was used (Figure 7(a) Peak MeV1, (b) Peak MeV2). A peak corresponding to the peptide sequence P-V1 introduced with valine contaminated in the cell-free translation system was observed weakly (Figure 7(a) Peak VI, (b) Peak V2).
Next, it was confirmed that the peptide sequence P-MeV2 containing two N-methylvaline residues was synthesized as main product by both the mutant ValRSs when a template mRNA containing two consecutive N-methylvalines (R-V2 (SEQ ID NO: 122)) was used (Figure 7(c) Peak MeV3, (d) Peak MeV5). Meanwhile, the peptide sequence P-MeV4 containing one N-methylvaline residue and one valine residue (Figure 7(c) Peak MeV4, (d) Peak MeV6) was observed, but the peak intensity was suppressed when using ValRS13-11 compared to ValRS13.
Finally, template mRNA containing three consecutive N-methylvalines (R-V3 (SEQ ID NO: 124)) was used to perform translation experiments. When ValRS13 was added, the synthesis of the target peptide sequence P-MeV3 containing three N-methylvaline residues was observed (Figure 7(e), Peak MeV7), but the peptide sequence P-MeV5 comprising two N-methylvaline residues and one valine residue was produced as main product (Figure 7(e), Peak MeV8). The peptide sequence P-MeV6 comprising one N-methylvaline residue and two valine residues was also observed (Figure 7(e), Peak MeV9). Meanwhile, when ValRS13-11 was added, it was observed that the target peptide sequence P-MeV3 containing three N-methylvaline residues was ribosomally synthesized as main product (Figure 7(f), Peak MeV10). The peptide sequences P-MeV5 and P-MeV6 containing valine were also observed, but the peak intensity of these peptide sequences was suppressed compared to that obtained when using ValRS13 (Figure 7(f), Peak MeV11 and MeV12). These results indicate that ValRS13-11 has increased aminoacylation activity to N-methylvaline compared to ValRS13, leading to the increase of the ribosomally synthesized amount of the target peptide containing N-methylvaline.

Peptide sequence P-MeV2 (SEQ ID NO: 126)
formylMetArg[MeVal][MeVal]ArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeV4 (SEQ ID NO: 126)
- formylMetArg[MeVal]ValArgAspTyrLysAspAspAspAspLys or
- formylMetArgVal[MeVal]ArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeV3 (SEQ ID NO: 127)
formylMetArg[MeVal][MeVal][MeVal]ArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeV5 (SEQ ID NO: 127)
- formylMetArg[MeVal][MeVal]ValArgAspTyrLysAspAspAspAspLys or
- formylMetArg[MeVal]Val[MeVal]ArgAspTyrLysAspAspAspAspLys or
- formylMetArgVal[MeVal][MeVal]ArgAspTyrLysAspAspAspAspLys
Peptide sequence P-MeV6 (SEQ ID NO: 127)
- formylMetArg[MeVal]ValValArgAspTyrLysAspAspAspAspLys or
- formylMetArgVal[MeVal]ValArgAspTyrLysAspAspAspAspLys or
- formylMetArgValVal[MeVal]ArgAspTyrLysAspAspAspAspLys

MALDI-TOF MS:

$Calc . m / z: [H + M] + = 1710.8 (the peptide corresponding to the sequence P - MeV2)$
$Calc . m / z: [H + M] + = 1696.8 (the peptide corresponding to the sequence P - MeV4)$
$Calc . m / z: [H + M] + = 1823.9 (the peptide corresponding to the sequence P - MeV3)$
$Calc . m / z: [H + M] + = 1809.9 (the peptide corresponding to the sequence P - MeV5)$
$Calc . m / z: [H + M] + = 1795.9 (the peptide corresponding to the sequence P - MeV6)$

Example 3: Development of N-methylserine-accepting ARS

The mutant SerRS plasmids (having His-tag (6 × His) at the N-terminus) listed in Table 5 were constructed by introducing site-directed mutations using PCR into a plasmid comprising the ORF sequence (SEQ ID NOs: 25, 26) of wild-type SerRS gene of E. coli (PQE-32(2) 2_wtSERRS), which was used as starting material. Specifically, 2.5 µL of 10 ng/µL template, 12.5 µL of 2 × KOD Fx buffer (TOYOBO, KFX-101), 0.75 µL of 10 µM forward primer, 0.75 µL of 10 µM reverse primer, 5 µL of 2 mM dNTP, 0.5 µL of KOD FX (TOYOBO, KFX-101), and 3 µL of H₂O were mixed together. Next, the resulting reaction solution was heated at 94°C for 2 minutes and then subjected to 10 cycles, each consisting of heating at 98°C for 10 seconds and heating at 68°C for 7 minutes, to amplify the mutant gene. The combinations of the template plasmid, forward primer, and reverse primer used are listed in Table 6. Each sequence of primers "F.S2" through "F.S8", "F.S15" through "F.S23", and "F.S33" through "F.S38" corresponds to SEQ ID NOs: 128-149 in ascending order. Each sequence of primers "R.S2" through "R.S8", "R.S15" through "R.S23", and "R.S33" through "R.S38" corresponds to SEQ ID NOs: 150-171 in ascending order. 0.5 µL of 10 U/µL DpnI was then added to the PCR reaction solution and further incubated at 37°C for 1.5 hours to digest the template DNA, and the resulting mutant DNA was purified. E. coli XL-1 Blue strain (STRATAGENE, 200236) was then co-transformed with the resulting mutant gene DNA and pREP4 (Invitrogen, V004-50) encoding the lacI gene. The transformants were seeded onto agar containing ampicillin and kanamycin. The plasmids of interest were purified from the resulting clones. The mutations were confirmed to be introduced into the plasmids.

For plasmid construction requiring multistep mutation introduction, the procedure as described above was repeated to obtain plasmids into which mutations of interest were introduced. The combinations of the primers and template for such plasmid construction are listed in Table 6.

[Table 6]

	forward primer	revers primer	Template
SerRS001
SerRS002	F. S2	R. S2	wt
SerRS003	F. S3	R. S3	wt
SerRS004	F. S4	R. S4	wt
SerRS005	F. S5	R. S5	wt
SerRS006	F. S6	R. S6	wt
SerRS007	F. S7	R. S7	wt
SerRS008	F. S8	R. S8	wt
SerRS009	F. S2	R. S2	SerRS008
SerRS010	F. S3	R. S3	SerRS008
SerRS011	F. S4	R. S4	SerRS008
SerRS012	F. S5	R. S5	SerRS008
SerRS013	F. S6	R. S6	SerRS008
SerRS014	F. S7	R. S7	SerRS008
SerRS015	F. S15	R. S15	wt
SerRS016	F. S16	R. S16	wt
SerRS017	F. S17	R. S17	wt
SerRS018	F. S18	R. S18	wt
SerRS019	F. S19	R. S19	wt
SerRS020	F. S20	R. S20	wt
SerRS021	F. S21	R. S21	wt
SerRS022	F. S22	R. S22	wt
SerRS023	F. S23	R. S23	wt
SerRS024	F. S15	R. S15	SerRS008
SerRS025	F. S16	R. S16	SerRS008
SerRS026	F. S17	R. S17	SerRS008
SerRS027	F. S18	R. S18	SerRS008
SerRS02B	F. S19	R. S19	SerRS008
SerRS029	F. S20	R. S20	SerRS008
SerRS030	F. S21	R. S21	SerRS008
SerRS031	F. S22	R. S22	SerRSOOB
SerRS032	F. S23	R. S23	SerRS008
SerRS033	F. S33	R. S33	wt
SerRS034	F. S34	R. S34	wt
SerRS035	F. S35	R. S35	wt
SerRS036	F. S36	R. S36	wt
SerRS037 F. S37 R. S37 wt
SerRS038 F. S38 R. S38 wt

Next, the resulting mutant plasmid was introduced into E. coli to express the mutant protein. First, E. coli BL21 strain transformed with the mutant plasmid and pREP4 (Invitrogen, V004-50) was cultured at 37°C in 4 mL of LB medium containing kanamycin and ampicillin. Then, when the OD value at 600 nm reached 0.4 to 0.8, IPTG was added to a final concentration of 0.5 mM. After further culturing at 37°C for 4 hours, the bacterial pellets were collected with a centrifuge.

Next, the resulting bacterial pellets were disrupted, and the mutant protein of interest was purified from the supernatant. Specifically, the bacterial pellets as described above were suspended in 600 µL of CHAPS solution (0.5% CHAPS (DOJINDO: 349-04722), 50% TBS (TaKaRa, T903)) and mixed with 6 µL of 30 U/µl rLysozyme (Novagen, 71110-3) followed by incubation at room temperature for 10 minutes. The reaction was further mixed with 2 µL of 2.5 U/µL benzonase nuclease (Novagen, 70746-3) followed by incubation at room temperature for 20 minutes, and an insoluble fraction was separated by centrifugation. Then, the mutant protein was purified from the resulting supernatant using QIAGEN Ni-NTA spin column kit (Qiagen, 31314) according to the product manual. Finally, excess imidazole was removed using a desalting column, PD miniTrap G-25 (GE Healthcare, 28-9180-07) according to the product manual.

[Synthesis of E. coli tRNASer by in vitro transcription reaction]

E. coli tRNA (R-tRNASer3 (SEQ ID NO: 173)) was synthesized from a template DNA (D-tRNASer3 (SEQ ID NO: 172)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) in the presence of 7.5 mM GMP, and purified using RNeasy Mini kit (Qiagen).

D-tRNASer3 (SEQ ID NO: 172)
tRNASer3 DNA sequence:
R-tRNASer3 (SEQ ID NO: 173)
tRNASer3 RNA sequence:

For the aminoacylation reaction, the solution containing 40 µM transcribed tRNA, 10 mM HEPES-K (pH 7.6), and 10 mM KCl solution was heated at 95°C for 2 minutes and then left at room temperature for 5 minutes or more to refold the tRNA. This tRNA solution was added to a final concentration of 10 µM to an acylation buffer (final concentration 50 mM HEPES-K [pH 7.6], 2 mM ATP, 100 mM potassium acetate, 10 mM magnesium acetate, 1 mM DTT, 2 mM spermidine, 0.1 mg/mL Bovine Serum Albumin), mixed with wild-type or mutant SerRS (final concentration 0.1 µM-2 µM) and N-methylserine (final concentration 1 mM), and incubated at 37°C for 10 minutes. Four volumes of a loading buffer (90 mM sodium acetate [pH 5.2], 10 mM EDTA, 95%(w/w) formamide, 0.001%(w/v) xylene cyanol) were added to the reaction solution and analyzed with acidic PAGE containing 6 M urea, and aminoacylation activity was detected by separating unreacted tRNA and aminoacyl-tRNA. RNA was stained with SYBR Gold (Life Technologies) and detected with LAS4000 (GE Healthcare) (Figure 8).
As a result, tRNA acylated with N-methylserine was observed when the mutants 03 (SEQ ID NO: 6), 35 (SEQ ID NO: 8), and 37 (SEQ ID NO: 9) were used. It is suggested that these mutants had increased activity for aminoacylation with N-methylserine compared to wild-type SerRS (Figure 8, lanes 3, 25, 27).

Template mRNA (R-S (SEQ ID NO: 175)) was synthesized from a template DNA (D-S (SEQ ID NO: 174)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-S (CT21) (SEQ ID NO: 174)
DNA sequence:
R-S (SEQ ID NO: 175)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methylserine, a desired polypeptide containing N-methylserine was ribosomally synthesized by adding N- methylserine and SerRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant SerRS and N-methylserine were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 1.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 0.1 mM 10-HCO-H4 folate, 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 84 µM EF-Ts, 1.2 µM ribosome, 0.03 µM ArgRS, 0.13 µM AspRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.02 µM TyrRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM arginine, aspartic acid, lysine, methionine, and tyrosine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methylserine is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-S (SEQ ID NO: 175)) and arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. SerRS (final concentration 0.1-2 µM) and N-methylserine (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. Translation products were identified with MALDI-TOF MS spectrometry using α-cyano-4-hydroxycinnamic acid as a matrix.
As a result of the translation using wild-type SerRS (SEQ ID NO: 25)), a peak corresponding to the target peptide peak P-CT21MeSer containing N-methylserine (Figure 9(a), Peak MeS1 m/z: [H+M]+ = 1585.6) was observed, but the peak corresponding to Ser-containing peptide P-CT21Ser probably derived from Ser contaminated in the translation system in trace amounts (Figure 9(a), Peak S1 m/z: [H+M]+ =1571.6) was observed as main product (Figure 9(a)). On the other hand, when similar experiments were performed using the modified SerRS with mutations introduced (Ser03 (SEQ ID NO: 6), 05 (SEQ ID NO: 7)), the peak corresponding to P-CT21Ser (Figure 9(b) Peak S2, (c) Peak S3) was observed similarly but as a side product, and it was revealed that the main product was P-CT21MeSer (Figure 9(b) Peak MeS2, (c) Peak MeS3). Particularly, when SerRS35 and 37 were used, it was demonstrated that the peak corresponding to P-CT21Ser was not observed and CT21MeSer was synthesized with high purity (Figure 9(d) Peak MeS4, (e) Peak MeS5). Consequently, these modified SerRSs were demonstrated to have increased activity to MeSer compared to wild-type SerRS.

Peptide sequence P-CT21Ser (SEQ ID NO: 176)
formylMetArgSerArgAspTyrLysAspAspAspAspLys
Peptide sequence P-CT21MeSer (SEQ ID NO: 176)
formylMetArg[MeSer]ArgAspTyrLysAspAspAspAspLys

MALDI-TOF MS:

$Calc . m / z: [H + M] + = 1571.7 (the peptide corresponding to the sequence P - CT 21 Ser)$
$Calc . m/z : [H + M] + = 1585.7 (the peptide corresponding to the sequence P-CT21MeSer)$

Example 4: Development of N-methylthreonine-accepting ARS

Expression vectors having a polyhistidine sequence at the N-terminus and comprising mutations listed in Table 7 were constructed. Subsequently, an expression strain was transformed with the vectors, and the mutant proteins of interest were purified with a nickel column from supernatants obtained by disrupting cells.

A template mRNA (R-T (SEQ ID NO: 178)) was synthesized from a template DNA (D-T (SEQ ID NO: 177)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-T (3lib15#09) (SEQ ID NO: 177)
DNA sequence:
R-T (SEQ ID NO: 178)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methylthreonine, a desired polypeptide containing N-methylthreonine was ribosomally synthesized by adding N- methylthreonine and a mutant ThrRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant ThrRS and N-methylthreonine were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 1.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 0.1 mM 10-HCO-H4 folate, 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 93 µM EF-Ts, 1.2 µM ribosome, 2.73 µM AlaRS, 0.13 µM AspRS, 0.09 µM GlyRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.68 µM PheRS, 0.16 µM ProRS, 0.25 µM SerRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM glycine, proline, alanine, phenylalanine, lysine, methionine, and serine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methylthreonine is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-T (SEQ ID NO: 178)) and glycine, proline, alanine, phenylalanine, lysine, methionine, and serine (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. ThrRS (final concentration 2 µM) and N-methylthreonine (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. Translation products were identified with MALDI-TOF MS spectrometry using α-cyano-4-hydroxycinnamic acid as a matrix.
As a result of the translation using wild-type ThrRS (SEQ ID NO: 29), a peak corresponding to the peptide peak P-3lib15MeThr of interest containing N-methylthreonine ((Figure 10(a), Peak MeT1) and a peak corresponding to a potassium salt of the peptide (Figure 10(a), Peak MeT2) were observed as main products, but at the same time, a peak corresponding to the peptide P-3lib15Thr derived from Thr contaminated in the translation system in trace amounts (Figure 10(a), PeakT1) and a peak corresponding to a potassium salt of the peptide (Figure 10(a), PeakT2) were also observed, revealing that the purity of the translation products was adversely affected (Figure 10(a)). On the other hand, when experiments were performed using the modified ThrRS 03 (SEQ ID NO: 10) and the modified ThrRS 14 (SEQ ID NO: 11) with mutations introduced, the peaks derived from P-CT21MeThr (Peak MeT3-6, Figure 10) were similarly observed, and the peak corresponding to the peptide sequence P-3lib15Thr was only slightly observed (Figure 10(b), (c)). In other words, it was demonstrated that the peptide with MeThr introduced was synthesized with higher purity using the modified ThrRSs compared to using wild-type ThrRS, and it was suggested that the modified ThrRS 03 and 14 are ARSs that can introduce MeThr into peptides more efficiently compared to wild-type ThrRS.

Peptide sequence P-3lib15Thr (SEQ ID NO: 179)
formylMetLysAlaGlyProGlyPheMetThrLysSerGlySerGlySer
Peptide sequence P-3lib15MeThr (SEQ ID NO: 179)
formylMetLysAlaGlyProGlyPheMet[MeThr]LysSerGlySerGlySer

MALDI-TOF MS:

$Calc . m/z : [H + M] + = 1480.7, [K + M] + = 1508.8 (\begin{array}{l} the peptide corresponding to the sequence P- \\ 3lib15Thr \end{array})$
$Calc . m/z : [H + M] + = 1484.7, [K + M] + = 1522.8 (\begin{array}{l} the peptide corresponding to the sequence P- \\ 3lib15MeThr \end{array})$

Example 5: Development of N-methyltryptophan-accepting ARS

Expression vectors having a polyhistidine sequence at the N-terminus and containing mutations listed in Table 8 were constructed. Then, an expression strain was transformed with the vectors, and the mutant proteins of interest were purified with a nickel column from supernatants obtained by disrupting cells.

A template mRNA(R-W (SEQ ID NO: 197)) was synthesized from a template DNA (D-W (SEQ ID NO: 196)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-W (CT29) (SEQ ID NO: 196)
DNA sequence:
R-W (SEQ ID NO: 197)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methyltryptophan, a desired polypeptide containing N-methyltryptophan was ribosomally synthesized by adding N-methyltryptophan and a mutant TrpRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant TrpRS and N-methyltryptophan were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 1.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 0.1 mM 10-HCO-H4 folate, 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 84 µM EF-Ts, 1.2 µM ribosome, 0.03 µM ArgRS, 0.13 µM AspRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.02 µM TyrRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM arginine, aspartic acid, lysine, methionine, and tyrosine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methyltryptophan is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-W (SEQ ID NO: 197)) and arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. TrpRS (final concentration 5 µM) and N-methyltryptophan (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. Translation products were identified with MALDI-TOF MS spectrometry using α-cyano-4-hydroxycinnamic acid as a matrix.
As a result of the translation using wild-type TrpRS (SEQ ID NO: 188), a peak corresponding to the target peptide peak P-CT29MeTrp containing N-methyltryptophan (Figure 11(a), Peak MeW1) was observed, but a peak corresponding to peptide P-CT29Trp derived from Trp contaminated in the translation system in trace amounts was observed as main product (Figure 11(a), PeakW1). On the other hand, when experiments were performed using the modified TrpRS 04 (SEQ ID NO: 184), the modified TrpRS 05 (SEQ ID NO: 185), and the modified TrpRS 18 (SEQ ID NO: 186) with mutations introduced, peaks derived from P-CT29MeTrp (Peak MeW2-4, Figure 11(b)-(d)) were observed as main product. In other words, it is demonstrated that the peptide introduced with MeTrp was synthesized at a higher purity using modified TrpRSs compared to using wild-type TrpRS, and it was suggested that these modified TrpRSs are ARSs that can introduce MeTrp into peptides more efficiently compared to wild-type TrpRS.

Peptide sequence P-CT29Trp (SEQ ID NO: 198)
formylMetArgTrpArgAspTyrLysAspAspAspAspLys
Peptide sequence P-CT29MeTrp (SEQ ID NO: 199)
formylMetArg[MeTrp]ArgAspTyrLysAspAspAspAspLys

MALDI-TOF MS:

$Calc . m/z : [H + M] + = 1670.7 (the peptide corresponding to the sequence P-CT29Trp)$
$Calc . m/z : [H + M] + = 1684.7 (the peptide corresponding to the sequence P-CT29MeTrp)$

Example 6: Development of N-methylleucine-accepting ARS

Expression vectors having a polyhistidine sequence at the N-terminus and containing mutations listed in Table 9 were constructed. Subsequently, an expression strain was transformed with the vectors, and the mutant proteins of interest were purified with a nickel column from supernatants obtained by disrupting cells.

A template mRNA(R-L (SEQ ID NO: 201)) was synthesized from a template DNA (D-L (SEQ ID NO: 200)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) and purified using RNeasy Mini kit (Qiagen).

D-L (CT23) (SEQ ID NO: 200)
DNA sequence:
R-L (SEQ ID NO: 201)
RNA sequence:

<Cell-free translation system>

In order to confirm translational introduction of N-methylleucine, a desired polypeptide containing N-methylleucine was ribosomally synthesized by adding N- methylleucine and a mutant LeuRS to a cell-free translation system. The translation system used was PURE system, a reconstituted cell-free protein synthesis system from E. coli. Specifically, wild-type or mutant LeuRS and N-methylleucine were added to a solution containing a basic cell-free translation solution (1 mM GTP, 1 mM ATP, 20 mM creatine phosphate, 50 mM HEPES-KOH pH 7.6, 100 mM potassium acetate, 9 mM magnesium acetate, 2 mM spermidine, 1 mM dithiothreitol, 1.5 mg/ml E. coli MRE600 (RNase negative)-derived tRNA (Roche), 0.1 mM 10-HCO-H4 folate, 4 µg/ml creatine kinase, 3 µg/ml myokinase, 2 unit/ml inorganic pyrophosphatase, 1.1 µg/ml nucleoside diphosphate kinase, 0.6 µM methionyl-tRNA transformylase, 0.26 µM EF-G, 0.24 µM RF2, 0.17 µM RF3, 0.5 µM RRF, 2.7 µM IF1, 0.4 µM IF2, 1.5 µM IF3, 40 µM EF-Tu, 84 µM EF-Ts, 1.2 µM ribosome, 0.03 µM ArgRS, 0.13 µM AspRS, 0.11 µM LysRS, 0.03 µM MetRS, 0.02 µM TyrRS (wherein the proteins prepared by the inventors were essentially prepared as His-tagged proteins)), and 1 µM template mRNA, each 250 µM arginine, aspartic acid, lysine, methionine, and tyrosine, and left at 37°C for 1 hour to ribosomally synthesize the peptide.

Mass spectroscopy was performed using MALDI-TOF MS for detecting peptides into which N-methylleucine is ribosomally introduced. Specifically, a solution containing 1 µM template mRNA (R-L (SEQ ID NO: 201)) and arginine, lysine, methionine, tyrosine, and aspartic acid (each final concentration 250 µM) added to the cell-free translation system as described above was prepared. LeuRS (final concentration 0.4-2 µM) and N-methylleucine (final concentration 5 mM) were added to the solution and incubated at 37°C for 60 minutes. The resulting translation reaction products were purified with SPE C-TIP (Nikkyo Technos Co., Ltd) and analyzed with MALDI-TOF MS. Translation products were identified with MALDI-TOF MS spectrometry using α-cyano-4-hydroxycinnamic acid as a matrix.
As a result of the translation using wild-type LeuRS (SEQ ID NO: 189), a peak corresponding to the target peptide peak P-CT23MeLeu containing N-methylleucine was not observed, but the peptide P-CT23Leu derived from Leu contaminated in the translation system in trace amounts (Figure 12(a), PeakL1) was observed as an almost single product. On the other hand, when experiments were performed using the modified LeuRS 02 (SEQ ID NO: 187) with mutations introduced, the peak derived from P-CT29MeLeu (Peak MeL1, Figure 12(b)) was observed, and the intensity of the peak was as strong as the intensity of the peak of the peptide P-CT23Leu containing Leu (Peak L2, Figure 12(b)). In other words, it is demonstrated that the MeLeu-introduced peptide synthesized using the modified LeuRS 02 was more than that using wild-type LeuRS, suggesting that this modified LeuRS is an ARS that can introduce MeLeu into peptides more efficiently compared to wild-type LeuRS.

Peptide sequence P-CT23Leu (SEQ ID NO: 202)
formylMetArgLeuArgAspTyrLysAspAspAspAspLys
Peptide sequence P-CT23MeLeu (SEQ ID NO: 203)
formylMetArg[MeLeu]ArgAspTyrLysAspAspAspAspLys

MALDI-TOF MS:

$Calc . m/z : [H + M] + = 1597.7 (the peptide corresponding to the sequence P-CT23Leu)$
$Calc . m/z : [H + M] + = 1611.7 (the peptide corresponding to the sequence P-CT23MeLeu)$

Example 7: Development of modified ValRSs having increased selectivity to N-methylvaline achieved by enhancing valine-hydrolyzing ability in the editing domain

Expression vectors for modified ValRSs that have a polyhistidine sequence at the N-terminus and containing mutations (N44G,T45S) in the catalytic domain and mutation T279A(G) in the editing domain, which mutations increase activity to N-methylvaline, were constructed (Table 10). Then, an expression strain was transformed with the vectors, and the mutant proteins of interest were purified using a nickel column from supernatants obtained by disrupting cells.

[Synthesis of E. coli tRNAVal by in vitro transcription reaction]

E. coli tRNA (R-tRNAVal1 (SEQ ID NO: 205)) was synthesized from a template DNA (D-tRNAVal1 (SEQ ID NO: 204)) by in vitro transcription reaction using RiboMAX Large Scale RNA production System T7 (Promega, P1300) in the presence of 7.5 mM GMP, and purified using RNeasy Mini kit (Qiagen).

D-tRNAVal1 (SEQ ID NO: 204)
tRNAVal1 DNA sequence:
R-tRNAVal1 (SEQ ID NO: 205)
tRNAVal1 RNA sequence:

For the aminoacylation reaction, the solution containing 50 µM transcribed tRNA, 10 mM HEPES-K (pH 7.6), and 10 mM KCl solution was heated at 95°C for 2 minutes and then left at room temperature for 5 minutes or more to refold the tRNA. This tRNA solution was added to a final concentration of 10 µM to an acylation buffer (in final concentrations of 50 mM HEPES-K [pH 7.6], 2 mM ATP, 100 mM potassium acetate, 10 mM magnesium acetate, 1 mM DTT, 2 mM spermidine, 0.1 mg/mL Bovine Serum Albumin), mixed with a mutant ValRS (final concentration 2 µM) and N-methylvaline (final concentration 0.08 mM-5 mM) or valine (final concentration 0.031 mM-0.25 mM), and incubated at 37°C for 10 minutes. Four volumes of a loading buffer (90 mM sodium acetate [pH 5.2], 10 mM EDTA, 95%(w/w) formamide, 0.1%(w/v) xylene cyanol) were added to the reaction solution and analyzed with acidic PAGE containing 6 M urea, and aminoacylation activity was detected by separating unreacted tRNA and aminoacyl-tRNA. RNA was stained with SYBR Gold (Life Technologies) and detected with LAS4000 (GE Healthcare).
This result proved that the acylation abilities of the mutants 13-11, 66 (SEQ ID NO: 182), and 67 (SEQ ID NO: 183) were not very different at each N-methylvaline concentration when N-methylvaline was used as substrate (Figure 13, for example, lanes 17-19). On the other hand, it was proved that when valine was used as a substrate, the acylation abilities of mutants 66 and 67 reduced compared to mutants 13-11 (Figure 14, for example, lanes 17-19). This demonstrated that mutants 66 and 67 having mutations newly introduced into the editing domain are modified ValRSs that have reduced aminoacylation activity to valine and therefore has increased selectivity to N-methylvaline.

ABSTRACT

The present invention provides modified valyl-tRNA synthetases (ValRSs) having increased reactivity with N-methyl valine compared to natural ValRSs. The present invention enables a more efficient production of polypeptides containing N-methyl valine.

SEQUENCE LISTING

<110> CHUGAI SEIYAKU KABUSHIKI KAISHA
<120> Modified aminoacyl tRNA synthetases and their use
<130> AA2504 EP S3
<140> EP 16 76 4874.0
<141> 2016-03-11
<150> PCT/JP2016/057707
<151> 2015-03-13
<160> 205
<170> PatentIn version 3.5
<210> 1
<211> 327
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 1
<210> 2
<211> 327
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 2
<210> 3
<211> 951
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 3
<210> 4
<211> 951
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 4
<210> 5
<211> 951
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 5
<210> 6
<211> 430
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 6
<210> 7
<211> 430
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 7
<210> 8
<211> 430
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 8
<210> 9
<211> 430
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 9
<210> 10
<211> 642
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 10
<210> 11
<211> 642
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 11
<210> 12
<211> 981
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 12
<210> 13
<211> 981
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 13
<210> 14
<211> 2853
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 14
<210> 15
<211> 2853
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 15
<210> 16
<211> 2853
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 16
<210> 17
<211> 1290
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 17
<210> 18
<211> 1290
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 18
<210> 19
<211> 1290
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 19
<210> 20
<211> 1290
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 20
<210> 21
<211> 1926
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 21
<210> 22
<211> 1926
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 22
<210> 23
<211> 2853
<212> DNA
<213> E. coli
<400> 23
<210> 24
<211> 951
<212> PRT
<213> E. coli
<400> 24
<210> 25
<211> 1290
<212> DNA
<213> E. coli
<400> 25
<210> 26
<211> 430
<212> PRT
<213> E. coli
<400> 26
<210> 27
<211> 981
<212> DNA
<213> E. coli
<400> 27
<210> 28
<211> 327
<212> PRT
<213> E. coli
<400> 28
<210> 29
<211> 642
<212> PRT
<213> E. coli
<400> 29

Glu Glu
<210> 30
<211> 41
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 30
gcctgctgcg tacccagacc gcgggcgtac agatccgcac c 41
<210> 31
<211> 41
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 31
gcctgctgcg tacccagacc ggcggcgtac agatccgcac c 41
<210> 32
<211> 42
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 32
ctacccgcct gctgcgtacc gcgacctctg gcgtacagat cc 42
<210> 33
<211> 42
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 33
ctacccgcct gctgcgtacc ggcacctctg gcgtacagat cc 42
<210> 34
<211> 41
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 34
ggtgcggatc tgtacgcccg cggtctgggt acgcagcagg c 41
<210> 35
<211> 41
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 35
ggtgcggatc tgtacgccgc cggtctgggt acgcagcagg c 41
<210> 36
<211> 42
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 36
ggatctgtac gccagaggtc gcggtacgca gcaggcgggt ag 42
<210> 37
<211> 42
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 37
ggatctgtac gccagaggtg ccggtacgca gcaggcgggt ag 42
<210> 38
<211> 97
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 38
<210> 39
<211> 76
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 39
<210> 40
<211> 92
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 40
<210> 41
<211> 71
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 41
<210> 42
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (3)..(3)
<223> Xaa is phenylalanine or methyl phenylalanine
<400> 42
<210> 43
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 43
gtttctgcat catgatcccg gctccgaacg tcaccggcag tttgcatat 49
<210> 44
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 44
gtttctgcat catgatcccg ggtccgaacg tcaccggcag tttgcatat 49
<210> 45
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 45
gtttctgcat catgatcccg ccgccggctg tcaccggcag tttgcatat 49
<210> 46
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 46
gtttctgcat catgatcccg gctccggctg tcaccggcag tttgcatat 49
<210> 47
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 47
gtttctgcat catgatcccg ggtccggctg tcaccggcag tttgcatat 49
<210> 48
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 48
gtttctgcat catgatcccg ccgccgagtg tcaccggcag tttgcatat 49
<210> 49
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 49
gtttctgcat catgatcccg gctccgagtg tcaccggcag tttgcatat 49
<210> 50
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 50
gtttctgcat catgatcccg ggtccgagtg tcaccggcag tttgcatat 49
<210> 51
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 51
gtttctgcat catgatcccg ccgccggtag tcaccggcag tttgcatat 49
<210> 52
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 52
gtttctgcat catgatcccg gctccggtag tcaccggcag tttgcatat 49
<210> 53
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 53
gtttctgcat catgatcccg ggtccggtag tcaccggcag tttgcatat 49
<210> 54
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 54
gtttctgcat catgatcccg ccgccgggtg tcaccggcag tttgcatat 49
<210> 55
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 55
gtttctgcat catgatcccg gctccgggtg tcaccggcag tttgcatat 49
<210> 56
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 56
gtttctgcat catgatcccg ggtccgggtg tcaccggcag tttgcatat 49
<210> 57
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 57
gtttctgcat catgatcccg ccgccggatg tcaccggcag tttgcatat 49
<210> 58
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 58
gtttctgcat catgatcccg gctccggatg tcaccggcag tttgcatat 49
<210> 59
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 59
gtttctgcat catgatcccg ggtccggatg tcaccggcag tttgcatat 49
<210> 60
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 60
tggcaggtcg gtactgctca cgccgggatc gct 33
<210> 61
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 61
tggcaggtcg gtactagtca cgccgggatc gct 33
<210> 62
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 62
tggcaggtcg gtactgtaca cgccgggatc gct 33
<210> 63
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 63
tggcaggtcg gtactggtca cgccgggatc gct 33
<210> 64
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 64
gtactgacca cgccgggatc tatacccaga tggtcgttga gcg 43
<210> 65
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 65
gtactgacca cgccgggatc tggacccaga tggtcgttga gcg 43
<210> 66
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 66
gtactgacca cgccgggatc tcaacccaga tggtcgttga gcg 43
<210> 67
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 67
gtactgacca cgccgggatc atgacccaga tggtcgttga gcg 43
<210> 68
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 68
gtactgacca cgccgggatc aaaacccaga tggtcgttga gcg 43
<210> 69
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 69
gtactgacca cgccgggatc aacacccaga tggtcgttga gcg 43
<210> 70
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 70
gtactgacca cgccgggatc gtgacccaga tggtcgttga gcg 43
<210> 71
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 71
gtactgacca cgccgggatc ctgacccaga tggtcgttga gcg 43
<210> 72
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 72
gatcccgccg ccgggtgtct atggcagttt gcatatgggt cac 43
<210> 73
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 73
gatcccgccg ccgggtgtct ggggcagttt gcatatgggt cac 43
<210> 74
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 74
gatcccgccg ccgggtgtct caggcagttt gcatatgggt cac 43
<210> 75
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 75
gatcccgccg ccgggtgtca tgggcagttt gcatatgggt cac 43
<210> 76
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 76
gatcccgccg ccgggtgtca aaggcagttt gcatatgggt cac 43
<210> 77
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 77
gatcccgccg ccgggtgtca acggcagttt gcatatgggt cac 43
<210> 78
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 78
gatcccgccg ccgggtgtcg tgggcagttt gcatatgggt cac 43
<210> 79
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 79
gatcccgccg ccgggtgtcc tgggcagttt gcatatgggt cac 43
<210> 80
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 80
atatgcaaac tgccggtgac gttcggagcc gggatcatga tgcagaaac 49
<210> 81
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 81
atatgcaaac tgccggtgac gttcggaccc gggatcatga tgcagaaac 49
<210> 82
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 82
atatgcaaac tgccggtgac agccggcggc gggatcatga tgcagaaac 49
<210> 83
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 83
atatgcaaac tgccggtgac agccggagcc gggatcatga tgcagaaac 49
<210> 84
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 84
atatgcaaac tgccggtgac agccggaccc gggatcatga tgcagaaac 49
<210> 85
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 85
atatgcaaac tgccggtgac actcggcggc gggatcatga tgcagaaac 49
<210> 86
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 86
atatgcaaac tgccggtgac actcggagcc gggatcatga tgcagaaac 49
<210> 87
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 87
atatgcaaac tgccggtgac actcggaccc gggatcatga tgcagaaac 49
<210> 88
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 88
atatgcaaac tgccggtgac taccggcggc gggatcatga tgcagaaac 49
<210> 89
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 89
atatgcaaac tgccggtgac taccggagcc gggatcatga tgcagaaac 49
<210> 90
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 90
atatgcaaac tgccggtgac taccggaccc gggatcatga tgcagaaac 49
<210> 91
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 91
atatgcaaac tgccggtgac acccggcggc gggatcatga tgcagaaac 49
<210> 92
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 92
atatgcaaac tgccggtgac acccggagcc gggatcatga tgcagaaac 49
<210> 93
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 93
atatgcaaac tgccggtgac acccggaccc gggatcatga tgcagaaac 49
<210> 94
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 94
atatgcaaac tgccggtgac atccggcggc gggatcatga tgcagaaac 49
<210> 95
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 95
atatgcaaac tgccggtgac atccggagcc gggatcatga tgcagaaac 49
<210> 96
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 96
atatgcaaac tgccggtgac atccggaccc gggatcatga tgcagaaac 49
<210> 97
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 97
agcgatcccg gcgtgagcag taccgacctg cca 33
<210> 98
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 98
agcgatcccg gcgtgactag taccgacctg cca 33
<210> 99
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 99
agcgatcccg gcgtgtacag taccgacctg cca 33
<210> 100
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 100
agcgatcccg gcgtgaccag taccgacctg cca 33
<210> 101
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 101
cgctcaacga ccatctgggt atagatcccg gcgtggtcag tac 43
<210> 102
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 102
cgctcaacga ccatctgggt ccagatcccg gcgtggtcag tac 43
<210> 103
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 103
cgctcaacga ccatctgggt tgagatcccg gcgtggtcag tac 43
<210> 104
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 104
cgctcaacga ccatctgggt catgatcccg gcgtggtcag tac 43
<210> 105
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 105
cgctcaacga ccatctgggt tttgatcccg gcgtggtcag tac 43
<210> 106
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 106
cgctcaacga ccatctgggt gttgatcccg gcgtggtcag tac 43
<210> 107
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 107
cgctcaacga ccatctgggt cacgatcccg gcgtggtcag tac 43
<210> 108
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 108
cgctcaacga ccatctgggt caggatcccg gcgtggtcag tac 43
<210> 109
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 109
gtgacccata tgcaaactgc catagacacc cggcggcggg atc 43
<210> 110
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 110
gtgacccata tgcaaactgc cccagacacc cggcggcggg atc 43
<210> 111
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 111
gtgacccata tgcaaactgc ctgagacacc cggcggcggg atc 43
<210> 112
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 112
gtgacccata tgcaaactgc ccatgacacc cggcggcggg atc 43
<210> 113
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 113
gtgacccata tgcaaactgc ctttgacacc cggcggcggg atc 43
<210> 114
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 114
gtgacccata tgcaaactgc cgttgacacc cggcggcggg atc 43
<210> 115
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 115
gtgacccata tgcaaactgc ccacgacacc cggcggcggg atc 43
<210> 116
<211> 43
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 116
gtgacccata tgcaaactgc ccaggacacc cggcggcggg atc 43
<210> 117
<211> 98
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 117
<210> 118
<211> 77
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 118
<210> 119
<211> 92
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 119
<210> 120
<211> 71
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 120
<210> 121
<211> 95
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 121
<210> 122
<211> 74
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 122
<210> 123
<211> 98
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 123
<210> 124
<211> 77
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 124
<210> 125
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (3)..(3)
<223> Xaa is valine or methyl valine
<400> 125
<210> 126
<211> 13
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (3)..(4)
<223> Xaa is valine or methyl valine
<400> 126
<210> 127
<211> 14
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (3)..(5)
<223> Xaa is valine or methyl valine
<400> 127
<210> 128
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 128
gtaactatgc gctgatccca gctgcagaag ttccgctgac taacctggt 49
<210> 129
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 129
gtaactatgc gctgatccca agtgcagaag ttccgctgac taacctggt 49
<210> 130
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 130
gtaactatgc gctgatccca ggtgcagaag ttccgctgac taacctggt 49
<210> 131
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 131
gtaactatgc gctgatccca acggcagctg ttccgctgac taacctggt 49
<210> 132
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 132
gtaactatgc gctgatccca acggcagtag ttccgctgac taacctggt 49
<210> 133
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 133
gtaactatgc gctgatccca acggcaagtg ttccgctgac taacctggt 49
<210> 134
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 134
ctggttcata ccctggctgg ttctggtctg gct 33
<210> 135
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 135
gtaactatgc gctgatccca gctgcagctg ttccgctgac taacctggt 49
<210> 136
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 136
gtaactatgc gctgatccca gctgcagtag ttccgctgac taacctggt 49
<210> 137
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 137
gtaactatgc gctgatccca gctgcaagtg ttccgctgac taacctggt 49
<210> 138
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 138
gtaactatgc gctgatccca agtgcagctg ttccgctgac taacctggt 49
<210> 139
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 139
gtaactatgc gctgatccca agtgcagtag ttccgctgac taacctggt 49
<210> 140
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 140
gtaactatgc gctgatccca agtgcaagtg ttccgctgac taacctggt 49
<210> 141
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 141
gtaactatgc gctgatccca ggtgcagctg ttccgctgac taacctggt 49
<210> 142
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 142
gtaactatgc gctgatccca ggtgcagtag ttccgctgac taacctggt 49
<210> 143
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 143
gtaactatgc gctgatccca ggtgcaagtg ttccgctgac taacctggt 49
<210> 144
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 144
cagttcgaca aagttgctat ggtgcagatc gtg 33
<210> 145
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 145
atgcaccagt tcgacgctgt tgaaatggtg cag 33
<210> 146
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 146
gtaactatgc gctgatccca acggcaggtg ttccgctgac taacctggt 49
<210> 147
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 147
gtaactatgc gctgatccca acggcagacg ttccgctgac taacctggt 49
<210> 148
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 148
gtaactatgc gctgatccca agtgcaggtg ttccgctgac taacctggt 49
<210> 149
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 149
gtaactatgc gctgatccca agtgcagacg ttccgctgac taacctggt 49
<210> 150
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 150
accaggttag tcagcggaac ttctgcagct gggatcagcg catagttac 49
<210> 151
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 151
accaggttag tcagcggaac ttctgcactt gggatcagcg catagttac 49
<210> 152
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 152
accaggttag tcagcggaac ttctgcacct gggatcagcg catagttac 49
<210> 153
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 153
accaggttag tcagcggaac agctgccgtt gggatcagcg catagttac 49
<210> 154
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 154
accaggttag tcagcggaac tactgccgtt gggatcagcg catagttac 49
<210> 155
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 155
accaggttag tcagcggaac acttgccgtt gggatcagcg catagttac 49
<210> 156
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 156
agccagacca gaaccagcca gggtatgaac cag 33
<210> 157
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 157
accaggttag tcagcggaac agctgcagct gggatcagcg catagttac 49
<210> 158
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 158
accaggttag tcagcggaac tactgcagct gggatcagcg catagttac 49
<210> 159
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 159
accaggttag tcagcggaac acttgcagct gggatcagcg catagttac 49
<210> 160
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 160
accaggttag tcagcggaac agctgcactt gggatcagcg catagttac 49
<210> 161
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 161
accaggttag tcagcggaac tactgcactt gggatcagcg catagttac 49
<210> 162
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 162
accaggttag tcagcggaac acttgcactt gggatcagcg catagttac 49
<210> 163
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 163
accaggttag tcagcggaac agctgcacct gggatcagcg catagttac 49
<210> 164
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 164
accaggttag tcagcggaac tactgcacct gggatcagcg catagttac 49
<210> 165
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 165
accaggttag tcagcggaac acttgcacct gggatcagcg catagttac 49
<210> 166
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 166
cacgatctgc accatagcaa ctttgtcgaa ctg 33
<210> 167
<211> 33
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 167
ctgcaccatt tcaacagcgt cgaactggtg cat 33
<210> 168
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 168
accaggttag tcagcggaac acctgccgtt gggatcagcg catagttac 49
<210> 169
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 169
accaggttag tcagcggaac gtctgccgtt gggatcagcg catagttac 49
<210> 170
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 170
accaggttag tcagcggaac acctgcactt gggatcagcg catagttac 49
<210> 171
<211> 49
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 171
accaggttag tcagcggaac gtctgcactt gggatcagcg catagttac 49
<210> 172
<211> 114
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 172
<210> 173
<211> 93
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 173
<210> 174
<211> 92
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 174
<210> 175
<211> 71
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 175
<210> 176
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (3)..(3)
<223> Xaa is serine or methyl serine
<400> 176
<210> 177
<211> 103
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 177
<210> 178
<211> 82
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 178
<210> 179
<211> 15
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (9)..(9)
<223> Xaa is threonine or methyl threonine
<400> 179
<210> 180
<211> 7
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (4)..(4)
<223> Xaa is Asn (N), Tyr (Y), or Thr (T).
<220>
<221> MISC_FEATURE
<222> (5)..(5)
<223> Xaa is any amino acid.
<220>
<221> MISC_FEATURE
<222> (6)..(6)
<223> Xaa is Thr (T) or Ser (S).
<400> 180
<210> 181
<211> 7
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> MISC_FEATURE
<222> (5)..(5)
<223> Xaa is any amino acid.
<400> 181
<210> 182
<211> 951
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 182
<210> 183
<211> 951
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 183
<210> 184
<211> 334
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 184
<210> 185
<211> 334
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 185
<210> 186
<211> 334
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 186
<210> 187
<211> 860
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 187
<210> 188
<211> 334
<212> PRT
<213> E. coli
<400> 188
<210> 189
<211> 860
<212> PRT
<213> E. coli
<400> 189
<210> 190
<211> 2853
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 190
<210> 191
<211> 2853
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 191
<210> 192
<211> 1002
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 192
<210> 193
<211> 1002
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 193
<210> 194
<211> 1002
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 194
<210> 195
<211> 2580
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 195
<210> 196
<211> 92
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 196
<210> 197
<211> 71
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 197
<210> 198
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 198
<210> 199
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> misc_feature
<222> (3)..(3)
<223> Xaa is methyl tryptophan
<400> 199
<210> 200
<211> 92
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 200
<210> 201
<211> 71
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 201
<210> 202
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 202
<210> 203
<211> 12
<212> PRT
<213> Artificial
<220>
<223> artificially synthesized sequence
<220>
<221> misc_feature
<222> (3)..(3)
<223> Xaa is methyl leucine
<400> 203
<210> 204
<211> 97
<212> DNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 204
<210> 205
<211> 76
<212> RNA
<213> Artificial
<220>
<223> artificially synthesized sequence
<400> 205

Claims

A modified valyl-tRNA synthetase (ValRS) which incorporates an N-methyl valine more efficiently than the ValRS having the amino acid sequence SEQ ID NO:24, wherein said modified ValRS is selected from the following (a) and (b):
(a) a ValRS modified at a position(s) corresponding to asparagine at position 43 and/or threonine at position 45 and/or threonine at position 279 of ValRS from Escherichia coli having the amino acid sequence SEQ ID NO:24, and

(b) a ValRS having (i) glycine or alanine at a position corresponding to asparagine at position 43 and/or (ii) serine at a position corresponding to threonine at position 45 and/or (iii) glycine or alanine at a position corresponding to threonine at position 279 of ValRS from Escherichia coli having the amino acid sequence SEQ ID NO:24,
wherein the modified ValRS of (a) and (b) comprises an amino acid sequence having at least 90% identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 3 to 5 and 182 and 183.
The modified ValRS of claim 1, wherein the modification comprises at least one amino acid substitution that causes a decrease of 10 or more in molecular weight.
The modified ValRS according to claims 1 or 2, wherein the ValRS comprises amino acids selected from the group consisting of SEQ ID NOs: 3 to 5 and 182 and 183.
A polynucleotide encoding the modified ValRS according to any one of claims 1 to 3.
A vector comprising the polynucleotide according to claim 4.
A host cell comprising the polynucleotide according to claim 4 or the vector according to claim 5.
A method for producing the modified ValRS according to any one of claims 1 to 3, comprising the step of culturing the host cell according to claim 6.
A method for producing a tRNA acylated with an N-methylvaline, comprising the step of contacting the N-methylvaline with a tRNA in the presence of the modified ValRS according to any one of claims 1 to 3.
A method for producing a polypeptide comprising an N-methylvaline, comprising the step of performing translation in the presence of the modified ValRS according to any one of claims 1 to 3 and the N-methylvaline.
The method according to claim 9, wherein the step of performing translation is carried out in a cell-free translation system.