WO2019005973A1 - Synthetase variants for incorporation of biphenylalanine into a peptide - Google Patents
Synthetase variants for incorporation of biphenylalanine into a peptide Download PDFInfo
- Publication number
- WO2019005973A1 WO2019005973A1 PCT/US2018/039764 US2018039764W WO2019005973A1 WO 2019005973 A1 WO2019005973 A1 WO 2019005973A1 US 2018039764 W US2018039764 W US 2018039764W WO 2019005973 A1 WO2019005973 A1 WO 2019005973A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- variant
- amino acid
- trna
- amino
- parental
- Prior art date
Links
- JCZLABDVDPYLRZ-AWEZNQCLSA-N Biphenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C1=CC=CC=C1 JCZLABDVDPYLRZ-AWEZNQCLSA-N 0.000 title claims abstract description 107
- 238000010348 incorporation Methods 0.000 title claims abstract description 34
- 102000003960 Ligases Human genes 0.000 title claims description 35
- 108090000364 Ligases Proteins 0.000 title claims description 35
- 229920001949 Transfer RNA Polymers 0.000 claims abstract description 164
- 102000008745 EC 6.1.1.- Human genes 0.000 claims abstract description 124
- 108030004302 EC 6.1.1.- Proteins 0.000 claims abstract description 124
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 72
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 72
- 235000001014 amino acid Nutrition 0.000 claims description 247
- 150000001413 amino acids Chemical class 0.000 claims description 195
- 108020004566 Transfer RNA Proteins 0.000 claims description 155
- 210000004027 cells Anatomy 0.000 claims description 151
- 238000006467 substitution reaction Methods 0.000 claims description 77
- 229920001184 polypeptide Polymers 0.000 claims description 68
- 235000018102 proteins Nutrition 0.000 claims description 63
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 60
- 230000014509 gene expression Effects 0.000 claims description 43
- 102000033147 ERVK-25 Human genes 0.000 claims description 40
- 108091005771 Peptidases Proteins 0.000 claims description 40
- 125000006239 protecting group Chemical group 0.000 claims description 40
- 239000004365 Protease Substances 0.000 claims description 39
- 150000007523 nucleic acids Chemical class 0.000 claims description 34
- 239000002773 nucleotide Substances 0.000 claims description 29
- 125000003729 nucleotide group Chemical group 0.000 claims description 29
- 230000001809 detectable Effects 0.000 claims description 26
- 102000035425 adaptor proteins Human genes 0.000 claims description 23
- 108091005736 adaptor proteins Proteins 0.000 claims description 23
- 241000894006 Bacteria Species 0.000 claims description 22
- 229920001850 Nucleic acid sequence Polymers 0.000 claims description 22
- 102000004190 Enzymes Human genes 0.000 claims description 21
- 108090000790 Enzymes Proteins 0.000 claims description 21
- 108090000848 Ubiquitin Proteins 0.000 claims description 21
- 102400000757 Ubiquitin Human genes 0.000 claims description 21
- 108020004707 nucleic acids Proteins 0.000 claims description 21
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 19
- 230000000875 corresponding Effects 0.000 claims description 17
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 15
- 230000035772 mutation Effects 0.000 claims description 14
- 229920000023 polynucleotide Polymers 0.000 claims description 12
- 239000002157 polynucleotide Substances 0.000 claims description 12
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 12
- 125000000266 alpha-aminoacyl group Chemical group 0.000 claims description 10
- 102200055973 PCSK9 C67A Human genes 0.000 claims description 7
- 102220005321 rs33918131 Human genes 0.000 claims description 7
- 102200089536 PON1 L55M Human genes 0.000 claims description 6
- 102220340881 rs1554949196 Human genes 0.000 claims description 6
- 102220005289 rs33948615 Human genes 0.000 claims description 6
- 102220018893 rs80358527 Human genes 0.000 claims description 6
- 229960005190 Phenylalanine Drugs 0.000 claims description 5
- 238000004519 manufacturing process Methods 0.000 claims description 5
- 102220065998 rs201123766 Human genes 0.000 claims description 5
- 102220351450 c.23G>A Human genes 0.000 claims description 4
- 235000008729 phenylalanine Nutrition 0.000 claims description 4
- ZUOUZKKEUPVFJK-UHFFFAOYSA-N diphenyl Chemical group C1=CC=CC=C1C1=CC=CC=C1 ZUOUZKKEUPVFJK-UHFFFAOYSA-N 0.000 claims description 2
- 102220218371 rs761240106 Human genes 0.000 claims 3
- 102220425943 CLEC3B K76R Human genes 0.000 claims 2
- KISFEBPWFCGRGN-UHFFFAOYSA-M sodium;2-(2,4-dichlorophenoxy)ethyl sulfate Chemical compound [Na+].[O-]S(=O)(=O)OCCOC1=CC=C(Cl)C=C1Cl KISFEBPWFCGRGN-UHFFFAOYSA-M 0.000 claims 1
- 229940019746 Antifibrinolytic amino acids Drugs 0.000 description 44
- 241000588724 Escherichia coli Species 0.000 description 41
- 230000002068 genetic Effects 0.000 description 34
- 239000005090 green fluorescent protein Substances 0.000 description 22
- 230000000368 destabilizing Effects 0.000 description 20
- 230000015556 catabolic process Effects 0.000 description 18
- 230000004059 degradation Effects 0.000 description 18
- 238000006731 degradation reaction Methods 0.000 description 18
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 17
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 17
- 230000001939 inductive effect Effects 0.000 description 17
- 101710019508 AARS1 Proteins 0.000 description 16
- 102100006286 AARS1 Human genes 0.000 description 16
- 229940023064 Escherichia coli Drugs 0.000 description 16
- 239000000758 substrate Substances 0.000 description 15
- 238000004166 bioassay Methods 0.000 description 14
- 239000002609 media Substances 0.000 description 14
- 108020004705 Codon Proteins 0.000 description 13
- 229940088598 Enzyme Drugs 0.000 description 13
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 13
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 13
- -1 conditions Substances 0.000 description 13
- 230000000694 effects Effects 0.000 description 13
- RAXXELZNTBOGNW-UHFFFAOYSA-N Imidazole Chemical compound C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 12
- 229920003013 deoxyribonucleic acid Polymers 0.000 description 12
- 101710028284 YARS Proteins 0.000 description 11
- 125000000089 arabinosyl group Chemical class C1([C@@H](O)[C@H](O)[C@H](O)CO1)* 0.000 description 11
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 10
- 150000002500 ions Chemical class 0.000 description 10
- 230000037361 pathway Effects 0.000 description 10
- 230000001105 regulatory Effects 0.000 description 10
- 229960005091 Chloramphenicol Drugs 0.000 description 9
- WIIZWVCIJKGZOK-RKDXNWHRSA-N Chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 9
- 102100006066 EEF1A1 Human genes 0.000 description 9
- 108010049977 Peptide Elongation Factor Tu Proteins 0.000 description 9
- 238000007792 addition Methods 0.000 description 9
- 238000000338 in vitro Methods 0.000 description 9
- 238000004949 mass spectrometry Methods 0.000 description 9
- 238000000034 method Methods 0.000 description 9
- 101700082234 pylS Proteins 0.000 description 9
- 238000006243 chemical reaction Methods 0.000 description 8
- 239000000499 gel Substances 0.000 description 8
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 8
- 101710019459 tyrS1 Proteins 0.000 description 8
- XSQUKJJJFZCRTK-UHFFFAOYSA-N urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 8
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 7
- 235000003534 Saccharomyces carlsbergensis Nutrition 0.000 description 7
- 229940081969 Saccharomyces cerevisiae Drugs 0.000 description 7
- 102100003329 YARS2 Human genes 0.000 description 7
- 230000001580 bacterial Effects 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 7
- 230000002708 enhancing Effects 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 241000894007 species Species 0.000 description 7
- 229940110715 ENZYMES FOR TREATMENT OF WOUNDS AND ULCERS Drugs 0.000 description 6
- 102200133015 KCNE2 T8A Human genes 0.000 description 6
- 108020004999 Messenger RNA Proteins 0.000 description 6
- 108010076818 TEV protease Proteins 0.000 description 6
- OFVLGDICTFRJMM-WESIUVDSSA-N Tetracycline Chemical compound C1=CC=C2[C@](O)(C)[C@H]3C[C@H]4[C@H](N(C)C)C(O)=C(C(N)=O)C(=O)[C@@]4(O)C(O)=C3C(=O)C2=C1O OFVLGDICTFRJMM-WESIUVDSSA-N 0.000 description 6
- 229960004799 Tryptophan Drugs 0.000 description 6
- 230000001419 dependent Effects 0.000 description 6
- 239000003623 enhancer Substances 0.000 description 6
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 238000010355 genome engineering Methods 0.000 description 6
- 229940020899 hematological Enzymes Drugs 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 229920002106 messenger RNA Polymers 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- 108090000765 processed proteins & peptides Proteins 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- VHJLVAABSRFDPM-UHFFFAOYSA-N 1,4-dimercaptobutane-2,3-diol Chemical compound SCC(O)C(O)CS VHJLVAABSRFDPM-UHFFFAOYSA-N 0.000 description 5
- 241000206602 Eukaryota Species 0.000 description 5
- SBUJHOSQTJFQJX-NOAMYHISSA-N Kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 5
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 5
- 241001465754 Metazoa Species 0.000 description 5
- 238000003776 cleavage reaction Methods 0.000 description 5
- 238000010367 cloning Methods 0.000 description 5
- 108060001658 clpS Proteins 0.000 description 5
- 230000000593 degrading Effects 0.000 description 5
- 125000000524 functional group Chemical group 0.000 description 5
- 229960000318 kanamycin Drugs 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 238000002703 mutagenesis Methods 0.000 description 5
- 231100000350 mutagenesis Toxicity 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 229960001230 Asparagine Drugs 0.000 description 4
- 229940009098 Aspartate Drugs 0.000 description 4
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 4
- 108010035563 Chloramphenicol O-Acetyltransferase Proteins 0.000 description 4
- 241000701022 Cytomegalovirus Species 0.000 description 4
- CKLJMWTZIZZHCS-UHFFFAOYSA-N DL-aspartic acid Chemical compound OC(=O)C(N)CC(O)=O CKLJMWTZIZZHCS-UHFFFAOYSA-N 0.000 description 4
- 241000196324 Embryophyta Species 0.000 description 4
- 241000194033 Enterococcus Species 0.000 description 4
- 229960002743 Glutamine Drugs 0.000 description 4
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 4
- 239000004472 Lysine Substances 0.000 description 4
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L MgCl2 Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 4
- 241000186359 Mycobacterium Species 0.000 description 4
- 102200155471 PTPN11 D61V Human genes 0.000 description 4
- 241001135223 Prevotella melaninogenica Species 0.000 description 4
- 102220396158 RYR1 K76R Human genes 0.000 description 4
- 241000235070 Saccharomyces Species 0.000 description 4
- 108091022064 Tyrosine-tRNA ligases Proteins 0.000 description 4
- 230000004913 activation Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 235000009582 asparagine Nutrition 0.000 description 4
- 230000003115 biocidal Effects 0.000 description 4
- 230000037348 biosynthesis Effects 0.000 description 4
- 239000004202 carbamide Substances 0.000 description 4
- 108010041758 cleavase Proteins 0.000 description 4
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 4
- 108020001507 fusion proteins Proteins 0.000 description 4
- 102000037240 fusion proteins Human genes 0.000 description 4
- 235000003869 genetically modified organisms (GMOs) Nutrition 0.000 description 4
- 235000004554 glutamine Nutrition 0.000 description 4
- 210000004962 mammalian cells Anatomy 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 244000005700 microbiome Species 0.000 description 4
- 239000011347 resin Substances 0.000 description 4
- 229920005989 resin Polymers 0.000 description 4
- 239000002689 soil Substances 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 230000001629 suppression Effects 0.000 description 4
- 230000002103 transcriptional Effects 0.000 description 4
- 230000001131 transforming Effects 0.000 description 4
- 101710004187 At1g28350 Proteins 0.000 description 3
- 101710004185 At2g33840 Proteins 0.000 description 3
- 241000606108 Bartonella quintana Species 0.000 description 3
- 241000722885 Brettanomyces Species 0.000 description 3
- 241001647378 Chlamydia psittaci Species 0.000 description 3
- 241000606153 Chlamydia trachomatis Species 0.000 description 3
- 210000000349 Chromosomes Anatomy 0.000 description 3
- 241000193403 Clostridium Species 0.000 description 3
- 229960000310 ISOLEUCINE Drugs 0.000 description 3
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 3
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 3
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 3
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 3
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 3
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 3
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 3
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 3
- 241000204031 Mycoplasma Species 0.000 description 3
- 229920002649 Nonsense suppressor Polymers 0.000 description 3
- 229920000272 Oligonucleotide Polymers 0.000 description 3
- 241000714474 Rous sarcoma virus Species 0.000 description 3
- 241000194017 Streptococcus Species 0.000 description 3
- 244000057717 Streptococcus lactis Species 0.000 description 3
- 235000014897 Streptococcus lactis Nutrition 0.000 description 3
- 241000223230 Trichosporon Species 0.000 description 3
- 102000002501 Tryptophan-tRNA Ligase Human genes 0.000 description 3
- 108091000047 Tryptophan-tRNA Ligase Proteins 0.000 description 3
- 102000006275 Ubiquitin-Protein Ligases Human genes 0.000 description 3
- 108010083111 Ubiquitin-Protein Ligases Proteins 0.000 description 3
- 241000607598 Vibrio Species 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 101710019287 YARS1 Proteins 0.000 description 3
- 101710019284 YARS2 Proteins 0.000 description 3
- 235000004279 alanine Nutrition 0.000 description 3
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 3
- 230000002759 chromosomal Effects 0.000 description 3
- 101700020699 clpP1 Proteins 0.000 description 3
- 235000018417 cysteine Nutrition 0.000 description 3
- 238000010790 dilution Methods 0.000 description 3
- 238000009510 drug design Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- BDAGIHXWWSANSR-UHFFFAOYSA-N formic acid Chemical compound OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 3
- 238000006062 fragmentation reaction Methods 0.000 description 3
- 230000002538 fungal Effects 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl β-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000006011 modification reaction Methods 0.000 description 3
- 230000000051 modifying Effects 0.000 description 3
- RZVAJINKPMORJF-UHFFFAOYSA-N p-acetaminophenol Chemical compound CC(=O)NC1=CC=C(O)C=C1 RZVAJINKPMORJF-UHFFFAOYSA-N 0.000 description 3
- 239000008188 pellet Substances 0.000 description 3
- 108060002324 polC Proteins 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 102220067346 rs138630815 Human genes 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000002194 synthesizing Effects 0.000 description 3
- 239000004474 valine Substances 0.000 description 3
- 229960004295 valine Drugs 0.000 description 3
- 230000003612 virological Effects 0.000 description 3
- RWLSBXBFZHDHHX-VIFPVBQESA-N (2S)-2-(naphthalen-2-ylamino)propanoic acid Chemical compound C1=CC=CC2=CC(N[C@@H](C)C(O)=O)=CC=C21 RWLSBXBFZHDHHX-VIFPVBQESA-N 0.000 description 2
- PEMUHKUIQHFMTH-QMMMGPOBSA-N (2S)-2-amino-3-(4-bromophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(Br)C=C1 PEMUHKUIQHFMTH-QMMMGPOBSA-N 0.000 description 2
- NEMHIKRLROONTL-QMMMGPOBSA-N (2S)-2-azaniumyl-3-(4-azidophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N=[N+]=[N-])C=C1 NEMHIKRLROONTL-QMMMGPOBSA-N 0.000 description 2
- GTVVZTAFGPQSPC-QMMMGPOBSA-N (2S)-2-azaniumyl-3-(4-nitrophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C([N+]([O-])=O)C=C1 GTVVZTAFGPQSPC-QMMMGPOBSA-N 0.000 description 2
- SNZIFNXFAFKRKT-NSHDSACASA-N (2S)-2-azaniumyl-3-[4-[(2-methylpropan-2-yl)oxy]phenyl]propanoate Chemical compound CC(C)(C)OC1=CC=C(C[C@H]([NH3+])C([O-])=O)C=C1 SNZIFNXFAFKRKT-NSHDSACASA-N 0.000 description 2
- ZXSBHXZKWRIEIA-JTQLQIEISA-N (2S)-3-(4-acetylphenyl)-2-azaniumylpropanoate Chemical compound CC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 ZXSBHXZKWRIEIA-JTQLQIEISA-N 0.000 description 2
- DGVVWUTYPXICAM-UHFFFAOYSA-N 2-mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 2
- UQTZMGFTRHFAAM-ZETCQYMHSA-N 3-Iodotyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(I)=C1 UQTZMGFTRHFAAM-ZETCQYMHSA-N 0.000 description 2
- OFQBYHLLIJGMNP-UHFFFAOYSA-N 3-ethoxy-2-hydroxybenzaldehyde Chemical compound CCOC1=CC=CC(C=O)=C1O OFQBYHLLIJGMNP-UHFFFAOYSA-N 0.000 description 2
- WTDRDQBEARUVNC-LURJTMIESA-N 3-hydroxy-L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(O)=C1 WTDRDQBEARUVNC-LURJTMIESA-N 0.000 description 2
- 241000432074 Adeno-associated virus Species 0.000 description 2
- ZKHQWZAMYRWXGA-KQYNXXCUSA-N Adenosine triphosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-N 0.000 description 2
- 241000589155 Agrobacterium tumefaciens Species 0.000 description 2
- 229920002287 Amplicon Polymers 0.000 description 2
- 239000004475 Arginine Substances 0.000 description 2
- 241000193830 Bacillus <bacterium> Species 0.000 description 2
- 241000194107 Bacillus megaterium Species 0.000 description 2
- 235000014469 Bacillus subtilis Nutrition 0.000 description 2
- 229940075615 Bacillus subtilis Drugs 0.000 description 2
- 240000008371 Bacillus subtilis Species 0.000 description 2
- 241000606125 Bacteroides Species 0.000 description 2
- 241000606660 Bartonella Species 0.000 description 2
- 241000221198 Basidiomycota Species 0.000 description 2
- 241000588832 Bordetella pertussis Species 0.000 description 2
- 229940041514 Candida albicans extract Drugs 0.000 description 2
- 241001647372 Chlamydia pneumoniae Species 0.000 description 2
- 229940038705 Chlamydia trachomatis Drugs 0.000 description 2
- 241000193163 Clostridioides difficile Species 0.000 description 2
- 241000186227 Corynebacterium diphtheriae Species 0.000 description 2
- OUYCCCASQSFEME-MRVPVSSYSA-N D-tyrosine Chemical compound OC(=O)[C@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-MRVPVSSYSA-N 0.000 description 2
- 241001468179 Enterococcus avium Species 0.000 description 2
- 241000194032 Enterococcus faecalis Species 0.000 description 2
- 241000588722 Escherichia Species 0.000 description 2
- 241000605986 Fusobacterium nucleatum Species 0.000 description 2
- 102100011343 GLB1 Human genes 0.000 description 2
- 241000207201 Gardnerella vaginalis Species 0.000 description 2
- 229940049906 Glutamate Drugs 0.000 description 2
- 108010070675 Glutathione Transferase family Proteins 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N HEPES Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 2
- 239000007995 HEPES buffer Substances 0.000 description 2
- 102100020039 HPGDS Human genes 0.000 description 2
- 241000590002 Helicobacter pylori Species 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 2
- PZNQZSRPDOEBMS-QMMMGPOBSA-N Iodo-Phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(I)C=C1 PZNQZSRPDOEBMS-QMMMGPOBSA-N 0.000 description 2
- 241000235649 Kluyveromyces Species 0.000 description 2
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 2
- 150000008575 L-amino acids Chemical class 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- 241000186660 Lactobacillus Species 0.000 description 2
- 229940039696 Lactobacillus Drugs 0.000 description 2
- 229940039695 Lactobacillus acidophilus Drugs 0.000 description 2
- 240000001046 Lactobacillus acidophilus Species 0.000 description 2
- 235000013956 Lactobacillus acidophilus Nutrition 0.000 description 2
- 101710029649 MDV043 Proteins 0.000 description 2
- 241000205276 Methanosarcina Species 0.000 description 2
- 125000000729 N-terminal amino-acid group Chemical group 0.000 description 2
- 210000000282 Nails Anatomy 0.000 description 2
- 241000588652 Neisseria gonorrhoeae Species 0.000 description 2
- 102000004245 Proteasome Endopeptidase Complex Human genes 0.000 description 2
- 108090000708 Proteasome Endopeptidase Complex Proteins 0.000 description 2
- 108010026552 Proteome Proteins 0.000 description 2
- 229940055023 Pseudomonas aeruginosa Drugs 0.000 description 2
- 241000589517 Pseudomonas aeruginosa Species 0.000 description 2
- ZFOMKMMPBOQKMC-KXUCPTDWSA-N Pyrrolysine Chemical compound C[C@@H]1CC=N[C@H]1C(=O)NCCCC[C@H]([NH3+])C([O-])=O ZFOMKMMPBOQKMC-KXUCPTDWSA-N 0.000 description 2
- 101700054624 RF1 Proteins 0.000 description 2
- 210000003705 Ribosomes Anatomy 0.000 description 2
- 229940081973 S-Adenosylmethionine Drugs 0.000 description 2
- MEFKEPWMEQBLKI-AIRLBKTGSA-O S-adenosyl-L-methionine zwitterion Chemical compound O[C@@H]1[C@H](O)[C@@H](C[S+](CC[C@H]([NH3+])C([O-])=O)C)O[C@H]1N1C2=NC=NC(N)=C2N=C1 MEFKEPWMEQBLKI-AIRLBKTGSA-O 0.000 description 2
- 101700060380 SINA Proteins 0.000 description 2
- 101710043164 Segment-4 Proteins 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- ATHGHQPFGPMSJY-UHFFFAOYSA-N Spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 description 2
- 241000589884 Treponema pallidum Species 0.000 description 2
- 101700060715 USP1 Proteins 0.000 description 2
- 101700038759 VP1 Proteins 0.000 description 2
- 241000607626 Vibrio cholerae Species 0.000 description 2
- 241000607265 Vibrio vulnificus Species 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 2
- 229960001570 ademetionine Drugs 0.000 description 2
- 229920002892 amber Polymers 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 235000009697 arginine Nutrition 0.000 description 2
- 244000052616 bacterial pathogens Species 0.000 description 2
- 108010003152 bacteriophage T7 RNA polymerase Proteins 0.000 description 2
- 108010005774 beta-Galactosidase Proteins 0.000 description 2
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 108091005941 blue fluorescent protein Proteins 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 239000000356 contaminant Substances 0.000 description 2
- 238000010192 crystallographic characterization Methods 0.000 description 2
- 108091005944 cyan fluorescent protein Proteins 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 230000003247 decreasing Effects 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 230000003831 deregulation Effects 0.000 description 2
- 230000001687 destabilization Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 238000000684 flow cytometry Methods 0.000 description 2
- 238000010230 functional analysis Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 230000001890 gluconeogenic Effects 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 239000001963 growth media Substances 0.000 description 2
- 101700005460 hemA Proteins 0.000 description 2
- 239000000185 hemagglutinin Substances 0.000 description 2
- 230000002209 hydrophobic Effects 0.000 description 2
- 239000002054 inoculum Substances 0.000 description 2
- 230000002934 lysing Effects 0.000 description 2
- 229910001629 magnesium chloride Inorganic materials 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000006151 minimal media Substances 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 230000001323 posttranslational Effects 0.000 description 2
- 238000001556 precipitation Methods 0.000 description 2
- OZAIFHULBGXAKX-UHFFFAOYSA-N precursor Substances N#CC(C)(C)N=NC(C)(C)C#N OZAIFHULBGXAKX-UHFFFAOYSA-N 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 230000001915 proofreading Effects 0.000 description 2
- 230000009504 protein deubiquitination Effects 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000011734 sodium Substances 0.000 description 2
- VMHLLURERBWHNL-UHFFFAOYSA-M sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 2
- 239000001632 sodium acetate Substances 0.000 description 2
- 235000017281 sodium acetate Nutrition 0.000 description 2
- 238000000527 sonication Methods 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 230000002269 spontaneous Effects 0.000 description 2
- 230000000087 stabilizing Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 108091005939 superfolder green fluorescent protein Proteins 0.000 description 2
- 210000001519 tissues Anatomy 0.000 description 2
- 239000011573 trace mineral Substances 0.000 description 2
- 235000013619 trace mineral Nutrition 0.000 description 2
- 238000010361 transduction Methods 0.000 description 2
- 230000026683 transduction Effects 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 239000012137 tryptone Substances 0.000 description 2
- 125000005454 tryptophanyl group Chemical group 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 239000012138 yeast extract Substances 0.000 description 2
- 108091005946 yellow fluorescent protein Proteins 0.000 description 2
- QDYOZKNUWCMVJZ-UWTATZPHSA-N (1R)-1-amino-2-sulfanylethanesulfinic acid Chemical compound SC[C@H](N)S(O)=O QDYOZKNUWCMVJZ-UWTATZPHSA-N 0.000 description 1
- CRTOKRWMAPBEKF-AWEZNQCLSA-N (2S)-2-(benzylamino)-3-(4-hydroxy-2-nitrophenyl)propanoic acid Chemical compound C([C@@H](C(=O)O)NCC=1C=CC=CC=1)C1=CC=C(O)C=C1[N+]([O-])=O CRTOKRWMAPBEKF-AWEZNQCLSA-N 0.000 description 1
- ONLQWUTVKUBXQR-QMMMGPOBSA-N (2S)-2-[(4,5-dimethoxy-2-nitrophenyl)methylamino]-3-hydroxypropanoic acid Chemical compound COC1=CC(CN[C@@H](CO)C(O)=O)=C([N+]([O-])=O)C=C1OC ONLQWUTVKUBXQR-QMMMGPOBSA-N 0.000 description 1
- BAAVRTJSLCSMNM-CMOCDZPBSA-N (2S)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]-4-carboxybutanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]pentanedioic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=C(O)C=C1 BAAVRTJSLCSMNM-CMOCDZPBSA-N 0.000 description 1
- BABTYIKKTLTNRX-QMMMGPOBSA-N (2S)-2-amino-3-(3-iodophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC(I)=C1 BABTYIKKTLTNRX-QMMMGPOBSA-N 0.000 description 1
- NFIVJOSXJDORSP-QMMMGPOBSA-N (2S)-2-amino-3-(4-boronophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(B(O)O)C=C1 NFIVJOSXJDORSP-QMMMGPOBSA-N 0.000 description 1
- NIGWMJHCCYYCSF-QMMMGPOBSA-N (2S)-2-amino-3-(4-chlorophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(Cl)C=C1 NIGWMJHCCYYCSF-QMMMGPOBSA-N 0.000 description 1
- KWIPUXXIFQQMKN-VIFPVBQESA-N (2S)-2-amino-3-(4-cyanophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(C#N)C=C1 KWIPUXXIFQQMKN-VIFPVBQESA-N 0.000 description 1
- JSXMFBNJRFXRCX-NSHDSACASA-N (2S)-2-amino-3-(4-prop-2-ynoxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCC#C)C=C1 JSXMFBNJRFXRCX-NSHDSACASA-N 0.000 description 1
- CYHRSNOITZHLJN-NSHDSACASA-N (2S)-2-azaniumyl-3-(4-propan-2-ylphenyl)propanoate Chemical compound CC(C)C1=CC=C(C[C@H](N)C(O)=O)C=C1 CYHRSNOITZHLJN-NSHDSACASA-N 0.000 description 1
- KDZOASGQNOPSCU-WDSKDSINSA-N (2S)-2-{1-[(4S)-4-amino-4-carboxybutyl]carbamimidamido}butanedioic acid Chemical compound OC(=O)[C@@H](N)CCC\N=C(/N)N[C@H](C(O)=O)CC(O)=O KDZOASGQNOPSCU-WDSKDSINSA-N 0.000 description 1
- IBCKYXVMEMSMQM-JTQLQIEISA-N (2S)-3-(3-acetylphenyl)-2-aminopropanoic acid Chemical compound CC(=O)C1=CC=CC(C[C@H](N)C(O)=O)=C1 IBCKYXVMEMSMQM-JTQLQIEISA-N 0.000 description 1
- HFDZHKBVRYIMOG-QMMMGPOBSA-N (2S)-3-(4-hydroxyphenyl)-2-(sulfoamino)propanoic acid Chemical compound OS(=O)(=O)N[C@H](C(=O)O)CC1=CC=C(O)C=C1 HFDZHKBVRYIMOG-QMMMGPOBSA-N 0.000 description 1
- QHGDJQUCSGUYMF-QMMMGPOBSA-N (2S)-3-hydroxy-2-[(2-nitrophenyl)methylamino]propanoic acid Chemical compound OC[C@@H](C(O)=O)NCC1=CC=CC=C1[N+]([O-])=O QHGDJQUCSGUYMF-QMMMGPOBSA-N 0.000 description 1
- DQUHYEDEGRNAFO-QMMMGPOBSA-N (2S)-6-amino-2-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CCCCN DQUHYEDEGRNAFO-QMMMGPOBSA-N 0.000 description 1
- YZJSUQQZGCHHNQ-BYPYZUCNSA-N (2S)-6-amino-2-azaniumyl-6-oxohexanoate Chemical compound OC(=O)[C@@H](N)CCCC(N)=O YZJSUQQZGCHHNQ-BYPYZUCNSA-N 0.000 description 1
- 229920000160 (ribonucleotides)n+m Polymers 0.000 description 1
- PCDQPRRSZKQHHS-XVFCMESISA-N ({[({[(2R,3S,4R,5R)-5-(4-amino-2-oxo-1,2-dihydropyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy}(hydroxy)phosphoryl)oxy](hydroxy)phosphoryl}oxy)phosphonic acid Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 PCDQPRRSZKQHHS-XVFCMESISA-N 0.000 description 1
- PAJPWUMXBYXFCZ-UHFFFAOYSA-N 1-Aminocyclopropane-1-carboxylic acid Chemical compound OC(=O)C1(N)CC1 PAJPWUMXBYXFCZ-UHFFFAOYSA-N 0.000 description 1
- BCOSEZGCLGPUSL-UHFFFAOYSA-N 2,3,3-trichloroprop-2-enoyl chloride Chemical compound ClC(Cl)=C(Cl)C(Cl)=O BCOSEZGCLGPUSL-UHFFFAOYSA-N 0.000 description 1
- JTTIOYHBNXDJOD-UHFFFAOYSA-N 2,4,6-triaminopyrimidine Chemical compound NC1=CC(N)=NC(N)=N1 JTTIOYHBNXDJOD-UHFFFAOYSA-N 0.000 description 1
- OMGHIGVFLOPEHJ-UHFFFAOYSA-N 2,5-dihydro-1H-pyrrol-1-ium-2-carboxylate Chemical compound OC(=O)C1NCC=C1 OMGHIGVFLOPEHJ-UHFFFAOYSA-N 0.000 description 1
- VYCNOBNEBXGHKT-UHFFFAOYSA-N 2-(2-methylhydrazinyl)acetic acid Chemical compound CNNCC(O)=O VYCNOBNEBXGHKT-UHFFFAOYSA-N 0.000 description 1
- CLTMYNWFSDZKKI-UHFFFAOYSA-N 2-(aminomethyl)benzoic acid Chemical compound NCC1=CC=CC=C1C(O)=O CLTMYNWFSDZKKI-UHFFFAOYSA-N 0.000 description 1
- CQJAWZCYNRBZDL-UHFFFAOYSA-N 2-(methylazaniumyl)butanoate Chemical compound CCC(NC)C(O)=O CQJAWZCYNRBZDL-UHFFFAOYSA-N 0.000 description 1
- JCIYZTBXUJCAMW-UHFFFAOYSA-N 2-[[5-(dimethylamino)naphthalen-1-yl]sulfonylamino]propanoic acid Chemical compound C1=CC=C2C(S(=O)(=O)NC(C)C(O)=O)=CC=CC2=C1N(C)C JCIYZTBXUJCAMW-UHFFFAOYSA-N 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- JINGUCXQUOKWKH-UHFFFAOYSA-N 2-aminodecanoic acid Chemical compound CCCCCCCCC(N)C(O)=O JINGUCXQUOKWKH-UHFFFAOYSA-N 0.000 description 1
- AKVBCGQVQXPRLD-UHFFFAOYSA-N 2-aminooctanoic acid Chemical compound CCCCCCC(N)C(O)=O AKVBCGQVQXPRLD-UHFFFAOYSA-N 0.000 description 1
- SDZGVFSSLGTJAJ-UHFFFAOYSA-N 2-azaniumyl-3-(2-nitrophenyl)propanoate Chemical compound OC(=O)C(N)CC1=CC=CC=C1[N+]([O-])=O SDZGVFSSLGTJAJ-UHFFFAOYSA-N 0.000 description 1
- JPZXHKDZASGCLU-UHFFFAOYSA-N 2-azaniumyl-3-naphthalen-2-ylpropanoate Chemical compound C1=CC=CC2=CC(CC(N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-UHFFFAOYSA-N 0.000 description 1
- JVPFOKXICYJJSC-UHFFFAOYSA-N 2-azaniumylnonanoate Chemical compound CCCCCCCC(N)C(O)=O JVPFOKXICYJJSC-UHFFFAOYSA-N 0.000 description 1
- POGSZHUEECCEAP-UHFFFAOYSA-N 3-(3-amino-4-hydroxyphenyl)-2-azaniumylpropanoate Chemical compound OC(=O)C(N)CC1=CC=C(O)C(N)=C1 POGSZHUEECCEAP-UHFFFAOYSA-N 0.000 description 1
- JVGVDSSUAVXRDY-UHFFFAOYSA-N 3-(4-hydroxyphenyl)lactic acid Chemical compound OC(=O)C(O)CC1=CC=C(O)C=C1 JVGVDSSUAVXRDY-UHFFFAOYSA-N 0.000 description 1
- GSWYUZQBLVUEPH-UHFFFAOYSA-N 3-(azaniumylmethyl)benzoate Chemical compound NCC1=CC=CC(C(O)=O)=C1 GSWYUZQBLVUEPH-UHFFFAOYSA-N 0.000 description 1
- KHABBYNLBYZCKP-UHFFFAOYSA-N 4-aminopiperidin-1-ium-4-carboxylate Chemical compound OC(=O)C1(N)CCNCC1 KHABBYNLBYZCKP-UHFFFAOYSA-N 0.000 description 1
- XWHHYOYVRVGJJY-QMMMGPOBSA-N 4-fluorophenyl-L-alanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(F)C=C1 XWHHYOYVRVGJJY-QMMMGPOBSA-N 0.000 description 1
- XWHHYOYVRVGJJY-UHFFFAOYSA-N 4-fluorophenylalanine Chemical compound OC(=O)C(N)CC1=CC=C(F)C=C1 XWHHYOYVRVGJJY-UHFFFAOYSA-N 0.000 description 1
- 229940000681 5-Hydroxytryptophan Drugs 0.000 description 1
- HFKRAQJDTVSWNX-UHFFFAOYSA-N 5-amino-2-benzylpentanoic acid Chemical compound NCCCC(C(O)=O)CC1=CC=CC=C1 HFKRAQJDTVSWNX-UHFFFAOYSA-N 0.000 description 1
- LDCYZAJDBXYCGN-VIFPVBQESA-N 5-hydroxy-L-tryptophan zwitterion Chemical compound C1=C(O)C=C2C(C[C@H](N)C(O)=O)=CNC2=C1 LDCYZAJDBXYCGN-VIFPVBQESA-N 0.000 description 1
- YSMODUONRAFBET-UHFFFAOYSA-N 5-hydroxylysine Chemical class NCC(O)CCC(N)C(O)=O YSMODUONRAFBET-UHFFFAOYSA-N 0.000 description 1
- 229940000687 6-Aminocaproic Acid Drugs 0.000 description 1
- 102100017923 ACOT12 Human genes 0.000 description 1
- 101710008266 ACOT12 Proteins 0.000 description 1
- 101700033661 ACTB Proteins 0.000 description 1
- 102100011550 ACTB Human genes 0.000 description 1
- 101710032514 ACTI Proteins 0.000 description 1
- GYDJEQRTZSCIOI-UHFFFAOYSA-N AMCHA Chemical compound NCC1CCC(C(O)=O)CC1 GYDJEQRTZSCIOI-UHFFFAOYSA-N 0.000 description 1
- 241000589291 Acinetobacter Species 0.000 description 1
- 241000186041 Actinomyces israelii Species 0.000 description 1
- 241000589158 Agrobacterium Species 0.000 description 1
- 241000429837 Alternaria caespitosa Species 0.000 description 1
- 102000005922 Amidases Human genes 0.000 description 1
- 108020003076 Amidases Proteins 0.000 description 1
- 108020005206 Amino Acyl Transfer RNA Proteins 0.000 description 1
- SLXKOJJOQWFEFD-UHFFFAOYSA-N Aminocaproic acid Chemical compound NCCCCCC(O)=O SLXKOJJOQWFEFD-UHFFFAOYSA-N 0.000 description 1
- QCTBMLYLENLHLA-UHFFFAOYSA-N Aminomethylbenzoic acid Chemical compound NCC1=CC=C(C(O)=O)C=C1 QCTBMLYLENLHLA-UHFFFAOYSA-N 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N Ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 241000605281 Anaplasma phagocytophilum Species 0.000 description 1
- 229940064005 Antibiotic throat preparations Drugs 0.000 description 1
- 229940083879 Antibiotics FOR TREATMENT OF HEMORRHOIDS AND ANAL FISSURES FOR TOPICAL USE Drugs 0.000 description 1
- 229940042052 Antibiotics for systemic use Drugs 0.000 description 1
- 108020005098 Anticodon Proteins 0.000 description 1
- 229940042786 Antitubercular Antibiotics Drugs 0.000 description 1
- 241001346367 Apiotrichum mycotoxinivorans Species 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- 241000235349 Ascomycota Species 0.000 description 1
- 229960005261 Aspartic Acid Drugs 0.000 description 1
- 241000894009 Azorhizobium caulinodans Species 0.000 description 1
- 241000589149 Azotobacter vinelandii Species 0.000 description 1
- 229940065181 Bacillus anthracis Drugs 0.000 description 1
- 241000193738 Bacillus anthracis Species 0.000 description 1
- 229940075612 Bacillus cereus Drugs 0.000 description 1
- 241000193755 Bacillus cereus Species 0.000 description 1
- 241000194108 Bacillus licheniformis Species 0.000 description 1
- 241000194106 Bacillus mycoides Species 0.000 description 1
- 241000606124 Bacteroides fragilis Species 0.000 description 1
- 241001518086 Bartonella henselae Species 0.000 description 1
- 229940092523 Bartonella quintana Drugs 0.000 description 1
- 229940002008 Bifidobacterium bifidum Drugs 0.000 description 1
- 241000186016 Bifidobacterium bifidum Species 0.000 description 1
- OWMVSZAMULFTJU-UHFFFAOYSA-N Bis-tris methane Chemical compound OCCN(CCO)C(CO)(CO)CO OWMVSZAMULFTJU-UHFFFAOYSA-N 0.000 description 1
- 241000760366 Blastocladiomycota Species 0.000 description 1
- 210000004369 Blood Anatomy 0.000 description 1
- 210000000988 Bone and Bones Anatomy 0.000 description 1
- 241000588807 Bordetella Species 0.000 description 1
- 241000588779 Bordetella bronchiseptica Species 0.000 description 1
- 229940052491 Bordetella pertussis Drugs 0.000 description 1
- 229940097269 Borrelia burgdorferi Drugs 0.000 description 1
- 241000589969 Borreliella burgdorferi Species 0.000 description 1
- 238000009010 Bradford assay Methods 0.000 description 1
- 241001522017 Brettanomyces anomalus Species 0.000 description 1
- 241000193764 Brevibacillus brevis Species 0.000 description 1
- 229940056450 Brucella abortus Drugs 0.000 description 1
- 241000589567 Brucella abortus Species 0.000 description 1
- 229940038698 Brucella melitensis Drugs 0.000 description 1
- 241001148106 Brucella melitensis Species 0.000 description 1
- 241001148111 Brucella suis Species 0.000 description 1
- 241001453380 Burkholderia Species 0.000 description 1
- 241000589513 Burkholderia cepacia Species 0.000 description 1
- 229940074375 Burkholderia mallei Drugs 0.000 description 1
- 241000722910 Burkholderia mallei Species 0.000 description 1
- 101700023299 CATC Proteins 0.000 description 1
- 101700051323 CLPP Proteins 0.000 description 1
- 101700000979 CLPP2 Proteins 0.000 description 1
- 241000589876 Campylobacter Species 0.000 description 1
- 241000589877 Campylobacter coli Species 0.000 description 1
- 241000589874 Campylobacter fetus Species 0.000 description 1
- 241000589875 Campylobacter jejuni Species 0.000 description 1
- 229940015062 Campylobacter jejuni Drugs 0.000 description 1
- 229940095731 Candida albicans Drugs 0.000 description 1
- 241000222122 Candida albicans Species 0.000 description 1
- 241000192452 Candida blankii Species 0.000 description 1
- 244000206911 Candida holmii Species 0.000 description 1
- 235000002965 Candida holmii Nutrition 0.000 description 1
- 229940055022 Candida parapsilosis Drugs 0.000 description 1
- 241000222173 Candida parapsilosis Species 0.000 description 1
- 241000222178 Candida tropicalis Species 0.000 description 1
- 241000222157 Candida viswanathii Species 0.000 description 1
- FPPNZSSZRUTDAP-UWFZAAFLSA-N Carbenicillin Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)C(C(O)=O)C1=CC=CC=C1 FPPNZSSZRUTDAP-UWFZAAFLSA-N 0.000 description 1
- 229960003669 Carbenicillin Drugs 0.000 description 1
- 102000004091 Caspase 8 Human genes 0.000 description 1
- 108090000538 Caspase 8 Proteins 0.000 description 1
- 241000606161 Chlamydia Species 0.000 description 1
- 206010008631 Cholera Diseases 0.000 description 1
- 241001508813 Clavispora lusitaniae Species 0.000 description 1
- 241000193155 Clostridium botulinum Species 0.000 description 1
- 229940038704 Clostridium perfringens Drugs 0.000 description 1
- 241000193468 Clostridium perfringens Species 0.000 description 1
- 241000193449 Clostridium tetani Species 0.000 description 1
- 230000036883 Clp Effects 0.000 description 1
- 229920001405 Coding region Polymers 0.000 description 1
- 241000186216 Corynebacterium Species 0.000 description 1
- 241000186226 Corynebacterium glutamicum Species 0.000 description 1
- 229940118765 Coxiella burnetii Drugs 0.000 description 1
- 241000606678 Coxiella burnetii Species 0.000 description 1
- 241001527609 Cryptococcus Species 0.000 description 1
- 241001522864 Cryptococcus gattii VGI Species 0.000 description 1
- 241000223233 Cutaneotrichosporon cutaneum Species 0.000 description 1
- NILQLFBWTXNUOE-UHFFFAOYSA-N Cycloleucine Chemical compound OC(=O)C1(N)CCCC1 NILQLFBWTXNUOE-UHFFFAOYSA-N 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- 239000011665 D-biotin Substances 0.000 description 1
- 235000000638 D-biotin Nutrition 0.000 description 1
- COLNVLDHVKWLRT-MRVPVSSYSA-N D-phenylalanine Chemical compound OC(=O)[C@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-MRVPVSSYSA-N 0.000 description 1
- MTCFGRXMJLQNBG-UWTATZPHSA-N D-serine Chemical compound OC[C@@H](N)C(O)=O MTCFGRXMJLQNBG-UWTATZPHSA-N 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 241000668709 Dipterocarpus costatus Species 0.000 description 1
- 108010050301 EC 2.7.7.56 Proteins 0.000 description 1
- 101710028349 ECU11_0530 Proteins 0.000 description 1
- 241000605310 Ehrlichia chaffeensis Species 0.000 description 1
- 229940119563 Enterobacter cloacae Drugs 0.000 description 1
- 241000588697 Enterobacter cloacae Species 0.000 description 1
- 241000520130 Enterococcus durans Species 0.000 description 1
- 229940032049 Enterococcus faecalis Drugs 0.000 description 1
- 241000194031 Enterococcus faecium Species 0.000 description 1
- 241000194030 Enterococcus gallinarum Species 0.000 description 1
- 241001465321 Eremothecium Species 0.000 description 1
- 241001465328 Eremothecium gossypii Species 0.000 description 1
- 241001646716 Escherichia coli K-12 Species 0.000 description 1
- 241000660147 Escherichia coli str. K-12 substr. MG1655 Species 0.000 description 1
- 230000036826 Excretion Effects 0.000 description 1
- 229920000665 Exon Polymers 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241000589601 Francisella Species 0.000 description 1
- 241001621835 Frateuria aurantia Species 0.000 description 1
- 102100004985 GUSB Human genes 0.000 description 1
- 241000193385 Geobacillus stearothermophilus Species 0.000 description 1
- 241000159512 Geotrichum Species 0.000 description 1
- 244000168141 Geotrichum candidum Species 0.000 description 1
- 235000017388 Geotrichum candidum Nutrition 0.000 description 1
- 108010060309 Glucuronidase Proteins 0.000 description 1
- 229960002989 Glutamic Acid Drugs 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 229940093922 Gynecological Antibiotics Drugs 0.000 description 1
- 101710006403 HSPA5 Proteins 0.000 description 1
- 241000606790 Haemophilus Species 0.000 description 1
- 241000606768 Haemophilus influenzae Species 0.000 description 1
- 229940047650 Haemophilus influenzae Drugs 0.000 description 1
- 241000606766 Haemophilus parainfluenzae Species 0.000 description 1
- 230000036499 Half live Effects 0.000 description 1
- 229940037467 Helicobacter pylori Drugs 0.000 description 1
- 208000006572 Human Influenza Diseases 0.000 description 1
- 206010020460 Human T-cell lymphotropic virus type I infection Diseases 0.000 description 1
- 241000714260 Human T-lymphotropic virus 1 Species 0.000 description 1
- 206010020429 Human ehrlichiosis Diseases 0.000 description 1
- 229960002591 Hydroxyproline Drugs 0.000 description 1
- 206010022000 Influenza Diseases 0.000 description 1
- 229920002459 Intron Polymers 0.000 description 1
- 239000007836 KH2PO4 Substances 0.000 description 1
- 102000011782 Keratins Human genes 0.000 description 1
- 108010076876 Keratins Proteins 0.000 description 1
- 241001534216 Klebsiella granulomatis Species 0.000 description 1
- 229940045505 Klebsiella pneumoniae Drugs 0.000 description 1
- 241000588747 Klebsiella pneumoniae Species 0.000 description 1
- 235000014663 Kluyveromyces fragilis Nutrition 0.000 description 1
- 241001138401 Kluyveromyces lactis Species 0.000 description 1
- 229940031154 Kluyveromyces marxianus Drugs 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- OGNSCSPNOLGXSM-VKHMYHEASA-N L-2,4-diaminobutyric acid Chemical compound NCC[C@H](N)C(O)=O OGNSCSPNOLGXSM-VKHMYHEASA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- QUOGESRFPZDMMT-YFKPBYRVSA-N L-homoarginine Chemical compound OC(=O)[C@@H](N)CCCCNC(N)=N QUOGESRFPZDMMT-YFKPBYRVSA-N 0.000 description 1
- FFFHZYDWPBMWHY-VKHMYHEASA-N L-homocysteine Chemical compound OC(=O)[C@@H](N)CCS FFFHZYDWPBMWHY-VKHMYHEASA-N 0.000 description 1
- UKAUYVFTDYCKQA-VKHMYHEASA-N L-homoserine zwitterion Chemical compound OC(=O)[C@@H](N)CCO UKAUYVFTDYCKQA-VKHMYHEASA-N 0.000 description 1
- AHLPHDHHMVZTML-BYPYZUCNSA-N L-ornithine Chemical compound NCCC[C@H](N)C(O)=O AHLPHDHHMVZTML-BYPYZUCNSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- XUIIKFGFIJCVMT-LBPRGKRZSA-N L-thyroxine zwitterion Chemical compound IC1=CC(C[C@H]([NH3+])C([O-])=O)=CC(I)=C1OC1=CC(I)=C(O)C(I)=C1 XUIIKFGFIJCVMT-LBPRGKRZSA-N 0.000 description 1
- 101710019436 LARS1 Proteins 0.000 description 1
- 101710019425 LARS2 Proteins 0.000 description 1
- 102100010495 LARS2 Human genes 0.000 description 1
- 241000858110 Lachancea Species 0.000 description 1
- 241000235087 Lachancea kluyveri Species 0.000 description 1
- 244000199885 Lactobacillus bulgaricus Species 0.000 description 1
- 229940004208 Lactobacillus bulgaricus Drugs 0.000 description 1
- 235000013960 Lactobacillus bulgaricus Nutrition 0.000 description 1
- 229940017800 Lactobacillus casei Drugs 0.000 description 1
- 240000004403 Lactobacillus casei Species 0.000 description 1
- 235000013958 Lactobacillus casei Nutrition 0.000 description 1
- 229940115932 Legionella pneumophila Drugs 0.000 description 1
- 241000589242 Legionella pneumophila Species 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- 241000221479 Leucosporidium Species 0.000 description 1
- 229940115931 Listeria monocytogenes Drugs 0.000 description 1
- 241000186779 Listeria monocytogenes Species 0.000 description 1
- 210000004185 Liver Anatomy 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 108060001084 Luciferase family Proteins 0.000 description 1
- 239000006137 Luria-Bertani broth Substances 0.000 description 1
- 210000004698 Lymphocytes Anatomy 0.000 description 1
- 241001134775 Lysinibacillus fusiformis Species 0.000 description 1
- 101710039852 METAP1 Proteins 0.000 description 1
- 101710012506 METAP2 Proteins 0.000 description 1
- 241000202974 Methanobacterium Species 0.000 description 1
- 241000203407 Methanocaldococcus jannaschii Species 0.000 description 1
- 241000203353 Methanococcus Species 0.000 description 1
- 241000235048 Meyerozyma guilliermondii Species 0.000 description 1
- 241001467578 Microbacterium Species 0.000 description 1
- 241000191938 Micrococcus luteus Species 0.000 description 1
- 241000243190 Microsporidia Species 0.000 description 1
- GNSKLFRGEWLPPA-UHFFFAOYSA-M Monopotassium phosphate Chemical compound [K+].OP(O)([O-])=O GNSKLFRGEWLPPA-UHFFFAOYSA-M 0.000 description 1
- 241000588655 Moraxella catarrhalis Species 0.000 description 1
- 241001149965 Mrakia frigida Species 0.000 description 1
- 210000003205 Muscles Anatomy 0.000 description 1
- 241000186367 Mycobacterium avium Species 0.000 description 1
- 229940114179 Mycobacterium bovis Drugs 0.000 description 1
- 241000186366 Mycobacterium bovis Species 0.000 description 1
- 241000186362 Mycobacterium leprae Species 0.000 description 1
- 241000187480 Mycobacterium smegmatis Species 0.000 description 1
- 229940010383 Mycobacterium tuberculosis Drugs 0.000 description 1
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 1
- 241000204051 Mycoplasma genitalium Species 0.000 description 1
- 241000204048 Mycoplasma hominis Species 0.000 description 1
- 241001135743 Mycoplasma penetrans Species 0.000 description 1
- 229940013390 Mycoplasma pneumoniae Drugs 0.000 description 1
- 241000202934 Mycoplasma pneumoniae Species 0.000 description 1
- CYZKJBZEIFWZSR-LURJTMIESA-N N(α)-methyl-L-histidine zwitterion Chemical compound CN[C@H](C(O)=O)CC1=CNC=N1 CYZKJBZEIFWZSR-LURJTMIESA-N 0.000 description 1
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 description 1
- 101700064338 NFCP Proteins 0.000 description 1
- 101700080605 NUC1 Proteins 0.000 description 1
- 241000588653 Neisseria Species 0.000 description 1
- 229940052778 Neisseria meningitidis Drugs 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 241000760367 Neocallimastigomycetes Species 0.000 description 1
- FBTSQILOGYXGMD-LURJTMIESA-N Nitrotyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C([N+]([O-])=O)=C1 FBTSQILOGYXGMD-LURJTMIESA-N 0.000 description 1
- FXTLFZWJXBBXGX-QMMMGPOBSA-N OC(=O)[C@H](C[SeH])NC1=CC=CC=C1 Chemical compound OC(=O)[C@H](C[SeH])NC1=CC=CC=C1 FXTLFZWJXBBXGX-QMMMGPOBSA-N 0.000 description 1
- 241001112159 Ogataea Species 0.000 description 1
- 241001099341 Ogataea polymorpha Species 0.000 description 1
- 229960003104 Ornithine Drugs 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 206010063834 Oversensing Diseases 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- TVIDEEHSOPHZBR-AWEZNQCLSA-N PARA-(BENZOYL)-PHENYLALANINE Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C(=O)C1=CC=CC=C1 TVIDEEHSOPHZBR-AWEZNQCLSA-N 0.000 description 1
- 102220393277 PIGBOS1 M40A Human genes 0.000 description 1
- 210000000496 Pancreas Anatomy 0.000 description 1
- 241000606860 Pasteurella Species 0.000 description 1
- 241000191992 Peptostreptococcus Species 0.000 description 1
- 102000030951 Phosphotransferases Human genes 0.000 description 1
- 108091000081 Phosphotransferases Proteins 0.000 description 1
- 241000235648 Pichia Species 0.000 description 1
- 241000235645 Pichia kudriavzevii Species 0.000 description 1
- 229920002873 Polyethylenimine Polymers 0.000 description 1
- 241000192138 Prochlorococcus Species 0.000 description 1
- 241000589516 Pseudomonas Species 0.000 description 1
- 206010037833 Rales Diseases 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 241000316848 Rhodococcus <scale insect> Species 0.000 description 1
- 241000223252 Rhodotorula Species 0.000 description 1
- 241000223254 Rhodotorula mucilaginosa Species 0.000 description 1
- 241000606701 Rickettsia Species 0.000 description 1
- 229940046939 Rickettsia prowazekii Drugs 0.000 description 1
- 241000606697 Rickettsia prowazekii Species 0.000 description 1
- 229940075118 Rickettsia rickettsii Drugs 0.000 description 1
- 241000606695 Rickettsia rickettsii Species 0.000 description 1
- 241000203719 Rothia dentocariosa Species 0.000 description 1
- 102100017879 S100A9 Human genes 0.000 description 1
- 101710023383 S100A9 Proteins 0.000 description 1
- 108060002241 SLC1A5 Proteins 0.000 description 1
- 102100012046 SLC1A5 Human genes 0.000 description 1
- 102000015329 SUMO-1 Protein Human genes 0.000 description 1
- 108010050000 SUMO-1 Protein Proteins 0.000 description 1
- 235000018370 Saccharomyces delbrueckii Nutrition 0.000 description 1
- 244000253911 Saccharomyces fragilis Species 0.000 description 1
- 235000018368 Saccharomyces fragilis Nutrition 0.000 description 1
- 241001123227 Saccharomyces pastorianus Species 0.000 description 1
- 241000607142 Salmonella Species 0.000 description 1
- 241001354013 Salmonella enterica subsp. enterica serovar Enteritidis Species 0.000 description 1
- 241000293871 Salmonella enterica subsp. enterica serovar Typhi Species 0.000 description 1
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 1
- 206010039447 Salmonellosis Diseases 0.000 description 1
- 241000235060 Scheffersomyces stipitis Species 0.000 description 1
- 241000235346 Schizosaccharomyces Species 0.000 description 1
- 241000235347 Schizosaccharomyces pombe Species 0.000 description 1
- 229940098362 Serratia marcescens Drugs 0.000 description 1
- 241000607715 Serratia marcescens Species 0.000 description 1
- 229940007046 Shigella dysenteriae Drugs 0.000 description 1
- 241000607764 Shigella dysenteriae Species 0.000 description 1
- 210000003491 Skin Anatomy 0.000 description 1
- 241000228389 Sporidiobolus Species 0.000 description 1
- 241000297588 Sporobolomyces koalae Species 0.000 description 1
- 241000191940 Staphylococcus Species 0.000 description 1
- 229940076185 Staphylococcus aureus Drugs 0.000 description 1
- 241000191967 Staphylococcus aureus Species 0.000 description 1
- 229940037645 Staphylococcus epidermidis Drugs 0.000 description 1
- 241000191963 Staphylococcus epidermidis Species 0.000 description 1
- 241000122971 Stenotrophomonas Species 0.000 description 1
- 241000193985 Streptococcus agalactiae Species 0.000 description 1
- 241000194043 Streptococcus criceti Species 0.000 description 1
- 241000194049 Streptococcus equinus Species 0.000 description 1
- 241000194050 Streptococcus ferus Species 0.000 description 1
- 241001134658 Streptococcus mitis Species 0.000 description 1
- 229940031008 Streptococcus mutans Drugs 0.000 description 1
- 241000194019 Streptococcus mutans Species 0.000 description 1
- 241000194025 Streptococcus oralis Species 0.000 description 1
- 229940031000 Streptococcus pneumoniae Drugs 0.000 description 1
- 241000193998 Streptococcus pneumoniae Species 0.000 description 1
- 229940076156 Streptococcus pyogenes Drugs 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- 241000194052 Streptococcus ratti Species 0.000 description 1
- 241000194023 Streptococcus sanguinis Species 0.000 description 1
- 241000193987 Streptococcus sobrinus Species 0.000 description 1
- 241001312524 Streptococcus viridans Species 0.000 description 1
- 241000187432 Streptomyces coelicolor Species 0.000 description 1
- 241000192707 Synechococcus Species 0.000 description 1
- 102100020295 TRNT1 Human genes 0.000 description 1
- 229940094937 Thioredoxin Drugs 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 241000723792 Tobacco etch virus Species 0.000 description 1
- 229940024982 Topical Antifungal Antibiotics Drugs 0.000 description 1
- 241000006364 Torula Species 0.000 description 1
- 241000235006 Torulaspora Species 0.000 description 1
- 240000004702 Torulaspora delbrueckii Species 0.000 description 1
- 235000014681 Torulaspora delbrueckii Nutrition 0.000 description 1
- 102000004357 Transferases Human genes 0.000 description 1
- 108090000992 Transferases Proteins 0.000 description 1
- 241000589886 Treponema Species 0.000 description 1
- 241000589892 Treponema denticola Species 0.000 description 1
- FPKOPBFLPLFWAD-UHFFFAOYSA-N Trinitrotoluene Chemical compound CC1=CC=C([N+]([O-])=O)C([N+]([O-])=O)=C1[N+]([O-])=O FPKOPBFLPLFWAD-UHFFFAOYSA-N 0.000 description 1
- GSEJCLTVZPLZKY-UHFFFAOYSA-N Tris Chemical compound OCCN(CCO)CCO GSEJCLTVZPLZKY-UHFFFAOYSA-N 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 102100015855 UBP1 Human genes 0.000 description 1
- 101700006884 UBP1 Proteins 0.000 description 1
- PGAVKCOVUIYSFO-XVFCMESISA-N Uridine triphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-XVFCMESISA-N 0.000 description 1
- 229940118696 Vibrio cholerae Drugs 0.000 description 1
- 101710019445 WARS1 Proteins 0.000 description 1
- 102100014638 WARS2 Human genes 0.000 description 1
- 101710019457 WARS2 Proteins 0.000 description 1
- 229920002533 WHP Posttrascriptional Response Element Polymers 0.000 description 1
- 101700020009 WIP2 Proteins 0.000 description 1
- 241000604961 Wolbachia Species 0.000 description 1
- 241000235013 Yarrowia Species 0.000 description 1
- 241000607734 Yersinia <bacteria> Species 0.000 description 1
- 229940098232 Yersinia enterocolitica Drugs 0.000 description 1
- 241000607447 Yersinia enterocolitica Species 0.000 description 1
- 229940118695 Yersinia pestis Drugs 0.000 description 1
- 241000607479 Yersinia pestis Species 0.000 description 1
- 241000607477 Yersinia pseudotuberculosis Species 0.000 description 1
- 241000758405 Zoopagomycotina Species 0.000 description 1
- 241000235033 Zygosaccharomyces rouxii Species 0.000 description 1
- 241000222126 [Candida] glabrata Species 0.000 description 1
- 241000192286 [Candida] stellata Species 0.000 description 1
- 241000606834 [Haemophilus] ducreyi Species 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 229960002684 aminocaproic acid Drugs 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 230000003466 anti-cipated Effects 0.000 description 1
- 102000004965 antibodies Human genes 0.000 description 1
- 108090001123 antibodies Proteins 0.000 description 1
- 150000003934 aromatic aldehydes Chemical class 0.000 description 1
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 239000010426 asphalt Substances 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 239000003124 biologic agent Substances 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 101710023486 bipA Proteins 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 125000001314 canonical amino-acid group Chemical group 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 239000006143 cell culture media Substances 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 230000036978 cell physiology Effects 0.000 description 1
- 230000001413 cellular Effects 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 230000002860 competitive Effects 0.000 description 1
- 230000000295 complement Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000001816 cooling Methods 0.000 description 1
- 230000001186 cumulative Effects 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 230000003111 delayed Effects 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- 229910000397 disodium phosphate Inorganic materials 0.000 description 1
- 238000000132 electrospray ionisation Methods 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 239000012149 elution buffer Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002255 enzymatic Effects 0.000 description 1
- 238000005886 esterification reaction Methods 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 1
- 108010021843 fluorescent protein 583 Proteins 0.000 description 1
- 235000019253 formic acid Nutrition 0.000 description 1
- 239000005350 fused silica glass Substances 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000017730 intein-mediated protein splicing Effects 0.000 description 1
- 229940079866 intestinal antibiotics Drugs 0.000 description 1
- 238000005040 ion trap Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 125000000468 ketone group Chemical group 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 101710028327 leuS Proteins 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 238000001294 liquid chromatography-tandem mass spectrometry Methods 0.000 description 1
- 238000011068 load Methods 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 229910052943 magnesium sulfate Inorganic materials 0.000 description 1
- 230000001404 mediated Effects 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- CSNNHWWHGAXBCP-UHFFFAOYSA-L mgso4 Chemical compound [Mg+2].[O-][S+2]([O-])([O-])[O-] CSNNHWWHGAXBCP-UHFFFAOYSA-L 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 229910000402 monopotassium phosphate Inorganic materials 0.000 description 1
- 235000019796 monopotassium phosphate Nutrition 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 239000005445 natural product Substances 0.000 description 1
- 229930014626 natural products Natural products 0.000 description 1
- 210000002569 neurons Anatomy 0.000 description 1
- 230000003000 nontoxic Effects 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 101700006494 nucA Proteins 0.000 description 1
- 239000011022 opal Substances 0.000 description 1
- 229940005935 ophthalmologic Antibiotics Drugs 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organs Anatomy 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- CMUHFUGDYMFHEI-QMMMGPOBSA-N para-Amino-phe Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-QMMMGPOBSA-N 0.000 description 1
- 238000007747 plating Methods 0.000 description 1
- 230000001402 polyadenylating Effects 0.000 description 1
- 229920001601 polyetherimide Polymers 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000000529 probiotic Effects 0.000 description 1
- 239000006041 probiotic Substances 0.000 description 1
- 235000018291 probiotics Nutrition 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000002829 reduced Effects 0.000 description 1
- 238000005067 remediation Methods 0.000 description 1
- 238000007634 remodeling Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000001177 retroviral Effects 0.000 description 1
- 229920002973 ribosomal RNA Polymers 0.000 description 1
- 102220274904 rs975951740 Human genes 0.000 description 1
- 102220329886 rs975951740 Human genes 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 238000003530 single readout Methods 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 229940063673 spermidine Drugs 0.000 description 1
- 230000003068 static Effects 0.000 description 1
- 230000001502 supplementation Effects 0.000 description 1
- 238000003239 susceptibility assay Methods 0.000 description 1
- 238000004885 tandem mass spectrometry Methods 0.000 description 1
- 230000029305 taxis Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 125000000999 tert-butyl group Chemical group [H]C([H])([H])C(*)(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- NPDBDJFLKKQMCM-UHFFFAOYSA-N tert-butylglycine Chemical compound CC(C)(C)C(N)C(O)=O NPDBDJFLKKQMCM-UHFFFAOYSA-N 0.000 description 1
- 238000004809 thin layer chromatography Methods 0.000 description 1
- 102000002933 thioredoxin family Human genes 0.000 description 1
- 108060008226 thioredoxin family Proteins 0.000 description 1
- 230000002588 toxic Effects 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- PMMYEEVYMWASQN-DMTCNVIQSA-N trans-L-hydroxyproline Chemical class O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000002054 transplantation Methods 0.000 description 1
- 150000003668 tyrosines Chemical class 0.000 description 1
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 108010032276 tyrosyl-glutamyl-tyrosyl-glutamic acid Proteins 0.000 description 1
- 238000004704 ultra performance liquid chromatography Methods 0.000 description 1
- 229950010342 uridine triphosphate Drugs 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 239000003643 water by type Substances 0.000 description 1
- ZGUNAGUHMKGQNY-UHFFFAOYSA-N α-phenylglycine Chemical compound OC(=O)C(N)C1=CC=CC=C1 ZGUNAGUHMKGQNY-UHFFFAOYSA-N 0.000 description 1
- JPZXHKDZASGCLU-LBPRGKRZSA-N β-(2-Naphthyl)-Alanine Chemical compound C1=CC=CC2=CC(C[C@H](N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-LBPRGKRZSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/1025—Acyltransferases (2.3)
- C12N9/104—Aminoacyltransferases (2.3.2)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y601/00—Ligases forming carbon-oxygen bonds (6.1)
- C12Y601/01—Ligases forming aminoacyl-tRNA and related compounds (6.1.1)
- C12Y601/0102—Phenylalanine-tRNA ligase (6.1.1.20)
Abstract
This disclosure provides variants of the biphenylalanine (BipA) orthogonal translation system used for incorporation of BipA into proteins. Specifically, engineered BipA aminoacyl-tRNA synthetase (BipARS) variants and tRNA variants that improve selectivity towards BipA are described. Furthermore, this disclosure provides methods used to generate these variants.
Description
Synthetase Variants for Incorporation of Biphenylalanine into a Peptide
RELATED APPLICATION DATA
This application claims priority to U.S. Provisional Application No. 62/527, 115 filed on June 30, 2017 which is hereby incorporated herein by reference in its entirety for all purposes.
STATEMENT OF GOVERNMENT INTERESTS
This invention was made with government support under DE-FG02-02ER63445 awarded by Department of Energy. The government has certain rights in the invention.
SEQUENCE LISTING
The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on June 26, 2018, is named 010498_01 110_WO_SL.txt and is 46,661 bytes in size.
FIELD
The present invention relates in general to synthetase and transfer RNA variants for incorporation of biphenylalanine into a polypeptide and methods of making same,
BACKGROUND
Advancements to genetic code expansion require accurate, selective, and high- throughput determination of non-standard amino acid (NSAA) incorporation into proteins. The fidelity of translation relies on the selectivity of amino acyl transfer RNA (tRNA) synthetases (AARSs), which catalyze esterifi cation of tRNAs to their corresponding amino acids (See, M. Ibba, D. Soil, Aminoacyl -tRNAs: setting the limits of the genetic code. Genes Dev. 18, 731-8 (2004)). Orthogonal AARS/tRNA pairs, together known as OTSs, enable site-specific NSAA
incorporation into proteins, most often by suppressing amber (UAG) stop codons in targeted sequences (See, C. Noren, S. Anthony-Cahill, M. Griffith, P. Schuitz, A general method for site-specific incorporation of unnatural amino acids into proteins. Science (80-. ). 244 (1989) (available at world wide website science, ciencemag.org/content/244/4901/182), L. Wang, A , Brock, B. Herberich, P. G. Schuitz, Expanding the genetic code of Escherichia coli. Science (80- ). 292, 498-500 (2001)). Four primary site-specific OTS families have been developed for NSAA incorporation: Methanococcus jannaschii tyrosyl-tRNA synthetase
(A^TyrRS)/tRNA¾^, ; various Methanosarcina pyrrolysyl-tRNA synthetase (PylRS)/tRNA¾l A; Escherichia coli tvrosv 1-tRNA synthetase (£cTyrRS)/tRNA¾ and E. coli 1 eucvl-tRNA synthetase
(See, J. W. Chin, Expanding and Reprogramming the Genetic Code of Cells and Animals. Aram. Rev. Biochem. 83, 379-408 (2014); A. Dumas, L. Lercher, C, D. Spicer, B. G. Davi s, Designing logical codon reassignment - Expanding the chemistry in biology. Chem. Sci. 6, 50-69 (2015)). Another commonly used OTS is the Saccharomyces cerevisiae tryptophanyi-tRNA synthetase (ScTrpRS)/ tRN A" c pair (See, R. A. Hughes, A. D. Ellington, Rational design of an orthogonal tryptophanyl nonsense suppressor tRNA. Nucleic Acids Res. 38, 6813-6830 (2010); A. Chatterjee, H. Xiao, P.-Y. Yang, G. Soundararajan, P. G. Schuitz, Genetic Codon Expansi on A Tryptophanyl -tRNA Synthetase/tRN A Pair for Unnatural Amino Acid Mutagenesis in E. coli**, doi: 10.1002/anie.201301094; J. W. Ellefson et al.. Directed evolution of genetic parts and circuits by compartmentalized partnered replication. Nai. Biotechnol. 32, 97-101 (2014)),
However, engineered OTS promiscuity for standard amino acids (SAAs) and for undesired NSAAs is a major barrier to expansion of the genetic code. The low fidelity of several OTSs is documented, revealing that even after multiple rounds of negative selection they misacylate tRNA with SAAs that their ancestral variants acted upon, such as tyrosine (Y) and tryptophan (W) (See, K. Oki, K. Sakamoto, T. Kobayashi, H. M. Sasaki, S. Yokoyama,
Transplantation of a tyrosine editing domain into a tyrosyl-tRNA synthetase variant enhances its specificity for a tyrosine analog. Proc. Nail. Acad. Sci. U. S. A. 105, 13298-303 (2008); T. S. Young, I. Ahmad, J. A. Yin, P, G. Schultz, An Enhanced System for Unnatural Amino Acid Mutagenesis in E, coli. J. Mol Biol. 395, 361-374 (2010); A. K, Anton czak et al., Importance of single molecular determinants in the fidelity of expanded genetic codes. Proc. Nail. Acad. Sci. U. S. A. 108, 1320-5 (201 1); S. Nehring, N. Budisa, B. Wiitschi, M. Oiiveberg, N. Budisa, Performance Analysis of Orthogonal Pairs Designed for an Expanded Eukaryotic Genetic Code. PLoS One. 7, e31992 (2012); J. W. Monk et al. Rapid and Inexpensive Evaluation of Nonstandard Amino Acid Incorporation in Escherichia coli. ACS Synth. Biol. (2016), doi: 10.1021/acssynbio.6b00192)). The problem of OTS cross-talk with SAAs is exemplified in the case of biocontainment, which was previously demonstrated based on the NSAA biphenylalanine (BipA) and its corresponding OTS (See, J. Xie, W. Liu, P. G. Schultz, A Genetically Encoded Bidentate, Metal-Binding Amino Acid. Angew. ( 'hemic. 119, 9399-9402 (2007)), Protease mutations were found in sequenced escapees that emerged in the absence of BipA, suggesting that redesigned enzymes intended to be destabilized by SAA misincorporation may transiently remain functional prior to degradation (See, D. J. Mandell et al, Biocontainment of genetically modified organisms by synthetic protein design. Nature. 518, 55-60 (2015)), Furthermore, genomic integration of the BipA OTS, which likely decreased misincorporation, reduced escape frequency. Given that OTS evolution efforts have not selected against activity upon undesired NSAAs, greater promiscuity is expected in the presence of multiple NSAAs. OTS promiscuity is of particular concern when using members of TyrRS/TrpRS/PylRS families together given demonstrated overlap of substrate ranges (See, C, Fan, J, M, L. Ho, N. Chirathivat, D. Soil, Y.-S. Wang, Exploring the Substrate Range of Wild-Type Aminoacyl-tRNA Synthetases. ChemBioChem . 15, 1805-1809 (2014); L.-T. Guo et al. Polyspecific pyrrolysyl-tRNA synthetases from directed evolution. Proc Natl Acad Sci
US A. I ll, 16724-16729 (2014); Y.-S. Wang et a!., The de novo engineering of pyrrolysyl- tRNA synthetase for genetic incorporation of 1 -phenylalanine and its derivatives. Mol. Biosyst. 7, 714-717 (2011)). Together, these concerns converge as progress was made towards constructing a 57-codon E. coli strain anticipated to exhibit multi-virus resistance, to require biocontainment, and to serve as a platform for producing proteins containing multiple different NSAAs (See, N. Ostrov et al. Design, synthesis, and testing toward a 57-codon genome. Science (80-. ). 353, 819-822 (2016)), Many other applications utilizing NSAAs, such as protein double labelling, FRET, and antibody conjugation, also require high fidelity incorporation to avoid heterogenous protein production.
There is a continuing need in the art to develop BipA OTS variants with improved efficiency and accuracy for incorporating desired NSAAs into peptides,
SUMMARY
The present disclosure provides a method of screening for an amino acyl tRNA synthetase variant having preferential selectivity for a desired non-standard amino acid (NSAA) over its standard amino acid (SAA) counterpart or an undesired non-standard amino acid for incorporation into a target polypeptide in a cell. According to one aspect, the method includes providing to the cell an amino acyl tRNA synthetase variant and its cognate transfer RNA corresponding to the desired NSAA, wherein the cell is genetically engineered to express the target polypeptide including an amino acid target location for incorporation of the desired NSAA by the amino acyl tRNA synthetase variant and the transfer RNA, and wherein the cell expresses the target polynucleotide and either a desired NSAA, an SAA or an undesired NSAA is incorporated at the amino acid target location depending on the preferential selectivity of the amino acyl tRNA synthetase variant and the transfer RNA for the corresponding desired NSAA, wherein a removable protecting group is attached to the target polypeptide adjacent to the amino acid target location, such that when the removable protecting group is removed, an
N-end amino acid is exposed at the amino acid target location, and wherein a detectable moiety is attached to the C-end of the target polypeptide, wherein the ceil expresses an enzyme that cleaves the removable protecting group to generate an N-end amino acid, and wherein the cell further expresses an adaptor protein for a protease, wherein the protease degrades the target polypeptide when the N-end amino acid is an SAA or an undesired NSAA, detecting the detectable moiety as a measure of the amount of target polypeptide including the desired NSAA within the cell, and repeatedly testing an amino acyl tRNA synthetase variant for improved production of the target polypeptide including the desired NSAA.
In one embodiment, the removable protecting group is ubiquitin that is cleavable by Ubpl . In another embodiment, the detectable moiety is a fluorescent moiety or a reporter protein. In certain embodiments, cell expresses the enzyme for cleaving the removable protecting group constitutively or inducibly. In other embodiments, the adaptor protein and the protease is a ClpS-ClpAP protease system wherein the ClpS-CipAP protease system degrades the target polypeptide when the N-end amino acid is an SAA or an undesired NSAA to thereby enrich the target polypeptide including the desired NSAA within the cell. In some embodiments, the adaptor protein comprises a ClpS protein, its natural homolog, ClpS_V65I, ClpS 431 or ClpS L32F mutants. In some embodiments, the cell is a prokaryotic cell or a eukaryotic cell. In one embodiment, the cell is a bacterium. In another embodiment, the cell is a genetically modified E. coli. In one embodiment, the desired NSAA is bi phenyl alanine (BipA). In certain embodiment, the amino acyl tRNA synthetase variant is a biphenylalanine amino acyl tRNA synthetase (BipARS) variant. In one embodiment, the amino acyl tRNA synthetase variant is generated by introducing mutations throughout the wild type amino acyl tRN A synthetase gene. In an exemplary embodiment, error-prone PGR is used to introduce mutations throughout the wild type amino acyl tRN A synthetase gene. In one embodiment, the amino acyl tRNA synthetase variant is provided to the cell by a nucleic acid encoding the amino
acyl tRNA synthetase variant. In another embodiment, the transfer RNA is provided to the cell by a nucleic acid encoding the transfer RNA.
According to one aspect, the present disclosure provides an amino acyl tRNA synthetase variant comprising variant 1 to variant 1 , According to another aspect, the present disclosure provides a nucleic acid encoding the amino acyl tR A synthetase variants 1 to 1 1.
According to one aspect, the present disclosure provides a transfer RNA variant comprising variant 4 tRNA, variant 9 tRNA, and variant 10 tRNA. According to another aspect, the present disclosure provides a nucleic acid encoding the transfer RNA variants of variant 4 tRNA, variant 9 tRNA, and variant 10 tRNA.
According to another aspect, the present disclosure provides a biphenylalanine amino acyl tRNA synthetase variant wherein the variant comprises one or more amino acid substitutions to a parental biphenylalanine amino acyl tRNA synthetase having the sequence of
NIDEFEMIKRNTSEIISEEELREVLKKDEKSAHIGFEPSGKIHLGHYL-QIKKMIDLQNAG FDIIMLADLHAYLNQKGELDEIRKIGDYNKK EAMGLKAKYVYGSEWMLDKDYT
LNVYRLALKTTLKRARR
EQRKIHMLARELLPKK CfflNP VLTGLDGEGKMS S SKGNFIAVDD SPEEIRAKD KA YCPAGVVEGOTIMEIAKWLEYPLTK
KNAVAEELIKILEPIRKRL (SEQ ID NO: 1), In some embodiments, the variant includes one or more amino acid substitutions selected from the group consisting of N1 57K and I255F, R257G, R181C and E259V, I153V and A214T, P37A, K76R, I49F, A130V and A233V, L55M and G158S, D61V and H70Q and Nl 17D, D200Y, G210S, E237V and D286Y to the parental biphenylalanine amino acyl tRNA synthetase, or an amino acid sequence having at least 90% sequence identity thereof. In an exemplary embodiment, the variant includes amino acid substitutions D61 V and H70Q to the parental biphenylalanine amino acyl tRNA synthetase, or
an amino acid sequence having at least 90% sequence identity thereof, in one embodiment, an isolated polynucleotide encoding the synthetase variants described herein. In another embodiment, a host cell comprising an expression vector is provided. In one embodiment, the expression vector comprises the polynucleotide encoding the synthetase variants described herein.
According to one aspect, the present disclosure provides a transfer RNA (tRNA) variant wherein the variant comprises one or more nucleotide substitutions to a parental tRNA having the sequence of ccggcggtagttcagcagggcagaacggcggactctaaatccgcatggcaggggttcaaatcccctccgccggacca (SEQ ID NO: 2). In some embodiments, the tRNA variant includes a nucleotide substitution selected from the group consisting of A22G, C67A, C26T, C29A, G51T and G23A to the parental tRNA, or a nucleotide sequence having at least 90% sequence identity thereof. In one embodiment, an isolated polynucleotide each encoding the tRNA variants described herein is provided. In another embodiment, a host cell comprising an expression vector is provided. In one embodiment, the expression vector comprises the polynucleotide which each encodes the tRNA variants described herein is provided.
According to another aspect, the present disclosure provides a biphenylalanine amino acyl tRNA synthetase and tRNA pair wherein the pair is selected from the group consisting of i) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions Nl 57K and I255F to the parental biphenylalanine amino acyl tRN A synthetase and the parental tRNA; ii) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution R257G to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; iii) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions R181C and E259V to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution A22G to the parental
tRNA; vi) a biphenyl alanine amino acyl tRNA synthetase variant comprising amino acid substitutions I153V and A214T to the parental biphenylalanme amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C67A to the parental tRNA; v) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution P37A to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; vi) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution K76R to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; vii) the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution A22G to the parental tRNA; viii) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions I49F, A130V and A233 V to the parental biphenyl alanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C26T to the parental tRNA; ix) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions L55M and G158S to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C29A to the parental tRNA; x) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions D61V and H70Q to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution G51T to the parental tRNA; and xi) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions N i l 71), D200Y, G210S, E237V and D286Y to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution G23A to the parental tRNA.
BRIEF DESCRIPTION OF THE DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee. The foregoing and other features and
advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:
Figs. 1 A and 1.G illustrate the use of post-translational proofreading (PTP) for selective BipA OTS evolution. Fig. 1 A shows FACS evolution scheme with EP-PCR AARS libraries transformed into hosts with PTP (using ClpSVo51) genomicaliy integrated before 3 sorting rounds. Fig. IB shows evaluation of most enriched evolved BipARS variants on a panel of NSAAs ([BipA] ==: 100 μΜ, [rest] ;=: 1 itiM, which are their standard concentrations). Fig. 1C shows in vitro amino acid substrate specificity profile of BipA OTS variants. Fig. ID shows escape frequencies over time for adk.d6 strains transformed with constructs indicated in legend. Navy circles represent previously published data. Gray circles for adk.d6 represent repeat of previously published data. Green and yellow circles are the most relevant constructs to compare for this study. KA: Kanamycin+Arabinose. SCA: SDS+Chloramphenicol+Arabinose. Error bars in D-F represent SEM, N=3. Fig. IE shows escape frequencies over time for tyrS.d.8 strains. Lines represent assay detection limit in cases where no colonies were observed. Fig, IF shows escape frequencies over time for adk.d6/tyrS.d8 strains. Fig. 1G shows doubling time for biocontained strains with WT or Variant 10 OTS. Error bars = SD, N=3.
Fig. 2 shows FACS data from BipARS EP-PCR library exposed to negative screens of differing stringency.
Figs. 3A-3D show confirmation of BipA incorporation by mass spectrometry (MS). Fig. 3 A shows SDS-PAGE gel of Ni-NTA purified Ub-X-GFP reporter proteins. Fig. 3B shows MS trace indicating incorporation of tyrosine in position X in peptide GGXLFVQELASK (SEQ ID NO: 3) (positions 75-86 of Ub-X-GFP) using WT BipA OTS and no addition of BipA. Fig. 3C shows MS trace indicating incorporation of BipA in position X of the same peptide
using WT BipA OTS in the presence of BipA. Fig. 3D shows MS trace indicating incorporation of BipA in position X of the same peptide using BipA 10 OTS in the presence of BipA.
Figs. 4A-4B show sample images of plates depicting biocontainment escape frequency estimation. Fig. 4A shows total CFU estimation on permissive media. Fig. 4B shows escapee estimation on non-permissive media.
Figs. 5A-5D show spontaneous tRNA mutations observed in sorted variants and effect on selectivity. Fig. 5 A shows positions of observed tRNA mutations on the predicted /TyrRS tRNAopt structure. Note that the position of the BipA OTS Variant 10 tRNA is the most influential for interaction with elongation factor Tu (EF-Tu). Figure 5A discloses SEQ ID NO: 153. Fig. 5B shows FL/OD measurements after cloning each combination of BipARS and tRNA variant. Each of the 3 variant tRNAs confers selectivity against standard amino acids (represented by the "No NSAA" case) regardless of the BipARS pairing. Variant 10 BipARS with Variant 10 tRNA is the most selective for BipA compared to the other NSAAs shown above. Fig. 5C shows in vitro amino acid substrate specificity of Variant 9 BipARS with WT tRNA or Variant 9 tRNA. Fig, 5D shows in vitro amino acid substrate specificity of Variant 10 BipARS with WT tRNA or Variant 10 tRNA.
Fig. 6 shows single UAG suppression sensitivity assay with and without PTP (using ClpS V65L which does not degrade pAcF or pAzF) reveals that AARSs evolved using a strategy geared towards multi-UAG suppression (See, M. Amiram et al. Evolution of translation machinery in recoded bacteria enables multi-site incorporation of nonstandard amino acids. Nat Biotech. 33, 1272-1279 (2015)) display very low fidelity for single UAG sites. Progenitor; pAcFRS_D286R. Evolved strains: pAzFRS. l .tl and pAcFRS.2.tl .
DETAILED DESCRIPTION
The present disclosure provides a method of screening for an amino acyl tRNA synthetase variant having preferential selectivity for a desired non-standard amino acid
(NSAA) over its standard amino acid (SAA) counterpart or an undesired non-standard amino acid for incorporation into a target polypeptide in a cell. As used herein, the terms "polypeptide" and "protein" include compounds that include amino acids joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues. Exemplar}- cells include prokaryotic cells and eukaryotic cells. Exemplary prokaryotic cells include bacteria, such as E. coli, such as genetically modified E. coii. In exemplary embodiments, the cell is genetically modified to express the target polypeptide including an amino acid target location for incorporation of a desired non-standard amino acid substitution by an engineered amino-acyl tRNA synthetase variant and transfer R A pair corresponding to the non-standard amino acid. A removable protecting group is attached to the target polypeptide adjacent to the amino acid target location, such that when the removable protecting group is removed, an N-end amino acid is exposed at the amino acid target location. According to one aspect, the removable protecting group is orthogonal within the cell in which it is being used.
According to certain aspects, the cell includes a protease system for degrading the target polypeptide when the N-end amino acid is a standard amino acid. According to certain aspects, the ceil includes a protease system for degrading the target polypeptide when the N-end amino acid is an undesired NSAA. According to certain aspects, the protease system includes an adapter protein and a corresponding protease. The adapter protein coordinates with the protease for degrading the target polypeptide when the N-end amino acid is a standard amino acid. According to one aspect, the protease system is endogenous. According to one aspect, the protease and adaptor can be expressed constitutively. According to one aspect, the protease system is exogenous. According to one aspect, the protease system is under influence of a promoter. According to one aspect, the adapter protein of the protease system is under influence of an inducible promoter. According to one aspect, the adapter protein is upregulated.
According to one aspect overexpression of adaptor to produce adaptor levels in excess of that found normally within a cell improves degradation of polypeptides having an undesired amino acid at the amino acid target location. According to one aspect, an adaptor protein is provided that facilitates N-end rale classification of an NSAA (See, D, B. F. Johnson et al, RF1 knockout allows ribosomal incorporation of unnatural amino acids at multiple sites. Nat Chem Biol. 7, 779-786 (2011); P. O'Donoghue et l., Near-cognate suppression of amber, opal and quadruplet codons competes with aminoacyl-tRNAPyl for genetic code expansion. FEBSLetl. 586, 3931-3937 (2012); A. Bachmair, D. Finley, A . Varshavsky, In vivo half-life of a protein is a function of its amino-terminal residue. Science (80- ). 234, 1 79-186 (1986); J. W. Tobias, T. E. Shrader, G. Rocap, A. Varshavsky, The N-end rule in bacteria. Science (80-. ). 254, 1374— 1377 (1991); T. Tasaki, S. M. Sriram, K. S. Park, Y. T. Kwon, TheN-End Rule Pathway. Aram. Rev. Biochem. 81, 261-289 (20 2)).
Because the N-end rule pathway of protein degradation is conserved across prokaryotes and eukaryotes, methods described herein are useful in prokaryotes and eukaryotes. The removable protecting groups should be orthogonal in the cell within which it is being used. Ubiquitin is a suitable protecting group in prokaryotic cells because it is orthogonal but it is not a suitable protecting group in eukaryotic cells because it is not orthogonal. In eukaryotic cells, ubiquitin is N-terminally added to proteins often to initiate the process of protein degradation in the proteasome. In addition, the adaptor proteins in eukaryotic cells are homologs of ClpS known as Ubiquitin E3 ligases. According to the present disclosure, ubiquitin E3 ligase domain is altered in order to change the N-end rule classification of an NSAA.
According to one aspect, the removable protecting group is removed to generate an N- end amino acid, and the protease degrades the target polypeptide when the N-end amino acid is a standard amino acid or an undesired NSAA. In this manner, the target polypeptide
including a desired non-standard amino acid substitution, i.e. which is resistant to degradation, is enriched within the cell. According to one aspect, embodiments of the disclosure are directed to methods that allow selective degradation of proteins having a standard amino acid or undesired NSAA instead of a desired nonstandard amino acid at their 'N -termini in a cell. The methods can be used for producing proteins with desired nonstandard amino acids at their N- termini with no detectable impurities.
According to one aspect, a method of identifying the presence of a target polypeptide including a desired non-standard amino acid, i.e. one which is resistant to degradation, is provided. According to this aspect, the target polypeptide includes a detectable moiety attached to the C-end of the target polypeptide. In this manner, if the target polypeptide (and detectable moiety) that is made by the cell is not subject to degradation as described above, then the detectable moiety is detected as a measure of the amount of target polypeptide generated by the cell. Accordingly, a method is provided where a detectable moiety is present at the C-end of the target polypeptide, the removable protecting group is removed to generate an N-end amino acid, the protease (whether accompanied by an adapter protein or not depending upon the protease system being used) degrades the target polypeptide when the N-end amino acid is a standard amino acid or an undesired NSAA, for example, to thereby enrich the target polypeptide including a desired non-standard amino acid substitution, and the detectable moiety is detected as a measure of the amount of the target polypeptide including a desired non-standard amino acid substitution.
According to one aspect, a method is provided for screening for amino acyl tRNA synthetase variants that are more selective for incorporating non-standard amino acids versus standard amino acids at a selected site in a protein. Since all or substantially all of proteins bearing a standard amino acid or an undesired NSAA at their N-terminus are degraded leaving only proteins with a desired nonstandard amino at their N-terminus, no or substantially no
background signal due to standard amino acid or undesired NSAA incorporation results from the method. Synthetases can be evolved and their variants screened in a high-throughput fashion for their function of producing a protein incorporating a nonstandard amino acid, such as a desired NSAA. In this manner, those synthetases with improved function can be identified and modified further to improve efficiency and selectivity.
I. Receding Cells That Incorporate a Desired NSAA in a Target Polypeptide
In general, a cell can be genetically modified to include a nucleic acid sequence which encodes for the target polypeptide that incorporate one or more non-standard amino acids within its amino acid sequence. The cell can be genomically recoded, ("a genomically recoded organsim") to the extent that one or more codons have been reassigned to encode for a nonstandard amino acid. For each different non-standard amino acid, an amino-acyl tRNA synthetase/tRNA pair is engineered and the cell is capable of using the amino-acyl tRNA synthetase/tRNA pair to add the corresponding non-standard amino acid (when present in the cell) to a growing peptide sequence. Materials, conditions, and reagents for genetically modifying a cell to make a target protein having one or more amino acid sequences are described in the following references, each of which are hereby incorporated by reference in their entireties.
Approaches to genomically recede organisms include multiplex automatable genome engineering (MAGE), (for example, as described in Wang, Harris H., et al. "Programming cells by multiplex genome engineering and accelerated evolution." Nature 460.7257 (2009): 894- 898 hereby incorporated by reference in its entirety) and hierarchical conjugative assembly genome engineering (CAGE) (for example, as described in Isaacs, Farren J., et al. "Precise manipulation of chromosomes in vivo enables genome-wide codon replacement." Science 333.6040 (2011): 348-353 hereby incorporated by reference in its entirety). In addition, portions of recoded genomes can be synthesized and subsequently
assembled, as described recently in an effort to construct a 57-codon organism (for example, as described in Ostrov, Niii, et al. "Design, synthesis, and testing toward a 57-codon genome," Science 353.6301 (2016): 819-822 hereby incorporated by reference in its entirety). The modification of an organism, whether receded or not receded, in order to express a polypeptide containing a site-specific non-standard amino acid has been described extensively in the literature (for example, as described in Wang, Lei, et al. "Expanding the genetic code of Escherichia coli. " Science 292.5516 (2001): 498-500; Chin, Jason W., et al. "An expanded eukaryotic genetic code," Science 301.5635 (2003): 964-967; Wang, Lei, and Peter G. Schultz. "Expanding the genetic code." Angewandte chemie international edition 44.1 (2005): 34-66; Liu, Chang C, and Peter G. Schultz. "Adding new chemistries to the genetic code." Annual review of biochemistry 79 (2010): 413-444; Chin, Jason W. "Expanding and reprogramming the genetic code of cells and animals." Annual review of biochemistry 83 (2014): 379-408 each of which is hereby incorporated by reference in its entirety). In brief, foreign nucleic acid sequences containing a gene encoding an orthogonal amino-acyl tRNA synthetase and an associated tRNA are introduced into an organism, typically in an expression vector. In addition, a desired non-standard amino acid is added to the cell culture medium. A nucleic acid sequence corresponding to a target protein is modified so that a free codon, such as the UAG codon, is formed at the target site of the gene encoding the target protein. In the presence of these four components - aminoaeyl tRNA synthetase protein, tRNA, NSAA, and target protein mRNA - the target protein containing the NSAA is made.
Basic to the present disclosure is the use of an amino-acyl tRNA synthetase/tRNA pair cognate to a nonstandard amino acid. Exemplar}' amino-acyl tRNA synthetase/tRNA pairs cognate to a nonstandard amino acid are known to those of skill in the art or may be designed for particular non-standard amino acids, as is known in the art or as described in Wang, Lei, and Peter G. Schultz. "Expanding the genetic code." Angewandte chemie international
edition 44.1 (2005): 34-66; Liu, Chang C, and Peter G. Schultz. "Adding new chemistries to the genetic code." Annual review of biochemistry 79 (2010): 413-444; and Chin, Jason W. "Expanding and reprogramming the genetic code of cells and animals." Annual review of biochemistry 83 (2014): 379-408 each of which are hereby incorporated by reference in its entirety.
According to one aspect, the amino-acyl tRNA synthetase/tRNA pair cognate to a nonstandard amino acid is orthogonal to the cellular components of the cell in which it is used . The orthogonality (and therefore the suitability) of exogenous amino-acyl tRNA synthetase/tRNA pairs is dependent on the type of host organism. Four main orthogonal aminoacyl-tRNA synthetases have been developed for genetic code expansion: the Methanococcus janaschii tyrosyl-tRNA synthetase
pair, the Escherichia, coli tyrosyl-tRNA synthetase (£cTyrRS)/tR AcuA pair, the E. coli leucyl-tRNA synthetase (EcLeuRSytRNACUA pair, and pyrrolysyl-tRNA synthetase (PylRS)/tRNAcuA pairs from certain Methanosarcina. The JW/TyrRS/tRNAcuA pair is orthogonal in E. coli but not in eukaryotic cells. The EcTyrRS/tRNAcuA pair and the EcLeuRS/tRNAcuA pair are orthogonal in eukaryotic cells but not in E. coli, whereas the PylRS/tRNAcuA pair is orthogonal in bacteria, eukaryotic cells, and animals (see Chin, Jason W. "Expanding and reprogramming the genetic code of cells and animals." Annual review of biochemistry S3 (2014): 379-408 hereby incorporated by reference in its entirety). To maintain orthogonality, the exogenous amino acyl tRNA synthetase should not recognize any native amino acids or native tRNA. To maintain orthogonality, the tRNA should not be recognized by any native amino-acyl tRNA synthetases. To maintain orthogonality, the non-standard amino acid should not be recognized by any native amino acyl tRNA synthetases. "Orthogonal" pairs meet one or more of the above conditions. It is to be understood that "orthogonal" pairs may lead to some mischarging, i.e. such as
insubstantial mischarging for example, of orthogonal tRNA with native amino acids so long as sufficient efficiency of charging to the designed NSAA occurs.
Exemplar}' families of synthetases for bacteria in addition to those described above and incorporated by reference include the PylRS/t NAcuA pair and the Saccharomyces cerevisiae tryptophanyl-tRNA synthetase (ScWRS)/tRNAcuA pair. These exemplary synthetase families have natural analogs (lysine and tryptophan) that are N-end destabilizing amino acids. The following references describe useful synthetase families and their associated NSAAs. Blight, Sherry K,, et al. "Direct charging of tRNAcuA with pyrroiysine in vitro and in vivo." Nature 431.7006 (2004): 333-335; Namy, Olivier, et al. "Adding pyrroiysine to the Escherichia coli genetic code." FEBS letters 581.27 (2007): 5282-5288; Hughes, Randall A., and Andrew D. Ellington. "Rational design of an orthogonal tryptophanyl nonsense suppressor tRNA." Nucleic acids research 38. 19 (2010): 6813-6830; Ellefson, Tared W., et al. "Directed evolution of genetic parts and circuits by compartmentalized partnered replication." Nature Biotechnology 32.1 (2014): 97-101; and Chatterjee, Abhishek, et al. "A Tryptophanyl-tRNA Synthetase/tRNA Pair for Unnatural Amino Acid Mutagenesis in E. coli." Angewandte Chemie International Edition 52.19 (2013): 5106-5109 each of which are hereby incorporated by reference in its entirety. As is known in the art, the synthetase catalyzes a reaction that attaches the nonstandard amino acid to the correct tRNA. The amino-acyl tRNA then migrates to the ribosome. The ribosome adds the nonstandard amino acid where the tRNA anticodon corresponds to the reverse complement of the codon on the mRNA of the target protein to be translated.
II. Orthogonal Translation Systems and Variants Thereof
According to one aspect, a method is provided for screening for amino acyl tRNA synthetase variants and their cognate transfer RNA variants having improved selectivity for incorporating a desired non-standard amino acid versus standard amino acid or an undesired
non-standard amino acid at a selected site in a protein or a polypeptide. According to one aspect, the screening is based on using prokaryotic or eukaryotic cells containing a CipS-ClpAP protease system. In certain exemplary embodiments, the protease system includes the adaptor protein ClpS or homologs or mutants thereof, such as ClpS_V65I, ClpS_V43I or ClpS_L32F. In certain embodiments, adaptor protein ClpS variants including ClpS_V65I, ClpS_V43I or ClpS L32F are used since they exhibit improved selectivity for certain amino acids, such as between standard amino acids and non-standard amino acids or between a desired NSAA and an undesired NSAA.
In exemplary embodiments, biphenyl alanine (BipA) aminoacyl-tRNA synthetase (BipARS) variants are generated by making one or more amino acid substitutions of a parental biphenyl alanine amino acyl tRNA synthetase having the amino acide sequence of MDEFEMI RNT SEESEEELRE VLKKDEK S AHIGFEP SGKIHLGHYLQIKKMIDLQNAG FDEIFILADLHAYLNQKGELDEIRKIGDYNKK EAMGLKAKYVYGSEWMLDKDYT LNVYRLAIJ TTLKRARRSMEU^
EQRKIHMLARELLPKKVVCIHNPVLTGLDGEGKMSSS GNFIAVDDSPEEERAKI KA
YCI^AGVVEGNPIMEIAKYFLEYPL'Il M5E FGGDLT\rNSYEELESLFKNKELHl¾lDL KNAVAEELIKILEPIRKRL (SEQ ID NO: 1). In this manner, synthetases can be evolved and their variants screened in a high-throughput fashion for their function of producing a protein or polypeptide incorporating a biphenylalamne at a desired position in the protein or polypeptide. In this manner, those synthetases with improved function can be identified and modified further to improve efficiency and selectivity. In some embodiments, the synthetase variant includes at least one, two, three, four, five, six, seven, eight, nine or ten amino acid substitutions of the parental synthetase. In some embodiments, the synthetase variant includes from about ten to about twenty, from about twenty to about fifty amino acid substitutions of the parental synthetase. In certain embodiments, the synthetase variant includes one or more
amino acid substitutions selected from the group consisting of N157K and I255F, R257G, R181C and E259V, I153V and A214T, P37A, K76R, I49F, A130V and A233V, L55M and G158S, D61V and H70Q and N1 17D, D200Y, G210S, E237V and D286Y to the parental biphenvlalamne amino acyl tRNA synthetase, or an amino acid sequence having at least at least 50%, at least 60%, at least 70%, at least 80%, e.g., at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%>, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%>, or 100%, sequence identity thereof. In an exemplary embodiment, the variant includes amino acid substitutions D61V and H70Q to the parental biphenylalanine amino acyl tRNA synthetase, or an amino acid sequence having at least 50%, at least 60%, at least 70%, at least 80%, e.g., at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity thereof.
According to one aspect, the present disclosure provides a transfer RNA (tRNA) variant wherein the variant comprises one or more nucleotide substitutions to a parental tRNA having the sequence of ccggcggtagttcagcagggcagaacggcggactctaaatccgcatggcaggggttcaaatcccctccgccggacca (SEQ ID NO: 2). In some embodiments, the tRNAvariant includes at least one, two, three, four, five, six, seven, eight, nine or ten nucleotide substitutions of the parental tRNA. In some embodiments, the tRNA variant includes from about ten to about twenty, from about twenty to about fifty nucleotide substitutions of the parental tRNA, In certain embodiments, the tRNA variant includes nucleotide substitution selected from the group consisting of A22G, C67A, C26T, C29A, G51T and G23A to the parental tRNA, or nucleotide sequence having at least at least 50%, at least 60%, at least 70%, at least 80%, e.g., at least 81%, at least 82%, at least
83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100%, sequence identity thereof.
According to another aspect, the present disclosure provides a biphenylalanine amino acyl tRNA synthetase and tRNA pair. In certain embodiments, the pair includes either or both of a biphenylalanine amino acyl tRNA synthetase variant and a tRNA variant. In some embodiments, the pair includes i) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions N157K and I255F to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; ii) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution R257G to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; iii) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions R181C and E259V to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution A22G to the parental tRNA; vi) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions II 53V and A214T to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C67A to the parental tRNA; v) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution P37A to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; vi) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution K76R to the parental biphenylalanine amino acyl tRNA synthetase and the parental tRNA; vii) the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution A22G to the parental tRNA; viii) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions I49F, A130V and A233V to the parental biphenylalanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C26T to the parental tRNA;
ix) a biphenyialanine amino acyl tRNA synthetase variant comprising amino acid substitutions L55M and G158S to the parental biphenyialanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution C29A to the parental tRNA; x) a biphenyialanine amino acyl tRNA synthetase variant comprising amino acid substitutions D61.V and H70Q to the parental biphenyialanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution G5 IT to the parental tRNA; or xi) a biphenyialanine amino acyl tRNA synthetase variant comprising amino acid substitutions N117D, D200Y, G210S, E237V and D286Y to the parental biphenyialanine amino acyl tRNA synthetase and a tRNA variant comprising a nucleotide substitution G23 A to the parental tRNA.
III. Removable Protecting Groups
According to one aspect, the target polypeptide includes a removable protecting group adjacent to the amino acid target location such that when the removable protecting group is removed, the amino acid target location is an N-end amino acid. Exemplary removable protecting groups are known to those of skill in the art and can be readily identified in the literature based on the present disclosure. According to one aspect, the removable protecting is a peptide sequence produced by the ceil when making the target polypeptide. According to one aspect, the removable protecting is a peptide sequence produced by the cell when making the target polypeptide, such that the removable peptide and the target polypeptide is a fusion. According to this aspect, the cell is genetically modified to include a foreign nucleic acid sequence encoding the target polypeptide including a non-standard amino acid substitution at an amino acid target location and a removable protecting group attached to the target polypeptide adjacent to the amino acid target location. According to one aspect, the removable protecting group is foreign to the cell, i.e. it is not endogenous to the cell. In this manner, the removable protecting is orthogonal to endogenous enzymes or other conditions within the cell.
An exemplary removable protecting group includes a cleavable protecting group, such as an enzyme cleavable protecting group. According to one aspect, the cell produces an enzyme that cleaves the removable protecting group to generate an N-end amino acid. An exemplary removable protecting group is a protein that is cleavable by a corresponding enzyme. According to one aspect, a removable protecting group is foreign to the cell and is not endogenous. According to one aspect, the enzyme that cleaves the removable protecting group is foreign to the cell and is not endogenous. According to one aspect, an exemplary removable protecting group for prokaryotic cells is ubiquitin that is cleavable by Ubpl . According to another aspect, an exemplary removable protecting group for eukaryotic ceils i s the sequence MENLYFQ/* (SEQ ID NO: 4), where "*" is the target position for the NSAA (known in the field as the Ρ position), where "/" represents the cut site, and where "ENLYFQ/*" (SEQ ID NO: 5) is the sequence that is cleavable by certain variants of TEV protease. Ordinarily, TEV protease cleavage efficiency is influenced by the choice of the amino acid at the Ρ position. However, mutants of TEV protease have been engineered which have increased or altered substrate tolerance at the PI ' position (see Renicke, Christian, Roberta Spadaccini , and Chri stof Taxis. "A Tobacco Etch Virus Protease with Increased Substrate Tolerance at the P l'position." PloSone 8.6 (2013): e67915 hereby incorporated by reference in its entirety). The use of TEV protease in vivo in mammalian cells has been demonstrated and is described in Oberst, Andrew, et al. "Inducible dimerization and inducible cleavage reveal a requirement for both processes in caspase-8 activation." Journal of Biological Chemistry 285.22 (2010): 16632-16642 hereby incorporated by reference in its entirety. One of skill will readily understand based on the present disclosure that the methods described herein are useful in prokaryotic cells and eukaryotic cells.
According to the present disclosure, the N-end target residue is exposed using materials and methods that are or will become apparent to one of skill based on the present disclosure.
An exemplar}- removable protecting protein domain includes a self-splicing domain, such as an intein, or other cleavable domains such as small ubiquitin modifiers (SUMO proteins). An exemplar}' removable protecting group may be a protein cleavage sequence along with its cognate partner, such as the TEV cleavage site and TEV protease. In general, any of the strategies used to remove N-terminal affinity tags in protein purification can serve as alternative ways to expose the N-end target residue. An exemplar}' system to expose the N-end target residue includes a class of enzymes known as methionine aminopeptidases which can remove the first N-terminal residue, such as when the second residue is the amino acid target location which is the desired site of addition of aNSAA. According to one aspect, the amino acid target location may be the N-terminal location or it may be any location between the N-terminal location and the C-terminal location. Accordingly, methods are provided for removing a protecting group and/or all amino acids up to the amino acid target location, thereby rendering the amino acid target location being the N-terminal amino acid.
IV. Detectable Moiety
According to one aspect, the target polypeptide includes a detectable moiety attached to the C-end of the target polypeptide. Exemplary detectable moieties are known to those of skill in the art and can be readily identified in the literature based on the present disclosure. According to one aspect, the detectable moiety is a peptide sequence produced by the cell when making the target polypeptide. According to one aspect, the detectable moiety is a peptide sequence produced by the cell when making the target polypeptide, such that the detectable moiety and the target polypeptide is a fusion. According to this aspect, the cell is genetically modified to include a foreign nucleic acid sequence encoding the target polypeptide including a non-standard amino acid substitution at an amino acid target location and a detectable moiety attached to the target polypeptide, for example, at the C-end of the target polypeptide.
According to one aspect, the detectable moiety is foreign to the cell, i.e. it is not endogenous to the cell.
An exemplar}' detectable moiety is a fluorescent moiety, such as GFP, that can be detected by fluorimetry, for example. An exemplary detectable moiety is a reporter protein. An exemplar}- detectable moiety includes a protein that confers antibiotic resistance which can be detected in the presence of an antibiotic. An exemplar ' detectable moiety includes an enzyme that perform s a function (such as Beta-Gal actosidase) that can lead to easy colorimetric output.
Aspects of the methods described herein may make use of epitope tags and reporter gene sequences as detectable moieties. Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags. Examples of reporter genes include, but are not limited to, glutathione-S-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, betaglucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP).
V. Genetic Modifications
Aspects of the present discl osure include the genetic modification of a cell to include foreign genetic material which can then be expressed by the cell . The cell may be modified to include any other genetic material or elements useful in the expression of a nucleic acid sequence. Foreign genetic elements may be introduced or provided to a ceil using methods known to those of skill in the art. For example, the cell may be genetically modified to include a foreign nucleic acid sequence encoding the target polypeptide including a non-standard amino acid substitution at an amino acid target location, a removable protecting group attached to the target polypeptide adjacent to the amino acid target location and a detectable moiety
attached to the C-end of the target polypeptide. The nonstandard amino acid may be encoded by a corresponding nonsense or sense codon. The cell may be genomically receded to recognize an engineered amino-acyl tR A synthetase corresponding or cognate to a nonstandard amino acid. The cell may be genetically modified to include a foreign nucleic acid sequence encoding an amino-acyl tRNA synthetase and/or a transfer RNA corresponding or cognate to the nonstandard amino acid and wherein the nonstandard amino acid is provided to the cell and the cell expresses the synthetase and the transfer RNA to include the nonstandard amino acid at the amino acid target location. The cell is genetically modified to include a foreign nucleic acid sequence encoding an enzyme for cleaving the removable protecting group under influence of an inducible promoter. The cell is genetically modified to include an inducible promoter influencing the production of an enzyme system for removal of the removable protecting group. The enzyme system or component thereof may be under influence of the inducible promoter. For example, the adapter which helps associate the cleavage enzyme with the removable protecting group may be under influence of an inducible promoter.
In general, nucleic acids may be introduced into a cell using any method known to those skilled in the art for such introduction. Such methods include transfection, transduction, viral transduction, microinjection, lipofection, nucleofection, nanoparticle bombardment, transformation, conjugation and the like. One of skill in the art will readily understand and adapt such methods using readily identifiable literature sources.
Aspects of the methods described herein may make use of vectors. The term "vector" includes a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors used to deliver the nucleic acids to cells as described herein include vectors known to those of skil 1 in the art and used for such purposes. Certain exemplary vectors may be plasmids, ientiviruses or adeno-associated viruses known to those of skill in the art. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-
stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g. circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a "plasmid," which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Another type of vector is a viral vector, wherein viraily-derived DNA or NA sequences are present in the vector for packaging into a virus (e.g. retroviruses, lentiviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host ceil. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g. bacterial vectors having a bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of a host ceil upon introduction into the host cell, and thereby are replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operativeiy linked. Such vectors are referred to herein as "expression vectors." Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. Recombinant expression vectors can comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell, which means that the recombinant expression vectors include one or more regulatory elements, which may be selected on the basis of the host cells to be used for expression, that is operatively-1 inked to the nucleic acid sequence to be expressed. Within a recombinant expression vector, "operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g. in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).
Aspects of the methods described herein may make use of regulatory elements. The
term "regulatory element" is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g. transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory elements are described, for example, in Goeddel, GENE EXPRESSION TECHNOLOGY : METHODS IN ENZYMOLOGY 185, Academic Press, San Diego, Calif. (1990). Regulatory elements include those that direct constitutive expression of a nucleotide sequence in many types of host ceil and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue- specific regulatory sequences). Regulatory elements useful in eukaryotic cells include a tissue- specific promoter that may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g. liver, pancreas), or particular cell types (e.g. lymphocytes). Regulator}' elements m ay also direct expression in a temporal -dependent manner, such as in a cell-cycle dependent or developmental stage-dependent manner, which may or may not also be tissue or cell-type specific. In some embodiments, a vector may comprise one or more pol III promoter (e.g. 1 , 2, 3, 4, 5, or more po! III promoters), one or more pol II promoters (e.g. 1 , 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g. 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and HI promoters. Examples of pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) [see, e.g., Boshart et al, Cell, 41 :521 -530 ( 1985)], the SV40 promoter, the dihydrotolate reductase promoter, the β-actin promoter, the phosphoglvcerol kinase (PGK) promoter, and the EF la promoter and Pol II promoters described herein. Also encompassed by the term "regulatory element" are enhancer elements, such as WPRE; CMV enhancers; the R- U5' segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit β-giobin (Proc. Natl. Acad.
Sci. USA., Vol. 78(3), p. 1527-31, 1981). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host ceil to be transformed, the level of expression desired, etc. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (CRISPR) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.). Common prokaryotic promoters include IPTG (isopropyl B-D-l- thiogalactopyranoside) inducible, anhydrotetracycline inducible, or arabinose inducible promoters. Such promoters express genes only in the presence of IPTG, anhydrotetracycline, or arabinose in the medium. An exemplary promoter for use in bacteria such as E. coli to express aminoacyl tRNA synthetase is an arabinose inducible promoter. An exemplary promoter for use in bacteria such as E. coli to express a reporter protein is an anhydrotetracycline inducible promoter.
Aspects of the methods described herein may make use of terminator sequences. A terminator sequence includes a section of nucleic acid sequence that marks the end of a gene or operon in genomic DNA during transcription. This sequence mediates transcriptional termination by providing signals in the newly synthesized mRNA that trigger processes which release the mRNA from the transcriptional complex. These processes include the direct interaction of the mRNA secondary structure with the complex and/or the indirect activities of recruited termination factors. Release of the transcriptional complex frees RNA polymerase and related transcriptional machinery to begin transcription of new mRNAs. Terminator sequences include those known in the art and identified and described herein,
VI. Adapter Protein Protease Systems
According to one aspect, the cell includes a protease system for degrading the target polypeptide when the N-end amino acid is a standard amino acid. The protease system may be
endogenous or exogenous. The ceil may include an adapter or discriminator protein that coordinates with a protease for degrading the target polypeptide when the N-end amino acid is a standard amino acid. The adapter protein may be under influence of an inducible promoter. According to one aspect, the adapter protein is ClpS or a variant or mutant thereof. According to one aspect, adapter proteins may have different levels of selectivity for certain amino acids. According to certain aspects, adapter proteins, such as ClpS may be altered to improve selectivity, such as between standard amino acids and non-standard amino acids or between a desired NSAA and an undesired NSAA. According to one aspect, the protease system is a ClpS-ClpAP protease system.
According to one aspect protease systems include Clps or homologs or mutants thereof, such as ClpS V65T ClpS _V43I or ClpS L32F, The N-end rule is mediated by homologs of ClpS/ClpAP in bacteria. In eukaryotes, the N-end rule involves more distant homologs of CipS (UBRl, ubiquitin E3 ligases) and degradation by the proteasome. Accordingly, the present disclosure contemplates use of many of the bacterial ClpS homologs to perform similar functions with slightly different amino acid recognition specificity. The present disclosure also contemplates use of eukaryotic protease systems, such as UBRl and related variants to mediate N-end rule recognition with different amino acid recognition specificity in eukaryotes. VII. Cells
According to certain aspects, cells according to the present disclosure include prokaryotic cells and eukaryotic cells. Exemplary prokaryotic cells include bacteria. Microorganisms which may serve as host cells and which may be genetically modified to produce recombinant microorganisms as described herein may include one or members of the genera Clostridium, Escherichia, Rhodococcus, Pseudomonas, Bacillus, Lactobacillus Saccharomyces, and Enterococcus . Particularly suitable microorganisms include bacteria and archaea. Exemplary microorganisms include Escherichia coli, Bacillus subtilis, and
Saccharomyces cerevisiae. Exemplar' eukaryotic cells include animal cells, such as human ceils, plant cells, fungal cells and the like.
In addition to E. coli, other useful bacteria include but are not limited to Bacillus suhtilis, Bacillus megaterium, Bifidobacterium bifidum, Caulohacter crescentus, Clostridium difficile, Chlamydia trachomatis, Corynebacterium glutamicum, Lactobacillus acidophilus, Lactococcus lactis, Mycoplasma geniialium, Neisseria gonorrhoeae, Prochlorococcus mar inns, Pseudomonas aeruginosa, Psuedomonas putida, Treponema pallidum, Streptomyces coelicolor, Synechococcus elongates, Vibrio natrigiens, and l "ymomonas mobilis.
Exemplary genus and species of bacteria cells include Acetobacter aurantius, Acinetobacter bitumen, Actinomyces israelii, Agrobacterium radiobacter, Agrobacterium turn efaci ens, Anaplasma Anaplasma phagocytophilum, Azorhizobium caulinodans, Azotobacter vinelandii, viridans streptococci, Bacillus anthracis, Bacillus brevis, Bacillus cereus, Bacillus fusiformis, Bacillus licheniformis, Bacillus megaterium, Bacillus mycoides, Bacillus stearothermophilus, Bacillus subtilis, Bacteroides, Bacteroides fragilis, Bacteroides gingival! s, Bacteroides melaninogenicus (also referred to as Prevotella melaninogenica ), Bartonella ,Bartonelia henselae, Bartonella quintana, Bordetella, Bordetella bronchi septica, Bordetella pertussis, Borrelia burgdorferi, Brucella abortus, Brucella melitensis, Brucella suis, Burkholderia, Burkholderia mallei, Burk olderia pseudomallei, Burkholderia cepacia, Calymmatobacterium granulomatis, Campylobacter, Campylobacter coli, Campylobacter fetus, Campylobacter jejuni, Campylobacter pylori, Chlamydia, Chlamydia trachomatis, Chlamydophiia Chlamydophila pneumoniae (also known as Chlamydia pneumoniae) Chlamydophila psittaci (also known as Chlamydia psittaci), Clostridium, Clostridium botulinum, Clostridium difficile, Clostridium perfringens (also known as Clostridium welchii), Clostridium tetani, Corynebacterium, Corynebacterium diphtheriae, Corynebacterium fusiforme, Coxiella burnetii, Ehrlichia chaffeensis, Enterobacter cloacae, Enterococcus,
Enterococcus avium, Enterococcus durans, Enterococcus faecalis, Enterococcus faecium, Enterococcus gailiinanmi, Enterococcus maloratus, Escherichia coli, Francisella tuiarensis, Fusobacterium nucleatum, Gardnerella vaginalis, Haemophilus, Haemophilus ducreyi, Haemophilus influenzae, Haemophilus parainfluenzae, Haemophilus pertussis, Haemophilus vaginalis, Helicobacter pylori, Klebsiella pneumoniae, Lactobacillus, Lactobacillus acidophilus, Lactobacillus bulgaricus, Lactobacillus casei, Lactococcus lactis, Legionella pneumophila, Listeria monocytogenes, Methanobacterium extroquens, Microbacterium multiforme, Micrococcus luteus, Moraxella catarrhalis, Mycobacterium, Mycobacterium avium, Mycobacterium bovis, Mycobacterium diphtheriae, Mycobacterium intraceliulare, Mycobacterium leprae, Mycobacterium iepraemurium, Mycobacterium phiei, Mycobacterium smegmatis, Mycobacterium tuberculosis, Mycoplasma, Mycoplasma term en tans, Mycoplasma genitalium, Mycoplasma hominis, Mycoplasma penetrans, Mycoplasma pneumoniae, Neisseria, Neisseria gonorrhoeae, Neisseria meningitidis, Pasteurelia, Pasteurelia multocida, Pasteurella tuiarensis, Peptostreptococcus, Poiphyromonas gingivalis, Prevotella melaninogenica (also known as Bacteroides melaninogenicus), Pseudomonas aeruginosa, Rhizobium radiobacter, Rickettsia, Rickettsia prowazekii, Rickettsia psittaci, Rickettsia quintana, Rickettsia rickettsii, Rickettsia trachomae, Rochalimaea, Rochalimaea henselae, Rochalimaea quintana, Rothia dentocariosa, Salmonella, Salmonella enteritidis, Salmonella typhi. Salmonella typhimurium, Serratia marcescens, Shigella dysenteriae, Staphylococcus, Staphylococcus aureus, Staphylococcus epidermidis, Stenotrophomonas maltophiiia, Streptococcus Streptococcus agalactiae, Streptococcus avium, Streptococcus bovis, Streptococcus cricetus, Streptococcus faceium, Streptococcus faecalis, Streptococcus ferus, Streptococcus gallinarum, Streptococcus lactis, Streptococcus mitior, Streptococcus mitis, Streptococcus mutans, Streptococcus oralis, Streptococcus pneumoniae, Streptococcus pyogenes, Streptococcus rattus, Streptococcus saiivarius, Streptococcus sanguis, Streptococcus
sobrinus, Treponema, Treponema pallidum, Treponema denticola, Vibrio, Vibrio cholerae, Vibrio comma, Vibrio parahaemolvticus, Vibrio vulnificus, Wolbachia, Yersinia, Yersinia enterocolitica, Yersinia pestis, and Yersinia pseudotuberculosis, and other genus and species known to those of skill in the art.
Exemplary genus and species of yeast cells include Saccharomyces, Saccharomyces cerevisiae, Torula, Saccharomyces bouiardii, Schizosaccharomyces, Schizosaccharomyces pombe, Candida, Candida glabrata, Candida tropicalis, Yarrowia, Candida parapsilosis, Candida krusei, Saccharomyces pastorianus, Brettanomyces, Brettanomyces bruxellei sis, Pichia, Pichia guilliermondii, Cryptococcus, Cryptococcus gattii, Torulaspora, Torulaspora delbrueckii, Zvgosaccharomvces, Zv osaccharomvces bailii, Candida lusitaniae, Candida stellata, Geotrichum, Geotrichum candidum, Pichia pastoris, Kluyveromyces, Kluyveromyces marxianus, Candida dubli iensis, Kluyveromyces, Kluyveromyces lactis, Trichosporon, Trichosporon uvarum, Eremothecium, Eremothecium gossypii, Pichia stipitis, Candida milieri, Ogataea, Ogataea polymorpha, Candida oleophilia, Zygosaccharomyces rouxii, Candida albicans, Leucosporidium, Leucosporidium frigidum, Candida viswanathii, Candida blankii, Saccharaomyces telluris, Saccharomyces florentinus, Sporidiobolus, Sporidioboius salmonicolor, Dekkera, Dekkera anomala, Lachancea, Lachancea kluyveri, Trichosporon, Trichosporon mycotoxinivorans, Rhodotorula, Rhodotorula rubra, Saccharomyces exiguus, Sporobolomyces koalae, and Trichosporon cutaneum, and other genus and species known to those of skill in the art.
Exemplary genus and species of fungal cells include Sac fungi, Basidiomycota, Zygomycota, Chtridiomycota, Basidiomycetes, Hyphomycetes, Glomeromyeota, Microsporidia, Blastocladiomycota, and Neocallimastigomycota, and other genus and species known to those of skill in the art.
Exemplary eukaryotic cells include mammalian cells, plant ceils, yeast cells and fungal
ceils.
VIII. Standard Amino Acid
As used herein, the term "SAA" (standard amino acid) include one of the L-amino acids that typically naturally occur in proteins on Earth and includes alanine, arginine, asparagine, aspartic acid, cysteine, glutamic acid, glutamine, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, serine, threonine, tyrosine, tryptophan, proline and valine. The standard amino acids that are naturally N-end destabilizing in most bacteria include tyrosine, phenylalanine, tryptophan, leucine, lysine, and arginine. According to one aspect, the amino acid at the amino acid target location is an NSAA that is stabilizing. When the natural analog of the NSAA is destabilizing and is present at the amino acid target location, degradation of the polypeptide occurs. Standard amino acids that are not naturally destabilizing via the N-end rule using natural ClpS, can be destabilizing when the ClpS is engineered to recognize such standard amino acid.
The N-end rule in bacteria may also be engineered to recognize isoleucine, valine, aspartate, glutamate, asparagine, and glutamine as destabilizing using methods known to those of skill in the art which is useful when the desired NSAA is an analog of these amino acids. For example, isoleucine and valine can be converted into N-end destabilizing residues by introducing a ClpS variant (M40A) that recognizes these amino acids as -terminal destabilizing residues see (Roman-Hernandez G, Grant RA, Sauer RT, & Baker TA (2009) Molecular basis of substrate selection by the N-end rule adaptor protein ClpS. Proceedings of the National Academy of Sciences 106(22):8888-8893 hereby incorporated by reference in its entirety). Aspartate and glumatate may be converted into N-end destabilizing residues by introducing a bacterial aminoacyl -transferase from Vibrio vulnificus (Bpt) that is a homolog of eukaryotic transferases and N-terminally appends a leucine (L) to peptides containing N- terminaily exposed aspartate or glutamate (see Graciet E, et ai. (2006) Aminoacyi-transferases
and the N-end rule pathway of prokaryotic/eukaryotic specificity in a human pathogen. Proceedings of the National Academy of Sciences of the United States of America 103(9):3078-3083 hereby incorporated by reference in its entirety). The ability of Bpt to catalyze this reaction has been demonstrated in E. coli and shows that components of the N- end rule, which includes many more conditionally destabilizing residues in eukaryotes, can be transferred across kingdoms. Asparagine and glutamine can be converted into N-end destabilizing residues by using an N-teraiinal amidase from S. cerevisiae (NTA1), which converts N-terminal asparagine into aspartate or N -terminal glutamine into glumate, respectively (see Tasaki T, Sriram SM, Park KS, & Kwon YT (20 2) The N-End Rule Pathway. Annual Review of Biochemistry 81(l):261-289 hereby incorporated by reference in its entirety). Indeed, in many eukaryotic cells these amino acids and more are naturally conditionally N-end destabilizing. One of skill will understand that an N-end rule destabilizing pathway may be provided for all 20 standard amino acids as a basis for a system where a desired amino acid from among the 20 standard amino acids is N-end destabilizing in at least one context (see Chen, Shun-Jia, et al. "An N-end rule pathway that recognizes proline and destroys gluconeogenic enzymes." Science 355.6323 (2017): eaa!3655 hereby incorporated by rweference in its entirety). One of skill in the art can identify the eukaryotic proteins required for conferring expanded N-end destabilization and transfer them to prokaryotes as needed. Similarly, in eukaryotic cells one can constitutively express components required for conferring expanded N-end destabilization such that degradation of proteins containing N-end standard amino acids no longer remains conditional. One of skill will recognize that some amino acids rendered destabilizing may have adverse consequences for cell physiology. For example, most native proteins begin with methionine and if methionine is made N-end destabilizing then most proteins would degrade. Aspects of converting an N-end stabilizing amino acid to an N-end destabilizing amino acid can be tested in a particular organism.
IX. Non-Standard Amino Acid
As used herein, the term "NSAA" refers to an unmodified amino acid that is not one of the 20 naturally occurring standard L-amino acids. NSAAs also include synthetic amino acids which have been designed to include a non-standard functional group not present in the standard amino acids or are naturally occurring amino acids bearing functional groups not present in the set of standard amino acids. Accordingly, a non-standard amino acid may include the structure of a standard amino acid and which includes a non-standard functional group. A non-standard amino acid may include the basic amino acid portion of a standard amino acid and include a non-standard functional group.
NSAAs also refer to natural amino acids that are not used by ail organisms (e.g. L- pyrrolysine (B. Hao et a!,, A new uag-encoded residue in the structure of a methanogen methyitransferase. Science. 296: 1462) and L-seienocysteine (S. Osawa et al., Recent evidence for evolution of the genetic code. Microbiol Mol. Biol. Rev. 56:229)). NSAAs are also known in the art as unnatural amino acids (UAAs) and non-canonical amino acids (NCAAs).
NSAAs include, but are not limited to, p-Acetylphenylalanine, m-Acetylphenylalanine, O-aliyltyrosine, Phenylselenocysteine, p-Propargyloxyphenylalanine, p-Azidophenylalanine, p-Boronophenylalanine, O-methyityrosine, p-Aminophenylalanine, p-Cyanophenyialanine, m-Cyanophenyl alanine, p-Fluorophenylalanine, p-Iodophenylalanine, p-Bromophenylalanine, p-Nitrophenylalanine, L-DOPA, 3-Aminotyrosine, 3-Iodotyrosine, p-Isopropylphenylalanine, 3-(2-Naphthyl)alanine, biphenylalanine, homoglutamine, D-tyrosine, p-Hydroxyphenyllactic acid, 2-Aminocaprylic acid, bipyridylalanine, HQ-alanine, p-Benzoylphenylalanine, o- trobenzyl cysteine, o-Nitrobenzylserine, 4,5-Dimethoxy-2-Nitrobenzylserine, o- Nitrobenzyllysine, o-Nitrobenzyltyrosine, 2-Nitrophenylalanine, dan syl alanine, p- Carboxymethyiphenyialanine, 3-Nitrotyrosine, sulfotyrosine, acetyllysine, methylhistidine, 2- Aminononanoic acid, 2-Aminodecanoic acid, pyrrolysine, Cbz-lysine, Boc-lysine,
allyloxycarbonyllysine, arginosuccinic acid, citrulfine, cysteine sulfinic acid, 3,4- dihydroxyphenylalanine, homocysteine, homoserine, ornithine, 3-monoiodotyrosine, 3,5- diiodotryosine, 3, 5, 5, -triiodothyronine, and 3,3 ',5,5'-tetraiodothyronine. Modified or unusual amino acids include D-amino acids, hydroxylysine, 4-hydroxyproline, N-Cbz-protected amino acids, 2,4-diaminobutyric acid, homoarginine, norieucine, N-methylaminobutyric acid, naphthyiaianine, phenylglycine, -phenylproline, tert-leucine, -aminocyclohexyl alanine, N- methyl-norl eucine, 3,4-dehydroproline, Ν,Ν-dimethylaminoglycine, N-methylaminoglycine, 4-aminopiperidine-4-carboxylic acid, 6-aminocaproic acid, trans-4-(aminomethyl)- cyclohexanecarboxylic acid, 2-, 3-, and 4-(aminomethyl)-benzoic acid, 1 - aminocyclopentanecarboxylic acid, 1-aminocyclopropanecarboxylic acid, and 2-benzyl-5- aminopentanoic acid, and the like. NSAAs also include amino acids that are functionalized, e.g., alkyne-functionalized, azide-functionalized, ketone-functionalized, aminooxy- functionaiized and the like. For reviews of NSAAs and lists of NSAAs suitable for use in certain embodiments of the subject invention, see Liu and Schultz (2010) Ann. Rev. Biockem. 79:413, and Kim et al. (2013) Cnrr. Opin. ( 'hem. Biol. 17:412, each of which is incorporated herein by reference in its entirety for all purposes.
In certain aspects, an NSAA of the subject invention has a corresponding aminoacyl tRNA synthetase (aaRS)/tRNA pair. In certain aspects, the aminoacyl tRNA synthetase/tRNA pair is orthogonal to those in a genetically modified organism such as, e.g., a prokaryotic cell, a bacterium (e.g., E. coif), a eukaryotic cell, a yeast, a plant cell, an insect cell, a mammalian cell, a virus, etc. In certain aspects, an NSAA of the subject invention is non -toxic when expressed in a genetically modified organism such as, e.g., a prokaryotic cell, a bacterium (e.g., E. coif), a eukaryotic cell, a yeast, a plant cell, an insect cell, a mammalian cell, a vims, etc. In certain aspects, an NSAA of the subject invention is not or does not resemble a natural product present in a cell or organism. In certain aspects, an NSAA of the subject invention is
hydrophobic, hydrophilic, polar, positively charged, or negatively charged. In other aspects, an NSAA of the subject invention is commercially available (such as, e.g., L-4,4-bipnehylalanine (bipA) and L-2-Naphthylalanine (napA)) or synthesized according to published protocols.
EXAMPLE I
Post-translational proofreading (PTP) for selective BipA OTS evolution A cell is genetically modified for the screening. The cell is provided with a nucleic acid sequence encoding a ubiquitin fused to the N-terminus of the protein wherein the N-terminus of the protein is an amino acid target location intended to have a nonstandard amino acid. The nonstandard amino acid may be encoded by a nonsense or sense codon. The cell is provided with a ubiquitin cleavase. The ceil may include an endogenous protease system, such as a ClpS-ClpAP system. The cell is provided with a non-standard amino acid. The cell expresses the fusion protein having either a standard or a non-standard amino acid incorporated at the amino acid target location. The ubiquitin cleavase cleaves the ubiquitin to produce a protein having either the standard or non-standard intervening amino acid at its N-terminus. If a standard amino acid is present at the N-terminus, the ClpS recognizes the standard amino acid at the N-terminus and targets the protein having the standard amino acid at its N-terminus to ClpP for degradation. If a nonstandard amino acid is present at the N-terminus, the Clps does not recognize the nonstandard amino acid and the protein is not targeted for degradation. A residue is destabilizing if it is recognized by the ClpS adaptor protein, which is the discriminator of the N-end rule in E. coli such as is described in Erbse A, et al. (2006) ClpS is an essential component of the N-end rule pathway in Escherichia coli. Nature 439(7077):753- 756 and Wang KH, Oakes ESC, Sauer RT, & Baker TA (2008) Tuning the Strength of a Bacterial N-end Rule Degradation Signal. Journal of Biological Chemistry 283(36):24600- 24607; Schmidt R, Zahn R, Bukau B, & Mogk A (2009) ClpS is the recognition component for Escherichia coli substrates of the N-end rule degradation pathway . Molecular Microbiology
72(2):506-517.; Roman-Hernandez G, Grant RA, Sauer RT, & Baker TA (2009) Molecular basis of substrate selection by the N-end rule adaptor protein ClpS. Proceedings of the National Academy of Sciences 106(22): 8888-8893; Schuenemann VJ, et al. (2009) Structural basis of N-end rule substrate recognition in Escherichia coli by the Cl AP adaptor protein ClpS. EMBO reports 10(5):508-514; Roman -Hernandez G, Hou Jennifer Y, Grant Robert A, Sauer Robert T, & Baker Tania A (2011) The ClpS Adaptor Mediates Staged Delivery of N-End Rule Substrates to the AAA+ ClpAP Protease. Molecular Cell 43(2):217-228; and Hou JY, Sauer RT, & Baker TA (2008) Distinct structural elements of the adaptor ClpS are required for regulating degradation by ClpAP. Nat Struct Mol Biol 15(3):288-294 each of which is hereby incorporated by reference in its entirety.
The disclosure provides a method of screening for an amino acyl tRNA synthetase variant that preferentially selects a non-standard amino acid against its standard amino acid counterpart or an undesired non-standard amino acid for incorporation into a polypeptide in a cell. In one embodiment, the cell is provided with an amino acyl tRNA synthetase variant. In another embodiment, the cell is provided with a nucleic acid sequence encoding a ubiquitin fused to the N-terminus of the polypeptide wherein the N-terminus of the polypeptide is an amino acid target location intended to have a nonstandard amino acid, and wherein GFP is fused to the C-end of the polypeptide (Ub-UAG-sfGFP). The nonstandard amino acid may be encoded by a nonsense or sense codon. The cell is provided with a ubiquitin cleavase, such as Ubpl . The cell may include an endogenous protease system, such as a ClpS-ClpAP system. In certain embodiment, the Ub-UAG-sfGFP construct is integrated into the cell's genome (C321.AClpS .Ub-UAG-sfGFP). In an exemplary embodiment, the UBPl~clpSV631 expression cassette is integrated into C321.AClpS.Ub-UAG-sfGFP (resulting in strain C321.Nend). The ceil is provided with a non-standard amino acid. The cell expresses the fusion protein having either a standard or a non-standard amino acid incorporated at the amino acid target location.
The ubiquitin cleavase cleaves the ubiquitin to produce a protein having either the standard or non-standard intervening amino acid at its N-terminus. If a standard amino acid is present at the N-terminus, the ClpS recognizes the standard amino acid at the N-terminus and targets the protein having the standard amino acid at its N-terminus to ClpP for degradation, including the GFP portion. If a nonstandard amino acid is present at the N-terminus, the Clps does not recognize the nonstandard amino acid and the protein is not targeted for degradation. The GFP is detected and is indicative of the presence of a synthetase variant that preferentially selects the non-standard amino acid against its standard amino acid counterpart for incorporation into the protein.
According to another aspect, the strength of the signal detected from the GFP is indicative of the amount of protein produced that included the nonstandard amino acid. In this manner, methods are provided for screening and evolving an amino acyl tR A synthetase variant that preferentially selects a non-standard amino acid against its standard amino acid counterpart for incorporation into a protein in a cell.
The ability of FTP to discriminate incorporation of intended NSAA from related SAAs is especially useful for high-throughput screening of OTS libraries. To demonstrate this for proof-of-concept, the UBPl-clpS^1 expression cassette was genomically integrated into C321.AClpS.Ub-UAG-sfGFP (resulting in strain C321.Nend). This strain was then used to improve the selectivity of the "wild-type" (WT) BipA OTS, Previous efforts to engineer MjTyrRS variants like BipARS focused on site-directed mutagenesis on positions near the amino acid binding pocket (See, "L. Wang, A. Brock, B. Herberich, P. G. Schultz, Expanding the genetic code of Escherichia coli. Science (80-, ). 292, 498-500 (2001); T. S. Young, I. Ahmad, J. A, Yin, P. G, Schultz, An Enhanced System for Unnatural Amino Acid Mutagenesis in E. coli. J. Mol. Biol. 395, 361-374 (2010)). To generate a novel BipARS library, error-prone PGR was used to introduce 2-4 mutations throughout the bipARS gene. After assembly, these
libraries were transformed into C321.Nend and screened with three rounds of F ACS sorting: (i) positive sort for GFP+ cells in BipA+; (ii) negative sort for GFP- cells in BipA-; (iii) final positive sort for GFP+ cells in BipA+ (Fig. 1A). To decrease promiscuity against other NS AAs, negative screening stringency was altered by varying addition of undesired NSAAs (as many as pAcF, pAzF, tBtylY, NapA, and pBnzylF), which changed the profile of isolated variants (Fig. 2). Upon characterizing the 11 most enriched variants after miniprep and transformation into C321.Ub-UAG-sfGFP (no PTP), it was observed that variants isolated from lower stringency negative sorts exhibited greater activity on BipA and lower activity on SAAs compared to the WT OTS, as well as varying degrees of activity on undesired NSAAs (Fig. IB and Table 1, Variants 1-6). Supplementation with undesired NSAAs enriched for mutants with even greater selectivity against SAAs and undesired NSAAs (Variants 4, 9-11) but also gave rise to cheaters (variant 8), suggesting that these conditions may be nearly too harsh. One mutant, Variant 10, exhibited high activity on BipA and no observable activity on any other NSAAs except tBtylY, which contains the inert tert-Butyl protecting group (Fig. IB). SDS- PAGE of reporter protein resulting from the Variant 10 OTS after expression and affinity purification showed no observable BipA- protein production (Fig. 3A). Furthermore, mass spectrometry confirmed site-specific BipA+ BipA incorporation (Figs. 3B-3D).
During characterization of BipA OTS variants we discovered spontaneous tRNA mutations present in our most apparently selective variants, such as 4, 9, and 10. Mutations were present at C29A for Variant 4, C67A for Variant 9, and G51U for Variant 10 (Fig. 5A). Notably, when these tRNA mutations were reverted, each corresponding BipA OTS became more promiscuous (Fig. SB), suggesting that observed tRNA mutations increase selectivity. The G51 position (G50 in E. coli nomenclature) mutated in tRNA Variant 10 is the most significant base pair in determining acylated tRNA binding affinity to Elongation Factor Tu (EF-Tu), which influences incorporation selectivity downstream of the AARS (See, F. J.
LaRiviere, A. D. Wolfson, O. C. Uhlenbeck, Uniform Binding of Aminoacyl-tRNAs to Elongation Factor Tu by Thermodynamic Compensation. Science (80-. ). 294 (2001); J. M. Schrader, S. J, Chapman, O, C. Uhlenbeck, Understanding the Sequence Specificity of tRNA. Binding to Elongation Factor Tu using tRNA Mutagenesis, J. Mol. Biol 386, 1255-1264 (2009)). Large hydrophobic NSAAs may have stronger interactions than Y with EF-Tu (See, Taraka Dale, and Lee E. Sanderson, O. C. Uhlenbeck*, The Affinity of Elongation Factor Tu for an Aminoacyl-tRNA Is Modulated by the Esterified Amino Acid† (2004), doi : 10.1021/BI036290O), and the C-U mismatch at this position in the tRNA may weaken the EF-Tu interaction. Improved selectivity of OTS Variant 10 may therefore result from a weaker tRNA-EF-Tu interaction that compensates for a stronger amino acid interaction.
To more rigorously assess OTS selectivity, we purified AARS and tRNA for the WT, Variant 9, and Variant 10 OTSs. The observed in vitro substrate specificity as determined by tRNA aminoacylation is in excellent agreement with our in vivo assays (Fig. 1C, Figs. 5C-D).
To demonstrate the utility of a more selective OTS, we substituted the WT BipA OTS construct previously used in three biocontained strains that exhibit observable escape frequencies with new plasmids containing either WT or Variant 10 OTS. These three biocontained strains harbor a computationally predicted redesign to the following essential genes to make their stability dependent on biphenylalanine: adk (adk.d6), tyrS (iyrS.dS), or adk and tyrS (adkd6/tyrS.dS) (See, D. J. Mandell et al, Biocontainment of genetically modified organisms by synthetic protein design. Nature. 518, 55-60 (2015)). In all three strains, we observed lower escape frequencies on non -permissive media at all three measured days (Figs. 1D-F, Figs, 4A-B). The difference in escape frequency was most apparent for the adk.d6/tyrS.d8 strain, which exhibited a 7-day escape frequency of 7.36 X 10"9, which is by more than two orders of magnitude the lowest ever observed for a C321.AA-derived strain containing only two LIAG codons in essential genes and no other biocontainment-related
genomic alterations outside of the two essential genes. Furthermore, the fitness of all three strains improved with the Variant 10 OTS, with doubling time decreasing by a factor of as much as 0.54 (Fig, 1G). Finally, Variant 10 also delayed onset of growth of adk.d6/tyrS.d8 on non-cognate NSAAs (Table 2). We expect these benefits to carry over to all strains which employ Variant 10 over WT OTS.
In addition to providing a new paradigm for OTS evaluation and evolution, PTP can be transformative for applications in which amino acid positions are in competitive states, such as screening of natural synthetases for NSAA acceptance, sense codoi reassignment, and post- translational modifications. PTP may also find use in translational regulation and as an orthogonal biocontainment strategy. Given that all 20 SAAs are known to be N-end destabilizing under certain conditions (See, S.-J. Chen, X. Wu, B. Wadas, J.-H. Oh, A. Varshavsky, An N-end rule pathway that recognizes proline and destroys gluconeogenic enzymes. Science (80-, ), 355 (2017), conditionally expressed components could be transferred across organisms to dramatically alter the set of N-end destabilizing SAAs for a particular application.
Evolution methods for improving selectivity
Improving selectivity or activity need not be conflicting goals, but different reporters and screening schemes may be better suited for one or the other. For example, a recent evolution method, (See, J, W. Tobias, T. E. Shrader, G. Rocap, A. Varshavsky, The N-end rule in bacteria. Science (80-. ). 254, 1374-1377 (1991)) used a reporter containing many UAG sites and a genome-integrated OTS system. The resulting higher ratio of UAG sites to OTS expression compared to this study can produce OTSs capable of expressing protein containing as many as 30 UAGs. However, we tested this evolved OTS (pAcFRS.2.tl ) on our genome- integrated Ub-X-GFP reporter and observed remarkably high promiscuity (Fig. 6). In fact, the promiscuity increased substantially from the parental construct, which offers insight into the
potential tradeoff between selectivity/activity and the importance of methods capable of achieving either aim. Except for Variant 8, the variants described in this disclosure exhibit greater than parental selectivity for BipA compared to other structurally similar NSAAs or SAAs. Thus, we envision that these BipARS/tRNA variants will be useful for applications where selectivity is important. Biocontainment is an exceptionally relevant use case given that promiscuous activity on amino acid substrates besides BipA can lead to growth in contexts that are intended to be "non-permissive" (ie., environments where Bip A is not present). Examples where biocontainment is important include safe expression of toxic biological agents, safeguards for accidental release of multi-virus resistance organisms, controlled environmental remediation, and in engineered probiotics to prevent undesired proliferation in the gist or in the environment upon excretion. In addition, BipARS/tRNA variants with greater selectivity can be used for tight translational control of protein expression, and they can also be more effectively used in conjunction with other NSAAs for applications that would benefit from use of multiple NSAAs simultaneously.
EXAMPLE II
Materials and Methods
Strains and strain engineering
E. coli strain C321.AA (CP006698. 1), which was previously engineered to be devoid of UAG codons and RF1 , was the starting strain used for this study, (See, . H. Wang, R. T. Sauer, T. A. Baker, ClpS modulates but is not essential for bacterial N-end rule degradation. Genes Dev. 21, 403-8 (2007); K. H. Wang, E. S. C. Oakes, R. T. Sauer, T. A. Baker, Tuning the Strength of a Bacterial N-end Rule Degradation Signal. J. Biol, Chem. 283, 24600-24607 (2008), M. .) . Lajoie et al , Genomically Receded Organisms Expand Biological Functions, Science (80- ). 342, 357-360 (2013); M. Amiram et al., Evolution of translation machinery in recoded bacteria enables multi-site incorporation of nonstandard amino acids. Nat Biotech. 33,
1272-1279 (2015); S. Million-Weaver, D. L. Alexander, J. M. Alien, M. Camps, in Methods in molecular biology (Clifton, N.J.) (2012; http://www.ncbi.nlm.nih.gov/pubmed/22144351), vol. 834, pp. 33-48; J. W. Tobias, A. Varshavsky, Cloning and functional analysis of the ubiquitin-specific protease gene UBP1 of Saccharomyces cerevisiae. J. Biol. Chem. 266, 12021-8 (1991); A. Wojtowicz et al., Expression of yeast deubiquitination enzyme UBP1 analogues in E. coli. Microb. Cell Fact. 4, 1-12 (2005); G. Roman-Hernandez, J. Y. Hou, R. A. Grant, R. T. Sauer, T. A. Baker, The ClpS Adaptor Mediates Staged Delivery of N-End Rule Substrates to the AAA.+ CipAP Protease. Mol Cell. 43, 217-228 (2011)). The TET promoter and Ub-UAG-sfGFP expression cassette was genomically integrated using λ Red recombineering, (See, K. A. Datsenko, B. L. Wanner, One-step inactivation of chromosomal genes in Escherichia coli K-12 using PGR products. Proc Natl Acad Sci USA . 97, 6640-6645 (2000); D. Yu et al, An efficient recombination system for chromosome engineering in Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 97, 5978-83 (2000)) and tolC negative selection using Colicin El (See, J. A. DeVito, Recombineering with tolC as a Selectable/Counter-selectable Marker: remodeling the rRNA Operons of Escherichia coli . Nucleic Acids Res. 36, e4 (2008); C. .) . Gregg ei aL, Rational optimization oftoiC as a powerful dual selectable marker for genome engineering. Nucleic Acids Res. 42, 4779-4790 (2014)). This resulted in strain C321.Ub-UAG-sfGFP. Please see Table 3 for sequences of key constructs such as the reporter construct. Multiplex automatable genome engineering (MAGE) (See, H. H. Wang et al.. Programming ceils by multiplex genome engineering and accelerated evolution. Nature. 460, 894-898 (2009)) was used to inactivate the endogenous mutS and clpS genes when needed and to add or remove UAG codons in the integrated reporter. For MAGE, saturated overnight cultures were diluted 100-fold into 3 raL LBL containing appropriate antibiotics and grown at 34 °C until mid-log. The integrated Lambda Red cassette in C321. ΔΑ derived strains was induced in a shaking water bath (42 °C, 300 rpm, 15 minutes), followed by
cooling culture tubes on ice for at least two minutes. These cells were made eiectrocompetent at 4 °C by pelleting 1 mL of culture (16,000 rcf, 20 seconds) and washing twice with 1 mL ice cold deionized water (dH20). Eiectrocompetent pellets were resuspended in 50 JJL of dH20 containing the desired DNA. For MAGE oligonucleotides, 5 μ,Μ of each oligonucleotide was used. Please see Table 4 for a list of all oligonucleotides used in this study. For integration of dsDNA cassettes, 50 ng was used. Ailele-specific colony PGR (ASC-PCR) was used to identify desired colonies resulting from MAGE as previously described (See, F, J. Isaacs et al, Precise manipulation of chromosomes in vivo enables genome-wide codon replacement. Science (80- . ). 333, 348-353 (20 1 )). Colony PGR was performed using Kapa 2G Fast HotStart ReadyMix according to manufacturer protocols and Sanger sequencing was performed by Genewiz to verify strain engineering. The strains C321.Ub-UAG-sfGFP, C321.Ub-UAG-sfGFP UAG151 , and C321.AClpS.Ub-UAG-sfGFP are available from Addgene. Ub-X-GFP reporters containing codons encoding SAAs in place of UAG were generated from Ub-UAG-GFP by PGR and Gibson assembly, and they were subsequently cloned into the pOSIP-TT vector for Clonetegration (one-step cloning and chromosomal integration) into NEB5a strains (See, F. St-Pierre et al. One-step cloning and chromosomal integration of DNA. ACS Synth. Biol. 2, 537-541 (2013)). The UBPl/clpS V65I operon was also placed under weak constitutive expression and integrated into C321.AClpS. Ub-UAG-sfGFP using Clonetegration. This strain (C321.Nend) was used as the host for FACS experiments.
Table 1 . Sequences of evolved BipA OTS variants.
Parental ("WT") sequences shown below
The amino acid sequence of the WT BipARS:
MDEFEMIKR TSEnSEEELREVLKKDEKSAHIGFEPSGKIHLGHYLQIKKMIDLQNAG
FDIIfflLADLHAYLNQKGELDEIRXIGDYN K EAilGLKAKY\rYGSEWMLDKDYT LNVYRLALKTTLKRARRSMELIAREDENPKVAEVIYPIMQVNGIHYKGVDVAVGGM EQRKfflMLAREIJ.PKKVVT.fflNPW GLDGEG MSSSKGNFIA\T)DSPEEIRA IK A YCPAGV ΈG PIMEIA YFLEYPLTIKRPEKFGGDLTVNSYΈELESLFKNKELHPMDL
K'NAVAEELIKILEPIRKRL ( SI X.) ID NO: 1)
The nucleotide sequence encoding the parental BipARS:
alggacgaatlcgaaatgatcaaacg aaca ae^
ctacctgcagaicaaaaaaaigatcgaectgcagaacgcgggtticgacatcai^
acaaaaaagttttcgaagcgatgggtctgaaagcgaaatacgtttacggtra
atggaactgategcgcgtgaagacgaaaacccgaaagttgcggaagttatctac^
igctggcgcgtgaactgctgccgaaaaaagttgUtgcatccacaa^
gtgcgaaaatcaaaaaagcgiactgcccggcgggtgttgttgaaggtaa∞
ttaactcttacgaagaactggaaicicigttcaaaaacaaagaactgcacccgaiggacctgaaaaacgcggttgcggaagaacigatcaaaatcctggaacc (SEQ ID
NO: 6)
The nucleotide sequence of the parental tRNA_.o pt ?A:
ccggcggJagttcagcagggcagaacggcggactcta;¾a1ccgcatggcaggggttcaaaicccctocgccggacca (SEQ ID NO : 2)
Table 2. Growth of biocontained adk.d6/tyrS.d8 strain on 100 μΜ non-cognate NSAAs as
DNO: Did not observe within a 48 hour incubation period
Table 3. Sequences o: key constructs
Construct Name eesiee
Ubiquitin-*- ATGCAGATTTTTGTGAAGACTTTAACAGGTAAGACGATTACCCT LFVQEL-sfGFP- GGAGGTGGAGTCCTCGGACACCATCGATAATGTAAAATCAAAA His6x ("LFVQEL" ATCCAAGATAAGGAAGGAATCCCTCCAGACCAGCAACGTCTGA and "Hisox" TTTTCGCAGGTAAACAACTGGAGGATGGTCGCACGCTTTCGGAC disclosed as SEQ TACAACATCCAGAAAGAATCTACCCTTCATTTGGTTCTGCGTCTG
CGTGGAGGATAGTTGTTTGTGCAGGAGCTTgcatccaagggcgaggagctct
ID NOS 7 and 8, ttactggcgtagtaccaattctcgtagagctcgatggcgatgtaaatggccataagttttccgtacgcggcga respectively) gggcgagggcgatgcaactaacggcaagctcactctcaagtttatttgtactactggcaagctcccagtac catggccaactctcgtaactactctgacctatggcgtacaatgtttttcccgctatccagatcacatgaagcaa catgatttttttaagtccgcaatgccagagggctatgtacaagagcgcactattagctttaaggatgatggca cctataagactcgcgcagaggtaaagtttgagggcgatactctcgtaaatcgcattgagctcaagggcattg attttaaggaggatggcaatattctcggccataagctggagtataatttcaattcccataatgtatacattaccg cagataagcaaaagaatggcattaaggcgaattttaagattcgccataatgtggaggatggctccgtacaa ctcgcagatcattatcaacaaaatactccaattggcgatggcccagtactcctcccagataatcattatctctc cactcaatccgtgctctccaaagatccaaatgagaagcgcgatcacatggtactcctggagtttgtaactgc agcaggcattactcatggcatggatgagctctataagctcgagcaccaccaccaccaccactaa (SEQ
ID NO: 9)
ClpS2_At gBlock ATGTCTGATAGTCCTGTTGACTTAAAACCCAAGCCTAAAGTCAA
GCCCAAATTAGAACGCCCAAAACTTTACAAAGTCATGTTATTGA ATGATGATTATACACCACGCGAATTTGTGACGGTAGTCCTTAAA GCGGTGTTTCGTATGTCAGAGGACACTGGTCGCCGTGTAATGAT GACAGCACATCGTTTTGGTTCGGCGGTGGTGGTCGTTTGTGAAC
GTGACATTGCAGAGACGAAAGCCAAGGAGGCGACCGACTTGGG GAAGGAAGCAGGTTTTCCTTTGATGTTCACGACTGAGCCCGAGG AGTAA (SEQ ID NO: 10)
pAzFRSJ .il GTTATGcactacGATggtgttgacgttTACgttggtggtatggaacagcgtaaaatccacatgct gBlock ggcgcgtgaactgctgccgaaaaaagttgtttgcatccacaacccggttctgaccggtctggacggtgaag gtaaaatgtcttcttctaaaggtaacttcatcgcggttgacgactctccggaagaaatccgtgcgaaaatcaa aaaagcgtactgcccggcgggtgttgttgaaggtaacccgatcatggaaatcgcgaaatacttcctggaat acccgctgaccatcaaaGGT (SEQ ID NO: 1 1)
ScUBPltmnc, or ATGGGGAGTGGGTCTTTCATTGCTGGGCTTGTCAACGATGGTAA
UBP 1 TACGTGTTTTATGAACTCGGTTCTTCAGTCCCTTGCTAGTAGCCG
TGAACTTATGGAGTTTTTGGATAATAATGTAATCCGTACATATG
AAGAAATTGAACAGAACGAGCACAATGAGGAAGGTAATGGCCA
AGAGAGCGCACAAGATGAGGCAACTCAC AAAAAAAACACTCGC
AAGGGAGGTAAGGTCTATGGGAAGCATAAAAAGAAATTAAACC
GCAAATCTTCTAGCAAGGAAGACGAAGAAAAGTCGCAAGAACC
AGACATTACGTTTTCGGTGGCGTTGCGTGATCTGCTGAGCGCAT
TAAATGCTAAGTATTATCGCGACAAACCCTACTTTAAGACTAAC
TCTTTATTAAAAGCGATGAGCAAGTCCCCGCGCAAAAATATCTT
GCTTGGGTACGATCAAGAAGACGCTCAGGAATTTTTTCAAAACA
TTCTTGCGGAGTTAGAATCTAATGTCAAGTCGTTAAACACAGAA
AAGCTTGATACTACACCX JTAGCCAAGTCCGAACTTCCAGACGA
TGCTCTGGTTGGCCAATTAAACCTTGGTGAGGTAGGCACCGTGT
ACATTCCCACAGAACAAATTGACCCCAATTCGATTTTACATGAC
AAATCGATTCAAAACTTTACCCCCTTTAAACTGATGACCCCGTT
GGATGGGATCACGGCTGAGCGCATCGGCTGCCTGCAATGCGGA
GAGAACGGGGGAATTCGCTACAGTGTTTTCAGCGGATTAAGTTT
GAACCTGCCGAATGAAAATATTGGAAGCACTCTTAAACTGTCCC
AGTTACTGTCCGATTGGTCGAAACCCGAGATTATCGAGGGTGTT
GAATGCAACCGTTGCGCTTTAACAGCTGCGCACTCACACTTGTT
TGGCCAATTAAAGGAGTTTGAGAAGAAACCTGAAGGCTCGATTC
CCGAAAAACTTATTAATGCCGTAAAGGACCGCGTGCACCAGATC
GAAGAGGTCTTGGCAAAGCCGGTTATCGACGATGAAGATTATA
AAAAATTGCATACTGCGAATATGGTCCGCAAGTGTTCAAAAAGT
AAACAAATTCTTATCTCTCGTCCACCACCTTTGTTGTCTATTCAT
ATCAACCGCTCTGTTTTCGACCCGCGCACCTACATGATTCGCAA
GAACAACTCCAAGGTTTTGTTCAAGTCACGCTTGAACCTGGCAC
CCTGGTGCTGTGATATCAACGAAATCAATCTTGACGCACGCCTT
CCGATGTCGAAGAAGGAAAAAGCAGCTCAACAAGATTCTTCTG
AAGACGAGAACATTGGCGGAGAGTACTATACTAAATTGCATGA
ACGTTTTGAGCAGGAGTTTGAAGATTCTGAAGAAGAGAAGGAA
TACGATGATGCAGAGGGTAATTATGCATCGCATTATAACCATAC
CAAGGACATCTCCAACTACGATCCATTGAATGGAGAAGTCGACG
GTGTGACTTCCGATGATGAGGATGAATACATTGAAGAGACAGA
CGCGTTGGGGAATACCATCAAAAAACGTATTATTGAACACTCCG
ACGTGGAGAACGAAAACGTGAAGGATAATGAAGAACTTCAGGA
GATCGATAACGTTAGCTTGGATGAGCCAAAAATTAATGTCGAGG
ACCAGCTTGAAACGAGTTCTGATGAGGAAGACGTTATTCCTGCT
CCACCCATCAACTACGCTCGCAGCTTTAGTACGGTCCCAGCGAC
CCCTTTAACTTACTCTTTGCGCAGCGTCATCGTGCACTATGGGAC
TCACAACTACGGACATTATATTGCATTTCGCAAGTATCGTGGAT
GTTGGTGGCGCATCTCCGATGAGACGGTCTATGTGGTAGATGAG
GCCGAAGTACTGTCAACACCGGGGGTATTTATGCTTTTCTACGA
GTATGATTTCGACGAGGAGACCGGAAAAATGAAAGACGACTTA
GAAGCTATCCAGAGCAATAATGAGGAAGATGACGAGAAAGAAC
AGGAACAGAAGGGTGTCCAGGAGCCAAAAGAATCCCAGGAGCA
AGGCGAAGGCGAAGAACAAGAAGAAGGGCAAGAGCAAATGAA
ATTTGAGCGTACGGAGGATCATCGCGACATTTCAGGGAAGGATG
TGAATTAA (SEQ ID NO: 12)
Table 4. Oiigonuc eotides used
Oligo Name Seqoessee SEQ ID NO pZE21-seq-F CCATTATTATCATGACATTAACC 13 pZE21-seq-R GGATTTGTCCTACTCAGGAG 14
AARS-seq-F CTTTTTATCGCAACTCTC 15
Ubiquitin+N- TTAAAGAGGAGAAATTAACTATGCAGATTTTTGTGAA 16 degron-F GACT
Ubiquitin+N- AGCTCCTCGCCCTTGGATGCAAGCTCCTGCACAAACAA 17 degron-R GT
pEVOLbbone_ C A GGGA AGG A TGTG A ATT A AT A A GTC G AC CATC A TC A 18
Ubpl-F TCA
pEVOLbbone_ AT :sAAAGACCCACTX .CCCAT AGAlX:TAATT .CT ::ClXrT 19
Ubpl-R TAGC
Ubpl -Pl-F TAACAGGAGGAATTAGATCTATGGGGAGTGGGTCTTT 20
CAT
Ubpl -Pl-R TCAAGCGTGACTTGAACAAAACCTTGGAGTTGTTCTTG i
CG
Upbl -P2-F CGCAAGAACAACTCCAAGGTTTTGTTCAAGTCACGCTT 22
GA
Upbl -P2-R TGATGATGATGGTCGACTTATTAATTCACATCCTTCCC 23
TGA
pUbi-*-Ndeg- TGCGTCTGCGTGGAGGATAGTTGTTTGTGCAGGAGCTT 24
GFP-F GC
pUbi-*-Ndeg- AAGCTCCTGCACAAACAACTATCCTCCACGCAGACGC 25
GFP-R
Ubpl int-seq~F GCTTGGGTACGATCAAGAAG 26
Ubpl int-seq- CCTTGGTATGGTTATAATGCG 27 R
pZE21bbone4 CAGGGAAGGATGTGAATTAAAAGCTTGATGGGGGATC 28 Ubpl -F CCA
pZE21bbone4 ATGAAAGACCCACTCCCCATGGTACCTTTCTCCTCTTT 29 Ubpl -R AATGAAT
Ubpl-ins-F TTAAAGAGGAGAAAGGTACCATGGGGAGTGGGTCTTT 30
CAT
Ubpl-ins-R TGGGATCCCCCATCAAGCTTTTAATTCACATCCTTCCC 31
TGA
UbiGFPins-F TAAAGAGGAGAAAGGTACCATGCAGATTTTTGTGAAG 32
ACTTTAAC
UbiGFPins-R TGGGATCCCCCATCAAGCTTTTAGTGGTGGTGGTGGTG 33
GT
pZEbbone4Ubi ACCACCACCACCACCACTAAAAGCTTGATGGGGGATC 34
GFP-F CCA
pZEbbone4Ubi GTCTTCACAAAAATCTGCATGGTACCTTTCTCCTCTTTA 35
GFP-R ATGAAT
reporter to ge TTACGGGCTAATTACAGGCAGAAATGCGTGATGTGTG 36 nome-F CCACACTTGTTGATCCCTATCAGTGATAGAGATTGAC reporter to ge CCAGCGGGCTAACTTTCCTCGCCGGAAGAGTGGTTAA 37 nome-R CAAAATAGTAACGTCACCGACAAACAACAGATAAAAC
SIR-seq-F CCAAAGTGAGTTGAGTATAAC 38
SIR-seq-R TTTCTCCTTATTATCAATGC 39
r2g-extend-F GCCGCAGCAAGCCAAAGTGAGTTGAGTATAACGCAAA 40 TTTGCTACTGGTCCGATGGGTGCAATGGTCTGAATTAC GGGCTAATTACAGGC
r2g-extend- AACGCAATCGCAACCGCTAAACCACTGGCCATGTGCA 41
CGAGTTTCATTCATTTCTCCTTATTATCAATGCACCAGC
GGGCTAACTTTC
MAGE_*toS t*a*aagagctcctcgcccttggatgcAAGCTCCTGCACAAACAACgA 42
TCCTCCACGCAGACGCAGAACCAAATGAAGGGTAGAT
TCTTTCT
asPCR-S-F CGTCTGCGTGGAGGATC 43 asPCR-*-F CGTCTGCGTGGAGGATA 44 pZE- TTCTGACCCATCGTAATTAAaagcttgatgggggatccca 45
Ubplbbone4Cl
pP-F
pZE- tGGTATATCTCCTTTTATTATTAATTCACATCCTTCCCTG 46
Ubplbbone4Cl AAAT
pP-R
clpPins-F GTGAATTAATAATAAAAGGAGATATACCatgTCATACA 47
GCGGCGA
clpPins-R tgggatcccccatcaagcttTTAATTACGATGGGTCAGAATCG 48 pEVOLtR A- ctgccaacttactgatttagtgtatgatggtgtttttgagg 49 pl-F
pEVOLtRNA- gccgcttagttagccgtgcaaacttatatcgtatggggctg 50 pl-R
agccccatacgatataagtttgcacggctaactaagcggc 51 p2-F
ctcaaaaacaccatcatacactaaatcagtaagttggcagcatca 52 p2-R
pZE- TGTGTACGCTAGAAAAAGCCTAAaagcttgatgggggatc 53
Ubplbbone4Cl
pS-F
pZE- GTTCGTTTTACCcatGGTATATCTCCTTTTATTATTAATT 54
Ubplbbone4Cl CACAT
pS-R
ClpSins-F ATAATAAAAGGAGATATACCatgGGTAAAACGAACGAC 55
TG
ClpSins-R gatcccccatcaagcttTTAGGCTTTTTCTAGCGTACACA 56
AARSlibraryin tactgtttctccatacccgtttttttgggctaacaggaggaattagatct 57 s-F
pEVOLbbone4 agatctaattcctcctgttagcc 58 lib-R
mutS null mut A*C*CCCATGAGTGCAATAGAAAATTTCGACGCCCATA 59
-2* CGCCCATGATGCAGCAGTGATAGTCGCTGAAAGCCCA
GCATCCCGAGATCCTGC
mutS_null_rev A*C*CCCATGAGTGCAATAGAAAATTTCGACGCCCATA 60 ert-2* CGCCCATGATGCAGCAGTATCTCAGGCTGAAAGCCCA
GCATCCCGAGATCCTGC
mutS- CCATGATGCAGCAGTATCTCAG 61
2_ascPCR_wt-
F
mutS- CCATGATGCAGCAGTGATAGTC 62
2_ascPCR_mut
-F
mutS- AGGTTGTCCTGACGCTCCTG 63
2 ascPCR-R
ASPCR- GTATAATTTCAATTCCCATAATGTATAG 64 151UAG-F
ASPCR- GTATAATTTCAATTCCCATAATGTATAC 65 151UAC-F
ASPCR-151-R ctcgagcttatagagctcatc 66
RemovelS l UA c*t*taaaattcgccttaatgccattcttttgcttatctgcggtaatgtatacattatgggaattg 67 G- aaattatactccagcttatggccgag
MAGE correct
ed
ClpS.inact- C*T*TTTTCTTCCGCCAGTTGATCAAAGTCCAGCCAGTC 68
MAGE GTTCtaTTatCaCATTGTCAGTTATCATCTTCGGTTACGGT
TATCGGCAGAAC
ASPCR- CCGATAACCGTAACCGAAGATGATAACTGACAATGG 69 ClpS_WT-F
ASPCR- CCGATAACCGTAACCGAAGATGATAACTGACAATGT 70 ClpS.inact-F
ASPCR-ClpS- CGTACTTGTTCACCATCGCCACTTTGGT 71 R
pZE-U- CGACTGAGCCCGAGGAGTAAaagcttgatgggggatccca 72 bbone4ClpS2_
At-F
pZE-U- TCAACAGGACTATCAGACATGGTATATCTCCTTTTATT J bbone4ClpS2_ ATTAATTCACATCC
At-R
ClpS2_At-ins- ATAATAAAAGGAGATATACCATGTCTGATAGTCCTGTT 74
F GACTT
ClpS2_At-ins- tgggatcccccatcaagcttTTACTCCTCGGGCTCAGTCG 75
R
CipS Μ40Α-Ι· ATGATGATTACACTCCGGCGGAGTTTGTTATTGACGTG 76
T
ClpS_M40A-R CGTCAATAACAAACTCCGCCGGAGTGTAATCATCATTG 77
AC
pOSIPbbone-F taacctaaactgacaggcat 78 pOSIPbbone-R ttccgatccccaattcct 79 pEVOL-araC- GGATCATTTTGCGCTTCAG 80 seq-1
pEVOL-araC- GAATATAACCTTTCATTCCC 81 seq-2
PylRSmiddle- GTGTTTCGACTAGCATTTC 82 seq
PylRSend-seq GGTCAAACATGATTTCAAAAAC 83
pEVOLCmR- caacagtactgcgatgag 84 seq-R
upstreamClpS- GCAAATAAGCTCTTGTCAGC 85
C ipS I 32- CATCTATGTATAAAGTGATANTCGTCAATGATGATTAC 86 X ! { -!· ACTCCG
ClpS_32-R TATCACTTTATACATAGATG 87
ClpS-V43- ATTACACTCCGATGGAGTTTNTTATTGACGTGTTACAA 88
NTT-F AAATTC
ClpS_43-R AAACTCCATCGGAGTGTAAT 89
ClpS_V65- CAACGCAATTGATGCTCGCTNTTCACTACCAGGGGAA 90
NTT-F GG
ClpS_65-R AGCGAGCATCAATTGCGTTG 91
ClpS_L99- CGAGGGAGAATGAGCATCCANTCCTGTGTACGCTAGA 92 NTC-F AAAAGC
ClpS__99-R TGGATGCTCATTCTCCCTCG 93
Alt ClpS- gcggatttgtcctactcag 94 R_forL99
AARS- gctaacaggaggaattagatct 95 inducible-only-
F
AARS- ttgataatctaacaaggattatggg 96 inducible-only-
R
pEVOLbbone- cccataatccttgttagattatcaaaggcattttgctattaaggg 97 Ind-only-F
pEVOL-bbone- agatctaattcctcctgttagc 98 ind-only-R
protosens- TAACTCGAGGCTGTTTTGG 99 bbone-F
protosens- CATATGTATATCTCCTTGTGCATC 100 bbone-R
Ubpl ClpS4prot GATGCACAAGGAGATATACATATGGGGAGTGGGTCTT 101 osens-F TCAT
Ubpl ClpS4prot CCAAAACAGCCTCGAGTTAGGCTTTTTCTAGCGTACA 102 osens-R
acccgatcatgcaggttaacGTTATGcactacGATggtgt 103 ins-F
tcaccaccgaatttttccggACCtttgatggtcagcg 104 ins-R
bbone4pAzFR ccggaaaaattcggtggtga 105 S. l .tl-F
bbone4pAzFR gttaacctgcatgatcgggt 106 S. l .tl-R
pZEbbone4tetR acgctctcctgagtaggac 107 -F
pZEbbone4tetR tcaccgacaaacaacagataaaac 108 -R
TetR-ins-F tatctgttgtttgtcggtgaacgtctcattttcgccagat 109
TetR-ins-R gtcctactcaggagagcgtagtgtcaactttatggctagc 110 pDULE-ABK- cgacctgaatggaagcc 111 bbone-F
pDULE-ABK- catacacggtgcctgac 112 bbone-R
CmRins4pDUL aacgcagtcaggcaccgtgtatggagaaaaaaatcactggatatac 113 E-F
CmR4pDULE- gccggcttccattcaggtcgaaaaaattacgccccgc 114 R
pC FRS-65- CAAAATGCTGGATTTGATATAATTATA NKTTG NKG 115 67-70-NNK-F ATTTANNKGCCTATTTAAACCAGAAAGGAGAG
pCNFRS-65-R TATAATTATATCAAATCCAGCATTTTGTAAATC 116 pCNFRS-108- GGCAAAATATGTTTATGGAAGTGAANNKNNKCTTGAT 1 17 109-114-N K- AAGGATN KACACTGAATGTCTATAGATTGGC
F
pCNFRS-108- TTCACTTCCATAAACATATTTTGCC 118 R
pC FRS-155~ GAAGTTATCTATCCAATAATG NKGTTN KGGTGCTC 119 157-161 -NNK- ATNNKCTTGGCGTTGATGTTGCAG
CATTATTGGATAGATAACTTCAGCAAC 120
R
library INS-seq- CGCATCAGGCAATTTAGC 121 R
BipARS PI 44 cgcgcgtgaagacgaaaaccagaaagttgcggaagttatctac 122 Q-F
BipARS PI 44 ggttttcgtcttcacgcg 123 Q-R
BipARS XI 7 tacccgatcatgcaggttaaaggtatccactacaaaggtgttg 124
K-F
BipARS XI 57 ttaacctgcatgatcgggta 125
K-R
BipARS_R181 gtaaaatccacatgctggcgtgtgaactgctgccgaaa 126 C-F
BipARS_R181 cgccagcatgtggatttta 127 C-R
BipARS I255F tcctggaatacccgctgaccttcaaacgtccggaaaaattc 128
BipARS !255i ggtcagcgggtattccag 129 -R
BipARS_E259 gctgaccatcaaacgtccggtaaaattcggtggtgacctg 130 V-F
BipARS_E259 ccggacgtttgatggtc 131 V-R
BipARS P284 tcaaaaacaaagaactgcactcgatgcgtctgaaaaacg 132 S-F
BipA S P284 gtgcagttctttgtttttgaac 133 S-R
pEVOLbbone4 ctgcagtttcaaacgctaaattg 134 iibv2-F
AARSlibraryin taggcctgataagcgtagcgcatcaggcaatttagcgtttgaaactgcag 135 sv2-R
Bi pA R S G257 aatacccgctgaccatcaaacgtccggaaaaattcggtg 136 R-F
Bi pA R S G257 accaccgaatttttccggacgtttgatggtcagcgggtat 137 R-R
BipARS- gcgaaatacgtttacggttc 138 100AA-F
BipARS- gaaccgtaaacgtatttcgc 139 100AA-R
BipARS- ggacggtgaaggtaaaatgtc 140 200AA-F
BipARS- gacattttaccttcaccgtcc 141 200AA-R
pZErepbbone4 cggcgccagggttgtttttcacgctctcctgagtaggaca 142 pylT-F
pZErepbbone4 ttccattcaggtcgaaaaaaagtgtcaactttatggctagc 143 pylT-R
pylTpDULE-F ttttttcgacctgaatggaagc 144 pylTpDULE-R gaaaaacaaccctggcgc 145 pZEbbone4pyl cggcgccagggttgtttttcacgctctcctgagtaggaca 142
Tonly-F
pZEbbone4pyl ttccattcaggtcgaaaaaactcgaggtgaagacgaaagg 146 Tonly-R
ClpS-Lib-F ACATTTCAGGGAAGGATGTGAATTAATAATAAAAGGA 147
GATATACC
ClpS-Lib-R gcgtaccatgggatcccccatcaagcttTTA 148 pZEbbone4Clp TAAaagcttgatgggggatc 149 Slib-F
pZEbbone4Clp GGTATATCTCCTTTTATTATTAATTCACATCC 150 Slib-R
ClpS-Lib-Seq GGATCATCGCGACATTTC 151
Plasmids and plasmid construction
Two copies of orthogonal MjTyrRS-derived AARSs and tRNA^1 AOpi were kindly provided in pEVOL plasmids by Dr, Peter Schultz (Scripps Institute) (See, M. Ibba, D. Soil, Aminoacyl-tRNAs: setting the limits of the genetic code. Genes Dev. 18, 731-8 (2004)). AARSs used in this study were the following: BipARS (See, J. Xie, W. Liu, P. G. Schultz, A Genetically Encoded Bidentate, Metal-Binding Amino Acid. Angew. Chemie. 119, 9399-9402 (2007)), BipyARS, (See, J. Xie, W. Liu, P, G. Schultz, A Genetically Encoded Bidentate, Metal-Binding Amino Acid. Angew. Chemie. 119, 9399-9402 (2007)), pAcFRS, (See, L. W ang, Z. Zhang, A. Brock, P. G. Schultz, Addition of the keto functional group to the genetic code of Escherichia coli. Proc. Natl Acad. Set U. S. A. 100, 56-61 (2003)), pAzFRS (See, J. W. Chin et aL, Addition of p-Azido-l-phenylalanine to the Genetic Code of Escherichia coli. J. Am. Chem. Soc. 124, 9026-9027 (2002)), and apARS, (See,† Lei Wang, % and Ansgar Brock, J,§ Peter G. Schultz*, Adding l-3-(2-Naphthyl)alanine to the Genetic Code of E. coli (2002), doi: 10.1021/JA012307J). The pEVOL plasmids were maintained using
chloramphenicol. Original plasmids harboring two AARS copies were used for synthetase promiscuity comparison experiments (Figures 2 and 3A-D). For generation and characterization of synthetase variants, plasmids harboring only one AARS copy under inducible expression were constructed using Gibson assembly, (See, D. G. Gibson et ah, Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods. 6, 343- 345 (2009)). The ScWRS-R3-13 AARS was synthesized as codon-optimized for expression in E. coli and cloned into the pEVOL plasmid along with its associated fRNA, (See, R. A. Hughes, A. D, Ellington, Rational design of an orthogonal tryptophan}'! nonsense suppressor tRNA. Nucleic Acids Res. 38, 6813-6830 (2010); J. W. Eliefson et al., Directed evolution of genetic parts and circuits by compartmentalized partnered replication. Nat. Biotechnol. 32, 97-101 (2014)), In all cases, tRNA is constitutively expressed and AARS expression is either arabinose inducible or constitutive.
An N-terminally truncated form of the UBPl gene from Saccharomyces cerevisiae, (See, J. W. Tobias, A. Varshavsky, Cloning and functional analysis of the ubiquitin-specific protease gene UBPl of Saccharomyces cerevisiae. J, Biol. Chem. 266, 12021 -8 (1991 ); A. Wojtowicz et al., Expression of yeast deubiquitination enzyme UBP l analogues in E. coli. Microb. Cell Fact. 4, 1-12 (2005)) (ScUBPltninc or simply UBPl) was synthesized as codon- optimized for expression in E. coli and cloned into the pZE21 vector (Kanamycin resistance, Col El origin, TET promoter) (Expressys). The E. coli genes clpS and clpP were PCR amplified from E. coli MG1655 and cloned into artificial operons downstream of the UBPl gene in the pZE21 vector using Gibson assembly. Artificial operons were created by inserting the following RBS sequence between the UBPl and clp genes: TAATAAAAGGAGATATACC (SEQ ID NO: 152), This RBS was originally designed using the RBS calculator (See, H. M Salis, E. A. MIrsky, C. A. Voigt, Automated design of synthetic ribosome binding sites to control protein expression. Nat Biotech. 27, 946-950 (2009) and previously validated in the
context of another artificial operon, (See, A. M. Kunjapur, Y. Tarasova, K. L. J. Prather, Synthesis and Accumulation of Aromatic Aldehydes in an Engineered Strain of Escherichia coli. J. Am. ( 'h m. Soc. 136, 1 1644-11654 (2014)). Rational engineering of CipS variants was performed by dividing the clpS gene into two amplicons where the second ampHeon contained a degenerate NTC or NTT sequence in the oligo corresponding to each codon of interest. The four initial positions of interest in the clpS gene correspond to amino acids 32, 43, 65, and 99. In each case, Gibson assembly was used to ligate both amplicons and the backbone plasmid. The pZE/UBPl/ClpS and pZE/UBPl/ClpS_V65I plasmids are available from Addgene.
Three reporter constructs were initially cloned into pZE21 vectors before use as templates for PGR amplification and genomic integration. The first of these consists of a Ubiquitin-*-LFVQEL-sfGFP-His6x fusion ("LFVQEL" and "His6x" disclosed as SEQ ID NOS 7 and 8, respectively) ("Ub-UAG-sfGFP") downstream of the TET promoter. The second has an additional UAG codon internal to the sfGFP at position Y151 * ("Ub-UAG- sfGFP 151UAG"). The third has an ATG codon (encoding methionine) in place of the first UAG ("Ub-M-sfGFP_151 UAG").
Culture Conditions
Cultures for general cuituring used herein were grown in LB -Lennox medium (LBL: 10 g/L bacto tryptone, 5 g L sodium chloride, 5 g/L yeast extract). Cultures for experiments in Figs 3A-3D were grown in 2X YT medium (2XYT: 16 g/L bacto tryptone, 10 g/L bacto yeast extract, 5 g/L sodium chloride) given improved observed final culture densities compared to LBL upon expression of ClpS variants. Unless otherwise indicated, ail cultures were grown in biological triplicate in 96-well deep-well plates in 300 Ε culture volumes at 34°C and 400 rpm.
Minimal Media SAA Spiking Experiments
Minimal media adapted C321.AA strains harboring either (i) pZE21 Ub-M- sfGFP_151UAG only, (ii) pZE21/Ub-M-sfGFP_151UAG and pEVOL/Mjti?N^^opl, (iii) pZE21/Ub-M-sfGFP_ .151UAG only and pEVOL/bip ARS_WT-tR A_WT, or (iv) pZE21/Ub- M-sfGFP_151UAG only and pEVOL bipARS_10-tRNA_10 were inoculated from frozen stocks in at least experimental duplicates. A I X M9 salt medium containing 6.78 g L Na2HP04 •7H20, 3 g L KH2PO4, 1 g L NH4CI, and 0.5 g L NaCl, supplemented with 2 mM MgS04, 0.1 mM CaCh, 1% glycerol, trace elements, 0.25 ^ig/L D-biotin, and carbenicillin was used as the culture medium. The trace element solution (lOOX) used contained 5 g/L EDTA, 0.83 g/L FeCly6H20, 84 mg/L ZnCh, 10 mg/L CoCl2-6H 0, 13 mg/L CuCI2-2H2Q, 1.6 mg/L MnCl2-2H20 and 10 mg/L H3B03 dissolved in water (See, A. M. A. M. Kunjapur, J. C. J. C. Hyun, K. L. J. K. L. J. Prather, Deregulation of S-adenosylmethionine biosynthesis and regeneration improves m ethyl ati on in the E, coli de novo vanillin biosynthesis pathway, Microh. Cell Fact. 15, 1 (2016)). Inoculum were grown to confluence overnight in deep 96- weli plates containing supplemented with 0.2% arabinose and chloramphenicol and/or kanamycin. Experimental cultures were inoculated at 1 :7 dilution in the same media supplemented with each of the 20 standard amino acids or bip.A to 1 mM or 100 uM, respectively. Cultures were incubated at 34 °C to an ODeoo of 0.5-0.8 in a shaking plate incubator at 1050 rpm (-4-5 h). GFP expression was induced by addition of anhydrotetracycline, and cells were incubated at 34 °C for an additional 16-20 h before measurement. All assays were performed in 96-well plate format. Cells were centrifuged at 5,000g for 5 min, washed with 1 x PBS, and resuspended in 1 x PBS after a second spin. GFP fluorescence was measured on a Biotek spectrophotometric plate reader using excitation and emission wavelengths of 485 and 525 nm. Fluorescence was then normalized by the
ODeoo reading to obtain FL/OD. Average normalized FL/OD from 3 independent experiments were plotted.
NSAA Incorporation Assays
Strains harboring integrated GFP reporters and AARS/tRNA plasmids were inoculated from frozen stocks of biological triplicate and grown to confluence overnight in deep well plates. Experimental cultures were inoculated at 1 : 100 dilution in either LBL or 2XYT media supplemented with chloramphenicol, arabinose, and the appropriate NSAA. Cultures were incubated at 34 °C to an OD6oo of 0.5-0,8 in a shaking plate incubator at 400 rpm (-4-5 h). GFP expression was induced by addition of anhydrotetracycline, and cells were incubated at 34 °C for an additional 16-20 h.
All assays were performed in 96-well plate format. Cells were centrifuged at 5,000»· for 3 min, washed with PBS, and resuspended in PBS after a second spin. GFP fluorescence was measured on a Biotek spectrophotometric plate reader using excitation and emission wavelengths of 485 and 525 nm . Fluorescence signals were corrected for autofluorescence as a linear function of OD600 using the parent C321.AA strain that does not contain a reporter. Fluorescence was then normalized by the QDeoo reading to obtain FL/OD.
Chemicals
NSAAs used in this study were purchased from PepTech Corporation, Sigma Aldrich, Santa Cruz Biotechnology, and Toronto Research Chemicals, The following NSAAs were purchased: L-4,4-Biphenylalanine (BipA), L-4-Benzoyiphenyiaianine (pBenzoylF), O-tert- Butyl-L-tyrosine (tButylY), L-2-Naphthylalanine (NapA), L-4-Acetylphenylalanine (pAcF), L-4-Iodophenylalanine (pIF), L-4-Bromophenylalanine (pBromoF), L-4-Chlorophenylalanine (pChloroF), L-4-Fluorophenylalanine (pFluoroF), L-4-Azidophenylalanine (pAzF), L-4- Nitrophenylalanine, L-4-Cyanophenylalanine, L-3-Iodophenylalanine, L-phenylalanine, L-
tyrosine, L-tryptophan, D-phenylalanine, D-tyrosine, and 5-Hydroxytryptophan. Solutions of
NSAAs (50 or 100 mM) were made in 10-50 niM NaOH.
Library Generation
Error-prone PCR (EP-PCR) is the method of choice for introducing random mutations into a defined segment of DNA that is too long to be chemically synthesized as a degenerate sequence. EP-PCR was performed using the GeneMorph II Random Mutagenesis Kit (Stratagene Catalog #200550), following manufacturer instructions to obtain approximately an average of 2-4 DNA mutations per library member. To generate libraries of MjTyrRS-derived AARSs, roughly 175 ng of PCR template was used in each 25 uL of PCR mix containing primers that have roughly 40 base pairs of homology flanking the AARS coding region. The reaction mixture was subject to 30 cycles with Tm of 63°C and extension time of 1 min. Four separate 25 uL EP-PCR reactions were performed per AARS and then pooled. Plasmid backbone PCRs were performed using KOD Xtreme Hot Start Polymerase (Miilipore Catalog #71795). Both PCR products were isolated by 1.5% agarose gel electrophoresis and Gibson assembled in 8 parallel 20 uL volumes per library. Assemblies were pooled, washed by ethanol precipitation, and resuspended in 50 JJL of dH20, which was drop dialyzed (EMD Miilipore, Billerica, MA) and electroporated into E. cloni supreme cells (Lucigen, Middleton, WI). Libraries were expanded in culture and miniprepped (Qiagen, Valencia, CA) to roughly 100 ng/μΐ aliquots. 1 tug of library was drop dialyzed and electroporated into C321.AA.Nendint for subsequent FACS experiments. Colony counts on appropriate antibiotic containing plates within one doubling time after transformation revealed library sizes of roughly 1 x 10° for AARS libraries in Ecloni hosts and J x 107 in C321 AA endint hosts.
Flow Cytometry and Cell Sorting
AARS libraries were subject to three rounds of fluorescence activated sorting in a Beckman Coulter MoFlo Astrios. Prior to each round, the usual NSAA incorporation assay procedure was followed such that cells would express GFP reporter proportional to the activity of the AARS library member. One notable deviation from that procedure was the use of a higher and variable inoculum volume to screen the full library at each stage. Ceils displaying the top 0,5% of fluorescence activation (50k cells) were collected after Round 1, expanded overnight, and used to inoculate experimental cultures for the next round. Because the next round was a negative screening round, the desired NSAA was not added into culture medium. The rest of the NSAA incorporation assay procedure was followed in order to eliminate cells that remained fluorescence due to promiscuous AARS activity on standard amino acids. In the second sort, cells displaying the lowest 10% of visible fluorescence (500k cells) were collected. Cells passing the second round were expanded overnight and used to inoculate the third and final round of sorting. The experimental cultures for the third round were treated as the first round and were sorted for the upper 0,05% of fluorescence activation (Ik cells). The final cells collected were expanded overnight and plated for sequencing and downstream testing. Libraries were frozen at each stage before and after sorting. Flow Jo X software was used to analyze the flow cytometry data. Constructs of interest were grown overnight, miniprepped, and transformed into C321 .AA.Ubiq-UAG-sfGFP for further analysis in plate reader assays. Reporter Purification
Strains harboring integrated GFP reporters and AARS/tRNA plasmids were inoculated from frozen stocks and grown to confluence overnight in 5 mL 2XYT containing chloramphenicol. Saturated cultures were used to inoculate 500 mL experimental cultures of 2XYT supplemented with chloramphenicol, arabinose, and appropriate NSAAs. Cuitures were incubated at 34 °C to an QDeoo of 0.5-0.8 in a shaking incubator at 250 rpm. GFP expression
was induced by addition of anhydrotetracycline, and cells were incubated at 34 °C for an additional 24 h before measurement. Cells were centrifuged in a Sorvali RC 5C Plus at 10,000 g for 20 minutes. Pellets were frozen at -20 °C before lysis and purification. Lysis of resuspended pellets was performed under denaturing conditions in 10 raL 7 M urea, 0, 1 M Na -!O i, 0.01 M Tris-Cl, pH 8.0 buffer with 450 units of Benzonase (Novagen, cat. no. 70664- 3) using 15 minutes of sonication in ice using a QSonica Q125 sonicator. Lysate was distributed into microcentrifuge tubes and centrifuged for 20 minutes at 20,000 g at room temperature, and then protein-containing supernatant was removed. 2 raL supernatant with 7.5 uM imidazole was added to 250 uL Ni-NTA resin (Qiagen Cat no. 30210) and equilibrated at 4°C overnight. Columns were washed with 7x 1 mL washes using 8 M urea, 0.1 M Na2P04, 0.01 M Tris-Cl. Wash I and 2 were adjusted to pH 6.3 and contained no imidazole. Washes 3- 7 were adjusted to pH 6.1 and contained imidazole at concentrations of 10 mM, 25 mM, 40 n M, 60 mM and 80 mM respectively. Protein was eluted with two 150 uL elutions using elution buffer (8 M urea, 0.1 M N a ··■!*() ;.. 0.01 M Tris-Cl, pH 4.5, 300 mM imidazole). Gels demonstrated that wash 5 eluted the protein, and for several samples the wash 5 fraction was concentrated ~20X using Amicon Ultra 0.5 mL 10K spin concentrators. Protein gels were loaded with 30 uL wash or elution volumes along with 10 uL Nu-PAGE loading dye in Nu- PAGE 10% Bis-Tris Gels (ThermoFisher Cat. no NP0301 ). Protein gels were run at 180 V for 1 h, washed 3x with DI water, stained with coomassie (Invitrogen Cat. no LC6060) for one hour. Gels were destained overnight in water on a shaker at room temperature and images were taken with a BioRad ChemiDoc MP imaging system.
Mass spectrometry
Samples were submitted for single LC-MS/MS experiments that were performed on a LTQ Orbitrap Elite (Thermo Fischer) equipped with Waters (Milford, MA) NanoAcquity UPLC pump. Trypsin-digested peptides were separated onto a 100 μηι inner diameter
microcapillary trapping column packed first with approximately 5 cm of CIS Reprosil resin (5 μηι, 100 A, Dr. Maisch GmbH, Germany) followed by analytical column -20 cm of Reprosil resin (1.8 μιη, 200 A, Dr. Maisch GmbH, Germany). Separation was achieved through applying a gradient from 5-27% ACN in 0.1% formic acid over 90 min at 200 nl min-1. Electrospray ionization was enabled through applying a voltage of 2.0 kV using a home-made electrode junction at the end of the microcapillary column and sprayed from fused silica pico tips (New Objective, MA). The LTQ Orbitrap Elite was operated in the data-dependent mode for the mass spectrometry methods. The mass spectrometry survey scan was performed in the Orbitrap in the range of 395 -1,800 m/z at a resolution of 6 x 104, followed by the selection of the twenty most intense ions (TOP20) for CD3-MS2 fragmentation in the Ion trap using a precursor isolation width window of 2 m/z, AGC setting of 10,000, and a maximum ion accumulation of 200 ms. Singly charged ion species were not subjected to CTD fragmentation. Normalized collision energy was set to 35 V and an activation time of 10 ms, AGC was set to 50,000, the maximum ion time was 200 ms. Ions in a 10 ppm m/z window around ions selected for MS2 were excluded from further selection for fragmentation for 60 s.
Mass Spectrometry Analysis
Raw data were submitted for analysis in Proteome Discoverer 2.1.0.81 (Thermo Scientific) software. Assignment of MS/MS spectra was performed using the Sequest HT algorithm by searching the data against a user provided protein sequence database as well as all entries from the E. coli Uniprot database and other known contaminants such as human keratins and common lab contaminants. Sequest HT searches were performed using a 20 ppm precursor ion tolerance and requiring each peptide N-/C termini to adhere with Trypsin protease specificity while allowing up to two missed cleavages. Cysteine carbamidom ethyl (+57.021) was set as static modifications while methionine oxidation (+15.99492 Da) was set as variable modification. MS2 spectra assignment false discovery rate (FDR) of 1% on protein level was
achieved by applying the target-decoy database search. Filtering was performed using a Percolator (64bit version, reference 6). For quantification, a 0.02 m/z window centered on the theoreti cal m/z value of each the six reporter ions and the intensity of the signal closest to the theoretical m/z value was recorded. Reporter ion intensities were exported in result file of Proteome Discoverer 2.1 search engine as an excel tables. All fold changes were analyzed after normalization between samples based on total unique peptides ion signal.
In vitro Aminoacylation Assays
Wild-type BipARS, BipARS9, and BipARSK) DNA template was amplified from the pEVOL.BipARS plasmid and cloned into pET20b using Gibson assembly (New England Biolabs) with primers pET20.F2 and pET20.R for linearization of pET20b and BipRS.F and BipRS.R2 for amplification of BipARS. The BipARS. pET20b plasmids were transformed into BL21(DE3) cells. A 25-mL overnight culture was used to inoculate 500 mL of fresh LB media containing ampicillin. Cells were grown at 37 °C to an ODeoo of approximately 0.6, and protein overexpression was induced with 1 mM IPTG for 4 h. Cells were harvested by centrifugation at 4 °C for 20 minutes at 6000 rpm. Cells were lysed using 50 mM Tris (pH7.5), 300 mM NaCl, 3 mM 2-mercaptoethanol and 5 mM imidazole followed by soni cation. Lysed cells were centrifuged at 18000 x g for I h at 4 °C. The supernatant was run through TALON resin and BipARS was eiuted using an imidazole concentration gradient. The proteins were stored in 50 mM HEPES (pH 7.3), 50 mM KC1, and 1 mM dithiothreitol (DTT). Protein concentration was calculated using the Bradford assay (BioRad).
The tRNA genes were cloned into pUC 18 using Gibson Assembly. pUC18 was linearized using primers pUCbipJF and p!JCbipJfl. The tRNA gene fragment was prepared by annealing 2 μΜ of primers tBip F and tBip R for WT tRNA, tBip9 F and tBip9 R for tRNA variant 9, and tBip 10 F and tBip 10 R for tRNA variant 10. tRNAs were obtained by in vitro transcription using T7 RNA polymerase. -100 μg of resulting plasmid was digested with BstNI
overnight at 55 °C, and the digestion reaction was used to start in vitro transcription by adding transcription buffer (40 mM Tris-HCl, pH 8, 6 mM MgCh, 1 mM spermidine, 0.01% Triton, 0.005 mg/mL BSA, and 5 mM dithiothreitol), 4 mM NTPs (ATP, OTP, UTP, and CTP), 20 mM MgCl2, 5 mM DTT, 2 units/mg of pyrophosphatase (Roche), and 0.75 mg/mL T7 RNA polymerase. The reaction was incubated for 6-7 h at 37 °C. The tRNA was purified using an 8 M urea/12 % acrylamide gel and extracted from the gel using a solution containing 0.5 M sodium acetate and 1 mM EDTA (pH 8) overnight at 30 °C followed by ethanol precipitation.
For aminoacylation reactions, tRNAs were radiolabeled at the 3 '-end using CCA- adding enzyme as previously described (See, A, M. A. M, Kunjapur, J. C. J. C. Hyun, K. L, J. K. L. J, Prather, Deregulation of S-adenosylmethionine biosynthesis and regeneration improves methylation in the E. coli de novo vanillin biosynthesis pathway. Microh. Cell Fact. 15, 1 (2016)). Reactions were carried out with 5 μΜ tRNA (with trace amount of 32P-labeled tRNA), 2.5 mM amino acid, and 5 μΜ BipARS in buffer containing 50 mM HEPES (pH 7.3), 4 mM ATP, 20 mM MgCl2, 0.1 mg/mL BSA, and 1 mM DTT. Reactions were incubated for 30 minutes at 37 °C. 2 uh of reaction mixture were quenched in 5 iL of 0.1 U/uL P I nuclease (Sigma) in 200 mM sodium acetate (pH 5) right after enzyme addition and after 30 min. The quenched time points were incubated at room temperature for 1 h. 1 Ε of the solution was run PEI cellulose thin layer chromatography sheets. The fraction of aminoacylated tRNA was determined as described previously, (See, M., Ibba, D. Soil, Aminoacyl -tRNAs: setting the limits of the genetic code. Genes Dev. 18, 731-8 (2004)). All assays were repeated three times. Figures were generated using Prism 7 (GraphPad Software).
Biocontainment escape frequency assays
Escape assays were performed very similarly to as previously described, (See, K. H. Wang, G. Rom an -Hernandez, R. A. Grant, R. T. Sauer, T. A. Baker, The Molecular Basis of N-End Rule Recognition. Mol. Cell. 32, 406-414 (2008)). All strains were grown in permissive
conditions and harvested in late exponential phase. Cells were washed twice in LB and resuspended in LB. Viable CFU were calculated from the mean and standard error of the mean (SEM) of three technical replicates of tenfold serial dilutions on permissive media. Three technical replicates were plated on n on -permissive media and monitored for 7 days. Synthetic auxotrophs were plated on two different non-permissive media conditions: SCA - LB with SDS, chloramphenicol, and arabinose - for previously published strains; and KA - LB with kanamycin and arabinose - for strains generated in this study. The latter strains were isolated by transformation with pEVOL vectors harboring kanamycin resistance markers instead of chloramphenicol resistance markers. Passaging and replica plating were used to ensure that isolated strains had lost chloramphenicol resistance and thus the original OTS construct used in the previous study. If synthetic auxotrophs exhibited escape frequencies above the detection limit (lawns) on non-permissive media at days 2, 5, or 7, escape frequencies for those days were calculated from additional platings at lower density. The SEM across technical replicates of the cumulative escape frequency was calculated as previously indicated,
Biocontained strain doubling time measurement
Doubling times for biocontained strains were measured in triplicate by plate reader as indicated earlier for growth assays. Doubling time assays for biocontained strains in the presence of only non-cognate NSAAs were performed as follows: cells grown to mid-log in permissive media were washed twice in LB and diluted to OD -0, 1 before 300-fold dilution into three 150 u volumes of LB+NSAA for each NSAA. These cultures were incubated in the Eon plate reader at conditions described earlier.
OTHER EMBODIMENTS
Other embodiments will be evident to those of skill in the art. It should be understood that the foregoing description is provided for clarity only and is merely exemplary. The spirit and scope of the present invention are not limited to the above examples, but are encompassed by the following claims. All publications and patent applications cited above are incorporated by reference in their entirety for all purposes to the same extent as if each individual publication or patent application were specifically and individually indicated to be so incorporated by reference.
Claims
1. A biphenylalanine amino acyl tRNA synthetase variant wherein the variant comprises one or more amino acid substitutions to a parental biphenylalanine amino acyl tRNA synthetase having the sequence of
MDEFEMIKRNTSEIISEEELREVL KDEKSAHIGFEPSGKIHLGHYLQIKKJvlIDLQNAG FDIIfflLADLHAYLNQKGELDEIR IGDYNKK EAIvlGLKAKYVYGSEWMLDKDYT LNVYRXALXTTUCR
EQRKIHMLARELLPKKVVC1HNPVLTGLDGEG MSSSKGNFIAVDDSPEEIRA IK A
YCPAGV ΈG PIMEIA YFLEYPLTIKRPEKFGGDLTVNSYΈELESLFKNKELHPMDL KNAVAEELD ILEPIRKRL (SEQ ID NO: 1).
2. The variant of claim 1 wherein the variant compri ses one or more amino acid substitutions selected from the group consisting of N157K and I255F, R257G, R181 C and E259V, I153V and A214T, P37A, K76R, I49F, A130V and A233V, L55M and G158S, D61 V and H70Q and Nl 17D, D200Y, G210S, E237V and D286Y to the parental biphenylalanine amino acyl tRNA synthetase, or an amino acid sequence having at least 90% sequence identity thereof.
3. The variant of claim 1 wherein the variant comprises amino acid substitutions D61 V and H70Q to the parental biphenylalanine amino acyl tRNA synthetase, or an amino acid sequence having at least 90% sequence identity thereof,
4. An isolated polynucleotide encoding the variant of claim 1.
5. A host ceil comprising an expression vector, wherein the expression vector comprises the polynucleotide of claim 4.
6. A transfer RNA (tRN A) variant wherein the variant comprises one or more nucleotide substitutions to a parental tRNA having the sequence of
ccggcggtagttcagcagggcagaacggcggactctaaatccgcatggcaggggttcaaatcccctccgccggacca (SEQ ID NO: 2).
7. The tRNA variant of claim 6 wherein the tRNA variant comprises a nucleotide substitution selected from the group consisting of A22G, C67A, C26T, C29A, G51T and G23 A to the parental tRNA, or a nucleotide sequence having at least 90% sequence identity thereof.
8. An isolated polynucleotide encoding the variant of claim 6,
9. A host cell comprising an expression vector, wherein the expression vector comprises the polynucleotide of claim 8.
10. A biphenylalanine amino acyl tRNA synthetase and tRNA pair wherein the pair is selected from the group consisting of
i) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions N157K and I255F to the parental biphenylalanine amino acyl tRNA synthetase of claim 1 and the parental tRN A of claim 6,
ii ) a biphenylalanine amino acyl tRNA synthetase variant comprising an amino acid substitution R257G to the parental biphenylalanine amino acyl tRNA synthetase of claim 1 and the parental tRNA of claim 6;
iii) a biphenylalanine amino acyl tRNA synthetase variant comprising amino acid substitutions R181C and E259V to the parental biphenylalanine amino acyl tRNA synthetase of claim 1 and a tRNA variant comprising a nucleotide substitution A22G to the parental tRNA of claim 6;
iv) a biphenylalanine amino acyl tRN A synthetase vari ant compri sing amino acid substitutions I153V and A214T to the parental biphenylalanine amino acyl tRNA synthetase of claim 1 and a tRNA variant comprising a nucleotide substitution C67A to the parental tRNA of claim 6;
v) a biphenylalanme amino acyi tRNA synthetase variant comprising an amino acid substitution P37A to the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and the parental tRNA of claim 6;
vi) a biphenylalanme amino acyi tRNA synthetase variant comprising an amino acid substitution K76R to the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and the parental tRNA of claim 6;
vii) the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and a tRNA variant comprising a nucleotide substitution A22G to the parental tRNA of claim 6; viii) a biphenylalanme amino acyi tRNA synthetase variant comprising amino acid substitutions I49F, A130V and A233V to the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and a tRN A variant comprising a nucleotide substitution C26T to the parental tRNA of claim 6;
ix) a biphenylalanme amino acyi tRNA synthetase variant comprising amino acid substitutions L55M and G158S to the parental bi phenyl alanine amino acyi tRNA synthetase of claim 1 and a tRNA variant comprising a nucleotide substitution C29A to the parental tRNA of claim 6;
x) a biphenylalanme amino acyi tRNA synthetase variant comprising amino acid substitutions D61 V and H70Q to the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and a tRNA variant comprising a nucleotide substitution G51T to the parental tRNA of claim 6; and
xi) a biphenylalanme amino acyi tRNA synthetase variant comprising amino acid substitutions Nl 17D, D200Y, G210S, E237V and D286Y to the parental biphenylalanme amino acyi tRNA synthetase of claim 1 and a tRN A variant comprising a nucleotide substitution G23A to the parental tRNA of claim 6.
11. A method of screening for an amino acyl tRNA synthetase variant having preferential selectivity for a desired non-standard amino acid (NSAA) over its standard amino acid (SAA) counterpart or an undesired non-standard amino acid for incorporation into a target polypeptide in a cell comprising
providing to the cell an amino acyl tRNA synthetase variant and its cognate transfer RNA corresponding to the desired NSAA, wherein the cell is genetically engineered to express the target polypeptide including an amino acid target location for incorporation of the desired NSAA by the amino acyl tRNA synthetase variant and the transfer RNA, and wherein the cell expresses the target polynucleotide and either a desired NSAA, an SAA or an undesired NSAA is incorporated at the amino acid target location depending on the preferential selectivity of the amino acyl tRNA synthetase variant and the transfer RN A for the corresponding desired NSAA,
wherein a removable protecting group is attached to the target polypeptide adjacent to the amino acid target location, suc that when the removable protecting group is removed, an N-end amino acid is exposed at the amino acid target location, and wherein a detectable moiety is attached to the C-end of the target polypeptide,
wherein the cell expresses an enzyme that cleaves the removable protecting group to generate an N-end amino acid, and wherein the cell further expresses an adaptor protein for a protease, wherein the protease degrades the target polypeptide when the N-end amino acid is an SAA or an undesired NSAA,
detecting the detectable moiety as a measure of the amount of target polypeptide including the desired NSAA within the cell, and
repeatedly testing an amino acyl tRNA synthetase variant for improved production of the target polypeptide including the desired NSAA.
12. The method of claim 1 1 wherein the removable protecting group is ubiquitin that is cleavable by Ubpl .
13. The method of claim 1 1 wherein the detectable moiety is a fluorescent moiety or a reporter protein,
14. The method of claim 1 1 wherein cell expresses the enzyme for cleaving the removable protecting group constitutively or inducibiy.
15. The method of claim 1 1 wherein the adaptor protein and the protease is a ClpS-ClpAP protease system wherein the ClpS-ClpAP protease system degrades the target polypeptide when the N-end amino acid is an SAA or an undesired NSAA to thereby enrich the target polypeptide including the desired NSAA within the cell.
16. The method of claim 1 1 wherein the adaptor protein comprises a ClpS protein, its natural homolog, ClpS__V65I, ClpS_43I or ClpS_L32F mutants.
17. The method of claim 1 1 wherein the cell is a prokaryotic cell or a eukaryotic cell.
1 8. The method of claim 1 1 wherein the cell is a bacterium.
19. The method of claim 1 1 wherein the cell is a genetically modified E. coii.
20. The method of claim 1 1 wherein the desired NSAA is biphenylalanine (BipA).
21. The method of claim 1 1 wherein the amino acyl tRNA synthetase variant is a biphenylalanine amino acyl tRNA synthetase (BipARS) variant,
22. The method of claim 1 1 wherein the amino acyl tR A synthetase variant is generated by introducing mutations throughout a parental amino acyl tRNA synthetase gene.
23. The method of claim 1 1 where error-prone PGR is used to introduce mutations throughout the wild type amino acyl tRNA synthetase gene,
24. The method of claim 1 1 wherein the amino acyl tRNA synthetase variant is provided to the cell by a nucleic acid encoding the amino acyl tRNA synthetase variant.
25. The method of claim 1 1 wherein the transfer RNA is provided to the cell by a nucleic acid encoding the transfer RNA.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762527115P | 2017-06-30 | 2017-06-30 | |
US62/527,115 | 2017-06-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019005973A1 true WO2019005973A1 (en) | 2019-01-03 |
Family
ID=64742229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2018/039764 WO2019005973A1 (en) | 2017-06-30 | 2018-06-27 | Synthetase variants for incorporation of biphenylalanine into a peptide |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2019005973A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023044431A3 (en) * | 2021-09-17 | 2023-04-27 | Absci Corporation | Composition of transfer rnas and use in production of proteins containing non-standard amino acids |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130244245A1 (en) * | 2004-10-27 | 2013-09-19 | The Scripps Research Institute | Orthogonal Translation Components for the in Vivo Incorporation of Unnatural Amino Acids |
US20160230176A1 (en) * | 2013-09-27 | 2016-08-11 | President And Fellows Of Harvard College | Recombinant Cells and Organisms Having Persistent Nonstandard Amino Acid Dependence and Methods of Making Them |
US20160355802A1 (en) * | 2014-02-06 | 2016-12-08 | Yale University | Compositions and Methods Of Use Thereof For Making Polypeptides With Many Instances Of Nonstandard Amino Acids |
-
2018
- 2018-06-27 WO PCT/US2018/039764 patent/WO2019005973A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130244245A1 (en) * | 2004-10-27 | 2013-09-19 | The Scripps Research Institute | Orthogonal Translation Components for the in Vivo Incorporation of Unnatural Amino Acids |
US20160230176A1 (en) * | 2013-09-27 | 2016-08-11 | President And Fellows Of Harvard College | Recombinant Cells and Organisms Having Persistent Nonstandard Amino Acid Dependence and Methods of Making Them |
US20160355802A1 (en) * | 2014-02-06 | 2016-12-08 | Yale University | Compositions and Methods Of Use Thereof For Making Polypeptides With Many Instances Of Nonstandard Amino Acids |
Non-Patent Citations (3)
Title |
---|
LU ET AL.: "Co-expression for intracellular processing in microbialprotein production", BIOTECHNOLOGY LETTERS, vol. 36, no. 3, March 2014 (2014-03-01), pages 427 - 441, XP055555575, Retrieved from the Internet <URL:https://www.researchgate.nettprofile/Juan_Aon/publication/257839380_Co-expression_for_intracellular_processing_in_microbial_protein_production/links/57c43c6b08aee465796c1042/Co-expression-for-intracellular-processing-in-microbial-protein-production.pdf> [retrieved on 20181024] * |
LUO ET AL.: "Genetically encoding phosphotyrosine and its nonhydrolyzable analog in bacteria", NATURE CHEMICAL BIOLOGY, vol. 13, no. 8, June 2017 (2017-06-01), pages 845 - 849, XP055555583, [retrieved on 20181024] * |
SCHUENEMANN ET AL.: "Structural basis of N-end rule substrate recognition in Escherichia coli by tho CIpAP adaptor protein Clp3", EMBO REPORTS, vol. 10, no. 5, May 2009 (2009-05-01), pages 508 - 514, XP055555580, [retrieved on 20181024] * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023044431A3 (en) * | 2021-09-17 | 2023-04-27 | Absci Corporation | Composition of transfer rnas and use in production of proteins containing non-standard amino acids |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Basgall et al. | Gene drive inhibition by the anti-CRISPR proteins AcrIIA2 and AcrIIA4 in Saccharomyces cerevisiae | |
Gamerdinger et al. | Early scanning of nascent polypeptides inside the ribosomal tunnel by NAC | |
EP3473728B1 (en) | Method for screening targeted genetic scissors by using multiple target system of on-target and off-target activity and use thereof | |
JP2021003134A (en) | Proteolytic inactivation of selected protein in bacterial extract for improved expression | |
Tu et al. | YcgC represents a new protein deacetylase family in prokaryotes | |
Ali et al. | A non-canonical NRPS is involved in the synthesis of fungisporin and related hydrophobic cyclic tetrapeptides in Penicillium chrysogenum | |
Kunjapur et al. | Engineering posttranslational proofreading to discriminate nonstandard amino acids | |
Yanagida et al. | The evolutionary potential of phenotypic mutations | |
Backes et al. | The chaperone-binding activity of the mitochondrial surface receptor Tom70 protects the cytosol against mitoprotein-induced stress | |
EP4166664A1 (en) | Modified aminoacyl-trna synthetase and use thereof | |
Karaduman et al. | Error-prone splicing controlled by the ubiquitin relative Hub1 | |
US20190390205A1 (en) | Incorporation of internal polya-encoded poly-lysine sequence tags and their variations for the tunable control of protein synthesis in bacterial and eukaryotic cells | |
Ruwe et al. | Identification and functional characterization of small alarmone synthetases in Corynebacterium glutamicum | |
Dörfel et al. | Proteomic and genomic characterization of a yeast model for Ogden syndrome | |
Wencker et al. | Another layer of complexity in Staphylococcus aureus methionine biosynthesis control: unusual RNase III-driven T-box riboswitch cleavage determines met operon mRNA stability and decay | |
WO2018148516A1 (en) | Methods of making proteins with non-standard amino acids | |
WO2019005973A1 (en) | Synthetase variants for incorporation of biphenylalanine into a peptide | |
CN104059891B (en) | 8-hydroxyquinoline alanine translation system and application thereof | |
Hegemann et al. | A bifunctional leader peptidase/ABC transporter protein is involved in the maturation of the lasso peptide cochonodin I from Streptococcus suis | |
Tajima et al. | Drop-off-reinitiation triggered by EF-G-driven mistranslocation and its alleviation by EF-P | |
US11649450B2 (en) | Methods of making proteins with non-standard amino acids | |
US20200140852A1 (en) | Methods of Making Proteins with Non-Standard Amino Acids | |
JP2020531572A (en) | TALE RVD that specifically recognizes DNA bases modified by methylation and its use | |
JP2022532216A (en) | Selective degradation of proteins | |
KR101668150B1 (en) | A tyrosyl-tRNA synthetase of Saccharomyces cerevisiae having an increased amber suppression activity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18823078 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18823078 Country of ref document: EP Kind code of ref document: A1 |