EP3087088B1 - Methods and compositions for ribosomal synthesis of macrocyclic peptides - Google Patents
Methods and compositions for ribosomal synthesis of macrocyclic peptides Download PDFInfo
- Publication number
- EP3087088B1 EP3087088B1 EP14874141.6A EP14874141A EP3087088B1 EP 3087088 B1 EP3087088 B1 EP 3087088B1 EP 14874141 A EP14874141 A EP 14874141A EP 3087088 B1 EP3087088 B1 EP 3087088B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- seq
- phenylalanine
- polypeptide
- group
- amino acid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims description 623
- 102000004196 processed proteins & peptides Human genes 0.000 title claims description 463
- 238000000034 method Methods 0.000 title claims description 174
- 230000015572 biosynthetic process Effects 0.000 title description 48
- 210000003705 ribosome Anatomy 0.000 title description 28
- 238000003786 synthesis reaction Methods 0.000 title description 27
- 239000000203 mixture Substances 0.000 title description 22
- 229920001184 polypeptide Polymers 0.000 claims description 347
- 239000002243 precursor Substances 0.000 claims description 178
- 235000001014 amino acid Nutrition 0.000 claims description 166
- 150000001413 amino acids Chemical class 0.000 claims description 160
- 230000017730 intein-mediated protein splicing Effects 0.000 claims description 157
- 108090000623 proteins and genes Proteins 0.000 claims description 114
- 102000004169 proteins and genes Human genes 0.000 claims description 95
- 230000014509 gene expression Effects 0.000 claims description 92
- 102000052866 Amino Acyl-tRNA Synthetases Human genes 0.000 claims description 86
- 108700028939 Amino Acyl-tRNA Synthetases Proteins 0.000 claims description 86
- 235000018102 proteins Nutrition 0.000 claims description 86
- 210000004027 cell Anatomy 0.000 claims description 83
- 125000005842 heteroatom Chemical group 0.000 claims description 70
- 210000004899 c-terminal region Anatomy 0.000 claims description 67
- 125000001931 aliphatic group Chemical group 0.000 claims description 57
- 125000003118 aryl group Chemical group 0.000 claims description 54
- 235000018417 cysteine Nutrition 0.000 claims description 52
- 229910052794 bromium Inorganic materials 0.000 claims description 51
- 229910052801 chlorine Inorganic materials 0.000 claims description 51
- 229910052731 fluorine Inorganic materials 0.000 claims description 51
- 108020004705 Codon Proteins 0.000 claims description 45
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 claims description 45
- 125000000524 functional group Chemical group 0.000 claims description 45
- 241000588724 Escherichia coli Species 0.000 claims description 43
- 102000039446 nucleic acids Human genes 0.000 claims description 41
- 108020004707 nucleic acids Proteins 0.000 claims description 41
- 150000007523 nucleic acids Chemical class 0.000 claims description 41
- 108091033319 polynucleotide Proteins 0.000 claims description 34
- 102000040430 polynucleotide Human genes 0.000 claims description 34
- 239000002157 polynucleotide Substances 0.000 claims description 34
- 125000001314 canonical amino-acid group Chemical group 0.000 claims description 30
- 108020005038 Terminator Codon Proteins 0.000 claims description 29
- 230000027455 binding Effects 0.000 claims description 22
- 102000004190 Enzymes Human genes 0.000 claims description 20
- 108090000790 Enzymes Proteins 0.000 claims description 20
- 238000006664 bond formation reaction Methods 0.000 claims description 20
- 125000005647 linker group Chemical group 0.000 claims description 19
- 125000003107 substituted aryl group Chemical group 0.000 claims description 18
- 125000003396 thiol group Chemical group [H]S* 0.000 claims description 17
- 240000004808 Saccharomyces cerevisiae Species 0.000 claims description 15
- 125000004104 aryloxy group Chemical group 0.000 claims description 14
- 125000003545 alkoxy group Chemical group 0.000 claims description 13
- 229910052740 iodine Inorganic materials 0.000 claims description 11
- FSSPGSAQUIYDCN-UHFFFAOYSA-N 1,3-Propane sultone Chemical class O=S1(=O)CCCO1 FSSPGSAQUIYDCN-UHFFFAOYSA-N 0.000 claims description 10
- HGGRBJAGJSWUIF-UHFFFAOYSA-N 4,4-difluorooxathiolane 2,2-dioxide Chemical compound FC1(F)COS(=O)(=O)C1 HGGRBJAGJSWUIF-UHFFFAOYSA-N 0.000 claims description 10
- QZDCJIPGANFBBF-UHFFFAOYSA-N 4-fluorooxathiolane 2,2-dioxide Chemical class FC1COS(=O)(=O)C1 QZDCJIPGANFBBF-UHFFFAOYSA-N 0.000 claims description 10
- 239000004472 Lysine Substances 0.000 claims description 10
- 150000001541 aziridines Chemical class 0.000 claims description 10
- OSYXBYWFALZNNG-JTQLQIEISA-N (2S)-2-amino-3-[4-(2-bromoethoxy)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCCBr)C=C1 OSYXBYWFALZNNG-JTQLQIEISA-N 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 9
- VNJXOJOGPXQFEG-MHPPCMCBSA-N (2S)-2-amino-3-[4-(1-bromoethyl)phenyl]propanoic acid Chemical compound BrC(C)C1=CC=C(C[C@H](N)C(=O)O)C=C1 VNJXOJOGPXQFEG-MHPPCMCBSA-N 0.000 claims description 8
- 125000001433 C-terminal amino-acid group Chemical group 0.000 claims description 7
- 241000282414 Homo sapiens Species 0.000 claims description 7
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 claims description 7
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 7
- 108091006047 fluorescent proteins Proteins 0.000 claims description 7
- 102000034287 fluorescent proteins Human genes 0.000 claims description 7
- JECYNCQXXKQDJN-UHFFFAOYSA-N 2-(2-methylhexan-2-yloxymethyl)oxirane Chemical class CCCCC(C)(C)OCC1CO1 JECYNCQXXKQDJN-UHFFFAOYSA-N 0.000 claims description 6
- 241000238631 Hexapoda Species 0.000 claims description 6
- 108020004566 Transfer RNA Proteins 0.000 claims description 6
- 210000004900 c-terminal fragment Anatomy 0.000 claims description 6
- 210000004898 n-terminal fragment Anatomy 0.000 claims description 6
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 6
- CZASZGLJJLORBV-MHPPCMCBSA-N (2S)-2-amino-3-[3-(1-bromoethyl)phenyl]propanoic acid Chemical compound BrC(C)C=1C=C(C[C@H](N)C(=O)O)C=CC1 CZASZGLJJLORBV-MHPPCMCBSA-N 0.000 claims description 5
- JAMZRLQCMMBEBP-JTQLQIEISA-N (2S)-2-amino-3-[3-(2-bromoethoxy)phenyl]propanoic acid Chemical compound BrCCOC=1C=C(C[C@H](N)C(=O)O)C=CC1 JAMZRLQCMMBEBP-JTQLQIEISA-N 0.000 claims description 5
- BBMMJGYOUUIZPZ-JTQLQIEISA-N (2S)-2-amino-3-[3-(2-chloroethoxy)phenyl]propanoic acid Chemical compound ClCCOC=1C=C(C[C@H](N)C(=O)O)C=CC1 BBMMJGYOUUIZPZ-JTQLQIEISA-N 0.000 claims description 5
- XKJFVPMRDJHUPP-VIFPVBQESA-N (2S)-2-amino-3-[3-(2-fluoroacetyl)phenyl]propanoic acid Chemical compound FCC(=O)C=1C=C(C[C@H](N)C(=O)O)C=CC1 XKJFVPMRDJHUPP-VIFPVBQESA-N 0.000 claims description 5
- PEBWLPAXBBOMHR-JTQLQIEISA-N (2S)-2-amino-3-[3-(aziridin-1-yl)phenyl]propanoic acid Chemical compound N1(CC1)C=1C=C(C[C@H](N)C(=O)O)C=CC1 PEBWLPAXBBOMHR-JTQLQIEISA-N 0.000 claims description 5
- IDUKFTVTJPUQSZ-JTQLQIEISA-N (2S)-2-amino-3-[3-(prop-2-enoylamino)phenyl]propanoic acid Chemical compound C(C=C)(=O)NC=1C=C(C[C@H](N)C(=O)O)C=CC1 IDUKFTVTJPUQSZ-JTQLQIEISA-N 0.000 claims description 5
- YFWGXSHKXGUFSZ-VIFPVBQESA-N (2S)-2-amino-3-[3-[(2-fluoroacetyl)amino]phenyl]propanoic acid Chemical compound FCC(=O)NC=1C=C(C[C@H](N)C(=O)O)C=CC1 YFWGXSHKXGUFSZ-VIFPVBQESA-N 0.000 claims description 5
- OBXWZIFDQSTBRN-VIFPVBQESA-N (2S)-2-amino-3-[4-(2-fluoroacetyl)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(C(=O)CF)C=C1 OBXWZIFDQSTBRN-VIFPVBQESA-N 0.000 claims description 5
- BWXFVCOZBOKRJI-JTQLQIEISA-N (2S)-2-amino-3-[4-(aziridin-1-yl)phenyl]propanoic acid Chemical compound N1(CC1)C1=CC=C(C[C@H](N)C(=O)O)C=C1 BWXFVCOZBOKRJI-JTQLQIEISA-N 0.000 claims description 5
- ZUAJCXDEDFXJER-JTQLQIEISA-N (2S)-2-amino-3-[4-(prop-2-enoylamino)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(NC(=O)C=C)C=C1 ZUAJCXDEDFXJER-JTQLQIEISA-N 0.000 claims description 5
- HJTGGJIQDHYXPE-VIFPVBQESA-N (2S)-2-amino-3-[4-[(2-fluoroacetyl)amino]phenyl]propanoic acid Chemical compound FCC(=O)NC1=CC=C(C[C@H](N)C(=O)O)C=C1 HJTGGJIQDHYXPE-VIFPVBQESA-N 0.000 claims description 5
- LARNBHANOBFOGY-VIFPVBQESA-N (2s)-2-amino-3-[3-[(2-chloroacetyl)amino]phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC(NC(=O)CCl)=C1 LARNBHANOBFOGY-VIFPVBQESA-N 0.000 claims description 5
- QHKDZPVDNPPXMI-JTQLQIEISA-N (2s)-2-amino-3-[4-(2-chloroethoxy)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCCCl)C=C1 QHKDZPVDNPPXMI-JTQLQIEISA-N 0.000 claims description 5
- YTXBQNBQCNPJDY-VIFPVBQESA-N (2s)-2-amino-3-[4-[(2-chloroacetyl)amino]phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(NC(=O)CCl)C=C1 YTXBQNBQCNPJDY-VIFPVBQESA-N 0.000 claims description 5
- 125000000449 nitro group Chemical group [O-][N+](*)=O 0.000 claims description 5
- 230000004568 DNA-binding Effects 0.000 claims description 4
- 150000002924 oxiranes Chemical class 0.000 claims description 4
- PDJZTZQUVUYQRM-NCJLJLRUSA-N (2s)-6-amino-2-[[(e)-but-2-enoyl]amino]hexanoic acid Chemical compound C\C=C\C(=O)N[C@H](C(O)=O)CCCCN PDJZTZQUVUYQRM-NCJLJLRUSA-N 0.000 claims 2
- 210000004602 germ cell Anatomy 0.000 claims 1
- 210000001161 mammalian embryo Anatomy 0.000 claims 1
- QTBSBXVTEAMEQO-UHFFFAOYSA-N acetic acid Substances CC(O)=O QTBSBXVTEAMEQO-UHFFFAOYSA-N 0.000 description 158
- 229940024606 amino acid Drugs 0.000 description 156
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 61
- 108010054814 DNA Gyrase Proteins 0.000 description 43
- 125000000151 cysteine group Chemical class N[C@@H](CS)C(=O)* 0.000 description 42
- 239000000460 chlorine Substances 0.000 description 36
- 239000000047 product Substances 0.000 description 36
- 238000006243 chemical reaction Methods 0.000 description 35
- 108020004414 DNA Proteins 0.000 description 33
- 125000000539 amino acid group Chemical group 0.000 description 32
- 125000002619 bicyclic group Chemical group 0.000 description 31
- 238000007363 ring formation reaction Methods 0.000 description 31
- OKKJLVBELUTLKV-MZCSYVLQSA-N Deuterated methanol Chemical compound [2H]OC([2H])([2H])[2H] OKKJLVBELUTLKV-MZCSYVLQSA-N 0.000 description 28
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 27
- 108010013829 alpha subunit DNA polymerase III Proteins 0.000 description 25
- 238000004895 liquid chromatography mass spectrometry Methods 0.000 description 25
- 239000013612 plasmid Substances 0.000 description 25
- 230000000875 corresponding effect Effects 0.000 description 23
- 238000010348 incorporation Methods 0.000 description 23
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 23
- 239000002904 solvent Substances 0.000 description 23
- 230000001629 suppression Effects 0.000 description 23
- XEKOWRVHYACXOJ-UHFFFAOYSA-N Ethyl acetate Chemical compound CCOC(C)=O XEKOWRVHYACXOJ-UHFFFAOYSA-N 0.000 description 22
- 101001105683 Homo sapiens Pre-mRNA-processing-splicing factor 8 Proteins 0.000 description 21
- 102100021231 Pre-mRNA-processing-splicing factor 8 Human genes 0.000 description 21
- 125000000217 alkyl group Chemical group 0.000 description 20
- 125000004432 carbon atom Chemical group C* 0.000 description 20
- 125000003367 polycyclic group Chemical group 0.000 description 20
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 19
- 108091060545 Nonsense suppressor Proteins 0.000 description 18
- 108010090804 Streptavidin Proteins 0.000 description 18
- 101100388071 Thermococcus sp. (strain GE8) pol gene Proteins 0.000 description 18
- 238000002955 isolation Methods 0.000 description 18
- 125000003342 alkenyl group Chemical group 0.000 description 17
- 239000013598 vector Substances 0.000 description 17
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 16
- 230000002829 reductive effect Effects 0.000 description 16
- 101150093191 RIR1 gene Proteins 0.000 description 15
- 102000001218 Rec A Recombinases Human genes 0.000 description 15
- 108010055016 Rec A Recombinases Proteins 0.000 description 15
- 101100302210 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RNR1 gene Proteins 0.000 description 15
- 238000004458 analytical method Methods 0.000 description 15
- 125000003729 nucleotide group Chemical group 0.000 description 15
- 239000000126 substance Substances 0.000 description 15
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 14
- 108010069514 Cyclic Peptides Proteins 0.000 description 14
- 102000001189 Cyclic Peptides Human genes 0.000 description 14
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 14
- -1 antibodies Proteins 0.000 description 14
- 125000004093 cyano group Chemical group *C#N 0.000 description 14
- 230000003993 interaction Effects 0.000 description 14
- 230000035772 mutation Effects 0.000 description 14
- 239000002773 nucleotide Substances 0.000 description 14
- 238000001228 spectrum Methods 0.000 description 14
- 241001524679 Escherichia virus M13 Species 0.000 description 13
- 125000003275 alpha amino acid group Chemical group 0.000 description 13
- 230000001580 bacterial effect Effects 0.000 description 13
- 238000005710 macrocyclization reaction Methods 0.000 description 13
- 238000002703 mutagenesis Methods 0.000 description 13
- 231100000350 mutagenesis Toxicity 0.000 description 13
- 238000004885 tandem mass spectrometry Methods 0.000 description 13
- 101710146427 Probable tyrosine-tRNA ligase, cytoplasmic Proteins 0.000 description 12
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 12
- 101710107268 Tyrosine-tRNA ligase, mitochondrial Proteins 0.000 description 12
- 230000037433 frameshift Effects 0.000 description 12
- 238000007429 general method Methods 0.000 description 12
- 238000001727 in vivo Methods 0.000 description 12
- 230000001404 mediated effect Effects 0.000 description 12
- 150000003568 thioethers Chemical class 0.000 description 12
- 238000005160 1H NMR spectroscopy Methods 0.000 description 11
- 125000000415 L-cysteinyl group Chemical group O=C([*])[C@@](N([H])[H])([H])C([H])([H])S[H] 0.000 description 11
- 125000000304 alkynyl group Chemical group 0.000 description 11
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 11
- 235000019439 ethyl acetate Nutrition 0.000 description 11
- 238000000338 in vitro Methods 0.000 description 11
- 238000002360 preparation method Methods 0.000 description 11
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 10
- 241000203407 Methanocaldococcus jannaschii Species 0.000 description 10
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 10
- 241000192581 Synechocystis sp. Species 0.000 description 10
- 238000004519 manufacturing process Methods 0.000 description 10
- 230000007017 scission Effects 0.000 description 10
- 125000003607 serino group Chemical group [H]N([H])[C@]([H])(C(=O)[*])C(O[H])([H])[H] 0.000 description 10
- 150000003573 thiols Chemical class 0.000 description 10
- 229910004373 HOAc Inorganic materials 0.000 description 9
- JCXJVPUVTGWSNB-UHFFFAOYSA-N Nitrogen dioxide Chemical compound O=[N]=O JCXJVPUVTGWSNB-UHFFFAOYSA-N 0.000 description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 description 9
- 102000018378 Tyrosine-tRNA ligase Human genes 0.000 description 9
- 238000003556 assay Methods 0.000 description 9
- 239000011324 bead Substances 0.000 description 9
- 230000008901 benefit Effects 0.000 description 9
- 238000003776 cleavage reaction Methods 0.000 description 9
- 150000001875 compounds Chemical class 0.000 description 9
- 125000004122 cyclic group Chemical group 0.000 description 9
- 239000013604 expression vector Substances 0.000 description 9
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 9
- 230000004048 modification Effects 0.000 description 9
- 238000012986 modification Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000011144 upstream manufacturing Methods 0.000 description 9
- 241000205274 Methanosarcina mazei Species 0.000 description 8
- NINIDFKCEFEMDL-UHFFFAOYSA-N Sulfur Chemical compound [S] NINIDFKCEFEMDL-UHFFFAOYSA-N 0.000 description 8
- UENWRTRMUIOCKN-UHFFFAOYSA-N benzyl thiol Chemical compound SCC1=CC=CC=C1 UENWRTRMUIOCKN-UHFFFAOYSA-N 0.000 description 8
- 239000002245 particle Substances 0.000 description 8
- 239000007787 solid Substances 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- PXQLVRUNWNTZOS-UHFFFAOYSA-N sulfanyl Chemical compound [SH] PXQLVRUNWNTZOS-UHFFFAOYSA-N 0.000 description 8
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 7
- 229920002101 Chitin Polymers 0.000 description 7
- 101710123256 Pyrrolysine-tRNA ligase Proteins 0.000 description 7
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 7
- 239000005864 Sulphur Substances 0.000 description 7
- 235000009582 asparagine Nutrition 0.000 description 7
- 229960001230 asparagine Drugs 0.000 description 7
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 7
- 229910052799 carbon Inorganic materials 0.000 description 7
- 239000013592 cell lysate Substances 0.000 description 7
- 230000008859 change Effects 0.000 description 7
- 239000012043 crude product Substances 0.000 description 7
- 125000001072 heteroaryl group Chemical group 0.000 description 7
- 239000006166 lysate Substances 0.000 description 7
- 125000002950 monocyclic group Chemical group 0.000 description 7
- 229910052757 nitrogen Inorganic materials 0.000 description 7
- 229910052760 oxygen Inorganic materials 0.000 description 7
- 239000001301 oxygen Substances 0.000 description 7
- 239000011541 reaction mixture Substances 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 238000013519 translation Methods 0.000 description 7
- VGCGRMAEOPVCDN-PKPIPKONSA-N (2S)-2,6-diamino-7-(2-bromoethoxy)-7-oxoheptanoic acid Chemical compound BrCCOC(=O)C(CCC[C@H](N)C(=O)O)N VGCGRMAEOPVCDN-PKPIPKONSA-N 0.000 description 6
- IHLXOBVKEDXKFW-PKPIPKONSA-N (2S)-2,6-diamino-7-(2-chloroethoxy)-7-oxoheptanoic acid Chemical compound ClCCOC(=O)C(CCC[C@H](N)C(=O)O)N IHLXOBVKEDXKFW-PKPIPKONSA-N 0.000 description 6
- UJAXSMBQBPJDHB-NSHDSACASA-N (2s)-2,6-diamino-2-[(2-methylpropan-2-yl)oxycarbonyl]hexanoic acid Chemical compound CC(C)(C)OC(=O)[C@@](N)(C(O)=O)CCCCN UJAXSMBQBPJDHB-NSHDSACASA-N 0.000 description 6
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 6
- RTGFDFIYXFCRGG-MQWKRIRWSA-N C(C=C=C)(=O)C(CCC[C@H](N)C(=O)O)N Chemical compound C(C=C=C)(=O)C(CCC[C@H](N)C(=O)O)N RTGFDFIYXFCRGG-MQWKRIRWSA-N 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 6
- ZMANZCXQSJIPKH-UHFFFAOYSA-N Triethylamine Chemical compound CCN(CC)CC ZMANZCXQSJIPKH-UHFFFAOYSA-N 0.000 description 6
- 238000001042 affinity chromatography Methods 0.000 description 6
- 238000013459 approach Methods 0.000 description 6
- 125000004429 atom Chemical group 0.000 description 6
- 230000001588 bifunctional effect Effects 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 238000003818 flash chromatography Methods 0.000 description 6
- RWSXRVCMGQZWBV-WDSKDSINSA-N glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 description 6
- 125000000623 heterocyclic group Chemical group 0.000 description 6
- 230000001965 increasing effect Effects 0.000 description 6
- VLKZOEOYAKHREP-UHFFFAOYSA-N n-Hexane Chemical compound CCCCCC VLKZOEOYAKHREP-UHFFFAOYSA-N 0.000 description 6
- 230000009257 reactivity Effects 0.000 description 6
- 238000012216 screening Methods 0.000 description 6
- 230000003612 virological effect Effects 0.000 description 6
- 101710132601 Capsid protein Proteins 0.000 description 5
- 101710094648 Coat protein Proteins 0.000 description 5
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 5
- 101710125418 Major capsid protein Proteins 0.000 description 5
- 101710141454 Nucleoprotein Proteins 0.000 description 5
- 101710083689 Probable capsid protein Proteins 0.000 description 5
- PMZURENOXWZQFD-UHFFFAOYSA-L Sodium Sulfate Chemical compound [Na+].[Na+].[O-]S([O-])(=O)=O PMZURENOXWZQFD-UHFFFAOYSA-L 0.000 description 5
- 230000009471 action Effects 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- 230000003197 catalytic effect Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- RTZKZFJDLAIYFH-UHFFFAOYSA-N ether Substances CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 5
- 238000013467 fragmentation Methods 0.000 description 5
- 238000006062 fragmentation reaction Methods 0.000 description 5
- 230000003834 intracellular effect Effects 0.000 description 5
- 238000003402 intramolecular cyclocondensation reaction Methods 0.000 description 5
- 150000002678 macrocyclic compounds Chemical class 0.000 description 5
- 210000004962 mammalian cell Anatomy 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 230000000269 nucleophilic effect Effects 0.000 description 5
- 239000003921 oil Substances 0.000 description 5
- 238000002823 phage display Methods 0.000 description 5
- 239000000843 powder Substances 0.000 description 5
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 5
- 238000000746 purification Methods 0.000 description 5
- 239000011780 sodium chloride Substances 0.000 description 5
- 229910052938 sodium sulfate Inorganic materials 0.000 description 5
- 235000011152 sodium sulphate Nutrition 0.000 description 5
- 125000001424 substituent group Chemical group 0.000 description 5
- 238000001644 13C nuclear magnetic resonance spectroscopy Methods 0.000 description 4
- 102000014914 Carrier Proteins Human genes 0.000 description 4
- 108010078791 Carrier Proteins Proteins 0.000 description 4
- 125000000998 L-alanino group Chemical group [H]N([*])[C@](C([H])([H])[H])([H])C(=O)O[H] 0.000 description 4
- 241000187479 Mycobacterium tuberculosis Species 0.000 description 4
- 108010067902 Peptide Library Proteins 0.000 description 4
- 241001148023 Pyrococcus abyssi Species 0.000 description 4
- 241000192117 Trichodesmium erythraeum Species 0.000 description 4
- 230000000975 bioactive effect Effects 0.000 description 4
- 239000013626 chemical specie Substances 0.000 description 4
- 125000000753 cycloalkyl group Chemical group 0.000 description 4
- 239000012467 final product Substances 0.000 description 4
- 230000007062 hydrolysis Effects 0.000 description 4
- 238000006460 hydrolysis reaction Methods 0.000 description 4
- 230000001939 inductive effect Effects 0.000 description 4
- 239000003446 ligand Substances 0.000 description 4
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 4
- BDAGIHXWWSANSR-UHFFFAOYSA-N methanoic acid Natural products OC=O BDAGIHXWWSANSR-UHFFFAOYSA-N 0.000 description 4
- BWHMMNNQKKPAPP-UHFFFAOYSA-L potassium carbonate Chemical compound [K+].[K+].[O-]C([O-])=O BWHMMNNQKKPAPP-UHFFFAOYSA-L 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 239000000523 sample Substances 0.000 description 4
- 230000004083 survival effect Effects 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 210000005253 yeast cell Anatomy 0.000 description 4
- HOFXDHDXBBWUQQ-GDVGLLTNSA-N (2S)-2,6-diamino-8-chloro-7-oxooctanoic acid Chemical compound N[C@@H](CCCC(N)C(=O)CCl)C(O)=O HOFXDHDXBBWUQQ-GDVGLLTNSA-N 0.000 description 3
- ABZDIWZCRLFTSC-GDVGLLTNSA-N (2S)-2,6-diamino-8-fluoro-7-oxooctanoic acid Chemical compound FCC(=O)C(CCC[C@H](N)C(=O)O)N ABZDIWZCRLFTSC-GDVGLLTNSA-N 0.000 description 3
- FLCFWDYHAQRFNI-MLWJPKLSSA-N (2s)-2,6-diamino-7-oxonon-8-enoic acid Chemical compound OC(=O)[C@@H](N)CCCC(N)C(=O)C=C FLCFWDYHAQRFNI-MLWJPKLSSA-N 0.000 description 3
- 125000006686 (C1-C24) alkyl group Chemical group 0.000 description 3
- LKQDUTMGIFNKRU-OFOFWLPESA-N (E,2S)-2,6-diamino-7-oxodec-8-enoic acid Chemical compound C(\C=C\C)(=O)C(CCC[C@H](N)C(=O)O)N LKQDUTMGIFNKRU-OFOFWLPESA-N 0.000 description 3
- 102100037399 Alanine-tRNA ligase, cytoplasmic Human genes 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 3
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 3
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 3
- 102100021389 DNA replication licensing factor MCM4 Human genes 0.000 description 3
- 102000004533 Endonucleases Human genes 0.000 description 3
- 108010042407 Endonucleases Proteins 0.000 description 3
- 108010024636 Glutathione Proteins 0.000 description 3
- 101000879354 Homo sapiens Alanine-tRNA ligase, cytoplasmic Proteins 0.000 description 3
- 101000615280 Homo sapiens DNA replication licensing factor MCM4 Proteins 0.000 description 3
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 3
- 101800002094 Mxe GyrA intein Proteins 0.000 description 3
- 241000192673 Nostoc sp. Species 0.000 description 3
- 241000144615 Thermococcus aggregans Species 0.000 description 3
- 239000007983 Tris buffer Substances 0.000 description 3
- 102100025336 Tyrosine-tRNA ligase, mitochondrial Human genes 0.000 description 3
- 125000002252 acyl group Chemical group 0.000 description 3
- 235000004279 alanine Nutrition 0.000 description 3
- 230000029936 alkylation Effects 0.000 description 3
- 238000005804 alkylation reaction Methods 0.000 description 3
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 3
- 229960000723 ampicillin Drugs 0.000 description 3
- 230000001851 biosynthetic effect Effects 0.000 description 3
- 239000006227 byproduct Substances 0.000 description 3
- 210000000170 cell membrane Anatomy 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 229960003180 glutathione Drugs 0.000 description 3
- 150000002430 hydrocarbons Chemical group 0.000 description 3
- 229910052739 hydrogen Inorganic materials 0.000 description 3
- 239000001257 hydrogen Substances 0.000 description 3
- 230000001976 improved effect Effects 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 230000005764 inhibitory process Effects 0.000 description 3
- 238000002824 mRNA display Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000010534 nucleophilic substitution reaction Methods 0.000 description 3
- 239000012044 organic layer Substances 0.000 description 3
- 239000012074 organic phase Substances 0.000 description 3
- 108700010839 phage proteins Proteins 0.000 description 3
- 125000001476 phosphono group Chemical group [H]OP(*)(=O)O[H] 0.000 description 3
- 229920002704 polyhistidine Polymers 0.000 description 3
- 230000001323 posttranslational effect Effects 0.000 description 3
- 230000002028 premature Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000002818 protein evolution Methods 0.000 description 3
- 230000016434 protein splicing Effects 0.000 description 3
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 229920006395 saturated elastomer Polymers 0.000 description 3
- 238000002741 site-directed mutagenesis Methods 0.000 description 3
- 238000000527 sonication Methods 0.000 description 3
- 150000007970 thio esters Chemical class 0.000 description 3
- RMVRSNDYEFQCLF-UHFFFAOYSA-N thiophenol Chemical compound SC1=CC=CC=C1 RMVRSNDYEFQCLF-UHFFFAOYSA-N 0.000 description 3
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 3
- DCTNEVSFAOHNIM-ZDUSSCGKSA-N (2S)-3-[4-(2-bromoethoxy)phenyl]-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound C(=O)(OC(C)(C)C)N[C@@H](CC1=CC=C(C=C1)OCCBr)C(=O)O DCTNEVSFAOHNIM-ZDUSSCGKSA-N 0.000 description 2
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- IZDJJYRXECMSLX-UHFFFAOYSA-N 2-chloro-5-methylpyridine-3-carbaldehyde Chemical compound CC1=CN=C(Cl)C(C=O)=C1 IZDJJYRXECMSLX-UHFFFAOYSA-N 0.000 description 2
- OSWFIVFLDKOXQC-UHFFFAOYSA-N 4-(3-methoxyphenyl)aniline Chemical compound COC1=CC=CC(C=2C=CC(N)=CC=2)=C1 OSWFIVFLDKOXQC-UHFFFAOYSA-N 0.000 description 2
- HBAQYPYDRFILMT-UHFFFAOYSA-N 8-[3-(1-cyclopropylpyrazol-4-yl)-1H-pyrazolo[4,3-d]pyrimidin-5-yl]-3-methyl-3,8-diazabicyclo[3.2.1]octan-2-one Chemical class C1(CC1)N1N=CC(=C1)C1=NNC2=C1N=C(N=C2)N1C2C(N(CC1CC2)C)=O HBAQYPYDRFILMT-UHFFFAOYSA-N 0.000 description 2
- 241000192531 Anabaena sp. Species 0.000 description 2
- 241000193752 Bacillus circulans Species 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 239000004215 Carbon black (E152) Substances 0.000 description 2
- 241000228124 Desulfitobacterium hafniense Species 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 241000701867 Enterobacteria phage T7 Species 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 2
- SRBFZHDQGSBBOR-HWQSCIPKSA-N L-arabinopyranose Chemical compound O[C@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-HWQSCIPKSA-N 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 2
- 241000205276 Methanosarcina Species 0.000 description 2
- 241000205275 Methanosarcina barkeri Species 0.000 description 2
- 101001132142 Methanosarcina barkeri Pyrrolysine-tRNA ligase Proteins 0.000 description 2
- 241000187486 Mycobacterium flavescens Species 0.000 description 2
- 241000186362 Mycobacterium leprae Species 0.000 description 2
- 241000329495 Nanoarchaeum equitans Kin4-M Species 0.000 description 2
- 241000894763 Nostoc punctiforme PCC 73102 Species 0.000 description 2
- 102000003992 Peroxidases Human genes 0.000 description 2
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 2
- 101710193132 Pre-hexon-linking protein VIII Proteins 0.000 description 2
- 241000530613 Pseudanabaena limnetica Species 0.000 description 2
- 241000205156 Pyrococcus furiosus Species 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 2
- 241000205188 Thermococcus Species 0.000 description 2
- 241000204103 Thermococcus fumicolans Species 0.000 description 2
- 241000204074 Thermococcus hydrothermalis Species 0.000 description 2
- 241001235254 Thermococcus kodakarensis Species 0.000 description 2
- 241000205180 Thermococcus litoralis Species 0.000 description 2
- 241001453191 Thermosynechococcus vulcanus Species 0.000 description 2
- HEDRZPFGACZZDS-MICDWDOJSA-N Trichloro(2H)methane Chemical compound [2H]C(Cl)(Cl)Cl HEDRZPFGACZZDS-MICDWDOJSA-N 0.000 description 2
- 108010028230 Trp-Ser- His-Pro-Gln-Phe-Glu-Lys Proteins 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 125000002008 alkyl bromide group Chemical group 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 239000008346 aqueous phase Substances 0.000 description 2
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 2
- 239000012620 biological material Substances 0.000 description 2
- GDTBXPJZTBHREO-UHFFFAOYSA-N bromine Chemical compound BrBr GDTBXPJZTBHREO-UHFFFAOYSA-N 0.000 description 2
- OTJZCIYGRUNXTP-UHFFFAOYSA-N but-3-yn-1-ol Chemical compound OCCC#C OTJZCIYGRUNXTP-UHFFFAOYSA-N 0.000 description 2
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 2
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 2
- 238000012219 cassette mutagenesis Methods 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000005119 centrifugation Methods 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 125000003636 chemical group Chemical group 0.000 description 2
- 229960005091 chloramphenicol Drugs 0.000 description 2
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 2
- 239000013599 cloning vector Substances 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000004132 cross linking Methods 0.000 description 2
- 239000003431 cross linking reagent Substances 0.000 description 2
- 229940079593 drug Drugs 0.000 description 2
- 238000007876 drug discovery Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 125000001033 ether group Chemical group 0.000 description 2
- 125000001495 ethyl group Chemical group [H]C([H])([H])C([H])([H])* 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 235000019253 formic acid Nutrition 0.000 description 2
- 230000002538 fungal effect Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 125000005843 halogen group Chemical group 0.000 description 2
- 125000004404 heteroalkyl group Chemical group 0.000 description 2
- 238000013537 high throughput screening Methods 0.000 description 2
- 238000002744 homologous recombination Methods 0.000 description 2
- 230000006801 homologous recombination Effects 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 229930195733 hydrocarbon Natural products 0.000 description 2
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 2
- 230000003100 immobilizing effect Effects 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 2
- 150000002611 lead compounds Chemical class 0.000 description 2
- 238000001819 mass spectrum Methods 0.000 description 2
- 239000002207 metabolite Substances 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 2
- 125000004433 nitrogen atom Chemical group N* 0.000 description 2
- 125000000018 nitroso group Chemical group N(=O)* 0.000 description 2
- 230000020477 pH reduction Effects 0.000 description 2
- 108040007629 peroxidase activity proteins Proteins 0.000 description 2
- 230000000144 pharmacologic effect Effects 0.000 description 2
- 239000008363 phosphate buffer Substances 0.000 description 2
- 229910052698 phosphorus Inorganic materials 0.000 description 2
- 239000011574 phosphorus Substances 0.000 description 2
- 238000003752 polymerase chain reaction Methods 0.000 description 2
- 229910000027 potassium carbonate Inorganic materials 0.000 description 2
- 230000017854 proteolysis Effects 0.000 description 2
- 239000012264 purified product Substances 0.000 description 2
- 108040001032 pyrrolysyl-tRNA synthetase activity proteins Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000002702 ribosome display Methods 0.000 description 2
- 229910052710 silicon Inorganic materials 0.000 description 2
- 239000010703 silicon Substances 0.000 description 2
- 238000000935 solvent evaporation Methods 0.000 description 2
- 230000002269 spontaneous effect Effects 0.000 description 2
- 238000003756 stirring Methods 0.000 description 2
- 108010018381 streptavidin-binding peptide Proteins 0.000 description 2
- 125000000446 sulfanediyl group Chemical group *S* 0.000 description 2
- 125000000020 sulfo group Chemical group O=S(=O)([*])O[H] 0.000 description 2
- 238000010189 synthetic method Methods 0.000 description 2
- DYHSDKLCOJIUFX-UHFFFAOYSA-N tert-butoxycarbonyl anhydride Chemical compound CC(C)(C)OC(=O)OC(=O)OC(C)(C)C DYHSDKLCOJIUFX-UHFFFAOYSA-N 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 125000000858 thiocyanato group Chemical group *SC#N 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- MYPYJXKWCTUITO-UHFFFAOYSA-N vancomycin Natural products O1C(C(=C2)Cl)=CC=C2C(O)C(C(NC(C2=CC(O)=CC(O)=C2C=2C(O)=CC=C3C=2)C(O)=O)=O)NC(=O)C3NC(=O)C2NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(CC(C)C)NC)C(O)C(C=C3Cl)=CC=C3OC3=CC2=CC1=C3OC1OC(CO)C(O)C(O)C1OC1CC(C)(N)C(O)C(C)O1 MYPYJXKWCTUITO-UHFFFAOYSA-N 0.000 description 2
- 238000010626 work up procedure Methods 0.000 description 2
- IZRUWBDEUDSRKB-JAMMHHFISA-N (2S)-2,6-diamino-7-(1,3-dibromopropan-2-yloxy)-7-oxoheptanoic acid Chemical compound BrCC(CBr)OC(=O)C(CCC[C@H](N)C(=O)O)N IZRUWBDEUDSRKB-JAMMHHFISA-N 0.000 description 1
- QEPFGWBYJVHJAH-JAMMHHFISA-N (2S)-2,6-diamino-7-(1,3-dichloropropan-2-yloxy)-7-oxoheptanoic acid Chemical compound ClCC(CCl)OC(=O)C(CCC[C@H](N)C(=O)O)N QEPFGWBYJVHJAH-JAMMHHFISA-N 0.000 description 1
- LTEKKDUNDUKWNW-WTIBDHCWSA-N (2S)-2,6-diamino-7-(2,3-dibromopropoxy)-7-oxoheptanoic acid Chemical compound BrC(COC(=O)C(CCC[C@H](N)C(=O)O)N)CBr LTEKKDUNDUKWNW-WTIBDHCWSA-N 0.000 description 1
- RPSGMMIZNFRZEJ-WTIBDHCWSA-N (2S)-2,6-diamino-7-(2,3-dichloropropoxy)-7-oxoheptanoic acid Chemical compound ClC(COC(=O)C(CCC[C@H](N)C(=O)O)N)CCl RPSGMMIZNFRZEJ-WTIBDHCWSA-N 0.000 description 1
- UILSMNBVJDLMCU-AWEZNQCLSA-N (2S)-2-amino-2-[(4-hydroxyphenyl)methyl]-3-[(2-methylpropan-2-yl)oxy]-3-oxopropanoic acid Chemical compound C(=O)(OC(C)(C)C)[C@](N)(CC1=CC=C(C=C1)O)C(=O)O UILSMNBVJDLMCU-AWEZNQCLSA-N 0.000 description 1
- CVEBFGUZRGXBRE-TYAJBCFHSA-N (2S)-2-amino-3-[3,5-bis(1-bromoethyl)phenyl]propanoic acid Chemical compound BrC(C)C=1C=C(C[C@H](N)C(=O)O)C=C(C1)C(C)Br CVEBFGUZRGXBRE-TYAJBCFHSA-N 0.000 description 1
- VCLVBMPJCXKVPZ-LBPRGKRZSA-N (2S)-2-amino-3-[3,5-bis(2-bromoethoxy)phenyl]propanoic acid Chemical compound BrCCOC=1C=C(C[C@H](N)C(=O)O)C=C(C1)OCCBr VCLVBMPJCXKVPZ-LBPRGKRZSA-N 0.000 description 1
- IWTJKTAQHIFDBX-LBPRGKRZSA-N (2S)-2-amino-3-[3,5-bis(2-chloroethoxy)phenyl]propanoic acid Chemical compound ClCCOC=1C=C(C[C@H](N)C(=O)O)C=C(C=1)OCCCl IWTJKTAQHIFDBX-LBPRGKRZSA-N 0.000 description 1
- ROBHKIKZLPORDX-JTQLQIEISA-N (2S)-2-amino-3-[3,5-bis(2-fluoroacetyl)phenyl]propanoic acid Chemical compound FCC(=O)C=1C=C(C[C@H](N)C(=O)O)C=C(C=1)C(CF)=O ROBHKIKZLPORDX-JTQLQIEISA-N 0.000 description 1
- YGJVAJWUXFWUMK-LBPRGKRZSA-N (2S)-2-amino-3-[3,5-bis(aziridin-1-yl)phenyl]propanoic acid Chemical compound N1(CC1)C=1C=C(C[C@H](N)C(=O)O)C=C(C=1)N1CC1 YGJVAJWUXFWUMK-LBPRGKRZSA-N 0.000 description 1
- VHHDYSYCXASWJL-LBPRGKRZSA-N (2S)-2-amino-3-[3,5-bis(prop-2-enoylamino)phenyl]propanoic acid Chemical compound C(C=C)(=O)NC=1C=C(C[C@H](N)C(=O)O)C=C(C=1)NC(C=C)=O VHHDYSYCXASWJL-LBPRGKRZSA-N 0.000 description 1
- BSLSSABEWDQPJQ-JTQLQIEISA-N (2S)-2-amino-3-[3,5-bis[(2-fluoroacetyl)amino]phenyl]propanoic acid Chemical compound FCC(=O)NC=1C=C(C[C@H](N)C(=O)O)C=C(C=1)NC(CF)=O BSLSSABEWDQPJQ-JTQLQIEISA-N 0.000 description 1
- WHTIIYCCZSSFBI-UMJHXOGRSA-N (2S)-2-amino-3-[3-(2,3-dibromopropoxy)phenyl]propanoic acid Chemical compound BrC(COC=1C=C(C[C@H](N)C(=O)O)C=CC1)CBr WHTIIYCCZSSFBI-UMJHXOGRSA-N 0.000 description 1
- BVWLQTIRTYJUEF-UMJHXOGRSA-N (2S)-2-amino-3-[3-(2,3-dichloropropoxy)phenyl]propanoic acid Chemical compound ClC(COC=1C=C(C[C@H](N)C(=O)O)C=CC1)CCl BVWLQTIRTYJUEF-UMJHXOGRSA-N 0.000 description 1
- UXUNCSQNOPYBNA-NSHDSACASA-N (2S)-2-amino-3-[4-(1,3-dibromopropan-2-yloxy)phenyl]propanoic acid Chemical compound BrCC(CBr)OC1=CC=C(C[C@H](N)C(=O)O)C=C1 UXUNCSQNOPYBNA-NSHDSACASA-N 0.000 description 1
- ZTURQERJYGMTAN-NSHDSACASA-N (2S)-2-amino-3-[4-(1,3-dichloropropan-2-yloxy)phenyl]propanoic acid Chemical compound ClCC(CCl)OC1=CC=C(C[C@H](N)C(=O)O)C=C1 ZTURQERJYGMTAN-NSHDSACASA-N 0.000 description 1
- WKTQBSLCOLIVAS-UMJHXOGRSA-N (2S)-2-amino-3-[4-(2,3-dibromopropoxy)phenyl]propanoic acid Chemical compound BrC(COC1=CC=C(C[C@H](N)C(=O)O)C=C1)CBr WKTQBSLCOLIVAS-UMJHXOGRSA-N 0.000 description 1
- OFNRLSUTQATUMT-UMJHXOGRSA-N (2S)-2-amino-3-[4-(2,3-dichloropropoxy)phenyl]propanoic acid Chemical compound ClC(COC1=CC=C(C[C@H](N)C(=O)O)C=C1)CCl OFNRLSUTQATUMT-UMJHXOGRSA-N 0.000 description 1
- SZNVLMZNAXUOOQ-JTQLQIEISA-N (2S)-6-(2-bromoethoxycarbonylamino)-2-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoic acid Chemical compound C(=O)(OC(C)(C)C)N[C@@H](CCCCNC(=O)OCCBr)C(=O)O SZNVLMZNAXUOOQ-JTQLQIEISA-N 0.000 description 1
- JSXMFBNJRFXRCX-NSHDSACASA-N (2s)-2-amino-3-(4-prop-2-ynoxyphenyl)propanoic acid Chemical group OC(=O)[C@@H](N)CC1=CC=C(OCC#C)C=C1 JSXMFBNJRFXRCX-NSHDSACASA-N 0.000 description 1
- YVBRFBNRDRVUAF-ZDUSSCGKSA-N (2s)-3-(4-acetylphenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(=O)C1=CC=C(C[C@H](NC(=O)OC(C)(C)C)C(O)=O)C=C1 YVBRFBNRDRVUAF-ZDUSSCGKSA-N 0.000 description 1
- ZXSBHXZKWRIEIA-JTQLQIEISA-N (2s)-3-(4-acetylphenyl)-2-azaniumylpropanoate Chemical compound CC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 ZXSBHXZKWRIEIA-JTQLQIEISA-N 0.000 description 1
- CNBUSIJNWNXLQQ-NSHDSACASA-N (2s)-3-(4-hydroxyphenyl)-2-[(2-methylpropan-2-yl)oxycarbonylamino]propanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 CNBUSIJNWNXLQQ-NSHDSACASA-N 0.000 description 1
- APQIUTYORBAGEZ-UHFFFAOYSA-N 1,1-dibromoethane Chemical compound CC(Br)Br APQIUTYORBAGEZ-UHFFFAOYSA-N 0.000 description 1
- FHCLGDLYRUPKAM-UHFFFAOYSA-N 1,2,3-tribromopropane Chemical compound BrCC(Br)CBr FHCLGDLYRUPKAM-UHFFFAOYSA-N 0.000 description 1
- 150000007545 14-membered macrocycles Chemical class 0.000 description 1
- WIIZWVCIJKGZOK-IUCAKERBSA-N 2,2-dichloro-n-[(1s,2s)-1,3-dihydroxy-1-(4-nitrophenyl)propan-2-yl]acetamide Chemical compound ClC(Cl)C(=O)N[C@@H](CO)[C@@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-IUCAKERBSA-N 0.000 description 1
- OXBLVCZKDOZZOJ-UHFFFAOYSA-N 2,3-Dihydrothiophene Chemical compound C1CC=CS1 OXBLVCZKDOZZOJ-UHFFFAOYSA-N 0.000 description 1
- ZLCKDYMZBZNBMJ-UHFFFAOYSA-N 2-bromoethyl carbonochloridate Chemical compound ClC(=O)OCCBr ZLCKDYMZBZNBMJ-UHFFFAOYSA-N 0.000 description 1
- ABFPKTQEQNICFT-UHFFFAOYSA-M 2-chloro-1-methylpyridin-1-ium;iodide Chemical compound [I-].C[N+]1=CC=CC=C1Cl ABFPKTQEQNICFT-UHFFFAOYSA-M 0.000 description 1
- SVDDJQGVOFZBNX-UHFFFAOYSA-N 2-chloroethyl carbonochloridate Chemical compound ClCCOC(Cl)=O SVDDJQGVOFZBNX-UHFFFAOYSA-N 0.000 description 1
- ZCYVEMRRCGMTRW-UHFFFAOYSA-N 7553-56-2 Chemical group [I] ZCYVEMRRCGMTRW-UHFFFAOYSA-N 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 101000651036 Arabidopsis thaliana Galactolipid galactosyltransferase SFR2, chloroplastic Proteins 0.000 description 1
- 108010065272 Aspartate-tRNA ligase Proteins 0.000 description 1
- 102100028820 Aspartate-tRNA ligase, cytoplasmic Human genes 0.000 description 1
- 241001225321 Aspergillus fumigatus Species 0.000 description 1
- 241000351920 Aspergillus nidulans Species 0.000 description 1
- 101000901118 Bacillus safensis Pumilarin Proteins 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 102000000584 Calmodulin Human genes 0.000 description 1
- 108010041952 Calmodulin Proteins 0.000 description 1
- 241000222178 Candida tropicalis Species 0.000 description 1
- 241000620137 Carboxydothermus hydrogenoformans Species 0.000 description 1
- 101710199022 Chitinase A1 Proteins 0.000 description 1
- 241000195598 Chlamydomonas moewusii Species 0.000 description 1
- KZBUYRJDOAKODT-UHFFFAOYSA-N Chlorine Chemical compound ClCl KZBUYRJDOAKODT-UHFFFAOYSA-N 0.000 description 1
- 201000007336 Cryptococcosis Diseases 0.000 description 1
- 241000221204 Cryptococcus neoformans Species 0.000 description 1
- 229930105110 Cyclosporin A Natural products 0.000 description 1
- PMATZTZNYRCHOR-CGLBZJNRSA-N Cyclosporin A Chemical compound CC[C@@H]1NC(=O)[C@H]([C@H](O)[C@H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)CN(C)C1=O PMATZTZNYRCHOR-CGLBZJNRSA-N 0.000 description 1
- 108010036949 Cyclosporine Proteins 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- YZCKVEUIGOORGS-OUBTZVSYSA-N Deuterium Chemical group [2H] YZCKVEUIGOORGS-OUBTZVSYSA-N 0.000 description 1
- OALVLUFFPXEHFO-UHFFFAOYSA-N Diazonamide A Natural products O1C=2C34C(O)OC5=C3C=CC=C5C(C3=5)=CC=CC=5NC(Cl)=C3C(=C(N=3)Cl)OC=3C=2N=C1C(C(C)C)NC(=O)C(NC(=O)C(N)C(C)C)CC1=CC=C(O)C4=C1 OALVLUFFPXEHFO-UHFFFAOYSA-N 0.000 description 1
- XEUXVSVRYSKWPZ-UHFFFAOYSA-N Dolastatin 3 Natural products C1NC(=O)C(N=2)=CSC=2C(CCC(N)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C2CCCN2C(=O)C2=CSC1=N2 XEUXVSVRYSKWPZ-UHFFFAOYSA-N 0.000 description 1
- 101710082707 Endochitinase A1 Proteins 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000702192 Escherichia virus P2 Species 0.000 description 1
- VGGSQFUCUMXWEO-UHFFFAOYSA-N Ethene Chemical group C=C VGGSQFUCUMXWEO-UHFFFAOYSA-N 0.000 description 1
- 239000005977 Ethylene Chemical group 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- XZWYTXMRWQJBGX-VXBMVYAYSA-N FLAG peptide Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=C(O)C=C1 XZWYTXMRWQJBGX-VXBMVYAYSA-N 0.000 description 1
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical compound FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 description 1
- 241000700662 Fowlpox virus Species 0.000 description 1
- 241000193385 Geobacillus stearothermophilus Species 0.000 description 1
- 102100024977 Glutamine-tRNA ligase Human genes 0.000 description 1
- 108010070675 Glutathione transferase Proteins 0.000 description 1
- 241000204942 Halobacterium sp. Species 0.000 description 1
- 102100029100 Hematopoietic prostaglandin D synthase Human genes 0.000 description 1
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 1
- 241000228404 Histoplasma capsulatum Species 0.000 description 1
- UFHFLCQGNIYNRP-UHFFFAOYSA-N Hydrogen Chemical compound [H][H] UFHFLCQGNIYNRP-UHFFFAOYSA-N 0.000 description 1
- 241000748655 Invertebrate iridescent virus 6 Species 0.000 description 1
- 241000235058 Komagataella pastoris Species 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 239000006137 Luria-Bertani broth Substances 0.000 description 1
- XOMHYSFBTSQDMS-UHFFFAOYSA-N Luzopeptin E2 Natural products OC1=CC2=CC(OC)=CC=C2N=C1C(=O)NC(C(N1N=CCCC1C(=O)NCC(=O)N(C)CC(=O)N(C)C(C(=O)OC1)C(C)(C)O)=O)COC(=O)C(C(C)(C)O)N(C)C(=O)CN(C)C(=O)CNC(=O)C2CCC=NN2C(=O)C1NC(=O)C1=NC2=CC=C(OC)C=C2C=C1O XOMHYSFBTSQDMS-UHFFFAOYSA-N 0.000 description 1
- PEEHTFAAVSWFBL-UHFFFAOYSA-N Maleimide Chemical compound O=C1NC(=O)C=C1 PEEHTFAAVSWFBL-UHFFFAOYSA-N 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 108010052285 Membrane Proteins Proteins 0.000 description 1
- 102000018697 Membrane Proteins Human genes 0.000 description 1
- XOGTZOOQQBDUSI-UHFFFAOYSA-M Mesna Chemical compound [Na+].[O-]S(=O)(=O)CCS XOGTZOOQQBDUSI-UHFFFAOYSA-M 0.000 description 1
- 241001302042 Methanothermobacter thermautotrophicus Species 0.000 description 1
- 238000006845 Michael addition reaction Methods 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 1
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 1
- 241000187484 Mycobacterium gordonae Species 0.000 description 1
- 241000186363 Mycobacterium kansasii Species 0.000 description 1
- 241000187493 Mycobacterium malmoense Species 0.000 description 1
- 241000187494 Mycobacterium xenopi Species 0.000 description 1
- 101100506065 Mycobacterium xenopi gyrA gene Proteins 0.000 description 1
- GRYLNZFGIOXLOG-UHFFFAOYSA-N Nitric acid Chemical compound O[N+]([O-])=O GRYLNZFGIOXLOG-UHFFFAOYSA-N 0.000 description 1
- 102100023050 Nuclear factor NF-kappa-B p105 subunit Human genes 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 108700028353 OmpC Proteins 0.000 description 1
- 229910020667 PBr3 Inorganic materials 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000228150 Penicillium chrysogenum Species 0.000 description 1
- 241001123663 Penicillium expansum Species 0.000 description 1
- 241001149509 Penicillium vulpinum Species 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 101800001494 Protease 2A Proteins 0.000 description 1
- 101800001066 Protein 2A Proteins 0.000 description 1
- 241001467519 Pyrococcus sp. Species 0.000 description 1
- 101710090029 Replication-associated protein A Proteins 0.000 description 1
- 241001148570 Rhodothermus marinus Species 0.000 description 1
- 101100545228 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ZDS1 gene Proteins 0.000 description 1
- BUGBHKTXTAQXES-UHFFFAOYSA-N Selenium Chemical compound [Se] BUGBHKTXTAQXES-UHFFFAOYSA-N 0.000 description 1
- 101710123496 Spindolin Proteins 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 1
- 101500013794 Thermococcus aggregans Tag pol-1 intein Proteins 0.000 description 1
- 241001127160 Thermococcus marinus Species 0.000 description 1
- 241000482676 Thermococcus thioreducens Species 0.000 description 1
- 241000204673 Thermoplasma acidophilum Species 0.000 description 1
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 1
- YZCKVEUIGOORGS-NJFSPNSNSA-N Tritium Chemical group [3H] YZCKVEUIGOORGS-NJFSPNSNSA-N 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- 238000005411 Van der Waals force Methods 0.000 description 1
- 108010059993 Vancomycin Proteins 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 101150063416 add gene Proteins 0.000 description 1
- 238000007259 addition reaction Methods 0.000 description 1
- 125000003282 alkyl amino group Chemical group 0.000 description 1
- 125000001930 alkyl chloride group Chemical group 0.000 description 1
- 150000001356 alkyl thiols Chemical class 0.000 description 1
- HSFWRNGVRCDJHI-UHFFFAOYSA-N alpha-acetylene Natural products C#C HSFWRNGVRCDJHI-UHFFFAOYSA-N 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 230000036436 anti-hiv Effects 0.000 description 1
- 230000000845 anti-microbial effect Effects 0.000 description 1
- 230000000259 anti-tumor effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000012062 aqueous buffer Substances 0.000 description 1
- 125000006615 aromatic heterocyclic group Chemical group 0.000 description 1
- 125000002102 aryl alkyloxo group Chemical group 0.000 description 1
- 125000001769 aryl amino group Chemical group 0.000 description 1
- 150000001504 aryl thiols Chemical class 0.000 description 1
- 229940091771 aspergillus fumigatus Drugs 0.000 description 1
- OHDRQQURAXLVGJ-HLVWOLMTSA-N azane;(2e)-3-ethyl-2-[(e)-(3-ethyl-6-sulfo-1,3-benzothiazol-2-ylidene)hydrazinylidene]-1,3-benzothiazole-6-sulfonic acid Chemical compound [NH4+].[NH4+].S/1C2=CC(S([O-])(=O)=O)=CC=C2N(CC)C\1=N/N=C1/SC2=CC(S([O-])(=O)=O)=CC=C2N1CC OHDRQQURAXLVGJ-HLVWOLMTSA-N 0.000 description 1
- 238000002819 bacterial display Methods 0.000 description 1
- 125000000499 benzofuranyl group Chemical group O1C(=CC2=C1C=CC=C2)* 0.000 description 1
- 125000004541 benzoxazolyl group Chemical group O1C(=NC2=C1C=CC=C2)* 0.000 description 1
- 125000001743 benzylic group Chemical group 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 238000004166 bioassay Methods 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 238000010170 biological method Methods 0.000 description 1
- 238000010504 bond cleavage reaction Methods 0.000 description 1
- 125000004369 butenyl group Chemical group C(=CCC)* 0.000 description 1
- 125000000484 butyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 125000000480 butynyl group Chemical group [*]C#CC([H])([H])C([H])([H])[H] 0.000 description 1
- 150000001732 carboxylic acid derivatives Chemical group 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 210000004671 cell-free system Anatomy 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 239000001913 cellulose Substances 0.000 description 1
- 229920002678 cellulose Polymers 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 229960001265 ciclosporin Drugs 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 238000007398 colorimetric assay Methods 0.000 description 1
- 238000012875 competitive assay Methods 0.000 description 1
- 238000003271 compound fluorescence assay Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 125000001651 cyanato group Chemical group [*]OC#N 0.000 description 1
- 125000000392 cycloalkenyl group Chemical group 0.000 description 1
- 125000001995 cyclobutyl group Chemical group [H]C1([H])C([H])([H])C([H])(*)C1([H])[H] 0.000 description 1
- 125000001511 cyclopentyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])(*)C1([H])[H] 0.000 description 1
- 125000001559 cyclopropyl group Chemical group [H]C1([H])C([H])([H])C1([H])* 0.000 description 1
- 125000003493 decenyl group Chemical group [H]C([*])=C([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 125000002704 decyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 238000010511 deprotection reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 229910052805 deuterium Inorganic materials 0.000 description 1
- YKBUODYYSZSEIY-PLSHLZFXSA-N diazonamide a Chemical compound N([C@H]([C@]12C=3O4)O5)C6=C2C=CC=C6C(C2=6)=CC=CC=6NC(Cl)=C2C(=C(N=2)Cl)OC=2C=3N=C4[C@H](C(C)C)NC(=O)[C@@H](NC(=O)[C@@H](O)C(C)C)CC2=CC=C5C1=C2 YKBUODYYSZSEIY-PLSHLZFXSA-N 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 1
- ATCVYMAKQRUVDS-OSAZLGQLSA-N dolastatin 3 Chemical compound C1NC(=O)C(N=2)=CSC=2[C@H](CCC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]2CCCN2C(=O)[C@H](C(C)C)NC(=O)C2=CSC1=N2 ATCVYMAKQRUVDS-OSAZLGQLSA-N 0.000 description 1
- 108010027025 dolastatin 3 Proteins 0.000 description 1
- 231100000673 dose–response relationship Toxicity 0.000 description 1
- 238000009510 drug design Methods 0.000 description 1
- 230000009881 electrostatic interaction Effects 0.000 description 1
- 230000009144 enzymatic modification Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 230000032050 esterification Effects 0.000 description 1
- 238000005886 esterification reaction Methods 0.000 description 1
- 150000002148 esters Chemical group 0.000 description 1
- 125000002534 ethynyl group Chemical group [H]C#C* 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 150000004665 fatty acids Chemical class 0.000 description 1
- 239000011737 fluorine Substances 0.000 description 1
- 125000002485 formyl group Chemical group [H]C(*)=O 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 125000002541 furyl group Chemical group 0.000 description 1
- 108010051239 glutaminyl-tRNA synthetase Proteins 0.000 description 1
- 125000001475 halogen functional group Chemical group 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 125000002883 imidazolyl group Chemical group 0.000 description 1
- 229960003444 immunosuppressant agent Drugs 0.000 description 1
- 230000001861 immunosuppressant effect Effects 0.000 description 1
- 239000003018 immunosuppressive agent Substances 0.000 description 1
- 125000003406 indolizinyl group Chemical group C=1(C=CN2C=CC=CC12)* 0.000 description 1
- 125000001041 indolyl group Chemical group 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- PGLTVOMIXTUURA-UHFFFAOYSA-N iodoacetamide Chemical compound NC(=O)CI PGLTVOMIXTUURA-UHFFFAOYSA-N 0.000 description 1
- 238000005040 ion trap Methods 0.000 description 1
- 125000000959 isobutyl group Chemical group [H]C([H])([H])C([H])(C([H])([H])[H])C([H])([H])* 0.000 description 1
- 125000000904 isoindolyl group Chemical group C=1(NC=C2C=CC=CC12)* 0.000 description 1
- 125000000555 isopropenyl group Chemical group [H]\C([H])=C(\*)C([H])([H])[H] 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 125000002183 isoquinolinyl group Chemical group C1(=NC=CC2=CC=CC=C12)* 0.000 description 1
- 230000000155 isotopic effect Effects 0.000 description 1
- 239000010410 layer Substances 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 229910021645 metal ion Inorganic materials 0.000 description 1
- RIFHJAODNHLCBH-UHFFFAOYSA-N methanethione Chemical group S=[CH] RIFHJAODNHLCBH-UHFFFAOYSA-N 0.000 description 1
- 108010005942 methionylglycine Proteins 0.000 description 1
- 125000000325 methylidene group Chemical group [H]C([H])=* 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 239000002062 molecular scaffold Substances 0.000 description 1
- 125000002757 morpholinyl group Chemical group 0.000 description 1
- 125000004370 n-butenyl group Chemical group [H]\C([H])=C(/[H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000004108 n-butyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 125000004123 n-propyl group Chemical group [H]C([H])([H])C([H])([H])C([H])([H])* 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 229910017604 nitric acid Inorganic materials 0.000 description 1
- 239000012038 nucleophile Substances 0.000 description 1
- 125000004365 octenyl group Chemical group C(=CCCCCCC)* 0.000 description 1
- 125000002347 octyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 125000002971 oxazolyl group Chemical group 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 125000002255 pentenyl group Chemical group C(=CCCC)* 0.000 description 1
- 125000001147 pentyl group Chemical group C(CCCC)* 0.000 description 1
- 125000005981 pentynyl group Chemical group 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 239000002831 pharmacologic agent Substances 0.000 description 1
- IPNPIHIZVLFAFP-UHFFFAOYSA-N phosphorus tribromide Chemical compound BrP(Br)Br IPNPIHIZVLFAFP-UHFFFAOYSA-N 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 125000004193 piperazinyl group Chemical group 0.000 description 1
- 125000003386 piperidinyl group Chemical group 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 229920000724 poly(L-arginine) polymer Polymers 0.000 description 1
- 108010011110 polyarginine Proteins 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 239000008057 potassium phosphate buffer Substances 0.000 description 1
- 239000013615 primer Substances 0.000 description 1
- 125000004368 propenyl group Chemical group C(=CC)* 0.000 description 1
- 125000001436 propyl group Chemical group [H]C([*])([H])C([H])([H])C([H])([H])[H] 0.000 description 1
- 125000002568 propynyl group Chemical group [*]C#CC([H])([H])[H] 0.000 description 1
- 102000021127 protein binding proteins Human genes 0.000 description 1
- 108091011138 protein binding proteins Proteins 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- 208000009305 pseudorabies Diseases 0.000 description 1
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 1
- 229950010131 puromycin Drugs 0.000 description 1
- 125000004309 pyranyl group Chemical group O1C(C=CC=C1)* 0.000 description 1
- 125000003373 pyrazinyl group Chemical group 0.000 description 1
- 125000005494 pyridonyl group Chemical group 0.000 description 1
- 125000004076 pyridyl group Chemical group 0.000 description 1
- 125000000714 pyrimidinyl group Chemical group 0.000 description 1
- 125000000719 pyrrolidinyl group Chemical group 0.000 description 1
- 125000000168 pyrrolyl group Chemical group 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 125000002943 quinolinyl group Chemical group N1=C(C=CC2=CC=CC=C12)* 0.000 description 1
- 150000003254 radicals Chemical class 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000012508 resin bead Substances 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006798 ring closing metathesis reaction Methods 0.000 description 1
- 229930195734 saturated hydrocarbon Natural products 0.000 description 1
- 239000006152 selective media Substances 0.000 description 1
- 229910052711 selenium Inorganic materials 0.000 description 1
- 239000011669 selenium Substances 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000007086 side reaction Methods 0.000 description 1
- 239000012279 sodium borohydride Substances 0.000 description 1
- 229910000033 sodium borohydride Inorganic materials 0.000 description 1
- KIEOKOFEPABQKJ-UHFFFAOYSA-N sodium dichromate Chemical compound [Na+].[Na+].[O-][Cr](=O)(=O)O[Cr]([O-])(=O)=O KIEOKOFEPABQKJ-UHFFFAOYSA-N 0.000 description 1
- JQWHASGSAFIOCM-UHFFFAOYSA-M sodium periodate Chemical compound [Na+].[O-]I(=O)(=O)=O JQWHASGSAFIOCM-UHFFFAOYSA-M 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 125000002653 sulfanylmethyl group Chemical group [H]SC([H])([H])[*] 0.000 description 1
- 239000003774 sulfhydryl reagent Substances 0.000 description 1
- 229910052717 sulfur Inorganic materials 0.000 description 1
- 239000011593 sulfur Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 125000000999 tert-butyl group Chemical group [H]C([H])([H])C(*)(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 125000003718 tetrahydrofuranyl group Chemical group 0.000 description 1
- 125000003039 tetrahydroisoquinolinyl group Chemical group C1(NCCC2=CC=CC=C12)* 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 125000000335 thiazolyl group Chemical group 0.000 description 1
- 125000001544 thienyl group Chemical group 0.000 description 1
- 125000002813 thiocarbonyl group Chemical group *C(*)=S 0.000 description 1
- 125000004568 thiomorpholinyl group Chemical group 0.000 description 1
- 125000000341 threoninyl group Chemical group [H]OC([H])(C([H])([H])[H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 125000003944 tolyl group Chemical group 0.000 description 1
- 238000007056 transamidation reaction Methods 0.000 description 1
- 125000001425 triazolyl group Chemical group 0.000 description 1
- 229910052722 tritium Chemical group 0.000 description 1
- 150000003667 tyrosine derivatives Chemical class 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- MYPYJXKWCTUITO-LYRMYLQWSA-N vancomycin Chemical compound O([C@@H]1[C@@H](O)[C@H](O)[C@@H](CO)O[C@H]1OC1=C2C=C3C=C1OC1=CC=C(C=C1Cl)[C@@H](O)[C@H](C(N[C@@H](CC(N)=O)C(=O)N[C@H]3C(=O)N[C@H]1C(=O)N[C@H](C(N[C@@H](C3=CC(O)=CC(O)=C3C=3C(O)=CC=C1C=3)C(O)=O)=O)[C@H](O)C1=CC=C(C(=C1)Cl)O2)=O)NC(=O)[C@@H](CC(C)C)NC)[C@H]1C[C@](C)(N)[C@H](O)[C@H](C)O1 MYPYJXKWCTUITO-LYRMYLQWSA-N 0.000 description 1
- 229960003165 vancomycin Drugs 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K7/00—Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
- C07K7/50—Cyclic peptides containing at least one abnormal peptide link
- C07K7/54—Cyclic peptides containing at least one abnormal peptide link with at least one abnormal peptide link in the ring
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K7/00—Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
- C07K7/04—Linear peptides containing only normal peptide links
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K1/00—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length
- C07K1/107—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides
- C07K1/113—General methods for the preparation of peptides, i.e. processes for the organic chemical preparation of peptides or proteins of any length by chemical modification of precursor peptides without change of the primary structure
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/90—Fusion polypeptide containing a motif for post-translational modification
- C07K2319/92—Fusion polypeptide containing a motif for post-translational modification containing an intein ("protein splicing")domain
Definitions
- the present invention relates to methods and compositions for generating macrocyclic peptides from genetically encoded, ribosomally produced polypeptide precursors.
- the invention also relates to a recombinant host cell comprising an artificial nucleic acid encoding for a polypeptide.
- Peptides molecules represent valuable tools for investigating biological systems, studying the binding and activity properties of biomolecules (e.g., enzymes, cell receptors, antibodies, kinases), exploring the etiopathological causes of diseases, and for validating pharmacological targets. Peptides are also attractive ligands for targeting protein-protein interactions and modulating the function of biological molecules such as enzymes and nucleic acids.
- the synthesis of combinatorial libraries of small peptides followed by screening of these chemical libraries in biological assays can enable the identification of compounds that exhibit a variety of biological and pharmacological properties. Bioactive peptides identified in this manner can constitute valuable lead compounds or facilitate the development of lead compounds towards the discovery of new drugs.
- linear peptides While many peptides exhibit interesting biological activity, linear peptides do not generally represent suitable pharmacological agents as they are generally only poorly adsorbed, do not cross biological membranes readily, and are prone to proteolytic degradation. In addition, linear peptides fail to bind proteins that recognize discontinuous epitopes.
- the use of molecular constraints to restrict the conformational freedom of the molecule backbone can be used to overcome these limitations. In many cases, conformationally constrained peptides exhibit enhanced enzymatic stability (Fairlie, Tyndall et al. 2000; Wang, Liao et al. 2005), membrane permeability (Walensky, Kung et al. 2004; Rezai, Bock et al.
- bioactive and therapeutically relevant peptides isolated from natural sources occur indeed in cyclized form or contain intramolecular bridges that reduce the conformational flexibility of these molecules (e.g., immunosuppressant cyclosporin A, antitumor dolastatin 3 and diazonamide A, anti-HIV luzopeptin E2, and the antimicrobial vancomycin).
- macrocyclic peptides constitute promising molecular scaffolds for the development of bioactive compounds and therapeutic agents (Katsara, Tselios et al. 2006; Driggers, Hale et al. 2008; Obrecht, Robinson et al. 2009; Marsault and Peterson 2011), methods for generating macrocyclic peptides and combinatorial libraries thereof, are of high synthetic value and practical utility, in particular in the context of drug discovery.
- cyclic peptides can be prepared synthetically via a variety of known methods (White and Yudin 2011), the possibility to generate macrocyclic peptides starting from genetically encoded polypeptide precursors offers several advantages (Frost, Smith et al. 2013; Smith, Frost et al. 2013).
- ribosomally produced peptides have also been constrained through the use of cysteine- or amine-reactive cross-linking agents (Millward, Takahashi et al. 2005; Seebeck and Szostak 2006; Heinis, Rutherford et al. 2009; Schlippe, Hartman et al. 2012).
- cysteine- or amine-reactive cross-linking agents Millward, Takahashi et al. 2005; Seebeck and Szostak 2006; Heinis, Rutherford et al. 2009; Schlippe, Hartman et al. 2012.
- a drawback of these methods is the risk of producing multiple undesired products via reaction of the cross-linking agents with multiple sites within the randomized peptide sequence or the carrier protein in a display system.
- these methods do not allow for the formation of macrocyclic peptides inside the polypeptide-producing cell host.
- Efficient and versatile methods for generating macrocyclic peptides from ribosomally produced polypeptides would thus be highly desirable in the art.
- the methods and compositions described herein provide a solution to this need, enabling the ribosomal synthesis of cyclic peptides in vitro (i.e., in a cell-free system) and in vivo (i.e., inside a cell or on a surface of a cell) and in various 'configurations', namely in the form of macrocyclic peptides, lariat-shaped peptides, or as cyclic peptides fused to a N-terminus or C-terminus of a protein of interest, such as a carrier protein of a display system.
- a method for making a macrocyclic peptide comprising:
- Z is an amino acid of structure (IV) and Y is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 2 -C 24 alkenyl, C 2 -C 24 substituted alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 5 -C 24 aryl, C 5 -C 24 substituted aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 1 -C 24 alkoxy, and C 5 -C 24 aryloxy groups.
- Y is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted alkyl,
- Y is a linker group selected from the group consisting of -CH 2 -C 6 H 4 -, -CH 2 -C 6 H 4 -O-, -CH 2 -C 6 H 4 -NH-, -(CH 2 ) 4 -(CH 2 ) 4 NH-, -(CH 2 ) 4 NHC(O)-, and -(CH 2 ) 4 NHC(O)O-.
- the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-pheny
- Y is a linker group selected from the group consisting of -CH 2 -C 6 H 4 -, -CH 2 -C 6 H 4 -O-, -CH 2 -C 6 H 4 -NH-,-CH 2 -C 6 H 4 -OCH 2 -, -(CH 2 ) 4 NH-, -(CH 2 ) 4 NHC(O)-, -(CH 2 ) 4 NHC(O)O-,-(CH 2 ) 4 NHC(O)OCH 2 -,
- the codon encoding for Z is an amber stop codon TAG, an ochre stop codon TAA, an opal stop codon TGA, or a four base codon.
- the expression system comprises:
- the expression system comprises:
- the N-terminal tail polypeptide, (AA) m , or the C-terminal tail polypeptide, (AA) p , or both, of the precursor polypeptides of formula (I) or (II) comprise(s):
- the polypeptide comprised within the N-terminal tail polypeptide, (AA) m , or the C-terminal tail polypeptide, (AA) p , or both, of the precursor polypeptides of formula (I) and (II) is a polypeptide selected from the group of polypeptides consisting of SEQ ID NOs 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, and 158.
- the intein polypeptide comprised within the N-terminal tail polypeptide, (AA) m , or the C-terminal tail polypeptide, (AA) p , or both, of the precursor polypeptides of formula (I) or (II) is a selected from the group consisting of a naturally occurring intein, an engineered variant of a naturally occurring intein, a fusion of the N-terminal and C-terminal fragments of a naturally occurring split intein and a fusion of the N-terminal and C-terminal fragments of an engineered split intein.
- the intein is selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp -NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp -GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp -GE8 Pol-1 (SEQ ID NO:14), Tthi Pol (SEQ ID NO:15), Tag Pol-3 (SEQ ID NO:16), Tfu Pol-2 (SEQ ID NO:1), Mxe
- the intein is a fusion product of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu- PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp -PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp -PCC7002 DnaE (SEQ ID NO:73-SEQ ID NO:74), Tvu DnaE (SEQ ID NO:75- SEQ ID NO:76),
- Ssp DnaE SEQ ID NO:61- SEQ ID NO:62
- Neq Pol SEQ ID NO:63- SEQ ID NO:64
- the split intein C-domain is selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu -PCC73102 DnaE-c (SEQ ID NO:68), Nsp -PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp -PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variant(s) thereof; and the split intein N-domain is selected from the group consisting of Ssp DnaE-n (SEQ ID NO:61), Neq Pol-n (SEQ ID NO:63), Asp DnaE-n (SEQ ID NO:65), Npu -PCC73102 DnaE
- the expression system is selected from the group consisting of a prokaryotic cell, an eukaryotic cell, and a cell-free expression system.
- the prokaryotic cell is Escherichia coli.
- the eukaryotic cell is a yeast, a mammalian, an insect or a plant cell.
- any of polypeptides (AA) n , (AA) o , (AA) m , or (AA) p , is fully or partially genetically randomized so that a plurality of macrocyclic peptides is obtained upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FGi in Z.
- the method comprises fully or partially randomizing any of polypeptides (AA) n , (AA) m , or (AA) p , wherein, upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FGi in Z, a plurality of macrocyclic peptides is produced.
- a recombinant host cell comprising an artificial nucleic acid ecoding a polypeptide of structure: (AA) m -Z-(AA) n -Cys-(AA) p (I) or (AA) m -Cys-(AA) n -Z-(AA) p (II) wherein:
- the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-pheny
- the polypeptide comprised within the N-terminal tail polypeptide, (AA) m , or the C-terminal tail polypeptide, (AA) p , or both, of the precursor polypeptides of formula (I) and (II) is a polypeptide selected from the group of polypeptides consisting of SEQ ID NOs 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, and 158.
- the cell comprises a macrocyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z.
- the N-terminal tail polypeptide, (AA) m , or the C-terminal tail polypeptide, (AA) p , or both, in the precursor polypeptides of formula (I) or formula (II) comprise(s) an intein selected from the group consisting of a naturally occurring intein, an engineered variant of a naturally occurring intein, a fusion of the N-terminal and C-terminal fragments of a naturally occurring split intein and a fusion of the N-terminal and C-terminal fragments of an engineered split intein.
- the intein is selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp -NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp -GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp -GE8 Pol-1 (SEQ ID NO:14), Tthi Pol (SEQ ID NO:15), Tag Pol-3 (SEQ ID NO:16), Tfu Pol-2 (SEQ ID NO:1), Mxe
- the intein is a fusion product of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu -PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp -PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp -PCC7002 DnaE (SEQ ID NO:73- SEQ ID NO:74), Tvu DnaE (SEQ ID NO:75- SEQ ID NO:76),
- Ssp DnaE SEQ ID NO:61- SEQ ID NO:62
- Neq Pol SEQ ID NO:63- SEQ ID NO:64
- the cell comprises a macrocyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z, and an intein-catalyzed N-terminal splicing, C-terminal splicing, or self-splicing reaction.
- the N-terminal tail polypeptide, (AA) m comprises the C-domain of a naturally occurring split intein, or of an engineered variant thereof, and the C-terminal tail polypeptide, (AA) p , comprises the N-domain of said split intein.
- the split intein C-domain is selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu -PCC73102 DnaE-c (SEQ ID NO:68), Nsp -PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp -PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variant(s) thereof; and the split intein N-domain is selected from the group consisting of Ssp DnaE-n (SEQ ID NO:61), Neq Pol-n (SEQ ID NO:63), Asp DnaE-n (SEQ ID NO:65), Npu -PCC73102 DnaE
- the cell comprises a polycyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z, and a split intein-catalyzed trans-splicing reaction.
- aliphatic or "aliphatic group” as used herein means a straight or branched C 1-15 hydrocarbon chain that is completely saturated or that contains at least one unit of unsaturation, or a monocyclic C 3-8 hydrocarbon, or bicyclic C 8-12 hydrocarbon that is completely saturated or that contains at least one unit of unsaturation, but which is not aromatic (also referred to herein as "cycloalkyl”).
- suitable aliphatic groups include, but are not limited to, linear or branched alkyl, alkenyl, alkynyl groups or hybrids thereof such as (cycloalkyl)alkyl, (cycloalkenyl)alkyl, or (cycloalkynyl)alkyl.
- alkyl, alkenyl, or alkynyl group may be linear, branched, or cyclic and may contain up to 15, up to 8, or up to 5 carbon atoms.
- Alkyl groups include, but are not limited to, methyl, ethyl, propyl, cyclopropyl, butyl, cyclobutyl, pentyl, and cyclopentyl groups.
- Alkenyl groups include, but are not limited to, propenyl, butenyl, and pentenyl groups.
- Alkynyl groups include, but are not limited to, propynyl, butynyl, and pentynyl groups.
- aryl and aryl group refers to an aromatic substituent containing a single aromatic or multiple aromatic rings that are fused together, directly linked, or indirectly linked (such as linked through a methylene or an ethylene moiety).
- An aryl group may contain from 5 to 24 carbon atoms, 5 to 18 carbon atoms, or 5 to 14 carbon atoms.
- heteroatom means nitrogen, oxygen, or sulphur, and includes, but is not limited to, any oxidized forms of nitrogen and sulfur, and the quaternized form of any basic nitrogen. Heteroatom further includes, but is not limited to, Se, Si, or P.
- heteroaryl refers to an aryl group in which at least one carbon atom is replaced with a heteroatom.
- a heteroaryl group is a 5- to 18-membered, a 5- to 14-membered, or a 5- to 10-membered aromatic ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms.
- Heteroaryl groups include, but are not limited to, pyridyl, pyrrolyl, furyl, thienyl, indolyl, isoindolyl, indolizinyl, imidazolyl, pyridonyl, pyrimidyl, pyrazinyl, oxazolyl, thiazolyl, purinyl, quinolinyl, isoquinolinyl, benzofuranyl, and benzoxazolyl groups.
- a heterocyclic group may be any monocyclic or polycyclic ring system which contains at least one heteroatom and may be unsaturated or partially or fully saturated.
- heterocyclic thus includes, but is not limited to, heteroaryl groups as defined above as well as non-aromatic heterocyclic groups.
- a heterocyclic group is a 3- to 18-membered, a 3- to 14-membered, or a 3- to 10-membered, ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms.
- Heterocyclic groups include, but are not limited to, the specific heteroaryl groups listed above as well as pyranyl, piperidinyl, pyrrolidinyl, dioaxanyl, piperazinyl, morpholinyl, thiomorpholinyl, morpholinosulfonyl, tetrahydroisoquinolinyl, and tetrahydrofuranyl groups.
- a halogen atom may be a fluorine, chlorine, bromine, or iodine atom.
- substituents include, but are not limited to, halogen atoms, hydroxyl (-OH), sulfhydryl (-SH), substituted sulfhydryl, carbonyl (-CO-), carboxy (-COOH), amino (-NH 2 ), nitro (-NO 2 ), sulfo (-SO 2 -OH), cyano (-C ⁇ N), thiocyanato (-S-C ⁇ N), phosphono (-P(O)OH 2 ), alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, alkylthiol, alkyloxy, alkylamino, arylthiol, aryloxy, or arylamino groups.
- optionally substituted modifies a series of groups separated by commas (e.g., “optionally substituted A, B, or C”; or “A, B, or C optionally substituted with”), it is intended that each of the groups (e.g., A, B, or C) is optionally substituted.
- heteroatom-containing aliphatic refers to an aliphatic moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, selenium, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
- alkyl and alkyl group refer to a linear, branched, or cyclic saturated hydrocarbon typically containing 1 to 24 carbon atoms, or 1 to 12 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, octyl, decyl and the like.
- heteroatom-containing alkyl refers to an alkyl moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
- alkenyl and alkenyl group refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, or of 2 to 12 carbon atoms, containing at least one double bond, such as ethenyl, n-propenyl, isopropenyl, n-butenyl, isobutenyl, octenyl, decenyl, and the like.
- heteroatom-containing alkenyl refers to an alkenyl moiety where at least one carbon atom is replaced with a heteroatom.
- alkynyl and alkynyl group refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, or of 2 to 12 carbon atoms, containing at least one triple bond, such as ethynyl, n-propynyl, and the like.
- heteroatom-containing alkynyl refers to an alkynyl moiety where at least one carbon atom is replaced with a heteroatom.
- heteroatom-containing aryl refers to an aryl moiety where at least one carbon atom is replaced with a heteroatom.
- alkoxy and alkoxy group refer to an aliphatic group or a heteroatom-containing aliphatic group bound through a single, terminal ether linkage.
- aryl alkoxy groups contain 1 to 24 carbon atoms, or contain 1 to 14 carbon atoms.
- aryloxy and aryloxy group refer to an aryl group or a heteroatom-containing aryl group bound through a single, terminal ether linkage. In various embodiments, aryloxy groups contain 5 to 24 carbon atoms, or contain 5 to 14 carbon atoms.
- substituted refers to a contiguous group of atoms.
- substituted include, but are not limited to: alkoxy, aryloxy, alkyl, heteroatom-containing alkyl, alkenyl, heteroatom-containing alkenyl, alkynyl, heteroatom-containing alkynyl, aryl, heteroatom-containing aryl, alkoxy, heteroatom-containing alkoxy, aryloxy, heteroatom-containing aryloxy, halo, hydroxyl (-OH), sulfhydryl (-SH), substituted sulfhydryl, carbonyl (-CO-), thiocarbonyl, (-CS-), carboxy (-COOH), amino (-NH 2 ), substituted amino, nitro (-NO 2 ), nitroso (-NO), sulfo (-SO 2 -OH), cyano (-C ⁇ N), cyanato (-O-C ⁇ N), thiocyana
- contact indicates that the chemical units are at a distance that allows short range non-covalent interactions (such as Van der Waals forces, hydrogen bonding, hydrophobic interactions, electrostatic interactions, dipole-dipole interactions) to dominate the interaction of the chemical units.
- non-covalent interactions such as Van der Waals forces, hydrogen bonding, hydrophobic interactions, electrostatic interactions, dipole-dipole interactions
- biological molecules such as those present in a bacterial, yeast or mammalian cell.
- the biological molecules can be, e.g., proteins, nucleic acids, fatty acids, or cellular metabolites.
- mutant or “variant” as used herein with reference to a molecule such as polynucleotide or polypeptide, indicates that such molecule has been mutated from the molecule as it exists in nature.
- mutate indicates any modification of a nucleic acid and/or polypeptide which results in an altered nucleic acid or polypeptide. Mutations include, but are not limited to, any process or mechanism resulting in a mutant protein, enzyme, polynucleotide, or gene. A mutation can occur in a polynucleotide or gene sequence, by point mutations, deletions, or insertions of single or multiple nucleotide residues.
- a mutation in a polynucleotide includes, but is not limited to, mutations arising within a protein-encoding region of a gene as well as mutations in regions outside of a protein-encoding sequence, such as, but not limited to, regulatory or promoter sequences.
- a mutation in a coding polynucleotide such as a gene can be "silent", i.e., not reflected in an amino acid alteration upon expression, leading to a "sequence-conservative" variant of the gene.
- a mutation in a polypeptide includes, but is not limited to, mutation in the polypeptide sequence and mutation resulting in a modified amino acid.
- Non-limiting examples of a modified amino acid include, but are not limited to, a glycosylated amino acid, a sulfated amino acid, a prenylated (e.g., farnesylated, geranylgeranylated) amino acid, an acetylated amino acid, an acylated amino acid, a PEGylated amino acid, a biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the like.
- engine refers to any manipulation of a molecule that result in a detectable change in the molecule, wherein the manipulation includes, but is not limited to, inserting a polynucleotide and/or polypeptide heterologous to the cell and mutating a polynucleotide and/or polypeptide native to the cell.
- nucleic acid molecule refers to any chain of at least two nucleotides bonded in sequence.
- a nucleic acid molecule can be a DNA or a RNA.
- peptide refers to any chain of at least two amino acids bonded in sequence, regardless of length or post-translational modification.
- peptide-containing molecule refers to a molecule that contains at least two amino acids.
- non-natural and “unnatural” as used herein means being directly or indirectly made or caused to be made through human action.
- a “non-natural amino acid” is an amino acid that has been produced through human manipulation and does not occur in nature.
- non-canonical amino acid is equivalent in meaning to the terms “non-natural amino acid” or "unnatural amino acid”.
- cyclic and “macrocyclic” as used herein means having constituent atoms forming a ring.
- a “macrocyclic peptide” is a peptide molecule that contains at least one ring formed by atoms comprised in the molecule.
- the term “macrocyclic peptide” comprises peptides that contain at least two rings separated from each other via a polypeptide sequence (also referred to herein as “polycyclic peptides”) and peptides that contain at least two rings fused to each other (also referred to herein as “polycyclic peptides").
- polycyclic peptides also comprises peptides that contain two rings fused to each other (referred to herein also as "bicyclic peptides").
- cyclization or “macrocyclization” as used herein refer to a process or reaction whereby a cyclic molecule is formed or is made to be formed.
- peptidic backbone refers to a sequence of atoms corresponding to the main backbone of a natural protein.
- precursor polypeptide or “polypeptide precursor” as used herein refers to a polypeptide that is capable of undergoing macrocyclization according to the methods disclosed herein.
- ribosomal polypeptide refers to a polypeptide that is produced by action of a ribosome, and specifically, by the ribosomal translation of a messenger RNA encoding for such polypeptide.
- the ribosome can be a naturally occurring ribosome, e.g., a ribosome derived from an archea, procaryotic or eukaryotic organism, or an engineered (i.e., non-naturally occurring, artificial or synthetic) variant of a naturally occurring ribosome.
- intein and "intein domain” as used herein refers to a naturally occurring or artificially constructed polypeptide sequence embedded within a precursor protein that can catalyze a splicing reaction during post-translational processing of the protein.
- NEB Intein Registry http://www.neb.com/neb/inteins.html
- split intein refers to an intein that has at least two separate components not fused to one another.
- splicing refers to the process involving the cleavage of the main backbone of an intein-containing polypeptide by virtue of a reaction or process catalyzed by an intein or portions of an intein.
- N-terminal splicing refers to the cleavage of a polypeptide chain fused to the N-terminus of an intein, such reaction typically involving the scission of the thioester (or ester) bond formed via intein-catalyzed N ⁇ S (or N ⁇ O acyl) transfer, by action of a nucleophilic functional group or a chemical species containing a nucleophilic functional group.
- C-terminal splicing refers to the cleavage of a polypeptide chain fused to the C-terminus of an intein.
- Self-splicing refers to the process involving the cleavage of an intein from a polypeptide, within which the intein is embedded.
- Trans-splicing refers to a self-splicing process involving split inteins.
- affinity tag refers to a polypeptide that is able to bind reversibly or irreversibly to an organic molecule, a metal ion, a protein, or a nucleic acid molecule.
- vector and “vector construct” as used herein refer to a vehicle by which a DNA or RNA sequence (e.g., a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g., transcription and translation) of the introduced sequence.
- a common type of vector is a "plasmid”, which generally is a self-contained molecule of double-stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell.
- plasmid which generally is a self-contained molecule of double-stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell.
- vectors including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts.
- Non-limiting examples include, but are not limited to, pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San Diego, Calif.), or pMAL plasmids (New England Biolabs, Beverly, Mass.), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art.
- the terms “express” and “expression” refer to allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence.
- a DNA sequence is expressed in or by a cell to form an "expression product" such as a protein.
- the expression product itself e.g., the resulting protein, may also be said to be “expressed” by the cell.
- a polynucleotide or polypeptide is expressed recombinantly, for example, when it is expressed or produced in a foreign host cell under the control of a foreign or native promoter, or in a native host cell under the control of a foreign promoter.
- fused means being connected through at least one covalent bond.
- bound means being connected through non-covalent interactions. Examples of non-covalent interactions are van der Waals, hydrogen bond, electrostatic, and hydrophobic interactions.
- a "DNA-binding peptide” refers to a peptide capable of connecting to a DNA molecule via non-covalent interactions.
- tethered as used herein means being connected through non-covalent interactions or through covalent bonds.
- a polypeptide tethered to a solid support refers to a polypeptide that is connected to a solid support (e.g., surface, resin bead) either via non-covalent interactions or through covalent bonds.
- Methods and compositions are provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced artificial polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue carrying a thiol-reactive functional group (referred to as FGi); and (b) a cysteine residue that is positioned either upstream or downstream of the non-canonical amino acid in the polypeptide sequence.
- FGi non-canonical amino acid residue carrying a thiol-reactive functional group
- Methods and compositions are also provided for making macrocyclic peptides from genetically encoded, ribosomally produced, intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with a thiol-reactive functional group (referred to as FGi); (b) a cysteine residue positioned upstream or downstream of the non-canonical amino acid within the polypeptide sequence; and (c) an intein protein positioned upstream or downstream of the non-canonical amino acid or of the cysteine residue within the polypeptide sequence.
- FGi non-canonical amino acid residue with a thiol-reactive functional group
- cysteine residue positioned upstream or downstream of the non-canonical amino acid within the polypeptide sequence
- an intein protein positioned upstream or downstream of the non-canonical amino acid or of the cysteine residue within the polypeptide sequence.
- Methods and compositions are also provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced, split intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with a thiol-reactive functional group (referred to as FGi); (b) a cysteine residue positioned upstream or downstream of the non-canonical amino acid within the polypeptide sequence; and (c) a split intein domain positioned upstream or downstream of the non-canonical amino acid or the cysteine residue within the polypeptide sequence.
- FGi non-canonical amino acid residue with a thiol-reactive functional group
- Methods and compositions are also provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced, split intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with two thiol-reactive functional groups (referred to as FGi and FG 2 ); (b) two cysteine residues positioned upstream and downstream of the non-canonical amino acid within the polypeptide sequence.
- FGi and FG 2 two thiol-reactive functional groups
- a method for making an artificial macrocyclic peptide comprising:
- a method for making an artificial macrocyclic peptide comprising:
- (AA) m is a N-terminal sequence comprising at least one amino acid, where AA corresponds to a generic amino acid residue and m corresponds to the number of amino acid residues composing such sequence.
- (AA) m is also referred to as "N-terminal tail”.
- (AA) p is a C-terminal sequence that has 0 or at least one amino acid, where AA corresponds to a generic amino acid residue and p corresponds to the number of amino acid residues composing such sequence.
- (AA) p is also referred to as "C-terminal tail”.
- (AA) n (and (AA) o , when present) is a peptide sequence of variable length (also referred to as "target peptide sequence"), where AA corresponds to a generic amino acid residue and n corresponds to the number of amino acid residues composing such peptide sequence.
- Cys is a cysteine amino acid residue.
- Z is an amino acid that carries a side-chain functional group FGi, which can react with the side-chain sulfhydryl group (-SH) of the cysteine residue to form a stable thioether bond.
- an artificial polypeptide of formula (I) or (II) (also referred herein to as "precursor polypeptide") to produce a macrocyclic peptide is conferred by the ability of the nucleophilic sulfhydryl group carried by the cysteine residue to react intramolecularly with the electrophilic functional group FGi carried by the amino acid Z, thereby forming a covalent, inter-side-chain thioether bond.
- this reaction proceeds via a thiol-mediated nucleophilic substitution reaction, a thiol-mediated Michael-type addition reaction, or a radical thiol-ene or thiol-yne reaction.
- electrophilic functional group FGi in the precursor polypeptide could in principle react intermolecularly with free cysteine or other thiol-containing molecules contained in the expression system (e.g., glutathione), it was discovered by the inventors that appropriate functional groups FGi can be found so that the desired intramolecular thioether-bond forming reaction occurs exclusively or preferentially over the undesired intermolecular side-reactions.
- a first advantage of the methods described herein is that they provide a highly versatile approach for the preparation of structurally diverse artificial macrocyclic peptides. Indeed, they offer multiple opportunities toward the structural and functional diversification of these compounds, e.g., through variation of the length and composition of the target peptide sequence ((AA) n ), variation of the structure of the amino acid Z, variation of the position of the amino acid Z relative to the cysteine residue (e.g., precursor polypeptide (I) versus (II)), variation of the length and composition of the N-terminal tail ((AA) m ), and variation of the length and composition of the C-terminal tail ((AA) p ).
- a second advantage of the methods disclosed herein is that they produce peptide molecules whose conformational flexibility is restrained by virtue of at least one intramolecular thioether linkage.
- this feature can confer these molecules with advantageous properties such as, for example, enhanced binding affinity, increased stability against proteolysis, and/or more favorable membrane-crossing properties, as compared to linear peptides or peptides lacking the intramolecular thioether linkage.
- the thioether linkage is redox and chemically stable in biological milieu, including the intracellular environment.
- a third advantage of the methods disclosed herein is they allow for the preparation of macrocyclic peptides from genetically encoded, ribosomally produced polypeptides. Accordingly, these macrocyclic peptides can be produced as fused to a genetically encoded affinity tag, DNA-binding protein/peptide, protein-binding protein/peptide, fluorescent protein, or enzyme, which can be achieved via the introduction of one or more of these elements within the N-terminal tail and/or within the C-terminal tail of the precursor polypeptide.
- these tags/proteins/enzymes can be useful to facilitate the purification and/or immobilization of the macrocyclic peptides for functional screening as demonstrated in Examples 4, 5 and 8.
- phage-displayed macrocyclic peptide libraries can be then 'panned' against a target biomolecule of interest according to procedures well known in the art (Lane and Stephen 1993; Giebel, Cass et al. 1995; Sidhu, Lowman et al. 2000) in order to identify macrocyclic peptide binders or inhibitors of such biomolecule.
- a fourth advantage of the methods described herein is that they also enable the production of macrocyclic peptides inside a cell-based expression host such as a bacterial, yeast, insect, or mammalian cell. Intracellular production of the macrocyclic peptide can then be coupled to an (intra)cellular reporter system, phenotypic screen, or selection system, in order to identify a macrocyclic peptide capable of inhibiting or activating a certain cellular process, biomolecule, or enzymatic reaction linked to the reporter output, phenotype, or cell survival, respectively.
- a fifth advantage of the methods disclosed herein is that the production of the macrocyclic peptides can be carried out under physiological conditions (e.g., in aqueous buffer, neutral pH, physiological temperature) and in complex biological media (e.g., inside a cell, in cell lysate) and in the presence of biological molecules (proteins, nucleic acids, cell metabolites) and biological material.
- physiological conditions e.g., in aqueous buffer, neutral pH, physiological temperature
- complex biological media e.g., inside a cell, in cell lysate
- biological molecules proteins, nucleic acids, cell metabolites
- the methods described herein can be useful to greatly accelerate and facilitate the discovery of bioactive peptide-based compounds as potential drug molecules and chemical probes or the identification of lead structures for the development of new chemical probes and drugs.
- Z is an amino acid of structure (IV) wherein Y is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 2 -C 24 alkenyl, C 2 -C 24 substituted alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 5 -C 24 aryl, C 5 -C 24 substituted aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 1 -C 24 alkoxy, C 5 -C 24 aryloxy groups.
- Y is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted alkyl,
- Z is an amino acid of structure (IV) wherein Y is a linker group selected from -CH 2 -C 6 H 4 -, -CH 2 -C 6 H 4 -O-, -CH 2 -C 6 H 4 -NH-,-(CH 2 ) 4 -, -(CH 2 ) 4 NH-, -(CH 2 ) 4 NHC(O)-, and -(CH 2 ) 4 NHC(O)O-.
- Y is a linker group selected from -CH 2 -C 6 H 4 -, -CH 2 -C 6 H 4 -O-, -CH 2 -C 6 H 4 -NH-,-(CH 2 ) 4 -, -(CH 2 ) 4 NH-, -(CH 2 ) 4 NHC(O)-, and -(CH 2 ) 4 NHC(O)O-.
- the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-pheny
- Z2 is an amino acid of structure (VI) wherein Y 2 is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 1 -C 24 substituted heteroatom-containing alkyl, C 2 -C 24 alkenyl, C 2 -C 24 substituted alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 2 -C 24 substituted heteroatom-containing alkenyl, C 5 -C 24 aryl, C 5 -C 24 substituted aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 5 -C 24 substituted heteroatom-containing aryl, C 1 -C 24 alkoxy, C 5 -C 24 aryloxy groups.
- Y 2 is a linker group selected from the group consisting of C 1 -C 24 alkyl, C 1 -C 24 substituted al
- Z2 is an amino acid of structure (VI) wherein Y 2 is a linker group selected from the group consisting of -CH 2 -C 6 H 4 -, -CH 2 -C 6 H 4 -O-,-CH 2 -C 6 H 4 -NH-,-CH 2 -C 6 H 4 -OCH 2 -, -(CH 2 ) 4 NH-, -(CH 2 ) 4 NHC(O)-,-(CH 2 ) 4 NHC(O)O-, -(CH 2 ) 4 NHC(O)OCH 2 -,
- the amino acid Z2 is selected from the group consisting of of 3,5- bis (2-bromoethoxy)-phenylalanine, 3,5- bis (2-chloroethoxy)-phenylalanine, 3,5- bis (1-bromoethyl)-phenylalanine, 3,5- bis (aziridin-1-yl)-phenylalanine, 3,5- bis -acrylamido-phenylalanine, 3,5- bis (2-fluoro-acetamido)-phenylalanine, 3,5- bis (2-fluoro-acetyl)-phenylalanine, 4-((1,3-dibromopropan-2-yl)oxy)-phenylalanine, 4-((1,3-dichloropropan-2-yl)oxy)-phenylalanine, N ⁇ -(((1,3-dibromopropan-2-yl)oxy)carbonyl)-lysine, N ⁇ - (((1,3-d)
- Artificial nucleic acid molecules for use according to the methods provided herein include, but are not limited to, those that encode for a polypeptide of general formula (I), (II), or (V) as defined above.
- the codon encoding for the amino acid Z (or Z2) in these polypeptides can be one of the 61 sense codons of the standard genetic code, a stop codon (TAG, TAA, TGA), or a four-base frameshift codon (e.g., TAGA, AGGT, CGGG, GGGT, CTCT).
- the codon encoding for the amino acid Z (or Z2) within the nucleotide sequence encoding for the precursor polypeptide of formula (I), (II) or (V) is an amber stop codon (TAG), an ochre stop codon (TAA), an opal stop codon (TGA), or a four-base frameshift codon (see Example 2).
- the codon encoding for Z (or Z2) in the nucleotide sequence encoding for these precursor polypeptides is the amber stop codon, TAG, or the 4-base codon, TAGA.
- the non-canonical amino acid Z (or Z2) can be introduced into the precursor polypeptide through direct incorporation during ribosomal synthesis of the precursor polypeptide, or generated post-translationally through enzymatic or chemical modification of the precursor polypeptide, or by a combination of these procedures.
- the amino acid Z (or Z2) is introduced into the precursor polypeptide during ribosomal synthesis of the precursor polypeptide via either stop codon suppression or four-base frameshift codon suppression.
- the amino acid Z (or Z2) is introduced into the precursor polypeptide during ribosomal synthesis of the precursor polypeptide via amber (TAG) stop codon suppression or via 4-base TAGA codon suppression.
- tRNA/aminoacyl-tRNA synthetase (AARS) pairs used for this purpose include, but are not limited to, engineered variants of Methanococcus jannaschii AARS/tRNA pairs (e.g., TyrRS/tRNA Tyr ), of Saccharomyces cerevisiae AARS/tRNA pairs (e.g., AspRS/tRNA Asp , GlnRS/tRNA Gln ,TyrRS/tRNA Tyr , and PheRS/tRNA Phe ), of Escherichia coli AARS/tRNA pairs (e.g., TyrRS/tRNA Tyr , LeuRS/tRNA Leu ), of Methanosarcina mazei AARS/tRNA pairs (PylRS/tRNA Pyl ), and of Methanosarcina mazei AARS/tRNA pairs (PylRS/tRNA Pyl ) (Wang, Xie et al.
- a non-canonical amino acid can be incorporated into a polypeptide by exploiting the promiscuity of wild-type aminoacyl-tRNA synthetase enzymes using a cell-free protein expression system, in which one or more natural amino acids are replaced with structural analogs(Josephson, Hartman et al. 2005; Hartman, Josephson et al. 2007). Any of these methods can be used to introduce an unnatural amino acid of the type (III), (IV), (VI) or (VII) into the precursor polypeptide for the purpose of generating macrocyclic peptides according to the methods disclosed herein.
- the non-canonical amino acid Z is incorporated into the precursor polypeptide via stop codon or four-base codon suppression methods using an engineered AARS/tRNA pair derived from Methanococcus jannaschii tyrosyl-tRNA synthetase ( Mj TyrRS) and its cognate tRNA ( Mj tRNA Tyr ), an engineered AARS/tRNA pair derived from Methanosarcina mazei pyrrolysyl-tRNA synthetase ( Mm PylRS) and its cognate tRNA (tRNA Pyl ), an engineered AARS/tRNA pair derived from Methanosarcina mazei pyrrolysyl-tRNA synthetase ( Mm PylRS) and its cognate tRNA (tRNA Pyl ), or an engineered AARS/tRNA pair derived from Escherichia coli tyrosyl-tRNA synthetase ( Ec TyrRS)
- aminoacyl-tRNA synthetase enzymes can be described in reference to the amino acid sequence of a naturally occurring aminoacyl-tRNA synthetase or another engineered aminoacyl-tRNA synthetase.
- the amino acid residue is determined in the aminoacyl-tRNA synthetase enzyme beginning from the first amino acid after the initial methionine (M) residue (i.e., the first amino acid after the initial methionine M represents residue position 1).
- M methionine
- the initiating methionine residue may be removed by biological processing machinery such as in a host cell or in vitro translation system, to generate a mature protein lacking the initiating methionine residue.
- the amino acid residue position at which a particular amino acid or amino acid change is present is sometimes described herein as "Xn", or "position n", where n refers to the residue position.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanococcus jannaschii tRNA Tyr as encoded by a nucleotide of sequence SEQ ID NO:: 101, 102, 103, or 104; and an engineered variant of Methanococcus jannaschii tyrosyl-tRNA synthetase (SEQ ID NO:: 77), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:77: X32, X63, X65, X70, X107, X108, X109, X155, X158, X159, X160, X161, X162, X163, X164, X167, and X286.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a Methanococcus jannaschii tRNA Tyr variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NOs: 101, 102, 103, and 104; and a Methanococcus jannaschii tyrosyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 77, 81, 82, 83, 84, 85, 86, 87, 88, 89, and 90.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanosarcina species tRNA Pyl or Desulfitobacterium hafniense tRNA Pyl as encoded by a nucleotide of sequence SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, or 112; and an engineered variant of Methanosarcina mazei pyrrolysyl-tRNA synthetase (SEQ ID NO:: 78), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:78: X302, X305, X306, X309, X346, X348, X364, X384, X401, X405, and X417.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanosarcina species tRNA Pyl or Desulfitobacterium hafniense tRNA Pyl as encoded by a nucleotide of sequence SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, or 112; and an engineered variant of Methanosarcina barkeri pyrrolysyl-tRNA synthetase (SEQ ID NO:: 79), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:: 79: X76, X266, X270, X271, X273, X274, X313, X315, and X349.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a tRNA Pyl variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, and 112; and a pyrrolysyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 78, 79, 91, 92, 93, 94, 95, and 96.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Escherichia coli tRNA Tyr or Bacillus stearothermophilus tRNA Tyr as encoded by a nucleotide of sequence SEQ ID NO:: 113, 114, 115, 116, 117, 118, 119, or 120; and an engineered variant of Escherichia coli tyrosyl-tRNA synthetase (SEQ ID NO:: 80), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:: 80: X37, X182, X183, X186, and X265.
- the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a tRNA Tyr variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NO:: 113, 114, 115, 116, 117, 118, 119, and 120; and a E. coli tyrosyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 80, 97, 98, 99, and 100.
- the aminoacyl-tRNA synthetase used for incorporating the amino acid Z (or Z2) into the precursor polypeptide can have additionally at least one amino acid residue differences at positions not specified by an X above as compared to the sequence SEQ ID NO:: 77, 78, 79, or 80.
- the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1-50, 1-75, 1-100, 1-150, or 1-200 amino acid residue differences at other positions not defined by X above.
- the suppressor tRNA molecule used for incorporating the amino acid Z (or Z2) into the precursor polypeptide can have additionally at least one nucleotide difference as compared to the sequence encoded by the gene of SEQ ID NO:: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, or 120.
- the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1-50, or 1-60 nucleotide differences as compared to the sequences encoded by these genes.
- the engineered variant of Methanococcus jannaschii tyrosyl-tRNA synthetase comprises at least one of the features selected from the group consisting of: X32 is Tyr, Leu, Ala, Gly, Thr, His, Glu, Val, or Gln; X65 is Leu, His, Tyr, Val, Ser, Thr, Gly, or Glu; X67 is Ala or Gly; X70 is His, Ala, Cys, or Ser; X107 is Glu, Pro, Asn, or Thr; X108 is Phe, Trp, Ala, Ser, Arg, Gly, Tyr, His, Trp, or Glu; X109 is Gln, Met, Asp, Lys, Glu, Pro, His, Gly, Met, or Leu; X155 is Gln, Glu, or Gly; X158 is Asp, Gly, Glu, Ala, Pro,
- the engineered variant of Methanosarcina mazei pyrrolysyl-tRNA synthetase comprises at least one of the features selected from the group consisting of: X302 is Ala or Thr; X305 is Leu or Met; X306 is Tyr, Ala, Met, Ile, Leu, Thr, Gly; X309 is Leu, Ala, Pro, Ser, or Arg; X346 is Asn, Ala, Ser, or Val; X348 is Cys, Ala, Thr, Leu, Lys, Met, or Trp; X364 is Thr or Lys; X384 is Tyr or Phe; X405 is Ile or Arg; X401 is Val or Leu; X417 is Trp, Thr or Leu .
- the engineered variant of Methanosarcina barkeri pyrrolysyl-tRNA synthetase comprises at least one of the features selected from the group consisting of: X76 is Asp or Gly; X266 is Leu, Val, or Met; X270 is Leu or Ile; X271 is Tyr, Phe, Leu, Met, or Ala; X274 is Leu, Ala, Met, or Gly; X313 is Cys, Phe, Ala, Val, or Ile; X315 is Met or Phe; X349 is Tyr, Phe, or Trp.
- the engineered variant of Escherichia coli tyrosyl-tRNA synthetase comprises at least one of the features selected from the group consisting of: X37 is Tyr, Ile, Gly, Val, Leu, Thr, or Ser; X182 is Asp, Gly, Ser, or Thr; X183 is Phe, Met, Tyr, or Ala; X186 is Leu, Ala, Met, or Val; X265 is Asp or Arg.
- An aspect of the methods disclosed herein is the identification and selection of a suitable aminoacyl-tRNA synthetase for incorporating an amino acid Z (or Z2) as defined above, into the artificial precursor polypeptide.
- Various methods are known in the art to evaluate and quantify the relative efficiency of a given wild-type or engineered aminoacyl-tRNA synthetase to incorporate a non-canonical amino acid into a protein (Young, Young et al. 2011). Any of these methods can be used to guide the identification and choice of a suitable aminoacyl-tRNA synthetase for incorporating a desired amino acid Z (or Z2) into the precursor polypeptide.
- such efficiency can be measured via a fluorescence assay based on the expression of a reporter fluorescent protein (e.g., green fluorescent protein), whose encoding gene has been modified to contain a codon to be suppressed (e.g., amber stop codon).
- a reporter fluorescent protein e.g., green fluorescent protein
- expression of the reporter fluorescent protein is then induced in a suitable expression system (e.g., an E. coli or yeast cell) in the presence of the aminoacyl-tRNA synthetase to be tested, a cognate suppressor tRNA (e.g., amber stop codon suppressor tRNA), and the desired non-canonical amino acid.
- the relative amount of the expressed (i.e., ribosomally produced) fluorescent protein is linked to the relative efficiency of the aminoacyl-tRNA synthetase to charge the cognate suppressor tRNA with the non-canonical amino acid, which can thus be quantified via fluorimetric means.
- a demonstration of how this procedure can be applied for selecting an aminoacyl-tRNA synthetase / suppressor tRNA pair for incorporating a desired amino acid Z (or Z2) into the precursor polypeptide is provided in Example 3.
- the ability of a given aminoacyl-tRNA synthetase / suppressor tRNA pair to incorporate a target non-canonical amino acid into a protein can be improved by means of rational design or directed evolution.
- fluorescence-based method described above can be used to screen several hundreds of engineered aminoacyl-tRNA synthetase variants and/or suppressor tRNA variants for this purpose, higher throughput procedures are also known in the art, which are, for example, based on selection systems (Wang, Xie et al. 2006; Wu and Schultz 2009; Liu and Schultz 2010; Fekner and Chan 2011).
- One such system involves introducing a library of mutated aminoacyl-tRNA synthetases and/or of mutated suppressor tRNAs into a suitable cell-based expression host (e.g., E. coli or yeast cells), whose survival under a suitable selective medium or growth conditions is dependent upon the functionality of the aminoacyl-tRNA synthetase / suppressor tRNA pair.
- a suitable cell-based expression host e.g., E. coli or yeast cells
- This can be achieved, for example, by introducing a stop codon or four-base codon that is to be suppressed, into a gene encoding for a protein or enzyme essential for survival of the cell, such as a protein or enzyme conferring resistance to an antibiotic.
- the ability of the aminoacyl-tRNA synthetase / suppressor tRNA pair to incorporate the desired non-canonical amino acid into the selection marker protein is linked to the survival of the host, thereby enabling the rapid isolation of suitable aminoacyl-tRNA synthetase / suppressor tRNA pair(s) for the incorporation of a particular non-canonical amino acid from very large engineered libraries.
- the selectivity of these aminoacyl-tRNA synthetase / suppressor tRNA pair toward the desired non-canonical amino acid over the twenty natural amino acids can be further improved by iterative rounds of positive and negative selection as described in (Wang, Xie et al.
- Engineered aminoacyl-tRNA synthetase / tRNA pairs for the incorporation of the amino acid Z (or Z2) into the precursor polypeptide can be prepared via mutagenesis of the polynucleotide encoding for the aminoacyl-tRNA synthetase enzymes of SEQ ID NOs: 77, 78, 79, 80, or an engineered variant thereof; and via mutagenesis of the tRNA-encoding polynucleotides of SEQ ID NOs: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, or an engineered variant thereof.
- mutagenesis methods include, but are not limited to, site-directed mutagenesis, site-saturation mutagenesis, random mutagenesis, cassette-mutagenesis, DNA shuffling, homologous recombination, non-homologous recombination, site-directed recombination, and the like.
- site-directed mutagenesis site-saturation mutagenesis
- random mutagenesis random mutagenesis
- cassette-mutagenesis DNA shuffling
- homologous recombination non-homologous recombination
- site-directed recombination site-directed recombination, and the like.
- Detailed description of art-known mutagenesis methods can be found, among other sources, in U.S. Pat. No. 5,605,793 ; U.S. Pat. No. 5,830,721 ; U.S. Pat. No.
- the engineered aminoacyl-tRNA synthetases and cognate suppressor tRNA obtained from mutagenesis of SEQ ID NO:77 to 80, and from mutagenesis of SEQ ID NO:101 to 120 can be screened for identifying aminoacyl-tRNA synthetase / suppressor tRNA pairs being able, or having improved ability as compared to the corresponding wild-type enzyme/tRNA molecule, to incorporate the amino acid Z (or Z2) into the precursor polypeptide.
- the engineered aminoacyl-tRNA synthetase used in the method comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NOs: 77, 78, 79, or 80.
- the engineered suppressor tRNA used in the method is encoded by a polynucleotide comprising a nucleotide sequence that is at least 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NOs: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, or 120.
- the target peptide sequence, (AA) n , in the precursor polypeptide of formula (I), (II) and (V) and the second target peptide sequence, (AA) o , in the precursor polypeptide of formula (V), can be a polypeptide comprising 1 to 1,000 amino acid residues.
- (AA) n (and (AA) o ) consists of a polypeptide comprising 1 to 50 amino acid residues and, in other embodiments, (AA) n (and (AA) o ) consists of a polypeptide comprising 1 to 20 amino acid residues.
- the N-terminal tail, (AA) m , in the precursor polypeptide of formula (I), (II), and (V) can be a polypeptide comprising 1 to 10,000 amino acid residues.
- (AA) m consists of a polypeptide comprising 1 to 1,000 amino acid residues and, in other embodiments, (AA) m consists of a polypeptide comprising 1 to 600 amino acid residues.
- the C-terminal tail, (AA) p , in the precursor polypeptide of formula (I), (II), and (V) may not be present, and when present, it can be a polypeptide comprising 1 to 10,000 amino acid residues.
- (AA) m consists, in some embodiments, of a polypeptide comprising 1 to 1,000 amino acid residues and, in other embodiments, (AA) m consists of a polypeptide comprising 1 to 600 amino acid residues.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) can comprise a polypeptide affinity tag, a DNA-binding polypeptide, a protein-binding polypeptide, an enzyme, a fluorescent protein, an intein protein, or a combination of these polypeptides.
- affinity tags can be useful for isolating, purifying, and/or immobilizing onto a solid support the macrocyclic peptides generated according to the methods disclosed herein.
- the N-terminal tail, C-terminal tail, or both, of the precursor polypeptides comprise at least one polypeptide affinity tags selected from the group consisting of a polyarginine tag (e.g., RRRRR) (SEQ ID NO:121), a polyhistidine tag (e.g., HHHHHH) (SEQ ID NO:122), an Avi-Tag (SGLNDIFEAQKIEWHELEL) (SEQ ID NO:123), a FLAG tag (DYKDDDDK) (SEQ ID NO:124), a Strep-tag II (WSHPQFEK) (SEQ ID NO:125), a c-myc tag (EQKLISEEDL) (SEQ ID NO:126), a S tag (KETAAAKFERQHMDS) (SEQ ID NO:127), a calmodulin-binding peptide (KRRWKKNFIAVSAANRFKKISSSGAL) (SEQ ID NO:128), a streptavidin-bind
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) can comprise a reporter protein or enzyme. This approach will result in the formation of macrocyclic peptides fused to a reporter protein or enzyme, which can be useful to facilitate the functional screening of said macrocyclic peptides.
- the N-terminal tail, (AA) m and/or the C-terminal tail, (AA) p , in the precursor polypeptides of formula (I), (II), and (V) comprise at least one polypeptide selected from the group consisting of green fluorescent protein (SEQ ID NO:134), luciferase (SEQ ID NO:135), alkaline phosphatase (SEQ ID NO:136), and engineered variants thereof.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), or (V) can comprise a protein or enzyme that is part of a display system such as, for example, a phage display (e.g., M13, T7, or lambda phage display), a yeast display, a bacterial display, a DNA display, a plasmid display, a CIS display, a ribosome display, or a mRNA display system.
- a phage display e.g., M13, T7, or lambda phage display
- yeast display e.g., M13, T7, or lambda phage display
- a yeast display e.g., a bacterial display
- DNA display e.g., a plasmid display
- CIS display e.g., a plasmid display
- this approach can be useful for generating large libraries of macrocyclic peptides which are physically linked to, or compartmentalized with the polynucleotide sequence that encodes for the corresponding precursor polypeptides.
- this approach can be useful toward isolating functional macrocyclic peptides that are able to bind, inhibit or activate a certain target biomolecule (e.g., protein, enzyme, DNA or RNA molecule) or target biomolecular interaction.
- a target biomolecule e.g., protein, enzyme, DNA or RNA molecule
- the N-terminal tail, (AA) m comprises a polypeptide selected from the group consisting of M13 phage coat protein pVI (SEQ ID NO: 137), T7 phage protein 10A (SEQ ID NO: 138), T7 phage protein 10B (SEQ ID NO: 139), E . coli NlpA (SEQ ID NO: 140), E. coli OmpC (SEQ ID NO: 141), E. coli FadL (SEQ ID NO: 142), E. coli Lpp-OmpA (SEQ ID NO:143), E. coli PgsA (SEQ ID NO: 144), E. coli EaeA (SEQ ID NO:145), S.
- M13 phage coat protein pVI SEQ ID NO: 137
- T7 phage protein 10A SEQ ID NO: 138
- T7 phage protein 10B SEQ ID NO: 139
- E . coli NlpA SEQ ID NO: 140
- cerevisiae Aga2p SEQ ID NO: 146
- S. cerevisiae Flo1p SEQ ID NO: 147
- human NF-KB p50 protein SEQ ID NO:148
- M13 phage coat protein pIII leader sequence SEQ ID NO: 149
- M13 phage coat protein pVIII leader sequence SEQ ID NO: 150
- M13 phage protein pVI SEQ ID NO:151
- Snap-tag SEQ ID NO:152
- Clip-Tag SEQ ID NO:153
- the C-terminal tail, (AA) p comprises a polypeptide selected from the group consisting of M13 phage coat protein pIII (SEQ ID NO:154), M13 phage coat protein pVIII (SEQ ID NO:155), RepA protein (SED ID NO: 156), S. cerevisiae Aga1p (SEQ ID NO:157), Snap-tag (SEQ ID NO:152), Clip-Tag (SEQ ID NO:153), P2A protein (SED ID NO: 158), and engineered variants thereof.
- M13 phage coat protein pIII SEQ ID NO:154
- M13 phage coat protein pVIII SEQ ID NO:155
- RepA protein SED ID NO: 156
- S. cerevisiae Aga1p SEQ ID NO:157
- Snap-tag SEQ ID NO:152
- Clip-Tag SEQ ID NO:153
- P2A protein SED ID NO: 158
- the C-terminal tail, (AA) p comprises a molecule selected from the group consisting of puromycin, puromycin analog, a puromycin-DNA conjugate, and a puromycin-RNA conjugate.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), or (V) can comprise an intein protein.
- Inteins are polypeptides that are found as in-frame insertions in various natural proteins and can undergo a self-catalyzed intramolecular rearrangement leading to self-excision (self-splicing) of the intein and ligation of the flanking polypeptides together.
- intein splicing The mechanism of intein splicing is well known (Xu and Perler 1996; Paulus 2000) and it involves the formation of a (thio)ester bond at the junction between the intein and the polypeptide fused the N-terminus of the intein (commonly referred to as "N-extein") by action of a catalytic cysteine or serine residue at the first position of the intein sequence.
- This reversible N(backbone) ⁇ S(side-chain) or a N(backbone) ⁇ O(side-chain) acyl transfer is followed by a trans(thio)esterification step whereby the N-extein acyl unit is transferred to the side-chain thiol/hydroxyl group of a cysteine, serine, or threonine residue at the first position of the polypeptide fused the C-terminus of the intein ("C-extein").
- the last step of the intein self-splicing process involves cleavage of the peptide bond connecting the intein to the C-extein via an intramolecular transamidation reaction by action of a conserved catalytic asparagine residue at the C-terminal position of the intein sequence (Paulus 2000).
- intein protein capable of only C-terminal splicing (i.e., cleavage of the peptide bond between the intein and C-extein), which can occurs spontaneously or promoted via a change in pH or temperature, depending on the nature of the intein and of the N-terminal amino acid(s) in the C-extein sequence.
- certain intein proteins occur as split inteins, having an N-domain and C-domain.
- split inteins Upon association of the N-domain with the C-domain, split inteins acquires the ability to self-splice according to a mechanism analogous to single-polypeptide intein proteins (Mootz 2009). As for the latter, the N-terminal cysteine or serine residue and C-terminal asparagine residue can be mutated, resulting in altered splicing behavior as described above (Perler 2005; Xu and Evans 2005; Mootz 2009; Elleuche and Poggeler 2010).
- introduction of a natural or engineered intein protein within the N-terminal tail, (AA) m , or C-terminal tail, (AA) p , of the precursor polypeptide of formula (I), (II), or (V) results in the formation of a macrocyclic peptide that is fused to either the C-terminus or the N-terminus, respectively, of such natural or engineered intein.
- This aspect enables one to control and modulate the release of the macrocyclic peptide from the intein-fused polypeptide based on the self-splicing and altered splicing behavior of natural and engineered intein proteins as summarized above.
- This aspect can be useful to facilitate the isolation and characterization of the macrocyclic peptide from a complex mixture such as, for example, the lysate of a cell expressing the precursor polypeptide or a cell-free translation system.
- This aspect can also be useful to facilitate the accumulation, and if desired, control the formation of a target macrocyclic peptide, prepared according the methods described herein, inside a cell-based expression host.
- this capability can facilitate the functional screening of in vivo (i.e., in-cell) produced macrocyclic peptide libraries, prepared according the methods disclosed herein, using an intracellular reporter system or a selection system as described above.
- Nucleotide sequences encoding for intein proteins that can be used can be derived from naturally occurring inteins and engineered variants thereof. A rather comprehensive list of such inteins is provided by the Intein Registry (http://www.neb.com/neb/inteins.html). Inteins that can be used include, but are not limited to, any of the naturally occurring inteins from organisms belonging to the Eucarya, Eubacteria, and Archea.
- inteins of the GyrA group e.g., Mxe GyrA, Mfl GyrA, Mgo GyrA, Mkas GyrA, Mle-TN GyrA, Mma GyrA
- DnaB group e.g., Ssp DnaB, Mtu -CDC1551 DnaB, Mtu-H37Rv DnaB, Rma DnaB
- RecA group e.g., Mtu-H37Rv RecA, Mtu-So93 RecA
- RIR1 group e.g., Mth RIR1, Chy RIR1, Pfu RIR1-2, Ter RIR1-2, Pab RIR1-3
- Vma group e.g., Sce Vma, Ctr Vma
- intein Mxe GyrA SEQ ID NO:1
- the engineered 'mini Ssp DnaB 'eDnaB', SEQ ID NO:2
- Intein proteins suitable in the methods described herein include, but are not limited to, engineered variants of natural inteins (or genetic fusion of split inteins), which have been modified by mutagenesis in order, for example, to prevent or minimize splicing at the N-terminal or C-terminal end of the intein.
- modifications include, but are not limited to, mutation of the conserved cysteine or serine residue at the N-terminus of the intein (e.g., via substitution to an alanine) with the purpose, for example, of preventing cleavage at the N-terminus of the intein.
- modifications include, but are not limited to, mutation of the conserved asparagine residue at the C-terminus of the intein (e.g., via substitution to an alanine) with the purpose, for example, of preventing cleavage at the C-terminus of the C-terminus of the intein. Examples of these modifications are provided in Example 2.
- Intein variants useful for the methods disclosed herein also include, but are not limited to, engineered inteins whose internal endonuclease domain, which is not essential for the splicing mechanism, is removed.
- a variant of Ssp DnaB ('eDnaB', SEQ ID NO:2) lacking the internal endonuclease domain is used for the preparation of the precursor polypeptides.
- Inteins to be comprised in the precursor polypeptide can also be engineered with the purpose, for example, of altering the splicing properties of the intein in order to increase or reduce the splicing efficiency or in order to make the intein-catalyzed splicing process dependent upon variation of certain parameters such as pH or temperature.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) comprise an intein protein, or an engineered variant thereof.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) comprise an intein protein selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp -NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp -GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) comprise the N-domain, C-domain, or both the N-domain and C-domain of a split intein, or an engineered variant thereof.
- the N-terminal tail, (AA) m , the C-terminal tail, (AA) p , or both, in the precursor polypeptides of formula (I), (II), and (V) comprise the N-domain, C-domain, or both the N-domain and C-domain of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu -PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp- PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp -PCC7002 DnaE (SEQ ID NO:73- SEQ ID NO:74), T
- the N-terminal tail, (AA) m , in the precursor polypeptides of formula (I), (II), and (V) comprises the C-domain of a split intein and the C-terminal tail, (AA) p , of said precursor polypeptides comprises the corresponding N-domain of the split intein.
- the N-terminal tail, (AA) m , in the precursor polypeptides of formula (I), (II), and (V) comprises the C-domain of a split intein selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu- PCC73102 DnaE-c (SEQ ID NO:68), Nsp -PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp-PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variants thereof; and the C-terminal tail, (AA) p , comprises the corresponding N-domain of the split intein selected from the group consisting of Ssp DnaE-n (SEQ ID NO
- polynucleotide molecules are provided encoding for precursor polypeptides of formula (I), (II), and (V) as defined above, the latter, (V), not being according to the invention but present for illustration purposes only.
- Polynucleotide molecules are provided for encoding for the aminoacyl-tRNA synthetases and cognate tRNA molecules for the ribosomal incorporation of the amino acid Z into the precursor polypeptides of formula (I) and (II) and for the ribosomal incorporation of the amino acid Z2 into the precursor polypeptides of formula (V).
- Polynucleotide molecules are provided encoding for polypeptide sequences that can be introduced within the N-terminal tail ((AA) m ) or C-terminal tail ((AA) p ) of the precursor polypeptides of formula (I), (II) and (V), such as peptide and protein affinity tags, reporter proteins and enzymes, carrier proteins of a display system, and intein proteins, as described above. Since the correspondence of all the possible three-base codons to the various amino acids is known, providing the amino acid sequence of the polypeptide provides also a description of all the polynucleotide molecules encoding for such polypeptide.
- the codons are selected to fit the host cell in which the polypeptide is being expressed.
- codons used in bacteria can be used to express the polypeptide in a bacterial host.
- the polynucleotides may be linked to one or more regulatory sequences controlling the expression of the polypeptide-encoding gene to form a recombinant polynucleotide capable of expressing the polypeptide.
- oligonucleotide primers having a predetermined or randomized sequence can be prepared chemically by solid phase synthesis using commercially available equipments and reagents. Polynucleotide molecules can then be synthesized and amplified using a polymerase chain reaction, digested via endonucleases, ligated together, and cloned into a vector according to standard molecular biology protocols known in the art (e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual (Third Edition), Cold Spring Harbor Press, 2001 ). These methods, in combination with the mutagenesis methods mentioned above, can be used to generate polynucleotide molecules that encode for the aforementioned polypeptides as well as suitable vectors for the expression of these polypeptides in a host expression system.
- the precursor polypeptides can be produced by introducing said polynucleotides into an expression vector, by introducing the resulting vectors into an expression host, and by inducing the expression of the encoded precursor polypeptides in the presence of the amino acid Z (or Z2) and, whenever necessary, also in the presence of a suitable stop codon or frameshift codon suppression system for mediating the incorporation of the amino acid Z (or Z2) into the precursor polypeptides.
- Nucleic acid molecules can be incorporated into any one of a variety of expression vectors suitable for expressing a polypeptide.
- Suitable vectors include, but are not limited to, chromosomal, nonchromosomal, artificial and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, pseudorabies, adenovirus, adeno-associated viruses, retroviruses and many others.
- Any vector that transduces genetic material into a cell, and, if replication is desired, which is replicable and viable in the relevant host can be used.
- a large number of expression vectors and expression hosts are known in the art, and many of these are commercially available.
- a person skilled in the art will be able to select suitable expression vectors for a particular application, e.g., the type of expression host (e.g., in vitro systems, prokaryotic cells such as bacterial cells, and eukaryotic cells such as yeast, insect, or mammalian cells) and the expression conditions selected.
- Expression hosts that may be used for the preparation of the precursor polypeptides and macrocyclic peptides include, but are not limited to, any systems that support the transcription, translation, and/or replication of a nucleic acid.
- the expression host system is a cell.
- Host cells for use in expressing the polypeptides encoded by the expression vector of this disclosure are well known in the art and include, but are not limited to, bacterial cells (e.g., Escherichia coli, Streptomyces ); fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae, Pichia pastoris ); insect cells; plant cells; and animal cells, such as mammalian cells and human cells.
- These systems also include, but are not limited to, lysates of prokaryotic cells (e.g., bacterial cells) and lysates of eukaryotic cells (e.g., yeast, insect, or mammalian cells). These systems also include, but are not limited to, in vitro transcription/translation systems, many of which are commercially available.
- lysates of prokaryotic cells e.g., bacterial cells
- eukaryotic cells e.g., yeast, insect, or mammalian cells
- in vitro transcription/translation systems many of which are commercially available.
- the choice of the expression vector and host system depends on the type of application intended for the methods disclosed herein and a person skilled in the art will be able to select a suitable expression host based on known features and application of the different expression hosts.
- a bacterial, yeast, or a human expression host when it is desired to evaluate the interaction between the macrocyclic peptide(s) generated via the methods disclosed herein with a bacterial, yeast, or a human cell component, a bacterial, yeast, or a human expression host, respectively, can be used.
- the expression host system is a cell.
- the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out within the cell-based expression host that produces the precursor polypeptides, so that the macrocyclic peptides are produced within this cell-based expression host.
- This method comprises providing a nucleic acid encoding for the precursor polypeptide, introducing the nucleic acid into the cell-based expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG 2 groups of the amino acid Z2), thereby producing the macrocyclic peptide inside the cell-based expression host.
- the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out on the surface of a cell or on a viral particle, so that the macrocyclic peptides are produced as tethered to a cell or a viral particle, respectively.
- This method comprises providing a nucleic acid encoding for the precursor polypeptide, wherein the N- or C-terminal tail comprises a polypeptide component of the cell membrane (e.g., S.
- the nucleic acid into the expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to be integrated into the cell membrane or viral particle, and allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG 2 groups of the amino acid Z2), thereby producing the macrocyclic peptide as tethered to the membrane of the cell or to the viral particle.
- the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out within a cell-free expression system, so that the macrocyclic peptides are produced within this cell-free expression system.
- This method comprises providing a nucleic acid encoding for the precursor polypeptide, introducing the nucleic acid into the cell-free expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG 2 groups of the amino acid Z2), thereby producing the macrocyclic peptide within the cell-free expression host.
- a method is also provided for making a library of macrocyclic peptides via cyclization of a plurality of precursor polypeptides of formula (I) or (II) that contain an heterogeneous peptide target sequence (AA) n , or an heterogeneous N-terminal tail (AA) m , or an heterogeneous C-terminal tail (AA) p , or a combination of these.
- This method comprises: (a) constructing a plurality of nucleic acid molecules encoding for a plurality of precursor polypeptides, said precursor polypeptides having an heterogeneous peptide target sequence (AA) n , or an heterogeneous N-terminal tail (AA) m , or an heterogeneous C-terminal tail (AA) p , or a combination of these; (b) introducing each of the plurality of said nucleic acid molecules into an expression vector, and introducing the resulting vectors into an expression host; (c) expressing the plurality of precursor polypeptides; (d) allowing for the precursor polypeptides to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z, thereby producing a plurality of macrocyclic peptides.
- This method comprises: (a) constructing a plurality of nucleic acid molecules encoding for a plurality of precursor polypeptides, said precursor polypeptides having an heterogeneous peptide target sequence (AA) n , or an heterogeneous second peptide target sequence (AA) o , or an heterogeneous N-terminal tail (AA) m , or an heterogeneous C-terminal tail (AA) p , or a combination of these; (b) introducing each of the plurality of said nucleic acid molecules into an expression vector, and introducing the resulting vectors into an expression host; (c) expressing the plurality of precursor polypeptides; (d) allowing for the precursor polypeptides to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteines and the FGi and FG 2 group2 of the amino acid Z2, thereby producing a plurality of macrocyclic peptides.
- each of the plurality of macrocyclic peptides prepared as described above is tethered to a cell component, to a cell membrane component, to a bacteriophage, to a viral particle, or to a DNA molecule, via a polypeptide comprised within the N-terminal tail or within the C-terminal tail of said macrocyclic peptide molecule.
- nucleotide molecules that encode for precursor polypeptides of formula (I), (II), or (V) containing an heterogeneous peptide target sequence (AA) n , an heterogeneous second peptide target sequence (AA) o , an heterogeneous N-terminal tail (AA) m , an heterogeneous C-terminal tail (AA) p , or a combination of these.
- the compounds provided herein may contain one or more chiral centers. Accordingly, the compounds are intended to include, but not be limited to, racemic mixtures, diastereomers, enantiomers, and mixture enriched in at least one stereoisomer or a plurality of stereoisomers.
- a group of substituents is disclosed herein, all the individual members of that group and all subgroups, including any isomers, enantiomers, and diastereomers are intended to be included in the disclosure.
- all isotopic forms of the compounds disclosed herein are intended to be included in the disclosure. For example, it is understood that any one or more hydrogens in a molecule disclosed herein can be replaced with deuterium or tritium.
- This example demonstrates the preparation of various cysteine-reactive unnatural amino acids, i.e., various Z and Z2 amino acids, which can be used for preparation of macrocyclic peptide molecules according to the general methods illustrated in FIGS. 1A-B , 2A-B , 3A-B , 4A-B , and 37A-B .
- the unnatural amino acid 4-(2-bromoethoxy)-phenylalanine (1, p-2beF) was prepared according to the synthetic route provide in Scheme 1 of FIG. 5 .
- the unnatural amino acid N ⁇ -((2-bromoethoxy)carbonyl)-lysine (2, 2-becK) was prepared according to the synthetic route provide in Scheme 2 of FIG. 5 .
- the unnatural amino acid 4-(1-bromoethyl)-phenylalanine (3, p-1beF) was prepared according to the synthetic route provide in Scheme 3 of FIG. 5 .
- the unnatural amino acid N ⁇ -((2-chloroethoxy)carbonyl)-lysine (4, 2-cecK) was prepared according to the synthetic route provide in Scheme 4 of FIG. 6 .
- the unnatural amino acid N ⁇ -(buta-2,3-dienoyl)-lysine (5, bdnK) was prepared according to the synthetic route provide in Scheme 5 of FIG. 6 .
- the bifunctional unnatural amino acid O-(2,3-dibromoethyl)-tyrosine (6, OdbpY) was prepared according to the synthetic route provide in Scheme 6 of FIG. 6 .
- cysteine-reactive amino acids of general formula (III), (IV), (VI), or (VII) can be prepared in an analogous manner either through modification of naturally occurring amino acids (e.g., p-2beF, 2-becK, 2-cecK, bdnK, ObdpY) or via synthesis ex novo (e.g., p-1beF).
- N ⁇ -((2-bromoethoxy)carbonyl)-lysine (2becK) (2) To a solution of N ⁇ - tert -butoxycarbonyl-lysine (1 g, 4.06 mmol) and NaOH (162.4 mg, 4.06 mmol, 1 eq) dissolved in 20 mL of water 2-bromoethylchloroformate (0.435 mL, 4.06 mmol, 1 eq) and, separately, an additional equivalent of NaOH were added simultaneously dropwise over 30 min. The reaction mixture was stirred at room temperature for 18 h. Upon acidification with HOAc, the aqueous phase was extracted with EtOAc (3 x 80 mL).
- reaction was stirred for 1 h at room temperature followed by dropwise addition of N ⁇ - tert -butoxycarbonyl-lysine (1.4 g, 5.72 mmol, 1.1 eq) and triethylamine (1.2 mL, 7.8 mmol, 1.5 eq).
- the reaction was monitored by TLC and upon completion (4-5 h) extracted with water.
- Organic layer was evaporated and the crude product was purified using flash column chromatography with 10:9:1 hexane:EtOAc:HOAc as solvent system. Fractions containing the desired product were pooled together and the solvent was removed under reduced pressure giving the desired product in 55% yield.
- reaction mixture was stirred at room temperature for 8 h after which the reaction mixture was filtered, diluted with 60 mL of water, acidified with acetic acid to pH 4 and extracted with 2 x 100 mL of EtOAc. Organic layers were combined and dried over sodium sulfate. The solvent was removed under reduced pressure yielding yellow oil as crude product which was purified by flash column chromatography using 10:9:1 hexane:EtOAc:HOAc acid as solvent system. Fractions of interest were combined and solvent removed under reduced pressure yielding off-white powder as product (g, %).
- a series of a plasmid-based vectors were prepared that encode for precursor polypeptides in different formats (Table 1) according to the macrocyclization methods schematically described in FIGS. 1A-B , 2A-B , 3A-B , 4A-B and 37A-B .
- a first series of constructs (Entries 1-9 and 13-15, Table 1) were prepared for the expression of precursor polypeptides of general formula (I), in which (i) the N-terminal tail, (AA) m , consists of a Met-Gly dipeptide; (ii) the target peptide sequence, (AA) n , consists of 1- to 12-amino acid long polypeptides, some of which were designed to include a streptavidin-binding HPQ motif (Katz 1995; Naumann, Savinov et al.
- a second series of constructs (Entries 10-12, Table 1) were prepared for the expression of precursor polypeptides of general formula (II), in which (i) the N-terminal tail, (AA) m , consists of a short (2 to 6 amino acid-long) polypeptide; (ii) the target peptide sequence, (AA) n , consists of a 3 to 7-amino acid long polypeptide; and (iii) the C-terminal tail, (AA) p , consists of the N198A variant of Mxe GyrA intein (SEQ ID NO:1) followed by a polyhistidine tag.
- SEQ ID NO:1 N198A variant of Mxe GyrA intein
- an amber stop codon was used to enable the introduction of the desired, cysteine-reactive unnatural amino acid Z, downstream of the peptide target sequence via amber stop codon suppression.
- a third series of constructs (Entries 16-20, Table 1) were prepared for the expression of precursor polypeptides of general formula (I), in which (i) the N-terminal tail, (AA) m , contains the C-domain of Synechocystis sp. DnaE split intein (SEQ ID NO:62); (ii) the C-terminal tail, (AA) p , contains the N-domain of Synechocystis sp. DnaE split intein (SEQ ID NO:61); and (iii) a streptavidin-binding HPQ motif (Naumann, Savinov et al.
- CBD corresponds to the Chitin Binding Domain (CBD) of Bacillus circulans chitinase A1 (SEQ ID NO:130)
- DnaE N and DnaEc correspond to the N-domain and C-domain, respectively, of Synechocystis sp. DnaE split intein (SEQ ID NOS: 61 and 62).
- the reactive amino acid residues involved in peptide macrocyclization i.e., Cys and Z residues; Cys and Z2 residues
- the plasmid vector pET22b(+) (Novagen) was used as cloning vector to prepare the plasmids for the expression of the precursor polypeptides of Entries 1-15 and 21-22 in Table 1. Briefly, synthetic oligonucleotides (Integrated DNA Technologies) were used for the PCR amplification of a gene encoding for N-terminal peptide and peptide target sequence fused to GyrA N198A intein using a previously described GyrA-containing vector (pBP_MG6) (Smith, Vitali et al. 2011) as template. The resulting PCR product (ca.
- 0.6 Kbp was digested with Nde I and Xho I and cloned into pET22b(+) to provide the plasmids for the expression of the precursor polypeptides of Entries 1-15 and 21-22 in Table 1.
- the cloning process placed the polypeptide-encoding gene under the control of an IPTG-inducible T7 promoter and introduced a poly-histidine tag at the C-terminus of the intein.
- Plasmids for the expression of the polypeptide constructs of Entries 16 through 20 of Table 1 were prepared in a similar manner but using pBAD plasmid (Life Technologies) as the cloning and expression vector.
- the genes encoding for DnaE N and DnaEc were amplified from Addgene plasmids pSFBAD09 and pJJDuet30. The sequences of the plasmid constructs were confirmed by DNA sequencing.
- This example illustrates how a suitable tRNA/aminoacyl-tRNA synthetase pair can be identified for the purpose of incorporating a desired cysteine-reactive, unnatural amino acid into a precursor polypeptide of general formula (I), (II), or (V) according to the methods disclosed herein.
- this example describes the identification of tRNA/aminoacyl-tRNA synthetase pairs for the incorporation of the unnatural amino acid 4-(2-bromoethoxy)-phenylalanine (p-2beF), N ⁇ -((2-bromoethoxy)carbonyl)-lysine (2becK), 4-(1-bromoethyl)-phenylalanine (p-1beF), N ⁇ -((2-chloroethoxy)carbonyl)-lysine (2cecK), N ⁇ -(buta-2,3-dienoyl)-lysine (bdnK), and O-(2,3-dibromoethyl)-tyrosine (OdbpY), which were synthesized as described in Example 1.
- a high-throughput fluorescence-based screen was applied to identify viable tRNA/aminoacyl-tRNA synthetase (AARS) pairs for the ribosomal incorporation of the unnatural amino acid p-2beF, 2becK, p-1beF, 2cecK, bdnK, or OdbpY, in response to an amber stop codon.
- AARS tRNA/aminoacyl-tRNA synthetase
- coli cells are co-transformed with two plasmids with compatible origins of replications and selection markers; one plasmid directs the expression of the tRNA/AARS pair to be tested, whereas the second plasmid contains a gene encoding for a variant of Yellow Fluorescence Protein (YFP), in which an amber stop codon (TAG) is introduced at the second position of the polypeptide sequence following the initial Met residue (called YFP(TAG)).
- YFP Yellow Fluorescence Protein
- TAG amber stop codon
- the ability of the tRNA/AARS pair to suppress the amber stop codon with the unnatural amino acid of interest can be thus determined and quantified based on the relative expression of YFP as determined by fluorescence.
- AARS engineered aminoacyl-tRNA synthetase
- mazei pyrrolysyl-tRNA synthetase (SEQ ID NO:78) in combination with their cognate amber stop codon suppressor tRNA (i.e., Mj tRNA CUA Tyr (SEQ ID NO:101) for Mj AARSs and Mm / Mb tRNA CUA Pyl (SEQ ID NO:105) for the Mm and Mb AARSs) were tested for their ability to incorporate the target amino acids p-2beF, 2becK, p-1beF, 2cecK, bdnK, or OdbpY into the reporter YFP(TAG) protein.
- Mj tRNA CUA Tyr SEQ ID NO:101
- Mm / Mb tRNA CUA Pyl SEQ ID NO:105
- this panel of AARS enzymes included the known engineered AARSs Mj -pAcF-RS (SEQ ID NO:81), Mj -pAmF-RS (SEQ ID NO:87), Mb -CrtK-RS (SEQ ID NO:93), and Mm -pXF-RS (SEQ ID NO:91) (Young, Young et al. 2011)) as well as the newly engineered Mj -OpgY2-RS (SEQ ID NO:85).
- the latter which is derived from Mj -OpgY-RS (SEQ ID NO:84) (Deiters and Schultz 2005), carries an Ala32G mutation that was designed to facilitate the recognition of O-substituted tyrosine derivatives such as p-2beF and OdbpY based on the available crystal structure of the parent enzyme Mj-TyrRS (SEQ ID NO:77) (Kobayashi, Nureki et al. 2003). As illustrated by the representative data in FIGS. 7A-B , the AARS/tRNA pair consisting of Mj -pOgY2-RS/ Mj tRNA CUA Tyr was found to enable the efficient incorporation of p-2beF ( FIG.
- the Mj -pAcF-RS/ Mj tRNA CUA Tyr pair can enable efficient amber stop codon suppression with p-1beF; the Mb- CrtK-RS/ Mm / Mb tRNA CUA Pyl pair can enable efficient amber stop codon suppression with 2cecK or bdnK; and the Mj -pOgY2-RS/ Mj tRNA CUA Tyr pair can enable efficient amber stop codon suppression with OdbpY.
- Competent BL21(DE3) E. coli were cotransformed with a pEVOL-based plasmid (Smith, Vitali et al. 2011) for the expression of the desired AARS/tRNA pair and a pET22-YFP(TAG) plasmid for the expression of the reporter YFP protein.
- pEVOL-based plasmid Smith, Vitali et al. 2011
- TAG pET22-YFP
- This example demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I) and containing the cysteine-reactive unnatural amino acid p-2beF.
- this example demonstrates certain embodiments as schematically described in FIGS. 1A and 2A .
- these precursor polypeptides were expected to undergo cyclization via a nucleophilic substitution reaction between the cysteine side-chain thiol group and the p-2beF side-chain bromoalkyl group flanking the target peptide sequence after ribosomal synthesis of the precursor polypeptides in the expression system (E. coli) ( FIG. 8 ).
- E. coli E. coli
- FIG. 8 To establish the occurrence and efficiency of the cyclization, these proteins were isolated by Ni-affinity chromatography exploiting the C-terminal poly-histidine tag present in these constructs (Table 1). In all the aforementioned constructs, a Thr residue was placed at the site preceding the GyrA intein ("I-1 site").
- the proteins were made react with benzyl mercaptan in order to release the desired macrocyclic peptide (in the form of C-terminal benzyl thioester or the corresponding C-terminal carboxylic acid after thioester hydrolysis) from the GyrA intein via thiol-induced splicing of the intein.
- the reaction mixtures were then analysed by LC-MS to detect and quantify the amount of the desired thioether-linked macrocyclic product as well as that of any uncyclized linear byproduct, as judged based on the peak areas in the corresponding extracted-ion chromatograms ( FIGS. 10-15 ).
- Uncyclized byproducts would appear as unmodified linear peptides or as linear adducts where the bromoalkyl group in p-2beF has undergone nucleophilic substitution with the benzyl mercaptan reagent during the in vitro reaction or with glutathione in vivo.
- GyrA intein contains a Cys at its N -terminal end which is crucial for mediating protein splicing in the context of the application of the present methods for producing peptide macrocycles inside the cells (see Example 5). Since this residue is partially buried within the active site (Klabunde, Sharma et al. 1998), we did not expect it to readily react with p-2beF side chain. Notably, quantitative splicing of the GyrA moiety upon treatment of all the aforementioned contructs with benzyl mercaptan indicated that no reaction occurred between p-2beF and the catalytic Cys at the intein I+1 site (see representative results in FIGS. 17a-d ).
- Macrocycles were analyzed using Thermo Scientific HyPurity C4 column (particle size 5 ⁇ m, 100 x 2.1 mm i.d.) and a linear gradient 5% to 95% ACN (with 0.1% formic acid) in water (with 0.1% formic acid) over 9 min.
- MALDI-TOF spectra were acquired on the Bruker Autoflex III mass spectrometer.
- Example 5 In vivo production of macrocyclic peptides from p-2beF-containing precursor polypeptides of general formula (I).
- This example further demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I) and containing the cysteine-reactive unnatural amino acid p-2beF.
- this example provides a demonstration of the functionality of the methods described herein for the production of macrocyclic peptide within living bacterial cells.
- these precursor polypeptides were expected to result in the formation of macrocyclic peptides inside the living cell expression host ( E. coli ) via the intramolecular, thioether bond-forming reaction between the cysteine and p-2beF residue, followed by release of the cyclic peptide via spontaneous N-terminal splicing of the intein moiety.
- These constructs were also designed to contain a streptavidin-binding motif (HPQ) within the sequence of the resulting macrocyclic peptides (Table 1) in order to allow for the isolation of these peptides via streptavidin-affinity capturing directly from bacterial lysates. Accordingly, E.
- coli cells expressing these precursor polypeptides were grown overnight and lysed by sonication. The cell lysates were then passed over streptavidin-coated beads, from which streptavidin-bound material was eluted. LC-MS analysis of the eluates revealed the occurence of the expected peptide macrocycle in each case, as illustrated by the LCMS chromatograms and MS/MS spectra in FIGS. 25-27 .
- Example 6 Preparation and isolation of macrocyclic peptides generated via cysteine cross-linking with different electrophilic amino acids.
- This example further demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I).
- this example demonstrates how different cysteine-reactive unnatural amino acids of general structure (III) can be used for the purpose of generating macrocyclic peptides starting from ribosomally produced polypeptide precursors according to the methods described herein.
- orthogonal AARS/tRNA pairs could be readily identified to achieve the specific incorporation of the unnatural amino acids 2becK, 2cecK, p-1beF, or bdnK into a precursor polypeptide of choice.
- Each of these amino acids contains an electrophilic side-chain functionality (i.e., alkylbromide group in 2becK and p-lbeF; alkylchloride group in 2cecK; allenamide group in bdnK), which was expected to react chemoselectively with a neighboring cysteine residue within the precursor polypeptide sequence according to the general methods provided herein.
- Detection and quantification of the cyclic product was carried by LC-MS and MS/MS analysis as described in Example 4. These analyses revealed the occurrence of the desired macrocyclic peptide product in each case, as shown by the representative LC-MS extracted-ion chromatograms and MS/MS spectra in FIGS. 18-22 .
- 2becK- and 2cecK-mediated peptide macrocyclization was found to occur very efficiently (>80%) when the cysteine residue is located within a six-residue distance from the electrophilic amino acid (i.e., with constructs 12mer-Z1C through 12mer-Z1C). Beyond this spacing distance, the % cyclization decreases significantly ( ⁇ 20%).
- p-1beF and bdnK were synthesized (Example 1) and tested here for their ability to induce peptide macrocyclization upon reaction with a proximal cysteine in the precursor polypeptide.
- p-1beF contains a benzylic, secondary alkyl bromide group, thus enabling the formation of more compact peptide ring structures as compared to those generated using p-2beF-mediated cysteine alkylation.
- bdnK was designed to contain an allenamide group, which is known to react chemoselectively with cysteine via a Michael addition reaction (Abbas, Xing et al. 2014).
- Example 7 Preparation and isolation of macrocyclic peptides precursor polypeptides of general formula (II).
- This example demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (II). As such, this example demonstrates certain embodiments as schematically described in FIGS. 1B and 2B .
- the reactive Cys is located upstream of the unnatural amino acid, and specifically at position Z-4, Z-6 and Z-8.
- Analysis of the p-2beF-containing proteins according to the procedure described above (Example 4) revealed the occurrence of the desired cyclic peptide as the largely predominant product (95-99%) for all of the constructs tested ( FIG. 9A , FIGS. 34-35 ).
- efficient inter-side-chain cyclization 80-95%) was observed when the cysteine and unnatural amino acid are three (Z-4) and five residue apart, while a lower % of cyclization was noted at the larger spacing distance (Z-8) ( FIG. 9B ).
- Example 8 In vivo production and isolation of bicyclic peptides.
- This example demonstrates certain embodiments as schematically described in FIG. 4A .
- this example demonstrates how bicyclic peptides can be generated from precursor polypeptides of general formula (I) via the combination of a split intein-mediated trans-splicing reaction and inter-side-chain cyclization reaction mediated by a cysteine and a cysteine-reactive unnatural amino acid according to the methods described herein. While split intein-mediated trans-splicing has proven useful for the generation and isolation of head-to-tail cyclic peptides in a variety of context (Scott, Abel-Santos et al.
- coli by means of an intramolecular, thioether bond-forming reaction between the cysteine and p-2beF residues and a DnaE-catalyzed trans-splicing reaction leading to ring closure (i.e., N-to-C-end cyclization) of the peptide sequence comprised between the C- and N-domain of the split intein.
- a streptavidin-binding motif HPQ was included within the sequence targeted for macrocyclization (Table 1). Accordingly, using an analogous procedure as that described in Example 5, lysates of E. coli cells expressing the aforementioned precursor polypeptides were passed over streptavidin-coated beads, from which streptavidin-bound material was eluted.
- the desired bicyclic peptide was isolated as the largely predominant product in each case (70-95%), as determined by LC-MS ( FIGS. 28-32 ).
- the bicyclic structure of these compounds was further evidenced by the corresponding MS/MS fragmentation spectra ( FIGS. 28-32 ).
- a chitin-binding domain was included at the C-terminus of the Int N domain in each construct (Table 1).
- LC-MS analysis of the protein fraction eluted from chitin beads showed that the split intein-mediated cyclization has occurred nearly quantitatively or nearly quantitatively (>85%) for all the constructs tested (see representative MS spectra in FIGS. 33a-d ).
- bicyclic structures across target sequences of varying length and composition supports the functionality and broad scope of the present methodology for the ribosomal synthesis of bicyclic peptides through the integration of split intein-mediated peptide circularization with inter-side-chain thioether bridge formation.
- the increased conformational rigidity imposed by the intra-side-chain thioether bridge is expected to improve the functional and/or stability properties of these bicyclic peptides as compared to the head-to-tail cyclized peptide counterpart.
- streptavidin-binding affinity of the bicyclic peptides obtained via cyclization of the cStrep3(S)-Z3C(p-2beF) and cStrep3(C)-Z8C(p-2beF) constructs was measured through an in-solution inhibition assay and compared with that of a 'monocyclic' counterpart ( cyclo [S(OpgY)TNCHPQFANA] (SEQ ID NO:189) where OpgY is O-propargyl-tyrosine).
- a streptavidin-binding surface is first created by immobilizing the bicyclic peptide obtained from the cStrep3(C)-Z8C(p-2beF) construct on maleimide-coated microtiter plates. Then, a fixed amount of streptavidin-horseradish peroxidase conjugate is added to the plate in the presence of varying amount of the bicyclic or cyclic peptide. After washing, the amount of bound streptavidin is determined based on the residual peroxidase activity using a standard (ABTS) colorimetric assay.
- ABTS standard
- the ICso value for the head-to-tail monocyclic peptide cyclo [S(OpgY)TNCHPQFANA (SEQ ID NO:189) was determined to be 1.9 ⁇ M, while the thioether-constrained bicyclic peptides from the cStrep3(S)-Z3C(p-2beF) and cStrep3(C)-Z8C(p-2beF) constructs exhibited an ICso of 3.7 and 0.77 ⁇ M, respectively ( FIG. 39B ).
- the > 2-fold increase in streptavidin binding affinity exhibited by the latter as compared to the monocyclic counterpart exemplifies the inherent advantage provided by presence of the additional intramolecular thioether linkage.
- This example demonstrates the feasibility of generating polycyclic peptides using the methods provided herein.
- it demonstrates the formation and isolation of polycyclic peptides obtained via the post-translational cyclization of precursor polypeptides containing multiple Z/Cys pairs.
- It also demonstrates the formation and isolation of polycyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (V).
- this example demonstrates certain embodiments as schematically described in FIGS. 37A-B .
- this example illustrates the specific case in which two copies of the same cysteine-reactive amino acid are incorporated into the precursor polypeptide
- a person skilled in the art would immediately recognize that this approach can be readily extended to the use of two different cysteine-reactive amino acids, such as those described in FIGS. 5 and 6 .
- the ribosomal incorporation of two different cysteine-reactive unnatural amino acids into the precursor polypeptide can be achieved using methods known in the art, i.e., via suppression of two different stop codons (Wan, Huang et al. 2010) or via suppression of a stop codon and a four-based codon (Chatterjee, Sun et al. 2013; Sachdeva, Wang et al. 2014).
- results from structure-reactivity studies such as those described in FIGS. 9A-B can guide the design of appropriate precursor polypeptides for the formation of a polycyclic peptide with the desired pattern of thioether linkages (i.e., through the judicious choice of spacing distances between the different Z and Cys residues).
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Medicinal Chemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Peptides Or Proteins (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
Description
- The present invention relates to methods and compositions for generating macrocyclic peptides from genetically encoded, ribosomally produced polypeptide precursors. The invention also relates to a recombinant host cell comprising an artificial nucleic acid encoding for a polypeptide.
- Peptides molecules represent valuable tools for investigating biological systems, studying the binding and activity properties of biomolecules (e.g., enzymes, cell receptors, antibodies, kinases), exploring the etiopathological causes of diseases, and for validating pharmacological targets. Peptides are also attractive ligands for targeting protein-protein interactions and modulating the function of biological molecules such as enzymes and nucleic acids. The synthesis of combinatorial libraries of small peptides followed by screening of these chemical libraries in biological assays can enable the identification of compounds that exhibit a variety of biological and pharmacological properties. Bioactive peptides identified in this manner can constitute valuable lead compounds or facilitate the development of lead compounds towards the discovery of new drugs.
- While many peptides exhibit interesting biological activity, linear peptides do not generally represent suitable pharmacological agents as they are generally only poorly adsorbed, do not cross biological membranes readily, and are prone to proteolytic degradation. In addition, linear peptides fail to bind proteins that recognize discontinuous epitopes. The use of molecular constraints to restrict the conformational freedom of the molecule backbone can be used to overcome these limitations. In many cases, conformationally constrained peptides exhibit enhanced enzymatic stability (Fairlie, Tyndall et al. 2000; Wang, Liao et al. 2005), membrane permeability (Walensky, Kung et al. 2004; Rezai, Bock et al. 2006; Rezai, Yu et al. 2006), and protein binding affinity (Tang, Yuan et al. 1999; Dias, Fasan et al. 2006) and selectivity (Henchey, Porter et al. 2010), compared to their linear counterparts. Constraints that lock-in the active conformation of a peptide molecule can result in increased affinity due to the reduced conformational entropy loss upon binding to the receptor. Many bioactive and therapeutically relevant peptides isolated from natural sources occur indeed in cyclized form or contain intramolecular bridges that reduce the conformational flexibility of these molecules (e.g., immunosuppressant cyclosporin A,
antitumor dolastatin 3 and diazonamide A, anti-HIV luzopeptin E2, and the antimicrobial vancomycin). Since macrocyclic peptides constitute promising molecular scaffolds for the development of bioactive compounds and therapeutic agents (Katsara, Tselios et al. 2006; Driggers, Hale et al. 2008; Obrecht, Robinson et al. 2009; Marsault and Peterson 2011), methods for generating macrocyclic peptides and combinatorial libraries thereof, are of high synthetic value and practical utility, in particular in the context of drug discovery. - While cyclic peptides can be prepared synthetically via a variety of known methods (White and Yudin 2011), the possibility to generate macrocyclic peptides starting from genetically encoded polypeptide precursors offers several advantages (Frost, Smith et al. 2013; Smith, Frost et al. 2013). Among these, there are: (a) the high combinatorial potential inherent to the ribosomal synthesis of genetically encoded polypeptides, which can enable the production of very large collections of peptide sequences (108-1010 members or higher) in a cost- and time-effective manner; (b) the possibility to link these peptide libraries to powerful, high-throughput screening platforms such as phage display, mRNA display, or yeast display, in order to identify peptide ligands with the desired property (e.g., high binding affinity toward a target protein); (c) the ease by which these chemical libraries can be deconvoluted in order to identify the library members of interest (i.e., via sequencing of the peptide-encoding DNA or RNA sequence).
- Various methods have been developed for producing biological libraries of conformationally constrained peptides (Frost, Smith et al. 2013; Smith, Frost et al. 2013). For example, libraries of disulfide-constrained cyclic peptides have been prepared using phage display and fusing randomized polypeptide sequences flanked by two cysteines to a phage particle as described, e.g., in
U.S. Pat. No. 7,235,626 . Disulfide bridges are however potentially reactive and this chemical linkage is unstable under reducing conditions or in a reductive environment such as the intracellular milieu. Alternatively, ribosomally produced peptides have also been constrained through the use of cysteine- or amine-reactive cross-linking agents (Millward, Takahashi et al. 2005; Seebeck and Szostak 2006; Heinis, Rutherford et al. 2009; Schlippe, Hartman et al. 2012). A drawback of these methods is the risk of producing multiple undesired products via reaction of the cross-linking agents with multiple sites within the randomized peptide sequence or the carrier protein in a display system. In addition, these methods do not allow for the formation of macrocyclic peptides inside the polypeptide-producing cell host. Other methods have been described that are useful for preparing head-to-tail cyclic peptides by using natural (i.e., naturally occurring) or engineered (i.e., non-naturally occurring, artificial or synthetic) split inteins, as described inU.S. Pat. No. 7,354,756 ,U.S. Pat. No. 7,252,952 andU.S. Pat. No. 7,105,341 . An advantage of these strategies is the possibility to couple the intracellular formation of cyclic peptide libraries with an cell-based reporter or selection system, which can facilitate the identification of functional peptide ligands (Horswill, Savinov et al. 2004; Cheng, Naumann et al. 2007; Naumann, Tavassoli et al. 2008; Young, Young et al. 2011). However, the peptide cyclization efficiency was found to be highly dependent on the peptide sequence (Scott, Abel-Santos et al. 2001). In addition, only head-to-tail cyclic peptides can be obtained through these strategies, which limits the extent of structural diversity of the ligand libraries generated through these methods. Finally, methods have also been reported for generating cyclic peptides through the enzymatic modification of linear peptide precursors (Hamamoto, Sisido et al. 2011; Touati, Angelini et al. 2011). However, the need for exogenous reagents and/or enzyme catalysts for mediating peptide cyclization and, in some cases, moderate cyclization efficiency limit the scope and utility of these approaches toward the generation and screening of cyclic peptide libraries. - Efficient and versatile methods for generating macrocyclic peptides from ribosomally produced polypeptides would thus be highly desirable in the art. The methods and compositions described herein provide a solution to this need, enabling the ribosomal synthesis of cyclic peptides in vitro (i.e., in a cell-free system) and in vivo (i.e., inside a cell or on a surface of a cell) and in various 'configurations', namely in the form of macrocyclic peptides, lariat-shaped peptides, or as cyclic peptides fused to a N-terminus or C-terminus of a protein of interest, such as a carrier protein of a display system.
- Citation or identification of any reference in
Section 2, or in any other section of this application, shall not be considered an admission that such reference is available as prior art to the present invention. - A method is provided for making a macrocyclic peptide, the method comprising:
- a. providing an artificial nucleic acid molecule encoding for a polypeptide of structure:
(AA)m-Z-(AA)n-Cys-(AA)p (I)
or
(AA)m-Cys-(AA)n-Z-(AA)p (II)
wherein:- i. (AA)m is an N-terminal amino acid or peptide sequence,
- ii. Z is a non-canonical amino acid carrying a side-chain functional group FGi, FGi being a functional group selected from the group consisting of -(CH2) n X, where X is F, Cl, Br, or I and n is an integer number from 1 to 10; -C(O)CH2X, where X is F, Cl, Br, or I; -CH(R')X, where X is F, Cl, Br, or I; -C(O)CH(R')X, where X is F, Cl, Br, or I; -OCH2CH2X, where X is F, Cl, Br, or I;-C(O)CH=C=C(R')(R"); -SO2C(R')=C(R')(R"); -C(O)C(R')=C(R')(R");-C(R')=C(R')C(O)OR'; -C(R')=C(R')C(O)N(R')(R"); -C(R')=C(R')-CN;-C(R')=C(R')-NO2; -C≡C-C(O)OR'; -C≡C-C(O)N(R')(R"); unsubstituted or substituted oxirane; unsubstituted or substituted aziridine; 1,2-oxathiolane 2,2-dioxide; 4-fluoro-1,2-oxathiolane 2,2-dioxide; and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide, where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group.
- iii. (AA)n is a target peptide sequence,
- iv. (AA)p is a C-terminal amino acid or peptide sequence;
- b. introducing the nucleic acid molecule into an expression system and expressing the nucleic acid molecule in the expression system, thereby producing the polypeptide; and
- c. allowing the functional group FGi to react with the side-chain sulfhydryl group (-SH) of the cysteine (Cys) residue(s), thereby producing the macrocyclic peptide.
- In one embodiment of the method, Z is an amino acid of structure:
oxathiolane 2,2-dioxide; 4-fluoro-1,2-oxathiolane 2,2-dioxide; and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide; where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group;
wherein Y is a linker group selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups. - In another embodiment of the method, Z is an amino acid of structure (IV) and Y is a linker group selected from the group consisting of C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 substituted heteroatom-containing alkyl, C1-C24 substituted heteroatom-containing alkyl, C2-C24 alkenyl, C2-C24 substituted alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C5-C24 aryl, C5-C24 substituted aryl, C5-C24 substituted heteroatom-containing aryl, C5-C24 substituted heteroatom-containing aryl, C1-C24 alkoxy, and C5-C24 aryloxy groups.
- In another embodiment of the method, Y is a linker group selected from the group consisting of -CH2-C6H4-, -CH2-C6H4-O-, -CH2-C6H4-NH-, -(CH2)4-(CH2)4NH-, -(CH2)4NHC(O)-, and -(CH2)4NHC(O)O-.
- In another embodiment of the method, the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-phenylalanine, 3-(2-fluoro-acetyl)-phenylalanine, 4-(2-fluoro-acetyl)-phenylalanine, Nε -((2-bromoethoxy)carbonyl)-lysine, Nε -((2-chloroethoxy)carbonyl)-lysine, Nε -(buta-2,3-dienoyl)-lysine, Nε -acryl-lysine, Nε -crotonyl-lysine, Nε -(2-fluoro-acetyl)-lysine, and Nε -(2-chloro-acetyl)-lysine.
-
- In another embodiment of the method, the codon encoding for Z is an amber stop codon TAG, an ochre stop codon TAA, an opal stop codon TGA, or a four base codon.
- In another embodiment of the method, the expression system comprises:
- an aminoacyl-tRNA synthetase polypeptide or an engineered variant thereof that is at least 90% identical to SEQ ID NO:77, 78, 79, or 80; and
- a transfer RNA molecule encoded by a polynucleotide that is at least 90% identical to SEQ ID NO:101, 105, 109, 113, or 117.
- In another embodiment of the method,
- (a) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:77 comprises an amino acid substitution at a position selected from the group consisting of position: X32, X63, X65, X70, X107, X108, X109, X155, X158, X159, X160, X161, X162, X163, X164, X167, and X286 of SEQ ID NO:77,
- (b) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:78 comprises an amino acid substitution at a position selected from the group consisting of position: X302, X305, X306, X309, X346, X348, X364, X384, X401, X405, and X417 of SEQ ID NO:78,
- (c) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:79 comprises an amino acid substitution at a position selected from the group consisting of position: X76, X266, X270, X271, X273, X274, X313, X315, and X349 of SEQ ID NO:79, or
- (d) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:80 comprises an amino acid substitution at a position selected from the group consisting of position: X37, X182, X183, X186, and X265 of SEQ. ID NO. 204.
- In another embodiment of the method,
- (a) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:77 comprises at least one of the features selected from the group consisting of: X32 is Tyr, Leu, Ala, Gly, Thr, His, Glu, Val, or Gln; X65 is Leu, His, Tyr, Val, Ser, Thr, Gly, or Glu; X67 is Ala or Gly; X70 is His, Ala, Cys, or Ser; X107 is Glu, Pro, Asn, or Thr; X108 is Phe, Trp, Ala, Ser, Arg, Gly, Tyr, His, Trp, or Glu; X109 is Gln, Met, Asp, Lys, Glu, Pro, His, Gly, Met, or Leu; X155 is Gln, Glu, or Gly; X158 is Asp, Gly, Glu, Ala, Pro, Thr, Ser, or Val; X159 is Ile, Cys, Pro, Leu, Ser, Trp, His, or Ala; X160 is His or Gln; X161 is Tyr or Gly; X162 is Leu, Arg, Ala, Gln, Gly, Lys, Ser, Glu, Tyr, or His; X163 is Gly or Asp; X164 is Val or Ala; X167 is Ala or Val; X286 is Asp or Arg;;
- (b) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:78 comprises at least one of the features selected from the group consisting of: X302 is Ala or Thr; X305 is Leu or Met; X306 is Tyr, Ala, Met, Ile, Leu, Thr, Gly; X309 is Leu, Ala, Pro, Ser, or Arg; X346 is Asn, Ala, Ser, or Val; X348 is Cys, Ala, Thr, Leu, Lys, Met, or Trp; X364 is Thr or Lys; X384 is Tyr or Phe; X405 is Ile or Arg; X401 is Val or Leu; and X417 is Trp, Thr, or Leu;
- (c) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:79 comprises at least one of the features selected from the group consisting of: X76 is Asp or Gly; X266 is Leu, Val, or Met; X270 is Leu or Ile; X271 is Tyr, Phe, Leu, Met, or Ala; X274 is Leu, Ala, Met, or Gly; X313 is Cys, Phe, Ala, Val, or Ile; X315 is Met or Phe; and X349 is Tyr, Phe, or Trp; or
- (d) the engineered variant of the aminoacyl-tRNA synthetase polypeptide of SEQ ID NO:80 comprises at least one of the features selected from the group consisting of: X37 is Tyr, Ile, Gly, Val, Leu, Thr, or Ser; X182 is Asp, Gly, Ser, or Thr; X183 is Phe, Met, Tyr, or Ala; X186 is Leu, Ala, Met, or Val; and X265 is Asp or Arg.
- In another embodiment of the method, the expression system comprises:
- an aminoacyl-tRNA synthetase selected from the group consisting of SEQ ID NOs. 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100; and
- a transfer RNA molecule encoded by a polynucleotide selected from the group consisting of SEQ ID NO:101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, and 120.
- In another embodiment of the method, the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I) or (II) comprise(s):
- a polypeptide affinity tag, a DNA-binding polypeptide, a protein-binding polypeptide, an enzyme, a fluorescent protein, an intein protein, or
- a combination thereof.
- In another embodiment of the method, the polypeptide comprised within the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I) and (II) is a polypeptide selected from the group of polypeptides consisting of
SEQ ID NOs - In another embodiment of the method, the intein polypeptide comprised within the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I) or (II) is a selected from the group consisting of a naturally occurring intein, an engineered variant of a naturally occurring intein, a fusion of the N-terminal and C-terminal fragments of a naturally occurring split intein and a fusion of the N-terminal and C-terminal fragments of an engineered split intein.
- In another embodiment of the method, the intein is selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp-NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp-GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp-GE8 Pol-1 (SEQ ID NO:14), Tthi Pol (SEQ ID NO:15), Tag Pol-3 (SEQ ID NO:16), Tfu Pol-2 (SEQ ID NO:17), Thy Pol-2 (SEQ ID NO:18), Tli Pol-2 (SEQ ID NO:19), Tsp-GE8 Pol-2 (SEQ ID NO:20), Pab Pol-II (SEQ ID NO:21), Mtu-CDC1551 DnaB (SEQ ID NO:22), Mtu-H37Rv DnaB (SEQ ID NO:23), Rma DnaB (SEQ ID NO:24), Ter DnaE-1 (SEQ ID NO:25), Ssp GyrB (SEQ ID NO:26), Mfl GyrA (SEQ ID NO:27), Mgo GyrA (SEQ ID NO:28), Mkas GyrA (SEQ ID NO:29), Mle-TN GyrA (SEQ ID NO:30), Mma GyrA (SEQ ID NO:31), Ssp DnaX (SEQ ID NO:32), Pab Lon (SEQ ID NO:33), Mja PEP (SEQ ID NO:34), Afu-FRR0163 PRP8 (SEQ ID NO:35), Ani-FGSCA4 PRP8 (SEQ ID NO:36), Cne-A PRP8 (SEQ ID NO:37), Hca PRP8 (SEQ ID NO:38), Pch PRP8 (SEQ ID NO:39), Pex PRP8 (SEQ ID NO:40), Pvu PRP8 (SEQ ID NO:41), Mtu-H37Rv RecA (SEQ ID NO:42), Mtu-So93 RecA (SEQ ID NO:43), Mfl RecA (SEQ ID NO:44), Mle-TN RecA (SEQ ID NO:45), Nsp-PCC7120 RIR1 (SEQ ID NO:46), Ter RIR1-1 (SEQ ID NO:47), Pab RIR1-1 (SEQ ID NO:48), Pfu RIR1-1 (SEQ ID NO:49), Chy RIR1 (SEQ ID NO:50), Mth RIR1 (SEQ ID NO:51), Pab RIR1-3 (SEQ ID NO:52), Pfu RIR1-2 (SEQ ID NO:53), Ter RIR1-2 (SEQ ID NO:54), Ter RIR1-4 (SEQ ID NO:55), CIV RIR1 (SEQ ID NO:56), Ctr VMA (SEQ ID NO:57), Sce VMA (SEQ ID NO:58), Tac-ATCC25905 VMA (SEQ ID NO:59), Ssp DnaB (SEQ ID NO:60),
- engineered variant(s) thereof, and
- engineered variant(s) thereof wherein the N-terminal cysteine or serine residue of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than cysteine or serine, or wherein the C-terminal asparagine residue of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than asparagine.
- In another embodiment of the method, the intein is a fusion product of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu-PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp-PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp-PCC7002 DnaE (SEQ ID NO:73-SEQ ID NO:74), Tvu DnaE (SEQ ID NO:75- SEQ ID NO:76),
- engineered variant(s) thereof, and
- engineered variant(s) thereof wherein the N-terminal cysteine or serine residue of the split intein N-domain of the engineered variant is mutated to any of the natural (or naturally occurring) amino acid residues other than cysteine or serine, or wherein the C-terminal asparagine residue of the split intein C-domain of the engineered variant is mutated to any of the natural (or naturally occurring) amino acid residues other than asparagine.
- In another embodiment of the method,
- the N-terminal tail polypeptide, (AA)m, of the precursor polypeptide of formula (I) or (II) comprises the C-domain of a split intein, and
- the C-terminal tail polypeptide, (AA)p, comprises the corresponding N-domain of the split intein.
- In another embodiment of the method, the split intein C-domain is selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu-PCC73102 DnaE-c (SEQ ID NO:68), Nsp-PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp-PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variant(s) thereof; and the split intein N-domain is selected from the group consisting of Ssp DnaE-n (SEQ ID NO:61), Neq Pol-n (SEQ ID NO:63), Asp DnaE-n (SEQ ID NO:65), Npu-PCC73102 DnaE-n (SEQ ID NO:67), Nsp-PCC7120 DnaE-n (SEQ ID NO:69), Oli DnaE-n (SEQ ID NO:71), Ssp-PCC7002 DnaE-n (SEQ ID NO:73), Tvu DnaE-n (SEQ ID NO:75), and engineered variant(s) thereof.
- In another embodiment of the method, the expression system is selected from the group consisting of a prokaryotic cell, an eukaryotic cell, and a cell-free expression system.
- In another embodiment of the method, the prokaryotic cell is Escherichia coli.
- In another embodiment of the method, the eukaryotic cell is a yeast, a mammalian, an insect or a plant cell.
- In another embodiment of the method, any of polypeptides (AA)n, (AA)o, (AA)m, or (AA)p, is fully or partially genetically randomized so that a plurality of macrocyclic peptides is obtained upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FGi in Z.
- In another embodiment of the method, the method comprises fully or partially randomizing any of polypeptides (AA)n, (AA)m, or (AA)p, wherein, upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FGi in Z, a plurality of macrocyclic peptides is produced.
- Artificial, engineered and recombinant nucleic acid molecules and peptide sequences (or amino acid sequences) for use in this method are also provided.
- A recombinant host cell is provided comprising an artificial nucleic acid ecoding a polypeptide of structure:
(AA)m-Z-(AA)n-Cys-(AA)p (I)
or
(AA)m-Cys-(AA)n-Z-(AA)p (II)
wherein: - i. (AA)m is an N-terminal amino acid or peptide sequence,
- ii. Z is an amino acid of structure:
- wherein Y is a linker group selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups, R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group; and
- wherein Y2, Y3, L are linker groups selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups,
- iv. (AA)n is a target peptide sequence,
- v. (AA)o is a second target peptide sequence,
- v. (AA)p is a C-terminal amino acid or peptide sequence.
- In one embodiment of the cell, the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-phenylalanine, 3-(2-fluoro-acetyl)-phenylalanine, 4-(2-fluoro-acetyl)-phenylalanine, Nε -((2-bromoethoxy)carbonyl)-lysine, Nε -((2-chloroethoxy)carbonyl)-lysine, Nε -(buta-2,3-dienoyl)-lysine, Nε -acryl-lysine, Nε -crotonyl-lysine, Nε -(2-fluoro-acetyl)-lysine, and Nε -(2-chloro-acetyl)-lysine.
- In another embodiment of the cell, the polypeptide comprised within the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I) and (II) is a polypeptide selected from the group of polypeptides consisting of
SEQ ID NOs - In another embodiment of the cell, the cell comprises a macrocyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z.
- In another embodiment of the cell, the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, in the precursor polypeptides of formula (I) or formula (II) comprise(s) an intein selected from the group consisting of a naturally occurring intein, an engineered variant of a naturally occurring intein, a fusion of the N-terminal and C-terminal fragments of a naturally occurring split intein and a fusion of the N-terminal and C-terminal fragments of an engineered split intein.
- In another embodiment of the cell, the intein is selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp-NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp-GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp-GE8 Pol-1 (SEQ ID NO:14), Tthi Pol (SEQ ID NO:15), Tag Pol-3 (SEQ ID NO:16), Tfu Pol-2 (SEQ ID NO:17), Thy Pol-2 (SEQ ID NO:18), Tli Pol-2 (SEQ ID NO:19), Tsp-GE8 Pol-2 (SEQ ID NO:20), Pab Pol-II (SEQ ID NO:21), Mtu-CDC1551 DnaB (SEQ ID NO:22), Mtu-H37Rv DnaB (SEQ ID NO:23), Rma DnaB (SEQ ID NO:24), Ter DnaE-1 (SEQ ID NO:25), Ssp GyrB (SEQ ID NO:26), Mfl GyrA (SEQ ID NO:27), Mgo GyrA (SEQ ID NO:28), Mkas GyrA (SEQ ID NO:29), Mle-TN GyrA (SEQ ID NO:30), Mma GyrA (SEQ ID NO:31), Ssp DnaX (SEQ ID NO:32), Pab Lon (SEQ ID NO:33), Mja PEP (SEQ ID NO:34), Afu-FRR0163 PRP8 (SEQ ID NO:35), Ani-FGSCA4 PRP8 (SEQ ID NO:36), Cne-A PRP8 (SEQ ID NO:37), Hca PRP8 (SEQ ID NO:38), Pch PRP8 (SEQ ID NO:39), Pex PRP8 (SEQ ID NO:40), Pvu PRP8 (SEQ ID NO:41), Mtu-H37Rv RecA (SEQ ID NO:42), Mtu-So93 RecA (SEQ ID NO:43), Mfl RecA (SEQ ID NO:44), Mle-TN RecA (SEQ ID NO:45), Nsp-PCC7120 RIR1 (SEQ ID NO:46), Ter RIR1-1 (SEQ ID NO:47), Pab RIR1-1 (SEQ ID NO:48), Pfu RIR1-1 (SEQ ID NO:49), Chy RIR1 (SEQ ID NO:50), Mth RIR1 (SEQ ID NO:51), Pab RIR1-3 (SEQ ID NO:52), Pfu RIR1-2 (SEQ ID NO:53), Ter RIR1-2 (SEQ ID NO:54), Ter RIR1-4 (SEQ ID NO:55), CIV RIR1 (SEQ ID NO:56), Ctr VMA (SEQ ID NO:57), Sce VMA (SEQ ID NO:58), Tac-ATCC25905 VMA (SEQ ID NO:59), Ssp DnaB (SEQ ID NO:60),
- engineered variant(s) thereof, and
- engineered variant(s) thereof wherein the N-terminal cysteine or serine residue of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than cysteine or serine, or wherein the C-terminal asparagine residue of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than asparagine
- In another embodiment of the cell, the intein is a fusion product of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu-PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp-PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp-PCC7002 DnaE (SEQ ID NO:73- SEQ ID NO:74), Tvu DnaE (SEQ ID NO:75- SEQ ID NO:76),
- engineered variant(s) thereof,
- engineered variant(s) thereof, wherein the N-terminal cysteine or serine residue of the split intein N-domain of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than cysteine or serine, or wherein the C-terminal asparagine residue of the split intein C-domain of the engineered variant is mutated to any natural (or naturally occurring) amino acid residue other than asparagine.
- In another embodiment of the cell, the cell comprises a macrocyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z, and an intein-catalyzed N-terminal splicing, C-terminal splicing, or self-splicing reaction.
- In another embodiment of the cell, the N-terminal tail polypeptide, (AA)m, comprises the C-domain of a naturally occurring split intein, or of an engineered variant thereof, and the C-terminal tail polypeptide, (AA)p, comprises the N-domain of said split intein.
- In another embodiment of the cell, the split intein C-domain is selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu-PCC73102 DnaE-c (SEQ ID NO:68), Nsp-PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp-PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variant(s) thereof; and the split intein N-domain is selected from the group consisting of Ssp DnaE-n (SEQ ID NO:61), Neq Pol-n (SEQ ID NO:63), Asp DnaE-n (SEQ ID NO:65), Npu-PCC73102 DnaE-n (SEQ ID NO:67), Nsp-PCC7120 DnaE-n (SEQ ID NO:69), Oli DnaE-n (SEQ ID NO:71), Ssp-PCC7002 DnaE-n (SEQ ID NO:73), Tvu DnaE-n (SEQ ID NO:75), and engineered variant(s) thereof.
- In another embodiment of the cell, the cell comprises a polycyclic peptide produced by a thioether bond-forming reaction between the cysteine (Cys) residue and the FGi functional group in the amino acid Z, and a split intein-catalyzed trans-splicing reaction.
- Embodiments are described herein with reference to the accompanying drawings, in which similar reference characters denote similar elements throughout the several views. It is to be understood that in some instances, various aspects of the embodiments may be shown exaggerated or enlarged to facilitate an understanding of the invention.
-
FIGS. 1A-B . Schematic representation of two general methods for making macrocyclic peptides from ribosomally produced precursor polypeptides of general formula (I) (panel A) or general formula (II) (panel B). W corresponds to the linker group resulting from the bond-forming reaction between the functional group FGi and the cysteine residue. -
FIGS. 2A-B . Schematic representation of a variation of the general methods ofFIGS. 1A-B , wherein an intein protein is comprised within the C-terminal tail of a precursor polypeptide of general formula (I) (panel A) or of general formula (II) (panel B). W corresponds to the linker group resulting from the bond-forming reaction between the functional group FGi and the cysteine residue. -
FIGS. 3A-B . Schematic representation of another variation of the general methods ofFIGS. 1A-B , wherein an intein protein is comprised within the N-terminal tail of a precursor polypeptide of general formula (I) (panel A) or of general formula (II) (panel B). W corresponds to the linker group resulting from the bond-forming reaction between the functional group FGi and the cysteine residue. -
FIGS. 4A-B . Schematic representation of another variation of the general methods ofFIGS. 1A-B , wherein the C- and N-domains of a split intein is comprised within the N-terminal tail and C-terminal tail, respectively, of a precursor polypeptide of general formula (I) (panel A) or of general formula (II) (panel B). W corresponds to the linker group resulting from the bond-forming reaction between the functional group FGi and the cysteine residue. -
FIG. 5 . Synthetic routes for the synthesis of the cysteine-reactive unnatural amino acids p-2beF, 2becK, and p-1beF. -
FIG. 6 . Synthetic routes for the synthesis of the cysteine-reactive unnatural amino acids 2cecK, bdnK, and OdbpY. -
FIGS. 7A-B . Fluorescence-based assay for screening of AARS/tRNA pairs. The graphs indicate the relative efficiency of incorporation of the unnatural amino acid p-2beF (A) and 2becK (B) into the reporter protein YFP(TAG) by different amber stop codon suppressor AARS/tRNA pairs. -
FIG. 8 . Strategy for ribosomal synthesis of thioether-bridged macrocyclic peptides via p-2beF-mediated cyclization. The linear precursor polypeptide comprises an N-terminal tail (N-term), the unnatural amino acid p-2beF, a variable target sequence containing the reactive cysteine (black circle) and GyrA intein. Depending on the nature of the 'I-1' residue, the macrocyclic peptide can be released in vitro via thiol-induced Intein splicing (path A) or directly in vivo (path B). -
FIGS. 9A-B . Dependence of macrocyclization efficiency on relative position of the Cys residue with respect to the unnatural amino acid 'Z'. (A) Percentage of cyclization for the different p-2beF-containing constructs as determined by LCMS after in vitro splicing of the GyrA intein. (B) (Percentage of cyclization for the different 2becK- and 2cecK-containing constructs as determined by LCMS after in vitro splicing of the GyrA intein. In each case, proteins were isolated after expression in E. coli for 12 hours at 27°C (see Examples for details). -
FIGS. 10-15 . Representative examples of macrocyclic peptides produced from p-2beF-containing precursor polypeptides according to the methods disclosed herein. Each figure describes the sequence of the precursor polypeptide, the chemical structure of the macrocyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIG. 16 . Representative MS/MS spectrum corresponding to the macrocyclic peptide obtained from construct 12mer-Z6C(2-beF). The assignment of the a and b fragments is indicated. -
FIGS. 17a-d . Deconvoluted LC-MS mass spectra of proteins isolated after benzyl mercaptan-induced splicing of purified construct (a) 12mer-Z1C, (b) 12mer-Z4C, (c) 10mer-C6Z, and (d) 10mer-C8Z. -
FIGS. 18-24 . Representative examples of macrocyclic peptides produced from 2becK-, 2cecK, p-1beF-, and bdnK-containing precursor polypeptides according to the methods disclosed herein. Each figure describes the sequence of the precursor polypeptide, the chemical structure of the macrocyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIGS. 25-27 . Macrocyclic peptides isolated via streptavidin-affinity chromatography from bacterial lysate. Each figure describes the sequence of the precursor polypeptide, the chemical structure of the macrocyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIGS. 28-32 . Bicyclic peptides isolated via streptavidin-affinity chromatography from bacterial lysate. Each figure describes the sequence of the precursor polypeptide, the chemical structure of the bicyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the bicyclic peptide. -
FIGS. 33a-d . Deconvoluted LC-MS mass spectra of proteins isolated from the cell lysate using Ni-NTA beads: (a) Strep1-Z5C(p-2beF) construct, (b) Strep2-Z7C(p-2beF) construct; and using chitin beads: (c) cStrep3(C)-Z3C(p-2beF) construct, (d) cStrep3(S)-Z3C(p-2beF) construct -
FIGS. 34-35 . Representative examples of macrocyclic peptides produced from p-2beF-containing precursor polypeptides of general formula (II). Each figure describes the sequence of the precursor polypeptide, the chemical structure of the macrocyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIG. 36 . Representative example of a polycyclic peptide produced from a precursor polypeptide containing two Cys/Z pairs, where Z is p-2beF. The figure describes the sequence of the precursor polypeptide, the chemical structure of the polycyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIGS. 37A-B . Not according to the invention and for illustration purposes only, schematic representation of the general methods for making polycyclic peptides from ribosomally produced precursor polypeptides of general formula (V) containing a bifunctional cysteine-reactive amino acid (Z2) of general formula (VI) (panel A) or (VII) (panel B). W1 and W2 correspond to the linker groups resulting from the bond-forming reaction between the cysteine residues and functional group FGi and FG2, respectively. -
FIG. 38 . Representative example of a polycyclic peptide produced from a precursor polypeptide containing two cysteines and a bifunctional cysteine-reactive amino acid (ObdpY). The figure describes the sequence of the precursor polypeptide, the chemical structure of the polycyclic peptide product, and the MS/MS spectrum and LC-MS extracted-ion chromatogram (inset) of the macrocyclic peptide. -
FIGS. 39A-B . Competitive binding assay for measuring streptavidin binding affinity of HPQ-containing cyclic and bicyclic peptides. (A) Schematic illustration of the in-solution inhibition assay. IC50 values are obtained from the dose-dependent decrease in horseradish peroxidase (HRP) activity at increasing concentration of the cyclic or bicyclic streptavidin-binding peptide. (B) Inhibition curve. - For clarity of disclosure, and not by way of limitation, the detailed description is divided into the subsections set forth below.
- Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains.
- The singular forms "a," "an," and "the" used herein include plural referents unless the content clearly dictates otherwise.
- The term "plurality" refers to two or more referents unless the content clearly dictates otherwise. The term "at least one" refers to one or more referents.
- The term "functional group" as used herein refers to a contiguous group of atoms that, together, may undergo a chemical reaction under certain reaction conditions. Examples of functional groups are, among many others, -OH, -NH2, -SH, -(C=O)-, -N3, -C≡CH.
- The term "aliphatic" or "aliphatic group" as used herein means a straight or branched C1-15 hydrocarbon chain that is completely saturated or that contains at least one unit of unsaturation, or a monocyclic C3-8 hydrocarbon, or bicyclic C8-12 hydrocarbon that is completely saturated or that contains at least one unit of unsaturation, but which is not aromatic (also referred to herein as "cycloalkyl"). For example, suitable aliphatic groups include, but are not limited to, linear or branched alkyl, alkenyl, alkynyl groups or hybrids thereof such as (cycloalkyl)alkyl, (cycloalkenyl)alkyl, or (cycloalkynyl)alkyl. The alkyl, alkenyl, or alkynyl group may be linear, branched, or cyclic and may contain up to 15, up to 8, or up to 5 carbon atoms. Alkyl groups include, but are not limited to, methyl, ethyl, propyl, cyclopropyl, butyl, cyclobutyl, pentyl, and cyclopentyl groups. Alkenyl groups include, but are not limited to, propenyl, butenyl, and pentenyl groups. Alkynyl groups include, but are not limited to, propynyl, butynyl, and pentynyl groups.
- The term "aryl" and "aryl group" as used herein refers to an aromatic substituent containing a single aromatic or multiple aromatic rings that are fused together, directly linked, or indirectly linked (such as linked through a methylene or an ethylene moiety). An aryl group may contain from 5 to 24 carbon atoms, 5 to 18 carbon atoms, or 5 to 14 carbon atoms.
- The terms "heteroatom" means nitrogen, oxygen, or sulphur, and includes, but is not limited to, any oxidized forms of nitrogen and sulfur, and the quaternized form of any basic nitrogen. Heteroatom further includes, but is not limited to, Se, Si, or P.
- The term "heteroaryl" as used herein refer to an aryl group in which at least one carbon atom is replaced with a heteroatom. In various embodiments, a heteroaryl group is a 5- to 18-membered, a 5- to 14-membered, or a 5- to 10-membered aromatic ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms. Heteroaryl groups include, but are not limited to, pyridyl, pyrrolyl, furyl, thienyl, indolyl, isoindolyl, indolizinyl, imidazolyl, pyridonyl, pyrimidyl, pyrazinyl, oxazolyl, thiazolyl, purinyl, quinolinyl, isoquinolinyl, benzofuranyl, and benzoxazolyl groups.
- A heterocyclic group may be any monocyclic or polycyclic ring system which contains at least one heteroatom and may be unsaturated or partially or fully saturated. The term "heterocyclic" thus includes, but is not limited to, heteroaryl groups as defined above as well as non-aromatic heterocyclic groups. In various embodiments, a heterocyclic group is a 3- to 18-membered, a 3- to 14-membered, or a 3- to 10-membered, ring system containing at least one heteroatom selected from the group consisting of oxygen, sulphur, and nitrogen atoms. Heterocyclic groups include, but are not limited to, the specific heteroaryl groups listed above as well as pyranyl, piperidinyl, pyrrolidinyl, dioaxanyl, piperazinyl, morpholinyl, thiomorpholinyl, morpholinosulfonyl, tetrahydroisoquinolinyl, and tetrahydrofuranyl groups.
- A halogen atom may be a fluorine, chlorine, bromine, or iodine atom.
- By "optionally substituted", it is intended that in the any of the chemical groups listed above (e.g., alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, triazolyl groups), at least one of the hydrogen atoms is optionally replaced with an atom or chemical group other than hydrogen. Specific examples of such substituents include, but are not limited to, halogen atoms, hydroxyl (-OH), sulfhydryl (-SH), substituted sulfhydryl, carbonyl (-CO-), carboxy (-COOH), amino (-NH2), nitro (-NO2), sulfo (-SO2-OH), cyano (-C≡N), thiocyanato (-S-C≡N), phosphono (-P(O)OH2), alkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl, aryl, heteroaryl, heterocyclic, alkylthiol, alkyloxy, alkylamino, arylthiol, aryloxy, or arylamino groups. Where "optionally substituted" modifies a series of groups separated by commas (e.g., "optionally substituted A, B, or C"; or "A, B, or C optionally substituted with"), it is intended that each of the groups (e.g., A, B, or C) is optionally substituted.
- The term "heteroatom-containing aliphatic" as used herein refer to an aliphatic moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, selenium, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
- The terms "alkyl" and "alkyl group" as used herein refer to a linear, branched, or cyclic saturated hydrocarbon typically containing 1 to 24 carbon atoms, or 1 to 12 carbon atoms, such as methyl, ethyl, n-propyl, isopropyl, n-butyl, isobutyl, t-butyl, octyl, decyl and the like.
- The term "heteroatom-containing alkyl" as used herein refers to an alkyl moiety where at least one carbon atom is replaced with a heteroatom, e.g., oxygen, nitrogen, sulphur, phosphorus, or silicon, and typically oxygen, nitrogen, or sulphur.
- The terms "alkenyl" and "alkenyl group" as used herein refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, or of 2 to 12 carbon atoms, containing at least one double bond, such as ethenyl, n-propenyl, isopropenyl, n-butenyl, isobutenyl, octenyl, decenyl, and the like.
- The term "heteroatom-containing alkenyl" as used herein refer to an alkenyl moiety where at least one carbon atom is replaced with a heteroatom.
- The terms "alkynyl" and "alkynyl group" as used herein refer to a linear, branched, or cyclic hydrocarbon group of 2 to 24 carbon atoms, or of 2 to 12 carbon atoms, containing at least one triple bond, such as ethynyl, n-propynyl, and the like.
- The term "heteroatom-containing alkynyl" as used herein refer to an alkynyl moiety where at least one carbon atom is replaced with a heteroatom.
- The term "heteroatom-containing aryl" as used herein refer to an aryl moiety where at least one carbon atom is replaced with a heteroatom.
- The terms "alkoxy" and "alkoxy group" as used herein refer to an aliphatic group or a heteroatom-containing aliphatic group bound through a single, terminal ether linkage. In various embodiments, aryl alkoxy groups contain 1 to 24 carbon atoms, or contain 1 to 14 carbon atoms.
- The terms "aryloxy" and "aryloxy group" as used herein refer to an aryl group or a heteroatom-containing aryl group bound through a single, terminal ether linkage. In various embodiments, aryloxy groups contain 5 to 24 carbon atoms, or contain 5 to 14 carbon atoms.
- The term "substituent" refers to a contiguous group of atoms. Examples of "substituents" include, but are not limited to: alkoxy, aryloxy, alkyl, heteroatom-containing alkyl, alkenyl, heteroatom-containing alkenyl, alkynyl, heteroatom-containing alkynyl, aryl, heteroatom-containing aryl, alkoxy, heteroatom-containing alkoxy, aryloxy, heteroatom-containing aryloxy, halo, hydroxyl (-OH), sulfhydryl (-SH), substituted sulfhydryl, carbonyl (-CO-), thiocarbonyl, (-CS-), carboxy (-COOH), amino (-NH2), substituted amino, nitro (-NO2), nitroso (-NO), sulfo (-SO2-OH), cyano (-C≡N), cyanato (-O-C≡N), thiocyanato (-S-C≡N), formyl (-CO-H), thioformyl (-CS-H), phosphono (-P(O)OH2), substituted phosphono, and phospho (-PO2).
- The term "contact" as used herein with reference to interactions of chemical units indicates that the chemical units are at a distance that allows short range non-covalent interactions (such as Van der Waals forces, hydrogen bonding, hydrophobic interactions, electrostatic interactions, dipole-dipole interactions) to dominate the interaction of the chemical units. For example, when a protein is 'contacted' with a chemical species, the protein is allowed to interact with the chemical species so that a reaction between the protein and the chemical species can occur.
- The term "bioorthogonal" as used herein with reference to a reaction, reagent, or functional group, indicates that such reaction, reagent, or functional group does not exhibit significant or detectable reactivity towards biological molecules such as those present in a bacterial, yeast or mammalian cell. The biological molecules can be, e.g., proteins, nucleic acids, fatty acids, or cellular metabolites.
- In general, the term "mutant" or "variant" as used herein with reference to a molecule such as polynucleotide or polypeptide, indicates that such molecule has been mutated from the molecule as it exists in nature. In particular, the term "mutate" and "mutation" as used herein indicates any modification of a nucleic acid and/or polypeptide which results in an altered nucleic acid or polypeptide. Mutations include, but are not limited to, any process or mechanism resulting in a mutant protein, enzyme, polynucleotide, or gene. A mutation can occur in a polynucleotide or gene sequence, by point mutations, deletions, or insertions of single or multiple nucleotide residues. A mutation in a polynucleotide includes, but is not limited to, mutations arising within a protein-encoding region of a gene as well as mutations in regions outside of a protein-encoding sequence, such as, but not limited to, regulatory or promoter sequences. A mutation in a coding polynucleotide such as a gene can be "silent", i.e., not reflected in an amino acid alteration upon expression, leading to a "sequence-conservative" variant of the gene. A mutation in a polypeptide includes, but is not limited to, mutation in the polypeptide sequence and mutation resulting in a modified amino acid. Non-limiting examples of a modified amino acid include, but are not limited to, a glycosylated amino acid, a sulfated amino acid, a prenylated (e.g., farnesylated, geranylgeranylated) amino acid, an acetylated amino acid, an acylated amino acid, a PEGylated amino acid, a biotinylated amino acid, a carboxylated amino acid, a phosphorylated amino acid, and the like.
- The term "engineer" refers to any manipulation of a molecule that result in a detectable change in the molecule, wherein the manipulation includes, but is not limited to, inserting a polynucleotide and/or polypeptide heterologous to the cell and mutating a polynucleotide and/or polypeptide native to the cell.
- The term "nucleic acid molecule" as used herein refers to any chain of at least two nucleotides bonded in sequence. For example, a nucleic acid molecule can be a DNA or a RNA.
- The term "peptide", "polypeptide", and "protein" as used herein refers to any chain of at least two amino acids bonded in sequence, regardless of length or post-translational modification.
- The term "peptide-containing molecule" as used herein refers to a molecule that contains at least two amino acids.
- The term "non-natural" and "unnatural" as used herein means being directly or indirectly made or caused to be made through human action. Thus, a "non-natural amino acid" is an amino acid that has been produced through human manipulation and does not occur in nature. The term "non-canonical amino acid" is equivalent in meaning to the terms "non-natural amino acid" or "unnatural amino acid".
- The term "cyclic" and "macrocyclic" as used herein means having constituent atoms forming a ring. Thus, a "macrocyclic peptide" is a peptide molecule that contains at least one ring formed by atoms comprised in the molecule. As such, the term "macrocyclic peptide" comprises peptides that contain at least two rings separated from each other via a polypeptide sequence (also referred to herein as "polycyclic peptides") and peptides that contain at least two rings fused to each other (also referred to herein as "polycyclic peptides"). The term "macrocyclic peptide" also comprises peptides that contain two rings fused to each other (referred to herein also as "bicyclic peptides").
- The terms "cyclization" or "macrocyclization" as used herein refer to a process or reaction whereby a cyclic molecule is formed or is made to be formed.
- The term "peptidic backbone" as used herein refers to a sequence of atoms corresponding to the main backbone of a natural protein.
- The term "precursor polypeptide" or "polypeptide precursor" as used herein refers to a polypeptide that is capable of undergoing macrocyclization according to the methods disclosed herein.
- The term "ribosomal polypeptide", "ribosomally produced polypeptide" or "ribosomally derived polypeptide" as used herein refers to a polypeptide that is produced by action of a ribosome, and specifically, by the ribosomal translation of a messenger RNA encoding for such polypeptide. The ribosome can be a naturally occurring ribosome, e.g., a ribosome derived from an archea, procaryotic or eukaryotic organism, or an engineered (i.e., non-naturally occurring, artificial or synthetic) variant of a naturally occurring ribosome.
- The term "intein" and "intein domain" as used herein refers to a naturally occurring or artificially constructed polypeptide sequence embedded within a precursor protein that can catalyze a splicing reaction during post-translational processing of the protein. The NEB Intein Registry (http://www.neb.com/neb/inteins.html) provides a list of known inteins.
- The term "split intein" as used herein refers to an intein that has at least two separate components not fused to one another.
- The term "splicing" as used herein refers to the process involving the cleavage of the main backbone of an intein-containing polypeptide by virtue of a reaction or process catalyzed by an intein or portions of an intein. "N-terminal splicing" refers to the cleavage of a polypeptide chain fused to the N-terminus of an intein, such reaction typically involving the scission of the thioester (or ester) bond formed via intein-catalyzed N→S (or N→O acyl) transfer, by action of a nucleophilic functional group or a chemical species containing a nucleophilic functional group. "C-terminal splicing" refers to the cleavage of a polypeptide chain fused to the C-terminus of an intein. "Self-splicing" as used herein refers to the process involving the cleavage of an intein from a polypeptide, within which the intein is embedded. "Trans-splicing" as used herein refers to a self-splicing process involving split inteins.
- The term "affinity tag" as used herein refers to a polypeptide that is able to bind reversibly or irreversibly to an organic molecule, a metal ion, a protein, or a nucleic acid molecule.
- The terms "vector" and "vector construct" as used herein refer to a vehicle by which a DNA or RNA sequence (e.g., a foreign gene) can be introduced into a host cell, so as to transform the host and promote expression (e.g., transcription and translation) of the introduced sequence. A common type of vector is a "plasmid", which generally is a self-contained molecule of double-stranded DNA that can be readily accept additional (foreign) DNA and which can readily introduced into a suitable host cell. A large number of vectors, including plasmid and fungal vectors, have been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts. Non-limiting examples include, but are not limited to, pKK plasmids (Clonetech), pUC plasmids, pET plasmids (Novagen, Inc., Madison, Wis.), pRSET or pREP plasmids (Invitrogen, San Diego, Calif.), or pMAL plasmids (New England Biolabs, Beverly, Mass.), and many appropriate host cells, using methods disclosed or cited herein or otherwise known to those skilled in the relevant art. The terms "express" and "expression" refer to allowing or causing the information in a gene or DNA sequence to become manifest, for example producing a protein by activating the cellular functions involved in transcription and translation of a corresponding gene or DNA sequence. A DNA sequence is expressed in or by a cell to form an "expression product" such as a protein. The expression product itself, e.g., the resulting protein, may also be said to be "expressed" by the cell. A polynucleotide or polypeptide is expressed recombinantly, for example, when it is expressed or produced in a foreign host cell under the control of a foreign or native promoter, or in a native host cell under the control of a foreign promoter.
- The term "fused" as used herein means being connected through at least one covalent bond. The term "bound" as used herein means being connected through non-covalent interactions. Examples of non-covalent interactions are van der Waals, hydrogen bond, electrostatic, and hydrophobic interactions. Thus, a "DNA-binding peptide" refers to a peptide capable of connecting to a DNA molecule via non-covalent interactions. The term "tethered" as used herein means being connected through non-covalent interactions or through covalent bonds. Thus, a "polypeptide tethered to a solid support" refers to a polypeptide that is connected to a solid support (e.g., surface, resin bead) either via non-covalent interactions or through covalent bonds.
- Methods and compositions are provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced artificial polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue carrying a thiol-reactive functional group (referred to as FGi); and (b) a cysteine residue that is positioned either upstream or downstream of the non-canonical amino acid in the polypeptide sequence. These methods are based on the ability of the FGi-bearing amino acid and cysteine residue to react with each other after ribosomal synthesis of the polypeptide, so that a macrocyclic peptide carrying a side-chain-to-side-chain covalent (thioether) linkage is formed. Schematic representations of these embodiments are provided in
FIGS. 1A-B . - Methods and compositions are also provided for making macrocyclic peptides from genetically encoded, ribosomally produced, intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with a thiol-reactive functional group (referred to as FGi); (b) a cysteine residue positioned upstream or downstream of the non-canonical amino acid within the polypeptide sequence; and (c) an intein protein positioned upstream or downstream of the non-canonical amino acid or of the cysteine residue within the polypeptide sequence. These methods exploit the ability of this non-canonical amino acid and cysteine residue to react with each other after ribosomal synthesis of the precursor polypeptide, so that a macrocyclic peptide carrying a side-chain-to-side-chain covalent (thioether) linkage is formed. These methods also exploit the ability of the intein to undergo N-terminal splicing, C-terminal splicing, or self-splicing, so that the macrocyclic peptide is released upon intein splicing. Schematic representations of these embodiments are provided in
FIGS. 2A-B and3A-B . - Methods and compositions are also provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced, split intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with a thiol-reactive functional group (referred to as FGi); (b) a cysteine residue positioned upstream or downstream of the non-canonical amino acid within the polypeptide sequence; and (c) a split intein domain positioned upstream or downstream of the non-canonical amino acid or the cysteine residue within the polypeptide sequence. These methods exploit the ability of this non-canonical amino acid and cysteine residue to react with each other after ribosomal synthesis of the precursor polypeptide, so that a macrocyclic peptide carrying a side-chain-to-side-chain covalent (thioether) linkage is formed. These methods also exploit the ability of the split intein to undergo trans-splicing, so that the bicyclic peptide is released upon split intein trans-splicing. Schematic representations of these embodiments are provided in
FIGS. 4A-B . - Methods and compositions are also provided for making artificial macrocyclic peptides from genetically encoded, ribosomally produced, split intein-fused polypeptides. These methods are based on the use of artificial precursor polypeptides comprising (a) a non-canonical amino acid residue with two thiol-reactive functional groups (referred to as FGi and FG2); (b) two cysteine residues positioned upstream and downstream of the non-canonical amino acid within the polypeptide sequence. These methods are based on the ability of the FG1/FG2-bearing amino acid to react with the two cysteine residues after ribosomal synthesis of the polypeptide, so that a bicyclic peptide carrying two side-chain-to-side-chain covalent (thioether) linkages is formed. Schematic representations of these embodiments are provided in
FIGS. 37A-B . - Artificial, engineered and recombinant nucleic acid molecules and peptide sequences (or amino acid sequences) for use in these methods are also provided.
- In some embodiments, a method is provided for making an artificial macrocyclic peptide, the method comprising:
- a. providing a nucleic acid molecule encoding for a polypeptide of structure:
(AA)m-Z-(AA)n-Cys-(AA)p (I)
or
(AA)m-Cys-(AA)n-Z-(AA)p (II)
wherein:- i. (AA)m is an N-terminal amino acid or peptide sequence,
- ii. Z is a non-canonical amino acid carrying a side-chain functional group FGi, this FGi being a functional group selected from the group consisting of -(CH2) n X, where X is F, Cl, Br, or I and n is an integer number from 1 to 10; -C(O)CH2X, where X is F, Cl, Br, or I; -CH(R')X, where X is F, Cl, Br, or I;-C(O)CH(R')X, where X is F, Cl, Br, or I; -OCH2CH2X, where X is F, Cl, Br, or I; -C(O)CH=C=C(R')(R"), -SO2C(R')=C(R')(R"), -C(O)C(R')=C(R')(R"),-C(R')=C(R')C(O)OR', -C(R')=C(R')C(O)N(R')(R"), -C(R')=C(R')-CN,-C(R')=C(R')-NO2, -C≡C-C(O)OR', -C≡C-C(O)N(R')(R"), unsubstituted or substituted oxirane, unsubstituted or substituted aziridine, 1,2-oxathiolane 2,2-dioxide, 4-fluoro-1,2-oxathiolane 2,2-dioxide, and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide, where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group.
- iii. (AA)n is a target peptide sequence,
- iv. (AA)p is a C-terminal amino acid or peptide sequence;
- b. introducing the nucleic acid molecule into an expression system and expressing the nucleic acid molecule in the expression system, thereby producing the polypeptide; and
- c. allowing the functional group FGi to react with the cysteine (Cys) side-chain sulfhydryl group (-SH), thereby producing the macrocyclic peptide.
- In other embodiments not according to the invention and are present for illustration purposes only, a method is provided for making an artificial macrocyclic peptide, the method comprising:
- a. providing a nucleic acid molecule encoding for a polypeptide of structure:
(AA)m-Cys-(AA)n-Z2-(AA)o-Cys-(AA)p (V)
wherein:- i. (AA)m is an N-terminal amino acid or peptide sequence,
- ii. Z2 is a non-canonical amino acid carrying two side-chain functional groups FGi and FG2, these FGi and FG2 being a functional group independently selected from the group consisting of -(CH2) n X, where X is F, Cl, Br, or I and n is an integer number from 1 to 10; -C(O)CH2X, where X is F, Cl, Br, or I;-CH(R')X, where X is F, Cl, Br, or I; -C(O)CH(R')X, where X is F, Cl, Br, or I; -OCH2CH2X, where X is F, Cl, Br, or I; -C(O)CH=C=C(R')(R"),-SO2C(R')=C(R')(R"), -C(O)C(R')=C(R')(R"), -C(R')=C(R')C(O)OR',-C(R')=C(R')C(O)N(R')(R"), -C(R')=C(R')-CN, -C(R')=C(R')-NO2,-C≡C-C(O)OR', -C≡C-C(O)N(R')(R"), unsubstituted or substituted oxirane, unsubstituted or substituted aziridine, 1,2-oxathiolane 2,2-dioxide, 4-fluoro-1,2-oxathiolane 2,2-dioxide, and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide, where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group.
- iii. (AA)n is a target peptide sequence,
- iv. (AA)o is a second target peptide sequence,
- v. (AA)p is a C-terminal amino acid or peptide sequence;
- b. introducing the nucleic acid molecule into an expression system and expressing the nucleic acid molecule in the expression system, thereby producing the polypeptide; and
- c. allowing the functional group FGi and FG2 to react with the side-chain sulfhydryl group (-SH) of the cysteines (Cys), thereby producing the macrocyclic peptide.
- According to the method, (AA)m is a N-terminal sequence comprising at least one amino acid, where AA corresponds to a generic amino acid residue and m corresponds to the number of amino acid residues composing such sequence. (AA)m is also referred to as "N-terminal tail". (AA)p is a C-terminal sequence that has 0 or at least one amino acid, where AA corresponds to a generic amino acid residue and p corresponds to the number of amino acid residues composing such sequence. (AA)p is also referred to as "C-terminal tail". (AA)n (and (AA)o, when present) is a peptide sequence of variable length (also referred to as "target peptide sequence"), where AA corresponds to a generic amino acid residue and n corresponds to the number of amino acid residues composing such peptide sequence. Cys is a cysteine amino acid residue. Z is an amino acid that carries a side-chain functional group FGi, which can react with the side-chain sulfhydryl group (-SH) of the cysteine residue to form a stable thioether bond.
- As disclosed herein, the ability of an artificial polypeptide of formula (I) or (II) (also referred herein to as "precursor polypeptide") to produce a macrocyclic peptide is conferred by the ability of the nucleophilic sulfhydryl group carried by the cysteine residue to react intramolecularly with the electrophilic functional group FGi carried by the amino acid Z, thereby forming a covalent, inter-side-chain thioether bond. Depending on the nature of FGi, this reaction proceeds via a thiol-mediated nucleophilic substitution reaction, a thiol-mediated Michael-type addition reaction, or a radical thiol-ene or thiol-yne reaction. Whereas the electrophilic functional group FGi in the precursor polypeptide could in principle react intermolecularly with free cysteine or other thiol-containing molecules contained in the expression system (e.g., glutathione), it was discovered by the inventors that appropriate functional groups FGi can be found so that the desired intramolecular thioether-bond forming reaction occurs exclusively or preferentially over the undesired intermolecular side-reactions. This result can be achieved because of the spatial proximity between the nucleophilic cysteine residue and the electrophilic Z amino acid, resulting in an increased effective concentration of the reacting species (i.e., -SH and FGi groups, respectively) in the intramolecular settings as compared to the intermolecular settings, which in turn favors the intramolecular peptide cyclization reaction over undesired intermolecular reactions. Similar considerations can be made in the context of certain embodiments, wherein a precursor polypeptide of formula (V) along with a bifunctional cysteine-reactive amino acid capable of forming thioether bonds with two cysteine residues within the polypeptide (residue Z2) is used.
- A first advantage of the methods described herein is that they provide a highly versatile approach for the preparation of structurally diverse artificial macrocyclic peptides. Indeed, they offer multiple opportunities toward the structural and functional diversification of these compounds, e.g., through variation of the length and composition of the target peptide sequence ((AA)n), variation of the structure of the amino acid Z, variation of the position of the amino acid Z relative to the cysteine residue (e.g., precursor polypeptide (I) versus (II)), variation of the length and composition of the N-terminal tail ((AA)m), and variation of the length and composition of the C-terminal tail ((AA)p). Further structural diversification can be achieved by combining multiple Z/Cys pairs within the same precursor polypeptide or by using bifunctional cysteine-reactive amino acids (Z2) in order to obtain polycyclic and bicyclic peptides. Accordingly, and because of the genetically encoded and ribosomal nature of the precursor polypeptides, the methods and compositions described herein can be used to produce vast libraries of structurally and functionally diverse macrocyclic peptides, which can be screened to identify compounds that can modulate, inhibit or promote interactions between biomolecules (e.g., enzymes, proteins, nucleic acids) for a variety of applications, including drug discovery.
- A second advantage of the methods disclosed herein is that they produce peptide molecules whose conformational flexibility is restrained by virtue of at least one intramolecular thioether linkage. As illustrated in Example 8, this feature can confer these molecules with advantageous properties such as, for example, enhanced binding affinity, increased stability against proteolysis, and/or more favorable membrane-crossing properties, as compared to linear peptides or peptides lacking the intramolecular thioether linkage. In addition, the thioether linkage is redox and chemically stable in biological milieu, including the intracellular environment.
- A third advantage of the methods disclosed herein is they allow for the preparation of macrocyclic peptides from genetically encoded, ribosomally produced polypeptides. Accordingly, these macrocyclic peptides can be produced as fused to a genetically encoded affinity tag, DNA-binding protein/peptide, protein-binding protein/peptide, fluorescent protein, or enzyme, which can be achieved via the introduction of one or more of these elements within the N-terminal tail and/or within the C-terminal tail of the precursor polypeptide. On one hand, these tags/proteins/enzymes can be useful to facilitate the purification and/or immobilization of the macrocyclic peptides for functional screening as demonstrated in Examples 4, 5 and 8. On the other hand, very large libraries of macrocyclic peptides can be rapidly and cost-effectively produced utilizing precursor polypeptides in which the target peptide sequence ((AA)n), N-terminal tail ((AA)m), and/or C-terminal tail ((AA)m), is partially or fully randomized genetically. These features of the method can allow one to produce macrocyclic peptides as fused to a carrier protein of a display system such as phage display, mRNA display, ribosome display, yeast display, and the like. So, for example, the methods described herein allow one to generate combinatorial libraries of macrocyclic peptides that are fused to the pIII protein of M13 bacteriophage. These phage-displayed macrocyclic peptide libraries can be then 'panned' against a target biomolecule of interest according to procedures well known in the art (Lane and Stephen 1993; Giebel, Cass et al. 1995; Sidhu, Lowman et al. 2000) in order to identify macrocyclic peptide binders or inhibitors of such biomolecule.
- A fourth advantage of the methods described herein is that they also enable the production of macrocyclic peptides inside a cell-based expression host such as a bacterial, yeast, insect, or mammalian cell. Intracellular production of the macrocyclic peptide can then be coupled to an (intra)cellular reporter system, phenotypic screen, or selection system, in order to identify a macrocyclic peptide capable of inhibiting or activating a certain cellular process, biomolecule, or enzymatic reaction linked to the reporter output, phenotype, or cell survival, respectively.
- A fifth advantage of the methods disclosed herein is that the production of the macrocyclic peptides can be carried out under physiological conditions (e.g., in aqueous buffer, neutral pH, physiological temperature) and in complex biological media (e.g., inside a cell, in cell lysate) and in the presence of biological molecules (proteins, nucleic acids, cell metabolites) and biological material. One implication of this is that the production of macrocyclic peptides according to the methods disclosed herein can be coupled to one of the several techniques known in the art for the display and high-throughput screening of biological peptide libraries.
- Because of the aforementioned advantageous features, the methods described herein can be useful to greatly accelerate and facilitate the discovery of bioactive peptide-based compounds as potential drug molecules and chemical probes or the identification of lead structures for the development of new chemical probes and drugs.
- In some embodiments, Z is an amino acid of structure:
wherein Y is a linker group selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups. - In some embodiments, Z is an amino acid of structure (IV) wherein Y is a linker group selected from the group consisting of C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 substituted heteroatom-containing alkyl, C1-C24 substituted heteroatom-containing alkyl, C2-C24 alkenyl, C2-C24 substituted alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C5-C24 aryl, C5-C24 substituted aryl, C5-C24 substituted heteroatom-containing aryl, C5-C24 substituted heteroatom-containing aryl, C1-C24 alkoxy, C5-C24 aryloxy groups.
- In some embodiments, Z is an amino acid of structure (IV) wherein Y is a linker group selected from -CH2-C6H4-, -CH2-C6H4-O-, -CH2-C6H4-NH-,-(CH2)4-, -(CH2)4NH-, -(CH2)4NHC(O)-, and -(CH2)4NHC(O)O-.
- In specific embodiments, the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)-phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1-bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)-phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-phenylalanine, 3-(2-fluoro-acetyl)-phenylalanine, 4-(2-fluoro-acetyl)-phenylalanine, Nε -((2-bromoethoxy)carbonyl)-lysine, Nε -((2-chloroethoxy)carbonyl)-lysine, Nε -(buta-2,3-dienoyl)-lysine, Nε -acryl-lysine, Nε -crotonyl-lysine, Nε -(2-fluoro-acetyl)-lysine, and Nε -(2-chloro-acetyl)-lysine.
- In some embodiments, Z2 is an amino acid of structure:
wherein Y2, Y3, and L are linker groups selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, aryloxy groups. - In some embodiments, Z2 is an amino acid of structure (VI) wherein Y2 is a linker group selected from the group consisting of C1-C24 alkyl, C1-C24 substituted alkyl, C1-C24 substituted heteroatom-containing alkyl, C1-C24 substituted heteroatom-containing alkyl, C2-C24 alkenyl, C2-C24 substituted alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C2-C24 substituted heteroatom-containing alkenyl, C5-C24 aryl, C5-C24 substituted aryl, C5-C24 substituted heteroatom-containing aryl, C5-C24 substituted heteroatom-containing aryl, C1-C24 alkoxy, C5-C24 aryloxy groups.
-
- In specific embodiments, the amino acid Z2 is selected from the group consisting of of 3,5-bis(2-bromoethoxy)-phenylalanine, 3,5-bis(2-chloroethoxy)-phenylalanine, 3,5-bis(1-bromoethyl)-phenylalanine, 3,5-bis(aziridin-1-yl)-phenylalanine, 3,5-bis-acrylamido-phenylalanine, 3,5-bis(2-fluoro-acetamido)-phenylalanine, 3,5-bis(2-fluoro-acetyl)-phenylalanine, 4-((1,3-dibromopropan-2-yl)oxy)-phenylalanine, 4-((1,3-dichloropropan-2-yl)oxy)-phenylalanine, Nε -(((1,3-dibromopropan-2-yl)oxy)carbonyl)-lysine, Nε-(((1,3-dichloropropan-2-yl)oxy)carbonyl)-lysine, 4-(2,3-dibromopropoxy)-phenylalanine, 3-(2,3-dibromopropoxy)-phenylalanine, 4-(2,3-dichloropropoxy)-phenylalanine, 3-(2,3-dichloropropoxy)-phenylalanine, Nε -((2,3-dibromopropoxy)carbonyl)-lysine, and Nε -((2,3-dichloropropoxy)carbonyl)-lysine.
- Artificial nucleic acid molecules for use according to the methods provided herein include, but are not limited to, those that encode for a polypeptide of general formula (I), (II), or (V) as defined above. The codon encoding for the amino acid Z (or Z2) in these polypeptides can be one of the 61 sense codons of the standard genetic code, a stop codon (TAG, TAA, TGA), or a four-base frameshift codon (e.g., TAGA, AGGT, CGGG, GGGT, CTCT). In some embodiments, the codon encoding for the amino acid Z (or Z2) within the nucleotide sequence encoding for the precursor polypeptide of formula (I), (II) or (V) is an amber stop codon (TAG), an ochre stop codon (TAA), an opal stop codon (TGA), or a four-base frameshift codon (see Example 2). In other embodiments, the codon encoding for Z (or Z2) in the nucleotide sequence encoding for these precursor polypeptides is the amber stop codon, TAG, or the 4-base codon, TAGA.
- The non-canonical amino acid Z (or Z2) can be introduced into the precursor polypeptide through direct incorporation during ribosomal synthesis of the precursor polypeptide, or generated post-translationally through enzymatic or chemical modification of the precursor polypeptide, or by a combination of these procedures. In some embodiments, the amino acid Z (or Z2) is introduced into the precursor polypeptide during ribosomal synthesis of the precursor polypeptide via either stop codon suppression or four-base frameshift codon suppression. In other embodiments, the amino acid Z (or Z2) is introduced into the precursor polypeptide during ribosomal synthesis of the precursor polypeptide via amber (TAG) stop codon suppression or via 4-base TAGA codon suppression.
- Several methods are known in the art for introducing a non-canonical amino acid into a recombinant or in vitro translated artificial polypeptide, any of which can be applied for preparing artificial precursor polypeptides suitable for the methods disclosed herein. These art-known methods include, but are not limited to, methods for suppression of a stop codon or of a four-based frameshift codon with a non-canonical amino acid using engineered (i.e., non-naturally occurring, artificial or synthetic) tRNA/aminoacyl-tRNA synthetase (AARS) pairs (Wang, Xie et al. 2006; Wu and Schultz 2009; Liu and Schultz 2010; Fekner and Chan 2011; Lang and Chin 2014). Examples of tRNA/aminoacyl-tRNA synthetase (AARS) pairs used for this purpose include, but are not limited to, engineered variants of Methanococcus jannaschii AARS/tRNA pairs (e.g., TyrRS/tRNATyr), of Saccharomyces cerevisiae AARS/tRNA pairs (e.g., AspRS/tRNAAsp, GlnRS/tRNAGln,TyrRS/tRNATyr, and PheRS/tRNAPhe), of Escherichia coli AARS/tRNA pairs (e.g., TyrRS/tRNATyr, LeuRS/tRNALeu), of Methanosarcina mazei AARS/tRNA pairs (PylRS/tRNAPyl), and of Methanosarcina mazei AARS/tRNA pairs (PylRS/tRNAPyl) (Wang, Xie et al. 2006; Wu and Schultz 2009; Liu and Schultz 2010; Fekner and Chan 2011; Lang and Chin 2014). Alternatively, natural or engineered four-codon suppressor tRNAs and their cognate aminoacyl-tRNA synthetases can be used for the same purpose (Anderson, Wu et al. 2004; Rodriguez, Lester et al. 2006; Neumann, Slusarczyk et al. 2010; Neumann, Wang et al. 2010). Alternatively, a non-canonical amino acid can be incorporated into a polypeptide using chemically (Dedkova, Fahmi et al. 2003) or enzymatically (Bessho, Hodgson et al. 2002; Hartman, Josephson et al. 2006) aminoacylated tRNA molecules and using a cell-free protein expression system in the presence of the aminoacylated tRNA molecules (Kourouklis, Murakami et al. 2005; Murakami, Ohta et al. 2006). Alternatively, a non-canonical amino acid can be incorporated into a polypeptide by exploiting the promiscuity of wild-type aminoacyl-tRNA synthetase enzymes using a cell-free protein expression system, in which one or more natural amino acids are replaced with structural analogs(Josephson, Hartman et al. 2005; Hartman, Josephson et al. 2007). Any of these methods can be used to introduce an unnatural amino acid of the type (III), (IV), (VI) or (VII) into the precursor polypeptide for the purpose of generating macrocyclic peptides according to the methods disclosed herein.
- In some embodiments, the non-canonical amino acid Z (or Z2) is incorporated into the precursor polypeptide via stop codon or four-base codon suppression methods using an engineered AARS/tRNA pair derived from Methanococcus jannaschii tyrosyl-tRNA synthetase (MjTyrRS) and its cognate tRNA (MjtRNATyr), an engineered AARS/tRNA pair derived from Methanosarcina mazei pyrrolysyl-tRNA synthetase (MmPylRS) and its cognate tRNA (tRNAPyl), an engineered AARS/tRNA pair derived from Methanosarcina mazei pyrrolysyl-tRNA synthetase (MmPylRS) and its cognate tRNA (tRNAPyl), or an engineered AARS/tRNA pair derived from Escherichia coli tyrosyl-tRNA synthetase (EcTyrRS) and its cognate tRNA (EctRNATyr).
- In the characterization of the aminoacyl-tRNA synthetase enzymes disclosed herein, these enzymes can be described in reference to the amino acid sequence of a naturally occurring aminoacyl-tRNA synthetase or another engineered aminoacyl-tRNA synthetase. As such, the amino acid residue is determined in the aminoacyl-tRNA synthetase enzyme beginning from the first amino acid after the initial methionine (M) residue (i.e., the first amino acid after the initial methionine M represents residue position 1). It will be understood that the initiating methionine residue may be removed by biological processing machinery such as in a host cell or in vitro translation system, to generate a mature protein lacking the initiating methionine residue. The amino acid residue position at which a particular amino acid or amino acid change is present is sometimes described herein as "Xn", or "position n", where n refers to the residue position.
- In some embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanococcus jannaschii tRNATyr as encoded by a nucleotide of sequence SEQ ID NO:: 101, 102, 103, or 104; and an engineered variant of Methanococcus jannaschii tyrosyl-tRNA synthetase (SEQ ID NO:: 77), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:77: X32, X63, X65, X70, X107, X108, X109, X155, X158, X159, X160, X161, X162, X163, X164, X167, and X286.
- In other embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a Methanococcus jannaschii tRNATyr variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NOs: 101, 102, 103, and 104; and a Methanococcus jannaschii tyrosyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 77, 81, 82, 83, 84, 85, 86, 87, 88, 89, and 90.
- In some embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanosarcina species tRNAPyl or Desulfitobacterium hafniense tRNAPyl as encoded by a nucleotide of sequence SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, or 112; and an engineered variant of Methanosarcina mazei pyrrolysyl-tRNA synthetase (SEQ ID NO:: 78), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:78: X302, X305, X306, X309, X346, X348, X364, X384, X401, X405, and X417.
- In some embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Methanosarcina species tRNAPyl or Desulfitobacterium hafniense tRNAPyl as encoded by a nucleotide of sequence SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, or 112; and an engineered variant of Methanosarcina barkeri pyrrolysyl-tRNA synthetase (SEQ ID NO:: 79), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:: 79: X76, X266, X270, X271, X273, X274, X313, X315, and X349.
- In other embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a tRNAPyl variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NO:: 105, 106, 107, 108, 109, 110, 111, and 112; and a pyrrolysyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 78, 79, 91, 92, 93, 94, 95, and 96.
- In some embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide comprises an engineered variant of Escherichia coli tRNATyr or Bacillus stearothermophilus tRNATyr as encoded by a nucleotide of sequence SEQ ID NO:: 113, 114, 115, 116, 117, 118, 119, or 120; and an engineered variant of Escherichia coli tyrosyl-tRNA synthetase (SEQ ID NO:: 80), said variant comprising an amino acid change at at least one of the following amino acid positions of SEQ ID NO:: 80: X37, X182, X183, X186, and X265.
- In other embodiments, the stop codon/frameshift codon suppression system used for incorporating the amino acid Z (or Z2) into the precursor polypeptide consists of a tRNATyr variant selected from the group of tRNA molecules encoded by the nucleotide sequence of SEQ ID NO:: 113, 114, 115, 116, 117, 118, 119, and 120; and a E. coli tyrosyl-tRNA synthetase variant selected from the group of polypeptides of SEQ ID NOs: 80, 97, 98, 99, and 100.
- In some embodiments, the aminoacyl-tRNA synthetase used for incorporating the amino acid Z (or Z2) into the precursor polypeptide can have additionally at least one amino acid residue differences at positions not specified by an X above as compared to the sequence SEQ ID NO:: 77, 78, 79, or 80. In some embodiments, the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1-50, 1-75, 1-100, 1-150, or 1-200 amino acid residue differences at other positions not defined by X above.
- In some embodiments, the suppressor tRNA molecule used for incorporating the amino acid Z (or Z2) into the precursor polypeptide can have additionally at least one nucleotide difference as compared to the sequence encoded by the gene of SEQ ID NO:: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, or 120. In some embodiments, the differences can be 1-2, 1-5, 1-10, 1-20, 1-30, 1-40, 1-50, or 1-60 nucleotide differences as compared to the sequences encoded by these genes.
- In another embodiment of the method, the engineered variant of Methanococcus jannaschii tyrosyl-tRNA synthetase (SEQ ID NO:: 77) comprises at least one of the features selected from the group consisting of: X32 is Tyr, Leu, Ala, Gly, Thr, His, Glu, Val, or Gln; X65 is Leu, His, Tyr, Val, Ser, Thr, Gly, or Glu; X67 is Ala or Gly; X70 is His, Ala, Cys, or Ser; X107 is Glu, Pro, Asn, or Thr; X108 is Phe, Trp, Ala, Ser, Arg, Gly, Tyr, His, Trp, or Glu; X109 is Gln, Met, Asp, Lys, Glu, Pro, His, Gly, Met, or Leu; X155 is Gln, Glu, or Gly; X158 is Asp, Gly, Glu, Ala, Pro, Thr, Ser, or Val; X159 is Ile, Cys, Pro, Leu, Ser, Trp, His, or Ala; X160 is His or Gln; X161 is Tyr or Gly; X162 is Leu, Arg, Ala, Gln, Gly, Lys, Ser, Glu, Tyr, or His; X163 is Gly or Asp; X164 is Val or Ala; X167 is Ala or Val; X286 is Asp or Arg.
- In another embodiment of the method, the engineered variant of Methanosarcina mazei pyrrolysyl-tRNA synthetase (SEQ ID NO:: 78) comprises at least one of the features selected from the group consisting of: X302 is Ala or Thr; X305 is Leu or Met; X306 is Tyr, Ala, Met, Ile, Leu, Thr, Gly; X309 is Leu, Ala, Pro, Ser, or Arg; X346 is Asn, Ala, Ser, or Val; X348 is Cys, Ala, Thr, Leu, Lys, Met, or Trp; X364 is Thr or Lys; X384 is Tyr or Phe; X405 is Ile or Arg; X401 is Val or Leu; X417 is Trp, Thr or Leu .
- In another embodiment of the method, the engineered variant of Methanosarcina barkeri pyrrolysyl-tRNA synthetase (SEQ ID NO:: 79) comprises at least one of the features selected from the group consisting of: X76 is Asp or Gly; X266 is Leu, Val, or Met; X270 is Leu or Ile; X271 is Tyr, Phe, Leu, Met, or Ala; X274 is Leu, Ala, Met, or Gly; X313 is Cys, Phe, Ala, Val, or Ile; X315 is Met or Phe; X349 is Tyr, Phe, or Trp.
- In another embodiment of the method, the engineered variant of Escherichia coli tyrosyl-tRNA synthetase (SEQ ID NO:: 80) comprises at least one of the features selected from the group consisting of: X37 is Tyr, Ile, Gly, Val, Leu, Thr, or Ser; X182 is Asp, Gly, Ser, or Thr; X183 is Phe, Met, Tyr, or Ala; X186 is Leu, Ala, Met, or Val; X265 is Asp or Arg.
- An aspect of the methods disclosed herein is the identification and selection of a suitable aminoacyl-tRNA synthetase for incorporating an amino acid Z (or Z2) as defined above, into the artificial precursor polypeptide. Various methods are known in the art to evaluate and quantify the relative efficiency of a given wild-type or engineered aminoacyl-tRNA synthetase to incorporate a non-canonical amino acid into a protein (Young, Young et al. 2011). Any of these methods can be used to guide the identification and choice of a suitable aminoacyl-tRNA synthetase for incorporating a desired amino acid Z (or Z2) into the precursor polypeptide. For example, such efficiency can be measured via a fluorescence assay based on the expression of a reporter fluorescent protein (e.g., green fluorescent protein), whose encoding gene has been modified to contain a codon to be suppressed (e.g., amber stop codon). Expression of the reporter fluorescent protein is then induced in a suitable expression system (e.g., an E. coli or yeast cell) in the presence of the aminoacyl-tRNA synthetase to be tested, a cognate suppressor tRNA (e.g., amber stop codon suppressor tRNA), and the desired non-canonical amino acid. Under these conditions, the relative amount of the expressed (i.e., ribosomally produced) fluorescent protein is linked to the relative efficiency of the aminoacyl-tRNA synthetase to charge the cognate suppressor tRNA with the non-canonical amino acid, which can thus be quantified via fluorimetric means. A demonstration of how this procedure can be applied for selecting an aminoacyl-tRNA synthetase / suppressor tRNA pair for incorporating a desired amino acid Z (or Z2) into the precursor polypeptide is provided in Example 3.
- If necessary, the ability of a given aminoacyl-tRNA synthetase / suppressor tRNA pair to incorporate a target non-canonical amino acid into a protein can be improved by means of rational design or directed evolution. While the fluorescence-based method described above can be used to screen several hundreds of engineered aminoacyl-tRNA synthetase variants and/or suppressor tRNA variants for this purpose, higher throughput procedures are also known in the art, which are, for example, based on selection systems (Wang, Xie et al. 2006; Wu and Schultz 2009; Liu and Schultz 2010; Fekner and Chan 2011). One such system involves introducing a library of mutated aminoacyl-tRNA synthetases and/or of mutated suppressor tRNAs into a suitable cell-based expression host (e.g., E. coli or yeast cells), whose survival under a suitable selective medium or growth conditions is dependent upon the functionality of the aminoacyl-tRNA synthetase / suppressor tRNA pair. This can be achieved, for example, by introducing a stop codon or four-base codon that is to be suppressed, into a gene encoding for a protein or enzyme essential for survival of the cell, such as a protein or enzyme conferring resistance to an antibiotic. In this case, the ability of the aminoacyl-tRNA synthetase / suppressor tRNA pair to incorporate the desired non-canonical amino acid into the selection marker protein is linked to the survival of the host, thereby enabling the rapid isolation of suitable aminoacyl-tRNA synthetase / suppressor tRNA pair(s) for the incorporation of a particular non-canonical amino acid from very large engineered libraries. The selectivity of these aminoacyl-tRNA synthetase / suppressor tRNA pair toward the desired non-canonical amino acid over the twenty natural amino acids can be further improved by iterative rounds of positive and negative selection as described in (Wang, Xie et al. 2006; Wu and Schultz 2009; Liu and Schultz 2010; Fekner and Chan 2011). Procedures such as those described above can be thus applied to generate and isolate an engineered aminoacyl-tRNA synthetase / suppressor tRNA pair suitable for incorporation of the amino acid Z as defined above, into the precursor polypeptide.
- Engineered aminoacyl-tRNA synthetase / tRNA pairs for the incorporation of the amino acid Z (or Z2) into the precursor polypeptide can be prepared via mutagenesis of the polynucleotide encoding for the aminoacyl-tRNA synthetase enzymes of SEQ ID NOs: 77, 78, 79, 80, or an engineered variant thereof; and via mutagenesis of the tRNA-encoding polynucleotides of SEQ ID NOs: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, or an engineered variant thereof. Many mutagenesis methods are known in the art and these include, but are not limited to, site-directed mutagenesis, site-saturation mutagenesis, random mutagenesis, cassette-mutagenesis, DNA shuffling, homologous recombination, non-homologous recombination, site-directed recombination, and the like. Detailed description of art-known mutagenesis methods can be found, among other sources, in
U.S. Pat. No. 5,605,793 ;U.S. Pat. No. 5,830,721 ;U.S. Pat. No. 5,834,252 ;WO 95/22625 WO 96/33207 WO 97/20078 WO 97/35966 WO 98/27230 WO 98/42832 WO 99/29902 WO 98/41653 WO 98/41622 WO 98/42727 WO 00/18906 WO 00/04190 WO 00/42561 WO 00/42560 WO 01/23401 WO 01/64864 - As described above, the engineered aminoacyl-tRNA synthetases and cognate suppressor tRNA obtained from mutagenesis of SEQ ID NO:77 to 80, and from mutagenesis of SEQ ID NO:101 to 120, can be screened for identifying aminoacyl-tRNA synthetase / suppressor tRNA pairs being able, or having improved ability as compared to the corresponding wild-type enzyme/tRNA molecule, to incorporate the amino acid Z (or Z2) into the precursor polypeptide.
- In some embodiments, the engineered aminoacyl-tRNA synthetase used in the method comprises an amino acid sequence that is at least 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NOs: 77, 78, 79, or 80.
- In some embodiments, the engineered suppressor tRNA used in the method is encoded by a polynucleotide comprising a nucleotide sequence that is at least 80%, 85%, 90%, 95%, 99% or more identical to the sequence SEQ ID NOs: 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, or 120.
- The target peptide sequence, (AA)n, in the precursor polypeptide of formula (I), (II) and (V) and the second target peptide sequence, (AA)o, in the precursor polypeptide of formula (V), can be a polypeptide comprising 1 to 1,000 amino acid residues. In some embodiments, (AA)n (and (AA)o) consists of a polypeptide comprising 1 to 50 amino acid residues and, in other embodiments, (AA)n (and (AA)o) consists of a polypeptide comprising 1 to 20 amino acid residues.
- The N-terminal tail, (AA)m, in the precursor polypeptide of formula (I), (II), and (V) can be a polypeptide comprising 1 to 10,000 amino acid residues. In some embodiments, (AA)m consists of a polypeptide comprising 1 to 1,000 amino acid residues and, in other embodiments, (AA)m consists of a polypeptide comprising 1 to 600 amino acid residues.
- The C-terminal tail, (AA)p, in the precursor polypeptide of formula (I), (II), and (V) may not be present, and when present, it can be a polypeptide comprising 1 to 10,000 amino acid residues. When present, (AA)m consists, in some embodiments, of a polypeptide comprising 1 to 1,000 amino acid residues and, in other embodiments, (AA)m consists of a polypeptide comprising 1 to 600 amino acid residues.
- The N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) can comprise a polypeptide affinity tag, a DNA-binding polypeptide, a protein-binding polypeptide, an enzyme, a fluorescent protein, an intein protein, or a combination of these polypeptides.
- Introduction of a polypeptide affinity tag within the N-terminal tail and/or C-terminal tail of the precursor polypeptide results in macrocyclic peptides fused to such polypeptide affinity tag. Such affinity tags can be useful for isolating, purifying, and/or immobilizing onto a solid support the macrocyclic peptides generated according to the methods disclosed herein. Accordingly, in some embodiments, the N-terminal tail, C-terminal tail, or both, of the precursor polypeptides comprise at least one polypeptide affinity tags selected from the group consisting of a polyarginine tag (e.g., RRRRR) (SEQ ID NO:121), a polyhistidine tag (e.g., HHHHHH) (SEQ ID NO:122), an Avi-Tag (SGLNDIFEAQKIEWHELEL) (SEQ ID NO:123), a FLAG tag (DYKDDDDK) (SEQ ID NO:124), a Strep-tag II (WSHPQFEK) (SEQ ID NO:125), a c-myc tag (EQKLISEEDL) (SEQ ID NO:126), a S tag (KETAAAKFERQHMDS) (SEQ ID NO:127), a calmodulin-binding peptide (KRRWKKNFIAVSAANRFKKISSSGAL) (SEQ ID NO:128), a streptavidin-binding peptide (MDEKTTGWRGGHVVEGLAGELEQLRARLEHHPQGQREP) (SEQ ID NO:129), a chitin-binding domain (SEQ ID NO:130), a glutathione S-transferase (GST; SEQ ID NO:131), a maltose-binding protein (MBP; SEQ ID NO:132), streptavidin (SEQ ID NO:133), and engineered variants thereof. These aspects are illustrated in Example 2.
- The N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) can comprise a reporter protein or enzyme. This approach will result in the formation of macrocyclic peptides fused to a reporter protein or enzyme, which can be useful to facilitate the functional screening of said macrocyclic peptides. Accordingly, in some embodiments, the N-terminal tail, (AA)m and/or the C-terminal tail, (AA)p, in the precursor polypeptides of formula (I), (II), and (V) comprise at least one polypeptide selected from the group consisting of green fluorescent protein (SEQ ID NO:134), luciferase (SEQ ID NO:135), alkaline phosphatase (SEQ ID NO:136), and engineered variants thereof.
- The N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), or (V) can comprise a protein or enzyme that is part of a display system such as, for example, a phage display (e.g., M13, T7, or lambda phage display), a yeast display, a bacterial display, a DNA display, a plasmid display, a CIS display, a ribosome display, or a mRNA display system. As mentioned above, this approach can be useful for generating large libraries of macrocyclic peptides which are physically linked to, or compartmentalized with the polynucleotide sequence that encodes for the corresponding precursor polypeptides. In turn, this approach can be useful toward isolating functional macrocyclic peptides that are able to bind, inhibit or activate a certain target biomolecule (e.g., protein, enzyme, DNA or RNA molecule) or target biomolecular interaction.
- Accordingly, in some embodiments, the N-terminal tail, (AA)m, comprises a polypeptide selected from the group consisting of M13 phage coat protein pVI (SEQ ID NO: 137), T7 phage protein 10A (SEQ ID NO: 138), T7 phage protein 10B (SEQ ID NO: 139), E. coli NlpA (SEQ ID NO: 140), E. coli OmpC (SEQ ID NO: 141), E. coli FadL (SEQ ID NO: 142), E. coli Lpp-OmpA (SEQ ID NO:143), E. coli PgsA (SEQ ID NO: 144), E. coli EaeA (SEQ ID NO:145), S. cerevisiae Aga2p (SEQ ID NO: 146), S. cerevisiae Flo1p (SEQ ID NO: 147), human NF-KB p50 protein (SEQ ID NO:148), M13 phage coat protein pIII leader sequence (SEQ ID NO: 149), M13 phage coat protein pVIII leader sequence (SEQ ID NO: 150), M13 phage protein pVI (SEQ ID NO:151), Snap-tag (SEQ ID NO:152), Clip-Tag (SEQ ID NO:153), and engineered variants thereof.
- In other embodiments, the C-terminal tail, (AA)p, comprises a polypeptide selected from the group consisting of M13 phage coat protein pIII (SEQ ID NO:154), M13 phage coat protein pVIII (SEQ ID NO:155), RepA protein (SED ID NO: 156), S. cerevisiae Aga1p (SEQ ID NO:157), Snap-tag (SEQ ID NO:152), Clip-Tag (SEQ ID NO:153), P2A protein (SED ID NO: 158), and engineered variants thereof.
- In other embodiments, the C-terminal tail, (AA)p, comprises a molecule selected from the group consisting of puromycin, puromycin analog, a puromycin-DNA conjugate, and a puromycin-RNA conjugate.
- The N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), or (V) can comprise an intein protein. Inteins are polypeptides that are found as in-frame insertions in various natural proteins and can undergo a self-catalyzed intramolecular rearrangement leading to self-excision (self-splicing) of the intein and ligation of the flanking polypeptides together. The mechanism of intein splicing is well known (Xu and Perler 1996; Paulus 2000) and it involves the formation of a (thio)ester bond at the junction between the intein and the polypeptide fused the N-terminus of the intein (commonly referred to as "N-extein") by action of a catalytic cysteine or serine residue at the first position of the intein sequence. This reversible N(backbone)→S(side-chain) or a N(backbone)→O(side-chain) acyl transfer is followed by a trans(thio)esterification step whereby the N-extein acyl unit is transferred to the side-chain thiol/hydroxyl group of a cysteine, serine, or threonine residue at the first position of the polypeptide fused the C-terminus of the intein ("C-extein"). The last step of the intein self-splicing process involves cleavage of the peptide bond connecting the intein to the C-extein via an intramolecular transamidation reaction by action of a conserved catalytic asparagine residue at the C-terminal position of the intein sequence (Paulus 2000).
- Knowledge of the splicing mechanism of intein has enabled the preparation of engineered inteins with altered splicing behavior (Perler 2005; Xu and Evans 2005; Elleuche and Poggeler 2010). For example, it is known that removal of the conserved asparagine residue at the C-terminus of the intein sequence can result in an engineered intein protein capable of only N-terminal splicing (i.e., cleavage of the peptide bond between the N-extein and the intein), which can occurs spontaneously (i.e., via hydrolysis of N-terminal (thio)ester bond) or upon incubation with a thiol reagent (e.g., thiophenol, benzylmercaptan, dithiothreitol, sodium 2-sulfanylethanesulfonate), depending on the nature of the intein and of the C-terminal amino acid(s) in the N-extein sequence. Similarly, removal of the conserved cysteine or serine residue at the N-terminus of the intein sequence can result in an engineered intein protein capable of only C-terminal splicing (i.e., cleavage of the peptide bond between the intein and C-extein), which can occurs spontaneously or promoted via a change in pH or temperature, depending on the nature of the intein and of the N-terminal amino acid(s) in the C-extein sequence. Furthermore, certain intein proteins occur as split inteins, having an N-domain and C-domain. Upon association of the N-domain with the C-domain, split inteins acquires the ability to self-splice according to a mechanism analogous to single-polypeptide intein proteins (Mootz 2009). As for the latter, the N-terminal cysteine or serine residue and C-terminal asparagine residue can be mutated, resulting in altered splicing behavior as described above (Perler 2005; Xu and Evans 2005; Mootz 2009; Elleuche and Poggeler 2010).
- According to the methods described herein, introduction of a natural or engineered intein protein within the N-terminal tail, (AA)m, or C-terminal tail, (AA)p, of the precursor polypeptide of formula (I), (II), or (V) results in the formation of a macrocyclic peptide that is fused to either the C-terminus or the N-terminus, respectively, of such natural or engineered intein. This aspect enables one to control and modulate the release of the macrocyclic peptide from the intein-fused polypeptide based on the self-splicing and altered splicing behavior of natural and engineered intein proteins as summarized above. This aspect can be useful to facilitate the isolation and characterization of the macrocyclic peptide from a complex mixture such as, for example, the lysate of a cell expressing the precursor polypeptide or a cell-free translation system. This aspect can also be useful to facilitate the accumulation, and if desired, control the formation of a target macrocyclic peptide, prepared according the methods described herein, inside a cell-based expression host. In turn, this capability can facilitate the functional screening of in vivo (i.e., in-cell) produced macrocyclic peptide libraries, prepared according the methods disclosed herein, using an intracellular reporter system or a selection system as described above. These aspects are illustrated by Examples 4-8.
- Nucleotide sequences encoding for intein proteins that can be used can be derived from naturally occurring inteins and engineered variants thereof. A rather comprehensive list of such inteins is provided by the Intein Registry (http://www.neb.com/neb/inteins.html). Inteins that can be used include, but are not limited to, any of the naturally occurring inteins from organisms belonging to the Eucarya, Eubacteria, and Archea. Among these, for example, inteins of the GyrA group (e.g., Mxe GyrA, Mfl GyrA, Mgo GyrA, Mkas GyrA, Mle-TN GyrA, Mma GyrA), DnaB group (e.g., Ssp DnaB, Mtu-CDC1551 DnaB, Mtu-H37Rv DnaB, Rma DnaB), RecA group (e.g., Mtu-H37Rv RecA, Mtu-So93 RecA), RIR1 group (e.g., Mth RIR1, Chy RIR1, Pfu RIR1-2, Ter RIR1-2, Pab RIR1-3), and Vma group (e.g., Sce Vma, Ctr Vma), intein Mxe GyrA (SEQ ID NO:1) and the engineered 'mini Ssp DnaB ('eDnaB', SEQ ID NO:2) can be used.
- Intein proteins suitable in the methods described herein include, but are not limited to, engineered variants of natural inteins (or genetic fusion of split inteins), which have been modified by mutagenesis in order, for example, to prevent or minimize splicing at the N-terminal or C-terminal end of the intein. Examples of these modifications include, but are not limited to, mutation of the conserved cysteine or serine residue at the N-terminus of the intein (e.g., via substitution to an alanine) with the purpose, for example, of preventing cleavage at the N-terminus of the intein. Examples of these modifications include, but are not limited to, mutation of the conserved asparagine residue at the C-terminus of the intein (e.g., via substitution to an alanine) with the purpose, for example, of preventing cleavage at the C-terminus of the C-terminus of the intein. Examples of these modifications are provided in Example 2. Intein variants useful for the methods disclosed herein also include, but are not limited to, engineered inteins whose internal endonuclease domain, which is not essential for the splicing mechanism, is removed. For example, a variant of Ssp DnaB ('eDnaB', SEQ ID NO:2) lacking the internal endonuclease domain is used for the preparation of the precursor polypeptides. Inteins to be comprised in the precursor polypeptide can also be engineered with the purpose, for example, of altering the splicing properties of the intein in order to increase or reduce the splicing efficiency or in order to make the intein-catalyzed splicing process dependent upon variation of certain parameters such as pH or temperature.
- Accordingly, in some embodiments, the N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) comprise an intein protein, or an engineered variant thereof. In some embodiments, the N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) comprise an intein protein selected from the group consisting of Mxe GyrA (SEQ ID NO:1), eDnaB (SEQ ID NO:2), Hsp-NRC1 CDC21 (SEQ ID NO:3), Ceu ClpP (SEQ ID NO:4), Tag Pol-1 (SEQ ID NO:5), Tfu Pol-1 (SEQ ID NO:6), Tko Pol-1 (SEQ ID NO:7), Psp-GBD Pol (SEQ ID NO:8), Tag Pol-2 (SEQ ID NO:9), Thy Pol-1 (SEQ ID NO:10), Tko Pol-2 (SEQ ID NO:11), Tli Pol-1 (SEQ ID NO:12), Tma Pol (SEQ ID NO:13), Tsp-GE8 Pol-1 (SEQ ID NO:14), Tthi Pol (SEQ ID NO:15), Tag Pol-3 (SEQ ID NO:16), Tfu Pol-2 (SEQ ID NO:17), Thy Pol-2 (SEQ ID NO:18), Tli Pol-2 (SEQ ID NO:19), Tsp-GE8 Pol-2 (SEQ ID NO:20), Pab Pol-II (SEQ ID NO:21), Mtu-CDC1551 DnaB (SEQ ID NO:22), Mtu-H37Rv DnaB (SEQ ID NO:23), Rma DnaB (SEQ ID NO:24), Ter DnaE-1 (SEQ ID NO:25), Ssp GyrB (SEQ ID NO:26), Mfl GyrA (SEQ ID NO:27), Mgo GyrA (SEQ ID NO:28), Mkas GyrA (SEQ ID NO:29), Mle-TN GyrA (SEQ ID NO:30), Mma GyrA (SEQ ID NO:31), Ssp DnaX (SEQ ID NO:32), Pab Lon (SEQ ID NO:33), Mja PEP (SEQ ID NO:34), Afu-FRR0163 PRP8 (SEQ ID NO:35), Ani-FGSCA4 PRP8 (SEQ ID NO:36), Cne-A PRP8 (SEQ ID NO:37), Hca PRP8 (SEQ ID NO:38), Pch PRP8 (SEQ ID NO:39), Pex PRP8 (SEQ ID NO:40), Pvu PRP8 (SEQ ID NO:41), Mtu-H37Rv RecA (SEQ ID NO:42), Mtu-So93 RecA (SEQ ID NO:43), Mfl RecA (SEQ ID NO:44), Mle-TN RecA (SEQ ID NO:45), Nsp-PCC7120 RIR1 (SEQ ID NO:46), Ter RIR1-1 (SEQ ID NO:47), Pab RIR1-1 (SEQ ID NO:48), Pfu RIR1-1 (SEQ ID NO:49), Chy RIR1 (SEQ ID NO:50), Mth RIR1 (SEQ ID NO:51), Pab RIR1-3 (SEQ ID NO:52), Pfu RIR1-2 (SEQ ID NO:53), Ter RIR1-2 (SEQ ID NO:54), Ter RIR1-4 (SEQ ID NO:55), CIV RIR1 (SEQ ID NO:56), Ctr VMA (SEQ ID NO:57), Sce VMA (SEQ ID NO:58), Tac-ATCC25905 VMA (SEQ ID NO:59), Ssp DnaB (SEQ ID NO:60), engineered variants thereof, and engineered variants thereof wherein the N-terminal cysteine or serine residue of the engineered variant is mutated to any of the natural amino acid residues other than cysteine or serine, or wherein the C-terminal asparagine residue of the engineered variant is mutated to any of the natural amino acid residues other than asparagine.
- In some embodiments, the N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) comprise the N-domain, C-domain, or both the N-domain and C-domain of a split intein, or an engineered variant thereof. In some embodiments, the N-terminal tail, (AA)m, the C-terminal tail, (AA)p, or both, in the precursor polypeptides of formula (I), (II), and (V) comprise the N-domain, C-domain, or both the N-domain and C-domain of a split intein selected from the group consisting of Ssp DnaE (SEQ ID NO:61- SEQ ID NO:62), Neq Pol (SEQ ID NO:63- SEQ ID NO:64), Asp DnaE (SEQ ID NO:65- SEQ ID NO:66), Npu-PCC73102 DnaE (SEQ ID NO:67- SEQ ID NO:68), Nsp-PCC7120 DnaE (SEQ ID NO:69- SEQ ID NO:70), Oli DnaE (SEQ ID NO:71-SEQ ID NO:72), Ssp-PCC7002 DnaE (SEQ ID NO:73- SEQ ID NO:74), Tvu DnaE (SEQ ID NO:75- SEQ ID NO:76), engineered variants thereof, and engineered variants wherein the N-terminal cysteine or serine residue of the split intein N-domain of the engineered variant is mutated to any of the natural amino acid residues other than cysteine or serine, or wherein the C-terminal asparagine residue of the split intein C-domain of the engineered variant is mutated to any of the natural amino acid residues other than asparagine.
- In some embodiments, the N-terminal tail, (AA)m, in the precursor polypeptides of formula (I), (II), and (V) comprises the C-domain of a split intein and the C-terminal tail, (AA)p, of said precursor polypeptides comprises the corresponding N-domain of the split intein. In some embodiments, the N-terminal tail, (AA)m, in the precursor polypeptides of formula (I), (II), and (V) comprises the C-domain of a split intein selected from the group consisting of Ssp DnaE-c (SEQ ID NO:62), Neq Pol-c (SEQ ID NO:64), Asp DnaE-c (SEQ ID NO:66), Npu-PCC73102 DnaE-c (SEQ ID NO:68), Nsp-PCC7120 DnaE-c (SEQ ID NO:70), Oli DnaE-c (SEQ ID NO:72), Ssp-PCC7002 DnaE-c (SEQ ID NO:74), Tvu DnaE-c (SEQ ID NO:76), and engineered variants thereof; and the C-terminal tail, (AA)p, comprises the corresponding N-domain of the split intein selected from the group consisting of Ssp DnaE-n (SEQ ID NO:61), Neq Pol-n (SEQ ID NO:63), Asp DnaE-n (SEQ ID NO:65), Npu-PCC73102 DnaE-n (SEQ ID NO:67), Nsp-PCC7120 DnaE-n (SEQ ID NO:69), Oli DnaE-n (SEQ ID NO:71), Ssp-PCC7002 DnaE-n (SEQ ID NO:73), Tvu DnaE-n (SEQ ID NO:75), and engineered variants thereof.
- In another aspect, polynucleotide molecules are provided encoding for precursor polypeptides of formula (I), (II), and (V) as defined above, the latter, (V), not being according to the invention but present for illustration purposes only. Polynucleotide molecules are provided for encoding for the aminoacyl-tRNA synthetases and cognate tRNA molecules for the ribosomal incorporation of the amino acid Z into the precursor polypeptides of formula (I) and (II) and for the ribosomal incorporation of the amino acid Z2 into the precursor polypeptides of formula (V). Polynucleotide molecules are provided encoding for polypeptide sequences that can be introduced within the N-terminal tail ((AA)m) or C-terminal tail ((AA)p) of the precursor polypeptides of formula (I), (II) and (V), such as peptide and protein affinity tags, reporter proteins and enzymes, carrier proteins of a display system, and intein proteins, as described above. Since the correspondence of all the possible three-base codons to the various amino acids is known, providing the amino acid sequence of the polypeptide provides also a description of all the polynucleotide molecules encoding for such polypeptide. Thus, a person skilled in the art will be able, given a certain polypeptide sequence, to generate any number of different polynucleotides encoding for the same polypeptide. In some embodiments, the codons are selected to fit the host cell in which the polypeptide is being expressed. For example, codons used in bacteria can be used to express the polypeptide in a bacterial host. The polynucleotides may be linked to one or more regulatory sequences controlling the expression of the polypeptide-encoding gene to form a recombinant polynucleotide capable of expressing the polypeptide.
- Numerous methods for making nucleic acids encoding for polypeptides having a predetermined or randomized sequence are known to those skilled in the art. For example, oligonucleotide primers having a predetermined or randomized sequence can be prepared chemically by solid phase synthesis using commercially available equipments and reagents. Polynucleotide molecules can then be synthesized and amplified using a polymerase chain reaction, digested via endonucleases, ligated together, and cloned into a vector according to standard molecular biology protocols known in the art (e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual (Third Edition), Cold Spring Harbor Press, 2001). These methods, in combination with the mutagenesis methods mentioned above, can be used to generate polynucleotide molecules that encode for the aforementioned polypeptides as well as suitable vectors for the expression of these polypeptides in a host expression system.
- The precursor polypeptides can be produced by introducing said polynucleotides into an expression vector, by introducing the resulting vectors into an expression host, and by inducing the expression of the encoded precursor polypeptides in the presence of the amino acid Z (or Z2) and, whenever necessary, also in the presence of a suitable stop codon or frameshift codon suppression system for mediating the incorporation of the amino acid Z (or Z2) into the precursor polypeptides.
- Nucleic acid molecules can be incorporated into any one of a variety of expression vectors suitable for expressing a polypeptide. Suitable vectors include, but are not limited to, chromosomal, nonchromosomal, artificial and synthetic DNA sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectors derived from combinations of plasmids and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, pseudorabies, adenovirus, adeno-associated viruses, retroviruses and many others. Any vector that transduces genetic material into a cell, and, if replication is desired, which is replicable and viable in the relevant host can be used. A large number of expression vectors and expression hosts are known in the art, and many of these are commercially available. A person skilled in the art will be able to select suitable expression vectors for a particular application, e.g., the type of expression host (e.g., in vitro systems, prokaryotic cells such as bacterial cells, and eukaryotic cells such as yeast, insect, or mammalian cells) and the expression conditions selected.
- Expression hosts that may be used for the preparation of the precursor polypeptides and macrocyclic peptides include, but are not limited to, any systems that support the transcription, translation, and/or replication of a nucleic acid. In some embodiments, the expression host system is a cell. Host cells for use in expressing the polypeptides encoded by the expression vector of this disclosure are well known in the art and include, but are not limited to, bacterial cells (e.g., Escherichia coli, Streptomyces); fungal cells such as yeast cells (e.g., Saccharomyces cerevisiae, Pichia pastoris); insect cells; plant cells; and animal cells, such as mammalian cells and human cells. These systems also include, but are not limited to, lysates of prokaryotic cells (e.g., bacterial cells) and lysates of eukaryotic cells (e.g., yeast, insect, or mammalian cells). These systems also include, but are not limited to, in vitro transcription/translation systems, many of which are commercially available. The choice of the expression vector and host system depends on the type of application intended for the methods disclosed herein and a person skilled in the art will be able to select a suitable expression host based on known features and application of the different expression hosts. As an example, when it is desired to evaluate the interaction between the macrocyclic peptide(s) generated via the methods disclosed herein with a bacterial, yeast, or a human cell component, a bacterial, yeast, or a human expression host, respectively, can be used. In some embodiments, the expression host system is a cell.
- In some embodiments, the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out within the cell-based expression host that produces the precursor polypeptides, so that the macrocyclic peptides are produced within this cell-based expression host. This method comprises providing a nucleic acid encoding for the precursor polypeptide, introducing the nucleic acid into the cell-based expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG2 groups of the amino acid Z2), thereby producing the macrocyclic peptide inside the cell-based expression host. These aspects are illustrated in Examples 4 through 8.
- In some embodiments, the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out on the surface of a cell or on a viral particle, so that the macrocyclic peptides are produced as tethered to a cell or a viral particle, respectively. This method comprises providing a nucleic acid encoding for the precursor polypeptide, wherein the N- or C-terminal tail comprises a polypeptide component of the cell membrane (e.g., S. cerevisiae membrane protein Aga2p) or of the viral particle (e.g., M13 phage pIII protein), introducing the nucleic acid into the expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to be integrated into the cell membrane or viral particle, and allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG2 groups of the amino acid Z2), thereby producing the macrocyclic peptide as tethered to the membrane of the cell or to the viral particle.
- In some embodiments, the formation of the macrocyclic peptides from the biosynthetic polypeptides as defined above is carried out within a cell-free expression system, so that the macrocyclic peptides are produced within this cell-free expression system. This method comprises providing a nucleic acid encoding for the precursor polypeptide, introducing the nucleic acid into the cell-free expression host, inducing the expression of the precursor polypeptide, allowing for the precursor polypeptide to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z (or between the cysteines and the FGi and FG2 groups of the amino acid Z2), thereby producing the macrocyclic peptide within the cell-free expression host.
- A method is also provided for making a library of macrocyclic peptides via cyclization of a plurality of precursor polypeptides of formula (I) or (II) that contain an heterogeneous peptide target sequence (AA)n, or an heterogeneous N-terminal tail (AA)m, or an heterogeneous C-terminal tail (AA)p, or a combination of these. This method comprises: (a) constructing a plurality of nucleic acid molecules encoding for a plurality of precursor polypeptides, said precursor polypeptides having an heterogeneous peptide target sequence (AA)n, or an heterogeneous N-terminal tail (AA)m, or an heterogeneous C-terminal tail (AA)p, or a combination of these; (b) introducing each of the plurality of said nucleic acid molecules into an expression vector, and introducing the resulting vectors into an expression host; (c) expressing the plurality of precursor polypeptides; (d) allowing for the precursor polypeptides to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteine and the FGi group of the amino acid Z, thereby producing a plurality of macrocyclic peptides.
- Not according to the invention and for illustration purposes only amethod is presented for making a library of macrocyclic peptides via cyclization of a plurality of precursor polypeptides of formula (V) that contain an heterogeneous peptide target sequence (AA)n, or an heterogeneous second peptide target sequence (AA)o, or an heterogeneous N-terminal tail (AA)m, or an heterogeneous C-terminal tail (AA)p, or a combination of these. This method comprises: (a) constructing a plurality of nucleic acid molecules encoding for a plurality of precursor polypeptides, said precursor polypeptides having an heterogeneous peptide target sequence (AA)n, or an heterogeneous second peptide target sequence (AA)o, or an heterogeneous N-terminal tail (AA)m, or an heterogeneous C-terminal tail (AA)p, or a combination of these; (b) introducing each of the plurality of said nucleic acid molecules into an expression vector, and introducing the resulting vectors into an expression host; (c) expressing the plurality of precursor polypeptides; (d) allowing for the precursor polypeptides to undergo intramolecular cyclization via a bond-forming reaction between the side-chain sulfhydryl group of the cysteines and the FGi and FG2 group2 of the amino acid Z2, thereby producing a plurality of macrocyclic peptides.
- In specific embodiments, each of the plurality of macrocyclic peptides prepared as described above is tethered to a cell component, to a cell membrane component, to a bacteriophage, to a viral particle, or to a DNA molecule, via a polypeptide comprised within the N-terminal tail or within the C-terminal tail of said macrocyclic peptide molecule.
- Several methods of making polynucleotides encoding for heterogeneous peptide sequences are known in the art. These include, among many others, methods for site-directed mutagenesis (Botstein, D.; Shortle, D. Science (New York, N.Y, 1985, 229, 1193; Smith, M. Annual review of genetics, 1985, 19, 423; Dale, S. J.; Felix, I. R. Methods in molecular biology (Clifton, N.J, 1996, 57, 55; Ling, M. M.; Robinson, B. H. Analytical biochemistry, 1997, 254, 157), oligonucleotide-directed mutagenesis (Zoller, M. J. Current opinion in biotechnology, 1992, 3, 348; Zoller, M. J.; Smith, M. Methods Enzymol, 1983, 100, 468; Zoller, M. J.; Smith, M. Methods Enzymol, 1987, 154, 329), mutagenesis by total gene synthesis and cassette mutagenesis (Nambiar, K. P.; Stackhouse, J.; Stauffer, D. M.; Kennedy, W. P.; Eldredge, J. K.; Benner, S. A. Science (New York, N.Y, 1984, 223, 1299; Grundstrom, T.; Zenke, W. M.; Wintzerith, M.; Matthes, H. W.; Staub, A.; Chambon, P. Nucleic acids research, 1985, 13, 3305; Wells, J. A.; Vasser, M.; Powers, D. B. Gene, 1985, 34, 315), and the like. Additional methods are described in the following U.S. patents, PCT publications, and EPO publications:
U.S. Pat. No. 5,605,793 "Methods for In vitro Recombination",U.S. Pat. No. 5,830,721 "DNA Mutagenesis by Random Fragmentation and Reassembly",WO 95/22625 WO 96/33207 EP 752008 WO 98/27230 WO 00/00632 WO 98/42832 WO 99/29902 - The compounds provided herein may contain one or more chiral centers. Accordingly, the compounds are intended to include, but not be limited to, racemic mixtures, diastereomers, enantiomers, and mixture enriched in at least one stereoisomer or a plurality of stereoisomers. When a group of substituents is disclosed herein, all the individual members of that group and all subgroups, including any isomers, enantiomers, and diastereomers are intended to be included in the disclosure. Additionally, all isotopic forms of the compounds disclosed herein are intended to be included in the disclosure. For example, it is understood that any one or more hydrogens in a molecule disclosed herein can be replaced with deuterium or tritium.
- The terms and expression that are employed herein are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described and portions thereof, but it is recognized that various modifications are possible within the scope of the subject matter claimed herein. Thus, it should be understood that although various embodiments and optional features have been disclosed herein, modification and variation of the concepts herein disclosed may be resorted to those skilled in the art, and that such modifications and variations are considered to be encompassed by the appended claims.
- Unless otherwise indicated, the disclosure is not limited to specific molecular structures, substituents, synthetic methods, reaction conditions, or the like, as such may vary. It is to be understood that the embodiments are not limited to particular compositions or biological systems, which can, of course, vary.
- A skilled artisan will appreciate that starting materials, biological materials, reagents, synthetic methods, purification methods, analytical methods, assay methods, and biological methods other than those specifically exemplified can be employed in the practice of the methods and compositions disclosed herein. All art-known functional equivalents of any such materials and methods are intended to be included in the methods and compositions disclosed herein.
- The following examples are offered by way of illustration and not by way of limitation.
- This example demonstrates the preparation of various cysteine-reactive unnatural amino acids, i.e., various Z and Z2 amino acids, which can be used for preparation of macrocyclic peptide molecules according to the general methods illustrated in
FIGS. 1A-B ,2A-B ,3A-B ,4A-B , and37A-B . - The unnatural amino acid 4-(2-bromoethoxy)-phenylalanine (1, p-2beF) was prepared according to the synthetic route provide in
Scheme 1 ofFIG. 5 . The unnatural amino acid Nε-((2-bromoethoxy)carbonyl)-lysine (2, 2-becK) was prepared according to the synthetic route provide inScheme 2 ofFIG. 5 . The unnatural amino acid 4-(1-bromoethyl)-phenylalanine (3, p-1beF) was prepared according to the synthetic route provide inScheme 3 ofFIG. 5 . The unnatural amino acid Nε-((2-chloroethoxy)carbonyl)-lysine (4, 2-cecK) was prepared according to the synthetic route provide inScheme 4 ofFIG. 6 . The unnatural amino acid Nε-(buta-2,3-dienoyl)-lysine (5, bdnK) was prepared according to the synthetic route provide inScheme 5 ofFIG. 6 . The bifunctional unnatural amino acid O-(2,3-dibromoethyl)-tyrosine (6, OdbpY) was prepared according to the synthetic route provide inScheme 6 ofFIG. 6 . A person skilled in the art would readily recognize that many other cysteine-reactive amino acids of general formula (III), (IV), (VI), or (VII) can be prepared in an analogous manner either through modification of naturally occurring amino acids (e.g., p-2beF, 2-becK, 2-cecK, bdnK, ObdpY) or via synthesis ex novo (e.g., p-1beF). - Synthesis of 4-(2-bromoethoxy)-phenylalanine (p-2beF) (1). To a reaction flask containing N-tert-butoxycarbonyl-tyrosine (2 g, 7.1 mmol) and potassium carbonate (2.94 g, 21.3 mmol) in dry DMF (20 mL) dibromoethane (1.83 mL, 21.3 mmol) was added dropwise over 20 min. The reaction mixture was stirred at room temperature for 18 h after which the reaction mixture was filtered, diluted with 60 mL of water, acidified with acetic acid to
pH 4 and extracted with 2 x 100 mL of EtOAc. Organic layers were combined and dried over sodium sulfate. The solvent was removed under reduced pressure yielding yellow oil as crude product which was purified by flash column chromatography using 10:9:1 hexane:EtOAc:HOAc acid as solvent system. Fractions of interest were combined and solvent removed under reduced pressure yielding N-Boc-4-(2-bromoethoxy)-phenylalanine as an off-white powder (2.3 g, 84%). 1H NMR (400 MHz, CD3OD) δ 1.39 (s, 9H), 2.8-3.05 (m, 2H), 3.3 (t, 2H), 3.51 (t, 2H), 4.37 (t, 2H), 6.69 (d, 2H), 7.02 (d, 2H); 13C NMR (125 MHz, CD3OD) δ 28.73, 29.49, 37.92, 56.82, 65.77, 80.69, 116.27, 128.84, 131.32, 157.39, 157.77, 173.414. MS (ESI) calculated for C14H19NO5 [M]+: m/z 387.07, found 387.17. Purified N-Boc-4-(2-bromoethoxy)-phenylalanine was treated with 20 mL of 30% TFA/DCM to remove the N-terminal protection. Upon completed reaction (determined by TLC), the solvent was removed under reduced pressure, crude residue dissolved 2 x in 10 mL of HOAc followed by solvent evaporation yielding thefinal product 1 as an off-white solid in quantitative yield (1.7 g). 1H NMR (400 MHz, CD3OD) δ 3.05-3.25 (m, 2H), 3.58 (t, 2H), 4.28 (t, 1H), 4.51 (t, 2H), 6.77 (d, 2H), 7.09 (d, 2H); 13C NMR (125 MHz, CD3OD) δ 29.1, 36.9, 55.35, 66.92, 116.92, 125.54, 131.59, 158.41, 169.93. MS (ESI) calculated for C11H14BrNO3 [M+H]+: m/z 288.02, found 288.51. - Synthesis of Nε-((2-bromoethoxy)carbonyl)-lysine (2becK) (2). To a solution of Nα -tert-butoxycarbonyl-lysine (1 g, 4.06 mmol) and NaOH (162.4 mg, 4.06 mmol, 1 eq) dissolved in 20 mL of water 2-bromoethylchloroformate (0.435 mL, 4.06 mmol, 1 eq) and, separately, an additional equivalent of NaOH were added simultaneously dropwise over 30 min. The reaction mixture was stirred at room temperature for 18 h. Upon acidification with HOAc, the aqueous phase was extracted with EtOAc (3 x 80 mL). The combined organic phases were dried over sodium sulfate, solvent was removed under reduced pressure yielding yellow oil as crude product which was purified by flash column chromatography using 10:9:1 hexane:EtOAc:HOAc as solvent system. Fractions of interest were combined and solvent removed under reduced pressure yielding N-Boc-Nε-((2-bromoethoxy)carbonyl)-lysine as an off-white powder (1.1 g, 68%). 1H NMR (400 MHz, CD3OD) δ 1.43 (s, 9H), 1.5 (m, 2H), 1.65 (m, 2H), 1.79 (m, 2H), 3.09 (t, 2H), 3.54 (t, 2H), 4.05 (t, 1H), 4.29 (t, 2H); 13C NMR (125 MHz, CD3OD) δ 24.09, 28.78, 30.39, 30.47, 32.434, 41.44, 54.82, 65.51, 80.51, 158.15, 158.44, 176.24; MS (ESI) calculated for C14H19NO5 [M+H]+: m/z 397.1, found 397.47. Purified N-Bοc-Nζ-((2-bromoethoxy)carbonyl)-lysine was treated with 20 mL of 30% TFA/DCM to remove the N-terminal protection. Upon completed reaction (determined by TLC), the solvent was removed under reduced pressure, crude residue dissolved 2 x in 10 mL of acetic acid followed by solvent evaporation yielding the
final product 2 as an off-white solid in quantitative yield (0.82 g). 1H NMR (400 MHz, CD3OD) δ 1.45 (m, 2H), 1.64 (m, 2H), 1.76 (m, 2H), 2.95 (t, 2H), 3.6 (t, 2H), 3.85 (t, 1H), 4.22 (t, 2H); 13C NMR (100 MHz, CD3OD) δ 20.74, 23.16, 30.36, 31.16, 41.21, 53.86, 65.54, 158.52, 175.21; MS (ESI) calculated for C11H14BrNO3 [M+H]+: m/z 297.04, found 297.7. - Synthesis of 4-(1-bromoethyl)-phenylalanine (p-1beF) (3). Solution of 4-acetylphenylalanine (0.5 g, 2.415 mmol), prepared as reported previously (Frost, Vitali et al. 2013), in methanol was placed in an ice bath followed by addition of triethylamine (0.51 mL, 3.63 mmol, 1.5 eq) and dropwise addition of di-tert-butyl dicarbonate (0.665 mL, 2.9 mmol, 1.2 eq) over 30 min. The reaction was left at room temperature for additional 3 h after which the solvent was removed in vacuo. The residue was redissolved in EtOAc and extracted with acidified water (pH 4). Organic phase was dried over sodium sulfate, solvent removed under reduced pressure and the crude yellow oil purified using flash column chromatography with 10:9:1 hexane:EtOAc:HOAc as solvent system. Fractions of interest were combined yielding N-Boc-4-acetylphenylalanine as a yellow powder (0.665 g, 90%) which was dissolved in MeOH, placed in an ice bath and treated with NaBH4 (0.164 g, 4.34 mmol, 2 eq) for 3 h. Following aqueous workup, the crude product was dissolved in DCM, placed in an ice bath and PBr3 (1 M solution in DCM) was added in portions (5.2 mL, 5.2 mmol, 2.4 eq) over 2 h. The reaction was warmed to reach room temperature and left stirring overnight. After workup, the aqueous layer was lyophilized and used as crude product 3 (0.382 g, 65%). 1H NMR (400 MHz, CD3OD) δ 1.99 (d, 3H), 2.8-3.2 (m, 2H), 4.31 (t, 1H), 4.78 (q, 1H), 7.18 (d, 2H), 7.27 (d, 2H); MS (ESI) calculated for C11H14BrNO2 [M+H]+: m/z 272.03, found 272.53.
- Synthesis of Nε-((2-chloroethoxy)carbonyl)-lysine (2-cecK) (4). To a solution of Nα -tert-butoxycarbonyl-lysine 1 (1 g, 4.06 mmol) and NaOH (162.4 mg, 4.06 mmol, 1 eq) dissolved in 20 mL of water 2-chloroethylchloroformate (0.419 mL, 4.06 mmol, 1 eq) and, separately, an additional equivalent of NaOH were added simultaneously dropwise over 30 min. The reaction mixture was stirred at room temperature for 10-12 h. Upon acidification with HOAc, the aqueous phase was extracted with EtOAc (3 x 80 mL). The combined organic phases were dried over sodium sulfate, solvent was removed under reduced pressure yielding yellow oil as crude product which was purified by flash column chromatography using 10:9:1 hexane:EtOAc:HOAc as solvent system. Fractions of interest were combined and solvent removed under reduced pressure yielding off-white powder as product (1.04 g, 75%). Purified product was treated with 20 mL of 30% TFA/DCM to remove the N-terminal Boc-protection. Upon completed reaction (determined by TLC), the solvent was removed under reduced pressure, yielding the
final product 4 as off-white solid in quantitative yield (0.75 g). 1H NMR (400 MHz, CD3OD) δ 1.45 (m, 2H), 1.64 (m, 2H), 1.76 (m, 2H), 2.95 (t, 2H), 3.6 (t, 2H), 3.85 (t, 1H), 4.22 (t, 2H). - Synthesis of Nε-(buta-2,3-dienoyl)-lysine (bdnK) (5). 3-butynoic acid was prepared by oxidation of 3-butyn-1-ol. About 20 mL of water was added to a 150 mL single neck RBF followed by 65% HNO3 (45 µL, 0.66 mmol, 0.05 eq), Na2Cr2O7 (40 mg, 0.132 mmol, 0.01 eq) and NaIO4 (6.22 g, 29 mmol, 2.2 eq) and stirred vigorously on an ice bath. After 15
min 1 mL of 3-butyn-1-ol (1 eq, 13.2 mmol) dissolved in 5 mL of cold water was added dropwise over 30 min. The reaction was left stirring overnight followed by product extraction with diethyl ether. Solvent was evaporated to yield off-white/yellow solid (g, %). 1H NMR (400 MHz, CDCl3) δ 3.35 (d, 2H), 2.22 (t, 1H). 3-butynoic acid (0.436 g, 5.2 mmol, 1 eq) was dissolved in dry DCM and 1.5 eq of 2-chloro-1-methylpyridinium iodide was added (2.2 g). The reaction was stirred for 1 h at room temperature followed by dropwise addition of Nα -tert-butoxycarbonyl-lysine (1.4 g, 5.72 mmol, 1.1 eq) and triethylamine (1.2 mL, 7.8 mmol, 1.5 eq). The reaction was monitored by TLC and upon completion (4-5 h) extracted with water. Organic layer was evaporated and the crude product was purified using flash column chromatography with 10:9:1 hexane:EtOAc:HOAc as solvent system. Fractions containing the desired product were pooled together and the solvent was removed under reduced pressure giving the desired product in 55% yield. 1H NMR (400 MHz, CD3OD) δ 1.4 (s, 9H), 1.5 (m, 2H), 1.62 (m, 2H), 1.81 (m, 2H), 3.13 (t, 2H), 4.51 (m, 3H), 5.8 (m, 1H). The final Boc-deprotection was achieved using 20 mL of 30% TFA/DCM for 30 min followed by solvent removal resulting in product 5 ( g). 1H NMR (400 MHz, CD3OD) δ 1.48 (m, 2H), 1.63 (m, 2H), 1.82 (m, 2H), 3.12 (t, 2H), 4.21 (t, 1H), 4.51 (d, 2H), 5.8 (m, 1H). - Synthesis of O-(2,3-dibromoethyl)-tyrosine (OdbpY) (6). To a reaction flask containing Nα -tert-butoxycarbonyl-tyrosine (2 g, 7.1 mmol) and potassium carbonate (2.94 g, 21.3 mmol, 2 eq) in dry DMF (20 mL) 1,2,3-tribromopropane (0.915 mL, 7.82 mmol, 1.1 eq) was added dropwise over 20 min. The reaction mixture was stirred at room temperature for 8 h after which the reaction mixture was filtered, diluted with 60 mL of water, acidified with acetic acid to
pH 4 and extracted with 2 x 100 mL of EtOAc. Organic layers were combined and dried over sodium sulfate. The solvent was removed under reduced pressure yielding yellow oil as crude product which was purified by flash column chromatography using 10:9:1 hexane:EtOAc:HOAc acid as solvent system. Fractions of interest were combined and solvent removed under reduced pressure yielding off-white powder as product (g, %). 1H NMR (400 MHz, CD3OD) δ 1.41 (s, 9H), 2.81-3.07 (m, 2H), 3.6-3.81 (m, 2H), 4.21-4.43 (m, 3H), 4.61-4.72 (m, 1H), 6.71 (d, 2H), 7.04 (d, 2H). Purified product was treated with 20 mL of 30% TFA/DCM to remove the N-terminal protection. Upon completed reaction (determined by TLC), the solvent was removed under reduced pressure yielding thefinal product 6 as an off-white solid in quantitative yield (g). 1H NMR (400 MHz, CD3OD) δ 2.81-3.07 (m, 2H), 3.6-3.81 (m, 2H), 4.12 (t, 1H), 4.21-4.43 (m, 2H), 4.61-4.72 (m, 1H), 6.71 (d, 2H), 7.04 (d, 2H). - This example demonstrates procedures for the construction of polynucleotide molecules for the expression of precursor polypeptides of the type (I), (II), or (V) according to the methods described herein.
- To illustrate the various embodiments, a series of a plasmid-based vectors were prepared that encode for precursor polypeptides in different formats (Table 1) according to the macrocyclization methods schematically described in
FIGS. 1A-B ,2A-B ,3A-B ,4A-B and37A-B . Specifically, a first series of constructs (Entries 1-9 and 13-15, Table 1) were prepared for the expression of precursor polypeptides of general formula (I), in which (i) the N-terminal tail, (AA)m, consists of a Met-Gly dipeptide; (ii) the target peptide sequence, (AA)n, consists of 1- to 12-amino acid long polypeptides, some of which were designed to include a streptavidin-binding HPQ motif (Katz 1995; Naumann, Savinov et al. 2005) (Entries 13-15, Table 1); and (iii) the C-terminal tail, (AA)p, consists of a short (1 to 8 amino acid-long) polypeptide sequence C-terminally fused to Mxe GyrA intein (SEQ ID NO:1). In these constructs, an amber stop codon was used to enable the introduction of the desired, cysteine-reactive unnatural amino acid Z, upstream of the peptide target sequence via amber stop codon suppression. Moreover, the C-terminal asparagine of Mxe GyrA intein was mutated to an alanine (N198A) to prevent C-terminal splicing and allow for the introduction of a polyhistidine affinity tag at the C-terminus of the polypeptide construct. These constructs were designed to demonstrate the general methods described inFIGS. 1A and2A . - A second series of constructs (Entries 10-12, Table 1) were prepared for the expression of precursor polypeptides of general formula (II), in which (i) the N-terminal tail, (AA)m, consists of a short (2 to 6 amino acid-long) polypeptide; (ii) the target peptide sequence, (AA)n, consists of a 3 to 7-amino acid long polypeptide; and (iii) the C-terminal tail, (AA)p, consists of the N198A variant of Mxe GyrA intein (SEQ ID NO:1) followed by a polyhistidine tag. In these constructs, an amber stop codon was used to enable the introduction of the desired, cysteine-reactive unnatural amino acid Z, downstream of the peptide target sequence via amber stop codon suppression. These constructs were designed to probe the functionality of the general methods described in
FIGS. 1B and2B . - A third series of constructs (Entries 16-20, Table 1) were prepared for the expression of precursor polypeptides of general formula (I), in which (i) the N-terminal tail, (AA)m, contains the C-domain of Synechocystis sp. DnaE split intein (SEQ ID NO:62); (ii) the C-terminal tail, (AA)p, contains the N-domain of Synechocystis sp. DnaE split intein (SEQ ID NO:61); and (iii) a streptavidin-binding HPQ motif (Naumann, Savinov et al. 2005) is included within (Entry 18-20, Table 1) or downstream of the target peptide sequence (AA)n (Entries 16-17, Table 1). In these constructs, an amber stop codon was used to enable the introduction of the desired, cysteine-reactive unnatural amino acid Z, upstream of the peptide target sequence. Furthermore, these constructs contain a CBD (cellulose binding domain) affinity tag fused to the C-terminal end of the split intein N-domain. These constructs were designed to probe the functionality of the general methods described in
FIGS. 4A-B . - An additional construct (Entry 21, Table 1) was prepared for the expression of a precursor polypeptide which carries two Cys/Z pairs comprising two different target peptide sequences (HPQF (SEQ ID NO:185) and NTSK (SEQ ID NO:186)) and being separated from each other by an intervening polypeptide sequence (ENLYFQS (SEQ ID NO:187)). This construct is instrumental for demonstrating the possibility to generate polycyclic peptides using the methods disclosed herein.
- Finally, a construct (Entry 22, Table 1) was prepared for the expression of a precursor polypeptide which carries a bifunctional cysteine-reactive amino acid (Z2) and two cysteine residues. This construct is instrumental for demonstrating the possibility to generate polycyclic peptides according to the general methods described in
FIGS. 37A-B .Table 1. Precursor polypeptide constructs.a Entry Construct Name Peptide Sequence 1 12mer-Z1C MG-(Z)-C GSKLAEYGT-(GyrAN198A)LEHHHHHH (SEQ ID NO:159) 2 12mer-Z2C MG-(Z)-T C SKLAEYGT-(GyrAN198A)LEHHHHHH (SEQ ID NO:160) 3 12mer-Z3C MG-(Z)TGCKLAEY GT-(GyrAN198A)LEHHHHHH (SEQ ID NO: 161) 4 12mer-Z4C MG-(Z)-TGSCLAEYGT-(GyrAN198A)-LEHHHHHH (SEQ ID NO:162) 5 12mer-Z5C MG-(Z)-TGSKCAEYGT-(GyrAN198A)LEHHHHHH (SEQ ID NO: 163) 6 12mer-Z6C MG-(Z)-TGSKLCEYGT-(GyrAN198A)LEHHHHHH (SEQ ID NO: 164) 7 12mer-Z8C MG-(Z)-TGSKLAECGT-(GyrAN198A)LEHHHHHH (SEQ ID NO: 165) 8 14mer-Z10C MG-(Z)-TGSKYLNAECGT-(GyrAN198A)LEHHHHHH (SEQ ID NO: 166) 9 16mer-Z12C MG-(Z)-TGSHKYLRNAECGT-(GyrAN198A)LEHHHHHH (SEQ ID NO:167) 10 10mer-C4Z MGSEAGCNIA-(Z)-(GyrAN198A)LEHHHHHH (SEQ ID NO:168; SEQ ID NO:169) 11 10mer-C6Z MGSECGTNIA-(Z)-(GyrAN198A)LEHHHHHH (SEQ ID NO:170; SEQ ID NO:169) 12 10mer-C8Z MGCEAGTNIA-(Z)-(GyrAN198A)LEHHHHHH (SEQ ID NO:171; SEQ ID NO:169) 13 Strep 1-Z5C MG-(Z)-HPQFCGD-(GyrAN198A)LEHHHHHH (SEQ ID NO:172) 14 Strep2-Z7C MG-(Z)-HPOGPPCGD-(GyrAN198A)LEHHHHHH (SEQ ID NO:173) 15 Strep3-Z11C MG-(Z)-FTNVHPQFANCD-(GyrAN198A)LEHHHHHH (SEQ ID NO: 174) 16 cStrep3(C)-Z3C (DnaEC)-C-(Z)-TNCHPOFANA-(DnaEN)-(CBD) (SEQ ID NO: 175; SEQ ID NO:176) 17 cStrep3(S)-Z3C (DnaEC)-S-(Z)-TNCHPOFANA-(DnaEN)-(CBD) (SEQ ID NO:177; SEQ ID NO:178) 18 cStrep3(C)-Z8C (DnaEC)-C-(Z)-TNVHPQFCNA-(DnaEN)-(CBD) (SEQ ID NO: 175; SEQ ID NO:179) 19 cStrep4(S)-Z8C (DnaEC)-S-(Z)-TNVHPQFCNAKGDA-(DnaEN)-(CBD) (SEQ ID NO:177; SEQ ID NO:180) 20 cStrep5(S)-Z8C (DnaEC)-S-(Z)-TNVHPQFCNAKGDTQA-(DnaEN)-(CBD) (SEQ ID NO:177; SEQ ID NO:181) 21 Strep6 _Z4C7C4Z MG-(Z)-HPOFCENLYFOSCNTSK-(Z)-(GyrAN198A)LEHHHHHH (SEQ ID NO:182; SEQ ID NO:169) 22 Strep7 _C5Z4C MGCAYDSG-(Z2)-HPQFCGT-(GyrAN198A)LEHHHHHH (SEQ ID NO:183; SEQ ID NO:184) aGyrAN190A corresponds to the N190A variant of Mycobacterium xenopi GyrA (SEQ. ID NO:1), CBD corresponds to the Chitin Binding Domain (CBD) of Bacillus circulans chitinase A1 (SEQ ID NO:130), DnaEN and DnaEc correspond to the N-domain and C-domain, respectively, of Synechocystis sp. DnaE split intein (SEQ ID NOS: 61 and 62). The reactive amino acid residues involved in peptide macrocyclization (i.e., Cys and Z residues; Cys and Z2 residues) are highlighted in bold. - Cloning and plasmid construction. The plasmid vector pET22b(+) (Novagen) was used as cloning vector to prepare the plasmids for the expression of the precursor polypeptides of Entries 1-15 and 21-22 in Table 1. Briefly, synthetic oligonucleotides (Integrated DNA Technologies) were used for the PCR amplification of a gene encoding for N-terminal peptide and peptide target sequence fused to GyrAN198A intein using a previously described GyrA-containing vector (pBP_MG6) (Smith, Vitali et al. 2011) as template. The resulting PCR product (ca. 0.6 Kbp) was digested with Nde I and Xho I and cloned into pET22b(+) to provide the plasmids for the expression of the precursor polypeptides of Entries 1-15 and 21-22 in Table 1. The cloning process placed the polypeptide-encoding gene under the control of an IPTG-inducible T7 promoter and introduced a poly-histidine tag at the C-terminus of the intein. Plasmids for the expression of the polypeptide constructs of Entries 16 through 20 of Table 1 were prepared in a similar manner but using pBAD plasmid (Life Technologies) as the cloning and expression vector. The genes encoding for DnaEN and DnaEc were amplified from Addgene plasmids pSFBAD09 and pJJDuet30. The sequences of the plasmid constructs were confirmed by DNA sequencing.
- This example illustrates how a suitable tRNA/aminoacyl-tRNA synthetase pair can be identified for the purpose of incorporating a desired cysteine-reactive, unnatural amino acid into a precursor polypeptide of general formula (I), (II), or (V) according to the methods disclosed herein. In particular, this example describes the identification of tRNA/aminoacyl-tRNA synthetase pairs for the incorporation of the unnatural amino acid 4-(2-bromoethoxy)-phenylalanine (p-2beF), Nε-((2-bromoethoxy)carbonyl)-lysine (2becK), 4-(1-bromoethyl)-phenylalanine (p-1beF), Nε-((2-chloroethoxy)carbonyl)-lysine (2cecK), Nε-(buta-2,3-dienoyl)-lysine (bdnK), and O-(2,3-dibromoethyl)-tyrosine (OdbpY), which were synthesized as described in Example 1.
- A high-throughput fluorescence-based screen was applied to identify viable tRNA/aminoacyl-tRNA synthetase (AARS) pairs for the ribosomal incorporation of the unnatural amino acid p-2beF, 2becK, p-1beF, 2cecK, bdnK, or OdbpY, in response to an amber stop codon. In this assay, E. coli cells are co-transformed with two plasmids with compatible origins of replications and selection markers; one plasmid directs the expression of the tRNA/AARS pair to be tested, whereas the second plasmid contains a gene encoding for a variant of Yellow Fluorescence Protein (YFP), in which an amber stop codon (TAG) is introduced at the second position of the polypeptide sequence following the initial Met residue (called YFP(TAG)). The ability of the tRNA/AARS pair to suppress the amber stop codon with the unnatural amino acid of interest can be thus determined and quantified based on the relative expression of YFP as determined by fluorescence. Using this assay, a panel of engineered aminoacyl-tRNA synthetase (AARS) variants derived from M. jannaschii tyrosyl-tRNA synthetase (SEQ ID NO:77), M. barkeri pyrrolysyl-tRNA synthetase (SEQ ID NO:79), or M. mazei pyrrolysyl-tRNA synthetase (SEQ ID NO:78) in combination with their cognate amber stop codon suppressor tRNA (i.e., MjtRNACUA Tyr (SEQ ID NO:101) for Mj AARSs and Mm/MbtRNACUA Pyl (SEQ ID NO:105) for the Mm and Mb AARSs) were tested for their ability to incorporate the target amino acids p-2beF, 2becK, p-1beF, 2cecK, bdnK, or OdbpY into the reporter YFP(TAG) protein. In a representative experiment, this panel of AARS enzymes included the known engineered AARSs Mj-pAcF-RS (SEQ ID NO:81), Mj-pAmF-RS (SEQ ID NO:87), Mb-CrtK-RS (SEQ ID NO:93), and Mm-pXF-RS (SEQ ID NO:91) (Young, Young et al. 2011)) as well as the newly engineered Mj-OpgY2-RS (SEQ ID NO:85). The latter, which is derived from Mj-OpgY-RS (SEQ ID NO:84) (Deiters and Schultz 2005), carries an Ala32G mutation that was designed to facilitate the recognition of O-substituted tyrosine derivatives such as p-2beF and OdbpY based on the available crystal structure of the parent enzyme Mj-TyrRS (SEQ ID NO:77) (Kobayashi, Nureki et al. 2003). As illustrated by the representative data in
FIGS. 7A-B , the AARS/tRNA pair consisting of Mj-pOgY2-RS/MjtRNACUA Tyr was found to enable the efficient incorporation of p-2beF (FIG. 7A ), whereas the AARS/tRNA pair consisting of Mb-CrtK-RS/Mm/MbtRNACUA Pyl was found to enable the efficient incorporation of 2becK into the reporter YFP(TAG) protein (FIG. 7B ). Control experiments with no unnatural amino acid added to the culture medium show no or negligible expression of the reporter YFP protein, evidencing the discriminating selectivity of these AARS/tRNA pairs for the desired unnatural amino acid over the pool of natural amino acids (this property is referred here as "orthogonal reactivity" or simply "orthogonality" of the AARS/tRNA). - Using an analogous procedure, it was established that the Mj-pAcF-RS/MjtRNACUA Tyr pair can enable efficient amber stop codon suppression with p-1beF; the Mb-CrtK-RS/Mm / MbtRNACUA Pyl pair can enable efficient amber stop codon suppression with 2cecK or bdnK; and the Mj-pOgY2-RS/MjtRNACUA Tyr pair can enable efficient amber stop codon suppression with OdbpY. These results provide an exemplary demonstration of viable procedures that can be used to identify suitable AARS/tRNA pairs for the ribosomal incorporation of cysteine-reactive unnatural amino acid into a polypeptide for the purpose of producing macrocyclic peptide according to methods disclosed herein and as illustrated in the following Examples.
- YFP expression assay. Competent BL21(DE3) E. coli were cotransformed with a pEVOL-based plasmid (Smith, Vitali et al. 2011) for the expression of the desired AARS/tRNA pair and a pET22-YFP(TAG) plasmid for the expression of the reporter YFP protein. After overnight growth at 37°C in LB medium supplemented with chloramphenicol (25 µg/mL) and ampicillin (50 µg/mL), cell cultures were used to inoculate 96-well plates containing 0.9 mL of minimal (M9) media (25 µg/mL chloramphenicol, 50 µg/mL ampicillin, 1% glycerol) per well. At OD600 = 0.6, protein expression was induced with 0.05% L-arabinose and 1 mM IPTG. Test wells were supplemented with the desired unnatural amino acid (e.g., 4-(2-bromoethoxy)-phenylalanine (p-2beF) at 2 to 5 mM, whereas no amino acid was added to the negative control wells. Cultures were grown overnight at 27°C and then diluted 1:100 with phosphate buffer (50 mM KPi (pH 7.5), 150 mM NaCl) into microtiter plates. Fluorescence intensity was measured using a
Tecan Infinite 1000 multi-well plate reader (λexc : 514 nm; λem : 527 nm). - This example demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I) and containing the cysteine-reactive unnatural amino acid p-2beF. In particular, this example demonstrates certain embodiments as schematically described in
FIGS. 1A and2A . - For these experiments, the precursor polypeptides corresponding to
Entries 1 through 9 in Table 1 were expressed in BL21(DE3) E. coli cells containing a second, pEVOL-based plasmid for the co-expression of Mj-pOgY2-RS and MjtRNACUA Tyr. As described in Example 3, this AARS/tRNA pair was established to allow for the efficient ribosomal incorporation of p-2beF into a polypeptide in response to an amber stop codon. According to our strategy (FIGS. 1A-B ), these precursor polypeptides were expected to undergo cyclization via a nucleophilic substitution reaction between the cysteine side-chain thiol group and the p-2beF side-chain bromoalkyl group flanking the target peptide sequence after ribosomal synthesis of the precursor polypeptides in the expression system (E. coli) (FIG. 8 ). To establish the occurrence and efficiency of the cyclization, these proteins were isolated by Ni-affinity chromatography exploiting the C-terminal poly-histidine tag present in these constructs (Table 1). In all the aforementioned constructs, a Thr residue was placed at the site preceding the GyrA intein ("I-1 site"). This substitution minimizes premature hydrolysis of GyrA-fusion proteins during expression in E. coli (Frost, Vitali et al. 2013), thereby facilitating analysis of the target peptide sequences after chemically induced splicing of the intein from the purified proteins in vitro (FIG. 8 , path A). This procedure would also permit the isolation of any product resulting from the unselective reaction of p-2beF with other nucleophiles in vivo (e.g., glutathione). Accordingly, after purification, the proteins were made react with benzyl mercaptan in order to release the desired macrocyclic peptide (in the form of C-terminal benzyl thioester or the corresponding C-terminal carboxylic acid after thioester hydrolysis) from the GyrA intein via thiol-induced splicing of the intein. The reaction mixtures were then analysed by LC-MS to detect and quantify the amount of the desired thioether-linked macrocyclic product as well as that of any uncyclized linear byproduct, as judged based on the peak areas in the corresponding extracted-ion chromatograms (FIGS. 10-15 ). Uncyclized byproducts would appear as unmodified linear peptides or as linear adducts where the bromoalkyl group in p-2beF has undergone nucleophilic substitution with the benzyl mercaptan reagent during the in vitro reaction or with glutathione in vivo. - As summarized in
FIG. 9A , these studies revealed that peptide macrocyclization had occurred with very high efficiency (80-95%) across the constructs with Cys and p-2beF being separated by two to eight residues (i.e., Cys at Z+2 to Z+8). Increasing this distance (i.e., with Cys at Z+10 and Z+12, Entries 8-9 in Table 1) resulted in a decrease of the cyclic product (50-20%,FIG. 9A ). Interestingly, cyclization could also be achieved also when the Cys was located immediately adjacent to the unnatural amino acid (Entry 1, Table 1), albeit at a lower extent (5%) as compared to the other constructs. This result can be rationalized based on the comparatively less favorable 14-membered macrocycle formed when the p-2beF/Cys pair are in a i / i+1 relationship. For each construct tested, the identity of the macrocyclic product could be further confirmed by analysis of the corresponding MS/MS fragmentation spectrum as illustrated inFIG. 16 . - GyrA intein contains a Cys at its N-terminal end which is crucial for mediating protein splicing in the context of the application of the present methods for producing peptide macrocycles inside the cells (see Example 5). Since this residue is partially buried within the active site (Klabunde, Sharma et al. 1998), we did not expect it to readily react with p-2beF side chain. Notably, quantitative splicing of the GyrA moiety upon treatment of all the aforementioned contructs with benzyl mercaptan indicated that no reaction occurred between p-2beF and the catalytic Cys at the intein I+1 site (see representative results in
FIGS. 17a-d ). Furthermore, no adducts or dimers were observed for any of the constructs described above, including those undergoing only partial cyclization (i.e., Entries 8-9,FIG. 9A ). Altogether, these results further highlight the high chemo- and regioselectivity of the macrocyclization reaction. - Protein expression and purification. The protein constructs were expressed using BL21(DE3) E. coli cells co-transformed with a pET22-based vector for the expression of the precursor polypeptide and a pEVOL-based vector for the expression of the Mj-pOgY2-RS/ MjtRNACUA Tyr pair. Cultures of these cells were grown overnight in LB media (50 mg/L ampicillin; 25 mg/L chloramphenicol) and used to inoculate 0.2 L of minimal (M9) media containing the same concentration of antibiotics, 1% glycerol, and 1 mM p-2BeF. At OD600 = 0.6, IPTG (1 mM) and L-arabinose (0.05%) was added to the culture media to induce protein expression. Cultures were grown for 14 h at 27 °C and then harvested by centrifugation. Cell pellets were resuspended in 50 mM Tris, 300 mM NaCl, 20 mM imidazole buffer (pH 7.5) and cells were lysed by sonication. The cell lysate was loaded on a Ni-NTA affinity column and proteins were eluted with 50 mM Tris, 150 mM NaCl, 300 mM imidazole buffer (pH 7.5). Fractions were combined and concentrated followed by buffer exchange with potassium phosphate buffer (50 mM, 150 mM NaCl, pH 7.5). The identity of the isolated proteins was confirmed using MALDI-TOF MS and LC-MS.
- Intein splicing and MS analysis. Aliquots of the purified proteins (200 µM) were incubated with 15 mM benzylmercaptan, 20 mM TCEP in 50 mM phosphate buffer (pH 8). The identity of the target macrocycles was confirmed using MALDI-TOF MS and LC-MS analysis. LC-MS analyses were performed on Thermo Scientific LTQ Velos ESI/ion-trap mass spectrometer coupled to an Accela U-HPLC. Macrocycles were analyzed using Thermo Scientific HyPurity C4 column (particle size 5µm, 100 x 2.1 mm i.d.) and a
linear gradient 5% to 95% ACN (with 0.1% formic acid) in water (with 0.1% formic acid) over 9 min. MALDI-TOF spectra were acquired on the Bruker Autoflex III mass spectrometer. - This example further demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I) and containing the cysteine-reactive unnatural amino acid p-2beF. In particular, this example provides a demonstration of the functionality of the methods described herein for the production of macrocyclic peptide within living bacterial cells.
- For these studies, the constructs corresponding to Entries 13 through 15 of Table 1 were utilized. The corresponding precursor polypeptides were expressed in BL21(DE3) E. coli cells in the presence of the Mj-pOgY2-RS/MjtRNACUA Tyr to achieve the site-selective incorporation of the unnatural amino acid p-2beF into these proteins via amber stop codon suppression. These constructs were designed to contain an Asp residue in the position preceding the GyrA intein moiety in order to favor premature N-terminal splicing of this intein during expression (
FIG. 8 ). We previously established that certain amino acid substitutions at the level of the I-1 site, and in particular Asp and Lys, can strongly promote premature splicing of GyrA intein during recombinant expression (Frost, Vitali et al. 2013). This effect is likely due to the ability of these residues to favor hydrolysis of the intein-catalyzed thioester linkage through their nucleophilic side-chain groups. This reactivity is leveraged here for mediating the spontaneous release of the macrocyclic peptide from the precursor protein after ribosomal expression as outlined inFIG. 8 (path B). Thus, according to our strategy (FIGS. 1A and2A ), these precursor polypeptides were expected to result in the formation of macrocyclic peptides inside the living cell expression host (E. coli) via the intramolecular, thioether bond-forming reaction between the cysteine and p-2beF residue, followed by release of the cyclic peptide via spontaneous N-terminal splicing of the intein moiety. These constructs were also designed to contain a streptavidin-binding motif (HPQ) within the sequence of the resulting macrocyclic peptides (Table 1) in order to allow for the isolation of these peptides via streptavidin-affinity capturing directly from bacterial lysates. Accordingly, E. coli cells expressing these precursor polypeptides were grown overnight and lysed by sonication. The cell lysates were then passed over streptavidin-coated beads, from which streptavidin-bound material was eluted. LC-MS analysis of the eluates revealed the occurence of the expected peptide macrocycle in each case, as illustrated by the LCMS chromatograms and MS/MS spectra inFIGS. 25-27 . Since the uncyclized peptide could also be captured through this procedure, these analyses also showed that the desired macrocyclic product was formed with high efficiency in each case (i.e., >95% for Strep1-Z5C(p-2beF); 70% for Strep2-Z7C(p-2beF); 85% Strep3-Z11C(p-2beF)). Furthermore, the precursor polypeptides were found to have undergone complete splicing in vivo (FIGS. 33a-d ). Since p-2beF-mediated alkylation of the intein catalytic cysteine would prevent protein splicing, the latter results further higlighted the high degree of chemo- and regioselectivity of the macrocyclization reaction. Furthremore, the cyclization yield observed with these sequences correlated very well with the reactivity trend measured across the other p-2beF-containing contructs (FIG. 9A ), suggesting that this parameter is rather predictable on the sole basis of the Cys/p-2beF distance and in spite of the difference in the composition of the target peptide sequence. - Altogether, these results further demonstrate the versatility of the methods described herein for enabling the ribosomal synthesis of macrocyclic peptides of varying length and compositions. In addition, they demonstrate the possibility to apply these methods to enable the production of macrocyclic peptides in vivo, i.e., inside a living cell. Finally, they demonstrate that these in vivo produced macrocyclic peptides can be functional, that is capable of specifically bind to a target biomolecule (i.e., streptavidin).
- Isolation and analysis of HPQ -containing macrocyclic peptides. Protein expression was performed as described above (Example 5). After centrifugation, cells were resuspended in 50 mM Tris, 300 mM NaCl and 20 mM imidazole (pH 7.5) and lysed by sonication. Cell lysates were incubated with streptavidin-coated beads for 1 hour under gentle shaking on ice. Beads were washed two times with the same buffer followed by incubation with acetonitrile:H2O (70:30 v/v) for one minute to release any streptavidin-bound peptides. Eluates were lyophilized and the identity of the peptides evaluated using MALDI-TOF MS and LC-MS as described above (Example 5).
- This example further demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (I). In particular, this example demonstrates how different cysteine-reactive unnatural amino acids of general structure (III) can be used for the purpose of generating macrocyclic peptides starting from ribosomally produced polypeptide precursors according to the methods described herein.
- As described in Example 3, orthogonal AARS/tRNA pairs could be readily identified to achieve the specific incorporation of the unnatural amino acids 2becK, 2cecK, p-1beF, or bdnK into a precursor polypeptide of choice. Each of these amino acids contains an electrophilic side-chain functionality (i.e., alkylbromide group in 2becK and p-lbeF; alkylchloride group in 2cecK; allenamide group in bdnK), which was expected to react chemoselectively with a neighboring cysteine residue within the precursor polypeptide sequence according to the general methods provided herein. To test the ability of 2becK and 2cecK to mediate peptide macrocyclization, the constructs corresponding to
Entries 1 through 9 of Table 1 were expressed in E. coli as described above (Example 5) using the appropriate AARS/tRNA pairs (Example 3) for the incorporation of either 2becK or 2cecK as the cysteine-reactive residue (Z residue, Table 1). To establish the occurrence of the desired macrocyclization reaction, these proteins were purified by Ni-affinity chromatography and then reacted with benzyl mercaptan to splice the GyrA intein and release the macrocyclic peptide. Detection and quantification of the cyclic product was carried by LC-MS and MS/MS analysis as described in Example 4. These analyses revealed the occurrence of the desired macrocyclic peptide product in each case, as shown by the representative LC-MS extracted-ion chromatograms and MS/MS spectra inFIGS. 18-22 . As summarized inFIG. 9B , 2becK- and 2cecK-mediated peptide macrocyclization was found to occur very efficiently (>80%) when the cysteine residue is located within a six-residue distance from the electrophilic amino acid (i.e., with constructs 12mer-Z1C through 12mer-Z1C). Beyond this spacing distance, the % cyclization decreases significantly (< 20%). Interestingly, the reactivity of 2becK- and 2cecK as cysteine cross-linking residues nicely complement that of p-2beF, as evidenced from comparison of % cyclization data inFIGS. 9A and 9B . For example, whereas the 12mer-Z1C construct undergoes efficient cyclization in the presence of 2becK (or 2cecK) but not p-2beF as the cysteine-reactive residue, the opposite holds true in the context of the large macrocycles formed from the constructs 12mer-Z10C and 12mer-Z12C. Thus, these results show how different cysteine-reactive amino acids can be appropriately chosen and applied to obtain macrocyclic peptides of varying ring size according to the methods provided herein. - To further investigate the generality of the methods presented herein, two additional amino acids, p-1beF and bdnK, were synthesized (Example 1) and tested here for their ability to induce peptide macrocyclization upon reaction with a proximal cysteine in the precursor polypeptide. p-1beF contains a benzylic, secondary alkyl bromide group, thus enabling the formation of more compact peptide ring structures as compared to those generated using p-2beF-mediated cysteine alkylation. On the other hand, bdnK was designed to contain an allenamide group, which is known to react chemoselectively with cysteine via a Michael addition reaction (Abbas, Xing et al. 2014). Using the appropriate AARS/tRNA pair as determined in Example 3, p-1beF was incorporated into the construct 12mer-Z4C (
Entry 4, Table 1) to give 12mer-Z4C(p-1beF), whereas bdnK was incorporated into the construct 12mer-Z6C (Entry 6, Table 1) to give 12mer-Z6C(bdnK). After expression in E. coli and purification via Ni-affinity chromatography, these proteins were made react with benzyl mercaptan to splice the GyrA intein and release the macrocyclic peptide. The desired macrocyclic peptide product could be observed in each case (FIGS. 23 and24 ). Altogether, the results included in this example illustrate how a variety of structurally diverse cysteine-reactive amino acids can be designed and applied in the context of the general peptide cyclization methods described in this application. - This example demonstrates the formation and isolation of macrocyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (II). As such, this example demonstrates certain embodiments as schematically described in
FIGS. 1B and2B . - For these studies, the constructs corresponding to
Entries 10 through 12 of Table 1 were used. Three different cysteine-reactive amino acids, p-2beF, 2becK, and 2cecK, were tested as the Z residue in these constructs. The corresponding p-2beF-, 2becK-, or 2cecK-containing precursor polypeptides were expressed in BL21(DE3) E. coli cells using the appropriate AARS/tRNA pair as determined in Example 3 (Mj-pOgY2-RS/MjtRNACUA Tyr pair for the p-2beF-containing proteins and Mb-CrtK-RS/Mm/MbtRNACUA Pyl for the 2becK and 2cecK-containing proteins). In these constructs, the reactive Cys is located upstream of the unnatural amino acid, and specifically at position Z-4, Z-6 and Z-8. Analysis of the p-2beF-containing proteins according to the procedure described above (Example 4) revealed the occurrence of the desired cyclic peptide as the largely predominant product (95-99%) for all of the constructs tested (FIG. 9A ,FIGS. 34-35 ). For the 2becK- and 2cecK-containing proteins, efficient inter-side-chain cyclization (80-95%) was observed when the cysteine and unnatural amino acid are three (Z-4) and five residue apart, while a lower % of cyclization was noted at the larger spacing distance (Z-8) (FIG. 9B ). These data clearly demonstrated that the thioether bond-forming reactivity of the cysteine-reactive amino acids is preserved when the order of Cys and Z residue is reversed, thus enabling structural variation of the resulting macrocyclic peptide products. Furthermore, quantitative thiol-induced splicing of the GyrA intein from the aforementioned proteins indicated that no reaction had occurred between the side-chain of the unnatural amino acid and the catalytic I+1 cysteine residue of the intein (FIGS. 17a-d ). - This example demonstrates certain embodiments as schematically described in
FIG. 4A . In particular, this example demonstrates how bicyclic peptides can be generated from precursor polypeptides of general formula (I) via the combination of a split intein-mediated trans-splicing reaction and inter-side-chain cyclization reaction mediated by a cysteine and a cysteine-reactive unnatural amino acid according to the methods described herein. While split intein-mediated trans-splicing has proven useful for the generation and isolation of head-to-tail cyclic peptides in a variety of context (Scott, Abel-Santos et al. 1999; Tavassoli and Benkovic 2005; Tavassoli and Benkovic 2007; Tavassoli, Lu et al. 2008; Young, Young et al. 2011) (see alsoU.S. Pat. No. 7,354,756 ,U.S. Pat. No. 7,252,952 , andU.S. Pat. No. 7,105,341 ), there are reports of the application of this technique (called SICLOPPS) to obtain bicyclic peptides of the general structure described inFIGS. 4A-B . This example demonstrates the possibility to apply the general methods disclosed herein, and specifically in its embodiments as outlined inFIGS. 4A-B , to enable the efficient production of bicyclic peptides inside a living cell. In addition, the advantage conferred by the bicyclic structure and thus by the inter-side-chain thioether linkage toward improving the functional (i.e., protein-binding) properties of the macrocyclic peptide is demonstrated. - For these studies, the constructs corresponding to Entries 16 through 20 of Table 1 were utilized. The corresponding precursor polypeptides were expressed in BL21(DE3) E. coli cells in the presence of the Mj-pOgY2-RS/MjtRNACUA Tyr for incorporation of the unnatural amino acid p-2beY into these proteins via amber stop codon suppression, as described above (Example 5). These constructs were designed to comprise the C-domain and N-domain of split intein DnaE within the N-terminal tail and the C-terminal tail, respectively, of the precursor polypeptide. According to our strategy (
FIG. 4A ), these precursor polypeptides were expected to result in the formation of bicyclic peptides in E. coli by means of an intramolecular, thioether bond-forming reaction between the cysteine and p-2beF residues and a DnaE-catalyzed trans-splicing reaction leading to ring closure (i.e., N-to-C-end cyclization) of the peptide sequence comprised between the C- and N-domain of the split intein. To facilitate the identification and isolation of these bicyclic peptides, a streptavidin-binding motif (HPQ) was included within the sequence targeted for macrocyclization (Table 1). Accordingly, using an analogous procedure as that described in Example 5, lysates of E. coli cells expressing the aforementioned precursor polypeptides were passed over streptavidin-coated beads, from which streptavidin-bound material was eluted. - Notably, the desired bicyclic peptide was isolated as the largely predominant product in each case (70-95%), as determined by LC-MS (
FIGS. 28-32 ). The bicyclic structure of these compounds was further evidenced by the corresponding MS/MS fragmentation spectra (FIGS. 28-32 ). Treatment of the bicyclic peptide obtained with the thiol-alkylating iodoacetamide resulted in a 57 Da increase in molecular mass and shift of the peptide retention time for the bicyclic product of the cStrep3(C)-Z3C(p-2beF) precursor protein but not for that of cStrep3(S)-Z3C(p-2beF), which is consistent with the presence of a free thiol in the former (from IntC+1 cysteine) but not in the latter. To allow measurement of the extent of post-translational self-processing of these precursor polypeptides in vivo, a chitin-binding domain was included at the C-terminus of the IntN domain in each construct (Table 1). LC-MS analysis of the protein fraction eluted from chitin beads showed that the split intein-mediated cyclization has occurred nearly quantitatively or nearly quantitatively (>85%) for all the constructs tested (see representative MS spectra inFIGS. 33a-d ). Overall, the successful generation of bicyclic structures across target sequences of varying length and composition supports the functionality and broad scope of the present methodology for the ribosomal synthesis of bicyclic peptides through the integration of split intein-mediated peptide circularization with inter-side-chain thioether bridge formation. - The increased conformational rigidity imposed by the intra-side-chain thioether bridge is expected to improve the functional and/or stability properties of these bicyclic peptides as compared to the head-to-tail cyclized peptide counterpart. To investigate this aspect, the streptavidin-binding affinity of the bicyclic peptides obtained via cyclization of the cStrep3(S)-Z3C(p-2beF) and cStrep3(C)-Z8C(p-2beF) constructs was measured through an in-solution inhibition assay and compared with that of a 'monocyclic' counterpart (cyclo[S(OpgY)TNCHPQFANA] (SEQ ID NO:189) where OpgY is O-propargyl-tyrosine). In this assay (
FIG. 39A ), a streptavidin-binding surface is first created by immobilizing the bicyclic peptide obtained from the cStrep3(C)-Z8C(p-2beF) construct on maleimide-coated microtiter plates. Then, a fixed amount of streptavidin-horseradish peroxidase conjugate is added to the plate in the presence of varying amount of the bicyclic or cyclic peptide. After washing, the amount of bound streptavidin is determined based on the residual peroxidase activity using a standard (ABTS) colorimetric assay. Using this assay, the ICso value for the head-to-tail monocyclic peptide cyclo[S(OpgY)TNCHPQFANA (SEQ ID NO:189) was determined to be 1.9 µM, while the thioether-constrained bicyclic peptides from the cStrep3(S)-Z3C(p-2beF) and cStrep3(C)-Z8C(p-2beF) constructs exhibited an ICso of 3.7 and 0.77 µM, respectively (FIG. 39B ). The > 2-fold increase in streptavidin binding affinity exhibited by the latter as compared to the monocyclic counterpart exemplifies the inherent advantage provided by presence of the additional intramolecular thioether linkage. - Preparation and isolation of bicyclic macrocycles. Protein expression of constructs 16-20 was performed as described in the previous Examples with the difference that cells were incubated for additional 3 hours at 37°C after overnight growth. Cells were harvested, lysed and the cell lysate treated as described above to isolate and analyze the streptavidin-bound peptides by LC-MS. To analyze the amount of protein splicing occurred in vivo, the same cell lysate samples were incubated with chitin beads for 1h on ice. Beads were washed two times with buffer followed by incubation with acetonitrile:H2O (70:30 v/v) for one minute to release any chitin-bound protein. Eluates were analyzed by LC-MS.
- This example demonstrates the feasibility of generating polycyclic peptides using the methods provided herein. In particular, it demonstrates the formation and isolation of polycyclic peptides obtained via the post-translational cyclization of precursor polypeptides containing multiple Z/Cys pairs. It also demonstrates the formation and isolation of polycyclic peptides produced via the cyclization of ribosomally derived precursor polypeptides of general formula (V). In particular, this example demonstrates certain embodiments as schematically described in
FIGS. 37A-B . - For these studies, the constructs corresponding to Entries 21 and 22 of Table 1 were utilized. In Strep6 _Z4C7C4Z, a Z/Cys pair encompassing a four-amino acid target peptide sequence (HPQF (SEQ ID NO:185)) is followed by a second Cys/Z pair encompassing a different target peptide sequence (NTSK) after a spacer sequence (ENLYFQS). To demonstrate the possibility to obtain polycyclic peptides in this manner, the corresponding precursor polypeptide was expressed in BL21(DE3) E. coli cells in the presence of the Mj-pOgY2-RS/MjtRNACUA Tyr to achieve the site-selective incorporation of the unnatural amino acid p-2beF in correspondence of the two Z residues. Although two possible bicyclic products could be generated via p-2beF-mediated cysteine alkylation, the structure-reactivity studies described in
FIG. 9A would predict that each p-2beF would react preferentially or exclusively with its most proximal cysteine residue (i.e., p-2beF3 with Cys8 and p-2beF21 with Cysl6, Table 1). Indeed, LC-MS analysis of the small molecular weight products obtained after thiol-induced splicing of purified Strep6 _Z4C7C4Z(p-2beF) revealed the occurrence of the expected 2beF3-Cys8/p-2beF21-Cysl6 linked product (FIG. 36 ) as the only bicyclic product. A small amount of the monocyclic 2beF3-Cys8-linked peptide was also observed. Overall, these studies demonstrate the possibility to generate precursor polypeptides with multiple Z/Cys pairs in order to obtain macrocyclic peptides featuring a polycyclic structure. Whereas this example illustrates the specific case in which two copies of the same cysteine-reactive amino acid are incorporated into the precursor polypeptide, a person skilled in the art would immediately recognize that this approach can be readily extended to the use of two different cysteine-reactive amino acids, such as those described inFIGS. 5 and6 . The ribosomal incorporation of two different cysteine-reactive unnatural amino acids into the precursor polypeptide can be achieved using methods known in the art, i.e., via suppression of two different stop codons (Wan, Huang et al. 2010) or via suppression of a stop codon and a four-based codon (Chatterjee, Sun et al. 2013; Sachdeva, Wang et al. 2014). As shown above, results from structure-reactivity studies such as those described inFIGS. 9A-B can guide the design of appropriate precursor polypeptides for the formation of a polycyclic peptide with the desired pattern of thioether linkages (i.e., through the judicious choice of spacing distances between the different Z and Cys residues). - The successful formation of cyclic peptides via the ribosomal incorporation of cysteine-reactive amino acids into precursor polypeptides as illustrated by the previous Examples suggested that macrocyclic peptide with a polycyclic architecture could also be obtained through the use of amino acids containing more than one cysteine-reactive functional group in their side chain, i.e., using amino acids with the general formula (VI) or (VII). To illustrate this aspect, one such amino acid, ObdpY, was designed and synthesized according to
Scheme 6 ofFIG. 6 . A suitable, orthogonal AARS/tRNA pair for the ribosomal incorporation of ObdpY in response to an amber stop codon was then identified as described in Example 3. Using ObdpY and the Mj-pOgY2-RS/MjtRNACUA Tyr pair, the precursor polypeptide corresponding to Entry 22 of Table 1 was expressed in E. coli and purified by Ni-affinity chromatography. In this protein (called Strep7_C5Z4C(ObdpY)), two cysteine residues flank the ObdpY residue encompassing two different target peptide sequences (i.e., AYDSG (SEQ ID NO:188) and HPQF (SEQ ID NO:185)). Analysis of the small molecular weight product obtained after thiol-induced splicing of the GyrA intein revealed the occurrence of the desired bicyclic peptide product (FIG. 38 ). A small amount of the monocyclic peptide resulting from reaction of ObdpY side chain with only one of the cysteine residue was also observed. Altogether, these studies demonstrate the feasibility of certain embodiments as schematically illustrate inFIGS. 37A-B . As noted above, structure-activity studies such as those presented inFIGS. 9A-B can guide the judicious choice of suitable Z2 residues of general formula (VI) or (VII) and of target sequence lengths in order to the obtain a polycyclic peptide carrying a desired pattern of thioether linkages. -
- Abbas, A., B. G. Xing, et al. (2014). Angewandte Chemie-International Edition 53(29): 7491-7494.
- Anderson, J. C., N. Wu, et al. (2004). Proc Natl Acad Sci U S A 101(20): 7566-7571.
- Bessho, Y., D. R. Hodgson, et al. (2002). Nat Biotechnol 20(7): 723-728.
- Chatterjee, A., S. B. Sun, et al. (2013). Biochemistry 52(10): 1828-1837.
- Cheng, L., T. A. Naumann, et al. (2007). Protein Sci. 16(8): 1535-1542.
- Dedkova, L. M., N. E. Fahmi, et al. (2003). Journal of the American Chemical Society 125(22): 6616-6617.
- Deiters, A. and P. G. Schultz (2005). Bioorg Med Chem Lett 15(5): 1521-1524.
- Dias, R. L. A., R. Fasan, et al. (2006). J. Am. Chem. Soc. 128(8): 2726-2732.
- Driggers, E. M., S. P. Hale, et al. (2008). Nat Rev Drug Discov 7(7): 608-624.
- Elleuche, S. and S. Poggeler (2010). Appl Microbiol Biotechnol 87(2): 479-489.
- Fairlie, D. P., J. D. A. Tyndall, et al. (2000). J. Med. Chem. 43(7): 1271-1281.
- Fekner, T. and M. K. Chan (2011). Current Opinion in Chemical Biology 15(3): 387-391.
- Frost, J. R., J. M. Smith, et al. (2013). Curr Opin Struct Biol 23(4): 571-580.
- Frost, J. R., F. Vitali, et al. (2013). Chembiochem 14(1): 147-160.
- Giebel, L. B., R. T. Cass, et al. (1995). Biochemistry 34(47): 15430-15435.
- Hamamoto, T., M. Sisido, et al. (2011). Chem Commun (Camb) 47(32): 9116-9118.
- Hartman, M. C., K. Josephson, et al. (2007). PLoS One 2(10): e972.
- Hartman, M. C., K. Josephson, et al. (2006). Proc Natl Acad Sci U S A 103(12): 4356-4361.
- Heinis, C., T. Rutherford, et al. (2009). Nat Chem Biol 5(7): 502-507.
- Henchey, L. K., J. R. Porter, et al. (2010). Chembiochem 11(15): 2104-2107.
- Horswill, A. R., S. N. Savinov, et al. (2004). Proc Natl Acad Sci U S A 101(44): 15591-15596.
- Josephson, K., M. C. Hartman, et al. (2005). J Am Chem Soc 127(33): 11727-11735.
- Katsara, M., T. Tselios, et al. (2006). Curr Med Chem 13(19): 2221-2232.
- Katz, B. A. (1995). Biochemistry 34(47): 15421-15429.
- Klabunde, T., S. Sharma, et al. (1998). Nat. Struct. Biol. 5(1): 31-36.
- Kobayashi, T., O. Nureki, et al. (2003). Nat. Struct. Biol. 10(6): 425-432.
- Kourouklis, D., H. Murakami, et al. (2005). Methods 36(3): 239-244.
- Lane, D. P. and C. W. Stephen (1993). Curr. Opin. Immunol. 5: 268-271.
- Lang, K. and J. W. Chin (2014). Chem. Rev. 114(9): 4764-4806.
- Liu, C. C. and P. G. Schultz (2010). Annu. Rev. Biochem. 79: 413-444.
- Marsault, E. and M. L. Peterson (2011). Journal of Medicinal Chemistry 54(7): 1961-2004.
- Millward, S. W., T. T. Takahashi, et al. (2005). J Am Chem Soc 127(41): 14142-14143.
- Mootz, H. D. (2009). Chembiochem 10(16): 2579-2589.
- Murakami, H., A. Ohta, et al. (2006). Nat Methods 3(5): 357-359.
- Naumann, T. A., S. N. Savinov, et al. (2005). Biotechnol Bioeng 92(7): 820-830.
- Naumann, T. A., A. Tavassoli, et al. (2008). Chembiochem 9(2): 194-197.
- Neumann, H., A. L. Slusarczyk, et al. (2010). J Am Chem Soc 132(7): 2142-2144.
- Neumann, H., K. Wang, et al. (2010). Nature 464(7287): 441-444.
- Obrecht, D., J. A. Robinson, et al. (2009). Current Medicinal Chemistry 16(1): 42-65.
- Paulus, H. (2000). Annual Review of Biochemistry 69: 447-496.
- Perler, F. B. (2005). IUBMB Life 57(7): 469-476.
- Rezai, T., J. E. Bock, et al. (2006). Journal of the American Chemical Society 128(43): 14073-14080.
- Rezai, T., B. Yu, et al. (2006). Journal of the American Chemical Society 128(8): 2510-2511.
- Rodriguez, E. A., H. A. Lester, et al. (2006). Proc Natl Acad Sci U S A 103(23): 8650-8655.
- Sachdeva, A., K. Wang, et al. (2014). Journal of the American Chemical Society 136(22): 7785-7788.
- Schlippe, Y. V., M. C. Hartman, et al. (2012). J Am Chem Soc 134(25): 10469-10477.
- Scott, C. P., E. Abel-Santos, et al. (2001). Chem Biol 8(8): 801-815.
- Scott, C. P., E. Abel-Santos, et al. (1999). Proc Natl Acad Sci U S A 96(24): 13638-13643.
- Seebeck, F. P. and J. W. Szostak (2006). J Am Chem Soc 128(22): 7150-7151.
- Sidhu, S. S., H. B. Lowman, et al. (2000). Methods Enzymol. 328: 333-363.
- Smith, J. M., J. R. Frost, et al. (2013). J Org Chem 78(8): 3525-3531.
- Smith, J. M., F. Vitali, et al. (2011). Angew Chem Int Ed 50(22): 5075-5080.
- Tang, Y. Q., J. Yuan, et al. (1999). Science 286(5439): 498-502.
- Tavassoli, A. and S. J. Benkovic (2005). Angew Chem Int Ed Engl 44(18): 2760-2763.
- Tavassoli, A. and S. J. Benkovic (2007). Nat. Protoc. 2(5): 1126-1133.
- Tavassoli, A., Q. Lu, et al. (2008). ACS Chem Biol 3(12): 757-764.
- Touati, J., A. Angelini, et al. (2011). Chembiochem 12(1): 38-42.
- Walensky, L. D., A. L. Kung, et al. (2004). Science 305(5689): 1466-1470.
- Wan, W., Y. Huang, et al. (2010). Angew Chem Int Ed. 49(18): 3211-3214.
- Wang, D., W. Liao, et al. (2005). Angew Chem Int Ed Engl 44(40): 6525-6529.
- Wang, L., J. Xie, et al. (2006). Annu Rev Biophys Biomol Struct 35: 225-249.
- White, C. J. and A. K. Yudin (2011). Nat Chem 3(7): 509-524.
- Wu, X. and P. G. Schultz (2009). J Am Chem Soc 131(35): 12497-12515.
- Xu, M. Q. and T. C. Evans, Jr. (2005). Curr Opin Biotechnol 16(4): 440-446.
- Xu, M. Q. and F. B. Perler (1996). Embo Journal 15(19): 5146-5153.
- Young, D. D., T. S. Young, et al. (2011). Biochemistry 50(11): 1894-1900.
- Young, T. S., D. D. Young, et al. (2011). Proc Natl Acad Sci U S A 108(27): 11052-11056.
- It will be appreciated that variants of the above-disclosed and other features and functions, or alternatives thereof, may be combined into many other different systems or applications.
- Where exemplary embodiments are described with reference to a certain number of elements it will be understood that the exemplary embodiments can be practiced utilizing either less than or more than the certain number of elements.
-
- <110> University of Rochester Fasan, Rudi
- <120> METHODS AND COMPOSITIONS FOR RIBOSOMAL SYNTHESIS OF MACROCYCLIC PEPTIDES
- <130> 1625_019PCT
- <150>
US 61/920181
<151> 2013-12-23 - <160> 189
- <170> PatentIn version 3.5
- <210> 1
<211> 198
<212> PRT
<213> Mycobacterium xenopi - <400> 1
- <210> 2
<211> 154
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 2
- <210> 3
<211> 182
<212> PRT
<213> Halobacterium sp. NRC1 - <400> 3
- <210> 4
<211> 456
<212> PRT
<213> Chlamydomonas eugametos - <400> 4
- <210> 5
<211> 360
<212> PRT
<213> Thermococcus aggregans - <400> 5
- <210> 6
<211> 360
<212> PRT
<213> Thermococcus fumicolans - <400> 6
- <210> 7
<211> 360
<212> PRT
<213> Thermococcus kodakaraensis - <400> 7
- <210> 8
<211> 537
<212> PRT
<213> Pyrococcus sp. GBD - <400> 8
- <210> 9
<211> 538
<212> PRT
<213> Thermococcus aggregans - <220>
<221> misc_feature
<222> (264)..(264)
<223> Xaa can be any naturally occurring amino acid - <220>
<221> misc_feature
<222> (269)..(269)
<223> Xaa can be any naturally occurring amino acid - <400> 9
- <210> 10
<211> 537
<212> PRT
<213> Thermococcus hydrothermalis - <400> 10
- <210> 11
<211> 536
<212> PRT
<213> Thermococcus kodakaraensis - <400> 11
- <210> 12
<211> 538
<212> PRT
<213> Thermococcus litoralis - <400> 12
- <210> 13
<211> 537
<212> PRT
<213> Thermococcus marinus - <400> 13
- <210> 14
<211> 535
<212> PRT
<213> Thermococcus species GE8 - <400> 14
- <210> 15
<211> 536
<212> PRT
<213> Thermococcus thioreducens - <400> 15
- <210> 16
<211> 157
<212> PRT
<213> Thermococcus aggregans - <400> 16
- <210> 17
<211> 389
<212> PRT
<213> Thermococcus fumicolans - <400> 17
- <210> 18
<211> 389
<212> PRT
<213> Thermococcus hydrothermalis - <400> 18
- <210> 19
<211> 390
<212> PRT
<213> Thermococcus litoralis - <400> 19
- <210> 20
<211> 389
<212> PRT
<213> Thermococcus species GE8 - <400> 20
- <210> 21
<211> 184
<212> PRT
<213> Pyrococcus abyssi - <400> 21
- <210> 22
<211> 416
<212> PRT
<213> Mycobacterium tuberculosis - <400> 22
- <210> 23
<211> 416
<212> PRT
<213> Mycobacterium tuberculosis - <400> 23
- <210> 24
<211> 428
<212> PRT
<213> Rhodothermus marinus - <400> 24
- <210> 25
<211> 1365
<212> PRT
<213> Trichodesmium erythraeum - <400> 25
- <210> 26
<211> 435
<212> PRT
<213> Synechocystis sp. PCC6803 - <400> 26
- <210> 27
<211> 421
<212> PRT
<213> Mycobacterium flavescens - <400> 27
- <210> 28
<211> 420
<212> PRT
<213> Mycobacterium gordonae - <400> 28
- <210> 29
<211> 420
<212> PRT
<213> Mycobacterium kansasii - <400> 29
- <210> 30
<211> 420
<212> PRT
<213> Mycobacterium leprae - <400> 30
- <210> 31
<211> 420
<212> PRT
<213> Mycobacterium malmoense - <400> 31
- <210> 32
<211> 430
<212> PRT
<213> Synechocystis sp. PCC6803 - <400> 32
- <210> 33
<211> 333
<212> PRT
<213> Pyrococcus abyssi - <400> 33
- <210> 34
<211> 412
<212> PRT
<213> Methanococcus jannaschii - <400> 34
- <210> 35
<211> 819
<212> PRT
<213> Aspergillus fumigatus - <400> 35
- <210> 36
<211> 605
<212> PRT
<213> Aspergillus nidulans FGSC A - <400> 36
- <210> 37
<211> 171
<212> PRT
<213> Cryptococcus neoformans - <400> 37
- <210> 38
<211> 534
<212> PRT
<213> Histoplasma capsulatum - <400> 38
- <210> 39
<211> 157
<212> PRT
<213> Penicillium chrysogenum - <400> 39
- <210> 40
<211> 162
<212> PRT
<213> Penicillium expansum - <400> 40
- <210> 41
<211> 161
<212> PRT
<213> Penicillium vulpinum - <400> 41
- <210> 42
<211> 440
<212> PRT
<213> Mycobacterium tuberculosis - <400> 42
- <210> 43
<211> 440
<212> PRT
<213> Mycobacterium tuberculosis - <400> 43
- <210> 44
<211> 364
<212> PRT
<213> Mycobacterium flavescens - <400> 44 355 360
- <210> 45
<211> 365
<212> PRT
<213> Mycobacterium leprae - <400> 45
- <210> 46
<211> 407
<212> PRT
<213> Nostoc sp. PCC7120 - <400> 46
- <210> 47
<211> 394
<212> PRT
<213> Trichodesmium erythraeum - <400> 47
- <210> 48
<211> 399
<212> PRT
<213> Pyrococcus abyssi - <400> 48
- <210> 49
<211> 454
<212> PRT
<213> Pyrococcus furiosus - <400> 49
- <210> 50
<211> 345
<212> PRT
<213> Carboxydothermus hydrogenoformans - <400> 50
- <210> 51
<211> 134
<212> PRT
<213> Methanothermobacter thermautotrophicus - <400> 51
- <210> 52
<211> 382
<212> PRT
<213> Pyrococcus abyssi - <400> 52
- <210> 53
<211> 382
<212> PRT
<213> Pyrococcus furiosus - <400> 53
- <210> 54
<211> 373
<212> PRT
<213> Trichodesmium erythraeum - <400> 54
- <210> 55
<211> 381
<212> PRT
<213> Trichodesmium erythraeum - <400> 55
- <210> 56
<211> 339
<212> PRT
<213> Chilo iridescent virus - <400> 56
- <210> 57
<211> 471
<212> PRT
<213> Candida tropicalis - <400> 57
- <210> 58
<211> 454
<212> PRT
<213> Saccharomyces cerevisiae - <400> 58
- <210> 59
<211> 173
<212> PRT
<213> Thermoplasma acidophilum - <400> 59
- <210> 60
<211> 429
<212> PRT
<213> Synechocystis sp. PCC6803 - <400> 60
- <210> 61
<211> 123
<212> PRT
<213> Synechocystis sp. PCC6803 - <400> 61
- <210> 62
<211> 36
<212> PRT
<213> Synechocystis sp. PCC6803 - <400> 62
- <210> 63
<211> 98
<212> PRT
<213> Nanoarchaeum equitans Kin4-M - <400> 63
- <210> 64
<211> 30
<212> PRT
<213> Nanoarchaeum equitans Kin4-M - <400> 64
- <210> 65
<211> 102
<212> PRT
<213> Anabaena sp. PCC7120 - <400> 65
- <210> 66
<211> 36
<212> PRT
<213> Anabaena sp. PCC7120 - <400> 66
- <210> 67
<211> 102
<212> PRT
<213> Nostoc punctiforme PCC73102 - <400> 67
- <210> 68
<211> 36
<212> PRT
<213> Nostoc punctiforme PCC73102 - <400> 68
- <210> 69
<211> 102
<212> PRT
<213> Nostoc sp. PCC7120 - <400> 69
- <210> 70
<211> 36
<212> PRT
<213> Nostoc sp. PCC7120 - <400> 70
- <210> 71
<211> 112
<212> PRT
<213> Oscillatoria limnetica - <400> 71
- <210> 72
<211> 36
<212> PRT
<213> Oscillatoria limnetica - <400> 72
- <210> 73
<211> 107
<212> PRT
<213> Synechocystis sp. PCC 7002 - <400> 73
- <210> 74
<211> 36
<212> PRT
<213> Synechocystis sp. PCC 7002 - <400> 74
- <210> 75
<211> 117
<212> PRT
<213> Thermosynechococcus vulcanus - <400> 75
- <210> 76
<211> 35
<212> PRT
<213> Thermosynechococcus vulcanus - <400> 76
- <210> 77
<211> 306
<212> PRT
<213> Methanococcus jannaschii - <400> 77
- <210> 78
<211> 454
<212> PRT
<213> Methanosarcina mazeii - <400> 78
- <210> 79
<211> 419
<212> PRT
<213> Methanosarcina barkeri - <400> 79
- <210> 80
<211> 424
<212> PRT
<213> Escherichia coli - <400> 80
- <210> 81
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 81
- <210> 82
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 82
- <210> 83
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 83
- <210> 84
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 84
- <210> 85
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 85
- <210> 86
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 86
- <210> 87
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 87
- <210> 88
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 88
- <210> 89
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 89
- <210> 90
<211> 306
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 90
- <210> 91
<211> 454
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 91
- <210> 92
<211> 454
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 92
- <210> 93
<211> 419
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 93
- <210> 94
<211> 419
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 94
- <210> 95
<211> 419
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 95
- <210> 96
<211> 419
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 96
- <210> 97
<211> 424
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 97
- <210> 98
<211> 424
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 98
- <210> 99
<211> 424
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 99
- <210> 100
<211> 424
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 100
- <210> 101
<211> 77
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 101
- <210> 102
<211> 77
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 102
- <210> 103
<211> 77
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 103
- <210> 104
<211> 78
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 104
- <210> 105
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 105
- <210> 106
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 106
- <210> 107
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 107
- <210> 108
<211> 73
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 108
- <210> 109
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 109
- <210> 110
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 110
- <210> 111
<211> 72
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence. - <400> 111
- <210> 112
<211> 73
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 112
- <210> 113
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 113
- <210> 114
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 114
- <210> 115
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 115
- <210> 116
<211> 86
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 116
- <210> 117
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 117
- <210> 118
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 118
- <210> 119
<211> 85
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 119
- <210> 120
<211> 86
<212> DNA
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 120
- <210> 121
<211> 5
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 121
- <210> 122
<211> 6
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 122
- <210> 123
<211> 19
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 123
- <210> 124
<211> 8
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 124
- <210> 125
<211> 8
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 125
- <210> 126
<211> 10
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 126
- <210> 127
<211> 15
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 127
- <210> 128
<211> 26
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 128
- <210> 129
<211> 38
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 129
- <210> 130
<211> 71
<212> PRT
<213> Bacillus circulans - <400> 130
- <210> 131
<211> 220
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 131
- <210> 132
<211> 368
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 132
- <210> 133
<211> 128
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 133
- <210> 134
<211> 239
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 134
- <210> 135
<211> 550
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 135
- <210> 136
<211> 524
<212> PRT
<213> Bos taurus - <400> 136
- <210> 137
<211> 112
<212> PRT
<213> phage M13 - <400> 137
- <210> 138
<211> 345
<212> PRT
<213> phage T7 - <400> 138
- <210> 139
<211> 398
<212> PRT
<213> phage T7 - <400> 139
- <210> 140
<211> 29
<212> PRT
<213> Escherichia coli - <400> 140
- <210> 141
<211> 190
<212> PRT
<213> Escherichia coli - <400> 141
- <210> 142
<211> 448
<212> PRT
<213> Escherichia coli - <400> 142
- <210> 143
<211> 122
<212> PRT
<213> Escherichia coli - <400> 143
- <210> 144
<211> 182
<212> PRT
<213> Escherichia coli - <400> 144
- <210> 145
<211> 948
<212> PRT
<213> Escherichia coli - <400> 145
- <210> 146
<211> 87
<212> PRT
<213> Saccharomyces cerevisiae - <400> 146
- <210> 147
<211> 1537
<212> PRT
<213> Saccharomyces cerevisiae - <400> 147
- <210> 148
<211> 335
<212> PRT
<213> Homo sapiens - <400> 148
- <210> 149
<211> 18
<212> PRT
<213> phage M13 - <400> 149
- <210> 150
<211> 23
<212> PRT
<213> phage M13 - <400> 150
- <210> 151
<211> 112
<212> PRT
<213> phage M13 - <400> 151
- <210> 152
<211> 207
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 152
- <210> 153
<211> 207
<212> PRT
<213> Artificial Sequence - <220>
<223> Engineered sequence - <400> 153
- <210> 154
<211> 407
<212> PRT
<213> phage M13 - <400> 154
- <210> 155
<211> 50
<212> PRT
<213> phage M13 - <400> 155
- <210> 156
<211> 285
<212> PRT
<213> Escherichia coli - <400> 156
- <210> 157
<211> 725
<212> PRT
<213> Saccharomyces cerevisiae - <400> 157
- <210> 158
<211> 761
<212> PRT
<213> phage P2 - <400> 158
- <210> 159
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 159
- <210> 160
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 160
- <210> 161
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 161
- <210> 162
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 162
- <210> 163
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 163
- <210> 164
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 164
- <210> 165
<211> 216
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 165
- <210> 166
<211> 218
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 166
- <210> 167
<211> 220
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 167
- <210> 168
<211> 10
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 168
- <210> 169
<211> 206
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 169
- <210> 170
<211> 10
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 170
- <210> 171
<211> 10
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 171
- <210> 172
<211> 213
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 172
- <210> 173
<211> 215
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 173
- <210> 174
<211> 218
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 174
- <210> 175
<211> 37
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 175
- <210> 176
<211> 186
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 176
- <210> 177
<211> 37
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 177
- <210> 178
<211> 186
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 178
- <210> 179
<211> 186
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 179
- <210> 180
<211> 190
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 180
- <210> 181
<211> 192
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 181
- <210> 182
<211> 17
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 182
- <210> 183
<211> 8
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 183
- <210> 184
<211> 213
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 184
- <210> 185
<211> 4
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 185
- <210> 186
<211> 4
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 186
- <210> 187
<211> 7
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 187
- <210> 188
<211> 5
<212> PRT
<213> Artificial Sequence - <220>
<223> Synthetic or artificial sequence - <400> 188
- <210> 189
<211> 10
<212> PRT
<213> Artificial Sequence - <220>
<223> synthetic or artificial sequence - <400> 189
Claims (11)
- A method for making a macrocyclic peptide, the method comprising:(a) providing an artificial nucleic acid molecule encoding for a polypeptide of structure: (AA)m-Z-(AA)n-Cys-(AA)p (I)
or
(AA)m-Cys-(AA)n-Z-(AA)p (II)
wherein:(i) (AA)m is an N-terminal amino acid or peptide sequence,(ii) Z is a non-canonical amino acid carrying a side-chain functional group FGi, FG1 being a functional group selected from the group consisting of - (CH2) n X, where X is F, Cl, Br, or I and n is an integer number from 1 to 10;-C(O)CH2X, where X is F, Cl, Br, or I; -CH(R')X, where X is F, Cl, Br, or I;-C(O)CH(R')X, where X is F, Cl, Br, or I; -OCH2CH2X, where X is F, Cl, Br, or I; -C(O)CH=C=C(R')(R"); -SO2C(R')=C(R')(R"); - C(O)C(R')=C(R')(R"); -C(R')=C(R')C(O)OR'; - C(R')=C(R')C(O)N(R')(R");-C(R')=C(R')-CN; -C(R')=C(R')-NO2; -C≡C-C(O)OR'; - C≡C-C(O)N(R')(R"); unsubstituted or substituted oxirane; unsubstituted or substituted aziridine; 1,2-oxathiolane 2,2-dioxide; 4-fluoro-1,2-oxathiolane 2,2-dioxide; and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide, where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group.(iii) (AA)n is a target peptide sequence,(iv) (AA)p is a C-terminal amino acid or peptide sequence;(b) introducing the nucleic acid molecule into an expression system, wherein the expression system is selected from the group consisting of a prokaryotic cell or an eukaryotic cell, wherein the eukaryotic cell is not an human germ cell or an human embryo; and wherein eitherthe expression system comprises:an aminoacyl-tRNA synthetase polypeptide or an engineered variant thereof that is at least 90% identical to SEQ ID NO:77, 78, 79, or 80; anda transfer RNA molecule encoded by a polynucleotide that is at least 90% identical to SEQ ID NO:101, 105, 109, 113, or 117, orthe expression system comprises:an aminoacyl-tRNA synthetase selected from the group consisting of SEQ ID NOs. 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100; anda transfer RNA molecule encoded by a polynucleotide selected from the group consisting of SEQ ID NO:101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, and 120; andexpressing the nucleic acid molecule in the expression system, thereby producing the polypeptide; and(c) allowing the functional group FG1 to react with the side- chain sulfhydryl group (-SH) of the cysteine (Cys) residue(s), thereby producing the macrocyclic peptide. - The method of claim 1 wherein Z is an amino acid of structure:C(R')=C(R')C(O)N(R')(R"); -C(R')=C(R')-CN; -C(R')=C(R')-NO2; -C≡C-C(O)OR';-C≡C-C(O)N(R')(R"); unsubstituted or substituted oxirane, unsubstituted or substituted aziridine; 1,2-oxathiolane 2,2-dioxide; 4-fluoro-1,2-oxathiolane 2,2-dioxide; and 4,4-difluoro- 1,2-oxathiolane 2,2-dioxide; where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group;wherein Y is a linker group selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups.
- The method of claim 1 wherein the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)- phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1- bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)- phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-phenylalanine, 3-(2-fluoro-acetyl)-phenylalanine, 4-(2-fluoro-acetyl)- phenylalanine, NE -((2-bromoethoxy)carbonyl)-lysine, NE -((2-chloroethoxy)carbonyl)-lysine, NE--(buta-2,3-dienoyl)-lysine, NE -acryl-lysine, NE -crotonyl-lysine, NE -(2-fluoro-acetyl)-lysine, and NE -(2-chloro-acetyl)-lysine.
- The method of claim 1 wherein the codon encoding for Z is an amber stop codon TAG, an ochre stop codon TAA, an opal stop codon TGA, or a four base codon.
- The method of claim 1, wherein the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I)or (II) comprise(s):
a polypeptide affinity tag, a DNA-binding polypeptide, a protein-binding polypeptide, an enzyme, a fluorescent protein, an intein protein, or
a combination thereof;or whereinthe N-terminal tail polypeptide, (AA)m, of the precursor polypeptide of formula (I) or (II) comprises the C-domain of a split intein, andthe C-terminal tail polypeptide, (AA)p, comprises the corresponding N-domain of the split intein;or whereinany of polypeptides (AA)n, (AA)m, or (AA)p, is fully or partially genetically randomized so that a plurality of macrocyclic peptides is obtained upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FG1 in Z. - The method of claim 1 wherein the prokaryotic cell is Escherichia coli and wherein the eukaryotic cell is a yeast, a mammalian, an insect or a plant cell.
- The method of claim 1 comprising:
fully or partially randomizing any of polypeptides (AA)n, (AA)m, or (AA)p, wherein, upon a thioether bond-forming reaction between the cysteine (Cys) residue and the side-chain functional group FG1 in Z, a plurality of macrocyclic peptides is produced. - A recombinant host cell comprising an artificial nucleic acid encoding for a polypeptide of structure:
(AA)m-Z-(AA)n-Cys-(AA)p (I)
or
(AA)m-Cys-(AA)n-Z-(AA)p (II)
wherein:(i) (AA)m is an N-terminal amino acid or peptide sequence,(ii) Z is an amino acid of structure:-(CH2) n X, where X is F, Cl, Br, or I and n is an integer number from 1 to 10; -C(O)CH2X, where X is F, Cl, Br, or I; -CH(R')X, where X is F, Cl, Br, or I; -C(O)CH(R')X, where X is F, Cl, Br, or I; -OCH2CH2X, where X is F, Cl, Br, or I; -C(O)CH=C=C(R')(R"); -SO2C(R')=C(R')(R"); -C(O)C(R')=C(R')(R") ; -C(R')=C(R')C(O)OR'; C(R')=C(R')C(O)N(R')(R"); -C(R')=C(R')-CN; -C(R')=C(R')-NO2; C≡C-C(O)OR'; -C≡C-C(O)N(R')(R") ; unsubstituted or substituted oxirane; unsubstituted or substituted aziridine; 1,2-oxathiolane 2,2-dioxide; 4- fluoro-1,2-oxathiolane 2,2-dioxide; and 4,4-difluoro-1,2-oxathiolane 2,2-dioxide; where each R' and R" is independently H, an aliphatic, a substituted aliphatic, an aryl, or a substituted aryl group; wherein Y is a linker group selected from the group consisting of aliphatic, aryl, substituted aliphatic, substituted aryl, heteroatom-containing aliphatic, heteroatom-containing aryl, substituted heteroatom-containing aliphatic, substituted heteroatom-containing aryl, alkoxy, and aryloxy groups,(iii) (AA)n is a target peptide sequence,(iv) (AA)p is a C-terminal amino acid or peptide sequence;and comprising an expression system comprising either:an aminoacyl-tRNA synthetase polypeptide or an engineered variant thereof that is at least 90% identical to SEQ ID NO:77, 78, 79, or 80; anda transfer RNA molecule encoded by a polynucleotide that is at least 90% identical to SEQ ID NO:101, 105, 109, 113, or 117, oran aminoacyl-tRNA synthetase selected from the group consisting of SEQ ID NOs. 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 and 100; anda transfer RNA molecule encoded by a polynucleotide selected from the group consisting of SEQ ID NO:101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, and 120. - The cell of claim 8, wherein the amino acid Z is selected from the group consisting of 4-(2-bromoethoxy)-phenylalanine, 3-(2-bromoethoxy)-phenylalanine, 4-(2-chloroethoxy)- phenylalanine, 3-(2-chloroethoxy)-phenylalanine, 4-(1-bromoethyl)-phenylalanine, 3-(1- bromoethyl)-phenylalanine, 4-(aziridin-1-yl)-phenylalanine, 3-(aziridin-1-yl)-phenylalanine, 4-acrylamido-phenylalanine, 3-acrylamido-phenylalanine, 4-(2-fluoro-acetamido)- phenylalanine, 3-(2-fluoro-acetamido)-phenylalanine, 4-(2-chloro-acetamido)-phenylalanine, 3-(2-chloro-acetamido)-phenylalanine, 3-(2-fluoro-acetyl)-phenylalanine, 4-(2-fluoro-acetyl)- phenylalanine, NE -((2-bromoethoxy)carbonyl)-lysine, NE -((2-chloroethoxy)carbonyl)-lysine, NE -(buta-2,3-dienoyl)-lysine, NE -acryl-lysine, NE -crotonyl-lysine, NE -(2-fluoro-acetyl)-lysine, and NE -(2-chloro-acetyl)-lysine.
- The cell of claim 8, wherein the polypeptide comprised within the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, of the precursor polypeptides of formula (I) and (II), is a polypeptide selected from the group of polypeptides consisting of SEQ ID NOs 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, and 158.
- The cell of claim 8, wherein the N-terminal tail polypeptide, (AA)m, or the C-terminal tail polypeptide, (AA)p, or both, in the precursor polypeptides of formula (I) or formula (II) comprise(s) an intein selected from the group consisting of a naturally occurring intein, an engineered variant of a naturally occurring intein, a fusion of the N-terminal and C-terminal fragments of a naturally occurring split intein and a fusion of the N- terminal and C-terminal fragments of an engineered split intein;
or whereinthe N-terminal tail polypeptide, (AA)m, comprises the C-domain of a naturally occurring split intein, or of an engineered variant thereof, andthe C-terminal tail polypeptide, (AA)p, comprises the N-domain of said split intein.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22174588.8A EP4122945A1 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361920181P | 2013-12-23 | 2013-12-23 | |
PCT/US2014/072016 WO2015100277A2 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP22174588.8A Division EP4122945A1 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
Publications (3)
Publication Number | Publication Date |
---|---|
EP3087088A2 EP3087088A2 (en) | 2016-11-02 |
EP3087088A4 EP3087088A4 (en) | 2017-08-02 |
EP3087088B1 true EP3087088B1 (en) | 2022-05-25 |
Family
ID=53479781
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP22174588.8A Pending EP4122945A1 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
EP14874141.6A Active EP3087088B1 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP22174588.8A Pending EP4122945A1 (en) | 2013-12-23 | 2014-12-23 | Methods and compositions for ribosomal synthesis of macrocyclic peptides |
Country Status (3)
Country | Link |
---|---|
US (1) | US10544191B2 (en) |
EP (2) | EP4122945A1 (en) |
WO (1) | WO2015100277A2 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106929482A (en) * | 2015-12-31 | 2017-07-07 | 北京大学 | Influenza virus, its live vaccine of rite-directed mutagenesis and its preparation method and application |
CN107177593B (en) | 2016-03-10 | 2020-10-23 | 北京大学 | Read-through of truncated proteins in early stop codon diseases using optimized gene codon expansion systems |
US20230139680A1 (en) * | 2019-05-20 | 2023-05-04 | The Texas A&M University System | A genetically encoded, phage-displayed cyclic peptide library and methods of making the same |
US20200385724A1 (en) * | 2019-06-07 | 2020-12-10 | Massachusetts Institute Of Technology | FRAMESHIFT SUPPRESSOR tRNA COMPOSITIONS AND METHODS OF USE |
WO2023173084A1 (en) * | 2022-03-11 | 2023-09-14 | University Of Rochester | Cyclopeptibodies and uses thereof |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5089396A (en) * | 1985-10-03 | 1992-02-18 | Genentech, Inc. | Nucleic acid encoding β chain prodomains of inhibin and method for synthesizing polypeptides using such nucleic acid |
US5837458A (en) | 1994-02-17 | 1998-11-17 | Maxygen, Inc. | Methods and compositions for cellular and metabolic engineering |
US5834252A (en) | 1995-04-18 | 1998-11-10 | Glaxo Group Limited | End-complementary polymerase reaction |
US6335160B1 (en) | 1995-02-17 | 2002-01-01 | Maxygen, Inc. | Methods and compositions for polypeptide engineering |
US6117679A (en) | 1994-02-17 | 2000-09-12 | Maxygen, Inc. | Methods for generating polynucleotides having desired characteristics by iterative selection and recombination |
US5605793A (en) | 1994-02-17 | 1997-02-25 | Affymax Technologies N.V. | Methods for in vitro recombination |
US6326204B1 (en) | 1997-01-17 | 2001-12-04 | Maxygen, Inc. | Evolution of whole cells and organisms by recursive sequence recombination |
JP4263248B2 (en) | 1997-03-18 | 2009-05-13 | ノボザイムス アクティーゼルスカブ | Library creation method by DNA shuffling |
KR20000076363A (en) | 1997-03-18 | 2000-12-26 | 한센 핀 베네드, 안네 제헤르, 웨이콥 마리안느 | An in vitro method for construction of a dna library |
US5948653A (en) | 1997-03-21 | 1999-09-07 | Pati; Sushma | Sequence alterations using homologous recombination |
US6153410A (en) | 1997-03-25 | 2000-11-28 | California Institute Of Technology | Recombination of polynucleotide sequences using random or defined primers |
DK1036198T3 (en) | 1997-12-08 | 2013-01-02 | California Inst Of Techn | Method for Preparation of Polynucleotide and Polypeptide Sequences |
CA2331926C (en) | 1998-06-29 | 2013-09-10 | Phylos, Inc. | Methods for generating highly diverse libraries |
CA2331335A1 (en) | 1998-09-29 | 2000-04-06 | Maxygen, Inc. | Shuffling of codon altered genes |
DK1141250T3 (en) | 1998-12-18 | 2006-07-10 | Penn State Res Found | Intein mediates cyclization of peptides |
US6436675B1 (en) | 1999-09-28 | 2002-08-20 | Maxygen, Inc. | Use of codon-varied oligonucleotide synthesis for synthetic shuffling |
DE60044223D1 (en) | 1999-01-19 | 2010-06-02 | Maxygen Inc | BY OLIGONUCLEOTIDE-MEDIATED NUCLEIC ACID RECOMBINATION |
EP1187914B1 (en) | 1999-06-14 | 2005-05-25 | Genentech, Inc. | A structured peptide scaffold for displaying turn libraries on phage |
WO2001064864A2 (en) | 2000-02-28 | 2001-09-07 | Maxygen, Inc. | Single-stranded nucleic acid template-mediated recombination and nucleic acid fragment isolation |
EP1263777A2 (en) | 2000-03-06 | 2002-12-11 | Rigel Pharmaceuticals, Inc. | In vivo production of cyclic peptides |
US7252952B2 (en) | 2000-03-06 | 2007-08-07 | Rigel Pharmaceuticals, Inc. | In vivo production of cyclic peptides for inhibiting protein—protein interaction |
GB0022458D0 (en) * | 2000-09-13 | 2000-11-01 | Medical Res Council | Directed evolution method |
US20030219739A1 (en) | 2001-01-30 | 2003-11-27 | Glass David J. | Novel nucleic acid and polypeptide molecules |
US20060166319A1 (en) * | 2004-08-13 | 2006-07-27 | Chan Michael K | Charging tRNA with pyrrolysine |
WO2006045116A2 (en) * | 2004-10-20 | 2006-04-27 | The Scripps Research Institute | In vivo site-specific incorporation of n-acetyl-galactosamine amino acids in eubacteria |
EP3012265B1 (en) * | 2007-03-26 | 2017-06-28 | The University of Tokyo | Process for synthesizing cyclic peptide compound |
US20090004105A1 (en) | 2007-06-27 | 2009-01-01 | Zhen Cheng | Molecular imaging of matrix metalloproteinase expression using labeled chlorotoxin |
CN103328648B (en) | 2010-12-03 | 2015-11-25 | 国立大学法人东京大学 | There is the peptide of stable secondary structure and peptide storehouse and their manufacture method |
US8986953B2 (en) * | 2011-01-20 | 2015-03-24 | University Of Rochester | Macrocyclic compounds with a hybrid peptidic/non-peptidic backbone and methods for their preparation |
-
2014
- 2014-12-23 EP EP22174588.8A patent/EP4122945A1/en active Pending
- 2014-12-23 EP EP14874141.6A patent/EP3087088B1/en active Active
- 2014-12-23 US US15/107,387 patent/US10544191B2/en active Active
- 2014-12-23 WO PCT/US2014/072016 patent/WO2015100277A2/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2015100277A2 (en) | 2015-07-02 |
US10544191B2 (en) | 2020-01-28 |
US20160355552A1 (en) | 2016-12-08 |
EP3087088A4 (en) | 2017-08-02 |
EP3087088A2 (en) | 2016-11-02 |
WO2015100277A3 (en) | 2015-07-30 |
EP4122945A1 (en) | 2023-01-25 |
WO2015100277A9 (en) | 2015-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3087088B1 (en) | Methods and compositions for ribosomal synthesis of macrocyclic peptides | |
US8986953B2 (en) | Macrocyclic compounds with a hybrid peptidic/non-peptidic backbone and methods for their preparation | |
Pan et al. | Target identification of natural products and bioactive compounds using affinity-based probes | |
CN111684112B (en) | System for assembly and modification of non-ribosomal peptide synthetases | |
CN110475785B (en) | Peptide compound, method for producing same, composition for screening, and method for selecting peptide compound | |
EP2761006A1 (en) | Split inteins and uses thereof | |
WO2014039715A1 (en) | Methods and compositions for site-specific labeling of peptides and proteins | |
EP2766389B1 (en) | Gene cluster for biosynthesis of griselimycin and methylgriselimycin | |
JP2017035101A (en) | Modified peptide display | |
Porterfield et al. | Engineered biosynthesis of alkyne-tagged polyketides by type I PKSs | |
US20200299675A1 (en) | Methods and Compositions for Display of Macrocyclic Peptides | |
EP2027313A2 (en) | Compositions and methods comprising the use of cell surface displayed homing endonucleases | |
Valentine et al. | Genetically encoded cyclic peptide libraries: From hit to lead and beyond | |
JP7461652B2 (en) | Compound library and method for producing the compound library | |
Bionda et al. | Ribosomal synthesis of thioether-bridged bicyclic peptides | |
Matabaro et al. | Enzyme-mediated backbone N-methylation in ribosomally encoded peptides | |
Smith et al. | Synthesis of macrocyclic organo-peptide hybrids from ribosomal polypeptide precursors via CuAAC-/hydrazide-mediated cyclization | |
Li et al. | Characterization of N-methyltransferase for catalyzing the terminus of leucinostatins in Purpureocillium lilacinum | |
Philpott | Bionanotechnology platforms for biocatalysis | |
WO2023173084A9 (en) | Cyclopeptibodies and uses thereof | |
Jaitzig | Reconstituted nonribosomal biosynthesis of the peptide antibiotic valinomycin in Escherichia coli as recombinant whole-cell biocatalyst | |
CA3236034A1 (en) | Method for producing compounds, method for producing compound library, compound library, and screening method | |
Sarkar | Developing an Enzymatic Toolbox to Generate Drug-Like Peptides | |
WO2022235942A1 (en) | Sequence-defined polymers with one or more azides, methods of making, and methods use thereof | |
Truong | Expanding Protein Sequence Space through Incorporation of Non-Canonical Amino Acids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20160722 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20170630 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: C07K 7/64 20060101AFI20170626BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20180515 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20211207 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1494225 Country of ref document: AT Kind code of ref document: T Effective date: 20220615 Ref country code: DE Ref legal event code: R096 Ref document number: 602014083853 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG9D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20220525 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1494225 Country of ref document: AT Kind code of ref document: T Effective date: 20220525 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220926 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220825 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220826 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220825 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220925 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602014083853 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20230228 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
P01 | Opt-out of the competence of the unified patent court (upc) registered |
Effective date: 20230516 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20221231 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221223 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221231 Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221223 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221231 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221231 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20221231 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20231227 Year of fee payment: 10 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20141223 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20220525 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20231229 Year of fee payment: 10 |