CN113801233A - Preparation method of somaglutide - Google Patents
Preparation method of somaglutide Download PDFInfo
- Publication number
- CN113801233A CN113801233A CN202010530625.4A CN202010530625A CN113801233A CN 113801233 A CN113801233 A CN 113801233A CN 202010530625 A CN202010530625 A CN 202010530625A CN 113801233 A CN113801233 A CN 113801233A
- Authority
- CN
- China
- Prior art keywords
- somaglutide
- fmoc
- boc
- seq
- precursor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000002360 preparation method Methods 0.000 title abstract description 15
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 65
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 65
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 60
- 238000000034 method Methods 0.000 claims abstract description 57
- 125000003088 (fluoren-9-ylmethoxy)carbonyl group Chemical group 0.000 claims abstract description 42
- 108010043121 Green Fluorescent Proteins Proteins 0.000 claims abstract description 17
- 102000004144 Green Fluorescent Proteins Human genes 0.000 claims abstract description 17
- 239000005090 green fluorescent protein Substances 0.000 claims abstract description 17
- 230000012846 protein folding Effects 0.000 claims abstract description 11
- 239000002243 precursor Substances 0.000 claims description 83
- RTZKZFJDLAIYFH-UHFFFAOYSA-N ether Substances CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 claims description 47
- 238000006243 chemical reaction Methods 0.000 claims description 26
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 22
- 150000001875 compounds Chemical class 0.000 claims description 21
- 239000012634 fragment Substances 0.000 claims description 19
- 150000001413 amino acids Chemical class 0.000 claims description 18
- 239000003208 petroleum Substances 0.000 claims description 17
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 claims description 13
- 102000004190 Enzymes Human genes 0.000 claims description 11
- 108090000790 Enzymes Proteins 0.000 claims description 11
- 239000007787 solid Substances 0.000 claims description 11
- 238000005520 cutting process Methods 0.000 claims description 9
- 238000001976 enzyme digestion Methods 0.000 claims description 8
- 239000000203 mixture Substances 0.000 claims description 8
- 241000894006 Bacteria Species 0.000 claims description 7
- 238000003756 stirring Methods 0.000 claims description 7
- 108010013369 Enteropeptidase Proteins 0.000 claims description 6
- 239000003960 organic solvent Substances 0.000 claims description 6
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 5
- 238000003776 cleavage reaction Methods 0.000 claims description 5
- 230000007017 scission Effects 0.000 claims description 5
- 102100029727 Enteropeptidase Human genes 0.000 claims description 4
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 claims description 3
- 230000002255 enzymatic effect Effects 0.000 claims description 3
- 238000002156 mixing Methods 0.000 claims description 3
- 238000009472 formulation Methods 0.000 claims description 2
- KQKPFRSPSRPDEB-UHFFFAOYSA-N sumatriptan Chemical compound CNS(=O)(=O)CC1=CC=C2NC=C(CCN(C)C)C2=C1 KQKPFRSPSRPDEB-UHFFFAOYSA-N 0.000 claims description 2
- 229960003708 sumatriptan Drugs 0.000 claims description 2
- 230000004048 modification Effects 0.000 abstract description 9
- 238000012986 modification Methods 0.000 abstract description 9
- 229920001184 polypeptide Polymers 0.000 description 30
- 102000004196 processed proteins & peptides Human genes 0.000 description 30
- 210000004027 cell Anatomy 0.000 description 29
- ZMXDDKWLCZADIW-UHFFFAOYSA-N N,N-Dimethylformamide Chemical compound CN(C)C=O ZMXDDKWLCZADIW-UHFFFAOYSA-N 0.000 description 27
- 239000000243 solution Substances 0.000 description 24
- 239000013612 plasmid Substances 0.000 description 20
- 108091033319 polynucleotide Proteins 0.000 description 20
- 102000040430 polynucleotide Human genes 0.000 description 20
- 239000002157 polynucleotide Substances 0.000 description 20
- 108090000623 proteins and genes Proteins 0.000 description 20
- YMWUJEATGCHHMB-UHFFFAOYSA-N Dichloromethane Chemical compound ClCCl YMWUJEATGCHHMB-UHFFFAOYSA-N 0.000 description 18
- JGFZNNIVVJXRND-UHFFFAOYSA-N N,N-Diisopropylethylamine (DIPEA) Chemical compound CCN(C(C)C)C(C)C JGFZNNIVVJXRND-UHFFFAOYSA-N 0.000 description 18
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 18
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 14
- 210000003000 inclusion body Anatomy 0.000 description 14
- 239000011259 mixed solution Substances 0.000 description 14
- 239000013598 vector Substances 0.000 description 14
- 239000013604 expression vector Substances 0.000 description 13
- 102000004169 proteins and genes Human genes 0.000 description 13
- 230000014509 gene expression Effects 0.000 description 12
- 108020004414 DNA Proteins 0.000 description 10
- 241000588724 Escherichia coli Species 0.000 description 10
- 230000015572 biosynthetic process Effects 0.000 description 10
- 229940125782 compound 2 Drugs 0.000 description 10
- 239000000047 product Substances 0.000 description 10
- 238000003786 synthesis reaction Methods 0.000 description 10
- 238000005406 washing Methods 0.000 description 10
- -1 Fmoc modified histidine Chemical class 0.000 description 9
- BZLVMXJERCGZMT-UHFFFAOYSA-N Methyl tert-butyl ether Chemical compound COC(C)(C)C BZLVMXJERCGZMT-UHFFFAOYSA-N 0.000 description 9
- 229940125898 compound 5 Drugs 0.000 description 9
- 229920001223 polyethylene glycol Polymers 0.000 description 9
- 125000006239 protecting group Chemical group 0.000 description 9
- 238000000746 purification Methods 0.000 description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- NQRYJNQNLNOLGT-UHFFFAOYSA-N Piperidine Chemical compound C1CCNCC1 NQRYJNQNLNOLGT-UHFFFAOYSA-N 0.000 description 8
- 239000012046 mixed solvent Substances 0.000 description 8
- 230000001376 precipitating effect Effects 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- 238000007796 conventional method Methods 0.000 description 7
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 6
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 6
- 229940126214 compound 3 Drugs 0.000 description 6
- 239000002244 precipitate Substances 0.000 description 6
- 238000004153 renaturation Methods 0.000 description 6
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 5
- 239000004472 Lysine Substances 0.000 description 5
- 101800001442 Peptide pr Proteins 0.000 description 5
- 238000012217 deletion Methods 0.000 description 5
- 230000037430 deletion Effects 0.000 description 5
- 206010012601 diabetes mellitus Diseases 0.000 description 5
- 239000003623 enhancer Substances 0.000 description 5
- 210000003527 eukaryotic cell Anatomy 0.000 description 5
- 238000000855 fermentation Methods 0.000 description 5
- 230000004151 fermentation Effects 0.000 description 5
- 238000004128 high performance liquid chromatography Methods 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 238000004366 reverse phase liquid chromatography Methods 0.000 description 5
- ZGYICYBLPGRURT-UHFFFAOYSA-N tri(propan-2-yl)silicon Chemical compound CC(C)[Si](C(C)C)C(C)C ZGYICYBLPGRURT-UHFFFAOYSA-N 0.000 description 5
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- ZDNNDIJTUHQCAM-MXAVVETBSA-N Ile-Ser-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ZDNNDIJTUHQCAM-MXAVVETBSA-N 0.000 description 4
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 4
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 210000005056 cell body Anatomy 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 229940125904 compound 1 Drugs 0.000 description 4
- 238000009833 condensation Methods 0.000 description 4
- 230000005494 condensation Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 239000012535 impurity Substances 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- 239000002609 medium Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 108700004896 tripeptide FEG Proteins 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 3
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 3
- SRZLHYPAOXBBSB-HJGDQZAQSA-N Glu-Arg-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SRZLHYPAOXBBSB-HJGDQZAQSA-N 0.000 description 3
- HPJLZFTUUJKWAJ-JHEQGTHGSA-N Glu-Gly-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HPJLZFTUUJKWAJ-JHEQGTHGSA-N 0.000 description 3
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 3
- WZPIKDWQVRTATP-SYWGBEHUSA-N Ile-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 WZPIKDWQVRTATP-SYWGBEHUSA-N 0.000 description 3
- YSDQQAXHVYUZIW-QCIJIYAXSA-N Liraglutide Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCNC(=O)CC[C@H](NC(=O)CCCCCCCCCCCCCCC)C(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1NC=NC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=C(O)C=C1 YSDQQAXHVYUZIW-QCIJIYAXSA-N 0.000 description 3
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 3
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 3
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 3
- LBSWWNKMVPAXOI-GUBZILKMSA-N Met-Val-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O LBSWWNKMVPAXOI-GUBZILKMSA-N 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- GNRMAQSIROFNMI-IXOXFDKPSA-N Phe-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O GNRMAQSIROFNMI-IXOXFDKPSA-N 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- DLSWIYLPEUIQAV-UHFFFAOYSA-N Semaglutide Chemical compound CCC(C)C(NC(=O)C(Cc1ccccc1)NC(=O)C(CCC(O)=O)NC(=O)C(CCCCNC(=O)COCCOCCNC(=O)COCCOCCNC(=O)CCC(NC(=O)CCCCCCCCCCCCCCCCC(O)=O)C(O)=O)NC(=O)C(C)NC(=O)C(C)NC(=O)C(CCC(N)=O)NC(=O)CNC(=O)C(CCC(O)=O)NC(=O)C(CC(C)C)NC(=O)C(Cc1ccc(O)cc1)NC(=O)C(CO)NC(=O)C(CO)NC(=O)C(NC(=O)C(CC(O)=O)NC(=O)C(CO)NC(=O)C(NC(=O)C(Cc1ccccc1)NC(=O)C(NC(=O)CNC(=O)C(CCC(O)=O)NC(=O)C(C)(C)NC(=O)C(N)Cc1cnc[nH]1)C(C)O)C(C)O)C(C)C)C(=O)NC(C)C(=O)NC(Cc1c[nH]c2ccccc12)C(=O)NC(CC(C)C)C(=O)NC(C(C)C)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CCCNC(N)=N)C(=O)NCC(O)=O DLSWIYLPEUIQAV-UHFFFAOYSA-N 0.000 description 3
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 238000010521 absorption reaction Methods 0.000 description 3
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 3
- 235000010633 broth Nutrition 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 238000012258 culturing Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 3
- 230000013595 glycosylation Effects 0.000 description 3
- 238000006206 glycosylation reaction Methods 0.000 description 3
- 238000009396 hybridization Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 229930182823 kanamycin A Natural products 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 210000004962 mammalian cell Anatomy 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- GCYXWQUSHADNBF-AAEALURTSA-N preproglucagon 78-108 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 GCYXWQUSHADNBF-AAEALURTSA-N 0.000 description 3
- 239000002994 raw material Substances 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 238000010188 recombinant method Methods 0.000 description 3
- 239000011347 resin Substances 0.000 description 3
- 229920005989 resin Polymers 0.000 description 3
- 238000010532 solid phase synthesis reaction Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 108010060175 trypsinogen activation peptide Proteins 0.000 description 3
- 108010080629 tryptophan-leucine Proteins 0.000 description 3
- HZAXFHJVJLSVMW-UHFFFAOYSA-N 2-Aminoethan-1-ol Chemical compound NCCO HZAXFHJVJLSVMW-UHFFFAOYSA-N 0.000 description 2
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 2
- 108010088751 Albumins Proteins 0.000 description 2
- 102000009027 Albumins Human genes 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- VHUUQVKOLVNVRT-UHFFFAOYSA-N Ammonium hydroxide Chemical compound [NH4+].[OH-] VHUUQVKOLVNVRT-UHFFFAOYSA-N 0.000 description 2
- HQIZDMIGUJOSNI-IUCAKERBSA-N Arg-Gly-Arg Chemical compound N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O HQIZDMIGUJOSNI-IUCAKERBSA-N 0.000 description 2
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 2
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- YJIUYQKQBBQYHZ-ACZMJKKPSA-N Gln-Ala-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YJIUYQKQBBQYHZ-ACZMJKKPSA-N 0.000 description 2
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 2
- 101800004266 Glucagon-like peptide 1(7-37) Proteins 0.000 description 2
- RJIVPOXLQFJRTG-LURJTMIESA-N Gly-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N RJIVPOXLQFJRTG-LURJTMIESA-N 0.000 description 2
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 2
- LLWQVJNHMYBLLK-CDMKHQONSA-N Gly-Thr-Phe Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LLWQVJNHMYBLLK-CDMKHQONSA-N 0.000 description 2
- 102000017011 Glycated Hemoglobin A Human genes 0.000 description 2
- 108010014663 Glycated Hemoglobin A Proteins 0.000 description 2
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 2
- 208000013016 Hypoglycemia Diseases 0.000 description 2
- VQUCKIAECLVLAD-SVSWQMSJSA-N Ile-Cys-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VQUCKIAECLVLAD-SVSWQMSJSA-N 0.000 description 2
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 2
- RDFIVFHPOSOXMW-ACRUOGEOSA-N Leu-Tyr-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RDFIVFHPOSOXMW-ACRUOGEOSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- 108010019598 Liraglutide Proteins 0.000 description 2
- KWUKZRFFKPLUPE-HJGDQZAQSA-N Lys-Asp-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KWUKZRFFKPLUPE-HJGDQZAQSA-N 0.000 description 2
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 2
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 2
- TVHCDSBMFQYPNA-RHYQMDGZSA-N Lys-Thr-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TVHCDSBMFQYPNA-RHYQMDGZSA-N 0.000 description 2
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 2
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 2
- 239000001888 Peptone Substances 0.000 description 2
- 108010080698 Peptones Proteins 0.000 description 2
- FIRWJEJVFFGXSH-RYUDHWBXSA-N Phe-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 FIRWJEJVFFGXSH-RYUDHWBXSA-N 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- 108020005091 Replication Origin Proteins 0.000 description 2
- PLQWGQUNUPMNOD-KKUMJFAQSA-N Ser-Tyr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PLQWGQUNUPMNOD-KKUMJFAQSA-N 0.000 description 2
- CDBYLPFSWZWCQE-UHFFFAOYSA-L Sodium Carbonate Chemical compound [Na+].[Na+].[O-]C([O-])=O CDBYLPFSWZWCQE-UHFFFAOYSA-L 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 2
- WMBFONUKQXGLMU-WDSOQIARSA-N Trp-Leu-Val Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N WMBFONUKQXGLMU-WDSOQIARSA-N 0.000 description 2
- KSCVLGXNQXKUAR-JYJNAYRXSA-N Tyr-Leu-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KSCVLGXNQXKUAR-JYJNAYRXSA-N 0.000 description 2
- SINRIKQYQJRGDQ-MEYUZBJRSA-N Tyr-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SINRIKQYQJRGDQ-MEYUZBJRSA-N 0.000 description 2
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 2
- PZTZYZUTCPZWJH-FXQIFTODSA-N Val-Ser-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)O)N PZTZYZUTCPZWJH-FXQIFTODSA-N 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 235000011114 ammonium hydroxide Nutrition 0.000 description 2
- 239000003472 antidiabetic agent Substances 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 229940041514 candida albicans extract Drugs 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 230000001684 chronic effect Effects 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000005336 cracking Methods 0.000 description 2
- PAFZNILMFXTMIY-UHFFFAOYSA-N cyclohexylamine Chemical compound NC1CCCCC1 PAFZNILMFXTMIY-UHFFFAOYSA-N 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000001035 drying Methods 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 230000029142 excretion Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 239000012467 final product Substances 0.000 description 2
- IRXSLJNXXZKURP-UHFFFAOYSA-N fluorenylmethyloxycarbonyl chloride Chemical compound C1=CC=C2C(COC(=O)Cl)C3=CC=CC=C3C2=C1 IRXSLJNXXZKURP-UHFFFAOYSA-N 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 230000002218 hypoglycaemic effect Effects 0.000 description 2
- 238000000338 in vitro Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 229960002701 liraglutide Drugs 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000012528 membrane Substances 0.000 description 2
- 229910052757 nitrogen Inorganic materials 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 230000010355 oscillation Effects 0.000 description 2
- 235000019319 peptone Nutrition 0.000 description 2
- 239000012071 phase Substances 0.000 description 2
- 238000012643 polycondensation polymerization Methods 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 230000006340 racemization Effects 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 230000006798 recombination Effects 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 108010060325 semaglutide Proteins 0.000 description 2
- 229950011186 semaglutide Drugs 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 238000007086 side reaction Methods 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 210000005253 yeast cell Anatomy 0.000 description 2
- 239000012138 yeast extract Substances 0.000 description 2
- AUXMWYRZQPIXCC-KNIFDHDWSA-N (2s)-2-amino-4-methylpentanoic acid;(2s)-2-aminopropanoic acid Chemical compound C[C@H](N)C(O)=O.CC(C)C[C@H](N)C(O)=O AUXMWYRZQPIXCC-KNIFDHDWSA-N 0.000 description 1
- DQUHYEDEGRNAFO-QMMMGPOBSA-N (2s)-6-amino-2-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoic acid Chemical compound CC(C)(C)OC(=O)N[C@H](C(O)=O)CCCCN DQUHYEDEGRNAFO-QMMMGPOBSA-N 0.000 description 1
- RYHBNJHYFVUHQT-UHFFFAOYSA-N 1,4-Dioxane Chemical compound C1COCCO1 RYHBNJHYFVUHQT-UHFFFAOYSA-N 0.000 description 1
- OZRFYUJEXYKQDV-UHFFFAOYSA-N 2-[[2-[[2-[(2-amino-3-carboxypropanoyl)amino]-3-carboxypropanoyl]amino]-3-carboxypropanoyl]amino]butanedioic acid Chemical compound OC(=O)CC(N)C(=O)NC(CC(O)=O)C(=O)NC(CC(O)=O)C(=O)NC(CC(O)=O)C(O)=O OZRFYUJEXYKQDV-UHFFFAOYSA-N 0.000 description 1
- 229920001817 Agar Polymers 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 1
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 1
- SHYYAQLDNVHPFT-DLOVCJGASA-N Ala-Asn-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SHYYAQLDNVHPFT-DLOVCJGASA-N 0.000 description 1
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 1
- LSLIRHLIUDVNBN-CIUDSAMLSA-N Ala-Asp-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LSLIRHLIUDVNBN-CIUDSAMLSA-N 0.000 description 1
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 1
- BVLPIIBTWIYOML-ZKWXMUAHSA-N Ala-Val-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BVLPIIBTWIYOML-ZKWXMUAHSA-N 0.000 description 1
- VKKYFICVTYKFIO-CIUDSAMLSA-N Arg-Ala-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N VKKYFICVTYKFIO-CIUDSAMLSA-N 0.000 description 1
- PTVGLOCPAVYPFG-CIUDSAMLSA-N Arg-Gln-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O PTVGLOCPAVYPFG-CIUDSAMLSA-N 0.000 description 1
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 1
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 1
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- PTNFNTOBUDWHNZ-GUBZILKMSA-N Asn-Arg-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O PTNFNTOBUDWHNZ-GUBZILKMSA-N 0.000 description 1
- LJUOLNXOWSWGKF-ACZMJKKPSA-N Asn-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N LJUOLNXOWSWGKF-ACZMJKKPSA-N 0.000 description 1
- KHCNTVRVAYCPQE-CIUDSAMLSA-N Asn-Lys-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O KHCNTVRVAYCPQE-CIUDSAMLSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 1
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 1
- 244000063299 Bacillus subtilis Species 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- 238000009010 Bradford assay Methods 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108010069514 Cyclic Peptides Proteins 0.000 description 1
- 102000001189 Cyclic Peptides Human genes 0.000 description 1
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- BLOXULLYFRGYKZ-GUBZILKMSA-N Gln-Glu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BLOXULLYFRGYKZ-GUBZILKMSA-N 0.000 description 1
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 1
- NUSWUSKZRCGFEX-FXQIFTODSA-N Glu-Glu-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CS)C(O)=O NUSWUSKZRCGFEX-FXQIFTODSA-N 0.000 description 1
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 1
- WRNAXCVRSBBKGS-BQBZGAKWSA-N Glu-Gly-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O WRNAXCVRSBBKGS-BQBZGAKWSA-N 0.000 description 1
- YVYVMJNUENBOOL-KBIXCLLPSA-N Glu-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N YVYVMJNUENBOOL-KBIXCLLPSA-N 0.000 description 1
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- 101800000224 Glucagon-like peptide 1 Proteins 0.000 description 1
- DTHNMHAUYICORS-KTKZVXAJSA-N Glucagon-like peptide 1 Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(N)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@@H](NC(=O)CNC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC=1N=CNC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)C1=CC=CC=C1 DTHNMHAUYICORS-KTKZVXAJSA-N 0.000 description 1
- 102400000322 Glucagon-like peptide 1 Human genes 0.000 description 1
- SCCPDJAQCXWPTF-VKHMYHEASA-N Gly-Asp Chemical compound NCC(=O)N[C@H](C(O)=O)CC(O)=O SCCPDJAQCXWPTF-VKHMYHEASA-N 0.000 description 1
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 1
- PMNHJLASAAWELO-FOHZUACHSA-N Gly-Asp-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PMNHJLASAAWELO-FOHZUACHSA-N 0.000 description 1
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 1
- YXASFUBDSDAXQD-UWVGGRQHSA-N His-Met-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O YXASFUBDSDAXQD-UWVGGRQHSA-N 0.000 description 1
- ILUVWFTXAUYOBW-CUJWVEQBSA-N His-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC1=CN=CN1)N)O ILUVWFTXAUYOBW-CUJWVEQBSA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- 101500028774 Homo sapiens Glucagon-like peptide 1 Proteins 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- IOVUXUSIGXCREV-DKIMLUQUSA-N Ile-Leu-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IOVUXUSIGXCREV-DKIMLUQUSA-N 0.000 description 1
- 150000008575 L-amino acids Chemical class 0.000 description 1
- SRBFZHDQGSBBOR-HWQSCIPKSA-N L-arabinopyranose Chemical compound O[C@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-HWQSCIPKSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- KVMULWOHPPMHHE-DCAQKATOSA-N Leu-Glu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KVMULWOHPPMHHE-DCAQKATOSA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- 239000006142 Luria-Bertani Agar Substances 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 1
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 1
- LMGNWHDWJDIOPK-DKIMLUQUSA-N Lys-Phe-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LMGNWHDWJDIOPK-DKIMLUQUSA-N 0.000 description 1
- 102000017737 Lysine-tRNA Ligase Human genes 0.000 description 1
- 108010092041 Lysine-tRNA Ligase Proteins 0.000 description 1
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 1
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 1
- 101710118186 Neomycin resistance protein Proteins 0.000 description 1
- BZQFBWGGLXLEPQ-UHFFFAOYSA-N O-phosphoryl-L-serine Natural products OC(=O)C(N)COP(O)(O)=O BZQFBWGGLXLEPQ-UHFFFAOYSA-N 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 244000131316 Panax pseudoginseng Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- FMMIYCMOVGXZIP-AVGNSLFASA-N Phe-Glu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O FMMIYCMOVGXZIP-AVGNSLFASA-N 0.000 description 1
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 1
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 1
- YTILBRIUASDGBL-BZSNNMDCSA-N Phe-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 YTILBRIUASDGBL-BZSNNMDCSA-N 0.000 description 1
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 1
- KIQUCMUULDXTAZ-HJOGWXRNSA-N Phe-Tyr-Tyr Chemical compound N[C@@H](Cc1ccccc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O KIQUCMUULDXTAZ-HJOGWXRNSA-N 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 241000293869 Salmonella enterica subsp. enterica serovar Typhimurium Species 0.000 description 1
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 1
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 1
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 1
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 1
- QMCDMHWAKMUGJE-IHRRRGAJSA-N Ser-Phe-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O QMCDMHWAKMUGJE-IHRRRGAJSA-N 0.000 description 1
- FZXOPYUEQGDGMS-ACZMJKKPSA-N Ser-Ser-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O FZXOPYUEQGDGMS-ACZMJKKPSA-N 0.000 description 1
- DKGRNFUXVTYRAS-UBHSHLNASA-N Ser-Ser-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O DKGRNFUXVTYRAS-UBHSHLNASA-N 0.000 description 1
- OLKICIBQRVSQMA-SRVKXCTJSA-N Ser-Ser-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OLKICIBQRVSQMA-SRVKXCTJSA-N 0.000 description 1
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 1
- RTXKJFWHEBTABY-IHPCNDPISA-N Ser-Trp-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)NC(=O)[C@H](CO)N RTXKJFWHEBTABY-IHPCNDPISA-N 0.000 description 1
- IAOHCSQDQDWRQU-GUBZILKMSA-N Ser-Val-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IAOHCSQDQDWRQU-GUBZILKMSA-N 0.000 description 1
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 1
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- KVEWWQRTAVMOFT-KJEVXHAQSA-N Thr-Tyr-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O KVEWWQRTAVMOFT-KJEVXHAQSA-N 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- BXPOOVDVGWEXDU-WZLNRYEVSA-N Tyr-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXPOOVDVGWEXDU-WZLNRYEVSA-N 0.000 description 1
- ARJASMXQBRNAGI-YESZJQIVSA-N Tyr-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N ARJASMXQBRNAGI-YESZJQIVSA-N 0.000 description 1
- FDKDGFGTHGJKNV-FHWLQOOXSA-N Tyr-Phe-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N FDKDGFGTHGJKNV-FHWLQOOXSA-N 0.000 description 1
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 1
- UDLYXGYWTVOIKU-QXEWZRGKSA-N Val-Asn-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UDLYXGYWTVOIKU-QXEWZRGKSA-N 0.000 description 1
- VFOHXOLPLACADK-GVXVVHGQSA-N Val-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N VFOHXOLPLACADK-GVXVVHGQSA-N 0.000 description 1
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 1
- JSOXWWFKRJKTMT-WOPDTQHZSA-N Val-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N JSOXWWFKRJKTMT-WOPDTQHZSA-N 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 238000005377 adsorption chromatography Methods 0.000 description 1
- 239000008272 agar Substances 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 108010005233 alanylglutamic acid Proteins 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 238000005349 anion exchange Methods 0.000 description 1
- 238000005571 anion exchange chromatography Methods 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 108010092854 aspartyllysine Proteins 0.000 description 1
- ANTGQBPGYZDWAW-UHFFFAOYSA-N azane;1,4-dioxane Chemical compound N.C1COCCO1 ANTGQBPGYZDWAW-UHFFFAOYSA-N 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 230000021523 carboxylation Effects 0.000 description 1
- 238000006473 carboxylation reaction Methods 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000006037 cell lysis Effects 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012295 chemical reaction liquid Substances 0.000 description 1
- 229960005091 chloramphenicol Drugs 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 239000003398 denaturant Substances 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000011033 desalting Methods 0.000 description 1
- 229950006137 dexfosfoserine Drugs 0.000 description 1
- 235000014113 dietary fatty acids Nutrition 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000194 fatty acid Substances 0.000 description 1
- 229930195729 fatty acid Natural products 0.000 description 1
- 150000004665 fatty acids Chemical group 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 125000005519 fluorenylmethyloxycarbonyl group Chemical group 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 108010078144 glutaminyl-glycine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 125000000623 heterocyclic group Chemical group 0.000 description 1
- 150000002410 histidine derivatives Chemical group 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 229910017053 inorganic salt Inorganic materials 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000004811 liquid chromatography Methods 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 229910001629 magnesium chloride Inorganic materials 0.000 description 1
- 238000010297 mechanical methods and process Methods 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000002808 molecular sieve Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- USRGIUJOYOXOQJ-GBXIJSLDSA-N phosphothreonine Chemical compound OP(=O)(O)O[C@H](C)[C@H](N)C(O)=O USRGIUJOYOXOQJ-GBXIJSLDSA-N 0.000 description 1
- DCWXELXMIBXGTH-UHFFFAOYSA-N phosphotyrosine Chemical compound OC(=O)C(N)CC1=CC=C(OP(O)(O)=O)C=C1 DCWXELXMIBXGTH-UHFFFAOYSA-N 0.000 description 1
- 239000000049 pigment Substances 0.000 description 1
- 230000000291 postprandial effect Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000017854 proteolysis Effects 0.000 description 1
- 230000009323 psychological health Effects 0.000 description 1
- HNJBEVLQSNELDL-UHFFFAOYSA-N pyrrolidin-2-one Chemical compound O=C1CCCN1 HNJBEVLQSNELDL-UHFFFAOYSA-N 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000003248 secreting effect Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 1
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 1
- 235000017557 sodium bicarbonate Nutrition 0.000 description 1
- 229910000029 sodium carbonate Inorganic materials 0.000 description 1
- 239000012265 solid product Substances 0.000 description 1
- 238000005063 solubilization Methods 0.000 description 1
- 230000007928 solubilization Effects 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 125000001424 substituent group Chemical group 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 125000005931 tert-butyloxycarbonyl group Chemical group [H]C([H])([H])C(OC(*)=O)(C([H])([H])[H])C([H])([H])[H] 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 239000000892 thaumatin Substances 0.000 description 1
- 235000010436 thaumatin Nutrition 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000003151 transfection method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 238000005199 ultracentrifugation Methods 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/575—Hormones
- C07K14/605—Glucagons
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P3/00—Drugs for disorders of the metabolism
- A61P3/08—Drugs for disorders of the metabolism for glucose homeostasis
- A61P3/10—Drugs for disorders of the metabolism for glucose homeostasis for hyperglycaemia, e.g. antidiabetics
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/06—Preparation of peptides or proteins produced by the hydrolysis of a peptide bond, e.g. hydrolysate products
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/02—Fusion polypeptide containing a localisation/targetting motif containing a signal sequence
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/50—Fusion polypeptide containing protease site
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/60—Fusion polypeptide containing spectroscopic/fluorescent detection, e.g. green fluorescent protein [GFP]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P20/00—Technologies relating to chemical industry
- Y02P20/50—Improvements relating to the production of bulk chemicals
- Y02P20/55—Design of synthesis routes, e.g. reducing the use of auxiliary or protecting groups
Landscapes
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Diabetes (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- General Chemical & Material Sciences (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Endocrinology (AREA)
- Biotechnology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Gastroenterology & Hepatology (AREA)
- General Engineering & Computer Science (AREA)
- Toxicology (AREA)
- Emergency Medicine (AREA)
- Hematology (AREA)
- Obesity (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pharmacology & Pharmacy (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Peptides Or Proteins (AREA)
Abstract
The invention provides a soxhlet peptide derivative and a preparation method thereof. Specifically, the method provided by the invention utilizes a somaglutide fusion protein containing a green fluorescent protein folding unit to prepare the somaglutide, performs Fmoc modification on a Boc-modified somaglutide main chain, and performs side chain addition of the somaglutide by orthogonal protection. The invention also provides the Fmoc and Boc modified somaglutide backbone involved in the preparation process and a fusion protein comprising the somaglutide backbone.
Description
Technical Field
The invention relates to the field of biological medicines, and particularly relates to a preparation method of somaglutide.
Background
Diabetes is a major disease threatening human health worldwide. In China, the prevalence rate of diabetes is on a rapid rising trend along with the change of life styles and the accelerated aging process of people. Acute and chronic complications of diabetes, especially chronic complications, accumulate a plurality of organs, are disabled, have high fatality rate, seriously affect physical and psychological health of patients and bring heavy burden to individuals, families and society.
The somaglutide is a hypoglycemic drug developed by Novo Nordisk, and the product can remarkably reduce the level of glycosylated hemoglobin (HbA1c) of a type 2 diabetic patient and reduce the weight, and simultaneously greatly reduce the risk of hypoglycemia. Semeglitide is obtained by modifying and modifying GLP-1 (7-37). Compared with Liraglutide, the fatty chain of Semeglutide is longer, and hydrophobicity is increased, but Semeglutide is modified by short-chain PEG, and hydrophilicity is greatly enhanced. After being modified by PEG, the modified PEG not only can be tightly combined with albumin to cover DPP-4 enzyme hydrolysis sites, but also can reduce renal excretion, prolong the biological half-life and achieve the effect of long circulation.
CAS number for somagluteptide: 910463-68-2, the name of English, Semaglutide, the sequence of which is as follows: H-His1-Aib2-Glu3-Gly4-Thr5-Phe6-Thr7-Ser8-Asp9-Val10-Ser11-Ser12-Tyr13-Le u14-Glu15-Gly16-Gln17-Ala18-Ala19-Lys20(PEG2-PEG 2-gamma-Glu-Octadecaneedioic acid) -Glu21-Phe22-Ile23-Ala24-Trp25-Leu26-Val27-Arg28-Gly29-Arg30-Gly 31-OH.
The application number is CN201611095162, which adopts a fragment condensation method to synthesize and obtain the total protection somaltulide, and the crude somaltulide peptide is obtained after cracking. Because the method adopts segment condensation, the raw materials are not easy to obtain, and the cost is high. In addition, the condensation of the side chain is carried out by firstly condensing the main chain to Thr at the 5 th position and then removing the side chain protecting group Alloc of Lys at the 20 th position. The method is easy to cause condensation polymerization of segment 2 resin in the synthesis process, greatly reduces the coupling efficiency of the amino acid after 20-bit Lys and the segment 1, is easy to generate racemization impurities, and is not beneficial to industrial production.
The application number is CN201511027176, the total protection somaglutide resin is obtained by a solid phase synthesis method, crude somaglutide peptide is obtained after cracking, and refined somaglutide is obtained after purification. The method comprises the steps of firstly condensing a main chain, then removing a Lys side chain protecting group Alloc, and carrying out side chain condensation. The method is easy to cause the condensation polymerization of resin in the synthesis process, greatly reduces the coupling efficiency, is easy to generate racemization impurities, particularly racemizes the last amino acid His, greatly reduces the yield of products and increases the production cost.
Therefore, the skilled person is working on new methods for producing somaglutide.
Disclosure of Invention
The invention aims to provide a preparation method of somaglutide.
In a first aspect of the invention, there is provided a somaglutide precursor fusion protein having, from N-terminus to C-terminus, the structure of formula I:
A-FP-TEV-EK-G (I)
in the formula (I), the compound is shown in the specification,
"-" represents a peptide bond;
a is a null or leader peptide sequence,
FP is a green fluorescent protein folding unit;
TEV is a first enzyme cutting site, preferably a TEV enzyme cutting site (shown as a sequence ENLYFQG, SEQ ID NO: 8);
EK is a second enzyme cutting site, preferably enterokinase enzyme cutting site (shown as a sequence DDDDDDK, SEQ ID NO: 9);
g is a sumatriptan precursor or a fragment thereof;
wherein said green fluorescent protein fold units comprise 2-6 β -sheet units selected from the group consisting of:
in another preferred embodiment, the green fluorescent protein folding unit is u2-u3, u4-u5, u1-u2-u3, u3-u4-u5 or u4-u5-u 6.
In another preferred example, G is a Boc-modified somaglutide precursor, the somaglutide precursor lacks 2-5 amino acids from the N-terminus of the somaglutide backbone, and lysine contained in the somaglutide precursor is Boc-modified.
In another preferred embodiment, the epsilon amino group of the Boc-modified lysine is modified with a tert-butoxycarbonyl group.
In another preferred embodiment, the amino acid sequence of the backbone of the somaglutide is shown in SEQ ID NO. 3.
In another preferred embodiment, the somaglutide precursor comprises:
the 18 th position is a first precursor of the Somalutide modified by Boc, and the amino acid sequence of the first precursor is shown as SEQ ID NO. 1;
or the 17 th position is a second precursor of the Boc modified Somaltulin, and the amino acid sequence of the second precursor is shown as SEQ ID NO. 2.
SEQ ID NO:1:EGTFTSDVSSYLEGQAAKEFIAWLVRGRG
SEQ ID NO:2:GTFTSDVSSYLEGQAAKEFIAWLVRGRG (K is Boc modified lysine)
In the present application, the complete somaglutide sequence (H (Aib) EGTFTSDVSSYLEGQAAK EFIAWLVRGRG, SEQ ID NO:3) is defined as the somaglutide backbone and the somaglutide with the N-terminal amino acid deleted is defined as the somaglutide precursor. For the Fmoc-modified somaglutide backbone, the H at the N-terminus is Fmoc-modified; for the Boc-modified somagluteptide backbone, the 20 th lysine was N ∈ - (tert-butyloxycarbonyl) -lysine.
In another preferred embodiment, the green fluorescent protein folding unit is u3-u4-u 5.
In another preferred embodiment, the amino acid sequence of the leader peptide is as shown in SEQ ID NO 7.
In another preferred embodiment, the 17 th or 18 th position of the somaglutide precursor is N ∈ - (tert-butoxycarbonyl) -lysine.
In a second aspect of the present invention, there is provided an Fmoc and Boc modified somaglutide backbone, wherein the 20 th position of the somaglutide backbone is a protected lysine, the protected lysine is N ∈ - (t-butoxycarbonyl) -lysine, and the N-terminus of the somaglutide backbone is an Fmoc modified histidine.
In another preferred embodiment, Fmoc is fluorenylmethyloxycarbonyl.
In another preferred embodiment, the amino acid sequence of the backbone of the somaglutide is shown in SEQ ID NO. 3.
In a third aspect of the present invention, there is provided a Boc-modified somaglutide precursor, comprising:
the 18 th position is a first precursor of the Somalutide modified by Boc, and the amino acid sequence of the first precursor is shown as SEQ ID NO. 1;
or the 17 th position is a second precursor of the Boc modified Somaltulin, and the amino acid sequence of the second precursor is shown as SEQ ID NO. 2.
In a fourth aspect of the present invention, an Fmoc-modified somaglutide backbone is provided, wherein the N-terminus of the somaglutide backbone is Fmoc-modified histidine, and the amino acid sequence of the somaglutide backbone is shown in SEQ ID No. 3.
In a fifth aspect of the invention, there is provided a method of preparing somaglutide, the method comprising the steps of:
(A) fermenting by using recombinant bacteria to prepare the somaglutide precursor fusion protein,
(B) the somaglutide precursor fusion protein is utilized to prepare the somaglutide,
wherein the thaumatin fusion protein is as described in the first aspect of the invention.
In another preferred embodiment, the step (B) further includes the steps of:
(i) carrying out enzyme digestion treatment on the soxhlet-marlu-peptide precursor fusion protein to obtain a Boc-modified soxhlet-marlu-peptide precursor;
(ii) attaching an Fmoc complex to the N-terminus of the Boc modified somaglutide precursor to thereby prepare Fmoc and Boc modified somaglutide backbones,
wherein the Fmoc complex comprises X amino acids from the N-terminus of the backbone of the somaglutide, and the N-terminal amino acids of the Fmoc complex are Fmoc-modified;
(iii) carrying out Boc removal treatment on the Fmoc-modified Somalou peptide main chain and the Boc-modified Somalou peptide main chain, and reacting the Somalou peptide main chain and a Somalou peptide side chain to prepare Fmoc-modified Somalou peptide; and
(iv) performing Fmoc removal treatment on the Fmoc-modified Somalobu peptide to obtain Fmoc-removed Somalobu peptide;
(v) and carrying out side chain tBu removal treatment on the Fmoc-removed somaglutide to prepare the somaglutide.
In another preferred example, in step (i), the enzyme digestion treatment is performed using enterokinase.
In another preferred embodiment, the Boc-modified somaglutide precursor comprises:
a first precursor of the 18 th Boc modified Somarlu peptide, wherein the amino acid sequence of the first precursor is shown as SEQ ID NO. 1;
or a second precursor of the 17 th Boc modified Somaltulin, wherein the amino acid sequence of the second precursor is shown as SEQ ID NO. 2.
In another preferred embodiment, the Fmoc complex is Fmoc-H-Aib or Fmoc-H-Aib-E.
In another preferred embodiment, in step (i) and step (ii), the value of X is the same.
In another preferred embodiment, the Fmoc and Boc modified somaglutide backbone is as described in the second aspect of the invention.
In another preferred embodiment, the reaction of step (ii) is as follows:
in another preferred embodiment, the side chain of the somaglutide is as follows:
in another preferred example, in step (ii), Fmoc complex, DIPEA (N, N-diisopropylethylamine) and DMF (N, N-dimethylformamide) are added, thereby attaching Fmoc complex to the N-terminus of the Boc-modified somaglutide precursor.
In another preferred embodiment, the molar ratio of Fmoc complex, DIPEA and Boc-modified somaglutide precursor added is (1.0-3.0): (10-14): (0.8-1.2), preferably (2-2.8): (11-13): (0.8-1.2).
In another preferred embodiment, between the step (ii) and the step (iii), a step of purifying the prepared Fmoc and Boc modified somaglutide backbone is further included.
In another preferred embodiment, the purification treatment is to add an organic solvent to the reaction solution to obtain a solid product, and more preferably, the organic solvent is a methyl tert-ether/petroleum ether mixed solution.
In another preferred embodiment, in step (iii), the method further comprises the steps of:
(a) adding TFA solution, stirring at low temperature, and removing Boc to obtain a Boc-removed product;
(b) adding an organic solvent to the reaction solution of step (a) to produce a solid de-Boc product, preferably said organic solvent is a methyl tert-ether/petroleum ether mixture;
(c) and mixing the de-Boc product with a side chain of the somaglutide to prepare the Fmoc modified somaglutide.
In another preferred embodiment, in step (c), the solid Boc-removed product is mixed with the side chain of somaglutide in DMF and reacted at room temperature.
In another preferred embodiment, in step (c), the reaction system further comprises DIPEA.
In another preferred embodiment, in step (iv), a piperidine-containing DMF solution is added to perform defmoc treatment, thereby preparing defmoc somaglutide.
In another preferred example, in step (v), a mixed solution of TFA, TIS and DCM is added to perform the side chain removal tBu protecting group treatment, thereby obtaining the somaglutide.
In another preferred embodiment, step (v) includes a step of purifying the produced somaglutide.
In another preferred embodiment, said Boc-modified somaglutide precursor is prepared using genetic recombination techniques.
In another preferred example, in the step (a), inclusion bodies of the somagluteptide precursor fusion protein are obtained by separating from the fermentation broth of the recombinant bacteria, and the somagluteptide precursor fusion protein is obtained after the inclusion bodies are subjected to renaturation and enzyme digestion.
In another preferred embodiment, before and after step (i), a purification step, preferably reverse-phase chromatography, is further included.
In another preferred embodiment, the recombinant bacterium comprises or integrates an expression cassette for expressing the somaglutide precursor fusion protein.
In another preferred embodiment, the method comprises the following steps:
in another preferred example, the method comprises the steps of:
(i) providing the somaglutide precursor fusion protein of the first aspect of the invention, carrying out enzyme digestion to obtain a compound 1,
(ii) attaching Compound 1 to the Fmoc-H-Aib complex to produce Compound 2,
(iii) carrying out Boc removal treatment on the compound 2, and reacting the compound with a side chain of the somaglutide to obtain a compound 4; and
(iv) subjecting compound 4 to Fmoc removal treatment to obtain compound 5;
(v) and (3) carrying out side chain tBu removal treatment on the compound 5 to prepare the somaglutide shown as the compound 6.
In another preferred embodiment, the method comprises the following steps:
in another preferred example, the method comprises the steps of:
(i) providing the somaglutide precursor fusion protein of the first aspect of the invention, carrying out enzyme digestion to obtain a compound 7,
(ii) compound 7 is attached to the Fmoc-H-Aib-E complex to produce compound 2,
(iii) carrying out Boc removal treatment on the compound 2, and reacting the compound with a side chain of the somaglutide to obtain a compound 4; and
(iv) subjecting compound 4 to Fmoc removal treatment to obtain compound 5;
(v) and (3) carrying out side chain tBu removal treatment on the compound 5 to prepare the somaglutide shown as the compound 6.
In a sixth aspect of the invention there is provided an isolated polynucleotide encoding a soma peptide precursor fusion protein according to the first aspect of the invention, an Fmoc and Boc modified soma peptide backbone according to the second aspect of the invention, a Boc modified soma peptide precursor according to the third aspect of the invention or an Fmoc modified soma peptide backbone according to the fourth aspect of the invention.
In a seventh aspect of the invention, there is provided a vector comprising a polynucleotide according to the sixth aspect of the invention.
In another preferred embodiment, the carrier is selected from the group consisting of: DNA, RNA, plasmids, lentiviral vectors, adenoviral vectors, retroviral vectors, transposons, or combinations thereof.
In an eighth aspect of the present invention, there is provided a host cell comprising the vector of the seventh aspect of the present invention or having the polynucleotide of the sixth aspect of the present invention integrated exogenously into the chromosome.
In another preferred embodiment, the host cell is Escherichia coli, Bacillus subtilis, a yeast cell, an insect cell, a mammalian cell, or a combination thereof.
In a ninth aspect of the invention, there is provided a formulation comprising a somaglutide precursor fusion protein according to the first aspect of the invention, an Fmoc and Boc modified somaglutide backbone according to the second aspect of the invention, a Boc modified somaglutide precursor according to the third aspect of the invention or an Fmoc modified somaglutide backbone according to the fourth aspect of the invention.
In a tenth aspect of the invention there is provided a preparation of somaglutide prepared using the method of the fifth aspect of the invention.
Drawings
FIG. 1 shows a map of plasmid pBAD-FP-TEV-EK-GLP-1 (18).
FIG. 2 shows a map of plasmid pBAD-FP-TEV-EK-GLP-1 (17).
FIG. 3 shows a map of the plasmid pEvol-pylRs-pylT.
FIG. 4 shows an SDS-PAGE electrophoresis of Boc-somagluteptide precursor fusion proteins after renaturation of inclusion bodies.
FIG. 5 shows HPLC detection profile of Boc-somaglutide precursor.
Detailed Description
The inventor of the invention has extensively and intensively studied and found a somaglutide derivative and a preparation method thereof. Specifically, the method provided by the invention utilizes a somaglutide fusion protein containing a green fluorescent protein folding unit to prepare the somaglutide, performs Fmoc modification on a Boc-modified somaglutide main chain, and performs side chain addition of the somaglutide by orthogonal protection. The invention also provides the Fmoc and Boc modified somaglutide backbone involved in the preparation process and a fusion protein comprising the somaglutide backbone. The method of the invention does not need expensive solid phase synthesis instruments, shortens the production period, has simple production process and improves the purity and the yield of the product.
Somaltulide
Somaglutide was developed by noh and nodel corporation, the english name Semaglutide, CAS number: 204656-20-2, is a human glucagon-like peptide-1 (GLP-1) analogue, and has the sequence: H-His1-Aib2-Glu3-Gly4-Thr5-Phe6-Thr7-Ser8-Asp9-Val10-Ser11-Ser12-Tyr13-Leu14-Glu15-Gly16-Gln17-Ala18-Ala19-Lys20(PEG2-PEG 2-gamma-Glu-Octadecaneedioic acid) -Glu21-Phe22-Ile23-Ala24-Trp25-Leu26-Val27-Arg28-Gly29-Arg30-Gly 31-OH. The sequence homology with the natural GLP-1 of the human reaches 97 percent.
The somaglutide is a hypoglycemic drug developed by Novo Nordisk, and the product can remarkably reduce the level of glycosylated hemoglobin (HbA1c) of a type 2 diabetic patient and reduce the weight, and simultaneously greatly reduce the risk of hypoglycemia. Semeglitide is obtained by modifying and modifying GLP-1 (7-37). Compared with Liraglutide, the fatty chain of Semeglutide is longer, and hydrophobicity is increased, but Semeglutide is modified by short-chain PEG, and hydrophilicity is greatly enhanced. After being modified by PEG, the modified PEG not only can be tightly combined with albumin to cover DPP-4 enzyme hydrolysis sites, but also can reduce renal excretion, prolong the biological half-life and achieve the effect of long circulation. Can significantly reduce fasting or postprandial blood sugar of type 2 diabetes patients to achieve the regulation of blood sugar level in vivo, and simultaneously can reduce the weight of the patients and the death risk of the patients with cardiovascular diseases.
Fusion proteins
The present invention constructs a somaglutide precursor fusion protein using a green fluorescent protein folding unit, as described in the first aspect of the invention.
The green fluorescent protein fold unit FP comprised in the fusion protein of the invention comprises 2 to 6, preferably 2 to 3 β -sheet units selected from the group consisting of:
amino acid sequence | |
u1 | VPILVELDGDVNG(SEQ ID NO:11) |
u2 | HKFSVRGEGEGDAT(SEQ ID NO:12) |
u3 | KLTLKFICTT(SEQ ID NO:13) |
u4 | YVQERTISFKD(SEQ ID NO:14) |
u5 | TYKTRAEVKFEGD(SEQ ID NO:15) |
u6 | TLVNRIELKGIDF(SEQ ID NO:16) |
u7 | HNVYITADKQ(SEQ ID NO:17) |
u8 | GIKANFKIRHNVED(SEQ ID NO:18) |
u9 | VQLADHYQQNTPIG(SEQ ID NO:19) |
u10 | HYLSTQSVLSKD(SEQ ID NO:20) |
u11 | HMVLLEFVTAAGI(SEQ ID NO:21)。 |
In another preferred embodiment, the green fluorescent protein folding unit FP can be selected from: u8, u9, u2-u3, u4-u5, u8-u9, u1-u2-u3, u2-u3-u4, u3-u4-u5, u5-u6-u7, u8-u9-u10, u9-u 9-u 9-u 9, u9-u 9-u 9-u 9, u 9-36u 9, u 9-9, u 36u 9-36u 9, u 36u 9-9, u 9-36u 9-9, u 9-36u 9-9, u 9-9, u 9-36u 9-9, u 9-36u 9-9, u-36u-9, u 36u 9, u 9-36u 9, u 36u 9-9, u 9-36u 9, u-9, u 9-9, u-9, u 9-36u-9, u-9, u 9-9, u 9-9, u 36u-9, u 9-36u 9, u 36u-9, u-36u-9, u-9-, u1-I-u5, u2-I-u4, u3-I-u8, u5-I-u6, or u10-I-u 11.
In another preferred embodiment, the green fluorescent protein folding unit is u3-u4-u5 or u4-u5-u 6.
The term "fusion protein" as used herein also includes variants having the above-described activities. These variants include (but are not limited to): deletion, insertion and/or substitution of 1 to 3 (usually 1 to 2, more preferably 1) amino acids, and addition or deletion of one or several (usually up to 3, preferably up to 2, more preferably up to 1) amino acids at the C-terminal and/or N-terminal. For example, in the art, substitutions with amino acids of similar or similar properties will not generally alter the function of the protein. Also, for example, the addition or deletion of one or several amino acids at the C-terminus and/or N-terminus does not generally alter the structure and function of the protein. In addition, the term also includes monomeric and multimeric forms of the polypeptides of the invention. The term also includes linear as well as non-linear polypeptides (e.g., cyclic peptides).
The invention also includes active fragments, derivatives and analogs of the above fusion proteins. As used herein, the terms "fragment," "derivative," and "analog" refer to a polypeptide that substantially retains the function or activity of a fusion protein of the invention. The polypeptide fragment, derivative or analogue of the present invention may be (i) a polypeptide in which one or more conserved or non-conserved amino acid residues (preferably conserved amino acid residues) are substituted, or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a polypeptide in which a polypeptide is fused with another compound (such as a compound for increasing the half-life of the polypeptide, e.g., polyethylene glycol), or (iv) a polypeptide in which an additional amino acid sequence is fused with the polypeptide sequence (a fusion protein in which a tag sequence such as a leader sequence, a secretory sequence or 6His is fused). Such fragments, derivatives and analogs are within the purview of those skilled in the art in view of the teachings herein.
A preferred class of reactive derivatives refers to polypeptides formed by the replacement of up to 3, preferably up to 2, more preferably up to 1 amino acid with an amino acid of similar or analogous nature compared to the amino acid sequence of the present invention. These conservative variants are preferably produced by amino acid substitutions according to Table A.
TABLE A
Initial residue(s) | Representative substitutions | Preferred substitutions |
Ala(A) | Val;Leu;Ile | Val |
Arg(R) | Lys;Gln;Asn | Lys |
Asn(N) | Gln;His;Lys;Arg | Gln |
Asp(D) | Glu | Glu |
Cys(C) | Ser | Ser |
Gln(Q) | Asn | Asn |
Glu(E) | Asp | Asp |
Gly(G) | Pro;Ala | Ala |
His(H) | Asn;Gln;Lys;Arg | Arg |
Ile(I) | Leu;Val;Met;Ala;Phe | Leu |
Leu(L) | Ile;Val;Met;Ala;Phe | Ile |
Lys(K) | Arg;Gln;Asn | Arg |
Met(M) | Leu;Phe;Ile | Leu |
Phe(F) | Leu;Val;Ile;Ala;Tyr | Leu |
Pro(P) | Ala | Ala |
Ser(S) | Thr | Thr |
Thr(T) | Ser | Ser |
Trp(W) | Tyr;Phe | Tyr |
Tyr(Y) | Trp;Phe;Thr;Ser | Phe |
Val(V) | Ile;Leu;Met;Phe;Ala | Leu |
The invention also provides analogs of the fusion proteins of the invention. These analogs may differ from the polypeptides of the invention by amino acid sequence differences, by modifications that do not affect the sequence, or by both. Analogs also include analogs having residues other than the natural L-amino acids (e.g., D-amino acids), as well as analogs having non-naturally occurring or synthetic amino acids (e.g., beta, gamma-amino acids). It is to be understood that the polypeptides of the present invention are not limited to the representative polypeptides exemplified above.
In addition, modifications may be made to the fusion proteins of the invention. Modified (generally without altering primary structure) forms include: chemically derivatized forms of the polypeptide, such as acetylation or carboxylation, in vivo or in vitro. Modifications also include glycosylation, such as those resulting from glycosylation modifications in the synthesis and processing of the polypeptide or in further processing steps. Such modification may be accomplished by exposing the polypeptide to an enzyme that performs glycosylation, such as a mammalian glycosylase or deglycosylase. Modified forms also include sequences having phosphorylated amino acid residues (e.g., phosphotyrosine, phosphoserine, phosphothreonine). Also included are polypeptides modified to increase their resistance to proteolysis or to optimize solubility.
The term "polynucleotide encoding a fusion protein of the present invention" may include a polynucleotide encoding a fusion protein of the present invention, and may also include polynucleotides that additionally include coding and/or non-coding sequences.
The invention also relates to variants of the above polynucleotides which encode fragments, analogs and derivatives of the polypeptides or fusion proteins having the same amino acid sequence as the present invention. These nucleotide variants include substitution variants, deletion variants and insertion variants. As is known in the art, an allelic variant is a substitution of a polynucleotide, which may be a substitution, deletion, or insertion of one or more nucleotides, without substantially altering the function of the fusion protein encoded thereby.
The present invention also relates to polynucleotides which hybridize to the sequences described above and which have at least 50%, preferably at least 70%, and more preferably at least 80% identity between the two sequences. The present invention particularly relates to polynucleotides hybridizable under stringent conditions (or stringent conditions) with the polynucleotides of the present invention. In the present invention, "stringent conditions" mean: (1) hybridization and elution at lower ionic strength and higher temperature, such as 0.2 XSSC, 0.1% SDS, 60 ℃; or (2) adding denaturant during hybridization, such as 50% (v/v) formamide, 0.1% calf serum/0.1% Ficoll, 42 deg.C, etc.; or (3) hybridization occurs only when the identity between two sequences is at least 90% or more, preferably 95% or more.
The fusion proteins and polynucleotides of the invention are preferably provided in isolated form, and more preferably, purified to homogeneity.
The full-length sequence of the polynucleotide of the present invention can be obtained by PCR amplification, recombination, or artificial synthesis. For PCR amplification, primers can be designed based on the nucleotide sequences disclosed herein, particularly open reading frame sequences, and the sequences can be amplified using commercially available cDNA libraries or cDNA libraries prepared by conventional methods known to those skilled in the art as templates. When the sequence is long, two or more PCR amplifications are often required, and then the amplified fragments are spliced together in the correct order.
Once the sequence of interest has been obtained, it can be obtained in large quantities by recombinant methods. This is usually done by cloning it into a vector, transferring it into a cell, and isolating the relevant sequence from the propagated host cell by conventional methods.
In addition, the sequence can be synthesized by artificial synthesis, especially when the fragment length is short. Generally, fragments with long sequences are obtained by first synthesizing a plurality of small fragments and then ligating them.
At present, DNA sequences encoding the proteins of the present invention (or fragments or derivatives thereof) have been obtained completely by chemical synthesis. The DNA sequence may then be introduced into various existing DNA molecules (or vectors, for example) and cells known in the art.
Methods for amplifying DNA/RNA using PCR techniques are preferably used to obtain the polynucleotides of the invention. Particularly, when it is difficult to obtain a full-length cDNA from a library, it is preferable to use the RACE method (RACE-cDNA terminal rapid amplification method), and primers used for PCR can be appropriately selected based on the sequence information of the present invention disclosed herein and synthesized by a conventional method. The amplified DNA/RNA fragments can be isolated and purified by conventional methods, such as by gel electrophoresis.
Expression vector
The invention also relates to vectors comprising the polynucleotides of the invention, as well as genetically engineered host cells transformed with the vectors of the invention or the coding sequences of the fusion proteins of the invention, and methods for producing the polypeptides of the invention by recombinant techniques.
The polynucleotide sequences of the present invention may be used to express or produce recombinant fusion proteins by conventional recombinant DNA techniques. Generally, the following steps are performed:
(1) transforming or transducing a suitable host cell with a polynucleotide (or variant) of the invention encoding a fusion protein of the invention, or with a recombinant expression vector comprising the polynucleotide;
(2) a host cell cultured in a suitable medium;
(3) isolating and purifying the protein from the culture medium or the cells.
In the present invention, the polynucleotide sequence encoding the fusion protein may be inserted into a recombinant expression vector. The term "recombinant expression vector" refers to a bacterial plasmid, bacteriophage, yeast plasmid, plant cell virus, mammalian cell virus such as adenovirus, retrovirus, or other vectors well known in the art. Any plasmid or vector may be used as long as it can replicate and is stable in the host. An important feature of expression vectors is that they generally contain an origin of replication, a promoter, a marker gene and translation control elements.
Methods well known to those skilled in the art can be used to construct expression vectors containing the DNA sequences encoding the fusion proteins of the present invention and appropriate transcription/translation control signals. These methods include in vitro recombinant DNA techniques, DNA synthesis techniques, in vivo recombinant techniques, and the like. The DNA sequence may be operably linked to a suitable promoter in an expression vector to direct mRNA synthesis. Representative examples of such promoters are: lac or trp promoter of E.coli; a lambda phage PL promoter; eukaryotic promoters include CMV immediate early promoter, HSV thymidine kinase promoter, early and late SV40 promoter, LTRs of retrovirus, and other known promoters capable of controlling gene expression in prokaryotic or eukaryotic cells or viruses. The expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.
Furthermore, the expression vector preferably comprises one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase, neomycin resistance and Green Fluorescent Protein (GFP) for eukaryotic cell culture, or tetracycline or ampicillin resistance for E.coli.
Vectors comprising the appropriate DNA sequences described above, together with appropriate promoter or control sequences, may be used to transform appropriate host cells to enable expression of the protein.
The host cell may be a prokaryotic cell, such as a bacterial cell; or lower eukaryotic cells, such as yeast cells; or higher eukaryotic cells, such as mammalian cells. Representative examples are: escherichia coli, streptomyces; bacterial cells of salmonella typhimurium; fungal cells such as yeast, plant cells (e.g., ginseng cells).
When the polynucleotide of the present invention is expressed in higher eukaryotic cells, transcription will be enhanced if an enhancer sequence is inserted into the vector. Enhancers are cis-acting elements of DNA, usually about 10 to 300 base pairs, that act on a promoter to increase transcription of a gene. Examples include the SV40 enhancer at the late side of the replication origin at 100 to 270 bp, the polyoma enhancer at the late side of the replication origin, and adenovirus enhancers.
It will be clear to one of ordinary skill in the art how to select appropriate vectors, promoters, enhancers and host cells.
Transformation of a host cell with recombinant DNA can be carried out using conventional techniques well known to those skilled in the art. When the host is prokaryotic, e.g., E.coli, competent cells capable of DNA uptake can be harvested after exponential growth phase using CaCl2Methods, the steps used are well known in the art. Another method is to use MgCl2. If desired, transformation can also be carried out by electroporation. When the host is a eukaryote, the following DNA transfection methods may be used: calcium phosphate coprecipitation, conventional mechanical methods such as microinjection, electroporation, liposome encapsulation, etc.
The obtained transformant can be cultured by a conventional method to express the polypeptide encoded by the gene of the present invention. The medium used in the culture may be selected from various conventional media depending on the host cell used. The culturing is performed under conditions suitable for growth of the host cell. After the host cells have been grown to an appropriate cell density, the selected promoter is induced by suitable means (e.g., temperature shift or chemical induction) and the cells are cultured for an additional period of time.
The recombinant polypeptide in the above method may be expressed intracellularly or on the cell membrane, or secreted extracellularly. If necessary, the recombinant protein can be isolated and purified by various separation methods using its physical, chemical and other properties. These methods are well known to those skilled in the art. Examples of such methods include, but are not limited to: conventional renaturation treatment, treatment with a protein precipitant (such as salt precipitation), centrifugation, cell lysis by osmosis, sonication, ultracentrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, High Performance Liquid Chromatography (HPLC), and other various liquid chromatography techniques, and combinations thereof.
Construction of a Somaltulin expression vector
FP-TEV-EK-GLP1(18 or 17) fragments were synthesized with the target gene, and recognition sites for restriction enzymes Nco I and Xho I were placed at both ends of the fragment. The sequence is subjected to codon optimization, and can realize high-level expression of functional protein in escherichia coli. After expression, the expression vector "pBAD/His A" (Kana) was digested with restriction enzymes Nco I and Xho IR) "and a plasmid containing the target gene of" FP-TEV-EK-GLP1(18 or 17) ", the cleavage products are separated by agarose electrophoresis, extracted by an agarose gel DNA recovery kit, and finally two DNA fragments are connected by T4DNA ligase. The ligation products were chemically transformed into E.coli Top10 cells, and the transformed cells were cultured overnight on LB agar medium (10g/L yeast peptone, 5g/L yeast extract powder, 10g/L NaCl, 1.5% agar) containing 50. mu.g/mL kanamycin. 3 viable colonies were picked, cultured overnight in 5mL of liquid LB medium (10g/L yeast peptone, 5g/L yeast extract powder, 10g/L NaCl) containing 50. mu.g/mL kanamycin, and plasmid extraction was performed using a plasmid miniprep kit. The extracted plasmid was then sequenced using sequencing oligonucleotide primer 5'-ATGCCATAGCATTTTTATCC-3' (SEQ ID NO:15) to confirm correct insertion. The finally obtained plasmid was designated "pBAD-FP-TEV-EK-GLP 1(18 or 17)".
Fmoc modification
In the field of biological medicine, the use of polypeptides is increasing, amino acids are basic raw materials for polypeptide synthesis technology, and all amino acids contain alpha-amino and carboxyl, and some also contain side chain active groups, such as: hydroxyl, amino, guanidyl, heterocycle and the like, therefore, amino and side chain active groups need to be protected in a peptide grafting reaction, and the protecting groups are removed after the polypeptide is synthesized, so that amino acid misconnection and a plurality of side reactions can occur.
Fmoc is a base-sensitive protecting group and can be removed in 50% dichloromethane solution of ammonia such as concentrated ammonia water or dioxane-methanol-4N NaOH (30: 9: 1), piperidine, ethanolamine, cyclohexylamine, 1, 4-dioxane, and pyrrolidone.
Fmoc-protecting groups are generally introduced by Fmoc-Cl or Fmoc-OSu under weakly basic conditions such as sodium carbonate or sodium bicarbonate. Fmoc-OSu allows easier control of reaction conditions and fewer side reactions than Fmoc-Cl.
Fmoc has strong ultraviolet absorption with maximum absorption wavelengths of 267nm (. epsilon.18950), 290nm (. epsilon.5280) and 301nm (. epsilon.6200), so that the detection can be realized by using the ultraviolet absorption, and a plurality of convenience is brought to the automatic polypeptide synthesis of an instrument. Moreover, the method is compatible with a wide range of solvents and reagents, has high mechanical stability, and can be used for various carriers and various activation modes, and the like. Therefore, the Fmoc protecting group is most commonly used in polypeptide synthesis today.
Fmoc-OSu (fluorenylmethoxycarbonylsuccinimides)
Side chain of somaglutide
tBuO-Ste-Glu (AEEA-AEEA-OSu) -OtBu is the side chain of somaglutide.
The preparation method of the somaglutide comprises the steps of firstly obtaining a somaglutide precursor with Boc-protected lysine at the 17-or 18-position by utilizing a gene recombination technology, and then connecting a somaglutide side chain tBuO-Ste-Glu (AEEA-AEEA-OSu) -OtBu to obtain the somaglutide.
Preparation of somaglutide
The invention provides two synthetic routes of Somaltulip, which are respectively shown as follows, a Fmoc compound modified compound 2 is prepared from a Boc-Somaltulip precursor (compound 1), a compound 3 is obtained after the compound 2 is subjected to Boc removal protection, the compound 3 is reacted with an activated Somaltulip side chain tBuO-Ste-Glu (AEEA-AEEA-OSu) -OtBu to obtain a compound 4, then the Fmoc removal reaction is carried out to obtain a compound 5, a tBu protecting group is removed from a side chain, and finally a Somaltulip compound 6 is obtained.
Specifically, the present invention provides a method for preparing somaglutide, comprising the steps of:
(i) providing a Boc-modified somaglutide precursor;
(ii) modifying the Fmoc compound of the Boc modified soxhalutatide precursor to prepare Fmoc and Boc modified soxhalutatide backbones;
(iii) carrying out Boc removal treatment on the Fmoc-modified Somalou peptide main chain and the Boc-modified Somalou peptide main chain, and reacting the Somalou peptide main chain and a Somalou peptide side chain to prepare Fmoc-modified Somalou peptide; and
(iv) and (3) carrying out Fmoc removal and side chain tBu removal treatment on the Fmoc-modified Somalou peptide so as to prepare the Somalou peptide.
The main advantages of the invention include:
(1) the invention directly utilizes a biosynthesis mode to produce the Boc modified soxhlet peptide precursor without adopting methods of dilution, ultrafiltration liquid exchange and the like to remove excessive inorganic salt in the supernatant of fermentation liquid. In the method of the present invention, the Boc-somagluteptide precursor is separated by using a chromatographic column, the yield of one step is more than 70%, which is 3 times higher than that of the conventional method, and the yield of the Boc-somagluteptide precursor is about 800-1000 mg/L. Moreover, the method can remove most of pigments, reduces the original multi-step process and reduces the process time and equipment investment cost;
(2) due to the protection of the Boc-lysine at the 20 th position, the invention can directly utilize orthogonal reaction with Fmoc protection to synthesize the somaglutide.
(3) The Somarlu peptide synthesized by the method disclosed by the invention has no N-terminal fatty acid acylated impurities, is beneficial to downstream purification, and reduces the cost.
(4) Compared with solid phase synthesis, the method of the invention does not produce racemized impurity polypeptide, does not need to use a large amount of modified amino acid, does not use a large amount of organic reagent, has small environmental pollution and lower cost;
(5) the fusion protein of the invention contains the somaglutide main chain with high specific gravity (increased fusion ratio), the green fluorescent protein in the fusion protein contains arginine and lysine, can be digested into small fragments by protease, has large molecular weight difference compared with the target protein, and is easy to separate.
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. The experimental procedures, in which specific conditions are not noted in the following examples, are generally carried out under conventional conditions or conditions recommended by the manufacturers. Unless otherwise indicated, percentages and parts are by weight.
Example 1 construction of a Somaltulin expression Strain
Construction of the somaglutide expression plasmid is described in the examples in patent application No. 201910210102.9. The DNA fragment of fusion protein FP1-TEV-EK-GLP-1(18) or FP2-TEV-EK-GLP-1(17) was cloned into the NcoI-XhoI site downstream of the araBAD promoter of the expression vector plasmid pBAD/His A (available from NTCC, kanamycin resistance) to give plasmid pBAD-FP1-TEV-EK-GLP-1(18) or pBAD-FP2-TEV-EK-GLP-1 (17). The plasmid maps are shown in FIGS. 1 and 2.
Based on the somaglutide precursor with 2-5 amino acids deleted from the N-terminal shown in SEQ ID NO. 1 or SEQ ID NO. 2, the fusion protein 1 and the fusion protein 2 are constructed
The amino acid sequence of the fusion protein 1 is shown as SEQ ID NO: 4:
MVSKGEELFTGVKLTLKFICTTYVQERTISFKDTYKTRAEVKFEGDENLYFQGDDDDKEGTFTSDVSSYLEGQAAKEFIAWLVRGRG
the amino acid sequence of the fusion protein 2 is shown as SEQ ID NO. 5
MVSKGEELFTGVYVQERTISFKDTYKTRAEVKFEGDTLVNRIELKGIDFENLYFQGDDDDKGTFTSDVSSYLEGQAAKEFIAWLVRGRG
Wherein the leader peptide sequence is MVSKGEELFTGV (SEQ ID NO:7)
The sequence of the green fluorescent protein folding unit (FP) is
FP1:KLTLKFICTTYVQERTISFKDTYKTRAEVKFEGD(SEQ ID NO:6,U3-U4-U5)
FP2:YVQERTISFKDTYKTRAEVKFEGDTLVNRIELKGIDF(SEQ ID NO:10,U4-U5-U6)
The TEV enzyme cleavage site is ENLYFQG (SEQ ID NO: 8);
the enzyme cutting site of the enterokinase is DDDDK (SEQ ID NO:9)
The somaglutide precursor with 2-5 amino acids deleted from the N-terminal is shown in SEQ ID NO. 1 or SEQ ID NO. 2.
SEQ ID NO:1:EGTFTSDVSSYLEGQAAKEFIAWLVRGRG
SEQ ID NO:2:GTFTSDVSSYLEGQAAKEFIAWLVRGRG (K is Boc modified lysine)
The DNA sequence of pylRs was then cloned into the SpeI-SalI site downstream of the araBAD promoter of the expression vector plasmid pEvol-pBpF (available from NTCC for chloramphenicol resistance), while the DNA sequence of the tRNA (pylTcua) of lysyl-tRNA synthetase was PCR inserted downstream of the proK promoter. This plasmid was designated pEvol-pylRs-pylT. The plasmid map is shown in FIG. 3.
The constructed plasmid pBAD-FP1-TEV-EK-GLP-1(18) and pEvol-pylRs-pylT are jointly transformed into an escherichia coli TOP10 strain, and a recombinant strain expressing the somagulin precursor fusion protein FP-TEV-EK-GLP-1(18) is obtained through screening.
The constructed plasmid pBAD-FP2-TEV-EK-GLP-1(17) and pEvol-pylRs-pylT are jointly transformed into an escherichia coli TOP10 strain, and a recombinant strain expressing the somataltide precursor fusion protein FP2-TEV-EK-GLP-1(17) is obtained through screening.
Example 2 expression of Boc-Somarlu peptide precursor
Inoculating the two recombinant Escherichia coli seed solutions into a fermentation culture medium respectively according to the inoculation amount of 5%, culturing at 37 ℃ and pH7.0 in batches until the pH rises to 7.05, separately feeding carbon and nitrogen sources, and feeding the carbon and nitrogen sources according to a constant pH method. After feeding, 7.5M ammonia water is automatically fed, and the pH is controlled to be 7.0-7.2. Culturing for 4-6 hr, adding L-arabinose for induction, and continuously inducing for 14 + -2 hr. Two fermentation broths comprising the somaglutide precursor fusion protein were obtained.
Example 3 preparation of Boc-somagluteptide precursor Inclusion bodies
After centrifuging the two fermentation broths obtained in example 2, the wet biomass was centrifuged at a ratio of 1: mixing the mixture with a volume of 1 and a bacterium breaking buffer solution, suspending for 3h, then breaking the bacteria by using a high-pressure homogenizer, centrifugally collecting inclusion bodies after breaking the bacteria, cleaning the inclusion bodies by using the buffer solution, weighing the inclusion bodies after cleaning, wherein the yield of the inclusion bodies of the two fusion proteins is 39-43g/L and 41-45g/L respectively.
The results of SDS-PAGE are shown in FIG. 4.
Example 4 denaturation and enzymatic cleavage of Boc-somagluteptide precursor Inclusion bodies
To the inclusion bodies obtained in example 3, 8mol/L of urea-dissolving buffer was added at a weight/volume ratio of 1:15, the mixture was dissolved with stirring at room temperature, the protein concentration was measured by the Bradford method, the total protein concentration of the inclusion body-dissolving solution was controlled to about 20mg/ml, and the pH was adjusted to 9.0. + -. 1.0 with NaOH. Dripping the inclusion body dissolving solution into the renaturation buffer solution to dilute the inclusion body dissolving solution by 5-10 times, maintaining the pH value of the fusion protein renaturation solution at 9.0-10.0, controlling the temperature at 4-8 ℃, and renaturing for 10-20 h.
The results showed that the ratio of fusion protein 1 to fusion protein 2 was about 30% and 33% after solubilization.
Example 5 Primary purification of Boc-Somarlu peptide fusion protein
Filtering the fusion protein renaturation solution obtained in the embodiment 4 by a filter membrane of 0.45 mu m to remove undissolved substances; according to the difference of the isoelectric points of the proteins, the fusion protein is primarily purified by adopting an anion exchange column.
Experimental results show that after anion exchange chromatography, the purities of the Boc-somaglutide precursor fusion protein 1 and the fusion protein 2 both reach more than 65%, the carrying capacity is about 18mg/mL, and the yield is more than 80%.
Example 6 enzymatic cleavage of Boc-somaglutide precursor fusion protein
Desalting the Boc-somaglutide precursor fusion protein primarily purified in the example 5, adjusting the pH value to 7.5-8.5, controlling the temperature to 18-25 ℃, adding enterokinase for enzyme digestion for 8-24h to obtain a Boc-somaglutide precursor, wherein the Boc-somaglutide precursor 1 and the precursor 2 are about 0.9g/L and 1.2g/L, and the enzyme digestion efficiency is more than or equal to 95%.
Example 7 reverse phase chromatography of Boc-Somarlu peptide precursor
And purifying the Boc-somaglutide precursor by reverse phase chromatography according to the hydrophobicity difference of the polypeptide and the protein to remove most of the foreign protein.
Diluted hydrochloric acid was added to the enzyme-cleaved solutions of Boc-somaglutide precursor 1 and precursor 2 obtained in example 6, the pH was adjusted to 2.0 to 3.0, and after the solution was clarified by filtration through a 0.45 μm filter membrane, an appropriate amount of acetonitrile was added to the solution to perform reverse phase chromatography separation and purification.
Using an aqueous solution containing trifluoroacetic acid as a mobile phase A; acetonitrile solution containing trifluoroacetic acid is used as mobile phase B. And combining the Boc-somaglutelin precursor with a filler, controlling the loading amount of the Boc-somaglutelin precursor to be not higher than 10mg/mL, carrying out gradient elution, and collecting the Boc-somaglutelin precursor. The experimental result shows that the purity of the Boc-somaglutide precursor 1 and the Boc-somaglutide precursor 2 collected by reverse phase chromatography is more than or equal to 90 percent, the yield is more than 80 percent, and an HPLC detection map of the purified Boc-somaglutide precursor is shown in figure 5.
EXAMPLE 8 preparation of Somaltulin Using Boc-Somaltulin precursor 1 (Fmoc-H-Aib, line 1)
Taking Boc-somaglutide precursor 1 (compound 1) obtained in example 7 (the molar ratio of the raw materials is 30 mg), Fmoc-H-Aib, DIPEA and DMF are added according to the molar ratio of the table 1, and the reaction is carried out for 8 to 12 hours, so as to prepare Fmoc and Boc protected somaglutide main chains. Adding a mixed solution of methyl tert-ether and petroleum ether into the reaction solution, centrifuging the precipitate, and washing the precipitate with methyl tert-ether for 2-3 times to obtain an Fmoc-protected compound 2: Fmoc-GLP-1 (Lys)20Boc)。
TABLE 1 molar ratio of the feeds
Boc-SomarluPeptides | Fmoc-H-Aib | DIPEA | DMF | |
Equivalent weight or volume | 1.0eq | 2.5eq | 12eq | 1V |
Adding TFA solution into the compound 2 after crude purification, stirring at low temperature for 0.5-2.0h, adding mixed solution of methyl tert-butyl ether and petroleum ether with the volume of 15-20 times into reaction liquid, precipitating and centrifuging, washing the precipitate for 2-3 times by using the mixed solution, and finally obtaining a solid compound 3 without Boc: Fmoc-GLP-1 (Lys)20NH2)。
Taking the compound 3 after Boc removal, adding DMF and 12eq of DIPEA, and stirring gently at room temperature for 5 min. 2.5eq of tBuO-Ste-Glu (AEEA-AEEA-OSu) -OtBu were dissolved in DMF solution and added to the resulting mixture and the reaction mixture was shaken gently at room temperature for 2-3 h. Adding a mixed solution of methyl tertiary butyl ether and petroleum ether with the volume 15-20 times that of the reaction system into the reaction system, precipitating and centrifuging, washing the solid for 2-3 times by using the mixed solution, and drying in vacuum to obtain a compound 4: Fmoc-GLP-1- (tBuO-Ste-Glu (AEEA-AEEA) -OtBu) (20).
Taking the compound 4, adding 20% piperidine-containing DMF solution, and reacting for 0.5-2.0h at room temperature. And then adding a mixed solvent of methyl tert-ether and petroleum ether into the reaction system, precipitating and centrifuging, washing the solid for 3-5 times by using the mixed solvent of methyl tert-ether and petroleum ether to obtain a compound 5 after Fmoc removal: Fmoc-GLP-1- (tBuO-Ste-Glu (AEEA-AEEA) -OtBu) (20).
Taking a compound 5, adding a mixed solution of TFA (trifluoroacetic acid), TIS (triisopropylsilane) and DCM (dichloromethane), carrying out oscillation reaction at room temperature for 2-4 hours to remove a side chain tBu protecting group, adding a mixed solvent of 10-20 times of methyl tert-butyl ether and petroleum ether into a reaction system, precipitating and centrifuging, and washing a solid with the mixed solvent of methyl tert-ether and petroleum ether for 3 times to obtain a final product. After HPLC purification, the obtained soxhalutatide has a purity of more than 98%.
Example 9 preparation of Somaltulide Using Boc-Somaltulide precursor 2 (Fmoc-H-Aib-E, line 2)
Taking Boc-somaglutide precursor 2 (compound 7) obtained in example 7 (the molar ratio of the feed is 30mg for example), Fmoc-H-Aib-E, DIPEA and DMF were added according to the molar ratio shown in Table 2, and the reaction was carried out for 8-12 hours to obtain Fmoc-and Boc-protected somaglutide backbones. Adding a mixed solution of methyl tert-ether and petroleum ether into the reaction solution, centrifuging the precipitate, and washing the precipitate with methyl tert-ether for 2-3 times to obtain an Fmoc-protected compound 2: Fmoc-GLP-1 (Lys)20Boc)。
TABLE 2 molar ratio of feeds
Taking the crude and purified compound 2, adding a TFA solution, stirring at low temperature for 0.5-2.0h, adding a mixed solution of methyl tertiary butyl ether and petroleum ether with the volume 15-20 times that of a reaction system into a reaction solution, precipitating and centrifuging, washing the precipitate for 2-3 times by using the mixed solution, and finally obtaining a solid compound 3 without Boc: Fmoc-GLP-1 (Lys)20NH2)。
Taking the compound 3 after Boc removal, adding DMF and 12eq of DIPEA, and stirring gently at room temperature for 5 min. 2.5eq of tBuO-Ste-Glu (AEEA-AEEA-OSu) -OtBu were dissolved in DMF solution and added to the resulting mixture and the reaction mixture was shaken gently at room temperature for 2-3 h. Adding a mixed solution of methyl tertiary butyl ether and petroleum ether with the volume 15-20 times that of the reaction system into the reaction system, precipitating and centrifuging, washing the solid for 2-3 times by using the mixed solution, and drying in vacuum to obtain a compound 4: Fmoc-GLP-1- (tBuO-Ste-Glu (AEEA-AEEA) -OtBu) (20).
Taking the compound 4, adding 20% piperidine-containing DMF solution, and reacting for 0.5-2.0h at room temperature. And then adding a mixed solvent of methyl tert-ether and petroleum ether into the reaction system, precipitating and centrifuging, washing the solid for 3-5 times by using the mixed solvent of methyl tert-ether and petroleum ether to obtain a compound 5 after Fmoc removal: Fmoc-GLP-1- (tBuO-Ste-Glu (AEEA-AEEA) -OtBu) (20).
Taking a compound 5, adding a mixed solution of TFA (trifluoroacetic acid), TIS (triisopropylsilane) and DCM (dichloromethane), carrying out oscillation reaction at room temperature for 2-4 hours to remove a side chain tBu protecting group, adding a mixed solvent of 10-20 times of methyl tert-butyl ether and petroleum ether into a reaction system, precipitating and centrifuging, and washing a solid with the mixed solvent of methyl tert-ether and petroleum ether for 3 times to obtain a final product. After HPLC purification, the obtained soxhalutatide has a purity of more than 98%.
Comparative example
Construction and expression of the fusion protein expression strain were carried out in a similar manner to that in examples 1 to 3, except that the amino acid sequence of the fusion protein used for expression was as shown in SEQ ID NO: 22.
MKKLLFAIPLVVPFYSHSTMELEICSWYHMGIRSFLEQKLISEEDLNSAVDDDDDKEGTFTSDVSSYLEGQAAKEFIAWLVRGRG (SEQ ID NO:22) the above fusion protein contains gIII signal peptide. The results showed that the inclusion body yield was 30g of wet-heavy inclusion bodies.
The results show that compared with the expression of the fusion protein with the conventional structure, the expression amount of the fusion protein is obviously improved.
All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.
Sequence listing
<110> Ningbo spread Biotechnology Ltd
<120> preparation method of somaglutide
<130> P2020-0743
<160> 22
<170> SIPOSequenceListing 1.0
<210> 1
<211> 29
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 1
Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Gly Gln Ala
1 5 10 15
Ala Lys Glu Phe Ile Ala Trp Leu Val Arg Gly Arg Gly
20 25
<210> 2
<211> 28
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Gly Gln Ala Ala
1 5 10 15
Lys Glu Phe Ile Ala Trp Leu Val Arg Gly Arg Gly
20 25
<210> 3
<211> 31
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 3
His Xaa Glu Gly Thr Phe Thr Ser Asp Val Ser Ser Tyr Leu Glu Gly
1 5 10 15
Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu Val Arg Gly Arg Gly
20 25 30
<210> 4
<211> 87
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 4
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Lys Leu Thr Leu
1 5 10 15
Lys Phe Ile Cys Thr Thr Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys
20 25 30
Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Glu Asn
35 40 45
Leu Tyr Phe Gln Gly Asp Asp Asp Asp Lys Glu Gly Thr Phe Thr Ser
50 55 60
Asp Val Ser Ser Tyr Leu Glu Gly Gln Ala Ala Lys Glu Phe Ile Ala
65 70 75 80
Trp Leu Val Arg Gly Arg Gly
85
<210> 5
<211> 89
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 5
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Tyr Val Gln Glu
1 5 10 15
Arg Thr Ile Ser Phe Lys Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys
20 25 30
Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp
35 40 45
Phe Glu Asn Leu Tyr Phe Gln Gly Asp Asp Asp Asp Lys Gly Thr Phe
50 55 60
Thr Ser Asp Val Ser Ser Tyr Leu Glu Gly Gln Ala Ala Lys Glu Phe
65 70 75 80
Ile Ala Trp Leu Val Arg Gly Arg Gly
85
<210> 6
<211> 34
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 6
Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr Tyr Val Gln Glu Arg Thr
1 5 10 15
Ile Ser Phe Lys Asp Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu
20 25 30
Gly Asp
<210> 7
<211> 12
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 7
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val
1 5 10
<210> 8
<211> 7
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 8
Glu Asn Leu Tyr Phe Gln Gly
1 5
<210> 9
<211> 5
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 9
Asp Asp Asp Asp Lys
1 5
<210> 10
<211> 37
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 10
Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp Thr Tyr Lys Thr Arg
1 5 10 15
Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu
20 25 30
Lys Gly Ile Asp Phe
35
<210> 11
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 11
Val Pro Ile Leu Val Glu Leu Asp Gly Asp Val Asn Gly
1 5 10
<210> 12
<211> 14
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 12
His Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Thr
1 5 10
<210> 13
<211> 10
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 13
Lys Leu Thr Leu Lys Phe Ile Cys Thr Thr
1 5 10
<210> 14
<211> 11
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 14
Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp
1 5 10
<210> 15
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 15
Thr Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp
1 5 10
<210> 16
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 16
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile Asp Phe
1 5 10
<210> 17
<211> 10
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 17
His Asn Val Tyr Ile Thr Ala Asp Lys Gln
1 5 10
<210> 18
<211> 14
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 18
Gly Ile Lys Ala Asn Phe Lys Ile Arg His Asn Val Glu Asp
1 5 10
<210> 19
<211> 14
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 19
Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly
1 5 10
<210> 20
<211> 12
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 20
His Tyr Leu Ser Thr Gln Ser Val Leu Ser Lys Asp
1 5 10
<210> 21
<211> 13
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 21
His Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
1 5 10
<210> 22
<211> 85
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 22
Met Lys Lys Leu Leu Phe Ala Ile Pro Leu Val Val Pro Phe Tyr Ser
1 5 10 15
His Ser Thr Met Glu Leu Glu Ile Cys Ser Trp Tyr His Met Gly Ile
20 25 30
Arg Ser Phe Leu Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Asn Ser
35 40 45
Ala Val Asp Asp Asp Asp Asp Lys Glu Gly Thr Phe Thr Ser Asp Val
50 55 60
Ser Ser Tyr Leu Glu Gly Gln Ala Ala Lys Glu Phe Ile Ala Trp Leu
65 70 75 80
Val Arg Gly Arg Gly
85
Claims (10)
1. A method of preparing somaglutide, comprising the steps of:
(A) fermenting by using recombinant bacteria to prepare the somaglutide precursor fusion protein,
(B) the somaglutide precursor fusion protein is utilized to prepare the somaglutide,
wherein the somaglutide fusion protein has a structure shown in formula I from N end to C end:
A-FP-TEV-EK-G (I)
in the formula (I), the compound is shown in the specification,
"-" represents a peptide bond;
a is a null or leader peptide sequence,
FP is a green fluorescent protein folding unit;
TEV is a first enzyme cutting site, preferably a TEV enzyme cutting site (shown as a sequence ENLYFQG, SEQ ID NO: 8);
EK is a second enzyme cutting site, preferably enterokinase enzyme cutting site (shown as a sequence DDDDDDK, SEQ ID NO: 9);
g is a sumatriptan precursor or a fragment thereof;
wherein said green fluorescent protein fold units comprise 2-6 β -sheet units selected from the group consisting of:
2. The method of claim 1, wherein said step (B) further comprises the steps of:
(i) carrying out enzyme digestion treatment on the soxhlet-marlu-peptide precursor fusion protein to obtain a Boc-modified soxhlet-marlu-peptide precursor;
(ii) attaching an Fmoc complex to the N-terminus of the Boc modified somaglutide precursor to thereby prepare Fmoc and Boc modified somaglutide backbones,
wherein the Fmoc complex comprises X amino acids from the N-terminus of the backbone of the somaglutide, and the N-terminal amino acids of the Fmoc complex are Fmoc-modified;
(iii) carrying out Boc removal treatment on the Fmoc-modified Somalou peptide main chain and the Boc-modified Somalou peptide main chain, and reacting the Somalou peptide main chain and a Somalou peptide side chain to prepare Fmoc-modified Somalou peptide; and
(iv) performing Fmoc removal treatment on the Fmoc-modified Somalobu peptide to obtain Fmoc-removed Somalobu peptide;
(v) and carrying out side chain tBu removal treatment on the Fmoc-removed somaglutide to prepare the somaglutide.
3. The method of claim 2, wherein in step (i), the enzymatic cleavage is performed using enterokinase.
4. The method of claim 2, wherein the Boc-modified somaglutide precursor comprises:
a first precursor of the 18 th Boc modified Somaltulin, wherein the amino acid sequence of the first precursor is shown as SEQ ID NO. 1;
or a second precursor of the 17 th Boc modified Somaltulin, wherein the amino acid sequence of the second precursor is shown as SEQ ID NO. 2.
5. The method of claim 2, wherein said Fmoc complex is Fmoc-H-Aib or Fmoc-H-Aib-E.
8. the method of claim 2, wherein in step (iii), further comprising the step of:
(a) adding TFA solution, stirring at low temperature, and removing Boc to obtain a Boc-removed product;
(b) adding an organic solvent to the reaction solution of step (a) to produce a solid de-Boc product, preferably said organic solvent is a methyl tert-ether/petroleum ether mixture;
(c) and mixing the de-Boc product with a side chain of the somaglutide to prepare the Fmoc modified somaglutide.
9. The method of claim 1, wherein the green fluorescent protein folding unit is u2-u3, u4-u5, u1-u2-u3, u3-u4-u5, or u4-u5-u 6.
10. A somaglutide formulation prepared using the method of claim 1.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010530625.4A CN113801233B (en) | 2020-06-11 | 2020-06-11 | Preparation method of somalupeptide |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010530625.4A CN113801233B (en) | 2020-06-11 | 2020-06-11 | Preparation method of somalupeptide |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113801233A true CN113801233A (en) | 2021-12-17 |
CN113801233B CN113801233B (en) | 2024-03-01 |
Family
ID=78891990
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010530625.4A Active CN113801233B (en) | 2020-06-11 | 2020-06-11 | Preparation method of somalupeptide |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113801233B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115322250A (en) * | 2022-06-16 | 2022-11-11 | 南京汉欣医药科技有限公司 | Synthesis method of semaglutide |
CN116970062A (en) * | 2022-04-29 | 2023-10-31 | 南京知和医药科技有限公司 | Ultra-long acting GLP-1 polypeptide derivative and preparation method and application thereof |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104619726A (en) * | 2012-03-23 | 2015-05-13 | 苏州鲲鹏生物技术有限公司 | Fusion proteins of superfolder green fluorescent protein and use thereof |
WO2018032843A1 (en) * | 2016-08-19 | 2018-02-22 | 深圳市健元医药科技有限公司 | Method for synthesizing semaglutide |
WO2019120639A1 (en) * | 2017-12-21 | 2019-06-27 | Bachem Holding Ag | Solid phase synthesis of acylated peptides |
CN110294800A (en) * | 2018-03-22 | 2019-10-01 | 齐鲁制药有限公司 | A kind of preparation method of Suo Malu peptide |
CN110498849A (en) * | 2019-09-16 | 2019-11-26 | 南京迪维奥医药科技有限公司 | A kind of main peptide chain of Suo Malu peptide and preparation method thereof |
CN111153983A (en) * | 2020-03-11 | 2020-05-15 | 江南大学 | Semisynthesis preparation method of somaglutide |
-
2020
- 2020-06-11 CN CN202010530625.4A patent/CN113801233B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104619726A (en) * | 2012-03-23 | 2015-05-13 | 苏州鲲鹏生物技术有限公司 | Fusion proteins of superfolder green fluorescent protein and use thereof |
WO2018032843A1 (en) * | 2016-08-19 | 2018-02-22 | 深圳市健元医药科技有限公司 | Method for synthesizing semaglutide |
CN109311961A (en) * | 2016-08-19 | 2019-02-05 | 深圳市健元医药科技有限公司 | A kind of synthetic method of Suo Malu peptide |
WO2019120639A1 (en) * | 2017-12-21 | 2019-06-27 | Bachem Holding Ag | Solid phase synthesis of acylated peptides |
CN110294800A (en) * | 2018-03-22 | 2019-10-01 | 齐鲁制药有限公司 | A kind of preparation method of Suo Malu peptide |
CN110498849A (en) * | 2019-09-16 | 2019-11-26 | 南京迪维奥医药科技有限公司 | A kind of main peptide chain of Suo Malu peptide and preparation method thereof |
CN111153983A (en) * | 2020-03-11 | 2020-05-15 | 江南大学 | Semisynthesis preparation method of somaglutide |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116970062A (en) * | 2022-04-29 | 2023-10-31 | 南京知和医药科技有限公司 | Ultra-long acting GLP-1 polypeptide derivative and preparation method and application thereof |
CN116970062B (en) * | 2022-04-29 | 2024-04-09 | 南京知和医药科技有限公司 | Ultra-long acting GLP-1 polypeptide derivative and preparation method and application thereof |
CN115322250A (en) * | 2022-06-16 | 2022-11-11 | 南京汉欣医药科技有限公司 | Synthesis method of semaglutide |
Also Published As
Publication number | Publication date |
---|---|
CN113801233B (en) | 2024-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2020242724B2 (en) | Aminoacyl-tRNA synthetase for efficiently introducing lysine derivative in protein | |
EP4166575A1 (en) | Semaglutide derivative, and preparation method therefor and application thereof | |
CN113801233B (en) | Preparation method of somalupeptide | |
EP4166573A1 (en) | Insulin degludec derivative, preparation method therefor, and application thereof | |
WO2021147869A1 (en) | Liraglutide derivative and preparation method therefor | |
CN113801234B (en) | Sodamide derivative and application thereof | |
EP3950719A1 (en) | Fusion protein containing fluorescent protein fragments and uses thereof | |
CN113773392B (en) | Preparation method of insulin glargine | |
CN113773399B (en) | Insulin glargine derivative and application thereof | |
CN113801235A (en) | Insulin lispro derivative and application thereof | |
CN114057886B (en) | Sodamide derivative and preparation method thereof | |
CN113801236A (en) | Preparation method of insulin lispro | |
CN113773397B (en) | Preparation method of insulin diglucoside | |
CN113773391B (en) | Preparation method of insulin aspart | |
CN113773400B (en) | Insulin aspart derivative and application thereof | |
CN113773396A (en) | Insulin detemir derivative and application thereof | |
CN113773395A (en) | Preparation method of insulin detemir | |
CN114075295A (en) | Efficient renaturation liquid of Boc-human insulin fusion protein inclusion body and renaturation method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |