CN112111504A - Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method - Google Patents
Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method Download PDFInfo
- Publication number
- CN112111504A CN112111504A CN202010952359.4A CN202010952359A CN112111504A CN 112111504 A CN112111504 A CN 112111504A CN 202010952359 A CN202010952359 A CN 202010952359A CN 112111504 A CN112111504 A CN 112111504A
- Authority
- CN
- China
- Prior art keywords
- fusion protein
- gly
- leu
- protease
- seq
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108020001507 fusion proteins Proteins 0.000 title claims abstract description 136
- 102000037865 fusion proteins Human genes 0.000 title claims abstract description 135
- 238000001976 enzyme digestion Methods 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000012216 screening Methods 0.000 title claims abstract description 21
- 238000002360 preparation method Methods 0.000 title claims abstract description 14
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 8
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 title abstract description 39
- 102000004218 Insulin-Like Growth Factor I Human genes 0.000 title abstract description 35
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 66
- 108091005804 Peptidases Proteins 0.000 claims abstract description 50
- 239000004365 Protease Substances 0.000 claims abstract description 46
- 230000014509 gene expression Effects 0.000 claims abstract description 43
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 claims abstract description 40
- 108090000190 Thrombin Proteins 0.000 claims abstract description 38
- 229960004072 thrombin Drugs 0.000 claims abstract description 35
- 238000003032 molecular docking Methods 0.000 claims abstract description 34
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 33
- 239000013604 expression vector Substances 0.000 claims abstract description 20
- 230000000694 effects Effects 0.000 claims abstract description 19
- 238000012795 verification Methods 0.000 claims abstract description 8
- 230000029087 digestion Effects 0.000 claims abstract description 6
- 108090000790 Enzymes Proteins 0.000 claims description 65
- 102000004190 Enzymes Human genes 0.000 claims description 62
- 229940088598 enzyme Drugs 0.000 claims description 62
- 108010013369 Enteropeptidase Proteins 0.000 claims description 37
- 102100029727 Enteropeptidase Human genes 0.000 claims description 33
- 238000005520 cutting process Methods 0.000 claims description 33
- 241000588724 Escherichia coli Species 0.000 claims description 24
- 150000001413 amino acids Chemical group 0.000 claims description 18
- 101000599951 Homo sapiens Insulin-like growth factor I Proteins 0.000 claims description 15
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 claims description 12
- 102000035195 Peptidases Human genes 0.000 claims description 10
- 108020004705 Codon Proteins 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 239000004472 Lysine Substances 0.000 claims description 4
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 3
- 239000004475 Arginine Substances 0.000 claims description 2
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 claims description 2
- 125000003275 alpha amino acid group Chemical group 0.000 abstract description 23
- 241000894006 Bacteria Species 0.000 description 36
- 238000003776 cleavage reaction Methods 0.000 description 20
- 230000007017 scission Effects 0.000 description 20
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 18
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 16
- 239000001963 growth medium Substances 0.000 description 12
- 239000007788 liquid Substances 0.000 description 12
- 239000006228 supernatant Substances 0.000 description 12
- 108060008226 thioredoxin Proteins 0.000 description 11
- 238000000746 purification Methods 0.000 description 10
- 238000012258 culturing Methods 0.000 description 9
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 8
- 238000002474 experimental method Methods 0.000 description 8
- 230000004927 fusion Effects 0.000 description 8
- 239000011780 sodium chloride Substances 0.000 description 8
- 238000001262 western blot Methods 0.000 description 8
- 239000000047 product Substances 0.000 description 7
- 239000012474 protein marker Substances 0.000 description 7
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 6
- 102100036407 Thioredoxin Human genes 0.000 description 6
- 239000001110 calcium chloride Substances 0.000 description 6
- 229910001628 calcium chloride Inorganic materials 0.000 description 6
- 239000002299 complementary DNA Substances 0.000 description 6
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 6
- 230000006698 induction Effects 0.000 description 6
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 6
- 238000003259 recombinant expression Methods 0.000 description 6
- 108091008146 restriction endonucleases Proteins 0.000 description 6
- 230000001131 transforming effect Effects 0.000 description 6
- 102000002933 Thioredoxin Human genes 0.000 description 5
- 108010092854 aspartyllysine Proteins 0.000 description 5
- 108010016616 cysteinylglycine Proteins 0.000 description 5
- 108010078144 glutaminyl-glycine Proteins 0.000 description 5
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 229940094937 thioredoxin Drugs 0.000 description 5
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 4
- IXCHOHLPHNGFTJ-YUMQZZPRSA-N Ser-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N IXCHOHLPHNGFTJ-YUMQZZPRSA-N 0.000 description 4
- 229960000723 ampicillin Drugs 0.000 description 4
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 4
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 4
- 108090000765 processed proteins & peptides Proteins 0.000 description 4
- 150000003839 salts Chemical class 0.000 description 4
- 238000003041 virtual screening Methods 0.000 description 4
- CWFMWBHMIMNZLN-NAKRPEOUSA-N (2s)-1-[(2s)-2-[[(2s,3s)-2-amino-3-methylpentanoyl]amino]propanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CWFMWBHMIMNZLN-NAKRPEOUSA-N 0.000 description 3
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 3
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 3
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 3
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 3
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 3
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 3
- FEGOCLZUJUFCHP-CIUDSAMLSA-N Ala-Pro-Gln Chemical compound [H]N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O FEGOCLZUJUFCHP-CIUDSAMLSA-N 0.000 description 3
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 3
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 3
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 3
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 3
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 3
- CVLIHKBUPSFRQP-WHFBIAKZSA-N Cys-Gly-Ala Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](C)C(O)=O CVLIHKBUPSFRQP-WHFBIAKZSA-N 0.000 description 3
- SKSJPIBFNFPTJB-NKWVEPMBSA-N Cys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CS)N)C(=O)O SKSJPIBFNFPTJB-NKWVEPMBSA-N 0.000 description 3
- LBSKYJOZIIOZIO-DCAQKATOSA-N Cys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CS)N LBSKYJOZIIOZIO-DCAQKATOSA-N 0.000 description 3
- 102000012410 DNA Ligases Human genes 0.000 description 3
- 108010061982 DNA Ligases Proteins 0.000 description 3
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 3
- QGAJQIGFFIQJJK-IHRRRGAJSA-N Glu-Tyr-Gln Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O QGAJQIGFFIQJJK-IHRRRGAJSA-N 0.000 description 3
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 3
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 3
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 3
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 3
- UDLAWRKOVFDKFL-PEFMBERDSA-N Ile-Asp-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N UDLAWRKOVFDKFL-PEFMBERDSA-N 0.000 description 3
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 3
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 3
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 3
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 3
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 3
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 3
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 3
- RDLSEGZJMYGFNS-FXQIFTODSA-N Met-Ser-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RDLSEGZJMYGFNS-FXQIFTODSA-N 0.000 description 3
- YJNDFEWPGLNLNH-IHRRRGAJSA-N Met-Tyr-Cys Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CS)C(O)=O)CC1=CC=C(O)C=C1 YJNDFEWPGLNLNH-IHRRRGAJSA-N 0.000 description 3
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 3
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 3
- YKUGPVXSDOOANW-KKUMJFAQSA-N Phe-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKUGPVXSDOOANW-KKUMJFAQSA-N 0.000 description 3
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 3
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 3
- NLQUOHDCLSFABG-GUBZILKMSA-N Ser-Arg-Arg Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NLQUOHDCLSFABG-GUBZILKMSA-N 0.000 description 3
- 241001052560 Thallis Species 0.000 description 3
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 3
- MQVGIFJSFFVGFW-XEGUGMAKSA-N Trp-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MQVGIFJSFFVGFW-XEGUGMAKSA-N 0.000 description 3
- FNWGDMZVYBVAGJ-XEGUGMAKSA-N Tyr-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC1=CC=C(C=C1)O)N FNWGDMZVYBVAGJ-XEGUGMAKSA-N 0.000 description 3
- HHSILIQTHXABKM-YDHLFZDLSA-N Val-Asp-Phe Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](Cc1ccccc1)C(O)=O HHSILIQTHXABKM-YDHLFZDLSA-N 0.000 description 3
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 3
- 230000003213 activating effect Effects 0.000 description 3
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 3
- 239000007853 buffer solution Substances 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 3
- 150000001875 compounds Chemical group 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 108010079547 glutamylmethionine Proteins 0.000 description 3
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 3
- 229930027917 kanamycin Natural products 0.000 description 3
- 229960000318 kanamycin Drugs 0.000 description 3
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 3
- 229930182823 kanamycin A Natural products 0.000 description 3
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000013641 positive control Substances 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 108010007375 seryl-seryl-seryl-arginine Proteins 0.000 description 3
- 108010061238 threonyl-glycine Proteins 0.000 description 3
- 238000005303 weighing Methods 0.000 description 3
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 2
- ADSGHMXEAZJJNF-DCAQKATOSA-N Ala-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N ADSGHMXEAZJJNF-DCAQKATOSA-N 0.000 description 2
- GIQCDTKOIPUDSG-GARJFASQSA-N Asn-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N)C(=O)O GIQCDTKOIPUDSG-GARJFASQSA-N 0.000 description 2
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 2
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 2
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 2
- MFVQGXGQRIXBPK-WDSKDSINSA-N Gly-Ala-Glu Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFVQGXGQRIXBPK-WDSKDSINSA-N 0.000 description 2
- JJGBXTYGTKWGAT-YUMQZZPRSA-N Gly-Pro-Glu Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O JJGBXTYGTKWGAT-YUMQZZPRSA-N 0.000 description 2
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- BZKDJRSZWLPJNI-SRVKXCTJSA-N His-His-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O BZKDJRSZWLPJNI-SRVKXCTJSA-N 0.000 description 2
- LNJLOZYNZFGJMM-DEQVHRJGSA-N Ile-His-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N LNJLOZYNZFGJMM-DEQVHRJGSA-N 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 2
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 2
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 2
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 2
- CGHXMODRYJISSK-NHCYSSNCSA-N Leu-Val-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O CGHXMODRYJISSK-NHCYSSNCSA-N 0.000 description 2
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 2
- RXWPLVRJQNWXRQ-IHRRRGAJSA-N Met-His-His Chemical compound C([C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CNC=N1 RXWPLVRJQNWXRQ-IHRRRGAJSA-N 0.000 description 2
- KAHUBGWSIQNZQQ-KKUMJFAQSA-N Phe-Asn-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KAHUBGWSIQNZQQ-KKUMJFAQSA-N 0.000 description 2
- DXWNFNOPBYAFRM-IHRRRGAJSA-N Phe-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N DXWNFNOPBYAFRM-IHRRRGAJSA-N 0.000 description 2
- DCHQYSOGURGJST-FJXKBIBVSA-N Pro-Thr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O DCHQYSOGURGJST-FJXKBIBVSA-N 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- KNCJWSPMTFFJII-ZLUOBGJFSA-N Ser-Cys-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O KNCJWSPMTFFJII-ZLUOBGJFSA-N 0.000 description 2
- GZFAWAQTEYDKII-YUMQZZPRSA-N Ser-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CO GZFAWAQTEYDKII-YUMQZZPRSA-N 0.000 description 2
- 239000004098 Tetracycline Substances 0.000 description 2
- IMULJHHGAUZZFE-MBLNEYKQSA-N Thr-Gly-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IMULJHHGAUZZFE-MBLNEYKQSA-N 0.000 description 2
- JQAWYCUUFIMTHE-WLTAIBSBSA-N Thr-Gly-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JQAWYCUUFIMTHE-WLTAIBSBSA-N 0.000 description 2
- ODXKUIGEPAGKKV-KATARQTJSA-N Thr-Leu-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)O)N)O ODXKUIGEPAGKKV-KATARQTJSA-N 0.000 description 2
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 108010005233 alanylglutamic acid Proteins 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 108010008355 arginyl-glutamine Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000011033 desalting Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 108010051307 glycyl-glycyl-proline Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010028295 histidylhistidine Proteins 0.000 description 2
- 230000001900 immune effect Effects 0.000 description 2
- 210000003000 inclusion body Anatomy 0.000 description 2
- 108010078274 isoleucylvaline Proteins 0.000 description 2
- 210000001503 joint Anatomy 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 108010031719 prolyl-serine Proteins 0.000 description 2
- 230000003381 solubilizing effect Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 238000000527 sonication Methods 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 229960002180 tetracycline Drugs 0.000 description 2
- 229930101283 tetracycline Natural products 0.000 description 2
- 235000019364 tetracycline Nutrition 0.000 description 2
- 150000003522 tetracyclines Chemical class 0.000 description 2
- IGXNPQWXIRIGBF-KEOOTSPTSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoic acid Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IGXNPQWXIRIGBF-KEOOTSPTSA-N 0.000 description 1
- CVOFKRWYWCSDMA-UHFFFAOYSA-N 2-chloro-n-(2,6-diethylphenyl)-n-(methoxymethyl)acetamide;2,6-dinitro-n,n-dipropyl-4-(trifluoromethyl)aniline Chemical compound CCC1=CC=CC(CC)=C1N(COC)C(=O)CCl.CCCN(CCC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C=C1[N+]([O-])=O CVOFKRWYWCSDMA-UHFFFAOYSA-N 0.000 description 1
- 108010091324 3C proteases Proteins 0.000 description 1
- WQVFQXXBNHHPLX-ZKWXMUAHSA-N Ala-Ala-His Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O WQVFQXXBNHHPLX-ZKWXMUAHSA-N 0.000 description 1
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 1
- WQVYAWIMAWTGMW-ZLUOBGJFSA-N Ala-Asp-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N WQVYAWIMAWTGMW-ZLUOBGJFSA-N 0.000 description 1
- IKKVASZHTMKJIR-ZKWXMUAHSA-N Ala-Asp-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IKKVASZHTMKJIR-ZKWXMUAHSA-N 0.000 description 1
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 1
- DBKNLHKEVPZVQC-LPEHRKFASA-N Arg-Ala-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O DBKNLHKEVPZVQC-LPEHRKFASA-N 0.000 description 1
- UXJCMQFPDWCHKX-DCAQKATOSA-N Arg-Arg-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(O)=O)C(O)=O UXJCMQFPDWCHKX-DCAQKATOSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- KMSHNDWHPWXPEC-BQBZGAKWSA-N Arg-Asp-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KMSHNDWHPWXPEC-BQBZGAKWSA-N 0.000 description 1
- JSHVMZANPXCDTL-GMOBBJLQSA-N Arg-Asp-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JSHVMZANPXCDTL-GMOBBJLQSA-N 0.000 description 1
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 1
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 1
- WVNFNPGXYADPPO-BQBZGAKWSA-N Arg-Gly-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O WVNFNPGXYADPPO-BQBZGAKWSA-N 0.000 description 1
- UAOSDDXCTBIPCA-QXEWZRGKSA-N Arg-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UAOSDDXCTBIPCA-QXEWZRGKSA-N 0.000 description 1
- LLUGJARLJCGLAR-CYDGBPFRSA-N Arg-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LLUGJARLJCGLAR-CYDGBPFRSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 1
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 1
- BWMMKQPATDUYKB-IHRRRGAJSA-N Arg-Tyr-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=C(O)C=C1 BWMMKQPATDUYKB-IHRRRGAJSA-N 0.000 description 1
- FTMRPIVPSDVGCC-GUBZILKMSA-N Arg-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FTMRPIVPSDVGCC-GUBZILKMSA-N 0.000 description 1
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 1
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 1
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 1
- PTNFNTOBUDWHNZ-GUBZILKMSA-N Asn-Arg-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O PTNFNTOBUDWHNZ-GUBZILKMSA-N 0.000 description 1
- JJGRJMKUOYXZRA-LPEHRKFASA-N Asn-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O JJGRJMKUOYXZRA-LPEHRKFASA-N 0.000 description 1
- CQMQJWRCRQSBAF-BPUTZDHNSA-N Asn-Arg-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)N)N CQMQJWRCRQSBAF-BPUTZDHNSA-N 0.000 description 1
- HAJWYALLJIATCX-FXQIFTODSA-N Asn-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N HAJWYALLJIATCX-FXQIFTODSA-N 0.000 description 1
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 1
- NVGWESORMHFISY-SRVKXCTJSA-N Asn-Asn-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NVGWESORMHFISY-SRVKXCTJSA-N 0.000 description 1
- NLCDVZJDEXIDDL-BIIVOSGPSA-N Asn-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O NLCDVZJDEXIDDL-BIIVOSGPSA-N 0.000 description 1
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 1
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- XEGZSHSPQNDNRH-JRQIVUDYSA-N Asn-Tyr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XEGZSHSPQNDNRH-JRQIVUDYSA-N 0.000 description 1
- GBAWQWASNGUNQF-ZLUOBGJFSA-N Asp-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N GBAWQWASNGUNQF-ZLUOBGJFSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 1
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 1
- KTTCQQNRRLCIBC-GHCJXIJMSA-N Asp-Ile-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O KTTCQQNRRLCIBC-GHCJXIJMSA-N 0.000 description 1
- DWOGMPWRQQWPPF-GUBZILKMSA-N Asp-Leu-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O DWOGMPWRQQWPPF-GUBZILKMSA-N 0.000 description 1
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 1
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 1
- PLNJUJGNLDSFOP-UWJYBYFXSA-N Asp-Tyr-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O PLNJUJGNLDSFOP-UWJYBYFXSA-N 0.000 description 1
- OTKUAVXGMREHRX-CFMVVWHZSA-N Asp-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=C(O)C=C1 OTKUAVXGMREHRX-CFMVVWHZSA-N 0.000 description 1
- QPDUWAUSSWGJSB-NGZCFLSTSA-N Asp-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N QPDUWAUSSWGJSB-NGZCFLSTSA-N 0.000 description 1
- 101001012262 Bos taurus Enteropeptidase Proteins 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- PKNIZMPLMSKROD-BIIVOSGPSA-N Cys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N PKNIZMPLMSKROD-BIIVOSGPSA-N 0.000 description 1
- YFXFOZPXVFPBDH-VZFHVOOUSA-N Cys-Ala-Thr Chemical compound C[C@@H](O)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](N)CS)C(O)=O YFXFOZPXVFPBDH-VZFHVOOUSA-N 0.000 description 1
- XABFFGOGKOORCG-CIUDSAMLSA-N Cys-Asp-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XABFFGOGKOORCG-CIUDSAMLSA-N 0.000 description 1
- QJUDRFBUWAGUSG-SRVKXCTJSA-N Cys-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CS)N QJUDRFBUWAGUSG-SRVKXCTJSA-N 0.000 description 1
- BPHKULHWEIUDOB-FXQIFTODSA-N Cys-Gln-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O BPHKULHWEIUDOB-FXQIFTODSA-N 0.000 description 1
- KEBJBKIASQVRJS-WDSKDSINSA-N Cys-Gln-Gly Chemical compound C(CC(=O)N)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N KEBJBKIASQVRJS-WDSKDSINSA-N 0.000 description 1
- DZSICRGTVPDCRN-YUMQZZPRSA-N Cys-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CS)N DZSICRGTVPDCRN-YUMQZZPRSA-N 0.000 description 1
- HBHMVBGGHDMPBF-GARJFASQSA-N Cys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N HBHMVBGGHDMPBF-GARJFASQSA-N 0.000 description 1
- OZHXXYOHPLLLMI-CIUDSAMLSA-N Cys-Lys-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OZHXXYOHPLLLMI-CIUDSAMLSA-N 0.000 description 1
- UDDITVWSXPEAIQ-IHRRRGAJSA-N Cys-Phe-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UDDITVWSXPEAIQ-IHRRRGAJSA-N 0.000 description 1
- WVWRADGCZPIJJR-IHRRRGAJSA-N Cys-Val-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CS)N WVWRADGCZPIJJR-IHRRRGAJSA-N 0.000 description 1
- 206010013883 Dwarfism Diseases 0.000 description 1
- -1 FactorXa Proteins 0.000 description 1
- ALUBSZXSNSPDQV-WDSKDSINSA-N Gln-Cys-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ALUBSZXSNSPDQV-WDSKDSINSA-N 0.000 description 1
- UFNSPPFJOHNXRE-AUTRQRHGSA-N Gln-Gln-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UFNSPPFJOHNXRE-AUTRQRHGSA-N 0.000 description 1
- VGTDBGYFVWOQTI-RYUDHWBXSA-N Gln-Gly-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VGTDBGYFVWOQTI-RYUDHWBXSA-N 0.000 description 1
- FFVXLVGUJBCKRX-UKJIMTQDSA-N Gln-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N FFVXLVGUJBCKRX-UKJIMTQDSA-N 0.000 description 1
- LURQDGKYBFWWJA-MNXVOIDGSA-N Gln-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)N)N LURQDGKYBFWWJA-MNXVOIDGSA-N 0.000 description 1
- BJPPYOMRAVLXBY-YUMQZZPRSA-N Gln-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N BJPPYOMRAVLXBY-YUMQZZPRSA-N 0.000 description 1
- LVRKAFPPFJRIOF-GARJFASQSA-N Gln-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N LVRKAFPPFJRIOF-GARJFASQSA-N 0.000 description 1
- QFXNFFZTMFHPST-DZKIICNBSA-N Gln-Phe-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CCC(=O)N)N QFXNFFZTMFHPST-DZKIICNBSA-N 0.000 description 1
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 1
- OREPWMPAUWIIAM-ZPFDUUQYSA-N Gln-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N OREPWMPAUWIIAM-ZPFDUUQYSA-N 0.000 description 1
- ZGHMRONFHDVXEF-AVGNSLFASA-N Gln-Ser-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZGHMRONFHDVXEF-AVGNSLFASA-N 0.000 description 1
- DUGYCMAIAKAQPB-GLLZPBPUSA-N Gln-Thr-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DUGYCMAIAKAQPB-GLLZPBPUSA-N 0.000 description 1
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 1
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 1
- VPKBCVUDBNINAH-GARJFASQSA-N Glu-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VPKBCVUDBNINAH-GARJFASQSA-N 0.000 description 1
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 1
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 1
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 1
- WRNAXCVRSBBKGS-BQBZGAKWSA-N Glu-Gly-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O WRNAXCVRSBBKGS-BQBZGAKWSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- UJMNFCAHLYKWOZ-DCAQKATOSA-N Glu-Lys-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UJMNFCAHLYKWOZ-DCAQKATOSA-N 0.000 description 1
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 1
- ZTVGZOIBLRPQNR-KKUMJFAQSA-N Glu-Met-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZTVGZOIBLRPQNR-KKUMJFAQSA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 1
- JVZLZVJTIXVIHK-SXNHZJKMSA-N Glu-Trp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCC(=O)O)N JVZLZVJTIXVIHK-SXNHZJKMSA-N 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- LERGJIVJIIODPZ-ZANVPECISA-N Gly-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)C)C(O)=O)=CNC2=C1 LERGJIVJIIODPZ-ZANVPECISA-N 0.000 description 1
- XUDLUKYPXQDCRX-BQBZGAKWSA-N Gly-Arg-Asn Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O XUDLUKYPXQDCRX-BQBZGAKWSA-N 0.000 description 1
- GZBZACMXFIPIDX-WHFBIAKZSA-N Gly-Cys-Asp Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)CN)C(=O)O GZBZACMXFIPIDX-WHFBIAKZSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- DENRBIYENOKSEX-PEXQALLHSA-N Gly-Ile-His Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DENRBIYENOKSEX-PEXQALLHSA-N 0.000 description 1
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 1
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 1
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 1
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 1
- ZWRDOVYMQAAISL-UWVGGRQHSA-N Gly-Met-Lys Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCCN ZWRDOVYMQAAISL-UWVGGRQHSA-N 0.000 description 1
- SCJJPCQUJYPHRZ-BQBZGAKWSA-N Gly-Pro-Asn Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O SCJJPCQUJYPHRZ-BQBZGAKWSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- OHUKZZYSJBKFRR-WHFBIAKZSA-N Gly-Ser-Asp Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O OHUKZZYSJBKFRR-WHFBIAKZSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 1
- BAYQNCWLXIDLHX-ONGXEEELSA-N Gly-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN BAYQNCWLXIDLHX-ONGXEEELSA-N 0.000 description 1
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 1
- IZVICCORZOSGPT-JSGCOSHPSA-N Gly-Val-Tyr Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IZVICCORZOSGPT-JSGCOSHPSA-N 0.000 description 1
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 1
- DCRODRAURLJOFY-XPUUQOCRSA-N His-Ala-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)NCC(O)=O DCRODRAURLJOFY-XPUUQOCRSA-N 0.000 description 1
- XDIVYNSPYBLSME-DCAQKATOSA-N His-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N XDIVYNSPYBLSME-DCAQKATOSA-N 0.000 description 1
- 241000430519 Human rhinovirus sp. Species 0.000 description 1
- AQCUAZTZSPQJFF-ZKWXMUAHSA-N Ile-Ala-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O AQCUAZTZSPQJFF-ZKWXMUAHSA-N 0.000 description 1
- WUEIUSDAECDLQO-NAKRPEOUSA-N Ile-Ala-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)O)N WUEIUSDAECDLQO-NAKRPEOUSA-N 0.000 description 1
- ATXGFMOBVKSOMK-PEDHHIEDSA-N Ile-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N ATXGFMOBVKSOMK-PEDHHIEDSA-N 0.000 description 1
- CWJQMCPYXNVMBS-STECZYCISA-N Ile-Arg-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N CWJQMCPYXNVMBS-STECZYCISA-N 0.000 description 1
- XENGULNPUDGALZ-ZPFDUUQYSA-N Ile-Asn-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N XENGULNPUDGALZ-ZPFDUUQYSA-N 0.000 description 1
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 1
- DURWCDDDAWVPOP-JBDRJPRFSA-N Ile-Cys-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N DURWCDDDAWVPOP-JBDRJPRFSA-N 0.000 description 1
- OVPYIUNCVSOVNF-ZPFDUUQYSA-N Ile-Gln-Pro Natural products CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O OVPYIUNCVSOVNF-ZPFDUUQYSA-N 0.000 description 1
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- GNXGAVNTVNOCLL-SIUGBPQLSA-N Ile-Tyr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N GNXGAVNTVNOCLL-SIUGBPQLSA-N 0.000 description 1
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 1
- JZBVBOKASHNXAD-NAKRPEOUSA-N Ile-Val-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N JZBVBOKASHNXAD-NAKRPEOUSA-N 0.000 description 1
- 208000031773 Insulin resistance syndrome Diseases 0.000 description 1
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 1
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 1
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 1
- QDSKNVXKLPQNOJ-GVXVVHGQSA-N Leu-Gln-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QDSKNVXKLPQNOJ-GVXVVHGQSA-N 0.000 description 1
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 1
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 1
- MPSBSKHOWJQHBS-IHRRRGAJSA-N Leu-His-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCSC)C(=O)O)N MPSBSKHOWJQHBS-IHRRRGAJSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 1
- ILDSIMPXNFWKLH-KATARQTJSA-N Leu-Thr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ILDSIMPXNFWKLH-KATARQTJSA-N 0.000 description 1
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 1
- YVSHZSUKQHNDHD-KKUMJFAQSA-N Lys-Asn-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N YVSHZSUKQHNDHD-KKUMJFAQSA-N 0.000 description 1
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 1
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 1
- RFQATBGBLDAKGI-VHSXEESVSA-N Lys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCCN)N)C(=O)O RFQATBGBLDAKGI-VHSXEESVSA-N 0.000 description 1
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 1
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 1
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 1
- WAIHHELKYSFIQN-XUXIUFHCSA-N Lys-Ile-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O WAIHHELKYSFIQN-XUXIUFHCSA-N 0.000 description 1
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 1
- PYFNONMJYNJENN-AVGNSLFASA-N Lys-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PYFNONMJYNJENN-AVGNSLFASA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 1
- YUTZYVTZDVZBJJ-IHPCNDPISA-N Lys-Trp-Lys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 YUTZYVTZDVZBJJ-IHPCNDPISA-N 0.000 description 1
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 1
- NCFZHKMKRCYQBJ-CIUDSAMLSA-N Met-Cys-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NCFZHKMKRCYQBJ-CIUDSAMLSA-N 0.000 description 1
- OGAZPKJHHZPYFK-GARJFASQSA-N Met-Glu-Pro Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGAZPKJHHZPYFK-GARJFASQSA-N 0.000 description 1
- TZHFJXDKXGZHEN-IHRRRGAJSA-N Met-His-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O TZHFJXDKXGZHEN-IHRRRGAJSA-N 0.000 description 1
- UROWNMBTQGGTHB-DCAQKATOSA-N Met-Leu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UROWNMBTQGGTHB-DCAQKATOSA-N 0.000 description 1
- CGUYGMFQZCYJSG-DCAQKATOSA-N Met-Lys-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O CGUYGMFQZCYJSG-DCAQKATOSA-N 0.000 description 1
- WTHGNAAQXISJHP-AVGNSLFASA-N Met-Lys-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O WTHGNAAQXISJHP-AVGNSLFASA-N 0.000 description 1
- GFDBWMDLBKCLQH-IHRRRGAJSA-N Met-Phe-Cys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N GFDBWMDLBKCLQH-IHRRRGAJSA-N 0.000 description 1
- MFDDVIJCQYOOES-GUBZILKMSA-N Met-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCSC)N MFDDVIJCQYOOES-GUBZILKMSA-N 0.000 description 1
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 1
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 1
- JEGFCFLCRSJCMA-IHRRRGAJSA-N Phe-Arg-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N JEGFCFLCRSJCMA-IHRRRGAJSA-N 0.000 description 1
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 1
- HOYQLNNGMHXZDW-KKUMJFAQSA-N Phe-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HOYQLNNGMHXZDW-KKUMJFAQSA-N 0.000 description 1
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 1
- HGNGAMWHGGANAU-WHOFXGATSA-N Phe-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HGNGAMWHGGANAU-WHOFXGATSA-N 0.000 description 1
- AUJWXNGCAQWLEI-KBPBESRZSA-N Phe-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AUJWXNGCAQWLEI-KBPBESRZSA-N 0.000 description 1
- FRMKIPSIZSFTTE-HJOGWXRNSA-N Phe-Tyr-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FRMKIPSIZSFTTE-HJOGWXRNSA-N 0.000 description 1
- XKHCJJPNXFBADI-DCAQKATOSA-N Pro-Asp-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O XKHCJJPNXFBADI-DCAQKATOSA-N 0.000 description 1
- WGAQWMRJUFQXMF-ZPFDUUQYSA-N Pro-Gln-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WGAQWMRJUFQXMF-ZPFDUUQYSA-N 0.000 description 1
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 1
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- DRKAXLDECUGLFE-ULQDDVLXSA-N Pro-Leu-Phe Chemical compound CC(C)C[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O DRKAXLDECUGLFE-ULQDDVLXSA-N 0.000 description 1
- SUENWIFTSTWUKD-AVGNSLFASA-N Pro-Leu-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SUENWIFTSTWUKD-AVGNSLFASA-N 0.000 description 1
- XYAFCOJKICBRDU-JYJNAYRXSA-N Pro-Phe-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O XYAFCOJKICBRDU-JYJNAYRXSA-N 0.000 description 1
- PRKWBYCXBBSLSK-GUBZILKMSA-N Pro-Ser-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O PRKWBYCXBBSLSK-GUBZILKMSA-N 0.000 description 1
- AIOWVDNPESPXRB-YTWAJWBKSA-N Pro-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2)O AIOWVDNPESPXRB-YTWAJWBKSA-N 0.000 description 1
- BNUKRHFCHHLIGR-JYJNAYRXSA-N Pro-Trp-Asp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O BNUKRHFCHHLIGR-JYJNAYRXSA-N 0.000 description 1
- GNFHQWNCSSPOBT-ULQDDVLXSA-N Pro-Trp-Gln Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CCC(=O)N)C(=O)O GNFHQWNCSSPOBT-ULQDDVLXSA-N 0.000 description 1
- HOJUNFDJDAPVBI-BZSNNMDCSA-N Pro-Trp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 HOJUNFDJDAPVBI-BZSNNMDCSA-N 0.000 description 1
- QHSSUIHLAIWXEE-IHRRRGAJSA-N Pro-Tyr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O QHSSUIHLAIWXEE-IHRRRGAJSA-N 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- MMGJPDWSIOAGTH-ACZMJKKPSA-N Ser-Ala-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MMGJPDWSIOAGTH-ACZMJKKPSA-N 0.000 description 1
- QEDMOZUJTGEIBF-FXQIFTODSA-N Ser-Arg-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O QEDMOZUJTGEIBF-FXQIFTODSA-N 0.000 description 1
- YUSRGTQIPCJNHQ-CIUDSAMLSA-N Ser-Arg-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O YUSRGTQIPCJNHQ-CIUDSAMLSA-N 0.000 description 1
- HEQPKICPPDOSIN-SRVKXCTJSA-N Ser-Asp-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HEQPKICPPDOSIN-SRVKXCTJSA-N 0.000 description 1
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 1
- IXZHZUGGKLRHJD-DCAQKATOSA-N Ser-Leu-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IXZHZUGGKLRHJD-DCAQKATOSA-N 0.000 description 1
- XKFJENWJGHMDLI-QWRGUYRKSA-N Ser-Phe-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O XKFJENWJGHMDLI-QWRGUYRKSA-N 0.000 description 1
- PJIQEIFXZPCWOJ-FXQIFTODSA-N Ser-Pro-Asp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O PJIQEIFXZPCWOJ-FXQIFTODSA-N 0.000 description 1
- WNDUPCKKKGSKIQ-CIUDSAMLSA-N Ser-Pro-Gln Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O WNDUPCKKKGSKIQ-CIUDSAMLSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- IGROJMCBGRFRGI-YTLHQDLWSA-N Thr-Ala-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O IGROJMCBGRFRGI-YTLHQDLWSA-N 0.000 description 1
- PQLXHSACXPGWPD-GSSVUCPTSA-N Thr-Asn-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PQLXHSACXPGWPD-GSSVUCPTSA-N 0.000 description 1
- MFEBUIFJVPNZLO-OLHMAJIHSA-N Thr-Asp-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MFEBUIFJVPNZLO-OLHMAJIHSA-N 0.000 description 1
- VUVCRYXYUUPGSB-GLLZPBPUSA-N Thr-Gln-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O VUVCRYXYUUPGSB-GLLZPBPUSA-N 0.000 description 1
- CQNFRKAKGDSJFR-NUMRIWBASA-N Thr-Glu-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O CQNFRKAKGDSJFR-NUMRIWBASA-N 0.000 description 1
- OQCXTUQTKQFDCX-HTUGSXCWSA-N Thr-Glu-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N)O OQCXTUQTKQFDCX-HTUGSXCWSA-N 0.000 description 1
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 1
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- JWQNAFHCXKVZKZ-UVOCVTCTSA-N Thr-Lys-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JWQNAFHCXKVZKZ-UVOCVTCTSA-N 0.000 description 1
- UJQVSMNQMQHVRY-KZVJFYERSA-N Thr-Met-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UJQVSMNQMQHVRY-KZVJFYERSA-N 0.000 description 1
- BIBYEFRASCNLAA-CDMKHQONSA-N Thr-Phe-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 BIBYEFRASCNLAA-CDMKHQONSA-N 0.000 description 1
- GVMXJJAJLIEASL-ZJDVBMNYSA-N Thr-Pro-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O GVMXJJAJLIEASL-ZJDVBMNYSA-N 0.000 description 1
- SGAOHNPSEPVAFP-ZDLURKLDSA-N Thr-Ser-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SGAOHNPSEPVAFP-ZDLURKLDSA-N 0.000 description 1
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- ZEJBJDHSQPOVJV-UAXMHLISSA-N Thr-Trp-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZEJBJDHSQPOVJV-UAXMHLISSA-N 0.000 description 1
- FYBFTPLPAXZBOY-KKHAAJSZSA-N Thr-Val-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O FYBFTPLPAXZBOY-KKHAAJSZSA-N 0.000 description 1
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 1
- DZIKVMCFXIIETR-JSGCOSHPSA-N Trp-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O DZIKVMCFXIIETR-JSGCOSHPSA-N 0.000 description 1
- OCCYDHCUKXRPSJ-SXNHZJKMSA-N Trp-Ile-Gln Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O OCCYDHCUKXRPSJ-SXNHZJKMSA-N 0.000 description 1
- CCZXBOFIBYQLEV-IHPCNDPISA-N Trp-Leu-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O CCZXBOFIBYQLEV-IHPCNDPISA-N 0.000 description 1
- MTEQZJFSEMXXRK-CFMVVWHZSA-N Tyr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N MTEQZJFSEMXXRK-CFMVVWHZSA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- CWQZAUYFWRLITN-AVGNSLFASA-N Tyr-Gln-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N)O CWQZAUYFWRLITN-AVGNSLFASA-N 0.000 description 1
- XQYHLZNPOTXRMQ-KKUMJFAQSA-N Tyr-Glu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XQYHLZNPOTXRMQ-KKUMJFAQSA-N 0.000 description 1
- NOOMDULIORCDNF-IRXDYDNUSA-N Tyr-Gly-Phe Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NOOMDULIORCDNF-IRXDYDNUSA-N 0.000 description 1
- AZGZDDNKFFUDEH-QWRGUYRKSA-N Tyr-Gly-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AZGZDDNKFFUDEH-QWRGUYRKSA-N 0.000 description 1
- FGVFBDZSGQTYQX-UFYCRDLUSA-N Tyr-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O FGVFBDZSGQTYQX-UFYCRDLUSA-N 0.000 description 1
- VSYROIRKNBCULO-BWAGICSOSA-N Tyr-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)O VSYROIRKNBCULO-BWAGICSOSA-N 0.000 description 1
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 1
- LNYOXPDEIZJDEI-NHCYSSNCSA-N Val-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LNYOXPDEIZJDEI-NHCYSSNCSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 1
- FPCIBLUVDNXPJO-XPUUQOCRSA-N Val-Cys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O FPCIBLUVDNXPJO-XPUUQOCRSA-N 0.000 description 1
- LHADRQBREKTRLR-DCAQKATOSA-N Val-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](C(C)C)N LHADRQBREKTRLR-DCAQKATOSA-N 0.000 description 1
- YCMXFKWYJFZFKS-LAEOZQHASA-N Val-Gln-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCMXFKWYJFZFKS-LAEOZQHASA-N 0.000 description 1
- PIFJAFRUVWZRKR-QMMMGPOBSA-N Val-Gly-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O PIFJAFRUVWZRKR-QMMMGPOBSA-N 0.000 description 1
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 1
- HQYVQDRYODWONX-DCAQKATOSA-N Val-His-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CO)C(=O)O)N HQYVQDRYODWONX-DCAQKATOSA-N 0.000 description 1
- LKUDRJSNRWVGMS-QSFUFRPTSA-N Val-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LKUDRJSNRWVGMS-QSFUFRPTSA-N 0.000 description 1
- JZWZACGUZVCQPS-RNJOBUHISA-N Val-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N JZWZACGUZVCQPS-RNJOBUHISA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 1
- RQOMPQGUGBILAG-AVGNSLFASA-N Val-Met-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O RQOMPQGUGBILAG-AVGNSLFASA-N 0.000 description 1
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 1
- VCIYTVOBLZHFSC-XHSDSOJGSA-N Val-Phe-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N VCIYTVOBLZHFSC-XHSDSOJGSA-N 0.000 description 1
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 108010086780 arginyl-glycyl-aspartyl-alanine Proteins 0.000 description 1
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 1
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 229960000074 biopharmaceutical Drugs 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 108010054813 diprotin B Proteins 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 108010085059 glutamyl-arginyl-proline Proteins 0.000 description 1
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010019832 glycyl-asparaginyl-glycine Proteins 0.000 description 1
- 108010081985 glycyl-cystinyl-aspartic acid Proteins 0.000 description 1
- 108010001064 glycyl-glycyl-glycyl-glycine Proteins 0.000 description 1
- 108010079413 glycyl-prolyl-glutamic acid Proteins 0.000 description 1
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 1
- 108010010147 glycylglutamine Proteins 0.000 description 1
- 108010050848 glycylleucine Proteins 0.000 description 1
- 108010015792 glycyllysine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 108010050343 histidyl-alanyl-glutamine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 102000044162 human IGF1 Human genes 0.000 description 1
- 230000005965 immune activity Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 108010009932 leucyl-alanyl-glycyl-valine Proteins 0.000 description 1
- 108010043322 lysyl-tryptophyl-alpha-lysine Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 108010054155 lysyllysine Proteins 0.000 description 1
- 108010038320 lysylphenylalanine Proteins 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 1
- 108700023046 methionyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010085203 methionylmethionine Proteins 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000000302 molecular modelling Methods 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010012581 phenylalanylglutamate Proteins 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 230000009465 prokaryotic expression Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 108010087846 prolyl-prolyl-glycine Proteins 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 229960005322 streptomycin Drugs 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 108010033670 threonyl-aspartyl-tyrosine Proteins 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 108010060175 trypsinogen activation peptide Proteins 0.000 description 1
- 108010084932 tryptophyl-proline Proteins 0.000 description 1
- 108010044292 tryptophyltyrosine Proteins 0.000 description 1
- 108010051110 tyrosyl-lysine Proteins 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/475—Growth factors; Growth regulators
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/50—Fusion polypeptide containing protease site
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Biomedical Technology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Theoretical Computer Science (AREA)
- Plant Pathology (AREA)
- Microbiology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Analytical Chemistry (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
The invention discloses a method for screening enzyme digestion adaptive fusion protein and an IGF-I preparation method, wherein the screening method is to insert target protein sequence coding genes into a plurality of different expression vectors containing protease enzyme digestion sites respectively to obtain amino acid sequences of a plurality of different fusion proteins; predicting the secondary structure of the amino acid sequence of the obtained fusion protein; taking amino acid sequences of different fusion proteins and corresponding secondary structure prediction results thereof as input files to predict a three-dimensional structure; and performing molecular docking on the three-dimensional structure prediction result and the protease corresponding to the fusion protein by adopting a Cluspro algorithm, selecting a digestion system with the correct docking result ratio exceeding a threshold value for experimental verification, and screening out the fusion protein with the most suitable protease. The soluble expression system of thrombin and HRV 3C protease suitable for IGF-I is screened by the method, and the soluble expression system has the advantages of stability, solubility, high-activity mass expression, convenient enzyme digestion and the like.
Description
Technical Field
The invention relates to the technical field of recombinant protein production, in particular to a method for screening enzyme digestion adaptive fusion protein, which saves manpower, material resources and time, and a preparation method of IGF-I.
Background
Insulin-like growth factor-I (IGF-I) plays an important role in the growth, development and proliferation of cells, and is a multifunctional regulatory factor. Currently, human IGF-I has been clinically used for the treatment of diabetes, insulin resistance syndrome, dwarfism and nervous system diseases, etc., with good results. If IGF-I is directly expressed in a prokaryotic expression system, the target protein usually forms an inclusion body, and in order to improve the protein solubility and facilitate purification, a dissolution promoting tag and a purification tag are added for fusion expression; in order to obtain the target protein, a protease cleavage site sequence is usually inserted between the target protein and the lysis-promoting tag and the purification tag, so that the tags can be removed by using corresponding protease to obtain the final target protein.
Commonly used proteases comprise enterokinase, thrombin, Xa factor, HRV 3C protease (human rhinovirus 3C protease) and the like, each protease has certain restriction on amino acid sequences before and after the cleavage site (for example, the optimal cleavage site sequence of thrombin is LVPR ↓GS, ↓indicatesthat the protease cleaves a corresponding polypeptide chain at the site, the same below), and protein drugs do not allow redundant amino acids to exist before the mature peptide, so that the protease corresponding to the fusion protein needs to be selected to ensure that the target protein obtained after cleavage has the N-terminal and/or the C-terminal of the natural non-additional amino acid residues. In actual production, a plurality of fusion proteins containing different protease enzyme cutting site sequences are usually required to be expressed and purified, and the optimal expression system can be selected by comparing the enzyme cutting efficiencies of different proteases; also, the following may occur during this process: (1) because the enzyme cutting sites are wrapped inside the fusion protein, the protease cannot cut the fusion protein; (2) the fusion protein contains a plurality of enzyme cutting sites of protease, and the protease can non-specifically cut the fusion protein and even cut the target protein. In any case, the cost of manpower and material resources is wasted, the development period is prolonged, and even the project is terminated.
Therefore, a rapid and efficient method is urgently needed to screen out a soluble expression system which can be successfully digested by enzyme and successfully prepare the recombinant human insulin-like growth factor-I so as to facilitate large-scale production.
Disclosure of Invention
In view of the above drawbacks or needs for improvement in the prior art, an object of the present invention is to provide a method for screening an adapted protease restriction enzyme-cleaved fusion protein, which efficiently screens protease restriction enzyme-cleaved fusion proteins by using a virtual screening technique, thereby solving the technical risk that the cleavage effect can only be verified by using a large number of experiments in the past.
The invention also aims to provide a preparation method of the insulin-like growth factor-I, which obtains the insulin-like growth factor-I fusion protein with good enzyme digestion effect by the virtual screening method, realizes soluble expression and successful enzyme digestion of the fusion protein, and solves the technical problem that the recombinant human insulin-like growth factor-I is difficult to stably, soluble and highly actively express in an escherichia coli expression system.
The technical scheme of the invention is detailed as follows:
a method of screening for an enzyme-cleaved aptameric fusion protein comprising the steps of:
(1) respectively inserting target protein sequence coding genes into a plurality of different expression vectors containing protease enzyme cutting sites to obtain amino acid sequences of a plurality of different fusion proteins; predicting the secondary structure of the amino acid sequence of the obtained fusion protein by using a secondary structure prediction algorithm PSIPRED;
(2) taking the amino acid sequences of the different fusion proteins in the step (1) and the corresponding secondary structure prediction results thereof as input files, and predicting the three-dimensional structure by a structure prediction algorithm I-TASSER;
(3) and (3) performing molecular docking on the three-dimensional structure prediction result obtained in the step (2) and the protease corresponding to the fusion protein by adopting a Cluspro algorithm, selecting a digestion system with the correct docking result ratio exceeding a threshold value for experimental verification, and screening out the fusion protein with the most suitable protease.
In the screening method, candidate proteases can be listed one by one, fusion protein sequences containing target proteins and enzyme cutting sites corresponding to the proteases are obtained respectively, and fusion proteins with corresponding quantities are obtained according to the quantity of the proteases. The subsequent secondary structure prediction algorithm PSIPRED, the tertiary structure prediction algorithm I-TASSER and the molecular docking algorithm Cluspro are all the existing known technologies, and the operation can be carried out by a person skilled in the art according to the instruction of each algorithm.
Preferably, the method further comprises the following steps:
(4) and (4) performing codon optimization on the most adaptive fusion protein coding gene obtained in the step (3). The expression quantity can be improved after optimization.
Preferably, in the above method, the target protein is human insulin-like growth factor-I.
Preferably, in the above method, the protease is enterokinase, thrombin or HRV 3C protease, and when the three-dimensional structure prediction results of the fusion proteins corresponding to different proteases are subjected to docking, the docking sites are selected as follows:
selecting all lysine (Lys) in an amino acid sequence and 4 amino acids on the left and right of the lysine (Lys) as candidate enzyme cutting sites of the fusion protein corresponding to the enterokinase;
selecting arginine (Arg) and 4 amino acids on the left and right of the Arg as candidate enzyme cutting sites of the fusion protein corresponding to the thrombin;
the fusion protein corresponding to HRV 3C protease selects glutamine (Gln) and 4 amino acids on the left and right of the glutamine as candidate enzyme cutting sites.
Preferably, in the above method, the threshold value in step (3) is 50%.
A method for preparing recombinant human insulin-like growth factor-I uses fusion protein with amino acid sequence as shown in SEQ ID NO.1, uses pET-32a (+) vector to express in prokaryotic host, and uses thrombin to carry out enzyme digestion on the fusion protein obtained by expression. The fusion protein coding gene was cloned into pET-32a (+) vector between MscI and EcoRV cleavage sites.
Preferably, in the above IGF-I preparation method, the encoding gene of the fusion protein shown in SEQ ID NO.1 is:
(1) a gene sequence shown as SEQ ID NO. 4; or
(2) A gene sequence which has 90 to 100 percent of homology with the gene sequence shown in SEQ ID NO.4 and encodes the same functional protein; or
(3) The gene sequence shown in SEQ ID NO.4 is a gene sequence which is derived from (1) and encodes protein with the same activity by adding, deleting or replacing one or more codons.
A method for preparing recombinant human insulin-like growth factor-I uses fusion protein whose amino acid sequence is shown in SEQ ID NO.3, uses pET-48b (+) vector to make expression in prokaryotic host, and uses HRV 3C enzyme to make enzyme digestion so as to obtain the invented fusion protein. The fusion protein coding gene was cloned into the pET-48b (+) vector between the SacII and HindIII cleavage sites.
Preferably, in the above IGF-I preparation method, the coding gene corresponding to the fusion protein shown in SEQ ID NO.3 is:
(1) a gene sequence shown as SEQ ID NO. 5;
(2) a gene sequence which has 90 to 100 percent of homology with the gene sequence shown in SEQ ID NO.5 and encodes the same functional protein; or
(3) The gene sequence shown in SEQ ID NO.5 is a gene sequence which is derived from (1) and encodes protein with the same activity by adding, deleting or replacing one or more codons.
Preferably, in any of the above-described preparation methods, the prokaryotic host is BL21(DE3) E.coli strain, Rosetta-gami B (DE3) E.coli strain, Origami B (DE3) E.coli strain or Rosetta-gami2(DE3) E.coli strain.
Compared with the prior art, the invention has the following beneficial effects:
the method for screening enzyme digestion adaptive fusion protein provided by the invention adopts a virtual screening means, can virtually compare the enzyme digestion effects of various proteases through structure simulation and molecular docking, screens out a proper fusion protein expression system and the proteases before an experiment, conforms to a prediction result after experimental verification, can greatly reduce the capital, labor and time cost, and is more convenient and efficient compared with the conventional high-input mode which can only be screened by one verification test.
According to the invention, the enzyme digestion effects of thrombin, enterokinase and HRV 3C protease are obtained through virtual screening by the screening method, and the enzyme digestion effects are consistent with the prediction result after experimental verification, and the thrombin, the HRV 3C protease and the corresponding fusion protein thereof are more suitable for preparing IGF-I, so that a soluble expression system which is suitable for the recombinant human insulin-like growth factor-I and can be successfully subjected to enzyme digestion is obtained.
The preparation method of the recombinant human insulin-like growth factor-I fusion protein provided by the invention has the advantages of soluble expression, large expression quantity and the like, can successfully obtain the recombinant human insulin-like growth factor-I by protease enzyme digestion, and solves the technical problem that the human insulin-like growth factor-I cannot be directly obtained.
Drawings
FIG. 1 is a PSIPRED online prediction server input interface;
FIG. 2 is an I-TASSER online prediction server input interface;
FIG. 3 is a Cluspro online prediction server input interface;
FIG. 4 is a map of E.coli expression vector pET-32a (+);
FIG. 5 is a map of the multiple cloning site region of E.coli expression vector pET-32a (+);
FIG. 6 is a SDS-PAGE electrophoretic analysis of thrombin fusion proteins;
FIG. 7 is a Western Blot immunodifferencing profile of thrombin fusion proteins;
FIG. 8 is a map of E.coli expression vector pET-48b (+);
FIG. 9 is a map of the multiple cloning site region of E.coli expression vector pET-48b (+);
FIG. 10 is a Western Blot immunodifferencing profile of HRV 3C enzyme fusion protein;
FIG. 11 is a SDS-PAGE analysis of thrombin-cleaved thrombin fusion proteins;
FIG. 12 is a WesternBlot immunodifferencing profile of HRV 3C enzyme fusion protein after digestion with HRV 3C enzyme;
FIG. 13 is a Western Blot immunodifferencing profile of enterokinase fusion protein;
FIG. 14 is an SDS-PAGE analysis of enterokinase fusion protein after digestion with enterokinase.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention are described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The inventors succeeded in improving the thermal stability of proteins using various structure prediction algorithms through in-depth studies on protein structures and molecular modeling methods in the early days (Li, et al. apple Environ Microb,2018,84(2), e 02129-17; Li, et al. RSCAdv,2018,8, 1948). The invention virtually screens out fusion protein suitable for preparing the recombinant human insulin-like growth factor-I by a structure prediction algorithm and a molecular docking algorithm, and adopts corresponding protease to successfully perform enzyme digestion to prepare the recombinant human insulin-like growth factor-I. By adopting the technical scheme of the invention, the expression of IGF-I in an escherichia coli expression system with stability, solubility and high activity can be realized, so that the industrial production of the IGF-I becomes possible, and the IGF-I is beneficial to development of new drugs and clinical application.
With respect to the examples, it is noted that the following detailed description is illustrative and is intended to provide further explanation of the invention. Unless otherwise indicated, the technical terms used are terms commonly used by those of ordinary skill in the art; the experimental method without specific conditions noted is a conventional experimental method; the test materials used are commercially available products unless otherwise specified, and the ingredients and preparation methods of various reagents and media can be found in conventional laboratory manuals.
Example 1: obtaining amino acid sequences of fusion proteins
According to literature reports, the direct expression of insulin-like growth factor-I (IGF-I) in Escherichia coli forms inclusion bodies (Rosano GL, et al front Microbiol,2014,5:172), and soluble expression of proteins can be realized by adding a solubilizing protein tag for fusion expression. At present, the most widely used escherichia coli expression vector is the pET system, and a solubilizing tag such as Thioredoxin (Thioredoxin) and a His-tag purification tag are added to the vector, so that soluble expression and purification of fusion protein can be promoted.
In order to realize that the N end of IGF-I has no other amino acid after the fusion protein is digested by the protease, the candidate protease is thrombin, FactorXa, enterokinase, TEV enzyme and HRV 3C enzyme; taking into account the cost of enzyme and the difficulty of acquisition, simulated digestion experiments were performed with thrombin, enterokinase, HRV 3C enzyme. In different pET expression vectors, enzyme cutting sites of thrombin, enterokinase and HRV 3C enzyme are contained, and target protein can be obtained by protease enzyme cutting. Finally, three fusion proteins of pET-32a (+) -thrombin enzyme cutting site-IGF-I, pET-32a (+) -enterokinase enzyme cutting site-IGF-I, pET-48b (+) -HRV 3C enzyme cutting site-IGF-I are determined for subsequent experiments.
The amino acid sequence of IGF-I is shown in SEQ ID No. 6:
GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCF RSCDLRRLEMYCAPLKPAKSA。
respectively inserting the amino acid sequence of IGF-I into thrombin enzyme cutting site sequence LVPR, enterokinase enzyme cutting site DDDDK sequence of pET-32a (+) vector and HRV 3C enzyme cutting site LEVLFQ sequence of pET-48b (+) vector to obtain corresponding fusion protein amino acid sequence, i.e. obtaining fusion protein amino acid sequence
Thrombin fusion protein amino acid sequence SEQ ID No.1:
MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMHHHHHHSSGLVPRGPETLCGAELVDALQ FVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSA。
enterokinase fusion protein amino acid sequence SEQ ID No.2:
MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMHHHHHHSSGLVPRGSGMKETAAAKFERQHMDSPDLGTDDDDKGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCA PLKPAKSA。
MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHTSGGGGSNNNPPTPTPSSGSGHHHHHHSAALEVLFQGPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAK SA。
example 2 prediction of fusion protein Secondary Structure
The PSIPRED online prediction server (http:// bio if. cs. ucl. ac. uk/PSIPRED /) is opened, and its pages are shown in FIG. 1. The amino acid Sequence obtained in example 1 was entered into the blank field of "Protein Sequence" and filled in Job Name and Email address, and after submitting the task, the ss2 file of the predicted result was sent to the designated mailbox.
The prediction results of the secondary structure of the thrombin fusion protein in example 1 are as follows, wherein C represents a random coil, S represents a β -sheet, and H represents an α -helix:
CCCCCSSCCHHHHHHHHHHCCCCSSSSSSCCCCHHHHHHHHHHHHHHHHHCCCSSSSSSSCCCCCCHHHHHCCCCCCSSSSSSCCSSSSSSSCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHCCCCSSSCCCCCCCCCCCCCCCCCCCCCHHCCCCCSSSSSSSSCCCCCCCCC
the results of predicting the secondary structure of the enterokinase fusion protein in example 1 are as follows:
CCCCCSSCCCCCHHHHHHCCCCCSSSSSCCCCCHHHHHHHHHHHHHHHHHCCCSSSSSSSCCCCCCCCCCCCCCCCCSSSSSSCCSSSSSSSCCCCHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHCCCCSSSCCCCCCCCCCCCCCCCCCCCCHHCCCCCCCHHHHHCCCCCCCCCC
the result of predicting the secondary structure of the HRV 3C enzyme fusion protein in example 1 is as follows:
CCCCCSSCCCCCHHHHHHCCCCCSSSSSCCCCCHHHHHHHHHHHHHHHHHCCCSSSSSSSCCCCCCCCCCCCCCCCCSSSSSSCCSSSSSSSCCCCHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHCCCCCCHHHHHHHHHHHHHCCCCSSSCCCCCCCCCCCCCCCCCCCCCHHCCCCCCCHHHHHCCCCCCCCCC
example 3 prediction of three-dimensional Structure of fusion protein
An I-TASSER online prediction server (https:// zhanglab. ccmb. med. umich. edu/I-TASSER /) is opened, and the page is shown in FIG. 2. The amino acid sequence of the fusion protein is entered into the sequence box, and then "Option III: the upload button in the specific secondary structure for specific contents "option uploads the ss2 file obtained in example 2. And then filling in an email address and a task name, and submitting the prediction task. After the task is finished, the predicted pdb file of the three-dimensional structure of the fusion protein can be received.
Example 4 molecular docking of fusion proteins with proteases
Turn on the Cluspro online prediction server (https:// Cluspro. bu. edu/home. php), the page of which is shown in FIG. 3. Clicking an 'Upload PDB' button in the 'Receptor' column to Upload the predicted fusion protein PDB file in the embodiment 3; IN the column "Ligand", protease PDB ID corresponding to the fusion protein was filled IN, thrombin was filled IN "1 ETR", enterokinase was filled IN "1 EKB", and HRV 3C enzyme was filled IN "2 IN 2", respectively.
In order to accelerate the docking speed and improve the docking accuracy, in the following "introduction and replication" options, the "introduction" blank column corresponding to the Receptor is filled with candidate enzyme cutting sites of the fusion protein, and the "introduction" blank column corresponding to the Ligand is filled with amino acid residue sites which are possibly contacted with the fusion protein on the surface near the protease activity site.
Selecting Arg and 4 amino acids on the left and right of the Arg as candidate enzyme cutting sites by the thrombin fusion protein, selecting all Lys and 4 amino acids on the left and right of the Lys in an amino acid sequence as candidate enzyme cutting sites by the enterokinase fusion protein, and selecting Gln and 4 amino acids on the left and right of the Gln as candidate enzyme cutting sites by the HRV 3C enzyme fusion protein; preferably, the region for docking the thrombin fusion protein is 49-57, 66-78, 93-101, 125-133, 146-154 in the sequence of SEQ ID No. 1; the region for docking the enterokinase fusion protein is the 1-8, 15-23, 33-41, 49-62, 66-75, 79-105, 130-144, 154-162, 181-189, 219-228 site in the sequence of SEQ ID No. 2; the region for docking the HRV 3C enzyme fusion protein is 47-55, 59-67, 95-103, 147-155, 162-170 and 187-195 in the SEQ ID No.3 sequence.
The amino acid sequence of thrombin (PDB ID:1ETR) is shown in SEQ ID No. 7:
TFGAGEADCGLRPLFEKKQVQDQTEKELFESYIEGRIVEGQDAEVGLSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTVDDLLVRIGKHSRTRYERKVEKISMLDKIYIHPRYNWKENLDRDIALLKLKRPIELSDYIHPVCLPDKQTAAKLLHAGFKGRVTGWGNRRETWTTSVAEVQPSVLQVVNLPLVERPVCKASTRIRITDNMFCAGYKPGEGKRGDACEGDSGGPFVMKSPYNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDRLGS
the butt joint region is the 51-53, 55-56, 58-65, 79-80, 83, 85-86, 88, 90, 101, 104, 105, 131, 132, 135, 175, 179, 182, 190, 192, 194, 214, 215, 224, 225, 234, 241, 261, 267, 269, 273, 276;
the amino acid sequence of enterokinase (PDB ID:1EKB) is shown in SEQ ID No. 8:
CGKKLVTQEVSPKIVGGSDSREGAWPWVVALYFDDQQVCGASLVSRDWLVSAAHCVYGRNMEPSKWKAVLGLHMASNLTSPQIETRLIDQIVINPHYNKRRKNNDIAMMHLEMKVNYTDYIQPICLPEENQVFPPGRICSIAGWGALIYQGSTADVLQEADVPLLSNEKCQQQMPEYNITENMVCAGYEAGGVDSCQGDSGGPLMCQENNRWLLAGVTSFGYQCALPNRPGVYARVPRFTEWIQSFLH
the butt joint area is 33, 36-39, 54-55, 57-59, 97, 99-100, 102, 105, 195-221, 218-221, 231-232;
the amino acid sequence of HRV 3C enzyme (PDB ID:2IN2) is shown IN SEQ ID No. 9:
GPNTEFALSLLRKNIMTITTSKGEFTGLGIHDRVCVIPTHAQPGDDVLVNGQKIRVKDKYKLVDPENINLELTVLTLDRNEKFRDIRGFISEDLEGVDATLVVHSNNFTNTILEVGPVTMAGLINLSSTPTNRMIRYDYATKTGQCGGVLCATGKIFGIHVGGNGRQGFSAQLKKQYFVEKQ
the docking area is the 26-30, 44-48, 64-68, 74, 76-78, 111-112, 131-137, 139, 147-152, 166-170, 174.
After the docking file and the docking site are input, a 'Dock' button is clicked to carry out docking prediction, the Cluspro outputs 20 most probable docking compound structure sets and pdb structure files thereof, and the number of structures contained in each set is given. The 20 aggregate compound structure files were observed to derive the docking position of each compound.
The output of the thrombin and fusion protein docking is as follows:
1) the largest structure set comprises 220 complex structures in total, and the enzyme cutting position is PR ↓ GP; 2) IR ↓ GI (161); 3) no definite docking position; 4) PR ↓ GP (64); 5) no definite docking position; 6) IR ↓ GI (49); 7) PR ↓ GP (46); 8) PR ↓ GP (43); 9) PR ↓ GP (29); 10) DR ↓ GF (26); 11) no definite docking position; 12) no definite docking position; 13) no definite docking position; 14) DR ↓ GF (15); 15) IR ↓ GI (14); 16) PR ↓ GP (12); 17) PR ↓ GP (10); 18) no definite docking position; 19) IR ↓ GI (4); 20) DR ↓ GF (1).
Counting the docking positions and the number of the docking positions to obtain that the percentage of the thrombin active sites docking at the correct enzyme cutting position (PR ↓ GP) is 61.1%:
MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIR(32.8%)GIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHMHHHHHHSSGLVPR(61.1%)GPETLCGAELVDALQFVCGDR(6.1%)GFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSA
similarly, the following are the restriction sites of enterokinase and the percentage thereof, and the proportion of the correct restriction site (DK ↓ GP) is 14.7%:
MSDK(17.1%)IIHLTDDSFDTDVLKADGAILVDFWAEWCGPCK(1.8%)MIAPILDEIADEYQGKLTVAKLNIDQNPGTAPK(30.1%)YGIRGIPTLLLFKNGEVAATK(10.4%)VGALSK(2.3%)GQLKEFLDANLAGSGSGHMHHHHHHSSGLVPRGSGMK(14.4%)ETAAAK(3.0%)FERQHMDSPDLGTDDDDK(14.7%)GPETLCGAELVDALQFVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLK(6.1%)PAKSA
the following are the restriction enzyme sites of HRV 3C enzyme and the percentage thereof, and the proportion of the correct restriction enzyme site (FQ ↓ GP) is 56.6%:
MSDKIIHLTDDSFDTDVLKADGAILVDFWAEWCGPCKMIAPILDEIADEYQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQLKEFLDANLAGSGSGHTSGGGGSNNNPPTPTPSSGSGHHHHHHSAALEVLFQ(56.6%)GPETLCGAELVDALQ(43.4%)FVCGDRGFYFNKPTGYGSSSRRAPQTGIVDECCFRSCDLRRLEMYCAPLKPAKSA
it is considered that, since a ratio of the correct cleavage site exceeding 50% in the prediction result means that the protease can smoothly contact the substrate and specifically cleave the correct cleavage site, 50% is selected as the threshold. Subsequently, expression experiments were carried out by selecting a thrombin cleavage system and an HRV 3C cleavage system as examples, and expression experiments were carried out by selecting an enterokinase cleavage system as a comparative example.
Example 5 preparation of fusion protein and enzyme cleavage
5.1 obtaining of fusion protein Gene sequences
In order to realize large-scale and high-efficiency expression of each fusion protein in Escherichia coli, the inventor finally designs a proper gene sequence for expression by referring to the codon bias of Escherichia coli and considering factors such as codon degeneracy, GC content, proper restriction endonuclease and the like.
Specifically, the thrombin fusion protein gene sequence is shown as SEQ ID No.4, and the 5 'end of the thrombin fusion protein gene sequence is provided with an MscI restriction endonuclease site (TGGCCA), the 3' end of the thrombin fusion protein gene sequence is provided with a continuous stop codon (TAATGA), and a recognition site (GATATC) of EcoRV:
TGGCCATATGCACCATCATCATCATCATTCTTCTGGTCTGGTGCCACGCGGTCCGGAGACCCTGTGCGGTGCGGAACTGGTGGACGCGCTGCAATTTGTTTGCGGTGATCGTGGCTTCTACTTTAACAAGCCGACCGGTTATGGTAGCAGCAGCCGTCGTGCGCCGCAGACCGGTATCGTTGACGAGTGCTGCTTCCGTAGCTGCGATCTGCGTCGTCTGGAAATGTATTGCGCGCCGCTGAAGCCGGCGAAAAGCGCGTAATGAGATATC
the HRV 3C enzyme fusion protein gene sequence is shown in SEQ ID No.5, the 5 'end of the HRV 3C enzyme fusion protein gene sequence is provided with a SacI restriction endonuclease site (CCGCGG), the 3' end of the HRV 3C enzyme fusion protein gene sequence is provided with a termination codon (TAA) and a recognition site of HindIII (AAGCTT):
CCGCGGCTCTGGAAGTGCTGTTTCAAGGTCCGGAGACCCTGTGCGGTGCGGAACTGGTGGACGCGCTGCAATTTGTTTGCGGTGATCGTGGCTTCTACTTTAACAAGCCGACCGGTTATGGTAGCAGCAGCCGTCGTGCGCCGCAGACCGGTATCGTTGACGAGTGCTGCTTCCGTAGCTGCGATCTGCGTCGTCTGGAAATGTATTGCGCGCCGCTGAAGCCGGCGAAAAGCGCGTAAAAGCTT
the above sequence was obtained by chemical synthesis.
5.2 construction of recombinant expression vector of thrombin fusion protein and fusion expression engineering bacterium
The map of the pET-32a (+) expression vector and the sequence of its multiple cloning site are shown in FIGS. 4 and 5. After the cDNA sequence (SEQ ID No.4) of the synthetic Thrombin fusion protein is subjected to double enzyme digestion by MscI and EcoRV, the cDNA sequence is connected to a pET-32a (+) expression vector which is also subjected to double enzyme digestion through T4 DNA ligase to construct a recombinant expression vector which is marked as pET-32 a-Thrombin-IGF-I; and (3) transforming an escherichia coli clone strain Top10, culturing and screening recombinants on an LB culture medium containing 50 mu g/ml ampicillin at 37 ℃, and sequencing to verify the sequence correctness after PCR and enzyme digestion verification are correct.
Extracting and identifying correct recombinants, transforming the recombinants into an Escherichia coli expression strain Origami B (DE3), and culturing and screening the recombinants on an LB culture medium containing 50 mu g/ml ampicillin (Amp), 15 mu g/ml kanamycin (Kan) and 12.5 mu g/ml tetracycline (Tet) at 37 ℃, wherein the obtained recombinants are IGF-I engineering bacteria expressed by Trx fusion and named as Thrombin-IGF. It was stored in 15% glycerol and frozen in a freezer at-80 ℃. Wherein, the fusion protein generated by the expression of the Thrombin-IGF engineering bacteria contains a thioredoxin (Trx) tag, a His-tag purification tag, a Thrombin (Thrombin) enzyme cutting site and a mature peptide IGF-I sequence, and the theoretical molecular weight is about 21.6 kDa.
Inoculating 0.2% of frozen Thrombin-IGF engineering bacteria in a glycerol tube into a test tube containing a fresh LB resistant culture medium (containing 50 mu g/ml Amp, 15 mu g/ml Kan and 12.5 mu g/ml Tet), performing shake culture overnight, and activating the strain; transferring into 300mL LB culture medium, shaking culturing at 37 deg.C for about 3h, and measuring OD600When the value is 0.6-0.8, adding isopropyl-beta-D-thiogalactoside (IPTG) with the final concentration of 0.05mM, carrying out induction expression for 24h at 25 ℃, and centrifuging to collect thalli; weighing wet weight of the thallus, adding a bacteria-breaking liquid (50mM Tris-HCl, 0.5M NaCl, pH8.0) in a weight ratio of the thallus to the bacteria-breaking liquid of 1:15, performing ultrasonic bacteria-breaking, and performing SDS-PAGE detection. The results are shown in FIG. 6, where M is protein Marker; 1 is a whole bacterium liquid before induction; 2 is induced whole bacteria liquid; and 3, breaking the bacteria supernatant after induction. The result shows that the bacterial breaking supernatant has the expression of fusion protein with the molecular weight close to 26kDa, which is close to the theoretical molecular weight.
The fusion protein was identified by Western blotting (Western Blot) using IGF-I monoclonal antibody (product of Abcam, cat # ab9572, the same applies below), and the result is shown in FIG. 7, where M is protein Marker; 1 is a whole bacterium liquid before induction; 2 is induced whole bacteria liquid; and 3, the induced bacteria-breaking supernatant. The results show that the fusion protein in the supernatant of the broken bacteria has the immune activity of IGF-I, and the fusion protein can form a small amount of dimer structure as can be seen in the figure.
5.3 construction of recombinant expression vector of HRV 3C enzyme fusion protein and fusion expression engineering bacteria
The map of the pET-48b (+) expression vector and the sequence of its multiple cloning site are shown in FIGS. 8 and 9. After the cDNA sequence (SEQ ID No.5) of the artificially synthesized HRV 3C enzyme fusion protein is subjected to double enzyme digestion by SacI and HindIII, the cDNA sequence is connected to a pET-48b (+) expression vector which is also subjected to double enzyme digestion through T4 DNA ligase to construct a recombinant expression vector which is recorded as pET-48b-HRV 3C-IGF-I; and (3) transforming an escherichia coli clone strain Top10, culturing and screening recombinants on an LB culture medium containing 50 mu g/ml kanamycin at 37 ℃, and after the recombinants are identified to be correct through PCR and enzyme digestion, sequencing to verify the sequence correctness of the recombinants.
Extracting and identifying correct recombinants, transforming the recombinants into an Escherichia coli expression strain Rosetta-gami2(DE3), and culturing and screening the recombinants on an LB culture medium containing 50 mu g/ml Kan, 34 mu g/ml Chl, 50 mu g/ml streptomycin (Str) and 12.5 mu g/ml Tet at 37 ℃, wherein the obtained recombinants are IGF-I engineering bacteria expressed by Trx fusion and are named as HRV 3C-IGF. It was stored in 15% glycerol and frozen in a freezer at-80 ℃. The fusion protein expressed and generated by the HRV 3C-IGF engineering bacteria contains a thioredoxin (Trx) tag, a His-tag purification tag, an HRV 3C enzyme cutting site and a mature peptide IGF-I sequence, and the theoretical molecular weight is about 23.5 kDa.
Taking HRV 3C-IGF engineering bacteria frozen in a glycerol tube, inoculating the engineering bacteria into a test tube containing a fresh LB resistance culture medium (containing 50 mu g/ml Kan, 50 mu g/ml Str, 34 mu g/ml Chl and 12.5 mu g/ml Tet) according to 0.2 percent, carrying out shake culture overnight, and activating the strain; transferring into 300mL LB culture medium, shaking culturing at 37 deg.C for about 3h, and measuring OD600When the value is 0.6-0.8, adding isopropyl-beta-D-thiogalactoside (IPTG) with the final concentration of 0.5mM, carrying out induction expression for 25h at 25 ℃, and centrifuging to collect thalli; weighing wet weight of the thallus, adding a bacteria breaking liquid (50mM Tris-HCl, 0.5M NaCl, pH8.0) according to the weight ratio of the thallus to the bacteria breaking liquid of 1:15, carrying out Western Blot detection on the bacteria liquid before and after induction and the bacteria breaking supernatant by adopting an IGF-I monoclonal antibody after ultrasonic bacteria breaking, wherein the result is shown in figure 10, and M is a protein Marker; 1 is induced whole bacteria liquid; 2 is the induced bacteria-breaking supernatant. The result shows that the fusion protein in the bacteria breaking supernatant has IGF-I immunological activity, the protein size is near 26kDa, and the molecular weight is close to the theoretical molecular weight.
5.4 purification of the fusion protein
The expression of the Thrombin fusion protein and the HRV 3C enzyme fusion protein was performed by using the Thrombin-IGF-engineered bacterium and the HRV 3C-IGF-engineered bacterium constructed in examples 5.2 and 5.3. The fermentation broth was centrifuged to collect the cells, disrupted by sonication, and the supernatant was collected, subjected to affinity chromatography using Ni-Sepharose 6FF column, equilibrated with 20mM Tris-HCl (pH7.8) -0.5M NaCl, and eluted with a 50mM → 500mM imidazole gradient to collect the target protein. To reduce the effect of high salt concentration on protease activity, excess salt was removed using a G-15 desalting column.
5.5 cleavage of the fusion protein
The IGF-I mature peptide is about 10kDa in size, and if cleaved correctly, a band is detectable at around 10kDa and has IGF-I immunoreactivity.
The system for cleaving fusion protein with thrombin adopts 1U thrombin (Solarbio's product, cat # T8021) to cleave 100 μ g fusion protein, and the final concentration of cleavage buffer system is 20mM Tris-HCl (pH 8.0), 0.15M NaCl, 0/2.5/10/20mM CaCl2The mixture was digested for 24 hours at 37 ℃ in a shaker. The cleavage result is shown in FIG. 11, wherein M is protein Marker, 1 is CaCl-free2And 2 is 2.5mM CaCl2And 3 is 10mMCaCl2And 4 is 20mM CaCl2. As shown in the figure, CaCl2At a final concentration of 10mM, the cleavage was best, and thrombin almost completely cleaved the fusion protein, CaCl2The concentration is increased again to influence the enzyme activity and reduce the cutting effect.
The HRV 3C enzyme cleavage system cleaved the fusion protein 100. mu.g of the fusion protein using 1U/10U of HRV 3C enzyme (product of Takara, cat. No.: 7360) and the cleavage buffer system was digested with 50mM Tris-HCl (pH7.5) and 0.15M NaCl at 4 ℃ for 24 hours. The results of the enzyme digestion product of IGF-I monoclonal antibody WB are shown in FIG. 12, wherein M is protein Marker, 1 is the addition of 1U enzyme, and 2 is the addition of 10U enzyme. As can be seen, the addition of 1U of enzyme cleaves the fusion protein to yield a small amount of mature IGF-I peptide; the enzyme cutting effect is better when 10U enzyme is added, and a large amount of IGF-I mature peptide can be obtained.
Comparative example 1 obtaining of Enterokinase fusion protein and enzyme digestion
1.1 obtaining the Gene sequence of the fusion protein
The enterokinase fusion protein gene sequence is shown in SEQ ID No.10, and the 5 'end of the enterokinase fusion protein gene sequence is provided with a KpnI restriction endonuclease site (GGTACC), the 3' end of the enterokinase fusion protein gene sequence is provided with a continuous stop codon (TAATGA) and a recognition site (AAGCTT) of HindIII:
GGTACCGACGACGACGACAAGGGTCCGGAGACCCTGTGCGGTGCGGAACTGGTGGACGCGCTGCAATTTGTTTGCGGTGATCGTGGCTTCTACTTTAACAAGCCGACCGGTTATGGTAGCAGCAGCCGTCGTGCGCCGCAGACCGGTATCGTTGACGAGTGCTGCTTCCGTAGCTGCGATCTGCGTCGTCTGGAAATGTATTGCGCGCCGCTGAAGCCGGCGAAAAGCGCGTAATGAAAGCTT
1.2 construction of recombinant expression vector of enterokinase fusion protein and fusion expression engineering bacterium
After the cDNA sequence (SEQ ID No.10) of the artificially synthesized Enterokinase fusion protein is subjected to double enzyme digestion by KpnI and HindIII, the cDNA sequence is connected to a pET-32a (+) expression vector which is also subjected to double enzyme digestion through T4 DNA ligase to construct a recombinant expression vector which is recorded as pET-32 a-Enterokinase-IGF-I; and (3) transforming an escherichia coli clone strain Top10, culturing and screening recombinants on an LB culture medium containing 50 mu g/ml ampicillin at 37 ℃, and sequencing to verify the sequence correctness after PCR and enzyme digestion verification are correct.
Extracting and identifying correct recombinants, transforming the recombinants into an Escherichia coli competence expression strain Origami B (DE3), and culturing and screening the recombinants on an LB culture medium containing 50 mu g/ml ampicillin (Amp), 15 mu g/ml kanamycin (Kan) and 12.5 mu g/ml tetracycline (Tet) at 37 ℃, wherein the obtained recombinants are IGF-I engineering bacteria expressed by Trx fusion and named as Ek-IGF. It was stored in 15% glycerol and frozen in a freezer at-80 ℃. Wherein, the fusion protein generated by Ek-IGF engineering bacteria expression should contain thioredoxin (Trx) label, His-tag purification label, Enterokinase (Enterokinase) enzyme cutting site and mature peptide IGF-I sequence, and the theoretical molecular weight is about 24.7 kDa.
Inoculating 0.2% of Ek-IGF engineering bacteria frozen in a glycerol tube into a test tube containing a fresh LB resistance culture medium (containing 50 mu g/ml Amp, 15 mu g/ml Kan and 12.5 mu g/ml Tet), performing shake culture overnight, and activating the strain; transferring into 300mL LB culture medium, shaking culturing at 37 deg.C for about 3h, and measuring OD600When the value is 0.6-0.8, adding IPTG (isopropyl thiogalactoside) with the final concentration of 0.5mM, carrying out induced expression for 24h at 25 ℃, and centrifuging to collect thalli; weighing wet weight of thallus, adding a bacteria breaking liquid (50mM Tris-HCl, 0.5M NaCl, pH8.0) according to the weight ratio of the thallus to the bacteria breaking liquid of 1:15, carrying out Western Blot identification on IGF-I monoclonal antibody by using the bacteria breaking supernatant and the precipitate after ultrasonic bacteria breaking, wherein the result is shown in figure 13, and M is protein Marker; 1 is a bacterium breaking supernatant; 2 is the bacterium breaking sediment. The result shows that the fusion protein in the bacteria breaking supernatant has IGF-I immunological activity and the protein size is between 26 and 34 kDa.
1.3 purification of enterokinase fusion proteins
Expression of the enterokinase fusion protein was accomplished using the Ek-IGF-engineered bacteria constructed in comparative example 1.1. The fermentation broth was centrifuged to collect the cells, disrupted by sonication, and the supernatant was collected, subjected to affinity chromatography using Ni-Sepharose 6FF column, equilibrated with 20mM Tris-HCl (pH7.8) -0.5M NaCl, and eluted with a 50mM → 500mM imidazole gradient to collect the target protein. To reduce the effect of high salt concentration on protease activity, excess salt was removed using a G-15 desalting column.
1.4 enzyme cleavage of enterokinase fusion proteins
The enterokinase system for cutting enterokinase fusion protein adopts different amounts of recombinant bovine enterokinase (product of Yaohai biological company, cat # ez00001) to cut 1mg of fusion protein, and the final concentration of the enzyme digestion buffer system is 20mM Tris-HCl (pH 8.0), 50mM NaCl, 2mM CaCl2The enzyme was digested at 16 ℃ for 24 h. The SDS-PAGE result of the enzyme digestion product is shown in FIG. 14, wherein M is protein Marker, 1 is 1IU enzyme, 2 is 5IU enzyme, 3 is 10IU enzyme, 4 is 50IU enzyme, 5 is 100IU enzyme, 6 is enterokinase cut positive control fusion protein, and 7 is non-enzyme added positive control fusion protein. As can be seen, enterokinase can cleave the corresponding positive control protein, but cannot cleave the enterokinase fusion protein.
The above-mentioned embodiments are merely preferred embodiments for fully illustrating the present invention, and the scope of the present invention is not limited thereto. The equivalent substitution or change made by the technical personnel in the technical field on the basis of the invention is all within the protection scope of the invention. The protection scope of the invention is subject to the claims.
Sequence listing
<110> Wuhan Haite biopharmaceuticals GmbH
<120> method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method
<130> WH2008252-1
<160> 10
<170> SIPOSequenceListing 1.0
<210> 1
<211> 199
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 1
Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp
1 5 10 15
Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp
20 25 30
Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp
35 40 45
Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn
50 55 60
Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu
65 70 75 80
Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser
85 90 95
Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala Gly Ser Gly
100 105 110
Ser Gly His Met His His His His His His Ser Ser Gly Leu Val Pro
115 120 125
Arg Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gln
130 135 140
Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr
145 150 155 160
Gly Ser Ser Ser Arg Arg Ala Pro Gln Thr Gly Ile Val Asp Glu Cys
165 170 175
Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro
180 185 190
Leu Lys Pro Ala Lys Ser Ala
195
<210> 2
<211> 228
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp
1 5 10 15
Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp
20 25 30
Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp
35 40 45
Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn
50 55 60
Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu
65 70 75 80
Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser
85 90 95
Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala Gly Ser Gly
100 105 110
Ser Gly His Met His His His His His His Ser Ser Gly Leu Val Pro
115 120 125
Arg Gly Ser Gly Met Lys Glu Thr Ala Ala Ala Lys Phe Glu Arg Gln
130 135 140
His Met Asp Ser Pro Asp Leu Gly Thr Asp Asp Asp Asp Lys Gly Pro
145 150 155 160
Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gln Phe Val Cys
165 170 175
Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr Gly Ser Ser
180 185 190
Ser Arg Arg Ala Pro Gln Thr Gly Ile Val Asp Glu Cys Cys Phe Arg
195 200 205
Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro Leu Lys Pro
210 215 220
Ala Lys Ser Ala
225
<210> 3
<211> 221
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 3
Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp
1 5 10 15
Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp
20 25 30
Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp
35 40 45
Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn
50 55 60
Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu
65 70 75 80
Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser
85 90 95
Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala Gly Ser Gly
100 105 110
Ser Gly His Thr Ser Gly Gly Gly Gly Ser Asn Asn Asn Pro Pro Thr
115 120 125
Pro Thr Pro Ser Ser Gly Ser Gly His His His His His His Ser Ala
130 135 140
Ala Leu Glu Val Leu Phe Gln Gly Pro Glu Thr Leu Cys Gly Ala Glu
145 150 155 160
Leu Val Asp Ala Leu Gln Phe Val Cys Gly Asp Arg Gly Phe Tyr Phe
165 170 175
Asn Lys Pro Thr Gly Tyr Gly Ser Ser Ser Arg Arg Ala Pro Gln Thr
180 185 190
Gly Ile Val Asp Glu Cys Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu
195 200 205
Glu Met Tyr Cys Ala Pro Leu Lys Pro Ala Lys Ser Ala
210 215 220
<210> 4
<211> 271
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
tggccatatg caccatcatc atcatcattc ttctggtctg gtgccacgcg gtccggagac 60
cctgtgcggt gcggaactgg tggacgcgct gcaatttgtt tgcggtgatc gtggcttcta 120
ctttaacaag ccgaccggtt atggtagcag cagccgtcgt gcgccgcaga ccggtatcgt 180
tgacgagtgc tgcttccgta gctgcgatct gcgtcgtctg gaaatgtatt gcgcgccgct 240
gaagccggcg aaaagcgcgt aatgagatat c 271
<210> 5
<211> 245
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
ccgcggctct ggaagtgctg tttcaaggtc cggagaccct gtgcggtgcg gaactggtgg 60
acgcgctgca atttgtttgc ggtgatcgtg gcttctactt taacaagccg accggttatg 120
gtagcagcag ccgtcgtgcg ccgcagaccg gtatcgttga cgagtgctgc ttccgtagct 180
gcgatctgcg tcgtctggaa atgtattgcg cgccgctgaa gccggcgaaa agcgcgtaaa 240
agctt 245
<210> 6
<211> 70
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 6
Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gln Phe
1 5 10 15
Val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr Gly
20 25 30
Ser Ser Ser Arg Arg Ala Pro Gln Thr Gly Ile Val Asp Glu Cys Cys
35 40 45
Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro Leu
50 55 60
Lys Pro Ala Lys Ser Ala
65 70
<210> 7
<211> 295
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 7
Thr Phe Gly Ala Gly Glu Ala Asp Cys Gly Leu Arg Pro Leu Phe Glu
1 5 10 15
Lys Lys Gln Val Gln Asp Gln Thr Glu Lys Glu Leu Phe Glu Ser Tyr
20 25 30
Ile Glu Gly Arg Ile Val Glu Gly Gln Asp Ala Glu Val Gly Leu Ser
35 40 45
Pro Trp Gln Val Met Leu Phe Arg Lys Ser Pro Gln Glu Leu Leu Cys
50 55 60
Gly Ala Ser Leu Ile Ser Asp Arg Trp Val Leu Thr Ala Ala His Cys
65 70 75 80
Leu Leu Tyr Pro Pro Trp Asp Lys Asn Phe Thr Val Asp Asp Leu Leu
85 90 95
Val Arg Ile Gly Lys His Ser Arg Thr Arg Tyr Glu Arg Lys Val Glu
100 105 110
Lys Ile Ser Met Leu Asp Lys Ile Tyr Ile His Pro Arg Tyr Asn Trp
115 120 125
Lys Glu Asn Leu Asp Arg Asp Ile Ala Leu Leu Lys Leu Lys Arg Pro
130 135 140
Ile Glu Leu Ser Asp Tyr Ile His Pro Val Cys Leu Pro Asp Lys Gln
145 150 155 160
Thr Ala Ala Lys Leu Leu His Ala Gly Phe Lys Gly Arg Val Thr Gly
165 170 175
Trp Gly Asn Arg Arg Glu Thr Trp Thr Thr Ser Val Ala Glu Val Gln
180 185 190
Pro Ser Val Leu Gln Val Val Asn Leu Pro Leu Val Glu Arg Pro Val
195 200 205
Cys Lys Ala Ser Thr Arg Ile Arg Ile Thr Asp Asn Met Phe Cys Ala
210 215 220
Gly Tyr Lys Pro Gly Glu Gly Lys Arg Gly Asp Ala Cys Glu Gly Asp
225 230 235 240
Ser Gly Gly Pro Phe Val Met Lys Ser Pro Tyr Asn Asn Arg Trp Tyr
245 250 255
Gln Met Gly Ile Val Ser Trp Gly Glu Gly Cys Asp Arg Asp Gly Lys
260 265 270
Tyr Gly Phe Tyr Thr His Val Phe Arg Leu Lys Lys Trp Ile Gln Lys
275 280 285
Val Ile Asp Arg Leu Gly Ser
290 295
<210> 8
<211> 248
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 8
Cys Gly Lys Lys Leu Val Thr Gln Glu Val Ser Pro Lys Ile Val Gly
1 5 10 15
Gly Ser Asp Ser Arg Glu Gly Ala Trp Pro Trp Val Val Ala Leu Tyr
20 25 30
Phe Asp Asp Gln Gln Val Cys Gly Ala Ser Leu Val Ser Arg Asp Trp
35 40 45
Leu Val Ser Ala Ala His Cys Val Tyr Gly Arg Asn Met Glu Pro Ser
50 55 60
Lys Trp Lys Ala Val Leu Gly Leu His Met Ala Ser Asn Leu Thr Ser
65 70 75 80
Pro Gln Ile Glu Thr Arg Leu Ile Asp Gln Ile Val Ile Asn Pro His
85 90 95
Tyr Asn Lys Arg Arg Lys Asn Asn Asp Ile Ala Met Met His Leu Glu
100 105 110
Met Lys Val Asn Tyr Thr Asp Tyr Ile Gln Pro Ile Cys Leu Pro Glu
115 120 125
Glu Asn Gln Val Phe Pro Pro Gly Arg Ile Cys Ser Ile Ala Gly Trp
130 135 140
Gly Ala Leu Ile Tyr Gln Gly Ser Thr Ala Asp Val Leu Gln Glu Ala
145 150 155 160
Asp Val Pro Leu Leu Ser Asn Glu Lys Cys Gln Gln Gln Met Pro Glu
165 170 175
Tyr Asn Ile Thr Glu Asn Met Val Cys Ala Gly Tyr Glu Ala Gly Gly
180 185 190
Val Asp Ser Cys Gln Gly Asp Ser Gly Gly Pro Leu Met Cys Gln Glu
195 200 205
Asn Asn Arg Trp Leu Leu Ala Gly Val Thr Ser Phe Gly Tyr Gln Cys
210 215 220
Ala Leu Pro Asn Arg Pro Gly Val Tyr Ala Arg Val Pro Arg Phe Thr
225 230 235 240
Glu Trp Ile Gln Ser Phe Leu His
245
<210> 9
<211> 182
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 9
Gly Pro Asn Thr Glu Phe Ala Leu Ser Leu Leu Arg Lys Asn Ile Met
1 5 10 15
Thr Ile Thr Thr Ser Lys Gly Glu Phe Thr Gly Leu Gly Ile His Asp
20 25 30
Arg Val Cys Val Ile Pro Thr His Ala Gln Pro Gly Asp Asp Val Leu
35 40 45
Val Asn Gly Gln Lys Ile Arg Val Lys Asp Lys Tyr Lys Leu Val Asp
50 55 60
Pro Glu Asn Ile Asn Leu Glu Leu Thr Val Leu Thr Leu Asp Arg Asn
65 70 75 80
Glu Lys Phe Arg Asp Ile Arg Gly Phe Ile Ser Glu Asp Leu Glu Gly
85 90 95
Val Asp Ala Thr Leu Val Val His Ser Asn Asn Phe Thr Asn Thr Ile
100 105 110
Leu Glu Val Gly Pro Val Thr Met Ala Gly Leu Ile Asn Leu Ser Ser
115 120 125
Thr Pro Thr Asn Arg Met Ile Arg Tyr Asp Tyr Ala Thr Lys Thr Gly
130 135 140
Gln Cys Gly Gly Val Leu Cys Ala Thr Gly Lys Ile Phe Gly Ile His
145 150 155 160
Val Gly Gly Asn Gly Arg Gln Gly Phe Ser Ala Gln Leu Lys Lys Gln
165 170 175
Tyr Phe Val Glu Lys Gln
180
<210> 10
<211> 243
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ggtaccgacg acgacgacaa gggtccggag accctgtgcg gtgcggaact ggtggacgcg 60
ctgcaatttg tttgcggtga tcgtggcttc tactttaaca agccgaccgg ttatggtagc 120
agcagccgtc gtgcgccgca gaccggtatc gttgacgagt gctgcttccg tagctgcgat 180
ctgcgtcgtc tggaaatgta ttgcgcgccg ctgaagccgg cgaaaagcgc gtaatgaaag 240
ctt 243
Claims (10)
1. A method for screening enzyme-cleaved aptameric fusion proteins, comprising the steps of:
(1) respectively inserting target protein sequence coding genes into a plurality of different expression vectors containing protease enzyme cutting sites to obtain amino acid sequences of a plurality of different fusion proteins; predicting the secondary structure of the amino acid sequence of the obtained fusion protein by using a secondary structure prediction algorithm PSIPRED;
(2) taking the amino acid sequences of the different fusion proteins in the step (1) and the corresponding secondary structure prediction results thereof as input files, and predicting the three-dimensional structure by a structure prediction algorithm I-TASSER;
(3) and (3) performing molecular docking on the three-dimensional structure prediction result obtained in the step (2) and the protease corresponding to the fusion protein by adopting a Cluspro algorithm, selecting a digestion system with the correct docking result ratio exceeding a threshold value for experimental verification, and screening out the fusion protein with the most suitable protease.
2. The method of claim 1, further comprising the steps of:
(4) and (4) performing codon optimization on the most adaptive fusion protein coding gene obtained in the step (3).
3. The method of claim 1, wherein the target protein is human insulin-like growth factor-I.
4. The method according to claim 3, wherein the protease is enterokinase, thrombin or HRV 3C protease, and when different proteases are docked with their corresponding three-dimensional structure predictions of the fusion protein, the docking sites are selected as follows:
selecting all lysine (Lys) in an amino acid sequence and 4 amino acids on the left and right of the lysine (Lys) as candidate enzyme cutting sites of the fusion protein corresponding to the enterokinase;
selecting arginine (Arg) and 4 amino acids on the left and right of the Arg as candidate enzyme cutting sites of the fusion protein corresponding to the thrombin;
the fusion protein corresponding to HRV 3C protease selects glutamine (Gln) and 4 amino acids on the left and right of the glutamine as candidate enzyme cutting sites.
5. The method of claim 3, wherein the threshold value in step (3) is 50%.
6. A preparation method of recombinant human insulin-like growth factor-I is characterized in that fusion protein with an amino acid sequence shown as SEQ ID NO.1 is used, pET-32a (+) vector is used for expression in a prokaryotic host, and thrombin is used for enzyme digestion of the fusion protein obtained by expression.
7. The method according to claim 6, wherein the gene encoding the fusion protein of SEQ ID No.1 is:
(1) a gene sequence shown as SEQ ID NO. 4; or
(2) A gene sequence which has 90 to 100 percent of homology with the gene sequence shown in SEQ ID NO.4 and encodes the same functional protein; or
(3) The gene sequence shown in SEQ ID NO.4 is a gene sequence which is derived from (1) and encodes protein with the same activity by adding, deleting or replacing one or more codons.
8. A preparation method of recombinant human insulin-like growth factor-I is characterized in that fusion protein with an amino acid sequence shown as SEQ ID NO.3 is used, pET-48b (+) vector is used for expression in a prokaryotic host, and HRV 3C enzyme is used for enzyme digestion of the fusion protein obtained by expression.
9. The method according to claim 8, wherein the coding gene corresponding to the fusion protein of SEQ ID No.3 is:
(1) a gene sequence shown as SEQ ID NO. 5;
(2) a gene sequence which has 90 to 100 percent of homology with the gene sequence shown in SEQ ID NO.5 and encodes the same functional protein; or
(3) The gene sequence shown in SEQ ID NO.5 is a gene sequence which is derived from (1) and encodes protein with the same activity by adding, deleting or replacing one or more codons.
10. The process according to any one of claims 6 to 9, wherein the prokaryotic host is a strain of BL21(DE3) E.coli, a strain of Rosetta-gami B (DE3) E.coli, a strain of Origami B (DE3) E.coli or a strain of Rosetta-gami2(DE3) E.coli.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010952359.4A CN112111504A (en) | 2020-09-11 | 2020-09-11 | Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010952359.4A CN112111504A (en) | 2020-09-11 | 2020-09-11 | Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112111504A true CN112111504A (en) | 2020-12-22 |
Family
ID=73801900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010952359.4A Pending CN112111504A (en) | 2020-09-11 | 2020-09-11 | Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112111504A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115340996A (en) * | 2022-08-30 | 2022-11-15 | 态创生物科技(广州)有限公司 | Co-expression method of multi-subunit protein by using specific enzyme cutting site |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1769456A (en) * | 2005-05-20 | 2006-05-10 | 成都西玛生物科技有限公司 | Recombinant a human peptide production method |
-
2020
- 2020-09-11 CN CN202010952359.4A patent/CN112111504A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1769456A (en) * | 2005-05-20 | 2006-05-10 | 成都西玛生物科技有限公司 | Recombinant a human peptide production method |
Non-Patent Citations (2)
Title |
---|
宁俊凯: "胰岛素样生长因子-1(IGF-1)原核制备工艺进展", 《海峡药学》 * |
易华伟等: "基于氨基酸序列和模拟结构预测蛋白质稳定性的研究进展", 《生物技术通报》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115340996A (en) * | 2022-08-30 | 2022-11-15 | 态创生物科技(广州)有限公司 | Co-expression method of multi-subunit protein by using specific enzyme cutting site |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cao et al. | Conjugal type IV macromolecular transfer systems of Gram-negative bacteria: organismal distribution, structural constraints and evolutionary conclusions | |
Li et al. | Molecular characterization of an ice nucleation protein variant (inaQ) from Pseudomonas syringae and the analysis of its transmembrane transport activity in Escherichia coli | |
Golden et al. | Ribosomal protein L6: structural evidence of gene duplication from a primitive RNA binding protein. | |
RU2007124731A (en) | GRAM POSITIVE BACTERIA PRODUCING RECOMBINANT PROTEINS | |
Unzueta et al. | Strategies for the production of difficult-to-express full-length eukaryotic proteins using microbial cell factories: production of human alpha-galactosidase A | |
CN110724187B (en) | Recombinant engineering bacterium for efficiently expressing liraglutide precursor and application thereof | |
Liu et al. | Fusion expression of pedA gene to obtain biologically active pediocin PA-1 in Escherichia coli | |
CN115785237B (en) | Recombinant botulinum toxin and preparation method thereof | |
JP2020529221A5 (en) | ||
CN112980865A (en) | Construction method of recombinant human-like collagen engineering bacteria | |
CN110835366B (en) | Tag polypeptide for promoting soluble expression of protein and application thereof | |
Shi et al. | Expression, purification and renaturation of truncated human integrin β1 from inclusion bodies of Escherichia coli | |
CN112111504A (en) | Method for screening enzyme digestion adaptive fusion protein and IGF-I preparation method | |
KR20100086717A (en) | Method for the secretory production of heterologous protein in escherichia coli | |
CN109055339B (en) | TEV protease mutant, gene, biological material, preparation method, reagent or kit and application | |
Rahman et al. | Topology-informed strategies for the overexpression and purification of membrane proteins | |
Aaltonen et al. | Transmembrane topology of the Acr3 family arsenite transporter from Bacillus subtilis | |
Durrani et al. | Expression and rapid purification of recombinant biologically active ovine growth hormone with DsbA targeting to Escherichia coli inner membrane | |
CN102676533A (en) | Recombinant human cystatin C coding gene and expression method | |
JP7016552B2 (en) | How to increase the secretion of recombinant proteins | |
MA | High expression level of human epidermal growth factor (hEGF) using a well-designed fusion protein-tagged construct in E. coli. | |
CN110540601B (en) | Recombinant PLB-hEGF fusion protein and application thereof | |
CN109880840B (en) | In vivo biotinylation labeling system for recombinant protein escherichia coli | |
CN109161557B (en) | Application of radiation-resistant deinococcus gobi alkaline protease gene KerB | |
CN113493780A (en) | Method for preparing recombinant heparinase II by utilizing SUMO fusion expression system and SUMO _ heparinase II fusion protein prepared by same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201222 |