CN116802284A - 具有β-氨基己糖苷酶活性的多肽和编码所述多肽的多核苷酸 - Google Patents
具有β-氨基己糖苷酶活性的多肽和编码所述多肽的多核苷酸 Download PDFInfo
- Publication number
- CN116802284A CN116802284A CN202180091682.XA CN202180091682A CN116802284A CN 116802284 A CN116802284 A CN 116802284A CN 202180091682 A CN202180091682 A CN 202180091682A CN 116802284 A CN116802284 A CN 116802284A
- Authority
- CN
- China
- Prior art keywords
- polypeptide
- polynucleotide
- acid sequence
- amino acid
- leu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 101
- 229920001184 polypeptide Polymers 0.000 title claims abstract description 98
- 102000004196 processed proteins & peptides Human genes 0.000 title claims abstract description 98
- 108091033319 polynucleotide Proteins 0.000 title claims abstract description 85
- 102000040430 polynucleotide Human genes 0.000 title claims abstract description 85
- 239000002157 polynucleotide Substances 0.000 title claims abstract description 85
- 102000007478 beta-N-Acetylhexosaminidases Human genes 0.000 title claims abstract description 77
- 108010085377 beta-N-Acetylhexosaminidases Proteins 0.000 title claims abstract description 77
- 230000000694 effects Effects 0.000 title claims abstract description 56
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 47
- 210000005253 yeast cell Anatomy 0.000 claims abstract description 41
- 238000004519 manufacturing process Methods 0.000 claims abstract description 16
- 238000012258 culturing Methods 0.000 claims abstract description 7
- 210000004027 cell Anatomy 0.000 claims description 62
- 238000000034 method Methods 0.000 claims description 43
- 239000013598 vector Substances 0.000 claims description 22
- 241000235070 Saccharomyces Species 0.000 claims description 14
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 10
- 239000013604 expression vector Substances 0.000 claims description 10
- 150000007523 nucleic acids Chemical group 0.000 claims description 10
- 108020004705 Codon Proteins 0.000 claims description 6
- 206010046914 Vaginal infection Diseases 0.000 claims 2
- 108090000623 proteins and genes Proteins 0.000 description 33
- 108020004414 DNA Proteins 0.000 description 23
- 230000014509 gene expression Effects 0.000 description 21
- 102000004169 proteins and genes Human genes 0.000 description 18
- 102000004190 Enzymes Human genes 0.000 description 17
- 108090000790 Enzymes Proteins 0.000 description 17
- 239000002299 complementary DNA Substances 0.000 description 16
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 15
- 241000196324 Embryophyta Species 0.000 description 12
- 241000220451 Canavalia Species 0.000 description 11
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 11
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 11
- 244000045232 Canavalia ensiformis Species 0.000 description 10
- 239000012634 fragment Substances 0.000 description 10
- 235000010520 Canavalia ensiformis Nutrition 0.000 description 9
- 239000000411 inducer Substances 0.000 description 9
- 240000003049 Canavalia gladiata Species 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 7
- 238000013518 transcription Methods 0.000 description 7
- 230000035897 transcription Effects 0.000 description 7
- 235000010518 Canavalia gladiata Nutrition 0.000 description 6
- 235000010469 Glycine max Nutrition 0.000 description 6
- 244000068988 Glycine max Species 0.000 description 6
- 150000001413 amino acids Chemical class 0.000 description 6
- 108020004999 messenger RNA Proteins 0.000 description 6
- 239000000463 material Substances 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 238000012163 sequencing technique Methods 0.000 description 5
- 239000000758 substrate Substances 0.000 description 5
- 101710194180 Alcohol oxidase 1 Proteins 0.000 description 4
- 241000235058 Komagataella pastoris Species 0.000 description 4
- 240000000377 Tussilago farfara Species 0.000 description 4
- 235000004869 Tussilago farfara Nutrition 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 230000000813 microbial effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 210000001519 tissue Anatomy 0.000 description 4
- 102100022548 Beta-hexosaminidase subunit alpha Human genes 0.000 description 3
- 241000235649 Kluyveromyces Species 0.000 description 3
- 101001018085 Lysobacter enzymogenes Lysyl endopeptidase Proteins 0.000 description 3
- 241000235017 Zygosaccharomyces Species 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 3
- 125000000539 amino acid group Chemical group 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 3
- 108010038633 aspartylglutamate Proteins 0.000 description 3
- 210000002257 embryonic structure Anatomy 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 108010057821 leucylproline Proteins 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- SDZRIBWEVVRDQI-CIUDSAMLSA-N Ala-Lys-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O SDZRIBWEVVRDQI-CIUDSAMLSA-N 0.000 description 2
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 2
- 241000219194 Arabidopsis Species 0.000 description 2
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 2
- NUCUBYIUPVYGPP-XIRDDKMYSA-N Asn-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CC(N)=O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O NUCUBYIUPVYGPP-XIRDDKMYSA-N 0.000 description 2
- BLQBMRNMBAYREH-UWJYBYFXSA-N Asp-Ala-Tyr Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O BLQBMRNMBAYREH-UWJYBYFXSA-N 0.000 description 2
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 2
- QPDUWAUSSWGJSB-NGZCFLSTSA-N Asp-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N QPDUWAUSSWGJSB-NGZCFLSTSA-N 0.000 description 2
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 2
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 108020004635 Complementary DNA Proteins 0.000 description 2
- SDDJEOCJUFKAPV-BPUTZDHNSA-N Cys-Met-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CS)CCSC)C(O)=O)=CNC2=C1 SDDJEOCJUFKAPV-BPUTZDHNSA-N 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- SYWCGQOIIARSIX-SRVKXCTJSA-N Glu-Pro-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O SYWCGQOIIARSIX-SRVKXCTJSA-N 0.000 description 2
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 2
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 2
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 2
- 102000005744 Glycoside Hydrolases Human genes 0.000 description 2
- 108010031186 Glycoside Hydrolases Proteins 0.000 description 2
- 102000002268 Hexosaminidases Human genes 0.000 description 2
- 108010000540 Hexosaminidases Proteins 0.000 description 2
- 102100034343 Integrase Human genes 0.000 description 2
- 241001138401 Kluyveromyces lactis Species 0.000 description 2
- 241000880493 Leptailurus serval Species 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- 239000001888 Peptone Substances 0.000 description 2
- 108010080698 Peptones Proteins 0.000 description 2
- 244000046052 Phaseolus vulgaris Species 0.000 description 2
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 2
- 241000235648 Pichia Species 0.000 description 2
- HFNPOYOKIPGAEI-SRVKXCTJSA-N Pro-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]1CCCN1 HFNPOYOKIPGAEI-SRVKXCTJSA-N 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 241000256251 Spodoptera frugiperda Species 0.000 description 2
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- VBTFUDNTMCHPII-UHFFFAOYSA-N Val-Trp-Tyr Natural products C=1NC2=CC=CC=C2C=1CC(NC(=O)C(N)C(C)C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 VBTFUDNTMCHPII-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 2
- 108010062796 arginyllysine Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 238000000855 fermentation Methods 0.000 description 2
- 230000004151 fermentation Effects 0.000 description 2
- 230000002538 fungal effect Effects 0.000 description 2
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 2
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- 239000000710 homodimer Substances 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000001906 matrix-assisted laser desorption--ionisation mass spectrometry Methods 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 125000000636 p-nitrophenyl group Chemical group [H]C1=C([H])C(=C([H])C([H])=C1*)[N+]([O-])=O 0.000 description 2
- 235000019319 peptone Nutrition 0.000 description 2
- 108010024654 phenylalanyl-prolyl-alanine Proteins 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- FGDZQCVHDSGLHJ-UHFFFAOYSA-M rubidium chloride Chemical compound [Cl-].[Rb+] FGDZQCVHDSGLHJ-UHFFFAOYSA-M 0.000 description 2
- 230000028327 secretion Effects 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 239000002689 soil Substances 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 108010084932 tryptophyl-proline Proteins 0.000 description 2
- 229930195730 Aflatoxin Natural products 0.000 description 1
- 241000589158 Agrobacterium Species 0.000 description 1
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 1
- GORKKVHIBWAQHM-GCJQMDKQSA-N Ala-Asn-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GORKKVHIBWAQHM-GCJQMDKQSA-N 0.000 description 1
- BTYTYHBSJKQBQA-GCJQMDKQSA-N Ala-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N)O BTYTYHBSJKQBQA-GCJQMDKQSA-N 0.000 description 1
- ZODMADSIQZZBSQ-FXQIFTODSA-N Ala-Gln-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZODMADSIQZZBSQ-FXQIFTODSA-N 0.000 description 1
- NHLAEBFGWPXFGI-WHFBIAKZSA-N Ala-Gly-Asn Chemical compound C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N NHLAEBFGWPXFGI-WHFBIAKZSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- YHKANGMVQWRMAP-DCAQKATOSA-N Ala-Leu-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YHKANGMVQWRMAP-DCAQKATOSA-N 0.000 description 1
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 1
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- BFMIRJBURUXDRG-DLOVCJGASA-N Ala-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 BFMIRJBURUXDRG-DLOVCJGASA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- YCTIYBUTCKNOTI-UWJYBYFXSA-N Ala-Tyr-Asp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCTIYBUTCKNOTI-UWJYBYFXSA-N 0.000 description 1
- BGGAIXWIZCIFSG-XDTLVQLUSA-N Ala-Tyr-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O BGGAIXWIZCIFSG-XDTLVQLUSA-N 0.000 description 1
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 1
- 108010025188 Alcohol oxidase Proteins 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- RWCLSUOSKWTXLA-FXQIFTODSA-N Arg-Asp-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RWCLSUOSKWTXLA-FXQIFTODSA-N 0.000 description 1
- IGULQRCJLQQPSM-DCAQKATOSA-N Arg-Cys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O IGULQRCJLQQPSM-DCAQKATOSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- HPSVTWMFWCHKFN-GARJFASQSA-N Arg-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O HPSVTWMFWCHKFN-GARJFASQSA-N 0.000 description 1
- GMFAGHNRXPSSJS-SRVKXCTJSA-N Arg-Leu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GMFAGHNRXPSSJS-SRVKXCTJSA-N 0.000 description 1
- OGSQONVYSTZIJB-WDSOQIARSA-N Arg-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O OGSQONVYSTZIJB-WDSOQIARSA-N 0.000 description 1
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 1
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 1
- LOVIQNMIPQVIGT-BVSLBCMMSA-N Arg-Trp-Phe Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CCCN=C(N)N)N)C(O)=O)C1=CC=CC=C1 LOVIQNMIPQVIGT-BVSLBCMMSA-N 0.000 description 1
- QMQZYILAWUOLPV-JYJNAYRXSA-N Arg-Tyr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)CC1=CC=C(O)C=C1 QMQZYILAWUOLPV-JYJNAYRXSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- IBLAOXSULLECQZ-IUKAMOBKSA-N Asn-Ile-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(N)=O IBLAOXSULLECQZ-IUKAMOBKSA-N 0.000 description 1
- VITDJIPIJZAVGC-VEVYYDQMSA-N Asn-Met-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VITDJIPIJZAVGC-VEVYYDQMSA-N 0.000 description 1
- OROMFUQQTSWUTI-IHRRRGAJSA-N Asn-Phe-Arg Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OROMFUQQTSWUTI-IHRRRGAJSA-N 0.000 description 1
- BKFXFUPYETWGGA-XVSYOHENSA-N Asn-Phe-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BKFXFUPYETWGGA-XVSYOHENSA-N 0.000 description 1
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 1
- FHCRKXCTKSHNOE-QEJZJMRPSA-N Asn-Trp-Glu Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FHCRKXCTKSHNOE-QEJZJMRPSA-N 0.000 description 1
- MLJZMGIXXMTEPO-UBHSHLNASA-N Asn-Trp-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O MLJZMGIXXMTEPO-UBHSHLNASA-N 0.000 description 1
- LTDGPJKGJDIBQD-LAEOZQHASA-N Asn-Val-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LTDGPJKGJDIBQD-LAEOZQHASA-N 0.000 description 1
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 1
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 1
- NZJDBCYBYCUEDC-UBHSHLNASA-N Asp-Cys-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N NZJDBCYBYCUEDC-UBHSHLNASA-N 0.000 description 1
- XDGBFDYXZCMYEX-NUMRIWBASA-N Asp-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N)O XDGBFDYXZCMYEX-NUMRIWBASA-N 0.000 description 1
- WSXDIZFNQYTUJB-SRVKXCTJSA-N Asp-His-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O WSXDIZFNQYTUJB-SRVKXCTJSA-N 0.000 description 1
- KTTCQQNRRLCIBC-GHCJXIJMSA-N Asp-Ile-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O KTTCQQNRRLCIBC-GHCJXIJMSA-N 0.000 description 1
- GBSUGIXJAAKZOW-GMOBBJLQSA-N Asp-Ile-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O GBSUGIXJAAKZOW-GMOBBJLQSA-N 0.000 description 1
- USNJAPJZSGTTPX-XVSYOHENSA-N Asp-Phe-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O USNJAPJZSGTTPX-XVSYOHENSA-N 0.000 description 1
- KPSHWSWFPUDEGF-FXQIFTODSA-N Asp-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(O)=O KPSHWSWFPUDEGF-FXQIFTODSA-N 0.000 description 1
- YIDFBWRHIYOYAA-LKXGYXEUSA-N Asp-Ser-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YIDFBWRHIYOYAA-LKXGYXEUSA-N 0.000 description 1
- XYPJXLLXNSAWHZ-SRVKXCTJSA-N Asp-Ser-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XYPJXLLXNSAWHZ-SRVKXCTJSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 1
- 239000002028 Biomass Substances 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- KKUVRYLJEXJSGX-MXAVVETBSA-N Cys-Ile-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CS)N KKUVRYLJEXJSGX-MXAVVETBSA-N 0.000 description 1
- ODDOYXKAHLKKQY-MMWGEVLESA-N Cys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N ODDOYXKAHLKKQY-MMWGEVLESA-N 0.000 description 1
- WVLZTXGTNGHPBO-SRVKXCTJSA-N Cys-Leu-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O WVLZTXGTNGHPBO-SRVKXCTJSA-N 0.000 description 1
- HMWBPUDETPKSSS-DCAQKATOSA-N Cys-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CS)N)C(=O)N[C@@H](CCCCN)C(=O)O HMWBPUDETPKSSS-DCAQKATOSA-N 0.000 description 1
- -1 DNA) into host cells Chemical class 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 235000016623 Fragaria vesca Nutrition 0.000 description 1
- 240000009088 Fragaria x ananassa Species 0.000 description 1
- 235000011363 Fragaria x ananassa Nutrition 0.000 description 1
- PONUFVLSGMQFAI-AVGNSLFASA-N Gln-Asn-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PONUFVLSGMQFAI-AVGNSLFASA-N 0.000 description 1
- NPTGGVQJYRSMCM-GLLZPBPUSA-N Gln-Gln-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPTGGVQJYRSMCM-GLLZPBPUSA-N 0.000 description 1
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 1
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 1
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- UEILCTONAMOGBR-RWRJDSDZSA-N Gln-Thr-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UEILCTONAMOGBR-RWRJDSDZSA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- GCYFUZJHAXJKKE-KKUMJFAQSA-N Glu-Arg-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O GCYFUZJHAXJKKE-KKUMJFAQSA-N 0.000 description 1
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 1
- MUSGDMDGNGXULI-DCAQKATOSA-N Glu-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O MUSGDMDGNGXULI-DCAQKATOSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 1
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 1
- TZXOPHFCAATANZ-QEJZJMRPSA-N Glu-Ser-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N TZXOPHFCAATANZ-QEJZJMRPSA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 1
- PMSDOVISAARGAV-FHWLQOOXSA-N Glu-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 PMSDOVISAARGAV-FHWLQOOXSA-N 0.000 description 1
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 1
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 1
- RMWAOBGCZZSJHE-UMNHJUIQSA-N Glu-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N RMWAOBGCZZSJHE-UMNHJUIQSA-N 0.000 description 1
- WGYHAAXZWPEBDQ-IFFSRLJSSA-N Glu-Val-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGYHAAXZWPEBDQ-IFFSRLJSSA-N 0.000 description 1
- QRWPTXLWHHTOCO-DZKIICNBSA-N Glu-Val-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QRWPTXLWHHTOCO-DZKIICNBSA-N 0.000 description 1
- UGVQELHRNUDMAA-BYPYZUCNSA-N Gly-Ala-Gly Chemical compound [NH3+]CC(=O)N[C@@H](C)C(=O)NCC([O-])=O UGVQELHRNUDMAA-BYPYZUCNSA-N 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- GGEJHJIXRBTJPD-BYPYZUCNSA-N Gly-Asn-Gly Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GGEJHJIXRBTJPD-BYPYZUCNSA-N 0.000 description 1
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 1
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 1
- FQKKPCWTZZEDIC-XPUUQOCRSA-N Gly-His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CN=CN1 FQKKPCWTZZEDIC-XPUUQOCRSA-N 0.000 description 1
- HKSNHPVETYYJBK-LAEOZQHASA-N Gly-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN HKSNHPVETYYJBK-LAEOZQHASA-N 0.000 description 1
- AAHSHTLISQUZJL-QSFUFRPTSA-N Gly-Ile-Ile Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AAHSHTLISQUZJL-QSFUFRPTSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- ULZCYBYDTUMHNF-IUCAKERBSA-N Gly-Leu-Glu Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O ULZCYBYDTUMHNF-IUCAKERBSA-N 0.000 description 1
- FXLVSYVJDPCIHH-STQMWFEESA-N Gly-Phe-Arg Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FXLVSYVJDPCIHH-STQMWFEESA-N 0.000 description 1
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 1
- LBDXVCBAJJNJNN-WHFBIAKZSA-N Gly-Ser-Cys Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O LBDXVCBAJJNJNN-WHFBIAKZSA-N 0.000 description 1
- FKYQEVBRZSFAMJ-QWRGUYRKSA-N Gly-Ser-Tyr Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FKYQEVBRZSFAMJ-QWRGUYRKSA-N 0.000 description 1
- COZMNNJEGNPDED-HOCLYGCPSA-N Gly-Val-Trp Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O COZMNNJEGNPDED-HOCLYGCPSA-N 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- 101150071246 Hexb gene Proteins 0.000 description 1
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 1
- CHZKBLABUKSXDM-XIRDDKMYSA-N His-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC3=CN=CN3)N CHZKBLABUKSXDM-XIRDDKMYSA-N 0.000 description 1
- MLZVJIREOKTDAR-SIGLWIIPSA-N His-Ile-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MLZVJIREOKTDAR-SIGLWIIPSA-N 0.000 description 1
- VFBZWZXKCVBTJR-SRVKXCTJSA-N His-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N VFBZWZXKCVBTJR-SRVKXCTJSA-N 0.000 description 1
- UMBKDWGQESDCTO-KKUMJFAQSA-N His-Lys-Lys Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O UMBKDWGQESDCTO-KKUMJFAQSA-N 0.000 description 1
- STGQSBKUYSPPIG-CIUDSAMLSA-N His-Ser-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 STGQSBKUYSPPIG-CIUDSAMLSA-N 0.000 description 1
- XSEAJSPAOTZXJE-IHPCNDPISA-N His-Trp-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)NC(=O)[C@H](CC4=CN=CN4)N XSEAJSPAOTZXJE-IHPCNDPISA-N 0.000 description 1
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 1
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 1
- QIHJTGSVGIPHIW-QSFUFRPTSA-N Ile-Asn-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C(C)C)C(=O)O)N QIHJTGSVGIPHIW-QSFUFRPTSA-N 0.000 description 1
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 1
- GECLQMBTZCPAFY-PEFMBERDSA-N Ile-Gln-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GECLQMBTZCPAFY-PEFMBERDSA-N 0.000 description 1
- WZDCVAWMBUNDDY-KBIXCLLPSA-N Ile-Glu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)O)N WZDCVAWMBUNDDY-KBIXCLLPSA-N 0.000 description 1
- XLCZWMJPVGRWHJ-KQXIARHKSA-N Ile-Glu-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N XLCZWMJPVGRWHJ-KQXIARHKSA-N 0.000 description 1
- SPQWWEZBHXHUJN-KBIXCLLPSA-N Ile-Glu-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O SPQWWEZBHXHUJN-KBIXCLLPSA-N 0.000 description 1
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 1
- DMSVBUWGDLYNLC-IAVJCBSLSA-N Ile-Ile-Phe Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 DMSVBUWGDLYNLC-IAVJCBSLSA-N 0.000 description 1
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- FHPZJWJWTWZKNA-LLLHUVSDSA-N Ile-Phe-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N FHPZJWJWTWZKNA-LLLHUVSDSA-N 0.000 description 1
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- BQIIHAGJIYOQBP-YFYLHZKVSA-N Ile-Trp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N3CCC[C@@H]3C(=O)O)N BQIIHAGJIYOQBP-YFYLHZKVSA-N 0.000 description 1
- ZUWSVOYKBCHLRR-MGHWNKPDSA-N Ile-Tyr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZUWSVOYKBCHLRR-MGHWNKPDSA-N 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 241001099156 Komagataella phaffii Species 0.000 description 1
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 1
- POJPZSMTTMLSTG-SRVKXCTJSA-N Leu-Asn-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N POJPZSMTTMLSTG-SRVKXCTJSA-N 0.000 description 1
- TWQIYNGNYNJUFM-NHCYSSNCSA-N Leu-Asn-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TWQIYNGNYNJUFM-NHCYSSNCSA-N 0.000 description 1
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 1
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 1
- PNUCWVAGVNLUMW-CIUDSAMLSA-N Leu-Cys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O PNUCWVAGVNLUMW-CIUDSAMLSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- LKXANTUNFMVCNF-IHPCNDPISA-N Leu-His-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O LKXANTUNFMVCNF-IHPCNDPISA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 1
- ZDBMWELMUCLUPL-QEJZJMRPSA-N Leu-Phe-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ZDBMWELMUCLUPL-QEJZJMRPSA-N 0.000 description 1
- KTOIECMYZZGVSI-BZSNNMDCSA-N Leu-Phe-His Chemical compound C([C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CC=CC=C1 KTOIECMYZZGVSI-BZSNNMDCSA-N 0.000 description 1
- WMIOEVKKYIMVKI-DCAQKATOSA-N Leu-Pro-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WMIOEVKKYIMVKI-DCAQKATOSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 1
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 1
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 1
- JBRWKVANRYPCAF-XIRDDKMYSA-N Lys-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N JBRWKVANRYPCAF-XIRDDKMYSA-N 0.000 description 1
- GHOIOYHDDKXIDX-SZMVWBNQSA-N Lys-Glu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 GHOIOYHDDKXIDX-SZMVWBNQSA-N 0.000 description 1
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 1
- XREQQOATSMMAJP-MGHWNKPDSA-N Lys-Ile-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XREQQOATSMMAJP-MGHWNKPDSA-N 0.000 description 1
- OVAOHZIOUBEQCJ-IHRRRGAJSA-N Lys-Leu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OVAOHZIOUBEQCJ-IHRRRGAJSA-N 0.000 description 1
- QKXZCUCBFPEXNK-KKUMJFAQSA-N Lys-Leu-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 QKXZCUCBFPEXNK-KKUMJFAQSA-N 0.000 description 1
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 1
- AFLBTVGQCQLOFJ-AVGNSLFASA-N Lys-Pro-Arg Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AFLBTVGQCQLOFJ-AVGNSLFASA-N 0.000 description 1
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- XGZDDOKIHSYHTO-SZMVWBNQSA-N Lys-Trp-Glu Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 XGZDDOKIHSYHTO-SZMVWBNQSA-N 0.000 description 1
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- OIFHHODAXVWKJN-ULQDDVLXSA-N Met-Phe-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 OIFHHODAXVWKJN-ULQDDVLXSA-N 0.000 description 1
- SOAYQFDWEIWPPR-IHRRRGAJSA-N Met-Ser-Tyr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O SOAYQFDWEIWPPR-IHRRRGAJSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 241000143603 Nakaseomyces Species 0.000 description 1
- 241001489174 Ogataea minuta Species 0.000 description 1
- 241000209094 Oryza Species 0.000 description 1
- 240000000125 Oryza minuta Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- YMORXCKTSSGYIG-IHRRRGAJSA-N Phe-Arg-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N YMORXCKTSSGYIG-IHRRRGAJSA-N 0.000 description 1
- HTKNPQZCMLBOTQ-XVSYOHENSA-N Phe-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O HTKNPQZCMLBOTQ-XVSYOHENSA-N 0.000 description 1
- XMPUYNHKEPFERE-IHRRRGAJSA-N Phe-Asp-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMPUYNHKEPFERE-IHRRRGAJSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- FRPVPGRXUKFEQE-YDHLFZDLSA-N Phe-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FRPVPGRXUKFEQE-YDHLFZDLSA-N 0.000 description 1
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 1
- VADLTGVIOIOKGM-BZSNNMDCSA-N Phe-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 VADLTGVIOIOKGM-BZSNNMDCSA-N 0.000 description 1
- ZIQQNOXKEFDPBE-BZSNNMDCSA-N Phe-Lys-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N ZIQQNOXKEFDPBE-BZSNNMDCSA-N 0.000 description 1
- JLLJTMHNXQTMCK-UBHSHLNASA-N Phe-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 JLLJTMHNXQTMCK-UBHSHLNASA-N 0.000 description 1
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- XNMYNGDKJNOKHH-BZSNNMDCSA-N Phe-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XNMYNGDKJNOKHH-BZSNNMDCSA-N 0.000 description 1
- GNRMAQSIROFNMI-IXOXFDKPSA-N Phe-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O GNRMAQSIROFNMI-IXOXFDKPSA-N 0.000 description 1
- BQMFWUKNOCJDNV-HJWJTTGWSA-N Phe-Val-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQMFWUKNOCJDNV-HJWJTTGWSA-N 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- VCYJKOLZYPYGJV-AVGNSLFASA-N Pro-Arg-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VCYJKOLZYPYGJV-AVGNSLFASA-N 0.000 description 1
- JFNPBBOGGNMSRX-CIUDSAMLSA-N Pro-Gln-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O JFNPBBOGGNMSRX-CIUDSAMLSA-N 0.000 description 1
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- JIWJRKNYLSHONY-KKUMJFAQSA-N Pro-Phe-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JIWJRKNYLSHONY-KKUMJFAQSA-N 0.000 description 1
- CZCCVJUUWBMISW-FXQIFTODSA-N Pro-Ser-Cys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O CZCCVJUUWBMISW-FXQIFTODSA-N 0.000 description 1
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 1
- IURWWZYKYPEANQ-HJGDQZAQSA-N Pro-Thr-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O IURWWZYKYPEANQ-HJGDQZAQSA-N 0.000 description 1
- BNUKRHFCHHLIGR-JYJNAYRXSA-N Pro-Trp-Asp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O BNUKRHFCHHLIGR-JYJNAYRXSA-N 0.000 description 1
- VPBQDHMASPJHGY-JYJNAYRXSA-N Pro-Trp-Ser Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CO)C(=O)O VPBQDHMASPJHGY-JYJNAYRXSA-N 0.000 description 1
- ZAUHSLVPDLNTRZ-QXEWZRGKSA-N Pro-Val-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZAUHSLVPDLNTRZ-QXEWZRGKSA-N 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 240000004110 Russelia equisetiformis Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- TYYBJUYSTWJHGO-ZKWXMUAHSA-N Ser-Asn-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O TYYBJUYSTWJHGO-ZKWXMUAHSA-N 0.000 description 1
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 1
- BLPYXIXXCFVIIF-FXQIFTODSA-N Ser-Cys-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N)CN=C(N)N BLPYXIXXCFVIIF-FXQIFTODSA-N 0.000 description 1
- WKLJLEXEENIYQE-SRVKXCTJSA-N Ser-Cys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WKLJLEXEENIYQE-SRVKXCTJSA-N 0.000 description 1
- RNMRYWZYFHHOEV-CIUDSAMLSA-N Ser-Gln-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RNMRYWZYFHHOEV-CIUDSAMLSA-N 0.000 description 1
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 1
- MUARUIBTKQJKFY-WHFBIAKZSA-N Ser-Gly-Asp Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MUARUIBTKQJKFY-WHFBIAKZSA-N 0.000 description 1
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 1
- ZUDXUJSYCCNZQJ-DCAQKATOSA-N Ser-His-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CO)N ZUDXUJSYCCNZQJ-DCAQKATOSA-N 0.000 description 1
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 1
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 1
- HDBOEVPDIDDEPC-CIUDSAMLSA-N Ser-Lys-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O HDBOEVPDIDDEPC-CIUDSAMLSA-N 0.000 description 1
- VIIJCAQMJBHSJH-FXQIFTODSA-N Ser-Met-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O VIIJCAQMJBHSJH-FXQIFTODSA-N 0.000 description 1
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 1
- CKDXFSPMIDSMGV-GUBZILKMSA-N Ser-Pro-Val Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O CKDXFSPMIDSMGV-GUBZILKMSA-N 0.000 description 1
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 1
- JCLAFVNDBJMLBC-JBDRJPRFSA-N Ser-Ser-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JCLAFVNDBJMLBC-JBDRJPRFSA-N 0.000 description 1
- CUXJENOFJXOSOZ-BIIVOSGPSA-N Ser-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CO)N)C(=O)O CUXJENOFJXOSOZ-BIIVOSGPSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 1
- OQSQCUWQOIHECT-YJRXYDGGSA-N Ser-Tyr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OQSQCUWQOIHECT-YJRXYDGGSA-N 0.000 description 1
- HAYADTTXNZFUDM-IHRRRGAJSA-N Ser-Tyr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HAYADTTXNZFUDM-IHRRRGAJSA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- UKKROEYWYIHWBD-ZKWXMUAHSA-N Ser-Val-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UKKROEYWYIHWBD-ZKWXMUAHSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- 241000974808 Spathaspora Species 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- 208000022292 Tay-Sachs disease Diseases 0.000 description 1
- 241000183049 Tetrapisispora Species 0.000 description 1
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 1
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 1
- GCXFWAZRHBRYEM-NUMRIWBASA-N Thr-Gln-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O GCXFWAZRHBRYEM-NUMRIWBASA-N 0.000 description 1
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 1
- AHOLTQCAVBSUDP-PPCPHDFISA-N Thr-Ile-Lys Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)[C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O AHOLTQCAVBSUDP-PPCPHDFISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- YOOAQCZYZHGUAZ-KATARQTJSA-N Thr-Leu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YOOAQCZYZHGUAZ-KATARQTJSA-N 0.000 description 1
- QFCQNHITJPRQTB-IEGACIPQSA-N Thr-Lys-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O QFCQNHITJPRQTB-IEGACIPQSA-N 0.000 description 1
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 1
- JMBRNXUOLJFURW-BEAPCOKYSA-N Thr-Phe-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N)O JMBRNXUOLJFURW-BEAPCOKYSA-N 0.000 description 1
- ABWNZPOIUJMNKT-IXOXFDKPSA-N Thr-Phe-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O ABWNZPOIUJMNKT-IXOXFDKPSA-N 0.000 description 1
- PRTHQBSMXILLPC-XGEHTFHBSA-N Thr-Ser-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PRTHQBSMXILLPC-XGEHTFHBSA-N 0.000 description 1
- BCYUHPXBHCUYBA-CUJWVEQBSA-N Thr-Ser-His Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O BCYUHPXBHCUYBA-CUJWVEQBSA-N 0.000 description 1
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 1
- DIHPMRTXPYMDJZ-KAOXEZKKSA-N Thr-Tyr-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N)O DIHPMRTXPYMDJZ-KAOXEZKKSA-N 0.000 description 1
- LVRFMARKDGGZMX-IZPVPAKOSA-N Thr-Tyr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=C(O)C=C1 LVRFMARKDGGZMX-IZPVPAKOSA-N 0.000 description 1
- KPMIQCXJDVKWKO-IFFSRLJSSA-N Thr-Val-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O KPMIQCXJDVKWKO-IFFSRLJSSA-N 0.000 description 1
- CURFABYITJVKEW-QTKMDUPCSA-N Thr-Val-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N)O CURFABYITJVKEW-QTKMDUPCSA-N 0.000 description 1
- KZTLZZQTJMCGIP-ZJDVBMNYSA-N Thr-Val-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KZTLZZQTJMCGIP-ZJDVBMNYSA-N 0.000 description 1
- BPGDJSUFQKWUBK-KJEVXHAQSA-N Thr-Val-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BPGDJSUFQKWUBK-KJEVXHAQSA-N 0.000 description 1
- VYVBSMCZNHOZGD-RCWTZXSCSA-N Thr-Val-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O VYVBSMCZNHOZGD-RCWTZXSCSA-N 0.000 description 1
- 108700009124 Transcription Initiation Site Proteins 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 1
- UJRIVCPPPMYCNA-HOCLYGCPSA-N Trp-Leu-Gly Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UJRIVCPPPMYCNA-HOCLYGCPSA-N 0.000 description 1
- RWAYYYOZMHMEGD-XIRDDKMYSA-N Trp-Leu-Ser Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O)=CNC2=C1 RWAYYYOZMHMEGD-XIRDDKMYSA-N 0.000 description 1
- YTYHAYZPOARHAP-HOCLYGCPSA-N Trp-Lys-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)O)N YTYHAYZPOARHAP-HOCLYGCPSA-N 0.000 description 1
- UHXOYRWHIQZAKV-SZMVWBNQSA-N Trp-Pro-Arg Chemical compound O=C([C@H](CC=1C2=CC=CC=C2NC=1)N)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O UHXOYRWHIQZAKV-SZMVWBNQSA-N 0.000 description 1
- XOLLWQIBBLBAHQ-WDSOQIARSA-N Trp-Pro-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O XOLLWQIBBLBAHQ-WDSOQIARSA-N 0.000 description 1
- OJKVFAWXPGCJMF-BPUTZDHNSA-N Trp-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)N[C@@H](CO)C(=O)O OJKVFAWXPGCJMF-BPUTZDHNSA-N 0.000 description 1
- VMXLNDRJXVAJFT-JYBASQMISA-N Trp-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O VMXLNDRJXVAJFT-JYBASQMISA-N 0.000 description 1
- DVLHKUWLNKDINO-PMVMPFDFSA-N Trp-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DVLHKUWLNKDINO-PMVMPFDFSA-N 0.000 description 1
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 1
- AKXBNSZMYAOGLS-STQMWFEESA-N Tyr-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AKXBNSZMYAOGLS-STQMWFEESA-N 0.000 description 1
- SMLCYZYQFRTLCO-UWJYBYFXSA-N Tyr-Cys-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O SMLCYZYQFRTLCO-UWJYBYFXSA-N 0.000 description 1
- WAPFQMXRSDEGOE-IHRRRGAJSA-N Tyr-Glu-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O WAPFQMXRSDEGOE-IHRRRGAJSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- ARJASMXQBRNAGI-YESZJQIVSA-N Tyr-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N ARJASMXQBRNAGI-YESZJQIVSA-N 0.000 description 1
- BJCILVZEZRDIDR-PMVMPFDFSA-N Tyr-Leu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=C(O)C=C1 BJCILVZEZRDIDR-PMVMPFDFSA-N 0.000 description 1
- FGVFBDZSGQTYQX-UFYCRDLUSA-N Tyr-Phe-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O FGVFBDZSGQTYQX-UFYCRDLUSA-N 0.000 description 1
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 1
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 1
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 1
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 1
- DLYOEFGPYTZVSP-AEJSXWLSSA-N Val-Cys-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N DLYOEFGPYTZVSP-AEJSXWLSSA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- MDYSKHBSPXUOPV-JSGCOSHPSA-N Val-Gly-Phe Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N MDYSKHBSPXUOPV-JSGCOSHPSA-N 0.000 description 1
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 1
- OVBMCNDKCWAXMZ-NAKRPEOUSA-N Val-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N OVBMCNDKCWAXMZ-NAKRPEOUSA-N 0.000 description 1
- AEMPCGRFEZTWIF-IHRRRGAJSA-N Val-Leu-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O AEMPCGRFEZTWIF-IHRRRGAJSA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 1
- BGXVHVMJZCSOCA-AVGNSLFASA-N Val-Pro-Lys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N BGXVHVMJZCSOCA-AVGNSLFASA-N 0.000 description 1
- MIKHIIQMRFYVOR-RCWTZXSCSA-N Val-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C(C)C)N)O MIKHIIQMRFYVOR-RCWTZXSCSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- WUFHZIRMAZZWRS-OSUNSFLBSA-N Val-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C(C)C)N WUFHZIRMAZZWRS-OSUNSFLBSA-N 0.000 description 1
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 1
- SSKKGOWRPNIVDW-AVGNSLFASA-N Val-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SSKKGOWRPNIVDW-AVGNSLFASA-N 0.000 description 1
- 230000002745 absorbent Effects 0.000 description 1
- 239000002250 absorbent Substances 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 239000005409 aflatoxin Substances 0.000 description 1
- 239000003905 agrochemical Substances 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010005233 alanylglutamic acid Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- OFHCOWSQAMBJIW-AVJTYSNKSA-N alfacalcidol Chemical group C1(/[C@@H]2CC[C@@H]([C@]2(CCC1)C)[C@H](C)CCCC(C)C)=C\C=C1\C[C@@H](O)C[C@H](O)C1=C OFHCOWSQAMBJIW-AVJTYSNKSA-N 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 210000004436 artificial bacterial chromosome Anatomy 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 210000001106 artificial yeast chromosome Anatomy 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 108010068265 aspartyltyrosine Proteins 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 230000034303 cell budding Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 238000013375 chromatographic separation Methods 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000011109 contamination Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 108010069495 cysteinyltyrosine Proteins 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 229940088679 drug related substance Drugs 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 230000035784 germination Effects 0.000 description 1
- 108010078144 glutaminyl-glycine Proteins 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 1
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 229960000789 guanidine hydrochloride Drugs 0.000 description 1
- PJJJBBJSCAKJQF-UHFFFAOYSA-N guanidinium chloride Chemical compound [Cl-].NC(N)=[NH2+] PJJJBBJSCAKJQF-UHFFFAOYSA-N 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000011081 inoculation Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 235000021374 legumes Nutrition 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 238000001294 liquid chromatography-tandem mass spectrometry Methods 0.000 description 1
- 210000003712 lysosome Anatomy 0.000 description 1
- 230000001868 lysosomic effect Effects 0.000 description 1
- 108010064235 lysylglycine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 1
- 239000012569 microbial contaminant Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 235000015927 pasta Nutrition 0.000 description 1
- 108010084572 phenylalanyl-valine Proteins 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 230000001323 posttranslational effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 108010029020 prolylglycine Proteins 0.000 description 1
- 108010090894 prolylleucine Proteins 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 102000037983 regulatory factors Human genes 0.000 description 1
- 108091008025 regulatory factors Proteins 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 229940102127 rubidium chloride Drugs 0.000 description 1
- 229930000044 secondary metabolite Natural products 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000003756 stirring Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000004885 tandem mass spectrometry Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 239000013638 trimer Substances 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010029384 tryptophyl-histidine Proteins 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 108010020532 tyrosyl-proline Proteins 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 239000010455 vermiculite Substances 0.000 description 1
- 235000019354 vermiculite Nutrition 0.000 description 1
- 229910052902 vermiculite Inorganic materials 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 239000003643 water by type Substances 0.000 description 1
Landscapes
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明涉及一种产生具有β‑氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:a)提供酵母细胞,所述酵母细胞包含编码具有β‑氨基己糖苷酶活性且具有与SEQ ID NO:1中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,b)在允许产生所述多肽的条件下培养所述酵母细胞,以及c)获得步骤b)中产生的多肽。本发明进一步涉及一种编码具有β‑氨基己糖苷酶活性且具有与SEQ ID NO:1中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,以及一种由所述多核苷酸编码的多肽。此外,本发明涉及一种包含本发明的多核苷酸的酵母细胞。
Description
技术领域
本发明涉及一种产生具有β-氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:a)提供酵母细胞,所述酵母细胞包含编码具有β-氨基己糖苷酶活性且具有与SEQ IDNO:1或16中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,b)在允许产生所述多肽的条件下培养所述酵母细胞,以及c)获得步骤b)中产生的多肽。本发明进一步涉及一种编码具有β-氨基己糖苷酶活性且具有与SEQ ID NO:1中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,以及一种由所述多核苷酸编码的多肽。此外,本发明涉及一种包含本发明的多核苷酸的酵母细胞和载体。
背景技术
β-氨基己糖苷酶(EC 3.2.1.52,本文中缩写为“b-Hex”)是一种催化N-乙酰基-β-氨基己糖苷(hexosaminide)中末端非还原N-乙酰基氨基己糖残基水解的酶。这种酶通常也称为N-乙酰基-β-葡糖苷酶。N-乙酰基葡糖苷和N-乙酰基半乳糖苷是底物。
在哺乳动物中发现了三种主要形式的β-氨基己糖苷酶:由一条α链、一条β-A链和一条β-B链组成的三聚体(A型),由两条β-A链和两条β-B链组成的四聚体(B型),以及两条α链的同二聚体(S型)。已知一些遗传障碍(如泰-萨二氏病和山德霍夫氏病(Sandhoff’sdisease))是由人类b-Hex基因突变引起的。
糖苷酶已经在糖生物学研究中用作工具数十年,并且已经研究了它们在糖蛋白成熟中的作用(如由Léonard R、Strasser R、Altmann F.Plant glycosidases acting onprotein-linked oligosaccharides.Phytochemistry.2009年2月;70(3):318-24.doi:10.1016/j.phytochem.2009.01.006.Epub 2009年2月4日.PMID:19200565综述)。
当前用于聚糖修饰的β-氨基己糖苷酶制剂是从其天然来源刀豆(Jack Beans)(直生刀豆(Canavalia ensiformis))中提取的。所述酶的基本描述和当前提取方法的基础可以在Li等人(J.Biol.Chem.1970 245:5153-5160)中找到。所述酶已经用于例如研究生物膜的酶促分离(J Med Microbiol.2006年8月;55(Pt 8):999-1008)。
然而,这种当前提取方法具有几个缺点:
刀豆作为一种生长在田间的植物具有此类天然系统的缺点:由于天气、土壤等条件导致的高度不可再现性(参见Li(1970))。作为结果,可能产生新的次级代谢物,然后其被引入生产过程中,最终可能污染原料药,并对患者健康具有不可计算的影响。使用农用化学品来保持生育力和避免对植物的损害可能导致在产品内的残留。在培养或储存期间土壤中、植物上或豆类上真菌或其他微生物污染物的存在可能导致产物被毒素(如黄曲霉毒素)污染,所述毒素即使在少量时也可以具有极毒作用。
由于在自然条件下植物不需要大量的这种酶,因此b-Hex在刀豆中不是非常丰富的蛋白质。它仅以约1U/g豆材料的非常小的活性存在,并且因此需要从大量污染蛋白质中提取且随后分离并纯化。这种困难的程序导致对用作底物的植物材料的高需求,并且使得这种方法非常昂贵。
除了1970年代早期描述的关于b-Hex的少量数据(参见Li等人,同上)之外,对这种酶知之甚少。值得注意的是,没有公开可得的蛋白质或DNA序列。没有对所述酶进行详细的结构表征。
Gers-Barlag等人描述了从大豆中分离β-氨基己糖苷酶(Phytochemistry,第27卷,第12期,1988,第3739-3741页)。US 2004/0031072披露了来自大豆的β-氨基己糖苷酶的序列(如SEQ ID NO:162900)。大豆β-氨基己糖苷酶序列也可以经由UniProt(参见登录号I1KTU6或I1JDS6,其对应于NCBI参考序列:XP_003518662.1)来评估。
CN 109 971 736描述了对来自草莓的氨基己糖苷酶的鉴定。
Slámová等人描述了真菌b-N-乙酰基氨基己糖苷酶在巴斯德毕赤酵母(Pichiapastoris)中的克隆和高产表达(Protein Expr Purif.2012年3月;82(1):212-7.doi:10.1016/j.pep.2012.01.004.Epub 2012年1月11日)。
Strasser描述了存在于拟南芥(Arabidopsis)(拟南芥(Arabidopsis thaliana))基因组中的三个假定b-Hex序列的异源表达(Strasser等人,Plant Physiol.2008年6月;147(2):931)。作者使用草地贪夜蛾(Spodoptera frugiperda)Sf21昆虫细胞系统进行表达。作者还表明这些植物酶与充分研究的人类b-Hex酶HexA和HexB仅具有大约30%的非常有限的同源性。因此,不足为奇的是,在Akeboshi等人中对于人类HexA描述的在甲基营养型酵母Ogataea minuta中的微生物表达将不能转座到植物酶上(Akeboshi等人,ApplEnviron Microbiol.2007年8月;73(15):4805-12)。这一点尤其正确,因为作者描述了来自O.minuta的重组HexA与来自人类溶酶体的天然HexA之间的两处主要差异,这对应于两种生物体之间不同的翻译后加工。
发明内容
在作为本申请基础的研究的背景下,对从刀豆(直生刀豆)中分离的b-Hex酶进行了详细分析,以便确定尽可能多的蛋白质序列。这通过应用蛋白酶消化、Edman测序和LC-MS/MS分析的组合来进行,并且通向了约40%蛋白质序列覆盖率的结果(实施例2)。利用这个结果,可以通过数据库检索确认实际上没有与所发现的序列的匹配。所发现的最接近的序列属于来自大豆(Soy Bean)(大豆(Glycine max))的b-Hex蛋白。此外,还确定了编码刀豆b-Hex酶的全长cDNA序列(实施例3)。与数据库中可获得的序列的比对揭示没有已知序列与所测定的序列匹配,因此检测到的β-氨基己糖苷酶多肽似乎尚未为公众所知。
有利地,可以在微生物系统中,即在法夫驹形氏酵母(Komagataella phaffii)(有时也称为巴斯德毕赤酵母)中表达β-氨基己糖苷酶多肽。含有产生的法夫驹形氏酵母菌株的培养物的上清液显示显著量的b-Hex活性(实施例4)。显示出可以获得超过100U/mL培养物。另外,培养物上清液不含大量的污染蛋白质。这允许直接可再现的蛋白质纯化过程。
选择由Swennen(2002)描述的乳酸克鲁维酵母(Klyveromyces lactis)表达系统作为第二个例子。对于此酵母系统,b-Hex的重组表达也是成功的,因为在各自的酵母培养物中发现了生物活性b-Hex。
显示出来自刀豆的b-Hex酶不是存在于单一多肽链中,如通过将所发现的DNA序列翻译成蛋白质序列所预期的。相反,发现了存在两条彼此缔合而没有共价连接的多肽链。这两条链的解离导致活性的完全丧失。因此,出人意料的是重组微生物表达产生了活性酶,因为没有预期到所鉴定的b-Hex酶以如下方式再现,即分裂成两条链并且同时确保这两条链以正确的结构缔合。
因此,本发明涉及一种产生具有β-氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:
a)提供宿主细胞,所述宿主细胞包含编码具有β-氨基己糖苷酶活性且具有与SEQID NO:1或16中所示的氨基酸序列至少85%相同的氨基酸序列的多肽的多核苷酸,
b)在允许产生所述多肽的条件下培养所述宿主细胞,以及
c)获得步骤b)中产生的多肽。
本发明进一步涉及一种编码具有β-氨基己糖苷酶活性且具有与SEQ ID NO:1或16中所示的氨基酸序列至少85%相同的氨基酸序列的多肽的多核苷酸。
本发明进一步涵盖一种由本发明的多核苷酸编码的分离的多肽。
此外,本发明涉及一种包含本发明的多核苷酸的载体。在一些实施方案中,所述载体是表达载体。
本发明进一步涉及一种包含本发明的多核苷酸、本发明的多肽和/或本发明的载体的宿主细胞。
在一些实施方案中,本发明的宿主细胞是酵母细胞或动物细胞。例如,所述宿主细胞可以是属于酵母科(Saccharomycetaceae)的酵母细胞,如法夫驹形氏酵母细胞。
在一些实施方案中,本发明的多肽具有与SEQ ID NO:1或16中所示的氨基酸序列至少90%相同,如95%或98%相同的氨基酸序列。在一些实施方案中,所述多肽包含如SEQID NO:1或16中所示的氨基酸序列。
在一些实施方案中,本发明的多核苷酸包含如SEQ ID NO:2中所示的核酸序列。在一些实施方案中,本发明的多核苷酸包含如SEQ ID NO:17中所示的核酸序列。
在一些实施方案中,本发明的多核苷酸可操作地连接至异源启动子。
在一些实施方案中,本发明的多核苷酸是针对宿主细胞(如酵母细胞)进行密码子优化的。
具体实施方式-定义
如上所述,本发明涉及一种产生具有β-氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:
a)提供宿主细胞,所述宿主细胞包含编码具有β-氨基己糖苷酶活性且具有与SEQID NO:1或16中所示的氨基酸序列至少85%相同的氨基酸序列的多肽的多核苷酸,
b)在允许产生所述多肽的条件下培养所述宿主细胞,以及
c)获得步骤b)中产生的多肽。
在本发明方法的步骤a)中,将提供包含编码具有β-氨基己糖苷酶活性的多肽的多核苷酸的宿主细胞。
如本文使用的术语“多核苷酸”是指线性或环状核酸分子。其涵盖DNA分子以及RNA分子。所述多核苷酸将作为分离的多核苷酸(即,从其天然环境中分离)或以遗传修饰的形式提供。如本文所述的多核苷酸的特征在于其将编码如上提及的多肽,即具有β-氨基己糖苷酶活性的多肽。
术语“多肽”和“蛋白质”在本文中可互换使用并且是指通过肽键连接在一起的呈聚合形式的氨基酸。
通过本发明的方法产生的多肽将具有β-氨基己糖苷酶活性。
如本文所用,β-氨基己糖苷酶(EC 3.2.1.52)典型地是指能够催化N-乙酰基-β-氨基己糖苷中末端非还原N-乙酰基氨基己糖残基水解的酶。例如,N-乙酰基葡糖苷和N-乙酰基半乳糖苷是底物。用于评估多肽是否具有β-氨基己糖苷酶活性的测定是本领域已知的,并且描述于例如Li&Li(1970)J Biol Chem 245 5153中:它们对以下底物显示b-氨基己糖苷酶活性:对-硝基苯基β-2-乙酰氨基-2-脱氧-对-吡喃葡萄糖苷和对-硝基苯基β-2-乙酰氨基-2-脱氧-对-吡喃半乳糖苷。同义词有β-氨基己糖苷酶、β-(1-2,3,4,6)氨基己糖苷酶、β-乙酰氨基-脱氧己糖苷酶、N-乙酰基-β-D-氨基己糖苷酶、N-乙酰基-β-氨基己糖苷酶、β-乙酰基氨基己糖苷酶、β-D-N-乙酰基氨基己糖苷酶、β-N-乙酰基-D-氨基己糖苷酶、β-N-乙酰基氨基葡糖苷酶、N-乙酰基氨基己糖苷酶和β-D-氨基己糖苷酶。
在一些实施方案中,具有β-氨基己糖苷酶活性的多肽形成同二聚体。
在一些实施方案中,具有β-氨基己糖苷酶活性的多肽由异源多核苷酸表达,即由例如通过使用表达载体瞬时地或稳定地引入宿主细胞中的多核苷酸表达。如本文使用的术语“异源”意指多核苷酸不是天然存在于宿主细胞中。因此,所述术语涵盖衍生自不同生物体的修饰或未修饰的多核苷酸或衍生自宿主细胞的修饰的多核苷酸。应当理解,异源多核苷酸可以包含允许在宿主细胞中表达的表达控制序列或允许异源多核苷酸在宿主细胞的基因组中的基因座处整合的序列,其中异源多核苷酸的表达将由宿主细胞的内源表达控制序列控制。通过引入异源多核苷酸,产生转基因宿主细胞。
具有β-氨基己糖苷酶活性的多肽的引入可以通过将编码所述多肽的异源多核苷酸引入宿主细胞中来实现。如本文提及的术语“引入”或“转化”涵盖将如本文所述的多核苷酸转移到宿主细胞中,而不管用于转移的方法如何。这包括瞬时引入表达载体中或稳定整合到宿主细胞的基因组中。在一些实施方案中,将多核苷酸稳定地引入宿主细胞的基因组中。
因此,本发明方法的步骤a)可以包括以下步骤:
a1)将编码具有β-氨基己糖苷酶活性的多肽的多核苷酸引入宿主细胞中;以及a2)由所述多核苷酸表达所述多肽。
术语“表达”或“基因表达”意指一种或多种特定基因或特定基因构建体的转录。术语“表达”或“基因表达”特别意指一种或多种基因或基因构建体转录成结构mRNA,随后将后者翻译成如本文提及的多肽。所述过程包括DNA的转录和所得mRNA产物的加工。
如上所述,由本发明的多核苷酸编码的多肽将具有β-氨基己糖苷酶活性。另外,它将具有与SEQ ID NO:1或16中所示的氨基酸序列至少85%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%相同的氨基酸序列。
在一些实施方案中,所述具有β-氨基己糖苷酶活性的多肽具有与SEQ ID NO:1或16中所示的氨基酸序列至少95%相同,如至少98%相同的氨基酸序列。
在一些实施方案中,所述具有β-氨基己糖苷酶活性的多肽包含如SEQ ID NO:16中所示的氨基酸序列。
SEQ ID NO:16是在作为本发明基础的研究中鉴定的刀豆(直生刀豆)β-氨基己糖苷酶的氨基酸序列。所述序列如下:
潜在前导序列(aa 1至30)的序列用下划线表示。在作为本发明基础的研究中,多肽以不含前导序列的方式表达。SEQ ID NO:1是刀豆(直生刀豆)β-氨基己糖苷酶的氨基酸序列,其中没有前导序列。因此,SEQ ID NO:1包含SEQ ID NO:16的aa 31至553。SEQ ID NO:1如下:
AAAPVKNYYA RRAPSGPGSC YEQ
在一个实施方案中,SEQ ID NO:1在N末端另外包含甲硫氨酸残基(M)。
在一个实施方案中,上述多肽由包含如SEQ ID NO:2中所示的核酸序列的多核苷酸编码:
gctactttgaagtccatcatcgagccaactgagtccttgacttacttgtggccattgccagctgacttcacttctggtgacgaaactttgtctgttgacccagctttgactttgtccgttgctggtaatggtggtggttcctccattttgagagatgctttcgacagatacagaggtattatcttcaagcactcctccgttggattctctttgatcagaaagttgagagagagattggtttccgtttccgcttacgacattgctactttgaagatcactgttcactccgacaacgaagagttgcagttgggtgttgacgagacttacactttgttggttccaaaggctaaggactcctacgttgctggtgaggttactatcgaggctaacactgtttacggtgctttgagaggtttggagactttctcccagttgtgttccttcgactactctgacaagactatcaagatttacaaggctccttggtccatccaggacaagccaagattttcctacagaggtttgttgttggacacttccagacactacttgccaatcaacgttatcaagcagatcatcgagtccatgtcctacgctaagttgaacgttttgcactggcacatcatcgacgaagagtctttcccattggaggttccaacttacccaaacttgtggaagggttcctacactaagtgggagagatacactgttgaggacgcttacgagatcgttaacttcgctaagatgagaggtattaacgttatggctgaggttgacgttccaggtcatgctgaatcttggggtgctggttatccaaatttgtggccatctccatcctgtagagagccattggacgtttccaagaacttcactttcgacgttatctccggaatcttgactgacatcagaaagatattcccattcgagttgttccacttgggaggtgacgaggttaatactgactgttggacttccacttcccacgttaaggaatggttgtccactcagaacatgactgctaaggatgcttacgaatacttcgttttgaaggctcaagagatcgctgtttctaagaactggtcccctgttaactgggaagagactttcaacactttcccagctaagttgcacaagaaaactgttgttcacaactggttgggtccaggtgtttgtccaaaggttgttgctaagggtttcagatgtatcttctccaaccagggtgtttggtacttggaccacttggatgttccttgggacgaggtttacactgctgaaccattggaaggtatcgagaagtcctctgagcaagagttggttatcggtggtgaagtttgtatgtggggtgagactgctgacacttctaacgttcagcagactatctggccaagagccgcagctgctgctgaaagattgtggtcccaaagagactccactaacatcactgttactgctttgccaagattgcagaacttcagatgtttgttgaacaagagaggtgttgctgctgctccagttaagaactactacgctagaagagccccatccggtccaggttcttgttacgaacaa
SEQ ID NO:2可以进一步包含在5'端的起始密码子(ATG)和在3'端的一个或多个终止密码子。
当多肽在宿主细胞中表达时,可以进一步加工多肽。例如,可以将多肽加工成两个亚基,其中第一亚基包含SEQ ID NO:16的氨基酸35至100,并且第二亚基包含SEQ ID NO:16的氨基酸110至553。第一亚基的起始和结束以及第二亚基的起始可以略有变化。例如,还检测到包含氨基酸34至101的亚基。
此外,所述多肽可以是己糖基化的和/或糖基化的。例如,第一亚基可以是己糖基化的。
在一个实施方案中,具有如SEQ ID NO:16中所示的序列的多肽由具有SEQ ID NO:17中所示的序列的多核苷酸编码。所述序列如下:
1561TTCAGATGTC TATTGAATAA ACGTGGAGTT GCAGCTGCTC CTGTGAAAAA TTATTATGCT
1621AGAAGGGCTC CTAGTGGTCC AGGCTCATGT TATGAGCAAT AA
在一个实施方案中,编码具有β-氨基己糖苷酶活性的多肽的多核苷酸是针对宿主细胞,如针对人类细胞进行密码子优化的。例如,所述多核苷酸可以包含SEQ ID NO:18中所示的序列:
包含SEQ ID NO:18中所示的序列的多核苷酸编码具有β-氨基己糖苷酶活性的多肽,其中所述多肽具有如SEQ ID NO:1中所示的序列。
关于参考多肽序列的“氨基酸序列同一性百分比(%)”定义为在用以实现最大序列同一性百分比而比对序列和引入空位(如果需要)后,候选序列中与所述参考多肽序列中的氨基酸残基相同的氨基酸残基的百分比。在一些实施方案中,应用标准参数来确定两个序列的序列同一性程度。例如,同一性程度将通过在比较窗口中比较两个最佳比对序列来确定,其中为了最佳比对,与参考序列(不包含添加或缺失)相比,比较窗口中的氨基酸序列的片段可以包含添加或缺失(例如,空位或突出端)。百分比是通过以下方式来计算的:确定两个序列中出现相同氨基酸残基的位置数,以得到匹配位置数;用匹配位置数除以比较窗口中的位置总数,并将结果乘以100,得到序列同一性百分比。用于比较的序列的最佳比对可以通过以下方法进行:Smith和Waterman Add.APL.Math.2:482(1981)的局部同源性算法;Needleman和Wunsch J.Mol.Biol.48:443(1970)的同源性比对算法;Pearson和LipmanProc.Natl.Acad.Sci.(USA)85:2444(1988)的相似性方法的检索;这些算法的计算机化实施(在威斯康辛州麦迪逊市科学大道575号Genetics Computer Group(GCG)的WisconsinGenetics Software Package中的GAP、BESTFIT、BLAST、PASTA和TFASTA);或目视检查。在一些实施方案中,在序列的整个长度上确定序列同一性程度。鉴于已经鉴定了用于比较的两个序列,优选使用GAP和BESTFIT来确定它们的最佳比对,并因此确定同一性程度。优选地,使用空位权重的默认值5.00和空位权重长度的默认值0.30。在一个实施方案中,使用以下确定两个氨基酸序列之间的序列同一性:Needleman和Wunsch算法(Needleman 1970,J.Mol.Biol.(48):444-453),所述算法已经被并入EMBOSS软件包(EMBOSS:欧洲分子生物学开放软件套件(The European Molecular Biology Open Software Suite),Rice,P.、Longden,I.和Bleasby,A.,Trends in Genetics 16(6),276-277,2000)中的needle程序中;BLOSUM62评分矩阵;以及空位开放罚分10和空位延伸罚分0.5。使用needle程序比对两个氨基酸序列时所用参数的非限制性例子是默认参数,包括EBLOSUM62评分矩阵、空位开放罚分10和空位延伸罚分0.5。
如本文提及的多核苷酸可以基本上由上述核酸序列组成或包含上述核酸序列。因此,它们也可以进一步含有另外的核酸序列。
在一些实施方案中,编码具有β-氨基己糖苷酶活性的多肽的多核苷酸可操作地连接至启动子,如异源启动子。典型地,启动子包含调节元件,所述调节元件介导编码序列区段在宿主细胞中的表达。
在一个实施方案中,所述启动子是组成型启动子。在一个替代性实施方案中,所述启动子是诱导型启动子。
“启动子”或“启动子序列”是与基因在同一链上且位于所述基因上游的核苷酸序列,其能够实现该基因的转录。启动子之后是基因的转录起始位点。启动子被RNA聚合酶(连同任何所需的转录因子)识别,从而启动转录。启动子的功能性片段或功能性变体是可被RNA聚合酶识别并能够启动转录的核苷酸序列。
“活性启动子片段”、“活性启动子变体”、“功能性启动子片段”或“功能性启动子变体”描述了启动子的核苷酸序列的片段或变体,它们仍具有启动子活性。
启动子可以是“诱导物依赖型启动子”或“非诱导物依赖型启动子”,其包含组成型启动子或处于其他细胞调节因子的控制下的启动子。
本领域技术人员能够选择用于表达目的多肽的合适启动子。例如,编码目的多肽的多核苷酸典型地可操作地连接至“诱导物依赖型启动子”或“非诱导物依赖型启动子”。此外,编码具有β-氨基己糖苷酶活性的多肽的多核苷酸典型地可操作地连接至“非诱导物依赖型启动子”,如组成型启动子。
“诱导物依赖型启动子”在本文中理解为这样的启动子,在将“诱导物分子”添加至发酵培养基中后增加了其用于实现所述启动子可操作地连接的基因的转录的活性。因此,对于诱导物依赖型启动子,诱导物分子的存在经由信号转导触发可操作地连接至启动子的基因的表达的增加。
在一个实施方案中,所述启动子是CMV启动子。例如,当在哺乳动物宿主细胞(如HEK-293宿主细胞)中表达具有β-氨基己糖苷酶活性的多肽时,可以使用CMV。
在另一个实施方案中,所述启动子是Tac启动子。例如,当在酵母宿主细胞(如下文公开的酵母细胞)中表达具有β-氨基己糖苷酶活性的多肽时,可以使用Tac启动子。Tac启动子(缩写为Ptac)是合成产生的DNA启动子,由来自trp和lac操纵子的启动子的组合产生。其通常用于蛋白质产生。
在一个实施方案中,所述启动子是编码醇氧化酶的多核苷酸的启动子,如来自酵母AOX1(醇氧化酶1)的启动子。
术语“可操作地连接”典型地是指启动子序列与目的基因(即,编码具有β-氨基己糖苷酶活性的多肽的多核苷酸)之间的功能性连接,使得启动子序列能够启动目的基因的转录。
此外,如本文提及的多核苷酸可以可操作地连接至终止子。术语“终止子”典型地涵盖控制序列,所述控制序列是在转录单元末端的DNA序列,其发出初级转录物的3'加工和聚腺苷酸化以及转录终止的信号。
如本文提及的多核苷酸可以进一步可操作地连接至编码分泌前导序列的多核苷酸,所述分泌前导序列即允许本发明的β-氨基己糖苷酶分泌至培养基中的序列。
本发明方法的步骤a)中提供的宿主细胞可以是任何被认为适当的宿主细胞。例如,所述宿主选自细菌细胞,如大肠杆菌细胞、酵母细胞、藻类细胞或植物细胞。术语“宿主细胞”进一步包括动物细胞,如非人动物细胞。
在一些实施方案中,所述宿主细胞是真核宿主细胞。
在一些实施方案中,所述宿主细胞是酵母细胞。
在一些实施方案中,所述酵母细胞属于酵母科,其是通过芽殖而繁殖的酵母目中的酵母科。在一些实施方案中,所述酵母科包括以下属:假丝酵母属(Candida)、克鲁维酵母属(Kluyveromyces)、驹形氏酵母属(Komagataella)、Kuraishia、拉钱斯氏酵母属(Lachancea)、Nakaseomyces、毕赤酵母属、酵母属(Saccharomyces)、Spathaspora、Tetrapisispora、接合酵母属(Zygosaccharomyces)和接合有孢圆酵母属(Zygotorulaspora)。
在一些实施方案中,所述酵母细胞属于克鲁维酵母属。例如,所述酵母细胞可以是乳酸克鲁维酵母细胞。
在一些实施方案中,所述酵母细胞属于毕赤酵母属。例如,所述酵母细胞可以是巴斯德毕赤酵母细胞。
在一些实施方案中,所述酵母细胞属于驹形氏酵母属。例如,所述酵母细胞可以是法夫驹形氏酵母细胞,如法夫驹形氏酵母菌株ATCC 76273的细胞。关于此菌株的更多信息可以在UniProt数据库(参见Taxon标识符981350)中找到。
在一些实施方案中,所述宿主细胞不是直生刀豆细胞。
在一些实施方案中,所述宿主细胞是哺乳动物宿主细胞。合适的哺乳动物细胞包括但不限于例如CHO(中国仓鼠卵巢)细胞、BHK细胞、HeLa细胞、COS细胞、HEK-293等。在一个实施方案中,使用HEK-293细胞。在另一个实施方案中,使用CHO细胞。
本发明方法的步骤b)包括在允许产生,即产生具有β-氨基己糖苷酶活性的多肽的条件下培养宿主细胞。此类条件在本领域中是熟知的,并且例如在实施例部分中描述。
本发明的方法可以进一步包括获得步骤b)中产生的多肽的步骤c)。所述多肽将通过本领域已知的方法从培养基中获得。
本发明进一步涉及如上文结合本发明的方法所定义的多核苷酸,即编码具有β-氨基己糖苷酶活性且具有与SEQ ID NO:1中所示的氨基酸序列至少85%相同的氨基酸序列的多肽的多核苷酸。
本发明进一步涵盖一种由本发明的多核苷酸编码的分离的多肽。所述多肽已在上文中定义。所述分离的多肽可以是己糖基化和/或糖基化的。
本发明进一步涉及一种包含本发明的多核苷酸、本发明的多肽和/或本发明的载体的宿主细胞。
此外,本发明涉及一种包含本发明的多核苷酸的载体。在一些实施方案中,所述载体是表达载体。
术语“载体”典型地涵盖噬菌体、质粒、病毒或逆转录病毒载体以及人工染色体,如细菌或酵母人工染色体。此外,所述术语还涉及靶向构建体,其允许将靶向构建体随机或定点整合到基因组DNA中。此类靶构建体优选包含足够长度的DNA以用于如下文详细描述的同源或异源重组。含有本发明的多核苷酸的载体优选进一步包含用于在宿主中繁殖和/或选择的选择标记物。可以通过本领域熟知的各种技术将所述载体并入宿主细胞中。如果引入宿主细胞中,所述载体可以存在于细胞质中,或者可以并入基因组中。在后一种情况下,所述载体可以进一步包含允许同源重组或异源插入的核酸序列。可以经由常规转化或转染技术将载体引入原核或真核细胞中。如本文所用的术语“转化”和“转染”、缀合和转导旨在包括用于将外来核酸(例如DNA)引入宿主细胞中的多种现有技术方法,包括磷酸钙、氯化铷或氯化钙共沉淀、DEAE-葡聚糖介导的转染、脂质体转染、自然感受态、碳基簇、化学介导的转移、电穿孔或粒子轰击(例如,“基因枪”)。用于转化或转染宿主细胞(包括酵母细胞)的合适方法可以在以下文献中见到:Sambrook等人(Molecular Cloning:A Laboratory Manual,第2版,Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press,ColdSpring Harbor,NY,1989)以及其他实验室手册,如Methods in Molecular Biology,1995,第44卷,Agrobacterium protocols,Gartland和Davey编辑,Humana Press,Totowa,NewJersey。可替代地,可以通过热休克或电穿孔技术引入质粒载体。
在一些实施方案中,本文提及的载体适合作为克隆载体,即可在微生物系统中,如在大肠杆菌中或在酵母细胞中复制。
此外,设想本发明的载体是表达载体。在这样的表达载体中,所述多核苷酸包含如上指定的允许在宿主细胞中表达的表达盒。除了本发明的多核苷酸之外,表达载体还可以包含其他调节元件,如启动子(例如,如本文别处所述的启动子)。优选地,所述表达载体也是基因转移或靶向载体。
实施方案列表
1.一种产生具有β-氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:
a)提供酵母细胞,所述酵母细胞包含编码具有β-氨基己糖苷酶活性且具有与SEQID NO:1或16中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,
b)在允许产生所述多肽的条件下培养所述酵母细胞,以及
c)获得步骤b)中产生的多肽。
2.根据实施方案1所述的方法,其中所述具有β-氨基己糖苷酶活性的多肽具有与SEQ ID NO:1中所示的氨基酸序列至少98%相同的氨基酸序列。
3.根据实施方案1和2所述的方法,其中所述具有β-氨基己糖苷酶活性的多肽包含如SEQ ID NO:1中所示的氨基酸序列。
4.根据实施方案1至3所述的方法,其中所述酵母细胞属于酵母科。
5.根据实施方案4所述的方法,其中所述酵母细胞是驹形氏酵母属细胞,如法夫驹形氏酵母,如法夫驹形氏酵母菌株ATCC 76273的细胞。
6.根据实施方案1至5中任一项所述的方法,其中编码具有β-氨基己糖苷酶活性的多肽的所述多核苷酸可操作地连接至异源启动子。
7.根据实施方案1至6中任一项所述的方法,其中编码具有β-氨基己糖苷酶活性的多肽的所述多核苷酸是针对所述酵母细胞进行密码子优化的。
8.根据实施方案1至7中任一项所述的方法,其中所述多核苷酸包含如SEQ ID NO:2或17中所示的核酸序列。
9.一种多核苷酸,所述多核苷酸编码具有β-氨基己糖苷酶活性且具有与SEQ IDNO:1中所示的氨基酸序列至少95%相同的氨基酸序列的多肽。
10.根据实施方案9所述的多核苷酸,其中所述具有β-氨基己糖苷酶活性的多肽包含如SEQ ID NO:1中所示的氨基酸序列。
11.根据实施方案9所述的多核苷酸,其中所述多核苷酸可操作地连接至异源启动子。
12.一种载体,如表达载体,所述载体包含根据实施方案9至11中任一项所述的多核苷酸。
13.一种酵母细胞,所述酵母细胞包含根据实施方案9至11中任一项所述的多核苷酸或根据实施方案12所述的载体。
14.根据实施方案13所述的酵母细胞,其中所述酵母细胞属于酵母科。
15.根据实施方案14所述的酵母细胞,其中所述酵母细胞是法夫驹形氏酵母。
16.一种分离的多肽,所述分离的多肽由根据实施方案9至11中任一项所述的多核苷酸编码。
以下实施例仅说明本发明。无论如何,它们不应被解释为限制保护范围。
实施例
实施例1:引言
在作为本发明基础的研究中,确定了来自直生刀豆的β-氨基己糖苷酶的mRNA序列和蛋白质序列。首先,通过制备型消化、MS/MS和N末端测序确定已从直生刀豆植物中提取的β-氨基己糖苷酶蛋白部分的序列。随后,通过3'和5'RACE(cDNA末端的快速扩增)确定cDNA序列。
NCBI和KEGG中的数据库研究得到了来自大豆(Glycine max)(大豆(soybean))的4种β-氨基己糖苷酶的mRNA衍生序列,其是具有以下NCBI ID的直生刀豆(C.ensiformis)(刀豆)的下一个测序亲属(relative):
·2号染色体(1668nt);cDNA XM_003518614.2;蛋白质XP_003518662.1
·10号染色体(1632nt);cDNA XM_003535730.2;蛋白质XP_003535778.1
·18号染色体(1698nt);cDNA XM_003552624.2;蛋白质XP_003552672.1
·20号染色体(1641nt);cDNA XM_003555573.2;蛋白质XP_003555621.1
这些序列充当用于引物设计和所阐明序列的比较的基础。
实施例2:来自直生刀豆的β-氨基己糖苷酶的蛋白质序列部分的确定
用Lys-C消化从直生刀豆纯化的β-氨基己糖苷酶(且具有约55kDa的表观分子量),并且经由HPLC分离所得肽。在此之后,用这些级分进行Edman降解。
将100μlβ-氨基己糖苷酶(约77μg)与29mg盐酸胍(固体)一起涡旋,以达到约3MGuaHCl的终浓度。添加7μl 1.5M Tris/HCl pH 8.8并再次短暂涡旋。取出3μl用于在条带上测试pH值(约pH 8.5)。使剩余溶液在-80℃下变性20min并在冰浴中骤冷。
用50μl水重构一小瓶Lys-C(5μg,Roche目录号11047825001)。将5μl(0.5μg Lys-C)的此溶液添加到骤冷的溶液中,再次涡旋并在32℃下孵育3h。将95μl直接注入使用Waters柱(X-Select CSH C18 2.5μm 2,1x150 mm,目录号186006727)的装备有级分收集器的Agilent 1200HPLC上。色谱分离产生了级分体积为100至150μl(含有约25% ACN溶剂)的尖峰。收集到36个级分(未示出)。将这些直接用于MALDI-MS以确定肽质量(例如,估计Edman循环的次数)。
对于一些获得的级分,可以在标准条件下使用Applied Biosystems Procise HT或Shimadzu PPSQ-33A测序仪通过N末端Edman测序来确定氨基酸序列。通过MALDI-MS测量来估计每个级分的循环次数(=氨基酸)。
级分的Edman降解产生了大量序列,将其使用ClustalW进行比对。总共鉴定出553个氨基酸中的208个。将从头测序的肽与如下文实施例3中所述鉴定的翻译的cDNA序列(未示出)重叠。结果表明在直生刀豆中鉴定出正确的cDNA序列。
实施例3:来自直生刀豆的β-氨基己糖苷酶的cDNA序列的确定
将刀豆(直生刀豆)种子置于塑料托盘中的潮湿吸水薄纸之间,并且在室温下在黑暗处储存约48小时(用于萌芽)。然后,将萌芽的种子在室温下在光照下再生长5-6天。然后将小植株置于在阳光充足的窗户处在室温下的作为基底的3-6mm蛭石中(深度为2-3cm),并且如果干燥,则浇水。
用解剖刀将来自直生刀豆的萌芽材料切成可以用于RNA提取的多个部分(约200mg植物材料),将其置于50ml塑料管中并在液氮中速冻。这是用芽、子叶、胚胎和叶组织进行的。根据制造商的说明书(RNeasy植物微型试剂盒(Qiagen目录号74903))从上述组织中分离RNA。
对于芽、子叶、胚胎和叶,分别用两种逆转录酶合成cDNA。然后,合并分别用于芽、子叶、胚胎和叶的两种逆转录酶反应的cDNA。
随后,使用Phusion Hot Start II DNA聚合酶(Thermo Scientific,目录号F-549L)和以下引物通过PCR扩增每种cDNA的内部片段:
JB-01CTCACCTACCTCTGGCCCCTTCCCGC(SEQ ID NO:3)
JB-07TTATTGGTCATAACATGACCCTGGACCAACAGG(SEQ ID NO:4)
然后,使用Big循环测序终止子试剂盒(Applied Biosystems,美国)和以下引物对扩增的片段进行DNA序列分析:
JB-01CTCACCTACCTCTGGCCCCTTCCCGC(SEQ ID NO:3)
JB-02GAGGAGCTTCAATTTGGAGTGGATG(SEQ ID NO:5)
JB-06ATCAGCTGTCTCACCCCACATGCAAACTTCTC(SEQ ID NO:6)
JB-07TTATTGGTCATAACATGACCCTGGACCAACAGG(SEQ ID NO:4),
用Big循环测序终止子试剂盒扩增约100ng PCR片段(或300ng质粒DNA)和10pmol引物,用DyeEX 2.0Spin试剂盒纯化并测序。根据制造商的说明书使用所述试剂盒和设备。
然后,用从子叶组织获得的cDNA进行3'RACE和5'RACE。
使用了以下引物:
对于3'RACE:
JB-08AAGTTTGCATGTGGGGTGAGAC(SEQ ID NO:7)
JB-09GCAAACAATATGGCCTAGAGCTG(SEQ ID NO:8)
CDSIII-短ATTCTAGAGGCCGAGGCGGCCGACATGT(SEQ ID NO:9)
进行了两次PCR,一次用JB-08+CDSIII-短,并且一次用JB-09+CDSIII-短。使用JB-09引物对PCR片段进行测序。
对于5'RACE:
JB-10AAGAGTCCTTGGCTTTGGGAAC(SEQ ID NO:10)
Okib57-衔接子5'-pGTAGGAATTCGGGTTGTAGGGAGGTCGACATTGCC-3'(SEQ ID NO:11)
JB-01CTCACCTACCTCTGGCCCCTTCCCGC(SEQ ID NO:3)
JB-11TCAATGTCGCAATGTCATAGGC(SEQ ID NO:12)
JB-12ATGAGACTGAACCCAACACTGC(SEQ ID NO:13)
Okib58 5'-GGCAATGTCGACCTCCCTACAAC-3'(SEQ ID NO:14)
Okib59 5'CTCCCTACAACCCGAATTCCTAC-3'(SEQ ID NO:15)
用两种子叶转录酶用特异性引物JB-10合成cDNA。然后合并两种cDNA。将Okib57-衔接子与新鲜合成的cDNA连接。用引物JB-11和Okib58进行一次PCR,并且用引物JB-12和Okib59进行一次PCR。将所得片段亚克隆到PCR-Blunt-II-TOPO中并如上所述测序。
总之,成功地获得了来自直生刀豆的β-氨基己糖苷酶的mRNA序列。可以从不同的新鲜萌芽的植物材料中分离mRNA。对相应的cDNA进行测序,并通过β-氨基己糖苷酶(纯化的β-氨基己糖苷酶)的蛋白质序列的部分阐明来证实发现的序列。
实施例4:所鉴定的多肽的重组表达
来自直生刀豆的β-氨基己糖苷酶在AOX1启动子的控制下在法夫驹形氏酵母菌株ATCC 76273(也称为CBS 7435)中重组表达。为了在96深孔板中重组表达β-氨基己糖苷酶,从转化板中挑取单个菌落,放入填充有优化培养基的96深孔板的单个孔中。在产生生物质的初始生长期之后,通过添加允许去阻抑表达的优化的液体混合物诱导从AOX1启动子的表达。在从初始接种开始总共108小时之后,将所有深孔板离心,并且将所有孔的上清液收获到储备微量滴定板中以用于随后的分析。
为了在发酵规模上重组表达β-氨基己糖苷酶,用生产菌株接种300mL摇瓶中的50mL酵母/蛋白胨/甘油培养基,并且在28℃下以110rpm摇动过夜(预培养物1)。从预培养物1接种预培养物2(在2L摇瓶中200mL酵母/蛋白胨/甘油培养基),使得OD600nm达到大约20。将预培养物2在28℃下以220rpm摇动约8h。从预培养物2接种2L填充有400mL含有甘油作为碳源的确定成分培养基(pH=5.5)的发酵罐,使得OD600nm为2.0,在初始分批期期间,培养温度为28℃。在启动生产期前一小时,温度降低至24℃,并且在整个剩余过程中保持在此水平,同时pH降至5.0并保持在此水平。在整个过程中将氧饱和度设定为30%(级联控制:搅拌器、流量、氧补充)。在700rpm与1200rpm之间进行搅拌,并且选择1.0-2.0L·min-1的流量范围(空气)。甘油分批补料通过在整个培养过程中以6g/L·h供应60%甘油溶液来进行。
SEQUENCE LISTING
<110> 建新公司
<120> 具有β-氨基己糖苷酶活性的多肽和编码所述多肽的多核苷酸
<130> PAT19110-WO-PCT
<160> 18
<170> PatentIn version 3.5
<210> 1
<211> 523
<212> PRT
<213> Canavalia ensiformis
<400> 1
Ala Thr Leu Lys Ser Ile Ile Glu Pro Thr Glu Ser Leu Thr Tyr Leu
1 5 10 15
Trp Pro Leu Pro Ala Asp Phe Thr Ser Gly Asp Glu Thr Leu Ser Val
20 25 30
Asp Pro Ala Leu Thr Leu Ser Val Ala Gly Asn Gly Gly Gly Ser Ser
35 40 45
Ile Leu Arg Asp Ala Phe Asp Arg Tyr Arg Gly Ile Ile Phe Lys His
50 55 60
Ser Ser Val Gly Phe Ser Leu Ile Arg Lys Leu Arg Glu Arg Leu Val
65 70 75 80
Ser Val Ser Ala Tyr Asp Ile Ala Thr Leu Lys Ile Thr Val His Ser
85 90 95
Asp Asn Glu Glu Leu Gln Leu Gly Val Asp Glu Thr Tyr Thr Leu Leu
100 105 110
Val Pro Lys Ala Lys Asp Ser Tyr Val Ala Gly Glu Val Thr Ile Glu
115 120 125
Ala Asn Thr Val Tyr Gly Ala Leu Arg Gly Leu Glu Thr Phe Ser Gln
130 135 140
Leu Cys Ser Phe Asp Tyr Ser Asp Lys Thr Ile Lys Ile Tyr Lys Ala
145 150 155 160
Pro Trp Ser Ile Gln Asp Lys Pro Arg Phe Ser Tyr Arg Gly Leu Leu
165 170 175
Leu Asp Thr Ser Arg His Tyr Leu Pro Ile Asn Val Ile Lys Gln Ile
180 185 190
Ile Glu Ser Met Ser Tyr Ala Lys Leu Asn Val Leu His Trp His Ile
195 200 205
Ile Asp Glu Glu Ser Phe Pro Leu Glu Val Pro Thr Tyr Pro Asn Leu
210 215 220
Trp Lys Gly Ser Tyr Thr Lys Trp Glu Arg Tyr Thr Val Glu Asp Ala
225 230 235 240
Tyr Glu Ile Val Asn Phe Ala Lys Met Arg Gly Ile Asn Val Met Ala
245 250 255
Glu Val Asp Val Pro Gly His Ala Glu Ser Trp Gly Ala Gly Tyr Pro
260 265 270
Asn Leu Trp Pro Ser Pro Ser Cys Arg Glu Pro Leu Asp Val Ser Lys
275 280 285
Asn Phe Thr Phe Asp Val Ile Ser Gly Ile Leu Thr Asp Ile Arg Lys
290 295 300
Ile Phe Pro Phe Glu Leu Phe His Leu Gly Gly Asp Glu Val Asn Thr
305 310 315 320
Asp Cys Trp Thr Ser Thr Ser His Val Lys Glu Trp Leu Ser Thr Gln
325 330 335
Asn Met Thr Ala Lys Asp Ala Tyr Glu Tyr Phe Val Leu Lys Ala Gln
340 345 350
Glu Ile Ala Val Ser Lys Asn Trp Ser Pro Val Asn Trp Glu Glu Thr
355 360 365
Phe Asn Thr Phe Pro Ala Lys Leu His Lys Lys Thr Val Val His Asn
370 375 380
Trp Leu Gly Pro Gly Val Cys Pro Lys Val Val Ala Lys Gly Phe Arg
385 390 395 400
Cys Ile Phe Ser Asn Gln Gly Val Trp Tyr Leu Asp His Leu Asp Val
405 410 415
Pro Trp Asp Glu Val Tyr Thr Ala Glu Pro Leu Glu Gly Ile Glu Lys
420 425 430
Ser Ser Glu Gln Glu Leu Val Ile Gly Gly Glu Val Cys Met Trp Gly
435 440 445
Glu Thr Ala Asp Thr Ser Asn Val Gln Gln Thr Ile Trp Pro Arg Ala
450 455 460
Ala Ala Ala Ala Glu Arg Leu Trp Ser Gln Arg Asp Ser Thr Asn Ile
465 470 475 480
Thr Val Thr Ala Leu Pro Arg Leu Gln Asn Phe Arg Cys Leu Leu Asn
485 490 495
Lys Arg Gly Val Ala Ala Ala Pro Val Lys Asn Tyr Tyr Ala Arg Arg
500 505 510
Ala Pro Ser Gly Pro Gly Ser Cys Tyr Glu Gln
515 520
<210> 2
<211> 1569
<212> DNA
<213> Canavalia ensiformis
<400> 2
gctactttga agtccatcat cgagccaact gagtccttga cttacttgtg gccattgcca 60
gctgacttca cttctggtga cgaaactttg tctgttgacc cagctttgac tttgtccgtt 120
gctggtaatg gtggtggttc ctccattttg agagatgctt tcgacagata cagaggtatt 180
atcttcaagc actcctccgt tggattctct ttgatcagaa agttgagaga gagattggtt 240
tccgtttccg cttacgacat tgctactttg aagatcactg ttcactccga caacgaagag 300
ttgcagttgg gtgttgacga gacttacact ttgttggttc caaaggctaa ggactcctac 360
gttgctggtg aggttactat cgaggctaac actgtttacg gtgctttgag aggtttggag 420
actttctccc agttgtgttc cttcgactac tctgacaaga ctatcaagat ttacaaggct 480
ccttggtcca tccaggacaa gccaagattt tcctacagag gtttgttgtt ggacacttcc 540
agacactact tgccaatcaa cgttatcaag cagatcatcg agtccatgtc ctacgctaag 600
ttgaacgttt tgcactggca catcatcgac gaagagtctt tcccattgga ggttccaact 660
tacccaaact tgtggaaggg ttcctacact aagtgggaga gatacactgt tgaggacgct 720
tacgagatcg ttaacttcgc taagatgaga ggtattaacg ttatggctga ggttgacgtt 780
ccaggtcatg ctgaatcttg gggtgctggt tatccaaatt tgtggccatc tccatcctgt 840
agagagccat tggacgtttc caagaacttc actttcgacg ttatctccgg aatcttgact 900
gacatcagaa agatattccc attcgagttg ttccacttgg gaggtgacga ggttaatact 960
gactgttgga cttccacttc ccacgttaag gaatggttgt ccactcagaa catgactgct 1020
aaggatgctt acgaatactt cgttttgaag gctcaagaga tcgctgtttc taagaactgg 1080
tcccctgtta actgggaaga gactttcaac actttcccag ctaagttgca caagaaaact 1140
gttgttcaca actggttggg tccaggtgtt tgtccaaagg ttgttgctaa gggtttcaga 1200
tgtatcttct ccaaccaggg tgtttggtac ttggaccact tggatgttcc ttgggacgag 1260
gtttacactg ctgaaccatt ggaaggtatc gagaagtcct ctgagcaaga gttggttatc 1320
ggtggtgaag tttgtatgtg gggtgagact gctgacactt ctaacgttca gcagactatc 1380
tggccaagag ccgcagctgc tgctgaaaga ttgtggtccc aaagagactc cactaacatc 1440
actgttactg ctttgccaag attgcagaac ttcagatgtt tgttgaacaa gagaggtgtt 1500
gctgctgctc cagttaagaa ctactacgct agaagagccc catccggtcc aggttcttgt 1560
tacgaacaa 1569
<210> 3
<211> 26
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer JB-01
<400> 3
ctcacctacc tctggcccct tcccgc 26
<210> 4
<211> 33
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer JB-07
<400> 4
ttattggtca taacatgacc ctggaccaac agg 33
<210> 5
<211> 25
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer JB-02
<400> 5
gaggagcttc aatttggagt ggatg 25
<210> 6
<211> 32
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer JB-06
<400> 6
atcagctgtc tcaccccaca tgcaaacttc tc 32
<210> 7
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> Primber JB-08
<400> 7
aagtttgcat gtggggtgag ac 22
<210> 8
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> Primer JB-09
<400> 8
gcaaacaata tggcctagag ctg 23
<210> 9
<211> 28
<212> DNA
<213> Artificial Sequence
<220>
<223> CDSIII-short
<400> 9
attctagagg ccgaggcggc cgacatgt 28
<210> 10
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> JB-10
<400> 10
aagagtcctt ggctttggga ac 22
<210> 11
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> Okib57-Adapter
<400> 11
gtaggaattc gggttgtagg gaggtcgaca ttgcc 35
<210> 12
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> JB-11
<400> 12
tcaatgtcgc aatgtcatag gc 22
<210> 13
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> JB-12
<400> 13
atgagactga acccaacact gc 22
<210> 14
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> Okib58
<400> 14
ggcaatgtcg acctccctac aac 23
<210> 15
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> Okib59
<400> 15
ctccctacaa cccgaattcc tac 23
<210> 16
<211> 553
<212> PRT
<213> Canavalia ensiformis
<400> 16
Met Phe Leu Cys Ile Pro Arg Trp Phe Ser Ser Pro Leu Leu Ile Leu
1 5 10 15
Phe Val Ile Tyr Cys Ala Leu Phe Ala Pro Gln Ala Ala Ser Ala Thr
20 25 30
Leu Lys Ser Ile Ile Glu Pro Thr Glu Ser Leu Thr Tyr Leu Trp Pro
35 40 45
Leu Pro Ala Asp Phe Thr Ser Gly Asp Glu Thr Leu Ser Val Asp Pro
50 55 60
Ala Leu Thr Leu Ser Val Ala Gly Asn Gly Gly Gly Ser Ser Ile Leu
65 70 75 80
Arg Asp Ala Phe Asp Arg Tyr Arg Gly Ile Ile Phe Lys His Ser Ser
85 90 95
Val Gly Phe Ser Leu Ile Arg Lys Leu Arg Glu Arg Leu Val Ser Val
100 105 110
Ser Ala Tyr Asp Ile Ala Thr Leu Lys Ile Thr Val His Ser Asp Asn
115 120 125
Glu Glu Leu Gln Leu Gly Val Asp Glu Thr Tyr Thr Leu Leu Val Pro
130 135 140
Lys Ala Lys Asp Ser Tyr Val Ala Gly Glu Val Thr Ile Glu Ala Asn
145 150 155 160
Thr Val Tyr Gly Ala Leu Arg Gly Leu Glu Thr Phe Ser Gln Leu Cys
165 170 175
Ser Phe Asp Tyr Ser Asp Lys Thr Ile Lys Ile Tyr Lys Ala Pro Trp
180 185 190
Ser Ile Gln Asp Lys Pro Arg Phe Ser Tyr Arg Gly Leu Leu Leu Asp
195 200 205
Thr Ser Arg His Tyr Leu Pro Ile Asn Val Ile Lys Gln Ile Ile Glu
210 215 220
Ser Met Ser Tyr Ala Lys Leu Asn Val Leu His Trp His Ile Ile Asp
225 230 235 240
Glu Glu Ser Phe Pro Leu Glu Val Pro Thr Tyr Pro Asn Leu Trp Lys
245 250 255
Gly Ser Tyr Thr Lys Trp Glu Arg Tyr Thr Val Glu Asp Ala Tyr Glu
260 265 270
Ile Val Asn Phe Ala Lys Met Arg Gly Ile Asn Val Met Ala Glu Val
275 280 285
Asp Val Pro Gly His Ala Glu Ser Trp Gly Ala Gly Tyr Pro Asn Leu
290 295 300
Trp Pro Ser Pro Ser Cys Arg Glu Pro Leu Asp Val Ser Lys Asn Phe
305 310 315 320
Thr Phe Asp Val Ile Ser Gly Ile Leu Thr Asp Ile Arg Lys Ile Phe
325 330 335
Pro Phe Glu Leu Phe His Leu Gly Gly Asp Glu Val Asn Thr Asp Cys
340 345 350
Trp Thr Ser Thr Ser His Val Lys Glu Trp Leu Ser Thr Gln Asn Met
355 360 365
Thr Ala Lys Asp Ala Tyr Glu Tyr Phe Val Leu Lys Ala Gln Glu Ile
370 375 380
Ala Val Ser Lys Asn Trp Ser Pro Val Asn Trp Glu Glu Thr Phe Asn
385 390 395 400
Thr Phe Pro Ala Lys Leu His Lys Lys Thr Val Val His Asn Trp Leu
405 410 415
Gly Pro Gly Val Cys Pro Lys Val Val Ala Lys Gly Phe Arg Cys Ile
420 425 430
Phe Ser Asn Gln Gly Val Trp Tyr Leu Asp His Leu Asp Val Pro Trp
435 440 445
Asp Glu Val Tyr Thr Ala Glu Pro Leu Glu Gly Ile Glu Lys Ser Ser
450 455 460
Glu Gln Glu Leu Val Ile Gly Gly Glu Val Cys Met Trp Gly Glu Thr
465 470 475 480
Ala Asp Thr Ser Asn Val Gln Gln Thr Ile Trp Pro Arg Ala Ala Ala
485 490 495
Ala Ala Glu Arg Leu Trp Ser Gln Arg Asp Ser Thr Asn Ile Thr Val
500 505 510
Thr Ala Leu Pro Arg Leu Gln Asn Phe Arg Cys Leu Leu Asn Lys Arg
515 520 525
Gly Val Ala Ala Ala Pro Val Lys Asn Tyr Tyr Ala Arg Arg Ala Pro
530 535 540
Ser Gly Pro Gly Ser Cys Tyr Glu Gln
545 550
<210> 17
<211> 1662
<212> DNA
<213> Canavalia ensiformis
<400> 17
atgtttctgt gcatacccag atggttctct tcacctcttc tcattctctt tgtcatttac 60
tgtgccctct ttgctcctca agctgcttct gccacactca aatctatcat tgaacccact 120
gagtccctca catacctttg gcccctcccc gcagacttca cttcaggcga tgaaactctt 180
tccgttgacc ctgcacttac cctctctgtc gccggcaacg gtggtggctc ttccattctc 240
agagatgcat ttgaccgata cagaggaatc atattcaagc acagcagtgt tgggttcagt 300
ctcataagaa agttaaggga aagattggtg tctgtttctg cctatgacat tgcgacattg 360
aagatcactg tccattcaga taacgaggag cttcaacttg gagtggatga aacctatacc 420
ttgctggttc ccaaagccaa ggactcttat gttgctgggg aagtcacaat tgaggcaaac 480
actgtttatg gtgcattgcg cggattagag acattcagcc agttgtgttc tttcgattat 540
tcggataaaa caataaaaat atacaaggca ccttggtcca tccaagataa acctagattt 600
tcctatcgtg ggcttttgtt ggacacatcg aggcactatt taccaattaa cgtaattaag 660
cagattattg aatctatgtc ctatgctaaa cttaatgttc tacattggca catcatagac 720
gaggagtcat ttcctcttga ggtacctaca tatccaaact tgtggaaagg ttcatataca 780
aagtgggaac gttacacggt agaagacgca tatgaaattg tcaacttcgc caaaatgaga 840
ggcataaatg tgatggcaga agtggatgtt cctggtcatg cagaatcatg gggtgctgga 900
tatcccaatc tttggccgtc accttcctgt agggagccac tggatgtttc aaagaatttt 960
acttttgatg tcatttctgg tatcctgaca gatataagaa agattttccc gtttgagcta 1020
tttcacttgg gtggtgatga agttaataca gattgctgga ccagtacttc tcatgtgaag 1080
gaatggcttt cgactcaaaa catgactgct aaagatgcct atgaatattt tgtactgaag 1140
gcccaagaga tagctgtttc aaaaaattgg agtccggtga actgggaaga aaccttcaat 1200
acatttccag caaagctcca taagaaaact gtggtgcata actggttggg ccctggggtt 1260
tgtccaaagg ttgttgcaaa aggtttcagg tgcattttca gtaatcaggg tgtctggtat 1320
cttgaccatc tggatgtacc ttgggatgag gtctatactg ctgagccact agaaggaata 1380
gaaaaatctt ctgaacaaga gcttgtaatt ggaggagaag tttgcatgtg gggtgagaca 1440
gctgatacat ccaatgttca gcaaacaata tggcctagag ctgctgcagc tgcagaacgc 1500
ttatggagtc agagagattc tacaaatatt actgtaactg cgttgccccg gttacaaaac 1560
ttcagatgtc tattgaataa acgtggagtt gcagctgctc ctgtgaaaaa ttattatgct 1620
agaagggctc ctagtggtcc aggctcatgt tatgagcaat aa 1662
<210> 18
<211> 1572
<212> DNA
<213> Artificial Sequence
<220>
<223> codon optimized sequence (for human cells)
<400> 18
gccacactga agtccatcat cgagcccacc gagagcctga cctacctgtg gcctctgccc 60
gccgatttca ccagcggcga cgagacactg tccgtggatc ctgccctgac actgagcgtg 120
gccggaaatg gcggcggaag cagcatcctg agagatgcct tcgaccggta cagaggcatc 180
atcttcaagc acagcagcgt gggcttcagc ctgatccgga agctgcgcga gagactggtg 240
tccgtgtccg cctacgatat cgccaccctg aagatcaccg tgcactccga caacgaggaa 300
ctgcagctgg gcgtggacga gacatacacc ctgctggtgc ccaaggccaa ggacagctat 360
gtggccggcg aagtgaccat cgaggccaac acagtgtacg gcgccctgag aggcctggaa 420
accttcagcc agctgtgcag cttcgactac agcgacaaga ccatcaagat ctacaaggcc 480
ccttggagca tccaggacaa gccccggttc agctacagag gcctgctgct ggacaccagc 540
agacactacc tgcccatcaa cgtgatcaag cagatcatcg agagcatgag ctacgccaag 600
ctgaacgtgc tgcactggca catcatcgac gaggaatcct tcccactgga agtgcccacc 660
taccccaacc tgtggaaggg cagctacacc aagtgggagc ggtacaccgt ggaagatgcc 720
tacgagatcg tgaacttcgc caagatgcgg ggcatcaatg tgatggccga ggtggacgtg 780
ccaggccacg ctgaatcttg gggagccggc taccctaatc tgtggcccag ccccagctgt 840
cgcgaacccc tggacgtgtc caagaacttc accttcgacg tgatcagcgg catcctgacc 900
gatatcagaa agatcttccc attcgagctg ttccacctgg gaggcgacga agtgaacacc 960
gactgctgga ccagcaccag ccacgtgaaa gagtggctga gcacccagaa catgaccgcc 1020
aaggacgcct acgagtactt cgtgctgaag gcccaggaaa tcgccgtgtc taagaattgg 1080
agccccgtga actgggagga aacctttaac accttccctg ccaaactgca caagaaaacc 1140
gtggtgcaca attggctggg ccctggcgtg tgccctaagg tggtggccaa gggcttccgc 1200
tgcatattca gcaaccaggg cgtgtggtat ctggaccacc tggatgtgcc ctgggacgag 1260
gtgtacacag ccgagcctct ggaaggcatc gagaagtcct ccgagcagga actcgtgatc 1320
ggcggagaag tgtgcatgtg gggcgagaca gccgacacct ccaacgtgca gcagaccatc 1380
tggcctagag ccgccgctgc cgctgaaaga ctgtggtccc agagagacag caccaacatc 1440
accgtgaccg ccctgccccg gctgcagaac tttagatgcc tgctgaacaa gcggggcgtg 1500
gccgctgccc ccgtgaagaa ttactatgcc agaagggccc ccagcggccc tggcagctgt 1560
tatgaacagt ga 1572
Claims (15)
1.一种产生具有β-氨基己糖苷酶活性的多肽的方法,所述方法包括以下步骤:
a)提供酵母细胞,所述酵母细胞包含编码具有β-氨基己糖苷酶活性且具有与SEQ IDNO:1或16中所示的氨基酸序列至少95%相同的氨基酸序列的多肽的多核苷酸,
b)在允许产生所述多肽的条件下培养所述酵母细胞,以及
c)获得步骤b)中产生的多肽。
2.根据权利要求1所述的方法,其中所述具有β-氨基己糖苷酶活性的多肽具有与SEQID NO:1中所示的氨基酸序列至少98%相同的氨基酸序列。
3.根据权利要求1和2所述的方法,其中所述具有β-氨基己糖苷酶活性的多肽包含如SEQ ID NO:1中所示的氨基酸序列。
4.根据权利要求1至3所述的方法,其中所述酵母细胞属于酵母科。
5.根据权利要求4所述的方法,其中所述酵母细胞是驹形氏酵母属细胞,如法夫驹形氏酵母细胞。
6.根据权利要求1至5中任一项所述的方法,其中编码具有β-氨基己糖苷酶活性的多肽的所述多核苷酸可操作地连接至异源启动子。
7.根据权利要求1至6中任一项所述的方法,其中编码具有β-氨基己糖苷酶活性的多肽的所述多核苷酸是针对所述酵母细胞进行密码子优化的。
8.根据权利要求1至7中任一项所述的方法,其中所述多核苷酸包含如SEQ ID NO:2或17中所示的核酸序列。
9.一种多核苷酸,所述多核苷酸编码具有β-氨基己糖苷酶活性且具有与SEQ ID NO:1中所示的氨基酸序列至少95%相同的氨基酸序列的多肽。
10.根据权利要求9所述的多核苷酸,其中所述具有β-氨基己糖苷酶活性的多肽包含如SEQ ID NO:1中所示的氨基酸序列。
11.根据权利要求9所述的多核苷酸,其中所述多核苷酸可操作地连接至异源启动子。
12.一种载体,如表达载体,所述载体包含根据权利要求9至11中任一项所述的多核苷酸。
13.一种酵母细胞,所述酵母细胞包含根据权利要求9至11中任一项所述的多核苷酸或根据权利要求12所述的载体。
14.根据权利要求13所述的酵母细胞,其中所述酵母细胞属于酵母科。
15.一种分离的多肽,所述分离的多肽由根据权利要求9至11中任一项所述的多核苷酸编码。
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063120608P | 2020-12-02 | 2020-12-02 | |
US63/120,608 | 2020-12-02 | ||
EP20213521.6 | 2020-12-11 | ||
PCT/EP2021/084004 WO2022117743A1 (en) | 2020-12-02 | 2021-12-02 | Polypeptide having beta-hexosaminidase activity, and polynucleotides coding for the same |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116802284A true CN116802284A (zh) | 2023-09-22 |
Family
ID=88048411
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202180091682.XA Pending CN116802284A (zh) | 2020-12-02 | 2021-12-02 | 具有β-氨基己糖苷酶活性的多肽和编码所述多肽的多核苷酸 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116802284A (zh) |
-
2021
- 2021-12-02 CN CN202180091682.XA patent/CN116802284A/zh active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
USRE44266E1 (en) | Expression of eukaryotic polypeptides in chloroplasts | |
CN108727479B (zh) | 调控叶倾角的f-box蛋白及其应用 | |
KR20210153106A (ko) | 단백질 생산을 위한 물질 및 방법 | |
CN110106187A (zh) | 一种橡胶树的arf基因及其编码蛋白与应用 | |
CN101928717A (zh) | 丹参鲨烯合酶(SmSQS)基因及其编码的蛋白和应用 | |
US10125371B2 (en) | Nucleotide sequence encoding WUSCHEL-related homeobox4 (WOX4) protein from Corchorus olitorius and Corchorus capsularis and methods of use for same | |
CN107475264B (zh) | Dgm1蛋白在提高植物根毛生成能力中的应用 | |
CN116802284A (zh) | 具有β-氨基己糖苷酶活性的多肽和编码所述多肽的多核苷酸 | |
Kawai et al. | Biochemical properties of rice adenylate kinase and subcellular location in plant cells | |
CN110295175A (zh) | 一个大豆NAC转录因子家族基因Glyma08g41995的应用 | |
CN111826391A (zh) | 一种nhx2-gcd1双基因或其蛋白的应用 | |
CN113248584B (zh) | Ralf蛋白质在促进植物对磷元素吸收中的应用 | |
WO2022117743A1 (en) | Polypeptide having beta-hexosaminidase activity, and polynucleotides coding for the same | |
CN113528567B (zh) | Fba8蛋白或由其衍生的蛋白质在调节植物维管束分裂和/或花序轴横截面积中的应用 | |
Mai et al. | Identification of a Sed5-like SNARE gene LjSYP32-1 that contributes to nodule tissue formation of Lotus japonicus | |
CN113667675B (zh) | 利用大豆fls2/bak1基因提高植物抗病性 | |
CN115161306B (zh) | 绿盲蝽rna降解酶、其编码基因、载体、菌株及其应用 | |
KR20100100097A (ko) | 고구마 유래의 MuS1 유전자 및 이의 용도 | |
CN117430679B (zh) | 来源于小麦的广谱抗病相关蛋白及其相关生物材料与应用 | |
CN115197307B (zh) | 调控植物抗逆性的蛋白IbGER5及其编码基因与用途 | |
Huang et al. | Gene cloning and comparative analysis of pyridoxal 5′-phosphate salvage synthesis enzymes in tobacco plants | |
CN113005106B (zh) | 玉米耐低温基因ZmCIPK10.1在提高植物抗寒性中的应用 | |
Muliani et al. | Isolation and cloning of protein disulfide isomerase gene from soybean (Glycine max L. merrill) | |
KR101630839B1 (ko) | 알팔파 유래의 dnaj 유사 단백질 및 이의 용도 | |
KR100599824B1 (ko) | 콩 유래 신규 스트레스 저항성 단백질, 이를 코딩하는유전자 및 그 용도 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |