CN114686454B - Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 - Google Patents
Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 Download PDFInfo
- Publication number
- CN114686454B CN114686454B CN202011621690.4A CN202011621690A CN114686454B CN 114686454 B CN114686454 B CN 114686454B CN 202011621690 A CN202011621690 A CN 202011621690A CN 114686454 B CN114686454 B CN 114686454B
- Authority
- CN
- China
- Prior art keywords
- sequence
- pegrna
- leu
- lys
- ala
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 51
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 32
- 102100034343 Integrase Human genes 0.000 claims abstract description 25
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims abstract description 25
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 24
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 24
- 108010038807 Oligopeptides Proteins 0.000 claims abstract description 16
- 102000015636 Oligopeptides Human genes 0.000 claims abstract description 16
- 108091033409 CRISPR Proteins 0.000 claims abstract description 13
- 102000004190 Enzymes Human genes 0.000 claims abstract description 12
- 108090000790 Enzymes Proteins 0.000 claims abstract description 12
- 239000003550 marker Substances 0.000 claims abstract description 11
- 101150038500 cas9 gene Proteins 0.000 claims abstract description 5
- 108020004414 DNA Proteins 0.000 claims description 47
- 235000007164 Oryza sativa Nutrition 0.000 claims description 31
- 235000009566 rice Nutrition 0.000 claims description 31
- 102000053602 DNA Human genes 0.000 claims description 28
- 241000196324 Embryophyta Species 0.000 claims description 21
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 13
- 108010002685 hygromycin-B kinase Proteins 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 11
- 230000037429 base substitution Effects 0.000 claims description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 7
- 108010008532 Deoxyribonuclease I Proteins 0.000 claims description 7
- 102000007260 Deoxyribonuclease I Human genes 0.000 claims description 7
- 240000007594 Oryza sativa Species 0.000 claims 1
- 239000012620 biological material Substances 0.000 abstract description 5
- 238000002474 experimental method Methods 0.000 abstract description 3
- 230000008901 benefit Effects 0.000 abstract description 2
- 239000013604 expression vector Substances 0.000 description 72
- 238000003259 recombinant expression Methods 0.000 description 64
- 230000035772 mutation Effects 0.000 description 37
- 108091033380 Coding strand Proteins 0.000 description 31
- 241000209094 Oryza Species 0.000 description 30
- 238000012408 PCR amplification Methods 0.000 description 30
- 206010020649 Hyperkeratosis Diseases 0.000 description 28
- 235000018102 proteins Nutrition 0.000 description 23
- 210000004027 cell Anatomy 0.000 description 22
- 239000002773 nucleotide Substances 0.000 description 20
- 125000003729 nucleotide group Chemical group 0.000 description 20
- 238000000034 method Methods 0.000 description 18
- 108091026890 Coding region Proteins 0.000 description 13
- 239000002299 complementary DNA Substances 0.000 description 13
- 239000002609 medium Substances 0.000 description 11
- 241000589158 Agrobacterium Species 0.000 description 9
- 108090000765 processed proteins & peptides Proteins 0.000 description 9
- 238000006467 substitution reaction Methods 0.000 description 8
- 241000700605 Viruses Species 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 239000013598 vector Substances 0.000 description 7
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 6
- OJOBTAOGJIWAGB-UHFFFAOYSA-N acetosyringone Chemical compound COC1=CC(C(C)=O)=CC(OC)=C1O OJOBTAOGJIWAGB-UHFFFAOYSA-N 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 108020005004 Guide RNA Proteins 0.000 description 5
- 108010008355 arginyl-glutamine Proteins 0.000 description 5
- 108010092854 aspartyllysine Proteins 0.000 description 5
- 239000003795 chemical substances by application Substances 0.000 description 5
- 208000015181 infectious disease Diseases 0.000 description 5
- 108010057821 leucylproline Proteins 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- 238000010839 reverse transcription Methods 0.000 description 5
- 239000007787 solid Substances 0.000 description 5
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 4
- 101710145242 Minor capsid protein P3-RTD Proteins 0.000 description 4
- 108010044940 alanylglutamine Proteins 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 108010013835 arginine glutamate Proteins 0.000 description 4
- 108010062796 arginyllysine Proteins 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 4
- 238000012258 culturing Methods 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 108010050848 glycylleucine Proteins 0.000 description 4
- 239000001963 growth medium Substances 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 229930024421 Adenine Natural products 0.000 description 3
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 3
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 3
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 3
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 3
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 3
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 3
- 241001672814 Porcine teschovirus 1 Species 0.000 description 3
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- 108010005233 alanylglutamic acid Proteins 0.000 description 3
- 108010070944 alanylhistidine Proteins 0.000 description 3
- 108010077245 asparaginyl-proline Proteins 0.000 description 3
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 238000010362 genome editing Methods 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- 108010015792 glycyllysine Proteins 0.000 description 3
- 108010003700 lysyl aspartic acid Proteins 0.000 description 3
- 108010064235 lysylglycine Proteins 0.000 description 3
- 230000006780 non-homologous end joining Effects 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- FUSPCLTUKXQREV-ACZMJKKPSA-N Ala-Glu-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O FUSPCLTUKXQREV-ACZMJKKPSA-N 0.000 description 2
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 2
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 2
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 2
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 2
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 2
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 2
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 2
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 2
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 2
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 2
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 2
- ZBYLEBZCVKLPCY-FXQIFTODSA-N Asp-Ser-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZBYLEBZCVKLPCY-FXQIFTODSA-N 0.000 description 2
- 241000701489 Cauliflower mosaic virus Species 0.000 description 2
- 241000710198 Foot-and-mouth disease virus Species 0.000 description 2
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 2
- SHAUZYVSXAMYAZ-JYJNAYRXSA-N Gln-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SHAUZYVSXAMYAZ-JYJNAYRXSA-N 0.000 description 2
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 2
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 2
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 2
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 2
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 2
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 2
- DTRUBYPMMVPQPD-YUMQZZPRSA-N Gly-Gln-Arg Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DTRUBYPMMVPQPD-YUMQZZPRSA-N 0.000 description 2
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 2
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 2
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 2
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 2
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 2
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 2
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 2
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 2
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 2
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- 108010047562 NGR peptide Proteins 0.000 description 2
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 2
- MCWHYUWXVNRXFV-RWMBFGLXSA-N Pro-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 MCWHYUWXVNRXFV-RWMBFGLXSA-N 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 2
- YRBGKVIWMNEVCZ-WDSKDSINSA-N Ser-Glu-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YRBGKVIWMNEVCZ-WDSKDSINSA-N 0.000 description 2
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 2
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 2
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 2
- MQVGIFJSFFVGFW-XEGUGMAKSA-N Trp-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O MQVGIFJSFFVGFW-XEGUGMAKSA-N 0.000 description 2
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 2
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 2
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 2
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 2
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 2
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 2
- 108010047495 alanylglycine Proteins 0.000 description 2
- 210000004102 animal cell Anatomy 0.000 description 2
- 239000007864 aqueous solution Substances 0.000 description 2
- 108010060035 arginylproline Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 229960002989 glutamic acid Drugs 0.000 description 2
- 108010078144 glutaminyl-glycine Proteins 0.000 description 2
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010040030 histidinoalanine Proteins 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 108010078274 isoleucylvaline Proteins 0.000 description 2
- 108010034529 leucyl-lysine Proteins 0.000 description 2
- 208000032839 leukemia Diseases 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 108010054155 lysyllysine Proteins 0.000 description 2
- 108010017391 lysylvaline Proteins 0.000 description 2
- 241000264288 mixed libraries Species 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 108010012581 phenylalanylglutamate Proteins 0.000 description 2
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 2
- 108010051242 phenylalanylserine Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 239000012882 rooting medium Substances 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 230000001954 sterilising effect Effects 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 108010071097 threonyl-lysyl-proline Proteins 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 1
- AXFMEGAFCUULFV-BLFANLJRSA-N (2s)-2-[[(2s)-1-[(2s,3r)-2-amino-3-methylpentanoyl]pyrrolidine-2-carbonyl]amino]pentanedioic acid Chemical compound CC[C@@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AXFMEGAFCUULFV-BLFANLJRSA-N 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 1
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 1
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- PAIHPOGPJVUFJY-WDSKDSINSA-N Ala-Glu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PAIHPOGPJVUFJY-WDSKDSINSA-N 0.000 description 1
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 1
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 1
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- FDAZDMAFZYTHGS-XVYDVKMFSA-N Ala-His-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FDAZDMAFZYTHGS-XVYDVKMFSA-N 0.000 description 1
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 1
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 1
- MNZHHDPWDWQJCQ-YUMQZZPRSA-N Ala-Leu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O MNZHHDPWDWQJCQ-YUMQZZPRSA-N 0.000 description 1
- DPNZTBKGAUAZQU-DLOVCJGASA-N Ala-Leu-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DPNZTBKGAUAZQU-DLOVCJGASA-N 0.000 description 1
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 1
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 1
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 1
- XAXHGSOBFPIRFG-LSJOCFKGSA-N Ala-Pro-His Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XAXHGSOBFPIRFG-LSJOCFKGSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 1
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 1
- BHFOJPDOQPWJRN-XDTLVQLUSA-N Ala-Tyr-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(N)=O)C(O)=O BHFOJPDOQPWJRN-XDTLVQLUSA-N 0.000 description 1
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 1
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 1
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 1
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 1
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 1
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 1
- KGSJCPBERYUXCN-BPNCWPANSA-N Arg-Ala-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KGSJCPBERYUXCN-BPNCWPANSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 1
- UISQLSIBJKEJSS-GUBZILKMSA-N Arg-Arg-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(O)=O UISQLSIBJKEJSS-GUBZILKMSA-N 0.000 description 1
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 1
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 1
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 1
- MFAMTAVAFBPXDC-LPEHRKFASA-N Arg-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O MFAMTAVAFBPXDC-LPEHRKFASA-N 0.000 description 1
- JTWOBPNAVBESFW-FXQIFTODSA-N Arg-Cys-Asp Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)CN=C(N)N JTWOBPNAVBESFW-FXQIFTODSA-N 0.000 description 1
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 1
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 1
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 1
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 1
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 1
- IRRMIGDCPOPZJW-ULQDDVLXSA-N Arg-His-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IRRMIGDCPOPZJW-ULQDDVLXSA-N 0.000 description 1
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 1
- YBZMTKUDWXZLIX-UWVGGRQHSA-N Arg-Leu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YBZMTKUDWXZLIX-UWVGGRQHSA-N 0.000 description 1
- IIAXFBUTKIDDIP-ULQDDVLXSA-N Arg-Leu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IIAXFBUTKIDDIP-ULQDDVLXSA-N 0.000 description 1
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 1
- YVTHEZNOKSAWRW-DCAQKATOSA-N Arg-Lys-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O YVTHEZNOKSAWRW-DCAQKATOSA-N 0.000 description 1
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 1
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 1
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 1
- MTYLORHAQXVQOW-AVGNSLFASA-N Arg-Lys-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O MTYLORHAQXVQOW-AVGNSLFASA-N 0.000 description 1
- JOADBFCFJGNIKF-GUBZILKMSA-N Arg-Met-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O JOADBFCFJGNIKF-GUBZILKMSA-N 0.000 description 1
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 1
- OVQJAKFLFTZDNC-GUBZILKMSA-N Arg-Pro-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O OVQJAKFLFTZDNC-GUBZILKMSA-N 0.000 description 1
- ATABBWFGOHKROJ-GUBZILKMSA-N Arg-Pro-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O ATABBWFGOHKROJ-GUBZILKMSA-N 0.000 description 1
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 1
- LFWOQHSQNCKXRU-UFYCRDLUSA-N Arg-Tyr-Phe Chemical compound C([C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 LFWOQHSQNCKXRU-UFYCRDLUSA-N 0.000 description 1
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 1
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 1
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 1
- QEYJFBMTSMLPKZ-ZKWXMUAHSA-N Asn-Ala-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O QEYJFBMTSMLPKZ-ZKWXMUAHSA-N 0.000 description 1
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 1
- QHBMKQWOIYJYMI-BYULHYEWSA-N Asn-Asn-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O QHBMKQWOIYJYMI-BYULHYEWSA-N 0.000 description 1
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 1
- ZPMNECSEJXXNBE-CIUDSAMLSA-N Asn-Cys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(O)=O ZPMNECSEJXXNBE-CIUDSAMLSA-N 0.000 description 1
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 1
- ASCGFDYEKSRNPL-CIUDSAMLSA-N Asn-Glu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O ASCGFDYEKSRNPL-CIUDSAMLSA-N 0.000 description 1
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 1
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 1
- XLZCLJRGGMBKLR-PCBIJLKTSA-N Asn-Ile-Phe Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XLZCLJRGGMBKLR-PCBIJLKTSA-N 0.000 description 1
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- ALHMNHZJBYBYHS-DCAQKATOSA-N Asn-Lys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ALHMNHZJBYBYHS-DCAQKATOSA-N 0.000 description 1
- RZNAMKZJPBQWDJ-SRVKXCTJSA-N Asn-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N RZNAMKZJPBQWDJ-SRVKXCTJSA-N 0.000 description 1
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 1
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 1
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 1
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 1
- GHWWTICYPDKPTE-NGZCFLSTSA-N Asn-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N GHWWTICYPDKPTE-NGZCFLSTSA-N 0.000 description 1
- KRXIWXCXOARFNT-ZLUOBGJFSA-N Asp-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(O)=O KRXIWXCXOARFNT-ZLUOBGJFSA-N 0.000 description 1
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 1
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 1
- BFOYULZBKYOKAN-OLHMAJIHSA-N Asp-Asp-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFOYULZBKYOKAN-OLHMAJIHSA-N 0.000 description 1
- KVPHTGVUMJGMCX-BIIVOSGPSA-N Asp-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N)C(=O)O KVPHTGVUMJGMCX-BIIVOSGPSA-N 0.000 description 1
- NYQHSUGFEWDWPD-ACZMJKKPSA-N Asp-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N NYQHSUGFEWDWPD-ACZMJKKPSA-N 0.000 description 1
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 1
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 1
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 1
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 1
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 1
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 1
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 1
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 1
- LIJXJYGRSRWLCJ-IHRRRGAJSA-N Asp-Phe-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LIJXJYGRSRWLCJ-IHRRRGAJSA-N 0.000 description 1
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 1
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 1
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- JSHWXQIZOCVWIA-ZKWXMUAHSA-N Asp-Ser-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JSHWXQIZOCVWIA-ZKWXMUAHSA-N 0.000 description 1
- GCACQYDBDHRVGE-LKXGYXEUSA-N Asp-Thr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC(O)=O GCACQYDBDHRVGE-LKXGYXEUSA-N 0.000 description 1
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 1
- XMKXONRMGJXCJV-LAEOZQHASA-N Asp-Val-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XMKXONRMGJXCJV-LAEOZQHASA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- VFGADOJXRLWTBU-JBDRJPRFSA-N Cys-Ile-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N VFGADOJXRLWTBU-JBDRJPRFSA-N 0.000 description 1
- POSRGGKLRWCUBE-CIUDSAMLSA-N Cys-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N POSRGGKLRWCUBE-CIUDSAMLSA-N 0.000 description 1
- LHJDLVVQRJIURS-SRVKXCTJSA-N Cys-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LHJDLVVQRJIURS-SRVKXCTJSA-N 0.000 description 1
- VIOQRFNAZDMVLO-NRPADANISA-N Cys-Val-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VIOQRFNAZDMVLO-NRPADANISA-N 0.000 description 1
- 108010090461 DFG peptide Proteins 0.000 description 1
- 230000008265 DNA repair mechanism Effects 0.000 description 1
- 241000710188 Encephalomyocarditis virus Species 0.000 description 1
- YQYJSBFKSSDGFO-UHFFFAOYSA-N Epihygromycin Natural products OC1C(O)C(C(=O)C)OC1OC(C(=C1)O)=CC=C1C=C(C)C(=O)NC1C(O)C(O)C2OCOC2C1O YQYJSBFKSSDGFO-UHFFFAOYSA-N 0.000 description 1
- 241000214054 Equine rhinitis A virus Species 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 1
- OYTPNWYZORARHL-XHNCKOQMSA-N Gln-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N OYTPNWYZORARHL-XHNCKOQMSA-N 0.000 description 1
- JFOKLAPFYCTNHW-SRVKXCTJSA-N Gln-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)N)N JFOKLAPFYCTNHW-SRVKXCTJSA-N 0.000 description 1
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 1
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 1
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 1
- MADFVRSKEIEZHZ-DCAQKATOSA-N Gln-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N MADFVRSKEIEZHZ-DCAQKATOSA-N 0.000 description 1
- NPTGGVQJYRSMCM-GLLZPBPUSA-N Gln-Gln-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPTGGVQJYRSMCM-GLLZPBPUSA-N 0.000 description 1
- MCAVASRGVBVPMX-FXQIFTODSA-N Gln-Glu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MCAVASRGVBVPMX-FXQIFTODSA-N 0.000 description 1
- DAAUVRPSZRDMBV-KBIXCLLPSA-N Gln-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DAAUVRPSZRDMBV-KBIXCLLPSA-N 0.000 description 1
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 1
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 1
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 1
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- HSHCEAUPUPJPTE-JYJNAYRXSA-N Gln-Leu-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HSHCEAUPUPJPTE-JYJNAYRXSA-N 0.000 description 1
- IHSGESFHTMFHRB-GUBZILKMSA-N Gln-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O IHSGESFHTMFHRB-GUBZILKMSA-N 0.000 description 1
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 1
- ZVQZXPADLZIQFF-FHWLQOOXSA-N Gln-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 ZVQZXPADLZIQFF-FHWLQOOXSA-N 0.000 description 1
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 1
- WLRYGVYQFXRJDA-DCAQKATOSA-N Gln-Pro-Pro Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 WLRYGVYQFXRJDA-DCAQKATOSA-N 0.000 description 1
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 1
- LPIKVBWNNVFHCQ-GUBZILKMSA-N Gln-Ser-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O LPIKVBWNNVFHCQ-GUBZILKMSA-N 0.000 description 1
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 1
- SYZZMPFLOLSMHL-XHNCKOQMSA-N Gln-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N)C(=O)O SYZZMPFLOLSMHL-XHNCKOQMSA-N 0.000 description 1
- ARYKRXHBIPLULY-XKBZYTNZSA-N Gln-Thr-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ARYKRXHBIPLULY-XKBZYTNZSA-N 0.000 description 1
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 1
- ICRKQMRFXYDYMK-LAEOZQHASA-N Gln-Val-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ICRKQMRFXYDYMK-LAEOZQHASA-N 0.000 description 1
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 1
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 1
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 1
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 1
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 1
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 1
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 1
- RDDSZZJOKDVPAE-ACZMJKKPSA-N Glu-Asn-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDDSZZJOKDVPAE-ACZMJKKPSA-N 0.000 description 1
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 1
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 1
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 1
- OXEMJGCAJFFREE-FXQIFTODSA-N Glu-Gln-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O OXEMJGCAJFFREE-FXQIFTODSA-N 0.000 description 1
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 1
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 1
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 1
- NJPQBTJSYCKCNS-HVTMNAMFSA-N Glu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N NJPQBTJSYCKCNS-HVTMNAMFSA-N 0.000 description 1
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 1
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 1
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 1
- XTZDZAXYPDISRR-MNXVOIDGSA-N Glu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XTZDZAXYPDISRR-MNXVOIDGSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- KRRFFAHEAOCBCQ-SIUGBPQLSA-N Glu-Ile-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KRRFFAHEAOCBCQ-SIUGBPQLSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 1
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 1
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 1
- UJMNFCAHLYKWOZ-DCAQKATOSA-N Glu-Lys-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UJMNFCAHLYKWOZ-DCAQKATOSA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 1
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 1
- WIKMTDVSCUJIPJ-CIUDSAMLSA-N Glu-Ser-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WIKMTDVSCUJIPJ-CIUDSAMLSA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 1
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 1
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 1
- NTHIHAUEXVTXQG-KKUMJFAQSA-N Glu-Tyr-Arg Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O NTHIHAUEXVTXQG-KKUMJFAQSA-N 0.000 description 1
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 1
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 1
- MLILEEIVMRUYBX-NHCYSSNCSA-N Glu-Val-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O MLILEEIVMRUYBX-NHCYSSNCSA-N 0.000 description 1
- FGGKGJHCVMYGCD-UKJIMTQDSA-N Glu-Val-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FGGKGJHCVMYGCD-UKJIMTQDSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 1
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 1
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 1
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 1
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 1
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 1
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 1
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 1
- LHRXAHLCRMQBGJ-RYUDHWBXSA-N Gly-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN LHRXAHLCRMQBGJ-RYUDHWBXSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- ZKLYPEGLWFVRGF-IUCAKERBSA-N Gly-His-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZKLYPEGLWFVRGF-IUCAKERBSA-N 0.000 description 1
- LPCKHUXOGVNZRS-YUMQZZPRSA-N Gly-His-Ser Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O LPCKHUXOGVNZRS-YUMQZZPRSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 1
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 1
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 1
- BBTCXWTXOXUNFX-IUCAKERBSA-N Gly-Met-Arg Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O BBTCXWTXOXUNFX-IUCAKERBSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- QVDGHDFFYHKJPN-QWRGUYRKSA-N Gly-Phe-Cys Chemical compound NCC(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CS)C(O)=O QVDGHDFFYHKJPN-QWRGUYRKSA-N 0.000 description 1
- IGOYNRWLWHWAQO-JTQLQIEISA-N Gly-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 IGOYNRWLWHWAQO-JTQLQIEISA-N 0.000 description 1
- WDXLKVQATNEAJQ-BQBZGAKWSA-N Gly-Pro-Asp Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WDXLKVQATNEAJQ-BQBZGAKWSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 1
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 1
- PYFIQROSWQERAS-LBPRGKRZSA-N Gly-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(=O)NCC(O)=O)=CNC2=C1 PYFIQROSWQERAS-LBPRGKRZSA-N 0.000 description 1
- UIQGJYUEQDOODF-KWQFWETISA-N Gly-Tyr-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 UIQGJYUEQDOODF-KWQFWETISA-N 0.000 description 1
- DUAWRXXTOQOECJ-JSGCOSHPSA-N Gly-Tyr-Val Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O DUAWRXXTOQOECJ-JSGCOSHPSA-N 0.000 description 1
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 1
- KZTLOHBDLMIFSH-XVYDVKMFSA-N His-Ala-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O KZTLOHBDLMIFSH-XVYDVKMFSA-N 0.000 description 1
- WYWBYSPRCFADBM-GARJFASQSA-N His-Cys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CS)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O WYWBYSPRCFADBM-GARJFASQSA-N 0.000 description 1
- DVHGLDYMGWTYKW-GUBZILKMSA-N His-Gln-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DVHGLDYMGWTYKW-GUBZILKMSA-N 0.000 description 1
- BDFCIKANUNMFGB-PMVVWTBXSA-N His-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CN=CN1 BDFCIKANUNMFGB-PMVVWTBXSA-N 0.000 description 1
- IDQNVIWPPWAFSY-AVGNSLFASA-N His-His-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O IDQNVIWPPWAFSY-AVGNSLFASA-N 0.000 description 1
- WJGSTIMGSIWHJX-HVTMNAMFSA-N His-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N WJGSTIMGSIWHJX-HVTMNAMFSA-N 0.000 description 1
- TWROVBNEHJSXDG-IHRRRGAJSA-N His-Leu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O TWROVBNEHJSXDG-IHRRRGAJSA-N 0.000 description 1
- DEOQGJUXUQGUJN-KKUMJFAQSA-N His-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N DEOQGJUXUQGUJN-KKUMJFAQSA-N 0.000 description 1
- TTYKEFZRLKQTHH-MELADBBJSA-N His-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O TTYKEFZRLKQTHH-MELADBBJSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- OWYIDJCNRWRSJY-QTKMDUPCSA-N His-Pro-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O OWYIDJCNRWRSJY-QTKMDUPCSA-N 0.000 description 1
- YAJQKIBLYPFAET-NAZCDGGXSA-N His-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N)O YAJQKIBLYPFAET-NAZCDGGXSA-N 0.000 description 1
- UXZMINKIEWBEQU-SZMVWBNQSA-N His-Trp-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC3=CN=CN3)N UXZMINKIEWBEQU-SZMVWBNQSA-N 0.000 description 1
- ZHMZWSFQRUGLEC-JYJNAYRXSA-N His-Tyr-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZHMZWSFQRUGLEC-JYJNAYRXSA-N 0.000 description 1
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 1
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 1
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 1
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 1
- SACHLUOUHCVIKI-GMOBBJLQSA-N Ile-Arg-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SACHLUOUHCVIKI-GMOBBJLQSA-N 0.000 description 1
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 1
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 1
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 1
- UMYZBHKAVTXWIW-GMOBBJLQSA-N Ile-Asp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UMYZBHKAVTXWIW-GMOBBJLQSA-N 0.000 description 1
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 1
- REJKOQYVFDEZHA-SLBDDTMCSA-N Ile-Asp-Trp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N REJKOQYVFDEZHA-SLBDDTMCSA-N 0.000 description 1
- LOXMWQOKYBGCHF-JBDRJPRFSA-N Ile-Cys-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](C)C(O)=O LOXMWQOKYBGCHF-JBDRJPRFSA-N 0.000 description 1
- KIMHKBDJQQYLHU-PEFMBERDSA-N Ile-Glu-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KIMHKBDJQQYLHU-PEFMBERDSA-N 0.000 description 1
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 1
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 1
- IGJWJGIHUFQANP-LAEOZQHASA-N Ile-Gly-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N IGJWJGIHUFQANP-LAEOZQHASA-N 0.000 description 1
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 1
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 1
- UASTVUQJMLZWGG-PEXQALLHSA-N Ile-His-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)NCC(=O)O)N UASTVUQJMLZWGG-PEXQALLHSA-N 0.000 description 1
- LNJLOZYNZFGJMM-DEQVHRJGSA-N Ile-His-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N LNJLOZYNZFGJMM-DEQVHRJGSA-N 0.000 description 1
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 1
- UWLHDGMRWXHFFY-HPCHECBXSA-N Ile-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N1CCC[C@@H]1C(=O)O)N UWLHDGMRWXHFFY-HPCHECBXSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 1
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 1
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 1
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 1
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 1
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 1
- KCTIFOCXAIUQQK-QXEWZRGKSA-N Ile-Pro-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O KCTIFOCXAIUQQK-QXEWZRGKSA-N 0.000 description 1
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 1
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 1
- TYYLDKGBCJGJGW-UHFFFAOYSA-N L-tryptophan-L-tyrosine Natural products C=1NC2=CC=CC=C2C=1CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 TYYLDKGBCJGJGW-UHFFFAOYSA-N 0.000 description 1
- 241000880493 Leptailurus serval Species 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 1
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 1
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 1
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 1
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 1
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 1
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 1
- ZYLJULGXQDNXDK-GUBZILKMSA-N Leu-Gln-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ZYLJULGXQDNXDK-GUBZILKMSA-N 0.000 description 1
- BOFAFKVZQUMTID-AVGNSLFASA-N Leu-Gln-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N BOFAFKVZQUMTID-AVGNSLFASA-N 0.000 description 1
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 1
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 1
- CIVKXGPFXDIQBV-WDCWCFNPSA-N Leu-Gln-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CIVKXGPFXDIQBV-WDCWCFNPSA-N 0.000 description 1
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 1
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 1
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 1
- UCDHVOALNXENLC-KBPBESRZSA-N Leu-Gly-Tyr Chemical compound CC(C)C[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=C(O)C=C1 UCDHVOALNXENLC-KBPBESRZSA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 1
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 1
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 1
- JNDYEOUZBLOVOF-AVGNSLFASA-N Leu-Leu-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JNDYEOUZBLOVOF-AVGNSLFASA-N 0.000 description 1
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 1
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 1
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 1
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 1
- ZGUMORRUBUCXEH-AVGNSLFASA-N Leu-Lys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZGUMORRUBUCXEH-AVGNSLFASA-N 0.000 description 1
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 1
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 1
- DDVHDMSBLRAKNV-IHRRRGAJSA-N Leu-Met-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O DDVHDMSBLRAKNV-IHRRRGAJSA-N 0.000 description 1
- ZDBMWELMUCLUPL-QEJZJMRPSA-N Leu-Phe-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ZDBMWELMUCLUPL-QEJZJMRPSA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 1
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 1
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 1
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 1
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 1
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 1
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 1
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 1
- RNYLNYTYMXACRI-VFAJRCTISA-N Leu-Thr-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O RNYLNYTYMXACRI-VFAJRCTISA-N 0.000 description 1
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 1
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 1
- AXVIGSRGTMNSJU-YESZJQIVSA-N Leu-Tyr-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N2CCC[C@@H]2C(=O)O)N AXVIGSRGTMNSJU-YESZJQIVSA-N 0.000 description 1
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 1
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 1
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 1
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 1
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 1
- CKSXSQUVEYCDIW-AVGNSLFASA-N Lys-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N CKSXSQUVEYCDIW-AVGNSLFASA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 1
- HKCCVDWHHTVVPN-CIUDSAMLSA-N Lys-Asp-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O HKCCVDWHHTVVPN-CIUDSAMLSA-N 0.000 description 1
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 1
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 1
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 1
- LPAJOCKCPRZEAG-MNXVOIDGSA-N Lys-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN LPAJOCKCPRZEAG-MNXVOIDGSA-N 0.000 description 1
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 1
- WGLAORUKDGRINI-WDCWCFNPSA-N Lys-Glu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O WGLAORUKDGRINI-WDCWCFNPSA-N 0.000 description 1
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 1
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 1
- HAUUXTXKJNVIFY-ONGXEEELSA-N Lys-Gly-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAUUXTXKJNVIFY-ONGXEEELSA-N 0.000 description 1
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 1
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 1
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 1
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 1
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 1
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 1
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 1
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 1
- MTBLFIQZECOEBY-IHRRRGAJSA-N Lys-Met-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O MTBLFIQZECOEBY-IHRRRGAJSA-N 0.000 description 1
- LNMKRJJLEFASGA-BZSNNMDCSA-N Lys-Phe-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LNMKRJJLEFASGA-BZSNNMDCSA-N 0.000 description 1
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 1
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 1
- BDFHWFUAQLIMJO-KXNHARMFSA-N Lys-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N)O BDFHWFUAQLIMJO-KXNHARMFSA-N 0.000 description 1
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 1
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 1
- LMMBAXJRYSXCOQ-ACRUOGEOSA-N Lys-Tyr-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O LMMBAXJRYSXCOQ-ACRUOGEOSA-N 0.000 description 1
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 1
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 1
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 1
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- UAPZLLPGGOOCRO-IHRRRGAJSA-N Met-Asn-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N UAPZLLPGGOOCRO-IHRRRGAJSA-N 0.000 description 1
- HDNOQCZWJGGHSS-VEVYYDQMSA-N Met-Asn-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HDNOQCZWJGGHSS-VEVYYDQMSA-N 0.000 description 1
- VOOINLQYUZOREH-SRVKXCTJSA-N Met-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCSC)N VOOINLQYUZOREH-SRVKXCTJSA-N 0.000 description 1
- KQBJYJXPZBNEIK-DCAQKATOSA-N Met-Glu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQBJYJXPZBNEIK-DCAQKATOSA-N 0.000 description 1
- GVIVXNFKJQFTCE-YUMQZZPRSA-N Met-Gly-Gln Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O GVIVXNFKJQFTCE-YUMQZZPRSA-N 0.000 description 1
- SLQDSYZHHOKQSR-QXEWZRGKSA-N Met-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCSC SLQDSYZHHOKQSR-QXEWZRGKSA-N 0.000 description 1
- MYAPQOBHGWJZOM-UWVGGRQHSA-N Met-Gly-Leu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C MYAPQOBHGWJZOM-UWVGGRQHSA-N 0.000 description 1
- PZUUMQPMHBJJKE-AVGNSLFASA-N Met-Leu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCNC(N)=N PZUUMQPMHBJJKE-AVGNSLFASA-N 0.000 description 1
- MSSJHBAKDDIRMJ-SRVKXCTJSA-N Met-Lys-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MSSJHBAKDDIRMJ-SRVKXCTJSA-N 0.000 description 1
- HAQLBBVZAGMESV-IHRRRGAJSA-N Met-Lys-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O HAQLBBVZAGMESV-IHRRRGAJSA-N 0.000 description 1
- KRLKICLNEICJGV-STQMWFEESA-N Met-Phe-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=CC=C1 KRLKICLNEICJGV-STQMWFEESA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- CYZBFPYMSJGBRL-DRZSPHRISA-N Phe-Ala-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CYZBFPYMSJGBRL-DRZSPHRISA-N 0.000 description 1
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 1
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 1
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 1
- KIAWKQJTSGRCSA-AVGNSLFASA-N Phe-Asn-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KIAWKQJTSGRCSA-AVGNSLFASA-N 0.000 description 1
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 1
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 1
- FRPVPGRXUKFEQE-YDHLFZDLSA-N Phe-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O FRPVPGRXUKFEQE-YDHLFZDLSA-N 0.000 description 1
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 1
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 1
- ZZVUXQCQPXSUFH-JBACZVJFSA-N Phe-Glu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CC=CC=C1 ZZVUXQCQPXSUFH-JBACZVJFSA-N 0.000 description 1
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 1
- BIYWZVCPZIFGPY-QWRGUYRKSA-N Phe-Gly-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CO)C(O)=O BIYWZVCPZIFGPY-QWRGUYRKSA-N 0.000 description 1
- SWCOXQLDICUYOL-ULQDDVLXSA-N Phe-His-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SWCOXQLDICUYOL-ULQDDVLXSA-N 0.000 description 1
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 1
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 1
- YKUGPVXSDOOANW-KKUMJFAQSA-N Phe-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKUGPVXSDOOANW-KKUMJFAQSA-N 0.000 description 1
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 1
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 1
- OSBADCBXAMSPQD-YESZJQIVSA-N Phe-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N OSBADCBXAMSPQD-YESZJQIVSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- KNYPNEYICHHLQL-ACRUOGEOSA-N Phe-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 KNYPNEYICHHLQL-ACRUOGEOSA-N 0.000 description 1
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 1
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 1
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 1
- CZQZSMJXFGGBHM-KKUMJFAQSA-N Phe-Pro-Gln Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O CZQZSMJXFGGBHM-KKUMJFAQSA-N 0.000 description 1
- UNBFGVQVQGXXCK-KKUMJFAQSA-N Phe-Ser-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O UNBFGVQVQGXXCK-KKUMJFAQSA-N 0.000 description 1
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 1
- KCIKTPHTEYBXMG-BVSLBCMMSA-N Phe-Trp-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O KCIKTPHTEYBXMG-BVSLBCMMSA-N 0.000 description 1
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 1
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 1
- OOLOTUZJUBOMAX-GUBZILKMSA-N Pro-Ala-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O OOLOTUZJUBOMAX-GUBZILKMSA-N 0.000 description 1
- NHDVNAKDACFHPX-GUBZILKMSA-N Pro-Arg-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O NHDVNAKDACFHPX-GUBZILKMSA-N 0.000 description 1
- VCYJKOLZYPYGJV-AVGNSLFASA-N Pro-Arg-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VCYJKOLZYPYGJV-AVGNSLFASA-N 0.000 description 1
- XUSDDSLCRPUKLP-QXEWZRGKSA-N Pro-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 XUSDDSLCRPUKLP-QXEWZRGKSA-N 0.000 description 1
- NXEYSLRNNPWCRN-SRVKXCTJSA-N Pro-Glu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXEYSLRNNPWCRN-SRVKXCTJSA-N 0.000 description 1
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 1
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 1
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 1
- LPGSNRSLPHRNBW-AVGNSLFASA-N Pro-His-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 LPGSNRSLPHRNBW-AVGNSLFASA-N 0.000 description 1
- SOACYAXADBWDDT-CYDGBPFRSA-N Pro-Ile-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SOACYAXADBWDDT-CYDGBPFRSA-N 0.000 description 1
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 1
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 1
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 1
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 1
- XYSXOCIWCPFOCG-IHRRRGAJSA-N Pro-Leu-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XYSXOCIWCPFOCG-IHRRRGAJSA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 1
- WCNVGGZRTNHOOS-ULQDDVLXSA-N Pro-Lys-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O WCNVGGZRTNHOOS-ULQDDVLXSA-N 0.000 description 1
- WLJYLAQSUSIQNH-GUBZILKMSA-N Pro-Met-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@@H]1CCCN1 WLJYLAQSUSIQNH-GUBZILKMSA-N 0.000 description 1
- JIWJRKNYLSHONY-KKUMJFAQSA-N Pro-Phe-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JIWJRKNYLSHONY-KKUMJFAQSA-N 0.000 description 1
- AJBQTGZIZQXBLT-STQMWFEESA-N Pro-Phe-Gly Chemical compound C([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 AJBQTGZIZQXBLT-STQMWFEESA-N 0.000 description 1
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 1
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 1
- AIOWVDNPESPXRB-YTWAJWBKSA-N Pro-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2)O AIOWVDNPESPXRB-YTWAJWBKSA-N 0.000 description 1
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 1
- YIPFBJGBRCJJJD-FHWLQOOXSA-N Pro-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 YIPFBJGBRCJJJD-FHWLQOOXSA-N 0.000 description 1
- QHSSUIHLAIWXEE-IHRRRGAJSA-N Pro-Tyr-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O QHSSUIHLAIWXEE-IHRRRGAJSA-N 0.000 description 1
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 108010003201 RGH 0205 Proteins 0.000 description 1
- MMGJPDWSIOAGTH-ACZMJKKPSA-N Ser-Ala-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MMGJPDWSIOAGTH-ACZMJKKPSA-N 0.000 description 1
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 1
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- BQWCDDAISCPDQV-XHNCKOQMSA-N Ser-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N)C(=O)O BQWCDDAISCPDQV-XHNCKOQMSA-N 0.000 description 1
- VDVYTKZBMFADQH-AVGNSLFASA-N Ser-Gln-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 VDVYTKZBMFADQH-AVGNSLFASA-N 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- YQQKYAZABFEYAF-FXQIFTODSA-N Ser-Glu-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQQKYAZABFEYAF-FXQIFTODSA-N 0.000 description 1
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 1
- ZFVFHHZBCVNLGD-GUBZILKMSA-N Ser-His-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZFVFHHZBCVNLGD-GUBZILKMSA-N 0.000 description 1
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 1
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 1
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 1
- ZSLFCBHEINFXRS-LPEHRKFASA-N Ser-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ZSLFCBHEINFXRS-LPEHRKFASA-N 0.000 description 1
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 1
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 1
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 1
- OJFFAQFRCVPHNN-JYBASQMISA-N Ser-Thr-Trp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O OJFFAQFRCVPHNN-JYBASQMISA-N 0.000 description 1
- PMTWIUBUQRGCSB-FXQIFTODSA-N Ser-Val-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O PMTWIUBUQRGCSB-FXQIFTODSA-N 0.000 description 1
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 1
- 239000005708 Sodium hypochlorite Substances 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 229930006000 Sucrose Chemical group 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical group O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- 241001648840 Thosea asigna virus Species 0.000 description 1
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 1
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 1
- XSLXHSYIVPGEER-KZVJFYERSA-N Thr-Ala-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O XSLXHSYIVPGEER-KZVJFYERSA-N 0.000 description 1
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 1
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 1
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 1
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 1
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 1
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 1
- DCCGCVLVVSAJFK-NUMRIWBASA-N Thr-Asp-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O DCCGCVLVVSAJFK-NUMRIWBASA-N 0.000 description 1
- OYTNZCBFDXGQGE-XQXXSGGOSA-N Thr-Gln-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O OYTNZCBFDXGQGE-XQXXSGGOSA-N 0.000 description 1
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 1
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 1
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 1
- IMDMLDSVUSMAEJ-HJGDQZAQSA-N Thr-Leu-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IMDMLDSVUSMAEJ-HJGDQZAQSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 1
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 1
- QHUWWSQZTFLXPQ-FJXKBIBVSA-N Thr-Met-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O QHUWWSQZTFLXPQ-FJXKBIBVSA-N 0.000 description 1
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 1
- CSZFFQBUTMGHAH-UAXMHLISSA-N Thr-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O CSZFFQBUTMGHAH-UAXMHLISSA-N 0.000 description 1
- LXXCHJKHJYRMIY-FQPOAREZSA-N Thr-Tyr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O LXXCHJKHJYRMIY-FQPOAREZSA-N 0.000 description 1
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 1
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 1
- SBYQHZCMVSPQCS-RCWTZXSCSA-N Thr-Val-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCSC)C(O)=O SBYQHZCMVSPQCS-RCWTZXSCSA-N 0.000 description 1
- HYVLNORXQGKONN-NUTKFTJISA-N Trp-Ala-Lys Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O)=CNC2=C1 HYVLNORXQGKONN-NUTKFTJISA-N 0.000 description 1
- BIJDDZBDSJLWJY-PJODQICGSA-N Trp-Ala-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O BIJDDZBDSJLWJY-PJODQICGSA-N 0.000 description 1
- QNMIVTOQXUSGLN-SZMVWBNQSA-N Trp-Arg-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 QNMIVTOQXUSGLN-SZMVWBNQSA-N 0.000 description 1
- BXKWZPXTTSCOMX-AQZXSJQPSA-N Trp-Asn-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXKWZPXTTSCOMX-AQZXSJQPSA-N 0.000 description 1
- XGEUYEOEZYFHRL-KKXDTOCCSA-N Tyr-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XGEUYEOEZYFHRL-KKXDTOCCSA-N 0.000 description 1
- HSVPZJLMPLMPOX-BPNCWPANSA-N Tyr-Arg-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O HSVPZJLMPLMPOX-BPNCWPANSA-N 0.000 description 1
- CRWOSTCODDFEKZ-HRCADAONSA-N Tyr-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O CRWOSTCODDFEKZ-HRCADAONSA-N 0.000 description 1
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 1
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- FIRUOPRJKCBLST-KKUMJFAQSA-N Tyr-His-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O FIRUOPRJKCBLST-KKUMJFAQSA-N 0.000 description 1
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 1
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 1
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 1
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 1
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 1
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 1
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 1
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 1
- UUBKSZNKJUJQEJ-JRQIVUDYSA-N Tyr-Thr-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O UUBKSZNKJUJQEJ-JRQIVUDYSA-N 0.000 description 1
- AKRHKDCELJLTMD-BVSLBCMMSA-N Tyr-Trp-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N AKRHKDCELJLTMD-BVSLBCMMSA-N 0.000 description 1
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 1
- UUJHRSTVQCFDPA-UFYCRDLUSA-N Tyr-Tyr-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 UUJHRSTVQCFDPA-UFYCRDLUSA-N 0.000 description 1
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 1
- LABUITCFCAABSV-UHFFFAOYSA-N Val-Ala-Tyr Natural products CC(C)C(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LABUITCFCAABSV-UHFFFAOYSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 1
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 1
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 1
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 1
- VLDMQVZZWDOKQF-AUTRQRHGSA-N Val-Glu-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VLDMQVZZWDOKQF-AUTRQRHGSA-N 0.000 description 1
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 1
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 1
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 1
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 1
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 1
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 1
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 1
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 1
- LGXUZJIQCGXKGZ-QXEWZRGKSA-N Val-Pro-Asn Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N LGXUZJIQCGXKGZ-QXEWZRGKSA-N 0.000 description 1
- HPOSMQWRPMRMFO-GUBZILKMSA-N Val-Pro-Cys Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)O)N HPOSMQWRPMRMFO-GUBZILKMSA-N 0.000 description 1
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- SUGRIIAOLCDLBD-ZOBUZTSGSA-N Val-Trp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SUGRIIAOLCDLBD-ZOBUZTSGSA-N 0.000 description 1
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 1
- BGTDGENDNWGMDQ-KJEVXHAQSA-N Val-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N)O BGTDGENDNWGMDQ-KJEVXHAQSA-N 0.000 description 1
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 1
- 108010039538 alanyl-glycyl-aspartyl-valine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010011559 alanylphenylalanine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 1
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 108010047857 aspartylglycine Proteins 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- -1 cas9n (H840A)) Proteins 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- XEYBHCRIKKKOSS-UHFFFAOYSA-N disodium;azanylidyneoxidanium;iron(2+);pentacyanide Chemical compound [Na+].[Na+].[Fe+2].N#[C-].N#[C-].N#[C-].N#[C-].N#[C-].[O+]#N XEYBHCRIKKKOSS-UHFFFAOYSA-N 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 125000002791 glucosyl group Chemical group C1([C@H](O)[C@@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 1
- 108010079547 glutamylmethionine Proteins 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 1
- 108010028188 glycyl-histidyl-serine Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010077435 glycyl-phenylalanyl-glycine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010084389 glycyltryptophan Proteins 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 108010045383 histidyl-glycyl-glutamic acid Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 1
- 229930027917 kanamycin Natural products 0.000 description 1
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 1
- 229960000318 kanamycin Drugs 0.000 description 1
- 229930182823 kanamycin A Natural products 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 108010000761 leucylarginine Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 1
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 1
- 238000006303 photolysis reaction Methods 0.000 description 1
- 230000015843 photosynthesis, light reaction Effects 0.000 description 1
- 108010025488 pinealon Proteins 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 1
- 108010031719 prolyl-serine Proteins 0.000 description 1
- 108010004914 prolylarginine Proteins 0.000 description 1
- 230000008263 repair mechanism Effects 0.000 description 1
- 206010039083 rhinitis Diseases 0.000 description 1
- JQXXHWHPUNPDRT-WLSIYKJHSA-N rifampicin Chemical compound O([C@](C1=O)(C)O/C=C/[C@@H]([C@H]([C@@H](OC(C)=O)[C@H](C)[C@H](O)[C@H](C)[C@@H](O)[C@@H](C)\C=C\C=C(C)/C(=O)NC=2C(O)=C3C([O-])=C4C)C)OC)C4=C1C3=C(O)C=2\C=N\N1CC[NH+](C)CC1 JQXXHWHPUNPDRT-WLSIYKJHSA-N 0.000 description 1
- 229960001225 rifampicin Drugs 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 108010048818 seryl-histidine Proteins 0.000 description 1
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 238000002791 soaking Methods 0.000 description 1
- SUKJFIGYRHOWBL-UHFFFAOYSA-N sodium hypochlorite Chemical compound [Na+].Cl[O-] SUKJFIGYRHOWBL-UHFFFAOYSA-N 0.000 description 1
- 229940083618 sodium nitroprusside Drugs 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000008223 sterile water Substances 0.000 description 1
- 238000004659 sterilization and disinfection Methods 0.000 description 1
- 239000005720 sucrose Chemical group 0.000 description 1
- 108010061238 threonyl-glycine Proteins 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 229940027257 timentin Drugs 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 108010044292 tryptophyltyrosine Proteins 0.000 description 1
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 1
- 108010051110 tyrosyl-lysine Proteins 0.000 description 1
- 108010020532 tyrosyl-proline Proteins 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
- C12N9/1276—RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8218—Antisense, co-suppression, viral induced gene silencing [VIGS], post-transcriptional induced gene silencing [PTGS]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1205—Phosphotransferases with an alcohol group as acceptor (2.7.1), e.g. protein kinases
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y207/00—Transferases transferring phosphorus-containing groups (2.7)
- C12Y207/07—Nucleotidyltransferases (2.7.7)
- C12Y207/07049—RNA-directed DNA polymerase (2.7.7.49), i.e. telomerase or reverse-transcriptase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Virology (AREA)
- Cell Biology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明公开了PE‑P3引导编辑系统及其在基因组碱基编辑中的应用。所述PE‑P3引导编辑系统包括融合蛋白或与所述融合蛋白相关的生物材料、pegRNA或与所述pegRNA相关的生物材料;所述融合蛋白包括反转录酶、Cas9切刻酶、自切割寡肽和筛选标记蛋白;所述反转录酶融合在所述Cas9切刻酶的N端,且通过自切割寡肽与筛选标记蛋白融合。通过实验证明:与PE‑P2引导编辑系统相比,本发明的PE‑P3引导编辑系统对靶点的编辑效率显著提高。
Description
技术领域
本发明属于生物技术领域,具体涉及PE-P3引导编辑系统及其在基因组碱基编辑中的应用。
背景技术
CRISPR-Cas9技术已经成为强有力的基因组编辑手段,被广泛应用到很多组织和细胞中。CRISPR/Cas9 protein-RNA复合物通过向导RNA(guide RNA)定位于靶点上,切割产生DNA双链断裂(dsDNA break,DSB),而后生物体会本能的启动DNA修复机制修复DSB。修复机制一般有两种,一种是非同源末端连接(non-homologous end joining,NHEJ),另一种是同源重组(homology-directed repair,HDR)。通常情况下NHEJ占大多数,因此修复产生的随机的indels(insertions or deletions)比精确修复高很多。对于碱基精确替换,因为HDR效率低以及需要DNA模板,所以使用HDR实现碱基精确替换的应用受到很大的限制。2016年和2017年相继报道的胞嘧啶碱基编辑器和腺嘌呤碱基编辑器虽可以精确的实现胞嘧啶(Cytosine,C)到胸腺嘧啶(Thymine,T)以及腺嘌呤(Adenine,A)到鸟嘌呤(Guanine,G)的转换,且不产生DSB也不引入DNA模板,但无法实现嘌呤和嘧啶之间的颠换,即无法实现A到T的替换、T到A的替换、C到G的替换、G到C的替换、A到C的替换、T到G的替换、C到A的替换、G到T的替换。同时,碱基编辑器只能编辑活性窗口内的C或A,而且当活性窗口内存在多个C或多个A时,容易产生靶标C或A与非靶标C或A共编辑而不能最终得到预期编辑产物。所有这些弊端大大限制了碱基编辑器的实际应用。
2019年,David Liu实验室报道了一种新的基因组编辑技术,即引导编辑技术(Prime editing),开发了三种引导编辑器(Prime editor,PE),分别是PE1、PE2和PE3。所有这三种PE均为反转录酶(reverse transcriptase,RT)与Cas9 H840A切口酶(Cas9n(H840A))融合在一起,使用引导编辑技术向导RNA(prime editing guide RNA,pegRNA)实现基因组编辑。pegRNA除了包含通常的向导RNA(sgRNA)外,还包含一段含有目标碱基突变的RT模板以及引物结合位点(primer binding site,PBS)。实验表明该技术可在动物细胞基因组中实现所有12种碱基替换类型的编辑,打破了传统碱基编辑器的限制,大大提高了碱基编辑范围。目前,在植物中,虽然引导编辑技术可以实现所有类型的碱基替换,但仍然存在碱基编辑效率不高,或是部分位点不能被编辑的问题。
发明内容
第一方面,本发明保护一种成套系统。
本发明保护的成套系统包括融合蛋白或与所述融合蛋白相关的生物材料、pegRNA或与所述pegRNA相关的生物材料;
所述融合蛋白包括反转录酶、Cas9切刻酶、自切割寡肽和筛选标记蛋白;所述反转录酶融合在所述Cas9切刻酶的N端,且通过自切割寡肽与筛选标记蛋白融合。
上述成套系统中,所述融合蛋白为依次由反转录酶、Cas9切刻酶、自切割寡肽和筛选标记蛋白组成的融合蛋白或依次由筛选标记蛋白、自切割寡肽、反转录酶和Cas9切刻酶组成的融合蛋白。
上述成套系统中,所述Cas9切刻酶可为Cas9n(H840A);
所述Cas9切刻酶(Cas9n(H840A))可为现有技术中公知的各种Cas9n或其变体,包括来源于细菌的Cas9n(如SpCas9n、SaCas9n、SaCas9n-KKH等),识别不同PAM的SpCas9变体切刻酶(如xCas9n、Cas9n-NG、Cas9n-VQR、Cas9n-VRER等),Cas9高保真酶变体切刻酶(如HypaCas9n、eSpCas9(1.1)n、Cas9-HF1n等)等。
进一步的,所述Cas9n(H840A)为A1)或A2):
A1)氨基酸序列是序列2所示的蛋白质;
A2)将序列2所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
更进一步的,所述Cas9n(H840A)的编码基因为a1)或a2)或a3):
a1)序列1第2293-6393位所示的cDNA分子或DNA分子;
a2)与a1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述Cas9n(H840A)的cDNA分子或DNA分子;
a3)在严格条件下与a1)或a2)限定的核苷酸序列杂交,且编码所述Cas9n(H840A)的cDNA分子或DNA分子。
上述成套系统中,所述反转录酶可为来源于病毒中的反转录酶,如来源于莫洛尼小鼠白血病病毒(Moloney murine leukemia virus,M-MLV)的反转录酶、来源于花椰菜花叶病毒(CaMV)的反转录酶等,也可为来源于细菌中的病毒,如来源于大肠杆菌中的反转录酶等。
进一步的,所述反转录酶为来源于莫洛尼小鼠白血病病毒的反转录酶(M-MLVRT);所述M-MLV RT为B1)或B2):
B1)氨基酸序列是序列3所示的蛋白质;
B2)将序列3所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
更进一步的,所述M-MLV RT的编码基因为b1)或b2)或b3):
b1)序列1第6493-8523位所示的cDNA分子或DNA分子;
b2)与b1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述M-MLV RT的cDNA分子或DNA分子;
b3)在严格条件下与b1)或b2)限定的核苷酸序列杂交,且编码所述M-MLV RT的cDNA分子或DNA分子。
上述成套系统中,所述筛选剂抗性蛋白为潮霉素磷酸转移酶。
进一步的,所述潮霉素磷酸转移酶为D1)或D2):
D1)氨基酸序列是序列4所示的蛋白质;
D2)将序列4所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
更进一步的,所述潮霉素磷酸转移酶的编码基因为d1)或d2)或d3):
d1)序列1第8731-9756位所示的cDNA分子或DNA分子;
d2)与d1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述潮霉素磷酸转移酶的cDNA分子或DNA分子;
d3)在严格条件下与d1)或d2)限定的核苷酸序列杂交,且编码所述潮霉素磷酸转移酶的cDNA分子或DNA分子。
上述成套系统中,所述自切割寡肽可为来源于病毒基因组的2A自切割寡肽,如口蹄疫病毒(FMDV)(F2A)肽、马A型鼻炎病毒(ERAV)(E2A)肽、明脉扁刺蛾β四体病毒(Thoseaasigna virus)(T2A)肽、猪捷申病毒-1(PTV-1)(P2A)肽、泰勒病毒2A肽以及脑心肌炎病毒2A肽。
进一步的,所述自切割寡肽为来源于猪捷申病毒-1的2A自切割寡肽(P2A);所述P2A的氨基酸序列为C1)或C2):
C1)氨基酸序列是序列5所示的蛋白质;
C2)将序列5所示的氨基酸序列经过一个或几个氨基酸残基的取代和/或缺失和/或添加且具有相同功能的蛋白质。
更进一步的,所述P2A的编码基因为c1)或c2)或c3):
c1)序列1第8674-8730位所示的cDNA分子或DNA分子;
c2)与c1)限定的核苷酸序列具有75%或75%以上同一性,且编码所述P2A的cDNA分子或DNA分子;
c3)在严格条件下与c1)或c2)限定的核苷酸序列杂交,且编码所述P2A的cDNA分子或DNA分子。
上述成套系统中,所述pegRNA依次由靶点序列(记作靶点序列甲)、esgRNA骨架、RT序列和PBS序列组成。
所述RT序列为靶点序列3’端3个碱基及其后连续的一段基因组序列的反向互补序列,且在其中引入目标突变,作为反转录酶的反转录模板,反转录出cDNA,然后作为修复模板,对基因组DNA进行修复。所述RT序列大小进一步可为8-34bp。
所述PBS序列(引物结合位点序列)为靶点序列5’端第n个碱基到第17个碱基靶点序列的反向互补序列(1≤n<17)。
所述RT序列和所述PBS序列的设计方法或原理可参照现有技术中已报道的有关于引导编辑技术(Prime editing,PE)中与pegRNA的RT序列和PBS序列相关的设计方法或原理。
所述esgRNA骨架为F1)或F2)或F3):
F1)将序列1第11008-11093位中的T替换为U得到的RNA分子;
F2)将F1)所示的RNA分子经过一个或几个核苷酸的取代和/或缺失和/或添加且具有相同功能的RNA分子;
F3)与F1)或F2)限定的核苷酸序列具有75%或75%以上同一性且具有相同功能的RNA分子。
上述成套系统还可包括esgRNA或与所述esgRNA相关的生物材料。
所述esgRNA依次由靶点序列(记作靶点序列乙)和上述esgRNA骨架组成。该esgRNA用于产生非编辑链切口,非编辑链切口位点可以随意选择。所述靶点序列甲与所述靶点序列乙分别位于目标DNA的两条链上,二者可以互补重合或部分互补重合,也可以有一定距离。
上述成套系统的用途具体如下:
S1)生物体或生物细胞基因组序列的编辑;
S2)制备生物体或生物细胞基因组序列的编辑的产品;
S3)提高生物体或生物细胞基因组序列的编辑效率;
S4)制备提高生物体或生物细胞基因组序列的编辑效率的产品。
第二方面,本发明保护上述成套系统或上述成套系统中的融合蛋白的新用途。
本发明保护上述成套系统或上述成套系统中的融合蛋白在如下S1)-S4)任一种中的应用:
S1)生物体或生物细胞基因组序列的编辑;
S2)制备生物体或生物细胞基因组序列的编辑的产品;
S3)提高生物体或生物细胞基因组序列的编辑效率;
S4)制备提高生物体或生物细胞基因组序列的编辑效率的产品。
第三方面,本发明保护如下T1)-T3)所述的方法:
T1)基因组序列的编辑方法或提高生物体或生物细胞基因组序列的编辑效率的方法,包括如下步骤:使生物体或生物细胞表达上述融合蛋白和上述pegRNA;所述pegRNA靶向靶点序列甲,用于实现对基因组序列的编辑。
T2)基因组序列的编辑方法或提高生物体或生物细胞基因组序列的编辑效率的方法,包括如下步骤:使生物体或生物细胞表达上述融合蛋白、上述pegRNA和上述esgRNA;所述pegRNA靶向靶点序列甲,用于实现对基因组序列的编辑;所述esgRNA靶向靶点序列乙,用于在非编辑链上产生切口,以提高目标突变的编辑效率。
T3)生物突变体的制备方法,包括如下步骤:按照T1)或T2)所述的方法对生物体或生物细胞的基因组序列进行编辑,获得生物突变体。
上述方法中,所述T1)中,所述使生物体或生物细胞表达上述融合蛋白和上述pegRNA的方法为将上述融合蛋白的编码基因和转录上述pegRNA的DNA分子导入目的植物中。
所述T2)中,所述使生物体或生物细胞表达上述融合蛋白、上述pegRNA和上述esgRNA的方法为将上述融合蛋白的编码基因、转录上述pegRNA的DNA分子和转录上述esgRNA的DNA分子导入目的植物中。
进一步的,所述T2)中,所述融合蛋白的编码基因、所述转录上述pegRNA的DNA分子和所述转录上述esgRNA的DNA分子通过重组表达载体导入目的植物中。所述融合蛋白的编码基因、所述转录上述pegRNA的DNA分子和所述转录上述esgRNA的DNA分子可通过同一个重组表达载体导入目的植物中,也可通过两个或者多个重组表达载体共同导入目的植物中。
在本发明的具体实施例中,所述融合蛋白的编码基因、所述转录上述pegRNA的DNA分子和所述转录上述esgRNA的DNA分子通过同一个重组表达载体导入目的植物中。所述重组表达载体包括依次由启动子、反转录酶M-MLV RT的编码基因、Cas9n(H840A)的编码基因、自切割寡肽P2A的编码基因、筛选剂抗性蛋白HPT的编码基因和终止子组成的表达盒,依次由启动子、所述转录esgRNA的DNA分子和poly T组成的表达盒和依次由启动子、所述转录pegRNA的DNA分子和poly T组成的表达盒。
上述任一所述成套系统或应用或方法中,所述基因组序列的编辑包括所述基因组序列的碱基替换(如单碱基替换和多碱基替换)、碱基插入(如单碱基插入和多碱基插入)和碱基删除(如单碱基删除和多碱基删除)。在本发明的具体实施例中,所述基因组序列的编辑为基因组序列的碱基替换。
上述任一所述成套系统或应用或方法中,所述生物体为X1)或X2)或X3)或X4):
X1)植物或动物;
X2)单子叶植物或双子叶植物;
X3)禾本科植物;
X4)水稻。
所述生物细胞为Y1)或Y2)或Y3)或Y4):
Y1)植物细胞或动物细胞;
Y2)单子叶植物细胞或双子叶植物细胞;
Y3)禾本科植物细胞;
Y4)水稻细胞。
本发明为了进一步提高PE-P2引导编辑系统的编辑效率,将M-MLV融合在Cas9n(H840A)的N端,同时通过自剪切多肽P2A与筛选剂抗性蛋白融合,提供了PE-P3引导编辑系统。与PE-P2引导编辑系统相比,本发明的PE-P3引导编辑系统对靶点的编辑效率显著提高。
附图说明
图1为引导编辑系统PE-P3和引导编辑系统PE-P2的表达载体的结构示意图。
图2为RT-M模板形式和RT-S模板形式的示意图。
具体实施方式
下面结合具体实施方式对本发明进行进一步的详细描述,给出的实施例仅为了阐明本发明,而不是为了限制本发明的范围。下述实施例中的实验方法,如无特殊说明,均为常规方法。下述实施例中所用的材料、试剂、仪器等,如无特殊说明,均可从商业途径得到。下述实施例中,如无特殊说明,序列表中各核苷酸序列的第1位均为相应DNA/RNA的5′末端核苷酸,末位均为相应DNA/RNA的3′末端核苷酸。
以下实施例中用于扩增目的基因的引物名称和引物序列如下表所示。
以下实施例中,
引导编辑器愈伤编辑效率=(组1检测到的有全部突变位点的reads数/总reads数×100%+组2检测到的有全部突变位点的reads数/总reads数×100%+组3检测到的有全部突变位点的reads数/总reads数×100%)/3。
引导编辑器T0苗编辑效率=全部突变位点均发生突变的阳性T0苗数/分析的总阳性T0苗数×100%。
日本晴水稻:参考文献:梁卫红,王高华,杜京尧,等.硝普钠及其光解产物对日本晴水稻幼苗生长和5种激素标记基因表达的影响[J].河南师范大学学报(自然版),2017(2):48-52.;公众可以从北京市农林科学院获得。
恢复培养基:含有200mg/L特美汀的N6固体培养基。
筛选培养基:含有50mg/L潮霉素的N6固体培养基。
分化培养基:含有2mg/L KT、0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。
生根培养基:含有0.2mg/L NAA、0.5g/L谷氨酸、0.5g/L脯氨酸的N6固体培养基。
实施例1、不同引导编辑系统的设计方法
引导编辑系统包括融合蛋白、esgRNA和pegRNA;融合蛋白包括Cas9切刻酶(如Cas9n(H840A))、反转录酶(如M-MLV)、自切割寡肽(如P2A)和筛选标记蛋白(如HPT);pegRNA依次由esgRNA、逆转录模板序列(RT序列)和引物结合位点序列(PBS序列)组成。
一、针对引导编辑系统中反转录酶与Cas9切刻酶的设计
根据反转录酶与Cas9切刻酶连接方式的不同,共分为如下两种引导编辑系统:现有技术中的引导编辑系统PE-P2和本发明设计的引导编辑系统PE-P3。引导编辑系统PE-P3和引导编辑系统PE-P2的表达载体的结构示意图如图1所示。
引导编辑系统PE-P2的表达载体包括Cas9n(H840A)&M-MLV&Hpt表达盒、esgRNA表达盒和pegRNA表达盒。Cas9n(H840A)&M-MLV&Hpt表达盒中,M-MLV融合在Cas9n(H840A)的C端,同时通过自剪切多肽P2A与筛选剂抗性蛋白融合。
引导编辑系统PE-P3的表达载体包括M-MLV&Cas9n(H840A)&Hpt表达盒、esgRNA表达盒和pegRNA表达盒。M-MLV&Cas9n(H840A)&Hpt表达盒中,M-MLV融合在Cas9n(H840A)的N端,同时通过自剪切多肽P2A与筛选剂抗性蛋白融合。
二、针对引导编辑系统中逆转录模板的设计
在引导编辑系统PE-P2和引导编辑系统PE-P3的基础上,根据逆转录模板(RT模板)中是否引入额外的突变碱基,共分为如下两种逆转录模板形式:RT-S模板形式和RT-M模板形式。以图2中的靶点为例,RT-M模板形式和RT-S模板形式的示意图如图2所示。
RT-S模板形式:相对于基因组序列,RT模板中仅含有单个突变碱基,将该单个突变碱基位点记作目标突变位点,即在RT-S模板形式中,仅在目标突变位点引入突变碱基,全部突变位点仅为目标突变位点。
RT-M模板形式:相对于基因组序列,RT模板中在RT-S的基础上引入额外的突变碱基,将该额外的突变碱基位点记作额外突变位点,即在RT-M模板形式中,除了在目标突变位点引入突变碱基外,还在目标突变位点以外的其它位点(额外突变位点)引入额外的突变碱基,全部突变位点由目标突变位点和额外突变位点组成。
实施例2、不同引导编辑系统的表达载体的构建及其对水稻基因组进行碱基编辑的效率对比
一、不同引导编辑系统的表达载体的构建
人工合成如下重组载体,各载体均为环状质粒:
引导编辑系统PE-P2的表达载体共计14个,分别是PE-P2-1,PE-P2-2,PE-P2-3,PE-P2-4,PE-P2-5,PE-P2-6,PE-P2-7,PE-P2-8,PE-P2-9,PE-P2-10,PE-P2-11,PE-P2-12,PE-P2-13,PE-P2-14载体。
引导编辑系统PE-P3的表达载体共计14个,分别是PE-P3-1,PE-P3-2,PE-P3-3,PE-P3-4,PE-P3-5,PE-P3-6,PE-P3-7,PE-P3-8,PE-P3-9,PE-P3-10,PE-P3-11,PE-P3-12,PE-P3-13,PE-P3-14载体。
PE-P2-1重组表达载体的序列为序列表中的序列1。其中,序列1的第102-2073位为ZmUbi1启动子的核苷酸序列,第2293-6393为Cas9n(H840A)蛋白质的编码序列(不含有终止密码子),编码序列2所示的Cas9n(H840A)蛋白质;序列1的第6493-8523位为M-MLV RT蛋白质的编码序列,编码序列3所示的M-MLV RT蛋白质;序列1的第8674-8730位为P2A的编码序列,编码序列5所示的蛋白质;序列1的第8731-9756位为潮霉素磷酸转移酶的编码序列,编码序列4所示的潮霉素磷酸转移酶蛋白质;序列1的第9763-10017位为Nos终止子序列;序列1的第10026-10491位为OsU6a启动子的核苷酸序列,第10492-10511位为产生非编码链切口的esgRNA靶点序列,第10512-10597位为产生非编码链切口的esgRNA骨架序列,第10598-10606位为Poly T;序列1的第10607-10987位为OsU3启动子的核苷酸序列,第10988-11007位为pegRNA-01靶点序列,第11008-11093位为pegRNA-01所对应的esgRNA骨架序列,第11094-11120位为pegRNA-01上的RT&PBS序列,第11121-11128位为Poly T;PE-P2-1重组表达载体中的产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列,序列见表1。
PE-P2-2重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-02所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-02靶点序列和pegRNA-02上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-02所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-02靶点序列和pegRNA-02上的RT&PBS序列见表1。
PE-P2-3重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-03所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-03靶点序列和pegRNA-03上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-03所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-03靶点序列和pegRNA-03上的RT&PBS序列见表1。
PE-P2-4重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-04所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-04靶点序列和pegRNA-04上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-04所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-04靶点序列和pegRNA-04上的RT&PBS序列见表1。
PE-P2-5重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-05所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-05靶点序列和pegRNA-05上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-05所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-05靶点序列和pegRNA-05上的RT&PBS序列见表1。
PE-P2-6重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-06所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-06靶点序列和pegRNA-06上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-06所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-06靶点序列和pegRNA-06上的RT&PBS序列见表1。
PE-P2-7重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-07所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-07靶点序列和pegRNA-07上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-07所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-07靶点序列和pegRNA-07上的RT&PBS序列见表1。
PE-P2-8重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-08所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-08靶点序列和pegRNA-08上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-08所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-08靶点序列和pegRNA-08上的RT&PBS序列见表1。
PE-P2-9重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-09所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-09靶点序列和pegRNA-09上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-09所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-09靶点序列和pegRNA-09上的RT&PBS序列见表1。
PE-P2-10重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-10所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-10靶点序列和pegRNA-10上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-10所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-10靶点序列和pegRNA-10上的RT&PBS序列见表1。
PE-P2-11重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-11所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-11靶点序列和pegRNA-11上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-11所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-11靶点序列和pegRNA-11上的RT&PBS序列见表1。
PE-P2-12重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-12所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-12靶点序列和pegRNA-12上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-12所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-12靶点序列和pegRNA-12上的RT&PBS序列见表1。
PE-P2-13重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-13所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-13靶点序列和pegRNA-13上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-13所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-13靶点序列和pegRNA-13上的RT&PBS序列见表1。
PE-P2-14重组表达载体的序列为将PE-P2-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-14所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-14靶点序列和pegRNA-14上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-14所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-14靶点序列和pegRNA-14上的RT&PBS序列见表1。
PE-P3-1重组表达载体的序列为序列表中的序列6。其中,序列6的第102-2073位为ZmUbi1启动子的核苷酸序列,第2290-4320位为M-MLV RT蛋白质的编码序列,编码序列3所示的M-MLV RT蛋白质;序列6的第4420-8520为Cas9n(H840A)蛋白质的编码序列(不含有终止密码子),编码序列2所示的Cas9n(H840A)蛋白质;序列6的第8671-8727位为P2A的编码序列,编码序列5所示的蛋白质;序列6的第8728-9753位为潮霉素磷酸转移酶的编码序列,编码序列4所示的潮霉素磷酸转移酶蛋白质;序列6的第9760-10014位为Nos终止子序列;序列6的第10023-10488位为OsU6a启动子的核苷酸序列,第10489-10508位为产生非编码链切口的esgRNA靶点序列,第10509-10594位为产生非编码链切口的esgRNA骨架序列,第10595-10603位为Poly T;序列6的第10604-10984位为OsU3启动子的核苷酸序列,第10985-11004位为pegRNA-01靶点序列,第11005-11090位为pegRNA-01所对应的esgRNA骨架序列,第11091-11117位为pegRNA-01上的RT&PBS序列,第11118-11125位为Poly T;PE-P3-1重组表达载体中的产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列,序列见表1。
PE-P3-2重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-02所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-02靶点序列和pegRNA-02上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-02所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-02靶点序列和pegRNA-02上的RT&PBS序列见表1。
PE-P3-3重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-03所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-03靶点序列和pegRNA-03上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-03所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-03靶点序列和pegRNA-03上的RT&PBS序列见表1。
PE-P3-4重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-04所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-04靶点序列和pegRNA-04上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-04所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-04靶点序列和pegRNA-04上的RT&PBS序列见表1。
PE-P3-5重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-05所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-05靶点序列和pegRNA-05上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-05所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-05靶点序列和pegRNA-05上的RT&PBS序列见表1。
PE-P3-6重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-06所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-06靶点序列和pegRNA-06上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-06所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-06靶点序列和pegRNA-06上的RT&PBS序列见表1。
PE-P3-7重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-07所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-07靶点序列和pegRNA-07上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-07所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-07靶点序列和pegRNA-07上的RT&PBS序列见表1。
PE-P3-8重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-08所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-08靶点序列和pegRNA-08上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-08所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-08靶点序列和pegRNA-08上的RT&PBS序列见表1。
PE-P3-9重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-09所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-09靶点序列和pegRNA-09上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-09所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-09靶点序列和pegRNA-09上的RT&PBS序列见表1。
PE-P3-10重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-10所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-10靶点序列和pegRNA-10上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-10所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-10靶点序列和pegRNA-10上的RT&PBS序列见表1。
PE-P3-11重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-11所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-11靶点序列和pegRNA-11上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-11所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-11靶点序列和pegRNA-11上的RT&PBS序列见表1。
PE-P3-12重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-12所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-12靶点序列和pegRNA-12上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-12所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-12靶点序列和pegRNA-12上的RT&PBS序列见表1。
PE-P3-13重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-13所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-13靶点序列和pegRNA-13上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-13所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-13靶点序列和pegRNA-13上的RT&PBS序列见表1。
PE-P3-14重组表达载体的序列为将PE-P3-1重组表达载体序列中产生非编码链切口的esgRNA靶点序列、pegRNA-01靶点序列和pegRNA-01上的RT&PBS序列分别替换为pegRNA-14所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-14靶点序列和pegRNA-14上的RT&PBS序列,且保持其他序列不变后得到的序列。pegRNA-14所对应的产生非编码链切口的esgRNA靶点序列、pegRNA-14靶点序列和pegRNA-14上的RT&PBS序列见表1。
各载体的pegRNA上的靶点核苷酸序列和RT&PBS序列,以及用于产生非编码链切口的esgRNA靶点序列如表1所示。
表1
二、水稻抗性愈伤以及阳性T0苗的获得
将步骤一构建的PE-P2-1,PE-P2-2,PE-P2-3,PE-P2-4,PE-P2-5,PE-P2-6,PE-P2-7,PE-P2-8,PE-P2-9,PE-P2-10,PE-P2-11,PE-P2-12,PE-P2-13,PE-P2-14,PE-P3-1,PE-P3-2,PE-P3-3,PE-P3-4,PE-P3-5,PE-P3-6,PE-P3-7,PE-P3-8,PE-P3-9,PE-P3-10,PE-P3-11,PE-P3-12,PE-P3-13和PE-P3-14重组表达载体,分别按照如下步骤1-9进行操作:
1、将载体导入农杆菌EHA105(上海唯地生物技术有限公司的产品,CAT#:AC1010),得到重组农杆菌。
2、采用培养基(含50μg/ml卡那霉素和25μg/ml利福平的YEP培养基)培养重组农杆菌,28℃,150rpm震荡培养至OD600为1.0-2.0,室温条件下,10000rpm离心1min,用侵染液(将N6液体培养基中的糖替换为葡萄糖和蔗糖,葡萄糖和蔗糖在侵染液中的浓度分别为10g/L和20g/L)重悬菌体并稀释至OD600为0.2,得到农杆菌侵染液。
3、水稻品种日本晴成熟种子去壳脱粒,置于100mL三角瓶中,加入70%(v/v)乙醇水溶液浸泡30sec,再置于25%(v/v)次氯酸钠水溶液中,120rpm震荡灭菌30min,无菌水冲洗3次,用滤纸吸干水分,然后将种子胚朝下置于N6固体培养基上,28℃暗培养4-6周,得到水稻愈伤。
4、完成步骤3后,将水稻愈伤浸泡置于农杆菌侵染液甲(农杆菌侵染液甲为向农杆菌侵染液中加入乙酰丁香酮得到的液体,乙酰丁香酮的添加量满足乙酰丁香酮与农杆菌侵染液的体积比为25μl:50ml)中浸泡10min,然后,放在铺有两层灭菌滤纸的培养皿(内含约200ml不含农杆菌的侵染液)上,21℃暗培养1天。
5、取步骤4得到的水稻愈伤放入恢复培养基上,25-28℃暗培养3天。
6、取步骤5得到的水稻愈伤,置于筛选培养基上,28℃暗培养2周。
7、取步骤6得到的水稻愈伤,再次置于筛选培养基上,28℃暗培养2周,得到水稻抗性愈伤。
8、取步骤7得到的水稻抗性愈伤放入分化培养基上,25℃光照培养1个月左右,将分化出来的小苗移至生根培养基上,25℃光照培养2周,获取水稻T0苗。
9、对于PE-P2-1,PE-P2-2,PE-P2-3,PE-P2-4,PE-P2-5,PE-P2-6,PE-P2-7,PE-P2-8,PE-P2-9,PE-P2-10,PE-P2-11,PE-P2-12,PE-P2-13,PE-P2-14重组表达载体,提取所获得的水稻T0苗的基因组DNA并以其作为模板,采用引物F(5’-GATCTTGATATACTTGGATGATGGC-3’)和引物R(5’-GGGGTACTTCTCGTGGTAGG-3’)组成的引物对进行PCR扩增,得到PCR扩增产物;将该PCR扩增产物进行琼脂糖凝胶电泳,然后进行如下判断:如果PCR扩增产物中含有约753bp的DNA片段,则相应的水稻T0苗为水稻阳性T0苗;如果PCR扩增产物中不含有约753bp的DNA片段,则相应的水稻T0苗不为水稻阳性T0苗。对于PE-P3-1,PE-P3-2,PE-P3-3,PE-P3-4,PE-P3-5,PE-P3-6,PE-P3-7,PE-P3-8,PE-P3-9,PE-P3-10,PE-P3-11,PE-P3-12,PE-P3-13和PE-P3-14重组表达载体,提取所获得的水稻T0苗的基因组DNA并以其作为模板,采用引物F(5’-GATCTTGATATACTTGGATGATGGC-3’)和引物R(5’-ATGACTGTCTCCTTCCTTGCC-3’)组成的引物对进行PCR扩增,得到PCR扩增产物;将该PCR扩增产物进行琼脂糖凝胶电泳,然后进行如下判断:如果PCR扩增产物中含有约1220bp的DNA片段,则相应的水稻T0苗为水稻阳性T0苗;如果PCR扩增产物中不含有约1220bp的DNA片段,则相应的水稻T0苗不为水稻阳性T0苗。
三、结果分析
1、每载体分别随机选取步骤二中步骤7所获得的抗性愈伤24块,提取DNA后,随机混合8块愈伤的DNA,最终获得3份混合DNA,即分为3组。以混合DNA为模板,对于OsALS-1靶点,采用引物对OsALS-1进行PCR扩增,得到第一轮PCR扩增产物;对于OsACC-2靶点,采用引物对OsACC-2进行PCR扩增,得到第一轮PCR扩增产物;对于OsWaxy-1靶点,采用引物对OsWaxy-1进行PCR扩增,得到第一轮PCR扩增产物;对于OsDEP1靶点,采用引物对OsDEP1进行PCR扩增,得到第一轮PCR扩增产物;对于OsALS-2靶点,采用引物对OsALS-2进行PCR扩增,得到第一轮PCR扩增产物。以第一轮PCR产物为模板,将不同的正向和反向条码加入PCR产物末端进行文库构建,形成混合文库,使用Illumina NovaSeq6000高通量测序平台测序,每个混合文库测序数据量2G(北京诺禾致源科技股份有限公司)。测序结果只针对各pegRNA区进行分析,引导编辑器愈伤编辑效率为检测到的有全部突变位点的reads数占总reads数的比例的3组平均值。实验结果见表2。
结果表明,对于靶点OsALS-1的突变(+1G/T)和靶点OsACC-2的突变(+5G/C),引导编辑器PE-P2的愈伤编辑效率均为0,而引导编辑器PE-P3均能够实现这两个靶点的突变,愈伤编辑效率分别为2.59%和4.41%;对于靶点OsALS-1的突变(+1,+2,+5G/T)和靶点OsACC-2的突变(+3,+5,+12A/G,G/C,T/C),引导编辑器PE-P2的愈伤编辑效率均较低,分别为4.34%和0.47%,而引导编辑器PE-P3能够提高愈伤编辑效率,分别达10.55%和9.6%;对于靶点OsWaxy-1的突变(+1,+10,+14C/T,T/A,T/C),引导编辑器PE-P2的愈伤编辑效率为2.21%,而引导编辑器PE-P3的愈伤编辑效率为17.16%。综上可见引导编辑器PE-P3大大提高了愈伤编辑效率。
对于靶点OsALS-1,引导编辑器PE-P2的RT-M模板形式(+1,+2,+5G/T)和RT-S的模板形式(+1G/T)的愈伤编辑效率分别为4.34%和0%,引导编辑器PE-P3的RT-M模板形式(+1,+2,+5G/T)和RT-S的模板形式(+1G/T)的愈伤编辑效率分别为10.55%和2.59%;对于靶点OsACC-2,引导编辑器PE-P2的RT-M模板形式(+3,+5,+12A/G,G/C,T/C)和RT-S的模板形式(+5G/C)的愈伤编辑效率分别为0.47%和0%,引导编辑器PE-P3的RT-M模板形式(+3,+5,+12A/G,G/C,T/C)和RT-S的模板形式(+5G/C)的愈伤编辑效率分别为9.6%和4.41%;对于靶点OsWaxy-1,引导编辑器PE-P2的RT-M模板形式(+1,+10,+14C/T,T/A,T/C)和RT-S的模板形式(+14T/C)的愈伤编辑效率分别为2.21%和0%;对于靶点OsDEP1,引导编辑器PE-P2的RT-M模板形式(+8,+10,+12,+16A/C,C/A,T/G,T/A)和RT-S的模板形式(+8A/C)的愈伤编辑效率分别为2.58%和1.06%;对于靶点OsALS-2,引导编辑器PE-P3的RT-M模板形式(+2,+5,+9C/A,G/A,C/T)和RT-S的模板形式(+9C/T)的愈伤编辑效率分别为3.86%和0%。综上可见RT-M的RT模板设计形式的愈伤编辑效率均高于RT-S模板形式,大大提高了愈伤编辑效率。
2、每载体分别取步骤一中步骤9所获得的水稻阳性T0苗的基因组DNA作为模板,对于OsACC-2靶点,采用引物对OsACC-2进行PCR扩增,得到PCR扩增产物;对于OsACC-1靶点,采用引物对OsACC-1进行PCR扩增,得到PCR扩增产物;对于OsChalk5靶点,采用引物对OsChalk5进行PCR扩增,得到PCR扩增产物;对于OsDEP1靶点,采用引物对OsDEP1进行PCR扩增,得到PCR扩增产物;对于OsALS-2靶点,采用引物对OsALS-2进行PCR扩增,得到PCR扩增产物;对于OsWaxy-1靶点,采用引物对OsWaxy-1进行PCR扩增,得到PCR扩增产物。所有PCR扩增产物进行Sanger测序。测序结果只针对各pegRNA区进行分析,分别统计各靶点发生碱基替换的T0苗数,计算得出引导编辑器T0苗编辑效率,结果见表3。
结果表明,对于靶点OsACC-2的突变(+5G/C)和靶点OsDEP1的突变(+8A/C),引导编辑器PE-P2的T0苗编辑效率分别为0%和2.0%,而引导编辑器PE-P3均能够提高这两个靶点的突变,T0苗编辑效率分别为10.0%和8.0%;对于靶点OsACC-1的突变(+2,+5,+10T/A,G/C,A/G)、靶点OsChalk5的突变(+5,+14,+17G/C,T/C,A/T)和靶点OsWaxy-1的突变(+1,+10,+14C/T,T/A,T/C),引导编辑器PE-P2的T0苗编辑效率均较低,分别为5.4%、1.9%和7.1%,而引导编辑器PE-P3能够提高T0苗编辑效率,分别达8.0%、5.9%和59.2%。综上可见引导编辑器PE-P3大大提高了T0苗编辑效率。
对于靶点OsACC-2,引导编辑器PE-P3的RT-M模板形式(+3,+5,+12A/G,G/C,T/C)和RT-S的模板形式(+5G/C)的T0苗编辑效率分别为32.0%和10.0%;对于靶点OsACC-1,引导编辑器PE-P2的RT-M模板形式(+2,+5,+10T/A,G/C,A/G)和RT-S模板形式(+10A/G)的T0苗编辑效率分别为5.4%和0%;对于靶点OsChalk5引导编辑器PE-P2的RT-M模板形式(+5,+14,+17G/C,T/C,A/T)和RT-S模板形式(+17A/T)的T0苗编辑效率分别为1.9%和0%;对于靶点OsDEP1引导编辑器PE-P2的RT-M模板形式(+8,+10,+12,+16A/C,C/A,T/G,T/A)和RT-S模板形式(+8A/C)的T0苗编辑效率分别为2.6%和2.0%;对于靶点OsALS-2引导编辑器PE-P3的RT-M模板形式(+2,+5,+9C/A,G/A,C/T)和RT-S模板形式(+9C/T)的T0苗编辑效率分别为8.0%和0%;对于靶点OsWaxy-1,引导编辑器PE-P2的RT-M模板形式(+1,+10,+14C/T,T/A,T/C)和RT-S的模板形式(+14T/C)的T0苗编辑效率分别为7.1%和0%。综上可见RT-M的RT模板设计形式的T0苗编辑效率均高于RT-S模板形式,大大提高了T0苗编辑效率。
表2
注:RT模板中突变碱基计数:RT模板序列3’端起第1位碱基记为+1。
表3
注:RT模板中突变碱基计数:RT模板序列3’端起第1位碱基记为+1。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。
序列表
<110> 北京市农林科学院
<120> PE-P3引导编辑系统及其在基因组碱基编辑中的应用
<160> 6
<170> PatentIn version 3.5
<210> 1
<211> 17639
<212> DNA
<213> Artificial Sequence
<400> 1
ggtggcagga tatattgtgg tgtaaacaaa ttgacgctta gacaacttaa taacacattg 60
cggacgtttt taatgtaggt accacctaaa tttccaagct tgtcgtgccc ctctctagag 120
ataatgagca ttgcatgtct aagttataaa aaattaccac atattttttt tgtcacactt 180
gtttgaagtg cagtttatct atctttatac atatatttaa actttactct acgaataata 240
taatctatag tactacaata atatcagtgt tttagagaat catataaatg aacagttaga 300
catggtctaa aggacaattg agtattttga caacaggact ctacagtttt atctttttag 360
tgtgcatgtg ttctcctttt tttttgcaaa tagcttcacc tatataatac ttcatccatt 420
ttattagtac atccatttag ggtttagggt taatggtttt tatagactaa tttttttagt 480
acatctattt tattctattt tagcctctaa attaagaaaa ctaaaactct attttagttt 540
ttttatttaa taatttagat ataaaataga ataaaataaa gtgactaaaa attaaacaaa 600
taccctttaa gaaattaaaa aaactaagga aacatttttc ttgtttcgag tagataatgc 660
cagcctgtta aacgccgtcg acgagtctaa cggacaccaa ccagcgaacc agcagcgtcg 720
cgtcgggcca agcgaagcag acggcacggc atctctgtcg ctgcctctgg acccctctcg 780
agagttccgc tccaccgttg gacttgctcc gctgtcggca tccagaaatt gcgtggcgga 840
gcggcagacg tgagccggca cggcaggcgg cctcctcctc ctctcacggc accggcagct 900
acgggggatt cctttcccac cgctccttcg ctttcccttc ctcgcccgcc gtaataaata 960
gacaccccct ccacaccctc tttccccaac ctcgtgttgt tcggagcgca cacacacaca 1020
accagatctc ccccaaatcc acccgtcggc acctccgctt caaggtacgc cgctcgtcct 1080
cccccccccc cctctctacc ttctctagat cggcgttccg gtccatggtt agggcccggt 1140
agttctactt ctgttcatgt ttgtgttaga tccgtgtttg tgttagatcc gtgctgctag 1200
cgttcgtaca cggatgcgac ctgtacgtca gacacgttct gattgctaac ttgccagtgt 1260
ttctctttgg ggaatcctgg gatggctcta gccgttccgc agacgggatc gatttcatga 1320
ttttttttgt ttcgttgcat agggtttggt ttgccctttt cctttatttc aatatatgcc 1380
gtgcacttgt ttgtcgggtc atcttttcat gctttttttt gtcttggttg tgatgatgtg 1440
gtctggttgg gcggtcgttc tagatcggag tagaattctg tttcaaacta cctggtggat 1500
ttattaattt tggatctgta tgtgtgtgcc atacatattc atagttacga attgaagatg 1560
atggatggaa atatcgatct aggataggta tacatgttga tgcgggtttt actgatgcat 1620
atacagagat gctttttgtt cgcttggttg tgatgatgtg gtgtggttgg gcggtcgttc 1680
attcgttcta gatcggagta gaatactgtt tcaaactacc tggtgtattt attaattttg 1740
gaactgtatg tgtgtgtcat acatcttcat agttacgagt ttaagatgga tggaaatatc 1800
gatctaggat aggtatacat gttgatgtgg gttttactga tgcatataca tgatggcata 1860
tgcagcatct attcatatgc tctaaccttg agtacctatc tattataata aacaagtatg 1920
ttttataatt attttgatct tgatatactt ggatgatggc atatgcagca gctatatgtg 1980
gattttttta gccctgcctt catacgctat ttatttgctt ggtactgttt cttttgtcga 2040
tgctcaccct gttgtttggt gttacttctg cagtacgtaa gcatggacta caaggaccac 2100
gacggggatt acaaagacca cgacatagac tacaaggatg acgatgacaa aatggcaccg 2160
aagaaaaaaa ggaaggtcgg cggctccccg aagaaaaaaa ggaaggtcgg cggctccccg 2220
aagaaaaaaa ggaaggtcgg cggctccccg aagaaaaaaa ggaaggtcgg aatccatggc 2280
gttccagaat tcgacaagaa gtactccatc ggcctcgaca tcggcaccaa cagcgtcggc 2340
tgggcggtga tcaccgacga gtacaaggtc ccgtccaaga agttcaaggt cctgggcaac 2400
accgaccgcc actccatcaa gaagaacctc atcggcgccc tcctcttcga ctccggcgag 2460
acggcggagg cgacccgcct caagcgcacc gcccgccgcc gctacacccg ccgcaagaac 2520
cgcatctgct acctccagga gatcttctcc aacgagatgg cgaaggtcga cgactccttc 2580
ttccaccgcc tcgaggagtc cttcctcgtg gaggaggaca agaagcacga gcgccacccc 2640
atcttcggca acatcgtcga cgaggtcgcc taccacgaga agtaccccac tatctaccac 2700
cttcgtaaga agcttgttga ctctactgat aaggctgatc ttcgtctcat ctaccttgct 2760
ctcgctcaca tgatcaagtt ccgtggtcac ttccttatcg agggtgacct taaccctgat 2820
aactccgacg tggacaagct cttcatccag ctcgtccaga cctacaacca gctcttcgag 2880
gagaacccta tcaacgcttc cggtgtcgac gctaaggcga tcctttccgc taggctctcc 2940
aagtccaggc gtctcgagaa cctcatcgcc cagctccctg gtgagaagaa gaacggtctt 3000
ttcggtaacc tcatcgctct ctccctcggt ctgaccccta acttcaagtc caacttcgac 3060
ctcgctgagg acgctaagct tcagctctcc aaggatacct acgacgatga tctcgacaac 3120
ctcctcgctc agattggaga tcagtacgct gatctcttcc ttgctgctaa gaacctctcc 3180
gatgctatcc tcctttcgga tatccttagg gttaacactg agatcactaa ggctcctctt 3240
tctgcttcca tgatcaagcg ctacgacgag caccaccagg acctcaccct cctcaaggct 3300
cttgttcgtc agcagctccc cgagaagtac aaggagatct tcttcgacca gtccaagaac 3360
ggctacgccg gttacattga cggtggagct agccaggagg agttctacaa gttcatcaag 3420
ccaatccttg agaagatgga tggtactgag gagcttctcg ttaagcttaa ccgtgaggac 3480
ctccttagga agcagaggac tttcgataac ggctctatcc ctcaccagat ccaccttggt 3540
gagcttcacg ccatccttcg taggcaggag gacttctacc ctttcctcaa ggacaaccgt 3600
gagaagatcg agaagatcct tactttccgt attccttact acgttggtcc tcttgctcgt 3660
ggtaactccc gtttcgcttg gatgactagg aagtccgagg agactatcac cccttggaac 3720
ttcgaggagg ttgttgacaa gggtgcttcc gcccagtcct tcatcgagcg catgaccaac 3780
ttcgacaaga acctccccaa cgagaaggtc ctccccaagc actccctcct ctacgagtac 3840
ttcacggtct acaacgagct caccaaggtc aagtacgtca ccgagggtat gcgcaagcct 3900
gccttcctct ccggcgagca gaagaaggct atcgttgacc tcctcttcaa gaccaaccgc 3960
aaggtcaccg tcaagcagct caaggaggac tacttcaaga agatcgagtg cttcgactcc 4020
gtcgagatca gcggcgttga ggaccgtttc aacgcttctc tcggtaccta ccacgatctc 4080
ctcaagatca tcaaggacaa ggacttcctc gacaacgagg agaacgagga catcctcgag 4140
gacatcgtcc tcactcttac tctcttcgag gatagggaga tgatcgagga gaggctcaag 4200
acttacgctc atctcttcga tgacaaggtt atgaagcagc tcaagcgtcg ccgttacacc 4260
ggttggggta ggctctcccg caagctcatc aacggtatca gggataagca gagcggcaag 4320
actatcctcg acttcctcaa gtctgatggt ttcgctaaca ggaacttcat gcagctcatc 4380
cacgatgact ctcttacctt caaggaggat attcagaagg ctcaggtgtc cggtcagggc 4440
gactctctcc acgagcacat tgctaacctt gctggttccc ctgctatcaa gaagggcatc 4500
cttcagactg ttaaggttgt cgatgagctt gtcaaggtta tgggtcgtca caagcctgag 4560
aacatcgtca tcgagatggc tcgtgagaac cagactaccc agaagggtca gaagaactcg 4620
agggagcgca tgaagaggat tgaggagggt atcaaggagc ttggttctca gatccttaag 4680
gagcaccctg tcgagaacac ccagctccag aacgagaagc tctacctcta ctacctccag 4740
aacggtaggg atatgtacgt tgaccaggag ctcgacatca acaggctttc tgactacgac 4800
gtcgacgcca ttgttcctca gtctttcctt aaggatgact ccatcgacaa caaggtcctc 4860
acgaggtccg acaagaacag gggtaagtcg gacaacgtcc cttccgagga ggttgtcaag 4920
aagatgaaga actactggag gcagcttctc aacgctaagc tcattaccca gaggaagttc 4980
gacaacctca cgaaggctga gaggggtggc ctttccgagc ttgacaaggc tggtttcatc 5040
aagaggcagc ttgttgagac gaggcagatt accaagcacg ttgctcagat cctcgattct 5100
aggatgaaca ccaagtacga cgagaacgac aagctcatcc gcgaggtcaa ggtgatcacc 5160
ctcaagtcca agctcgtctc cgacttccgc aaggacttcc agttctacaa ggtccgcgag 5220
atcaacaact accaccacgc tcacgatgct taccttaacg ctgtcgttgg taccgctctt 5280
atcaagaagt accctaagct tgagtccgag ttcgtctacg gtgactacaa ggtctacgac 5340
gttcgtaaga tgatcgccaa gtccgagcag gagatcggca aggccaccgc caagtacttc 5400
ttctactcca acatcatgaa cttcttcaag accgagatca ccctcgccaa cggcgagatc 5460
cgcaagcgcc ctcttatcga gacgaacggt gagactggtg agatcgtttg ggacaagggt 5520
cgcgacttcg ctactgttcg caaggtcctt tctatgcctc aggttaacat cgtcaagaag 5580
accgaggtcc agaccggtgg cttctccaag gagtctatcc ttccaaagag aaactcggac 5640
aagctcatcg ctaggaagaa ggattgggac cctaagaagt acggtggttt cgactcccct 5700
actgtcgcct actccgtcct cgtggtcgcc aaggtggaga agggtaagtc gaagaagctc 5760
aagtccgtca aggagctcct cggcatcacc atcatggagc gctcctcctt cgagaagaac 5820
ccgatcgact tcctcgaggc caagggctac aaggaggtca agaaggacct catcatcaag 5880
ctccccaagt actctctttt cgagctcgag aacggtcgta agaggatgct ggcttccgct 5940
ggtgagctcc agaagggtaa cgagcttgct cttccttcca agtacgtgaa cttcctctac 6000
ctcgcctccc actacgagaa gctcaagggt tcccctgagg ataacgagca gaagcagctc 6060
ttcgtggagc agcacaagca ctacctcgac gagatcatcg agcagatctc cgagttctcc 6120
aagcgcgtca tcctcgctga cgctaacctc gacaaggtcc tctccgccta caacaagcac 6180
cgcgacaagc ccatccgcga gcaggccgag aacatcatcc acctcttcac gctcacgaac 6240
ctcggcgccc ctgctgcttt caagtacttc gacaccacca tcgacaggaa gcgttacacg 6300
tccaccaagg aggttctcga cgctactctc atccaccagt ccatcaccgg tctttacgag 6360
actcgtatcg acctttccca gcttggtggt gatagcggtg gctccagcgg tggtagcagc 6420
ggtagcgaaa ctccagggac ctcggaatcg gcgactccag aatccagtgg gggtagcagc 6480
ggcggatcca gcaccctcaa tatcgaggac gagtacaggc tgcatgagac atccaaggag 6540
ccggacgtgt cactcggctc tacatggctg agcgatttcc cacaggcctg ggcggagaca 6600
ggcggcatgg gcctcgcggt caggcaggcg ccgctcatca ttccactgaa ggcgacctcc 6660
acaccggtca gcatcaagca gtacccaatg tcacaggagg cacggctcgg catcaagcca 6720
cacattcaga ggctcctgga ccagggcatt ctggtccctt gccagagccc gtggaacacc 6780
cctctcctgc cggtgaagaa gcctggcaca aatgactaca ggccggtcca ggatctcagg 6840
gaggtgaaca agcgcgtcga ggatatccat ccgaccgtgc cgaacccata caatctcctg 6900
tcaggcctcc cgccatctca ccagtggtac accgtcctcg acctgaagga tgcgttcttc 6960
tgcctcaggc tgcatccaac aagccagcct ctcttcgcct tcgagtggcg cgatccagag 7020
atgggcattt caggccagct cacctggaca cggctgccac agggcttcaa gaactctcct 7080
accctcttca atgaggcgct ccatcgggac ctggccgatt tcaggatcca gcaccctgac 7140
ctcattctcc tgcagtacgt ggacgatctc ctgctcgccg cgacatcaga gctggattgc 7200
cagcagggca ccagggccct gctccagaca ctcggcaatc tgggctaccg ggcctctgcg 7260
aagaaggccc agatctgcca gaagcaggtg aagtacctcg gctacctgct caaggaggga 7320
cagaggtggc tgacagaggc aaggaaggag acagtcatgg gccagcctac cccgaagaca 7380
cctcggcagc tcagggagtt cctgggcaag gccggattct gcaggctctt cattccagga 7440
ttcgcggaga tggcggcgcc actctaccct ctgaccaagc cgggcacact gttcaactgg 7500
ggcccagacc agcagaaggc gtaccaggag attaagcagg cactgctcac agcacctgcg 7560
ctcggcctgc cggacctcac aaagccattc gagctgttcg tggatgagaa gcagggctac 7620
gcgaagggag tcctgacaca gaagctggga ccatggaggc gcccagtggc ctacctctca 7680
aagaagctcg acccagtggc ggccggatgg cctccgtgcc tgaggatggt ggcggccatt 7740
gccgtcctca ccaaggatgc cggcaagctg acaatgggcc agcctctcgt gattctggcg 7800
ccgcatgcgg tggaggccct ggtcaagcag ccacctgata ggtggctgtc caacgcgcgc 7860
atgacccact accaggccct gctcctggac acagataggg tccagttcgg accagtggtg 7920
gcactcaatc ctgccacact gctgccactc cctgaggagg gcctgcagca taactgcctc 7980
gatattctgg cggaggccca tggcacccgg ccagacctca cagatcagcc gctgccagac 8040
gccgatcaca cctggtacac agatggctca tctctcctgc aggagggcca gaggaaggcc 8100
ggagcagccg tgaccacaga gacagaggtc atctgggcaa aggccctccc agcgggcacc 8160
tcagcacaga gggccgagct cattgcactg acacaggcgc tcaagatggc cgagggcaag 8220
aagctgaatg tgtacacaga ctccaggtac gcattcgcca cagcacacat ccatggcgag 8280
atttacaggc ggaggggatg gctcacatca gagggaaagg agatcaagaa caaggatgag 8340
attctcgcgc tcctgaaggc cctcttcctg cctaagcgcc tgtcaatcat tcactgccca 8400
ggacatcaga agggacactc agccgaggca aggggaaata ggatggcaga ccaggcggcc 8460
aggaaggcag cgatcaccga gacaccagat acctccacac tcctgattga gaactccagc 8520
cctgacgatg acaaaatggc accgaagaaa aaaaggaagg tcggcggctc cccgaagaaa 8580
aaaaggaagg tcggcggctc cccgaagaaa aaaaggaagg tcggcggctc cccgaagaaa 8640
aaaaggaagg tcggaatcca tggcggatca ggagccacca acttctccct cctcaagcag 8700
gccggcgacg tggaggagaa cccgggccca atgaaaaagc ctgaactcac cgcgacgtct 8760
gtcgagaagt ttctgatcga aaagttcgac agcgtctccg acctgatgca gctctcggag 8820
ggcgaagaat ctcgtgcttt cagcttcgat gtaggagggc gtggatatgt cctgcgggta 8880
aatagctgcg ccgatggttt ctacaaagat cgttatgttt atcggcactt tgcatcggcc 8940
gcgctcccga ttccggaagt gcttgacatt ggggagttta gcgagagcct gacctattgc 9000
atctcccgcc gttcacaggg tgtcacgttg caagacctgc ctgaaaccga actgcccgct 9060
gttctacaac cggtcgcgga ggctatggat gcgatcgctg cggccgatct tagccagacg 9120
agcgggttcg gcccattcgg accgcaagga atcggtcaat acactacatg gcgtgatttc 9180
atatgcgcga ttgctgatcc ccatgtgtat cactggcaaa ctgtgatgga cgacaccgtc 9240
agtgcgtccg tcgcgcaggc tctcgatgag ctgatgcttt gggccgagga ctgccccgaa 9300
gtccggcacc tcgtgcacgc ggatttcggc tccaacaatg tcctgacgga caatggccgc 9360
ataacagcgg tcattgactg gagcgaggcg atgttcgggg attcccaata cgaggtcgcc 9420
aacatcttct tctggaggcc gtggttggct tgtatggagc agcagacgcg ctacttcgag 9480
cggaggcatc cggagcttgc aggatcgcca cgactccggg cgtatatgct ccgcattggt 9540
cttgaccaac tctatcagag cttggttgac ggcaatttcg atgatgcagc ttgggcgcag 9600
ggtcgatgcg acgcaatcgt ccgatccgga gccgggactg tcgggcgtac acaaatcgcc 9660
cgcagaagcg cggccgtctg gaccgatggc tgtgtagaag tactcgccga tagtggaaac 9720
cgacgcccca gcactcgtcc gagggcaaag aaatagacta gttcccgatc gttcaaacat 9780
ttggcaataa agtttcttaa gattgaatcc tgttgccggt cttgcgatga ttatcatata 9840
atttctgttg aattacgtta agcatgtaat aattaacatg taatgcatga cgttatttat 9900
gaggtgggtt tttatgatta gagtcccgca attatacatt taatacgcga tagaaaacaa 9960
aatatagcgc gcaaactagg ataaattatc gcgcgcggtg tcatctatgt tactagacct 10020
gcaggtggaa tcggcagcaa aggatttttt cctgtagttt tcccacaacc attttttacc 10080
atccgaatga taggatagga aaaatatcca agtgaacagt attcctataa aattcccgta 10140
aaaagcctgc aatccgaatg agccctgaag tctgaactag ccggtcacct gtacaggcta 10200
tcgagatgcc atacaagaga cggtagtagg aactaggaag acgatggttg attcgtcagg 10260
cgaaatcgtc gtcctgcagt cgcatctatg ggcctggacg gaatagggga aaaagttggc 10320
cggataggag ggaaaggccc aggtgcttac gtgcgaggta ggcctgggct ctcagcactt 10380
cgattcgttg gcaccggggt aggatgcaat agagagcaac gtttagtacc acctcgctta 10440
gctagagcaa actggactgc cttatatgcg cgggtgctgg cttggctgcc gatatctcgc 10500
tctcacattc cgtttcagag ctatgctgga aacagcatag caagttgaaa taaggctagt 10560
ccgttatcaa cttgaaaaag tggcaccgag tcggtgcttt ttttttagga atctttaaac 10620
atacgaacag atcacttaaa gttcttctga agcaacttaa agttatcagg catgcatgga 10680
tcttggagga atcagatgtg cagtcaggga ccatagcaca agacaggcgt cttctactgg 10740
tgctaccagc aaatgctgga agccgggaac actgggtacg ttggaaacca cgtgtgatgt 10800
gaaggagtaa gataaactgt aggagaaaag catttcgtag tgggccatga agcctttcag 10860
gacatgtatt gcagtatggg ccggcccatt acgcaattgg acgacaacaa agactagtat 10920
tagtaccacc tcggctatcc acatagatca aagctggttt aaaagagttg tgcagatgat 10980
ccgtggcggg tatggtggtg caatggggtt tcagagctat gctggaaaca gcatagcaag 11040
ttgaaataag gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcaaaccta 11100
tcctccaatt gcaccaccat ttttttttgg catgcaagct tggcactggc cgtcgtttta 11160
caacgtcgtg actgggaaaa ccctggcgtt acccaactta atcgccttgc agcacatccc 11220
cctttcgcca gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc ccaacagttg 11280
cgcagcctga atggcgaatg ctagagcagc ttgagcttgg atcagattgt cgtttcccgc 11340
cttcagttta aactatcagt gtttgacagg atatattggc gggtaaacct aagagaaaag 11400
agcgtttatt agaataacgg atatttaaaa gggcgtgaaa aggtttatcc gttcgtccat 11460
ttgtatgtgc atgccaacca cagggttccc ctcgggatca aagtactttg atccaacccc 11520
tccgctgcta tagtgcagtc ggcttctgac gttcagtgca gccgtcttct gaaaacgaca 11580
tgtcgcacaa gtcctaagtt acgcgacagg ctgccgccct gcccttttcc tggcgttttc 11640
ttgtcgcgtg ttttagtcgc ataaagtaga atacttgcga ctagaaccgg agacattacg 11700
ccatgaacaa gagcgccgcc gctggcctgc tgggctatgc ccgcgtcagc accgacgacc 11760
aggacttgac caaccaacgg gccgaactgc acgcggccgg ctgcaccaag ctgttttccg 11820
agaagatcac cggcaccagg cgcgaccgcc cggagctggc caggatgctt gaccacctac 11880
gccctggcga cgttgtgaca gtgaccaggc tagaccgcct ggcccgcagc acccgcgacc 11940
tactggacat tgccgagcgc atccaggagg ccggcgcggg cctgcgtagc ctggcagagc 12000
cgtgggccga caccaccacg ccggccggcc gcatggtgtt gaccgtgttc gccggcattg 12060
ccgagttcga gcgttcccta atcatcgacc gcacccggag cgggcgcgag gccgccaagg 12120
cccgaggcgt gaagtttggc ccccgcccta ccctcacccc ggcacagatc gcgcacgccc 12180
gcgagctgat cgaccaggaa ggccgcaccg tgaaagaggc ggctgcactg cttggcgtgc 12240
atcgctcgac cctgtaccgc gcacttgagc gcagcgagga agtgacgccc accgaggcca 12300
ggcggcgcgg tgccttccgt gaggacgcat tgaccgaggc cgacgccctg gcggccgccg 12360
agaatgaacg ccaagaggaa caagcatgaa accgcaccag gacggccagg acgaaccgtt 12420
tttcattacc gaagagatcg aggcggagat gatcgcggcc gggtacgtgt tcgagccgcc 12480
cgcgcacgtc tcaaccgtgc ggctgcatga aatcctggcc ggtttgtctg atgccaagct 12540
ggcggcctgg ccggccagct tggccgctga agaaaccgag cgccgccgtc taaaaaggtg 12600
atgtgtattt gagtaaaaca gcttgcgtca tgcggtcgct gcgtatatga tgcgatgagt 12660
aaataaacaa atacgcaagg ggaacgcatg aaggttatcg ctgtacttaa ccagaaaggc 12720
gggtcaggca agacgaccat cgcaacccat ctagcccgcg ccctgcaact cgccggggcc 12780
gatgttctgt tagtcgattc cgatccccag ggcagtgccc gcgattgggc ggccgtgcgg 12840
gaagatcaac cgctaaccgt tgtcggcatc gaccgcccga cgattgaccg cgacgtgaag 12900
gccatcggcc ggcgcgactt cgtagtgatc gacggagcgc cccaggcggc ggacttggct 12960
gtgtccgcga tcaaggcagc cgacttcgtg ctgattccgg tgcagccaag cccttacgac 13020
atatgggcca ccgccgacct ggtggagctg gttaagcagc gcattgaggt cacggatgga 13080
aggctacaag cggcctttgt cgtgtcgcgg gcgatcaaag gcacgcgcat cggcggtgag 13140
gttgccgagg cgctggccgg gtacgagctg cccattcttg agtcccgtat cacgcagcgc 13200
gtgagctacc caggcactgc cgccgccggc acaaccgttc ttgaatcaga acccgagggc 13260
gacgctgccc gcgaggtcca ggcgctggcc gctgaaatta aatcaaaact catttgagtt 13320
aatgaggtaa agagaaaatg agcaaaagca caaacacgct aagtgccggc cgtccgagcg 13380
cacgcagcag caaggctgca acgttggcca gcctggcaga cacgccagcc atgaagcggg 13440
tcaactttca gttgccggcg gaggatcaca ccaagctgaa gatgtacgcg gtacgccaag 13500
gcaagaccat taccgagctg ctatctgaat acatcgcgca gctaccagag taaatgagca 13560
aatgaataaa tgagtagatg aattttagcg gctaaaggag gcggcatgga aaatcaagaa 13620
caaccaggca ccgacgccgt ggaatgcccc atgtgtggag gaacgggcgg ttggccaggc 13680
gtaagcggct gggttgtctg ccggccctgc aatggcactg gaacccccaa gcccgaggaa 13740
tcggcgtgac ggtcgcaaac catccggccc ggtacaaatc ggcgcggcgc tgggtgatga 13800
cctggtggag aagttgaagg ccgcgcaggc cgcccagcgg caacgcatcg aggcagaagc 13860
acgccccggt gaatcgtggc aagcggccgc tgatcgaatc cgcaaagaat cccggcaacc 13920
gccggcagcc ggtgcgccgt cgattaggaa gccgcccaag ggcgacgagc aaccagattt 13980
tttcgttccg atgctctatg acgtgggcac ccgcgatagt cgcagcatca tggacgtggc 14040
cgttttccgt ctgtcgaagc gtgaccgacg agctggcgag gtgatccgct acgagcttcc 14100
agacgggcac gtagaggttt ccgcagggcc ggccggcatg gccagtgtgt gggattacga 14160
cctggtactg atggcggttt cccatctaac cgaatccatg aaccgatacc gggaagggaa 14220
gggagacaag cccggccgcg tgttccgtcc acacgttgcg gacgtactca agttctgccg 14280
gcgagccgat ggcggaaagc agaaagacga cctggtagaa acctgcattc ggttaaacac 14340
cacgcacgtt gccatgcagc gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc 14400
cgagggtgaa gccttgatta gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga 14460
gtacatcgag atcgagctag ctgattggat gtaccgcgag atcacagaag gcaagaaccc 14520
ggacgtgctg acggttcacc ccgattactt tttgatcgat cccggcatcg gccgttttct 14580
ctaccgcctg gcacgccgcg ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat 14640
ctacgaacgc agtggcagcg ccggagagtt caagaagttc tgtttcaccg tgcgcaagct 14700
gatcgggtca aatgacctgc cggagtacga tttgaaggag gaggcggggc aggctggccc 14760
gatcctagtc atgcgctacc gcaacctgat cgagggcgaa gcatccgccg gttcctaatg 14820
tacggagcag atgctagggc aaattgccct agcaggggaa aaaggtcgaa aagttctctt 14880
tcctgtggat agcacgtaca ttgggaaccc aaagccgtac attgggaacc ggaacccgta 14940
cattgggaac ccaaagccgt acattgggaa ccggtcacac atgtaagtga ctgatataaa 15000
agagaaaaaa ggcgattttt ccgcctaaaa ctctttaaaa cttattaaaa ctcttaaaac 15060
ccgcctggcc tgtgcataac tgtctggcca gcgcacagcc gaagagctgc aaaaagcgcc 15120
tacccttcgg tcgctgcgct ccctacgccc cgccgcttcg cgtcggccta tcgcggccgc 15180
tggccgctca aaaatggctg gcctacggcc aggcaatcta ccagggcgcg gacaagccgc 15240
gccgtcgcca ctcgaccgcc ggcgcccaca tcaaggcacc ctgcctcgcg cgtttcggtg 15300
atgacggtga aaacctctga cacatgcagc tcccggagac ggtcacagct tgtctgtaag 15360
cggatgccgg gagcagacaa gcccgtcagg gcgcgtcagc gggtgttggc gggtgtcggg 15420
gcgcagccat gacccagtca cgtagcgata gcggagtgta tactggctta actatgcggc 15480
atcagagcag attgtactga gagtgcacca tatgcggtgt gaaataccgc acagatgcgt 15540
aaggagaaaa taccgcatca ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc 15600
ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag gcggtaatac ggttatccac 15660
agaatcaggg gataacgcag gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa 15720
ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 15780
caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 15840
gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 15900
cctgtccgcc tttctccctt cgggaagcgt ggcgctttct catagctcac gctgtaggta 15960
tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 16020
gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 16080
cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 16140
tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 16200
tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 16260
caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 16320
aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 16380
cgaaaactca cgttaaggga ttttggtcat gcattctagg tactaaaaca attcatccag 16440
taaaatataa tattttattt tctcccaatc aggcttgatc cccagtaagt caaaaaatag 16500
ctcgacatac tgttcttccc cgatatcctc cctgatcgac cggacgcaga aggcaatgtc 16560
ataccacttg tccgccctgc cgcttctccc aagatcaata aagccactta ctttgccatc 16620
tttcacaaag atgttgctgt ctcccaggtc gccgtgggaa aagacaagtt cctcttcggg 16680
cttttccgtc tttaaaaaat catacagctc gcgcggatct ttaaatggag tgtcttcttc 16740
ccagttttcg caatccacat cggccagatc gttattcagt aagtaatcca attcggctaa 16800
gcggctgtct aagctattcg tatagggaca atccgatatg tcgatggagt gaaagagcct 16860
gatgcactcc gcatacagct cgataatctt ttcagggctt tgttcatctt catactcttc 16920
cgagcaaagg acgccatcgg cctcactcat gagcagattg ctccagccat catgccgttc 16980
aaagtgcagg acctttggaa caggcagctt tccttccagc catagcatca tgtccttttc 17040
ccgttccaca tcataggtgg tccctttata ccggctgtcc gtcattttta aatataggtt 17100
ttcattttct cccaccagct tatatacctt agcaggagac attccttccg tatcttttac 17160
gcagcggtat ttttcgatca gttttttcaa ttccggtgat attctcattt tagccattta 17220
ttatttcctt cctcttttct acagtattta aagatacccc aagaagctaa ttataacaag 17280
acgaactcca attcactgtt ccttgcattc taaaacctta aataccagaa aacagctttt 17340
tcaaagttgt tttcaaagtt ggcgtataac atagtatcga cggagccgat tttgaaaccg 17400
cggtgatcac aggcagcaac gctctgtcat cgttacaatc aacatgctac cctccgcgag 17460
atcatccgtg tttcaaaccc ggcagcttag ttgccgttct tccgaatagc atcggtaaca 17520
tgagcaaagt ctgccgcctt acaacggctc tcccgctgac gccgtcccgg actgatgggc 17580
tgcctgtatc gagtggtgat tttgtgccga gctgccggtc ggggagctgt tggctggct 17639
<210> 2
<211> 1367
<212> PRT
<213> Artificial Sequence
<400> 2
Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 3
<211> 677
<212> PRT
<213> Artificial Sequence
<400> 3
Thr Leu Asn Ile Glu Asp Glu Tyr Arg Leu His Glu Thr Ser Lys Glu
1 5 10 15
Pro Asp Val Ser Leu Gly Ser Thr Trp Leu Ser Asp Phe Pro Gln Ala
20 25 30
Trp Ala Glu Thr Gly Gly Met Gly Leu Ala Val Arg Gln Ala Pro Leu
35 40 45
Ile Ile Pro Leu Lys Ala Thr Ser Thr Pro Val Ser Ile Lys Gln Tyr
50 55 60
Pro Met Ser Gln Glu Ala Arg Leu Gly Ile Lys Pro His Ile Gln Arg
65 70 75 80
Leu Leu Asp Gln Gly Ile Leu Val Pro Cys Gln Ser Pro Trp Asn Thr
85 90 95
Pro Leu Leu Pro Val Lys Lys Pro Gly Thr Asn Asp Tyr Arg Pro Val
100 105 110
Gln Asp Leu Arg Glu Val Asn Lys Arg Val Glu Asp Ile His Pro Thr
115 120 125
Val Pro Asn Pro Tyr Asn Leu Leu Ser Gly Leu Pro Pro Ser His Gln
130 135 140
Trp Tyr Thr Val Leu Asp Leu Lys Asp Ala Phe Phe Cys Leu Arg Leu
145 150 155 160
His Pro Thr Ser Gln Pro Leu Phe Ala Phe Glu Trp Arg Asp Pro Glu
165 170 175
Met Gly Ile Ser Gly Gln Leu Thr Trp Thr Arg Leu Pro Gln Gly Phe
180 185 190
Lys Asn Ser Pro Thr Leu Phe Asn Glu Ala Leu His Arg Asp Leu Ala
195 200 205
Asp Phe Arg Ile Gln His Pro Asp Leu Ile Leu Leu Gln Tyr Val Asp
210 215 220
Asp Leu Leu Leu Ala Ala Thr Ser Glu Leu Asp Cys Gln Gln Gly Thr
225 230 235 240
Arg Ala Leu Leu Gln Thr Leu Gly Asn Leu Gly Tyr Arg Ala Ser Ala
245 250 255
Lys Lys Ala Gln Ile Cys Gln Lys Gln Val Lys Tyr Leu Gly Tyr Leu
260 265 270
Leu Lys Glu Gly Gln Arg Trp Leu Thr Glu Ala Arg Lys Glu Thr Val
275 280 285
Met Gly Gln Pro Thr Pro Lys Thr Pro Arg Gln Leu Arg Glu Phe Leu
290 295 300
Gly Lys Ala Gly Phe Cys Arg Leu Phe Ile Pro Gly Phe Ala Glu Met
305 310 315 320
Ala Ala Pro Leu Tyr Pro Leu Thr Lys Pro Gly Thr Leu Phe Asn Trp
325 330 335
Gly Pro Asp Gln Gln Lys Ala Tyr Gln Glu Ile Lys Gln Ala Leu Leu
340 345 350
Thr Ala Pro Ala Leu Gly Leu Pro Asp Leu Thr Lys Pro Phe Glu Leu
355 360 365
Phe Val Asp Glu Lys Gln Gly Tyr Ala Lys Gly Val Leu Thr Gln Lys
370 375 380
Leu Gly Pro Trp Arg Arg Pro Val Ala Tyr Leu Ser Lys Lys Leu Asp
385 390 395 400
Pro Val Ala Ala Gly Trp Pro Pro Cys Leu Arg Met Val Ala Ala Ile
405 410 415
Ala Val Leu Thr Lys Asp Ala Gly Lys Leu Thr Met Gly Gln Pro Leu
420 425 430
Val Ile Leu Ala Pro His Ala Val Glu Ala Leu Val Lys Gln Pro Pro
435 440 445
Asp Arg Trp Leu Ser Asn Ala Arg Met Thr His Tyr Gln Ala Leu Leu
450 455 460
Leu Asp Thr Asp Arg Val Gln Phe Gly Pro Val Val Ala Leu Asn Pro
465 470 475 480
Ala Thr Leu Leu Pro Leu Pro Glu Glu Gly Leu Gln His Asn Cys Leu
485 490 495
Asp Ile Leu Ala Glu Ala His Gly Thr Arg Pro Asp Leu Thr Asp Gln
500 505 510
Pro Leu Pro Asp Ala Asp His Thr Trp Tyr Thr Asp Gly Ser Ser Leu
515 520 525
Leu Gln Glu Gly Gln Arg Lys Ala Gly Ala Ala Val Thr Thr Glu Thr
530 535 540
Glu Val Ile Trp Ala Lys Ala Leu Pro Ala Gly Thr Ser Ala Gln Arg
545 550 555 560
Ala Glu Leu Ile Ala Leu Thr Gln Ala Leu Lys Met Ala Glu Gly Lys
565 570 575
Lys Leu Asn Val Tyr Thr Asp Ser Arg Tyr Ala Phe Ala Thr Ala His
580 585 590
Ile His Gly Glu Ile Tyr Arg Arg Arg Gly Trp Leu Thr Ser Glu Gly
595 600 605
Lys Glu Ile Lys Asn Lys Asp Glu Ile Leu Ala Leu Leu Lys Ala Leu
610 615 620
Phe Leu Pro Lys Arg Leu Ser Ile Ile His Cys Pro Gly His Gln Lys
625 630 635 640
Gly His Ser Ala Glu Ala Arg Gly Asn Arg Met Ala Asp Gln Ala Ala
645 650 655
Arg Lys Ala Ala Ile Thr Glu Thr Pro Asp Thr Ser Thr Leu Leu Ile
660 665 670
Glu Asn Ser Ser Pro
675
<210> 4
<211> 341
<212> PRT
<213> Artificial Sequence
<400> 4
Met Lys Lys Pro Glu Leu Thr Ala Thr Ser Val Glu Lys Phe Leu Ile
1 5 10 15
Glu Lys Phe Asp Ser Val Ser Asp Leu Met Gln Leu Ser Glu Gly Glu
20 25 30
Glu Ser Arg Ala Phe Ser Phe Asp Val Gly Gly Arg Gly Tyr Val Leu
35 40 45
Arg Val Asn Ser Cys Ala Asp Gly Phe Tyr Lys Asp Arg Tyr Val Tyr
50 55 60
Arg His Phe Ala Ser Ala Ala Leu Pro Ile Pro Glu Val Leu Asp Ile
65 70 75 80
Gly Glu Phe Ser Glu Ser Leu Thr Tyr Cys Ile Ser Arg Arg Ser Gln
85 90 95
Gly Val Thr Leu Gln Asp Leu Pro Glu Thr Glu Leu Pro Ala Val Leu
100 105 110
Gln Pro Val Ala Glu Ala Met Asp Ala Ile Ala Ala Ala Asp Leu Ser
115 120 125
Gln Thr Ser Gly Phe Gly Pro Phe Gly Pro Gln Gly Ile Gly Gln Tyr
130 135 140
Thr Thr Trp Arg Asp Phe Ile Cys Ala Ile Ala Asp Pro His Val Tyr
145 150 155 160
His Trp Gln Thr Val Met Asp Asp Thr Val Ser Ala Ser Val Ala Gln
165 170 175
Ala Leu Asp Glu Leu Met Leu Trp Ala Glu Asp Cys Pro Glu Val Arg
180 185 190
His Leu Val His Ala Asp Phe Gly Ser Asn Asn Val Leu Thr Asp Asn
195 200 205
Gly Arg Ile Thr Ala Val Ile Asp Trp Ser Glu Ala Met Phe Gly Asp
210 215 220
Ser Gln Tyr Glu Val Ala Asn Ile Phe Phe Trp Arg Pro Trp Leu Ala
225 230 235 240
Cys Met Glu Gln Gln Thr Arg Tyr Phe Glu Arg Arg His Pro Glu Leu
245 250 255
Ala Gly Ser Pro Arg Leu Arg Ala Tyr Met Leu Arg Ile Gly Leu Asp
260 265 270
Gln Leu Tyr Gln Ser Leu Val Asp Gly Asn Phe Asp Asp Ala Ala Trp
275 280 285
Ala Gln Gly Arg Cys Asp Ala Ile Val Arg Ser Gly Ala Gly Thr Val
290 295 300
Gly Arg Thr Gln Ile Ala Arg Arg Ser Ala Ala Val Trp Thr Asp Gly
305 310 315 320
Cys Val Glu Val Leu Ala Asp Ser Gly Asn Arg Arg Pro Ser Thr Arg
325 330 335
Pro Arg Ala Lys Lys
340
<210> 5
<211> 19
<212> PRT
<213> Artificial Sequence
<400> 5
Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val Glu Glu Asn
1 5 10 15
Pro Gly Pro
<210> 6
<211> 17636
<212> DNA
<213> Artificial Sequence
<400> 6
ggtggcagga tatattgtgg tgtaaacaaa ttgacgctta gacaacttaa taacacattg 60
cggacgtttt taatgtaggt accacctaaa tttccaagct tgtcgtgccc ctctctagag 120
ataatgagca ttgcatgtct aagttataaa aaattaccac atattttttt tgtcacactt 180
gtttgaagtg cagtttatct atctttatac atatatttaa actttactct acgaataata 240
taatctatag tactacaata atatcagtgt tttagagaat catataaatg aacagttaga 300
catggtctaa aggacaattg agtattttga caacaggact ctacagtttt atctttttag 360
tgtgcatgtg ttctcctttt tttttgcaaa tagcttcacc tatataatac ttcatccatt 420
ttattagtac atccatttag ggtttagggt taatggtttt tatagactaa tttttttagt 480
acatctattt tattctattt tagcctctaa attaagaaaa ctaaaactct attttagttt 540
ttttatttaa taatttagat ataaaataga ataaaataaa gtgactaaaa attaaacaaa 600
taccctttaa gaaattaaaa aaactaagga aacatttttc ttgtttcgag tagataatgc 660
cagcctgtta aacgccgtcg acgagtctaa cggacaccaa ccagcgaacc agcagcgtcg 720
cgtcgggcca agcgaagcag acggcacggc atctctgtcg ctgcctctgg acccctctcg 780
agagttccgc tccaccgttg gacttgctcc gctgtcggca tccagaaatt gcgtggcgga 840
gcggcagacg tgagccggca cggcaggcgg cctcctcctc ctctcacggc accggcagct 900
acgggggatt cctttcccac cgctccttcg ctttcccttc ctcgcccgcc gtaataaata 960
gacaccccct ccacaccctc tttccccaac ctcgtgttgt tcggagcgca cacacacaca 1020
accagatctc ccccaaatcc acccgtcggc acctccgctt caaggtacgc cgctcgtcct 1080
cccccccccc cctctctacc ttctctagat cggcgttccg gtccatggtt agggcccggt 1140
agttctactt ctgttcatgt ttgtgttaga tccgtgtttg tgttagatcc gtgctgctag 1200
cgttcgtaca cggatgcgac ctgtacgtca gacacgttct gattgctaac ttgccagtgt 1260
ttctctttgg ggaatcctgg gatggctcta gccgttccgc agacgggatc gatttcatga 1320
ttttttttgt ttcgttgcat agggtttggt ttgccctttt cctttatttc aatatatgcc 1380
gtgcacttgt ttgtcgggtc atcttttcat gctttttttt gtcttggttg tgatgatgtg 1440
gtctggttgg gcggtcgttc tagatcggag tagaattctg tttcaaacta cctggtggat 1500
ttattaattt tggatctgta tgtgtgtgcc atacatattc atagttacga attgaagatg 1560
atggatggaa atatcgatct aggataggta tacatgttga tgcgggtttt actgatgcat 1620
atacagagat gctttttgtt cgcttggttg tgatgatgtg gtgtggttgg gcggtcgttc 1680
attcgttcta gatcggagta gaatactgtt tcaaactacc tggtgtattt attaattttg 1740
gaactgtatg tgtgtgtcat acatcttcat agttacgagt ttaagatgga tggaaatatc 1800
gatctaggat aggtatacat gttgatgtgg gttttactga tgcatataca tgatggcata 1860
tgcagcatct attcatatgc tctaaccttg agtacctatc tattataata aacaagtatg 1920
ttttataatt attttgatct tgatatactt ggatgatggc atatgcagca gctatatgtg 1980
gattttttta gccctgcctt catacgctat ttatttgctt ggtactgttt cttttgtcga 2040
tgctcaccct gttgtttggt gttacttctg cagtacgtaa gcatggacta caaggaccac 2100
gacggggatt acaaagacca cgacatagac tacaaggatg acgatgacaa aatggcaccg 2160
aagaaaaaaa ggaaggtcgg cggctccccg aagaaaaaaa ggaaggtcgg cggctccccg 2220
aagaaaaaaa ggaaggtcgg cggctccccg aagaaaaaaa ggaaggtcgg aatccatggc 2280
gttccagaaa ccctcaatat cgaggacgag tacaggctgc atgagacatc caaggagccg 2340
gacgtgtcac tcggctctac atggctgagc gatttcccac aggcctgggc ggagacaggc 2400
ggcatgggcc tcgcggtcag gcaggcgccg ctcatcattc cactgaaggc gacctccaca 2460
ccggtcagca tcaagcagta cccaatgtca caggaggcac ggctcggcat caagccacac 2520
attcagaggc tcctggacca gggcattctg gtcccttgcc agagcccgtg gaacacccct 2580
ctcctgccgg tgaagaagcc tggcacaaat gactacaggc cggtccagga tctcagggag 2640
gtgaacaagc gcgtcgagga tatccatccg accgtgccga acccatacaa tctcctgtca 2700
ggcctcccgc catctcacca gtggtacacc gtcctcgacc tgaaggatgc gttcttctgc 2760
ctcaggctgc atccaacaag ccagcctctc ttcgccttcg agtggcgcga tccagagatg 2820
ggcatttcag gccagctcac ctggacacgg ctgccacagg gcttcaagaa ctctcctacc 2880
ctcttcaatg aggcgctcca tcgggacctg gccgatttca ggatccagca ccctgacctc 2940
attctcctgc agtacgtgga cgatctcctg ctcgccgcga catcagagct ggattgccag 3000
cagggcacca gggccctgct ccagacactc ggcaatctgg gctaccgggc ctctgcgaag 3060
aaggcccaga tctgccagaa gcaggtgaag tacctcggct acctgctcaa ggagggacag 3120
aggtggctga cagaggcaag gaaggagaca gtcatgggcc agcctacccc gaagacacct 3180
cggcagctca gggagttcct gggcaaggcc ggattctgca ggctcttcat tccaggattc 3240
gcggagatgg cggcgccact ctaccctctg accaagccgg gcacactgtt caactggggc 3300
ccagaccagc agaaggcgta ccaggagatt aagcaggcac tgctcacagc acctgcgctc 3360
ggcctgccgg acctcacaaa gccattcgag ctgttcgtgg atgagaagca gggctacgcg 3420
aagggagtcc tgacacagaa gctgggacca tggaggcgcc cagtggccta cctctcaaag 3480
aagctcgacc cagtggcggc cggatggcct ccgtgcctga ggatggtggc ggccattgcc 3540
gtcctcacca aggatgccgg caagctgaca atgggccagc ctctcgtgat tctggcgccg 3600
catgcggtgg aggccctggt caagcagcca cctgataggt ggctgtccaa cgcgcgcatg 3660
acccactacc aggccctgct cctggacaca gatagggtcc agttcggacc agtggtggca 3720
ctcaatcctg ccacactgct gccactccct gaggagggcc tgcagcataa ctgcctcgat 3780
attctggcgg aggcccatgg cacccggcca gacctcacag atcagccgct gccagacgcc 3840
gatcacacct ggtacacaga tggctcatct ctcctgcagg agggccagag gaaggccgga 3900
gcagccgtga ccacagagac agaggtcatc tgggcaaagg ccctcccagc gggcacctca 3960
gcacagaggg ccgagctcat tgcactgaca caggcgctca agatggccga gggcaagaag 4020
ctgaatgtgt acacagactc caggtacgca ttcgccacag cacacatcca tggcgagatt 4080
tacaggcgga ggggatggct cacatcagag ggaaaggaga tcaagaacaa ggatgagatt 4140
ctcgcgctcc tgaaggccct cttcctgcct aagcgcctgt caatcattca ctgcccagga 4200
catcagaagg gacactcagc cgaggcaagg ggaaatagga tggcagacca ggcggccagg 4260
aaggcagcga tcaccgagac accagatacc tccacactcc tgattgagaa ctccagccct 4320
agcggtggct ccagcggtgg tagcagcggt agcgaaactc cagggacctc ggaatcggcg 4380
actccagaat ccagtggggg tagcagcggc ggatccagcg acaagaagta ctccatcggc 4440
ctcgacatcg gcaccaacag cgtcggctgg gcggtgatca ccgacgagta caaggtcccg 4500
tccaagaagt tcaaggtcct gggcaacacc gaccgccact ccatcaagaa gaacctcatc 4560
ggcgccctcc tcttcgactc cggcgagacg gcggaggcga cccgcctcaa gcgcaccgcc 4620
cgccgccgct acacccgccg caagaaccgc atctgctacc tccaggagat cttctccaac 4680
gagatggcga aggtcgacga ctccttcttc caccgcctcg aggagtcctt cctcgtggag 4740
gaggacaaga agcacgagcg ccaccccatc ttcggcaaca tcgtcgacga ggtcgcctac 4800
cacgagaagt accccactat ctaccacctt cgtaagaagc ttgttgactc tactgataag 4860
gctgatcttc gtctcatcta ccttgctctc gctcacatga tcaagttccg tggtcacttc 4920
cttatcgagg gtgaccttaa ccctgataac tccgacgtgg acaagctctt catccagctc 4980
gtccagacct acaaccagct cttcgaggag aaccctatca acgcttccgg tgtcgacgct 5040
aaggcgatcc tttccgctag gctctccaag tccaggcgtc tcgagaacct catcgcccag 5100
ctccctggtg agaagaagaa cggtcttttc ggtaacctca tcgctctctc cctcggtctg 5160
acccctaact tcaagtccaa cttcgacctc gctgaggacg ctaagcttca gctctccaag 5220
gatacctacg acgatgatct cgacaacctc ctcgctcaga ttggagatca gtacgctgat 5280
ctcttccttg ctgctaagaa cctctccgat gctatcctcc tttcggatat ccttagggtt 5340
aacactgaga tcactaaggc tcctctttct gcttccatga tcaagcgcta cgacgagcac 5400
caccaggacc tcaccctcct caaggctctt gttcgtcagc agctccccga gaagtacaag 5460
gagatcttct tcgaccagtc caagaacggc tacgccggtt acattgacgg tggagctagc 5520
caggaggagt tctacaagtt catcaagcca atccttgaga agatggatgg tactgaggag 5580
cttctcgtta agcttaaccg tgaggacctc cttaggaagc agaggacttt cgataacggc 5640
tctatccctc accagatcca ccttggtgag cttcacgcca tccttcgtag gcaggaggac 5700
ttctaccctt tcctcaagga caaccgtgag aagatcgaga agatccttac tttccgtatt 5760
ccttactacg ttggtcctct tgctcgtggt aactcccgtt tcgcttggat gactaggaag 5820
tccgaggaga ctatcacccc ttggaacttc gaggaggttg ttgacaaggg tgcttccgcc 5880
cagtccttca tcgagcgcat gaccaacttc gacaagaacc tccccaacga gaaggtcctc 5940
cccaagcact ccctcctcta cgagtacttc acggtctaca acgagctcac caaggtcaag 6000
tacgtcaccg agggtatgcg caagcctgcc ttcctctccg gcgagcagaa gaaggctatc 6060
gttgacctcc tcttcaagac caaccgcaag gtcaccgtca agcagctcaa ggaggactac 6120
ttcaagaaga tcgagtgctt cgactccgtc gagatcagcg gcgttgagga ccgtttcaac 6180
gcttctctcg gtacctacca cgatctcctc aagatcatca aggacaagga cttcctcgac 6240
aacgaggaga acgaggacat cctcgaggac atcgtcctca ctcttactct cttcgaggat 6300
agggagatga tcgaggagag gctcaagact tacgctcatc tcttcgatga caaggttatg 6360
aagcagctca agcgtcgccg ttacaccggt tggggtaggc tctcccgcaa gctcatcaac 6420
ggtatcaggg ataagcagag cggcaagact atcctcgact tcctcaagtc tgatggtttc 6480
gctaacagga acttcatgca gctcatccac gatgactctc ttaccttcaa ggaggatatt 6540
cagaaggctc aggtgtccgg tcagggcgac tctctccacg agcacattgc taaccttgct 6600
ggttcccctg ctatcaagaa gggcatcctt cagactgtta aggttgtcga tgagcttgtc 6660
aaggttatgg gtcgtcacaa gcctgagaac atcgtcatcg agatggctcg tgagaaccag 6720
actacccaga agggtcagaa gaactcgagg gagcgcatga agaggattga ggagggtatc 6780
aaggagcttg gttctcagat ccttaaggag caccctgtcg agaacaccca gctccagaac 6840
gagaagctct acctctacta cctccagaac ggtagggata tgtacgttga ccaggagctc 6900
gacatcaaca ggctttctga ctacgacgtc gacgccattg ttcctcagtc tttccttaag 6960
gatgactcca tcgacaacaa ggtcctcacg aggtccgaca agaacagggg taagtcggac 7020
aacgtccctt ccgaggaggt tgtcaagaag atgaagaact actggaggca gcttctcaac 7080
gctaagctca ttacccagag gaagttcgac aacctcacga aggctgagag gggtggcctt 7140
tccgagcttg acaaggctgg tttcatcaag aggcagcttg ttgagacgag gcagattacc 7200
aagcacgttg ctcagatcct cgattctagg atgaacacca agtacgacga gaacgacaag 7260
ctcatccgcg aggtcaaggt gatcaccctc aagtccaagc tcgtctccga cttccgcaag 7320
gacttccagt tctacaaggt ccgcgagatc aacaactacc accacgctca cgatgcttac 7380
cttaacgctg tcgttggtac cgctcttatc aagaagtacc ctaagcttga gtccgagttc 7440
gtctacggtg actacaaggt ctacgacgtt cgtaagatga tcgccaagtc cgagcaggag 7500
atcggcaagg ccaccgccaa gtacttcttc tactccaaca tcatgaactt cttcaagacc 7560
gagatcaccc tcgccaacgg cgagatccgc aagcgccctc ttatcgagac gaacggtgag 7620
actggtgaga tcgtttggga caagggtcgc gacttcgcta ctgttcgcaa ggtcctttct 7680
atgcctcagg ttaacatcgt caagaagacc gaggtccaga ccggtggctt ctccaaggag 7740
tctatccttc caaagagaaa ctcggacaag ctcatcgcta ggaagaagga ttgggaccct 7800
aagaagtacg gtggtttcga ctcccctact gtcgcctact ccgtcctcgt ggtcgccaag 7860
gtggagaagg gtaagtcgaa gaagctcaag tccgtcaagg agctcctcgg catcaccatc 7920
atggagcgct cctccttcga gaagaacccg atcgacttcc tcgaggccaa gggctacaag 7980
gaggtcaaga aggacctcat catcaagctc cccaagtact ctcttttcga gctcgagaac 8040
ggtcgtaaga ggatgctggc ttccgctggt gagctccaga agggtaacga gcttgctctt 8100
ccttccaagt acgtgaactt cctctacctc gcctcccact acgagaagct caagggttcc 8160
cctgaggata acgagcagaa gcagctcttc gtggagcagc acaagcacta cctcgacgag 8220
atcatcgagc agatctccga gttctccaag cgcgtcatcc tcgctgacgc taacctcgac 8280
aaggtcctct ccgcctacaa caagcaccgc gacaagccca tccgcgagca ggccgagaac 8340
atcatccacc tcttcacgct cacgaacctc ggcgcccctg ctgctttcaa gtacttcgac 8400
accaccatcg acaggaagcg ttacacgtcc accaaggagg ttctcgacgc tactctcatc 8460
caccagtcca tcaccggtct ttacgagact cgtatcgacc tttcccagct tggtggtgat 8520
gacgatgaca aaatggcacc gaagaaaaaa aggaaggtcg gcggctcccc gaagaaaaaa 8580
aggaaggtcg gcggctcccc gaagaaaaaa aggaaggtcg gcggctcccc gaagaaaaaa 8640
aggaaggtcg gaatccatgg cggatcagga gccaccaact tctccctcct caagcaggcc 8700
ggcgacgtgg aggagaaccc gggcccaatg aaaaagcctg aactcaccgc gacgtctgtc 8760
gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 8820
gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct gcgggtaaat 8880
agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 8940
ctcccgattc cggaagtgct tgacattggg gagtttagcg agagcctgac ctattgcatc 9000
tcccgccgtt cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 9060
ctacaaccgg tcgcggaggc tatggatgcg atcgctgcgg ccgatcttag ccagacgagc 9120
gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 9180
tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 9240
gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 9300
cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 9360
acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 9420
atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 9480
aggcatccgg agcttgcagg atcgccacga ctccgggcgt atatgctccg cattggtctt 9540
gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 9600
cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 9660
agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 9720
cgccccagca ctcgtccgag ggcaaagaaa tagactagtt cccgatcgtt caaacatttg 9780
gcaataaagt ttcttaagat tgaatcctgt tgccggtctt gcgatgatta tcatataatt 9840
tctgttgaat tacgttaagc atgtaataat taacatgtaa tgcatgacgt tatttatgag 9900
gtgggttttt atgattagag tcccgcaatt atacatttaa tacgcgatag aaaacaaaat 9960
atagcgcgca aactaggata aattatcgcg cgcggtgtca tctatgttac tagacctgca 10020
ggtggaatcg gcagcaaagg attttttcct gtagttttcc cacaaccatt ttttaccatc 10080
cgaatgatag gataggaaaa atatccaagt gaacagtatt cctataaaat tcccgtaaaa 10140
agcctgcaat ccgaatgagc cctgaagtct gaactagccg gtcacctgta caggctatcg 10200
agatgccata caagagacgg tagtaggaac taggaagacg atggttgatt cgtcaggcga 10260
aatcgtcgtc ctgcagtcgc atctatgggc ctggacggaa taggggaaaa agttggccgg 10320
ataggaggga aaggcccagg tgcttacgtg cgaggtaggc ctgggctctc agcacttcga 10380
ttcgttggca ccggggtagg atgcaataga gagcaacgtt tagtaccacc tcgcttagct 10440
agagcaaact ggactgcctt atatgcgcgg gtgctggctt ggctgccgat atctcgctct 10500
cacattccgt ttcagagcta tgctggaaac agcatagcaa gttgaaataa ggctagtccg 10560
ttatcaactt gaaaaagtgg caccgagtcg gtgctttttt tttaggaatc tttaaacata 10620
cgaacagatc acttaaagtt cttctgaagc aacttaaagt tatcaggcat gcatggatct 10680
tggaggaatc agatgtgcag tcagggacca tagcacaaga caggcgtctt ctactggtgc 10740
taccagcaaa tgctggaagc cgggaacact gggtacgttg gaaaccacgt gtgatgtgaa 10800
ggagtaagat aaactgtagg agaaaagcat ttcgtagtgg gccatgaagc ctttcaggac 10860
atgtattgca gtatgggccg gcccattacg caattggacg acaacaaaga ctagtattag 10920
taccacctcg gctatccaca tagatcaaag ctggtttaaa agagttgtgc agatgatccg 10980
tggcgggtat ggtggtgcaa tggggtttca gagctatgct ggaaacagca tagcaagttg 11040
aaataaggct agtccgttat caacttgaaa aagtggcacc gagtcggtgc aaacctatcc 11100
tccaattgca ccaccatttt tttttggcat gcaagcttgg cactggccgt cgttttacaa 11160
cgtcgtgact gggaaaaccc tggcgttacc caacttaatc gccttgcagc acatccccct 11220
ttcgccagct ggcgtaatag cgaagaggcc cgcaccgatc gcccttccca acagttgcgc 11280
agcctgaatg gcgaatgcta gagcagcttg agcttggatc agattgtcgt ttcccgcctt 11340
cagtttaaac tatcagtgtt tgacaggata tattggcggg taaacctaag agaaaagagc 11400
gtttattaga ataacggata tttaaaaggg cgtgaaaagg tttatccgtt cgtccatttg 11460
tatgtgcatg ccaaccacag ggttcccctc gggatcaaag tactttgatc caacccctcc 11520
gctgctatag tgcagtcggc ttctgacgtt cagtgcagcc gtcttctgaa aacgacatgt 11580
cgcacaagtc ctaagttacg cgacaggctg ccgccctgcc cttttcctgg cgttttcttg 11640
tcgcgtgttt tagtcgcata aagtagaata cttgcgacta gaaccggaga cattacgcca 11700
tgaacaagag cgccgccgct ggcctgctgg gctatgcccg cgtcagcacc gacgaccagg 11760
acttgaccaa ccaacgggcc gaactgcacg cggccggctg caccaagctg ttttccgaga 11820
agatcaccgg caccaggcgc gaccgcccgg agctggccag gatgcttgac cacctacgcc 11880
ctggcgacgt tgtgacagtg accaggctag accgcctggc ccgcagcacc cgcgacctac 11940
tggacattgc cgagcgcatc caggaggccg gcgcgggcct gcgtagcctg gcagagccgt 12000
gggccgacac caccacgccg gccggccgca tggtgttgac cgtgttcgcc ggcattgccg 12060
agttcgagcg ttccctaatc atcgaccgca cccggagcgg gcgcgaggcc gccaaggccc 12120
gaggcgtgaa gtttggcccc cgccctaccc tcaccccggc acagatcgcg cacgcccgcg 12180
agctgatcga ccaggaaggc cgcaccgtga aagaggcggc tgcactgctt ggcgtgcatc 12240
gctcgaccct gtaccgcgca cttgagcgca gcgaggaagt gacgcccacc gaggccaggc 12300
ggcgcggtgc cttccgtgag gacgcattga ccgaggccga cgccctggcg gccgccgaga 12360
atgaacgcca agaggaacaa gcatgaaacc gcaccaggac ggccaggacg aaccgttttt 12420
cattaccgaa gagatcgagg cggagatgat cgcggccggg tacgtgttcg agccgcccgc 12480
gcacgtctca accgtgcggc tgcatgaaat cctggccggt ttgtctgatg ccaagctggc 12540
ggcctggccg gccagcttgg ccgctgaaga aaccgagcgc cgccgtctaa aaaggtgatg 12600
tgtatttgag taaaacagct tgcgtcatgc ggtcgctgcg tatatgatgc gatgagtaaa 12660
taaacaaata cgcaagggga acgcatgaag gttatcgctg tacttaacca gaaaggcggg 12720
tcaggcaaga cgaccatcgc aacccatcta gcccgcgccc tgcaactcgc cggggccgat 12780
gttctgttag tcgattccga tccccagggc agtgcccgcg attgggcggc cgtgcgggaa 12840
gatcaaccgc taaccgttgt cggcatcgac cgcccgacga ttgaccgcga cgtgaaggcc 12900
atcggccggc gcgacttcgt agtgatcgac ggagcgcccc aggcggcgga cttggctgtg 12960
tccgcgatca aggcagccga cttcgtgctg attccggtgc agccaagccc ttacgacata 13020
tgggccaccg ccgacctggt ggagctggtt aagcagcgca ttgaggtcac ggatggaagg 13080
ctacaagcgg cctttgtcgt gtcgcgggcg atcaaaggca cgcgcatcgg cggtgaggtt 13140
gccgaggcgc tggccgggta cgagctgccc attcttgagt cccgtatcac gcagcgcgtg 13200
agctacccag gcactgccgc cgccggcaca accgttcttg aatcagaacc cgagggcgac 13260
gctgcccgcg aggtccaggc gctggccgct gaaattaaat caaaactcat ttgagttaat 13320
gaggtaaaga gaaaatgagc aaaagcacaa acacgctaag tgccggccgt ccgagcgcac 13380
gcagcagcaa ggctgcaacg ttggccagcc tggcagacac gccagccatg aagcgggtca 13440
actttcagtt gccggcggag gatcacacca agctgaagat gtacgcggta cgccaaggca 13500
agaccattac cgagctgcta tctgaataca tcgcgcagct accagagtaa atgagcaaat 13560
gaataaatga gtagatgaat tttagcggct aaaggaggcg gcatggaaaa tcaagaacaa 13620
ccaggcaccg acgccgtgga atgccccatg tgtggaggaa cgggcggttg gccaggcgta 13680
agcggctggg ttgtctgccg gccctgcaat ggcactggaa cccccaagcc cgaggaatcg 13740
gcgtgacggt cgcaaaccat ccggcccggt acaaatcggc gcggcgctgg gtgatgacct 13800
ggtggagaag ttgaaggccg cgcaggccgc ccagcggcaa cgcatcgagg cagaagcacg 13860
ccccggtgaa tcgtggcaag cggccgctga tcgaatccgc aaagaatccc ggcaaccgcc 13920
ggcagccggt gcgccgtcga ttaggaagcc gcccaagggc gacgagcaac cagatttttt 13980
cgttccgatg ctctatgacg tgggcacccg cgatagtcgc agcatcatgg acgtggccgt 14040
tttccgtctg tcgaagcgtg accgacgagc tggcgaggtg atccgctacg agcttccaga 14100
cgggcacgta gaggtttccg cagggccggc cggcatggcc agtgtgtggg attacgacct 14160
ggtactgatg gcggtttccc atctaaccga atccatgaac cgataccggg aagggaaggg 14220
agacaagccc ggccgcgtgt tccgtccaca cgttgcggac gtactcaagt tctgccggcg 14280
agccgatggc ggaaagcaga aagacgacct ggtagaaacc tgcattcggt taaacaccac 14340
gcacgttgcc atgcagcgta cgaagaaggc caagaacggc cgcctggtga cggtatccga 14400
gggtgaagcc ttgattagcc gctacaagat cgtaaagagc gaaaccgggc ggccggagta 14460
catcgagatc gagctagctg attggatgta ccgcgagatc acagaaggca agaacccgga 14520
cgtgctgacg gttcaccccg attacttttt gatcgatccc ggcatcggcc gttttctcta 14580
ccgcctggca cgccgcgccg caggcaaggc agaagccaga tggttgttca agacgatcta 14640
cgaacgcagt ggcagcgccg gagagttcaa gaagttctgt ttcaccgtgc gcaagctgat 14700
cgggtcaaat gacctgccgg agtacgattt gaaggaggag gcggggcagg ctggcccgat 14760
cctagtcatg cgctaccgca acctgatcga gggcgaagca tccgccggtt cctaatgtac 14820
ggagcagatg ctagggcaaa ttgccctagc aggggaaaaa ggtcgaaaag ttctctttcc 14880
tgtggatagc acgtacattg ggaacccaaa gccgtacatt gggaaccgga acccgtacat 14940
tgggaaccca aagccgtaca ttgggaaccg gtcacacatg taagtgactg atataaaaga 15000
gaaaaaaggc gatttttccg cctaaaactc tttaaaactt attaaaactc ttaaaacccg 15060
cctggcctgt gcataactgt ctggccagcg cacagccgaa gagctgcaaa aagcgcctac 15120
ccttcggtcg ctgcgctccc tacgccccgc cgcttcgcgt cggcctatcg cggccgctgg 15180
ccgctcaaaa atggctggcc tacggccagg caatctacca gggcgcggac aagccgcgcc 15240
gtcgccactc gaccgccggc gcccacatca aggcaccctg cctcgcgcgt ttcggtgatg 15300
acggtgaaaa cctctgacac atgcagctcc cggagacggt cacagcttgt ctgtaagcgg 15360
atgccgggag cagacaagcc cgtcagggcg cgtcagcggg tgttggcggg tgtcggggcg 15420
cagccatgac ccagtcacgt agcgatagcg gagtgtatac tggcttaact atgcggcatc 15480
agagcagatt gtactgagag tgcaccatat gcggtgtgaa ataccgcaca gatgcgtaag 15540
gagaaaatac cgcatcaggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 15600
cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 15660
atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 15720
taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 15780
aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 15840
tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 15900
gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 15960
cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 16020
cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 16080
atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 16140
tacagagttc ttgaagtggt ggcctaacta cggctacact agaaggacag tatttggtat 16200
ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 16260
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 16320
aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 16380
aaactcacgt taagggattt tggtcatgca ttctaggtac taaaacaatt catccagtaa 16440
aatataatat tttattttct cccaatcagg cttgatcccc agtaagtcaa aaaatagctc 16500
gacatactgt tcttccccga tatcctccct gatcgaccgg acgcagaagg caatgtcata 16560
ccacttgtcc gccctgccgc ttctcccaag atcaataaag ccacttactt tgccatcttt 16620
cacaaagatg ttgctgtctc ccaggtcgcc gtgggaaaag acaagttcct cttcgggctt 16680
ttccgtcttt aaaaaatcat acagctcgcg cggatcttta aatggagtgt cttcttccca 16740
gttttcgcaa tccacatcgg ccagatcgtt attcagtaag taatccaatt cggctaagcg 16800
gctgtctaag ctattcgtat agggacaatc cgatatgtcg atggagtgaa agagcctgat 16860
gcactccgca tacagctcga taatcttttc agggctttgt tcatcttcat actcttccga 16920
gcaaaggacg ccatcggcct cactcatgag cagattgctc cagccatcat gccgttcaaa 16980
gtgcaggacc tttggaacag gcagctttcc ttccagccat agcatcatgt ccttttcccg 17040
ttccacatca taggtggtcc ctttataccg gctgtccgtc atttttaaat ataggttttc 17100
attttctccc accagcttat ataccttagc aggagacatt ccttccgtat cttttacgca 17160
gcggtatttt tcgatcagtt ttttcaattc cggtgatatt ctcattttag ccatttatta 17220
tttccttcct cttttctaca gtatttaaag ataccccaag aagctaatta taacaagacg 17280
aactccaatt cactgttcct tgcattctaa aaccttaaat accagaaaac agctttttca 17340
aagttgtttt caaagttggc gtataacata gtatcgacgg agccgatttt gaaaccgcgg 17400
tgatcacagg cagcaacgct ctgtcatcgt tacaatcaac atgctaccct ccgcgagatc 17460
atccgtgttt caaacccggc agcttagttg ccgttcttcc gaatagcatc ggtaacatga 17520
gcaaagtctg ccgccttaca acggctctcc cgctgacgcc gtcccggact gatgggctgc 17580
ctgtatcgag tggtgatttt gtgccgagct gccggtcggg gagctgttgg ctggct 17636
Claims (8)
1.成套系统在如下S1)-S4)任一种中的应用:
S1)植物或植物细胞基因组序列的编辑;
S2)制备植物或植物细胞基因组序列的编辑的产品;
S3)提高植物或植物细胞基因组序列的编辑效率;
S4)制备提高植物或植物细胞基因组序列的编辑效率的产品;
所述成套系统包括融合蛋白和pegRNA;
所述融合蛋白包括反转录酶、Cas9切刻酶、自切割寡肽和筛选标记蛋白;所述反转录酶融合在所述Cas9切刻酶的N端,且通过自切割寡肽与筛选标记蛋白融合;
所述Cas9切刻酶为氨基酸序列是序列2所示的蛋白质;
所述反转录酶为氨基酸序列是序列3所示的蛋白质;
所述自切割寡肽为氨基酸序列是序列5所示的蛋白质;
所述pegRNA依次由靶点序列、esgRNA骨架、RT序列和PBS序列组成;所述esgRNA骨架为将序列1第11008-11093位中的T替换为U得到的RNA分子;
所述植物为水稻。
2.根据权利要求1所述的应用,其特征在于:所述Cas9切刻酶的编码基因为序列1第2293-6393位所示的DNA分子。
3.根据权利要求1所述的应用,其特征在于:所述反转录酶的编码基因为序列1第6493-8523位所示的DNA分子。
4.根据权利要求1所述的应用,其特征在于:所述筛选标记蛋白为潮霉素磷酸转移酶;所述潮霉素磷酸转移酶为氨基酸序列是序列4所示的蛋白质。
5.根据权利要求4所述的应用,其特征在于:所述潮霉素磷酸转移酶的编码基因为序列1第8731-9756位所示的DNA分子。
6.根据权利要求1所述的应用,其特征在于:所述自切割寡肽的编码基因为序列1第8674-8730位所示的DNA分子。
7.根据权利要求1所述的应用,其特征在于:所述成套系统还包括esgRNA。
8.根据权利要求1所述的应用,其特征在于:所述基因组序列的编辑为基因组序列的碱基替换。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011621690.4A CN114686454B (zh) | 2020-12-31 | 2020-12-31 | Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011621690.4A CN114686454B (zh) | 2020-12-31 | 2020-12-31 | Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114686454A CN114686454A (zh) | 2022-07-01 |
CN114686454B true CN114686454B (zh) | 2024-04-26 |
Family
ID=82133798
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011621690.4A Active CN114686454B (zh) | 2020-12-31 | 2020-12-31 | Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114686454B (zh) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110577965A (zh) * | 2019-08-30 | 2019-12-17 | 北京市农林科学院 | xCas9n-epBE碱基编辑系统在基因编辑中的应用 |
CN110951743A (zh) * | 2019-12-31 | 2020-04-03 | 北京市农林科学院 | 一种提高植物基因替换效率的方法 |
CN111378051A (zh) * | 2020-03-25 | 2020-07-07 | 北京市农林科学院 | Pe-p2引导编辑系统及其在基因组碱基编辑中的应用 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9840699B2 (en) * | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
CN110157727A (zh) * | 2017-12-21 | 2019-08-23 | 中国科学院遗传与发育生物学研究所 | 植物碱基编辑方法 |
BR102019009665A2 (pt) * | 2018-12-21 | 2022-02-08 | Jacques P. Tremblay | Modificação da proteína precursora beta amiloide (app) através da edição de base usando o sistema crispr/cas9 |
-
2020
- 2020-12-31 CN CN202011621690.4A patent/CN114686454B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110577965A (zh) * | 2019-08-30 | 2019-12-17 | 北京市农林科学院 | xCas9n-epBE碱基编辑系统在基因编辑中的应用 |
CN110951743A (zh) * | 2019-12-31 | 2020-04-03 | 北京市农林科学院 | 一种提高植物基因替换效率的方法 |
CN111378051A (zh) * | 2020-03-25 | 2020-07-07 | 北京市农林科学院 | Pe-p2引导编辑系统及其在基因组碱基编辑中的应用 |
Non-Patent Citations (5)
Title |
---|
A design optimized prime editor with expanded scope and capability in plants;Wen Xu;nature plants;45-52 * |
Prime editing引导植物基因组精确编辑新局面;秦瑞英;魏鹏程;;遗传(06);5-9 * |
Search-and-replace genome editing without double-strand breaks or donor DNA;Andrew V. Anzalone等;Nature;149-178 * |
利用CRISPR/Cas9 技术编辑BADH2-1/BADH2-2 创制 香米味道玉米新种质;张翔等;中国农业科学;2064-2072 * |
通过优化xCas9 系统拓展碱基编辑PAM 至GA 和松弛型NG;张成伟等;2019 年中国作物学会学术年会论文摘要集;1 * |
Also Published As
Publication number | Publication date |
---|---|
CN114686454A (zh) | 2022-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108495932B (zh) | 用于特异性转换靶向dna序列的核酸碱基的单子叶植物的基因组序列的转换方法、及其使用的分子复合体 | |
CN106939316B (zh) | 利用CRISPR/Cas9系统定点敲除水稻OsPDCD5基因第二外显子的方法 | |
CN110951741B (zh) | 一种基于CRISPR Cpf1的枯草芽孢杆菌多基因编辑和表达调控系统 | |
CN110577965B (zh) | xCas9n-epBE碱基编辑系统在基因编辑中的应用 | |
CN109722439B (zh) | 烟草mlo2、mlo6和mlo12基因在制备抗白粉菌烟草品种中的应用及其方法 | |
CN106929532B (zh) | 人工创制玉米雄性不育系与高效的转育方法 | |
US20040034889A1 (en) | Method of transforming soybean | |
CN110951770B (zh) | 一种简单快速高效的CRISPR/Cas9载体构建方法及应用 | |
CN114763556B (zh) | 一种基因编辑效率提高的引导碱基编辑系统及其应用 | |
CN109355306B (zh) | 陆地棉转化事件icr24-397及其特异性鉴定方法 | |
CN110724685A (zh) | 转基因耐盐耐除草剂玉米sr801外源插入旁侧序列及其应用 | |
CN114686454B (zh) | Pe-p3引导编辑系统及其在基因组碱基编辑中的应用 | |
CN114438104A (zh) | 一种调控番茄果实糖含量的SlGRAS9基因及在培育高糖含量番茄中的应用 | |
AU2005252598A1 (en) | Transformation vectors | |
CN109266686A (zh) | 一种基因组核苷酸定点替换的方法 | |
CN110564752B (zh) | 差异代理技术在c·t碱基替换细胞富集中的应用 | |
CN108949805B (zh) | 一种植物基因组多位点编辑载体pCXUN-CAS9-RGR | |
CN103305541A (zh) | 一种激活标签Ac/Ds转座系统及其在植物突变体库构建中的应用 | |
CN109265562B (zh) | 一种切刻酶及其在基因组碱基替换中的应用 | |
CN109666693B (zh) | Mg132在碱基编辑系统编辑受体基因组中的应用 | |
CN109666694B (zh) | Scr7在碱基编辑系统编辑受体基因组中的应用 | |
CN113881670B (zh) | 抗大豆花叶病毒的转基因植物构建方法 | |
KR101570765B1 (ko) | 잠두위조바이러스 2의 감염활성을 재현하기 위한 아그로박테리움 투메파시언스 혼합물 | |
CN113621641A (zh) | 一种介导调控植物隐性核雄性不育系雄性育性的gat载体及其应用 | |
CN114317596B (zh) | 一种将植物基因组靶点序列中的a突变为g的方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |