CN117089572A - Low off-target base editor and construction thereof - Google Patents
Low off-target base editor and construction thereof Download PDFInfo
- Publication number
- CN117089572A CN117089572A CN202210518789.4A CN202210518789A CN117089572A CN 117089572 A CN117089572 A CN 117089572A CN 202210518789 A CN202210518789 A CN 202210518789A CN 117089572 A CN117089572 A CN 117089572A
- Authority
- CN
- China
- Prior art keywords
- leu
- lys
- glu
- asp
- ile
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title abstract description 7
- 108091033409 CRISPR Proteins 0.000 claims abstract description 103
- 241001465754 Metazoa Species 0.000 claims abstract description 22
- 102000008300 Mutant Proteins Human genes 0.000 claims abstract description 13
- 108010021466 Mutant Proteins Proteins 0.000 claims abstract description 13
- 210000004962 mammalian cell Anatomy 0.000 claims abstract description 8
- 125000000539 amino acid group Chemical group 0.000 claims abstract description 5
- 238000001415 gene therapy Methods 0.000 claims abstract description 4
- 108090000623 proteins and genes Proteins 0.000 claims description 61
- 210000004027 cell Anatomy 0.000 claims description 36
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical group NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 claims description 36
- 230000035772 mutation Effects 0.000 claims description 29
- 108020004707 nucleic acids Proteins 0.000 claims description 20
- 102000039446 nucleic acids Human genes 0.000 claims description 20
- 150000007523 nucleic acids Chemical class 0.000 claims description 20
- CKLJMWTZIZZHCS-REOHCLBHSA-N aspartic acid group Chemical group N[C@@H](CC(=O)O)C(=O)O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 claims description 18
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical group O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 claims description 18
- 239000012620 biological material Substances 0.000 claims description 16
- GFFGJBXGBJISGV-UHFFFAOYSA-N adenyl group Chemical group N1=CN=C2N=CNC2=C1N GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 claims description 15
- 230000008685 targeting Effects 0.000 claims description 15
- 230000014509 gene expression Effects 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 11
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 claims description 11
- 238000002360 preparation method Methods 0.000 claims description 10
- 235000018102 proteins Nutrition 0.000 claims description 9
- 102000004169 proteins and genes Human genes 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical group CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 claims description 8
- 244000005700 microbiome Species 0.000 claims description 7
- 239000003153 chemical reaction reagent Substances 0.000 claims description 6
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 claims description 6
- 238000012986 modification Methods 0.000 claims description 6
- 230000004048 modification Effects 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 239000002773 nucleotide Substances 0.000 claims description 5
- 125000003729 nucleotide group Chemical group 0.000 claims description 5
- 201000010099 disease Diseases 0.000 claims description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 4
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 claims description 4
- 125000002987 valine group Chemical group [H]N([H])C([H])(C(*)=O)C([H])(C([H])([H])[H])C([H])([H])[H] 0.000 claims description 4
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 claims description 3
- 238000011282 treatment Methods 0.000 claims description 3
- 241000124008 Mammalia Species 0.000 claims description 2
- 239000003814 drug Substances 0.000 claims description 2
- 125000001500 prolyl group Chemical group [H]N1C([H])(C(=O)[*])C([H])([H])C([H])([H])C1([H])[H] 0.000 claims description 2
- 238000011321 prophylaxis Methods 0.000 claims description 2
- 230000009437 off-target effect Effects 0.000 abstract description 5
- 230000000051 modifying effect Effects 0.000 abstract description 4
- 241000193996 Streptococcus pyogenes Species 0.000 abstract description 2
- 238000007877 drug screening Methods 0.000 abstract description 2
- 239000013612 plasmid Substances 0.000 description 117
- 108020005004 Guide RNA Proteins 0.000 description 88
- 108010034529 leucyl-lysine Proteins 0.000 description 42
- 108010092854 aspartyllysine Proteins 0.000 description 36
- 108020004414 DNA Proteins 0.000 description 30
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 30
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 24
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 24
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 21
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 18
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 18
- 108010062796 arginyllysine Proteins 0.000 description 18
- 108010025306 histidylleucine Proteins 0.000 description 18
- 108010057821 leucylproline Proteins 0.000 description 18
- 108010073969 valyllysine Proteins 0.000 description 18
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 16
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 14
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 12
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 description 12
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 12
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 12
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 12
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 12
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 12
- SITLTJHOQZFJGG-XPUUQOCRSA-N Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SITLTJHOQZFJGG-XPUUQOCRSA-N 0.000 description 12
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 12
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 12
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 12
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 12
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 12
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 12
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 12
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 12
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 12
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 12
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 12
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 12
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 12
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 12
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 12
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 12
- 108010047562 NGR peptide Proteins 0.000 description 12
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 12
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 12
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 12
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 12
- 101150063416 add gene Proteins 0.000 description 12
- 108010050848 glycylleucine Proteins 0.000 description 12
- 108010018006 histidylserine Proteins 0.000 description 12
- 108010012058 leucyltyrosine Proteins 0.000 description 12
- 108010054155 lysyllysine Proteins 0.000 description 12
- 108010012581 phenylalanylglutamate Proteins 0.000 description 12
- 108010051110 tyrosyl-lysine Proteins 0.000 description 12
- 102100037964 E3 ubiquitin-protein ligase RING2 Human genes 0.000 description 11
- 101001095815 Homo sapiens E3 ubiquitin-protein ligase RING2 Proteins 0.000 description 11
- 229950010131 puromycin Drugs 0.000 description 11
- 125000006850 spacer group Chemical group 0.000 description 11
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 10
- 238000012217 deletion Methods 0.000 description 10
- 230000037430 deletion Effects 0.000 description 10
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 9
- 238000010354 CRISPR gene editing Methods 0.000 description 9
- 238000010362 genome editing Methods 0.000 description 9
- 108010017391 lysylvaline Proteins 0.000 description 9
- 238000001890 transfection Methods 0.000 description 9
- 101000801640 Homo sapiens Phospholipid-transporting ATPase ABCA3 Proteins 0.000 description 8
- 102100033623 Phospholipid-transporting ATPase ABCA3 Human genes 0.000 description 8
- 229940104302 cytosine Drugs 0.000 description 7
- GYNQVPIDAQTZOY-ROUUACIJSA-N (2s)-2-[[2-[[2-[[(2s)-2-amino-3-(4-hydroxyphenyl)propanoyl]amino]acetyl]amino]acetyl]amino]-3-phenylpropanoic acid Chemical compound C([C@H](N)C(=O)NCC(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 GYNQVPIDAQTZOY-ROUUACIJSA-N 0.000 description 6
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 6
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 6
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 6
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 6
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 6
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 6
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 6
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 6
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 6
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 6
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 6
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 6
- XEPSCVXTCUUHDT-AVGNSLFASA-N Arg-Arg-Leu Natural products CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCCN=C(N)N XEPSCVXTCUUHDT-AVGNSLFASA-N 0.000 description 6
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 6
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 6
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 6
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 6
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 6
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 6
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 6
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 6
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 6
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 6
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 6
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 6
- GRRXPUAICOGISM-RWMBFGLXSA-N Arg-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GRRXPUAICOGISM-RWMBFGLXSA-N 0.000 description 6
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 6
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 6
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 6
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 6
- MOGMYRUNTKYZFB-UNQGMJICSA-N Arg-Thr-Phe Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MOGMYRUNTKYZFB-UNQGMJICSA-N 0.000 description 6
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 6
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 6
- BVLIJXXSXBUGEC-SRVKXCTJSA-N Asn-Asn-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BVLIJXXSXBUGEC-SRVKXCTJSA-N 0.000 description 6
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 6
- IIFDPDVJAHQFSR-WHFBIAKZSA-N Asn-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O IIFDPDVJAHQFSR-WHFBIAKZSA-N 0.000 description 6
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 6
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 6
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 6
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 6
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 6
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 6
- QUMKPKWYDVMGNT-NUMRIWBASA-N Asn-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QUMKPKWYDVMGNT-NUMRIWBASA-N 0.000 description 6
- FYRVDDJMNISIKJ-UWVGGRQHSA-N Asn-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FYRVDDJMNISIKJ-UWVGGRQHSA-N 0.000 description 6
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 6
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 6
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 6
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 6
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 6
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 6
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 6
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 6
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 6
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 6
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 6
- TVIZQBFURPLQDV-DJFWLOJKSA-N Asp-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N TVIZQBFURPLQDV-DJFWLOJKSA-N 0.000 description 6
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 6
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 6
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 6
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 6
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 6
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 6
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 6
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 6
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 6
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 6
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 6
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 6
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 6
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 6
- CGVWDTRDPLOMHZ-FXQIFTODSA-N Gln-Glu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O CGVWDTRDPLOMHZ-FXQIFTODSA-N 0.000 description 6
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 6
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 6
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 6
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 6
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 6
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 6
- WTJIWXMJESRHMM-XDTLVQLUSA-N Gln-Tyr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O WTJIWXMJESRHMM-XDTLVQLUSA-N 0.000 description 6
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 6
- YKLNMGJYMNPBCP-ACZMJKKPSA-N Glu-Asn-Asp Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YKLNMGJYMNPBCP-ACZMJKKPSA-N 0.000 description 6
- YYOBUPFZLKQUAX-FXQIFTODSA-N Glu-Asn-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YYOBUPFZLKQUAX-FXQIFTODSA-N 0.000 description 6
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 6
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 6
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 6
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 6
- CJWANNXUTOATSJ-DCAQKATOSA-N Glu-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N CJWANNXUTOATSJ-DCAQKATOSA-N 0.000 description 6
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 6
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 6
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 6
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 6
- SNFUTDLOCQQRQD-ZKWXMUAHSA-N Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CCC(O)=O SNFUTDLOCQQRQD-ZKWXMUAHSA-N 0.000 description 6
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 6
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 6
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 6
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 6
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 6
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 6
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 6
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 6
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 6
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 6
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 6
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 6
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 6
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 6
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 6
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 6
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 6
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 6
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 6
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 6
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 6
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 6
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 6
- FKESCSGWBPUTPN-FOHZUACHSA-N Gly-Thr-Asn Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O FKESCSGWBPUTPN-FOHZUACHSA-N 0.000 description 6
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 6
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 6
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 6
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 6
- VOEGKUNRHYKYSU-XVYDVKMFSA-N His-Asp-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O VOEGKUNRHYKYSU-XVYDVKMFSA-N 0.000 description 6
- NNBWMLHQXBTIIT-HVTMNAMFSA-N His-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N NNBWMLHQXBTIIT-HVTMNAMFSA-N 0.000 description 6
- IMCHNUANCIGUKS-SRVKXCTJSA-N His-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IMCHNUANCIGUKS-SRVKXCTJSA-N 0.000 description 6
- XMENRVZYPBKBIL-AVGNSLFASA-N His-Glu-His Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XMENRVZYPBKBIL-AVGNSLFASA-N 0.000 description 6
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 6
- KWBISLAEQZUYIC-UWJYBYFXSA-N His-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N KWBISLAEQZUYIC-UWJYBYFXSA-N 0.000 description 6
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 6
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 6
- WCHONUZTYDQMBY-PYJNHQTQSA-N His-Pro-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WCHONUZTYDQMBY-PYJNHQTQSA-N 0.000 description 6
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 6
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 6
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 6
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 6
- UMYZBHKAVTXWIW-GMOBBJLQSA-N Ile-Asp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UMYZBHKAVTXWIW-GMOBBJLQSA-N 0.000 description 6
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 6
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 6
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 6
- JSZMKEYEVLDPDO-ACZMJKKPSA-N Ile-Cys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CS)C(O)=O JSZMKEYEVLDPDO-ACZMJKKPSA-N 0.000 description 6
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 6
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 6
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 6
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 6
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 6
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 6
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 6
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 6
- PKGGWLOLRLOPGK-XUXIUFHCSA-N Ile-Leu-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PKGGWLOLRLOPGK-XUXIUFHCSA-N 0.000 description 6
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 6
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 6
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 6
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 6
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 6
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 6
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 6
- NAFIFZNBSPWYOO-RWRJDSDZSA-N Ile-Thr-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NAFIFZNBSPWYOO-RWRJDSDZSA-N 0.000 description 6
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 6
- JSLIXOUMAOUGBN-JUKXBJQTSA-N Ile-Tyr-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N JSLIXOUMAOUGBN-JUKXBJQTSA-N 0.000 description 6
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 6
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 6
- 108010065920 Insulin Lispro Proteins 0.000 description 6
- QLROSWPKSBORFJ-BQBZGAKWSA-N L-Prolyl-L-glutamic acid Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1 QLROSWPKSBORFJ-BQBZGAKWSA-N 0.000 description 6
- 241000880493 Leptailurus serval Species 0.000 description 6
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 6
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 6
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 6
- NTRAGDHVSGKUSF-AVGNSLFASA-N Leu-Arg-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NTRAGDHVSGKUSF-AVGNSLFASA-N 0.000 description 6
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 6
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 6
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 6
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 6
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 6
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 6
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 6
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 6
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 6
- SEMUSFOBZGKBGW-YTFOTSKYSA-N Leu-Ile-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SEMUSFOBZGKBGW-YTFOTSKYSA-N 0.000 description 6
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 6
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 6
- IAJFFZORSWOZPQ-SRVKXCTJSA-N Leu-Leu-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O IAJFFZORSWOZPQ-SRVKXCTJSA-N 0.000 description 6
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 6
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 6
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 6
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 6
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 6
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 6
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 6
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 6
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 6
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 6
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 6
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 6
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 6
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 6
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 6
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 6
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 6
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 6
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 6
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 6
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 6
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 6
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 6
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 6
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 6
- CKSXSQUVEYCDIW-AVGNSLFASA-N Lys-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N CKSXSQUVEYCDIW-AVGNSLFASA-N 0.000 description 6
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 6
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 6
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 6
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 6
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 6
- OAPNERBWQWUPTI-YUMQZZPRSA-N Lys-Gln Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCC(N)=O OAPNERBWQWUPTI-YUMQZZPRSA-N 0.000 description 6
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 6
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 6
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 6
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 6
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 6
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 6
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 6
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 6
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 6
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 6
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 6
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 6
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 6
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 6
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 6
- NVGBPTNZLWRQSY-UWVGGRQHSA-N Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(O)=O)CCCCN NVGBPTNZLWRQSY-UWVGGRQHSA-N 0.000 description 6
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 6
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 6
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 6
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 6
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 6
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 6
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 6
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 6
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 6
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 6
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 6
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 6
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 6
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 6
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 6
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 6
- 239000004472 Lysine Substances 0.000 description 6
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 6
- HDNOQCZWJGGHSS-VEVYYDQMSA-N Met-Asn-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HDNOQCZWJGGHSS-VEVYYDQMSA-N 0.000 description 6
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 6
- KQBJYJXPZBNEIK-DCAQKATOSA-N Met-Glu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQBJYJXPZBNEIK-DCAQKATOSA-N 0.000 description 6
- QGRJTULYDZUBAY-ZPFDUUQYSA-N Met-Ile-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O QGRJTULYDZUBAY-ZPFDUUQYSA-N 0.000 description 6
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 6
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 6
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 6
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 6
- CGOMLCQJEMWMCE-STQMWFEESA-N Phe-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CGOMLCQJEMWMCE-STQMWFEESA-N 0.000 description 6
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 6
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 6
- JXWLMUIXUXLIJR-QWRGUYRKSA-N Phe-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JXWLMUIXUXLIJR-QWRGUYRKSA-N 0.000 description 6
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 6
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 6
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 6
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 6
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 6
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 6
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 6
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 6
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 6
- SJRQWEDYTKYHHL-SLFFLAALSA-N Phe-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC3=CC=CC=C3)N)C(=O)O SJRQWEDYTKYHHL-SLFFLAALSA-N 0.000 description 6
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 6
- VOHFZDSRPZLXLH-IHRRRGAJSA-N Pro-Asn-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VOHFZDSRPZLXLH-IHRRRGAJSA-N 0.000 description 6
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 6
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 6
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 6
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 6
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 6
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 6
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 6
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 6
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 6
- 108010003201 RGH 0205 Proteins 0.000 description 6
- NLQUOHDCLSFABG-GUBZILKMSA-N Ser-Arg-Arg Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NLQUOHDCLSFABG-GUBZILKMSA-N 0.000 description 6
- QGMLKFGTGXWAHF-IHRRRGAJSA-N Ser-Arg-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QGMLKFGTGXWAHF-IHRRRGAJSA-N 0.000 description 6
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 6
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 6
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 6
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 6
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 6
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 6
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 6
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 6
- MOINZPRHJGTCHZ-MMWGEVLESA-N Ser-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N MOINZPRHJGTCHZ-MMWGEVLESA-N 0.000 description 6
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 6
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 6
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 6
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 6
- ILVGMCVCQBJPSH-WDSKDSINSA-N Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](N)CO ILVGMCVCQBJPSH-WDSKDSINSA-N 0.000 description 6
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 6
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 6
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 6
- ZUXQFMVPAYGPFJ-JXUBOQSCSA-N Thr-Ala-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN ZUXQFMVPAYGPFJ-JXUBOQSCSA-N 0.000 description 6
- XYEXCEPTALHNEV-RCWTZXSCSA-N Thr-Arg-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XYEXCEPTALHNEV-RCWTZXSCSA-N 0.000 description 6
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 6
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 6
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 6
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 6
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 6
- BVOVIGCHYNFJBZ-JXUBOQSCSA-N Thr-Leu-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O BVOVIGCHYNFJBZ-JXUBOQSCSA-N 0.000 description 6
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 6
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 6
- SCSVNSNWUTYSFO-WDCWCFNPSA-N Thr-Lys-Glu Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O SCSVNSNWUTYSFO-WDCWCFNPSA-N 0.000 description 6
- WRQLCVIALDUQEQ-UNQGMJICSA-N Thr-Phe-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WRQLCVIALDUQEQ-UNQGMJICSA-N 0.000 description 6
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 6
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 6
- OGOYMQWIWHGTGH-KZVJFYERSA-N Thr-Val-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O OGOYMQWIWHGTGH-KZVJFYERSA-N 0.000 description 6
- RSUXQZNWAOTBQF-XIRDDKMYSA-N Trp-Arg-Gln Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N RSUXQZNWAOTBQF-XIRDDKMYSA-N 0.000 description 6
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 6
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 6
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 6
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 6
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 6
- MVFQLSPDMMFCMW-KKUMJFAQSA-N Tyr-Leu-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O MVFQLSPDMMFCMW-KKUMJFAQSA-N 0.000 description 6
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 6
- OLYXUGBVBGSZDN-ACRUOGEOSA-N Tyr-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=C(O)C=C1 OLYXUGBVBGSZDN-ACRUOGEOSA-N 0.000 description 6
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 6
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 6
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 6
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 6
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 6
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 6
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 6
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 6
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 6
- PVPAOIGJYHVWBT-KKHAAJSZSA-N Val-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N)O PVPAOIGJYHVWBT-KKHAAJSZSA-N 0.000 description 6
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 6
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 description 6
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 6
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 6
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 6
- UPJONISHZRADBH-XPUUQOCRSA-N Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(O)=O)CCC(O)=O UPJONISHZRADBH-XPUUQOCRSA-N 0.000 description 6
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 6
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 6
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 6
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 6
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 6
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 6
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 6
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 6
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 6
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 6
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 6
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 6
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 6
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 6
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 6
- 108010005233 alanylglutamic acid Proteins 0.000 description 6
- 108010044940 alanylglutamine Proteins 0.000 description 6
- 108010070944 alanylhistidine Proteins 0.000 description 6
- 108010087924 alanylproline Proteins 0.000 description 6
- 108010070783 alanyltyrosine Proteins 0.000 description 6
- 108010080488 arginyl-arginyl-leucine Proteins 0.000 description 6
- 108010008355 arginyl-glutamine Proteins 0.000 description 6
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 6
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 6
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 6
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 6
- 108010093581 aspartyl-proline Proteins 0.000 description 6
- 108010038633 aspartylglutamate Proteins 0.000 description 6
- 108010068265 aspartyltyrosine Proteins 0.000 description 6
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 6
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 6
- 108010079547 glutamylmethionine Proteins 0.000 description 6
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 6
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 6
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 6
- 108010037850 glycylvaline Proteins 0.000 description 6
- 108010085325 histidylproline Proteins 0.000 description 6
- 108010027338 isoleucylcysteine Proteins 0.000 description 6
- 108010078274 isoleucylvaline Proteins 0.000 description 6
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 6
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 6
- 108010087810 leucyl-seryl-glutamyl-leucine Proteins 0.000 description 6
- 108010003700 lysyl aspartic acid Proteins 0.000 description 6
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 6
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 6
- 108010009298 lysylglutamic acid Proteins 0.000 description 6
- 108010064235 lysylglycine Proteins 0.000 description 6
- 108010038320 lysylphenylalanine Proteins 0.000 description 6
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 6
- 108010051242 phenylalanylserine Proteins 0.000 description 6
- 108010025488 pinealon Proteins 0.000 description 6
- 108010031719 prolyl-serine Proteins 0.000 description 6
- 108010070643 prolylglutamic acid Proteins 0.000 description 6
- 108010015796 prolylisoleucine Proteins 0.000 description 6
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 6
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 6
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 6
- 108010005834 tyrosyl-alanyl-glycine Proteins 0.000 description 6
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 6
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 5
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 239000004474 valine Substances 0.000 description 5
- 239000004475 Arginine Substances 0.000 description 4
- HRVQDZOWMLFAOD-BIIVOSGPSA-N Asp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N)C(=O)O HRVQDZOWMLFAOD-BIIVOSGPSA-N 0.000 description 4
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 4
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 4
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 4
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 4
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 4
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 4
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 4
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 4
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 4
- QHDXUYOYTPWCSK-RCOVLWMOSA-N Val-Asp-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N QHDXUYOYTPWCSK-RCOVLWMOSA-N 0.000 description 4
- 108010047495 alanylglycine Proteins 0.000 description 4
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 4
- 235000003704 aspartic acid Nutrition 0.000 description 4
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 229930024421 Adenine Natural products 0.000 description 3
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 3
- 239000012097 Lipofectamine 2000 Substances 0.000 description 3
- WWEWGPOLIJXGNX-XUXIUFHCSA-N Lys-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCCN)N WWEWGPOLIJXGNX-XUXIUFHCSA-N 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- 108010006785 Taq Polymerase Proteins 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- 150000001413 amino acids Chemical group 0.000 description 3
- 230000005782 double-strand break Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 108090000765 processed proteins & peptides Proteins 0.000 description 3
- 102000004196 processed proteins & peptides Human genes 0.000 description 3
- 108010060175 trypsinogen activation peptide Proteins 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- 108010052875 Adenine deaminase Proteins 0.000 description 2
- FVBZXNSRIDVYJS-AVGNSLFASA-N Arg-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCN=C(N)N FVBZXNSRIDVYJS-AVGNSLFASA-N 0.000 description 2
- 238000010453 CRISPR/Cas method Methods 0.000 description 2
- 108010080611 Cytosine Deaminase Proteins 0.000 description 2
- 102000000311 Cytosine Deaminase Human genes 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 2
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 2
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 2
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 2
- NHDVNAKDACFHPX-GUBZILKMSA-N Pro-Arg-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O NHDVNAKDACFHPX-GUBZILKMSA-N 0.000 description 2
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 2
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- SDHFVYLZFBDSQT-DCAQKATOSA-N Asp-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N SDHFVYLZFBDSQT-DCAQKATOSA-N 0.000 description 1
- LKIYSIYBKYLKPU-BIIVOSGPSA-N Asp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O LKIYSIYBKYLKPU-BIIVOSGPSA-N 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 239000012091 fetal bovine serum Substances 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 238000011090 industrial biotechnology method and process Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 238000012113 quantitative test Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 230000013120 recombinational repair Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1024—In vivo mutagenesis using high mutation rate "mutator" host strains by inserting genetic material, e.g. encoding an error prone polymerase, disrupting a gene for mismatch repair
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Epidemiology (AREA)
- Animal Behavior & Ethology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The application discloses a low off-target base editor and construction thereof. The low off-target base editor disclosed by the application contains or expresses a Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas 9. The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.
Description
Technical Field
The application relates to the technical field of biotechnology and gene editing, in particular to a low off-target base editor and construction thereof.
Background
Genome editing refers to the effective design and efficient modification of cells at the genome level, and the CRISPR/Cas9 genome editing technology is simple in design, convenient to operate and high in editing efficiency, and is successfully applied to genome editing research of various target cells at present. CRISPR/Cas9 genome editing techniques mainly use guide RNAs (grnas) to guide Cas9 proteins to precisely cut at the targeting site of the genome resulting in DNA double strand breaks (double strand break, DSBs), host cells repair with their own non-homologous end-junctions (NHEJ) or based on homologous end recombination (homologous end recombination repair, HDR), but specific editing for single bases is difficult to achieve. Because double strand breaks of DNA have much uncertainty, the probability of HDR occurrence is low, whereas NHEJ causes random insertions or deletions of bases, and thus, conventional CRISPR/Cas techniques have certain drawbacks in gene editing for single bases.
The CRISPR Base Editor (BE) overcomes the defect of the traditional CRISPR/Cas technology in single base editing, mainly utilizes Cas9 (dCAs 9) without DNA cutting activity or an editor formed by fusing Cas9 protein (nCas 9) with single-chain DNA cutting activity and deaminase, can realize accurate point mutation of a target site under the guidance of gRNA, and has great application prospect in the treatment of genetic mutation genetic diseases. The existing base editor mainly comprises: a cytosine base editor (Cytosine base editor, CBE) that converts cytosine nucleotides within an edit window of the target sequence to thymidines (C > T); adenine base editors (Adenine base editor, ABE) that convert adenine nucleotides within the editing window to guanine nucleotides (a > G); and a novel glycosylase base editor (Glycosylase base editor, GBE) that can edit cytosine nucleotides to adenine nucleotides (C > A) in E.coli and to guanine nucleotides (C > G) specifically in mammalian cells.
The CRISPR/Cas9 has a serious off-target effect, and can cut DNA double chains at misplaced gene sites, thereby causing potential risks, which is also a big factor limiting the clinical application of CRISPR/Cas9 gene editing. At present, a plurality of clinical tests based on CRISPR/Cas9 gene editing are started at home and abroad, and the reduction of off-target effect is a problem to be solved urgently. CRISPR/Cas9 primarily uses base complementary pairing of the targeting sequence of the gRNA with the target DNA to identify the site that needs editing, however, sometimes Cas9 enzymes can still edit in cases where the targeting sequence of the gRNA does not perfectly match the genomic DNA, resulting in off-target effects. High-fidelity Cas9 proteins (Cas 9-HF1, hypas 9, evoCas9, etc.) constructed by proteolytic engineering or directed evolution can reduce off-target, but these high-fidelity Cas9 proteins reduce off-target as well as editing efficiency of the target site.
The CRISPR base editor mainly utilizes dCAS9 and nCas9 proteins and gRNA to identify a genome target site, deaminates a target base by utilizing deaminase, and then utilizes a DNA repair system or DNA repair of a cell to realize the base editing of the target site, and similar to CRISPR/Cas9 genome editing, the target-off phenomenon can occur at an incompletely matched site.
Disclosure of Invention
The application aims to solve the technical problem of reducing off-target of a base editor.
In order to solve the above technical problems, the present application firstly provides a base editor, which contains or expresses Cas9 mutant, wherein the Cas9 mutant is a mutant protein obtained by mutating amino acid residues between 1010 th and 1031 th positions of Cas9, or a mutant protein obtained by mutating amino acid residues at 1010 th, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 th positions of Cas9, or a mutant protein obtained by mutating any one or more of the following positions of Cas 9:
m1) mutating the tyrosine residue at position 1010 of Cas9 to an aspartic acid residue;
m2) mutating the tyrosine residue at position 1013 of Cas9 to an aspartic acid residue;
m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;
m4) mutating the tyrosine residue at position 1016 of Cas9 to an aspartic acid residue;
m5) mutating the valine residue at position 1018 of Cas9 to the aspartic acid residue;
m6) mutating the arginine residue at position 1019 of Cas9 to an aspartic acid residue;
m7) mutating the glutamine residue at position 1027 of Cas9 to an aspartic acid residue;
m8) mutates the lysine residue at position 1031 of Cas9 to an aspartic acid residue.
In the above base editor, the Cas9 may be a protein represented by sequence 2 or sequence 6 in the sequence table.
In the above base editor, the Cas9 mutant may be a mutant protein obtained by performing seven mutations of Cas9, M1), M2), M4), M5), M6), M7) and M8), or a mutant protein obtained by performing five mutations of Cas9, M2), M5), M7) and M8), or a mutant protein obtained by performing eight mutations of Cas9, M1) -M8).
The above base editor may also contain or also express sgrnas targeting the target sequence and/or domains with base modifying activity.
The domain having base modification activity may be a domain having deaminase activity, a mutant, homolog or polypeptide having or at least partially having deaminase activity. Specifically, the structural domain with the base modification activity is adenine deaminase, or a mutant, a homolog or a polypeptide fragment with or part of the adenine deaminase activity; or, the domain with base modification activity is cytosine deaminase, or a mutant, homolog or polypeptide fragment with or part of cytosine deaminase activity.
In particular, the base editor may also contain other components of the base editor than Cas9 or mutations thereof, such as sgrnas targeting the target sequence and/or domains with base modifying activity. Further, the base editor may also contain other components of the CBE base editor (e.g., BE4max or hyA3A-BE4 max) than Cas9 or a mutation thereof, or contain other components of the ABE base editor (e.g., NG-ABEmax or ABE8 e) than Cas9 or a mutation thereof, or contain other components of the GBE base editor (e.g., apodec_nmcas 9_ Ung) than Cas9 or a mutation thereof.
The Cas9 mutants also fall within the scope of the present application.
The application also provides a biological material associated with the Cas9 mutant, which is any one of the following B1) to B5):
b1 A nucleic acid molecule encoding the Cas9 mutant;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or a recombinant vector comprising the expression cassette of B2);
b4 A recombinant microorganism comprising the nucleic acid molecule of B1), or a recombinant microorganism comprising the expression cassette of B2), or a recombinant microorganism comprising the recombinant vector of B3);
b5 A cell line containing the nucleic acid molecule of B1) or a cell line containing the expression cassette of B2).
In the above biological material, the nucleic acid molecule of B1) may be a mutant gene obtained by mutating the nucleotide sequence between positions 3028 and 3093 of the Cas9 gene.
Further, the nucleic acid molecule of B1) may be a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 1 in the sequence table to sequence 10 or sequence 11, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 5 in the sequence table to sequence 12, or a nucleic acid molecule obtained by mutating 3028 to 3093 of the Cas9 gene shown in sequence 9 in the sequence table to sequence 13 or sequence 14.
In the above biological material, the expression cassette (Cas 9 mutant gene expression cassette) of B2) containing the nucleic acid molecule encoding the Cas9 mutant refers to DNA capable of expressing the Cas9 mutant in a host cell, and the DNA may include not only a promoter for initiating transcription of the Cas9 mutant gene, but also a terminator for terminating transcription of the Cas9 mutant gene. Further, the expression cassette may also include an enhancer sequence.
Recombinant vectors containing the Cas9 mutant gene expression cassettes can be constructed using existing expression vectors.
In the above biological material, the vector may be a plasmid, cosmid, phage or viral vector.
B3 The recombinant vector can BE BE4maxM, BE4maxM2, NG-ABEmaxM, ABE8eM, APOBEC_nCas9_UngM2 or APOBEC_nCas9_UngM3;
the BE4maxM is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 10;
the BE4maxM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 1) in a BE4max plasmid into a sequence 11;
the NG-ABEmaxM is a mutant plasmid obtained by mutating 3028-3093 th bit of Cas9 gene (sequence 5) in NG-ABEmax plasmid into sequence 12
The ABE8eM is a mutant plasmid obtained by mutating 3028-3093 th sites of Cas9 genes (sequence 1) in an ABE8e plasmid into a sequence 10;
the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating 3028 th to 3093 th positions of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 13,
the APOBEC_nCas9_UngM3 is a mutant plasmid obtained by mutating 3028-3093 th sites of a Cas9 gene (sequence 9) in an APOBEC_nCas9_ Ung plasmid into a sequence 14.
In the above application, the microorganism may be yeast, bacteria, algae or fungi.
In the above applications, the cell line does not include propagation material.
The application also provides a product comprising the base editor, or the Cas9 mutant, or the biological material.
The product may also contain one or more pharmaceutically acceptable carriers.
The application also provides for the use of the base editor, or the Cas9 mutant, or the biomaterial, or any of the following for the product:
y1) converting a cytosine nucleotide residue in a biological cell or animal or subject to a thymine nucleotide residue;
y2) converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y3) converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y4) preparing a product for converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject;
y5) preparing a product for converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y6) preparing a product for converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y7) as or preparing a single base editing reagent or kit;
y8) as or for the preparation of a medicament for gene therapy;
y9) treating or preventing a disease;
y10) for the preparation of a product for the treatment or prophylaxis of a disease.
The application also provides any one of the following methods:
x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize conversion of the target cytosine nucleotide residue into thymine nucleotide residue;
x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on the target adenine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target adenine nucleotide residue into guanine nucleotide residue;
x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: and performing base editing on the target cytosine nucleotide residue by using the base editor, the Cas9 mutant, the biological material or the product to realize the conversion of the target cytosine nucleotide residue into guanine nucleotide residue.
In the above, the biological cell may be a mammalian cell; the animal may be a mammal.
The application constructs a low off-target base editor comprising low off-target CBE, low off-target ABE and low off-target GBE by modifying the Base Editor (BE) containing the streptococcus pyogenes Cas9 or mutants thereof, and can remarkably reduce the off-target activity of the base editor in mammalian cells. The base editor has wide application prospect in gene therapy, drug screening, animal and plant model construction and the like.
Detailed Description
The following detailed description of the application is provided in connection with the accompanying drawings that are presented to illustrate the application and not to limit the scope thereof. The examples provided below are intended as guidelines for further modifications by one of ordinary skill in the art and are not to be construed as limiting the application in any way.
The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents, instruments and the like used in the examples described below are commercially available unless otherwise specified. The quantitative tests in the following examples were all set up in triplicate and the results averaged. In the following examples, unless otherwise specified, the 1 st position of each nucleotide sequence in the sequence listing is the 5 'terminal nucleotide of the corresponding DNA/RNA, and the last position is the 3' terminal nucleotide of the corresponding DNA/RNA.
The medium used hereinafter was DMEM medium (Thermo Fisher) containing 10% fetal bovine serum (Gibco).
Example 1 preparation and use of Low off-target CBE genome editor
In the present example, the currently commonly used CBE base editors BE4max (Addgene: 112093) and hyA A-BE4max (Addgene: 157943) were mutated from Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, Y (tyrosine) at position 1016, V (valine) at position 1018, R (arginine) at position 1019, Q (glutamine) at position 1027 and K (lysine) at position 1031 to D (aspartic acid) (the mutated proteins of nCas9 were designated as nCas 9-M), and BE4maxM and hyA A-BE4maxM were constructed, respectively; BE4maxM2 was constructed by mutating Y (tyrosine) at position 1010, Y (tyrosine) at position 1013, V (valine) at position 1018, Q (glutamine) at position 1027, and K (lysine) at position 1031 of nCas9 in BE4max (Addgene: 112093) to D (aspartic acid) (the protein after this nCas9 mutation was designated nCas 9-M2). The sequences of nCas9, nCas9-M and nCas9-M2 are respectively sequence 2, sequence 3 and sequence 4 in the sequence table.
In mammalian cells, genomic RNF2 sites were base-edited with BE4max, BE4maxM, hyA3A-BE4max, hyA3A-BE4maxM and BE4maxM2, respectively, using mismatched and non-mismatched gRNAs, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of BE4max and BE4maxM2 is not remarkably different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of BE4maxM is obviously lower than BE4max, the editing efficiency of BE4maxM2 is obviously lower than BE4max, and the editing efficiency of hyA A-BE4maxM is obviously lower than hyA A-BE4max.
The experimental procedure was as follows:
preparation of plasmids:
taking BE4max plasmid (Addgene: 112093) as a template, carrying out point mutation on nCas9 gene in BE4max plasmid by using a primer pair consisting of P1 and P2 in table 6 in a primer embedding mode, wherein the obtained mutant plasmid is BE4maxM; the nCas9 gene in the BE4max plasmid is subjected to point mutation by using the primer pair consisting of P1 and P3 in Table 6 by using the BE4max plasmid (Addgene: 112093) as a template in a primer embedding mode, and the obtained mutant plasmid is BE4maxM2.
The BE4max plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the BE4max can express nCas9 shown in a sequence 2; BE4maxM is a mutant plasmid obtained by mutating the 3028-3093 th position of nCas9 gene (sequence 1) in BE4max plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 10), the mutated gene of nCas9 gene is named as nCas9-M gene, and BE4maxM contains nCas9-M gene and can express nCas9-M shown in sequence 3; BE4maxM2 is a mutant plasmid obtained by mutating TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG from the 3028 th to 3093 th positions of nCas9 gene (sequence 1) in BE4max plasmid to GACGGCGACGACAAGGTGTACGACGACCGGAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 11), the mutated gene of nCas9 gene is denoted as nCas9-M2 gene, and BE4maxM2 contains the nCas9-M2 gene and can express nCas9-M2 shown in sequence 4.
The nCas9 gene in the hyA A-BE4max plasmid is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 by using the hyA A-BE4max plasmid (Addgene: 157943) as a template in a primer embedding mode, and the obtained mutant plasmid is hyA A-BE4maxM.
hyA3A-BE4max plasmid contains nCas9 gene shown in sequence 1 in a sequence table, hyA A-BE4max can express nCas9 shown in sequence 2; hyA3A-BE4maxM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in the hyA A-BE4max plasmid with the nCas9-M gene, and hyA A-BE4maxM can express the nCas9-M shown in sequence 3.
The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of targeted genome RNF2 locus, namely, a gRNA plasmid (gRNA 1) without mismatch, a gRNA plasmid (gRNA 2) with mismatch and a gRNA plasmid (gRNA 3) with deletion mismatch.
Primers used for the mismatch free gRNA plasmid (gRNA 1) are P10 and P11 in Table 6; primers used for the mismatched gRNA plasmid (gRNA 2) are P10 and P12 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 3) are P10 and P13 in Table 6.
The mismatch-free gRNA plasmid (gRNA 1) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 1; the mismatched gRNA plasmid (gRNA 2) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 2; the deletion mismatch-containing gRNA plasmid (gRNA 3) was obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 3.
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the genomic RNF2 site (the pre-spacer sequences of the gRNA plasmids are shown in table 1) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, using Lipofectamine 2000 (Life, invitrogen, 11668019) reagent to HEK293T cells, each plasmid was transfected in combination with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P4 and P5 in Table 6.
TABLE 1 Pre-spacer sequence of RNF2 genomic editing site gRNA plasmid
gRNA | Pre-spacer sequence of target gene (5 '-3') |
gRNA1 (targeting RNF2 without mismatch) | GTCATCTTAGTCATTACCTG (sequence 15) |
gRNA2 (targeting RNF)2 containing mismatches) | GTaATCTTAGTCATTACCTG (sequence 16) |
gRNA3 (targeting RNF2 with deletion mismatch) | G-CATCTTAGTCATTACCTG (sequence 17) |
The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome was counted as T, i.e., the C-T editing efficiency, and the results are shown in Table 2. The results show that: for gRNA1 without mismatch, the editing efficiency of BE4max and BE4maxM is not significantly different, the editing efficiency of BE4max and BE4maxM2 is not significantly different, and the editing efficiency of hyA A-BE4max and hyA A-BE4maxM is not significantly different; for grnas being mismatched gRNA2 and gRNA3, BE4maxM significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, BE4maxM2 significantly reduced the editing efficiency of mismatched gRNA compared to BE4max, and hyA a-BE4maxM also significantly reduced the editing efficiency of mismatched gRNA compared to hyA a-BE4max.
TABLE 2C-T editing efficiency of different cRNAs at RNF2 (C6, 5' sixth cytosine base) site by different cytosine base editors
Example 2: preparation and application of low off-target ABE genome editor
In this example, NG-ABEmaxM and ABE8eM were constructed by mutating 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of the currently commonly used ABE base editors NG-ABEmax (Addgene: 124163) and ABE8e (Addgene: 138489), respectively. The protein after the mutation of the NG-nCas9 is NG-nCas9-M, and the sequences of the NG-nCas9 and the NG-nCas9-M are a sequence 6 and a sequence 7 in a sequence table respectively; the protein after the nCas9 mutation is nCas9-M, and the sequences of the nCas9 and the nCas9-M are respectively a sequence 2 and a sequence 3 in a sequence table.
In mammalian cells, genomic ABCA3 locus was base edited with NG-ABEmax, NG-ABEmaxM, ABE8e, ABE8eM, respectively, using mismatched and non-mismatched gRNA, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not greatly different, and the editing efficiency of ABE8e and ABE8eM is not greatly different; when mismatched gRNA is used at the editing site, the editing efficiency of NG-ABEmaxM is significantly lower than NG-ABEmax, and the editing efficiency of ABE8eM is significantly lower than ABE8e.
The experimental procedure was as follows:
preparation of plasmids:
the NG-nCas9 gene in the NG-ABEmax plasmid (Addgene: 124163) is used as a template, and the primer pair consisting of P1 and P2 in the table 6 is used for carrying out point mutation on the NG-nCas9 gene in the NG-ABEmax plasmid in a primer embedding mode, so that the obtained mutant plasmid is the NG-ABEmaxM.
The NG-ABEmax plasmid contains a NG-nCas9 gene shown in a sequence 5 in a sequence table, and the NG-ABEmax can express the NG-nCas9 shown in a sequence 6; the NG-ABEmaxM is a mutant plasmid obtained by mutating the 3028-3093 th position of the NG-nCas9 gene (sequence 5) in the NG-ABEmax plasmid from TACGGCGACTACAAGGTGTACGACGTGCGGAAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 12), the mutated gene of the NG-nCas9 gene is named as the NG-nCas9-M gene, and the NG-ABEmaxM contains the NG-nCas9-M gene and can express the NG-nCas9-M shown in the sequence 7.
The nCas9 gene in the ABE8e plasmid (Addgene: 138489) is subjected to point mutation by using a primer pair consisting of P1 and P2 in Table 6 as a template in a primer embedding mode, and the obtained mutant plasmid is ABE8eM.
The ABE8e plasmid contains an nCas9 gene shown in a sequence 1 in a sequence table, and the ABE8e can express nCas9 shown in a sequence 2; ABE8eM is a mutant plasmid obtained by replacing the nCas9 gene (sequence 1) in ABE8e plasmid with the nCas9-M gene, and ABE8eM can express nCas9-M shown in sequence 3.
The pGL3-U6-sgRNA-PGK-puromycin plasmid (Addgene: 51133) is used as a template, and a pre-spacer sequence is inserted by a primer embedding mode to respectively obtain recombinant plasmids (called gRNA plasmids for short) of gRNA of a targeted genome ABCA3 site, namely, a gRNA plasmid (gRNA 4) without mismatch, a gRNA plasmid (gRNA 5) with mismatch and a gRNA plasmid (gRNA 6) with deletion mismatch.
Primers used for the mismatch free gRNA plasmid (gRNA 4) are P10 and P14 in Table 6; the primers used for the mismatched gRNA plasmid (gRNA 5) are P10 and P15 in Table 6; the primers used for the deletion mismatch containing gRNA plasmid (gRNA 6) are P10 and P16 in Table 6.
The mismatch-free gRNA plasmid (gRNA 4) is obtained by replacing the 322 th-343 rd position of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 4; the mismatched gRNA plasmid (gRNA 5) is obtained by replacing 322-343 rd bit of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 5; the deletion mismatch-containing gRNA plasmid (gRNA 6) was a plasmid obtained by replacing positions 322-343 of pGL3-U6-sgRNA-PGK-puromycin plasmid with the pre-spacer sequence of gRNA 6.
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the ABCA3 site (the pre-spacer sequences of the gRNA plasmids are shown in table 3) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, to HEK293T cells using Lipofectamine 2000 (Life, invitrogen, 11668019) reagents, each plasmid combination transfected with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P6 and P7 in Table 6.
TABLE 3 Pre-spacer sequence of ABCA3 genomic editing site gRNA plasmid
gRNA | Pre-spacer sequence of target gene (5 '-3') |
gRNA4 (target ABCA3 without mismatch) | GAAGAGCAGGGTCATGAAGG (sequence 18) |
gRNA5 (targeting ABCA3 with mismatch) | GAtGAGCAGGGTCATGAAGG (sequence 19) |
gRNA6 (targeting ABCA3 with deletion mismatch) | G-AGAGCAGGGTCATGAAGG (sequence 20) |
The percentage of the mutation A corresponding to the fifth position of the sequence 12 in the sequence table in the genome to G, namely the A-G editing efficiency, was counted, and the results are shown in Table 4. The results show that: for gRNA4 without mismatch, the editing efficiency of NG-ABEmax and NG-ABEmaxM is not obviously different, and the editing efficiency of ABE8e and ABE8eM is not obviously different; for grnas containing mismatches, both NG-ABEmaxM significantly reduced the editing efficiency of mismatched grnas compared to NG-ABEmax and ABE8eM significantly reduced the editing efficiency of mismatched grnas compared to ABE8e for grnas 5 and 6.
TABLE 4A-G editing efficiency of different gRNAs at ABCA3 (A5, 5' fifth cytosine base) site by different adenine base editors
NG-ABEmax | NG-ABEmaxM | ABE8e | ABE8eM | |
gRNA4 | 53.4±4.5% | 46.8±4.3% | 85.4±6.2% | 78.6±6.5% |
gRNA5 | 50.3±3.4% | 12.1±2.3% | 78.2±1.3% | 15.9±3.7% |
gRNA6 | 48.2±7.8% | 10.8±5.6% | 70.9±5.6% | 19.8±7.2% |
Example 3: preparation and application of low off-target GBE genome editor
In the embodiment, 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) of a commonly used GBE base editor APOBEC_nCas9_ Ung (molecular clone: MC_ 0101154) are mutated into D (aspartic acid), so that APOBEC_nCas9_UngM2 is constructed, the protein after nCas9 mutation is nCas9-M, and the amino acid sequences of nCas9 and nCas9-M are respectively the sequence 2 and the sequence 3 in a sequence table; mutation of 1010Y (tyrosine), 1013Y (tyrosine), 1016Y (tyrosine), 1018V (valine), 1019R (arginine), 1027Q (glutamine) and 1031K (lysine) in APOBEC_nCas9_ Ung into D (aspartic acid), mutation of 1014K (lysine) into P (proline) to construct APOBEC_nCas9_UngM3, wherein the protein after nCas9 mutation is nCas9-M3, and the amino acid sequences of nCas9 and nCas9-M3 are respectively sequence 2 and sequence 8 in a sequence table.
In mammalian cells, genomic RNF2 locus was base edited with apodec_ncs9_ung, apodec_ncs9_ungm2, apodec_ncs9_ungm3, respectively using gRNA with and without mismatch, and as a result found: when the gRNA without mismatch is used at the editing site, the editing efficiency of apodec_ncs9_ Ung is not greatly different from that of apodec_ncs9_ungm2 and apodec_ncs9_ungm3; when mismatched grnas are used for the editing sites, the editing efficiency of both apodec_ncs9_ungm2 and apodec_ncs9_ungm3 is significantly lower than apodec_ncs9_ Ung.
The experimental procedure was as follows:
preparation of plasmids:
taking an APOBEC_nCas9_ Ung plasmid (molecular clone: MC_ 0101154) as a template, carrying out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid by using a primer pair consisting of P1 and P8 in Table 6 in a primer embedding mode, wherein the obtained mutant plasmid is APOBEC_nCas9_UngM2; the APOBEC_nCas9_ Ung plasmid is taken as a template, the primer pair consisting of P1 and P9 in table 6 is utilized to carry out point mutation on the nCas9 gene in the APOBEC_nCas9_ Ung plasmid in a primer embedding mode, and the obtained mutant plasmid is APOBEC_nCas9_UngM3.
The APOBEC_nCas9_ Ung plasmid contains an nCas9' gene shown in a sequence 9 in a sequence table, and the APOBEC_nCas9_ Ung can express the nCas9 shown in a sequence 2; the APOBEC_nCas9_UngM2 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nCas9' gene (sequence 9) in the APOBEC_nCas9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACAAGGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 13), the mutated gene of the nCas9' gene is named as the nCas9-M ' gene, and the APOBEC_nCas9_UngM2 can express the nCas9-M shown in the amino acid sequence 3; the apodec_nmcas 9_ungm3 is a mutant plasmid obtained by mutating the 3028 th to 3093 th positions of the nmas 9 'gene (sequence 9) in the apodec_nmcas 9_ Ung plasmid from TACGGGGACTACAAGGTTTACGATGTGCGCAAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAG to GACGGCGACGACCCCGTGGACGACGACGACAAGATGATCGCCAAGAGCGAGGACGAAATCGGCGAC (sequence 14), the mutated gene of the nmas 9' gene is denoted as the nmas 9-M3 gene, and the apodec_nmcas 9_ungm3 contains the nmas 9-M3 gene and can express the nmas 9-M3 gene shown in sequence 8.
Recombinant plasmids targeting the grnas of genomic RNF2 sites (abbreviated as gRNA plasmids), i.e. gRNA plasmids without mismatches (gRNA 1 of example 1), gRNA plasmids with mismatches (gRNA 2 of example 1), gRNA plasmids with deletion mismatches (gRNA 3 of example 1).
Plasmid transfection: HEK293T cells (ATCC, CRL-3216) were cultured at 5X 10 5 24 well plates were plated, and when each well had grown to 40% -60%, different base editor plasmids were transfected with either a mismatch-free gRNA plasmid or a mismatch-containing gRNA plasmid targeting the genomic RNF2 site (the pre-spacer sequences of the gRNA plasmids are shown in table 1) in an amount of 600ng base editor plasmid, 300ng gRNA plasmid, using Lipofectamine 2000 (Life, invitrogen, 11668019) reagent to HEK293T cells, each plasmid was transfected in combination with 3 replicates, and 5 μg/ml puromycin (Merck, USA) was added to the medium 24 hours after transfection. Genomic DNA was extracted using a rapid extraction DNA extract (Epicentre, USA) 120 hours after transfection, and the region genes near the edited site were PCR amplified using Taq DNA polymerase (Kangji, china) and the PCR products were sequenced. The primers used for PCR amplification are P4 and P5 in Table 6.
The percentage of C mutation corresponding to the sixth position of sequence 9 in the sequence table in the genome to G, i.e., the C-G editing efficiency was counted, and the results are shown in Table 5. The results show that: for gRNA1 with no mismatch, there is no significant difference in apodec_ncs9_ Ung and apodec_ncs9_ungm2 and apodec_ncs9_ungm3 editing efficiency; for grnas being mismatch-containing gRNA2 and gRNA3, apodec_ncs9_ Ung significantly reduces the editing efficiency of the mismatch gRNA compared to both apodec_ncs9_ungm2 and apodec_ncs9_ungm3.
TABLE 5 different efficiency of C-G editing of different gRNAs at RNF2 (C6, 5' sixth cytosine) site by different cytosine base editors
APOBEC_nCas9_Ung | APOBEC_nCas9_UngM | APOBEC_nCas9_UngM2 | |
gRNA1 | 27.3±3.2% | 22.3±4.2% | 24.4±5.6% |
gRNA2 | 18.6±2.4% | 2.4±1.2% | 7.8±4.7% |
gRNA2 | 21.3±2.8% | 3.2±2.4% | 9.6±1.5% |
TABLE 6 primers
Primer(s) | Sequence(s) |
P1 | CCACGTCTCAGATGATCGCCAAGAGCGAGGACGAAATCGGCGACGCTACCGCCAAGTACTTCT |
P2 | CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC |
P3 | CCACGTCTCACATCTTCCGGTCGTCGTACACCTTGTCGTCGCCGTCCACGAACTCGCTTTCCAGC |
P4 | ACATTCAGACCATAGCACTTCC |
P5 | GTCTTCCTTGGTGCCTTATCAG |
P6 | ACAGCACGGCTACATTTGG |
P7 | CCAGGAGTTTGAGCAAGATGAG |
P8 | CCACGTCTCACATCTTGTCGTCGTCGTCCACCTTGTCGTCGCCGTCCACGAACTCGGACTCCAGC |
P9 | CCACGTCTCACATCTTGTCGTCGTCGTCCACGGGGTCGTCGCCGTCCACGAACTCGGACTCCAGC |
P10 | CCAGGTCTCA CGGTGTTTCGTCCTTTCCACAAGATATATAAAGC |
P11 | CCAGGTCTCAACCGTCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
P12 | CCAGGTCTCAACCGTAATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
P13 | CCAGGTCTCAACCGCATCTTAGTCATTACCTGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
P14 | CCAGGTCTCAACCGAAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
P15 | CCAGGTCTCAACCGATGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
P16 | CCAGGTCTCAACCGAGAGCAGGGTCATGAAGGGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC |
The present application is described in detail above. It will be apparent to those skilled in the art that the present application can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the application and without undue experimentation. While the application has been described with respect to specific embodiments, it will be appreciated that the application may be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. The application of some of the basic features may be done in accordance with the scope of the claims that follow.
Sequence listing
<110> institute of Tianjin Industrial biotechnology, national academy of sciences
<120> a low off-target base editor and construction thereof
<160> 20
<170>PatentIn version 3.5
<210> 1
<211> 4104
<212> DNA
<213> Artificial sequence (Artificial sequence)
<400> 1
atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60
atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120
cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180
gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240
tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300
ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360
aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420
aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480
atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540
gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600
atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660
cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720
ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780
gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840
cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900
ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960
atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020
cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080
ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140
gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200
aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260
gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320
gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380
agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440
gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500
aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560
tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620
agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680
gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740
tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800
atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860
ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920
cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980
aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040
gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100
agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160
cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220
gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280
atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340
atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400
gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460
gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520
atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580
gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640
aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700
accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760
ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820
actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880
aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940
taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000
taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060
atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120
aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180
cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240
gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300
cagacaggcggcttcagcaaagagtctatcctgcccaagaggaacagcgataagctgatc 3360
gccagaaagaaggactgggaccctaagaagtacggcggcttcgacagccccaccgtggcc 3420
tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480
aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540
tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600
tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccggcgaactg 3660
cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720
cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780
cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840
atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900
cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960
cctgccgccttcaagtactttgacaccaccatcgaccggaagaggtacaccagcaccaaa 4020
gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080
gacctgtctcagctgggaggtgac 4104
<210> 2
<211> 1368
<212> PRT
<213> artificial sequence
<400> 2
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 3
<211> 1368
<212> PRT
<213> artificial sequence
<400> 3
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 4
<211> 1368
<212> PRT
<213> artificial sequence
<400> 4
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Tyr Asp AspArg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 5
<211> 4104
<212> DNA
<213> artificial sequence
<400> 5
atggacaagaagtacagcatcggcctggccatcggcaccaactctgtgggctgggccgtg 60
atcaccgacgagtacaaggtgcccagcaagaaattcaaggtgctgggcaacaccgaccgg 120
cacagcatcaagaagaacctgatcggagccctgctgttcgacagcggcgaaacagccgag 180
gccacccggctgaagagaaccgccagaagaagatacaccagacggaagaaccggatctgc 240
tatctgcaagagatcttcagcaacgagatggccaaggtggacgacagcttcttccacaga 300
ctggaagagtccttcctggtggaagaggataagaagcacgagcggcaccccatcttcggc 360
aacatcgtggacgaggtggcctaccacgagaagtaccccaccatctaccacctgagaaag 420
aaactggtggacagcaccgacaaggccgacctgcggctgatctatctggccctggcccac 480
atgatcaagttccggggccacttcctgatcgagggcgacctgaaccccgacaacagcgac 540
gtggacaagctgttcatccagctggtgcagacctacaaccagctgttcgaggaaaacccc 600
atcaacgccagcggcgtggacgccaaggccatcctgtctgccagactgagcaagagcaga 660
cggctggaaaatctgatcgcccagctgcccggcgagaagaagaatggcctgttcggaaac 720
ctgattgccctgagcctgggcctgacccccaacttcaagagcaacttcgacctggccgag 780
gatgccaaactgcagctgagcaaggacacctacgacgacgacctggacaacctgctggcc 840
cagatcggcgaccagtacgccgacctgtttctggccgccaagaacctgtccgacgccatc 900
ctgctgagcgacatcctgagagtgaacaccgagatcaccaaggcccccctgagcgcctct 960
atgatcaagagatacgacgagcaccaccaggacctgaccctgctgaaagctctcgtgcgg 1020
cagcagctgcctgagaagtacaaagagattttcttcgaccagagcaagaacggctacgcc 1080
ggctacattgacggcggagccagccaggaagagttctacaagttcatcaagcccatcctg 1140
gaaaagatggacggcaccgaggaactgctcgtgaagctgaacagagaggacctgctgcgg 1200
aagcagcggaccttcgacaacggcagcatcccccaccagatccacctgggagagctgcac 1260
gccattctgcggcggcaggaagatttttacccattcctgaaggacaaccgggaaaagatc 1320
gagaagatcctgaccttccgcatcccctactacgtgggccctctggccaggggaaacagc 1380
agattcgcctggatgaccagaaagagcgaggaaaccatcaccccctggaacttcgaggaa 1440
gtggtggacaagggcgcttccgcccagagcttcatcgagcggatgaccaacttcgataag 1500
aacctgcccaacgagaaggtgctgcccaagcacagcctgctgtacgagtacttcaccgtg 1560
tataacgagctgaccaaagtgaaatacgtgaccgagggaatgagaaagcccgccttcctg 1620
agcggcgagcagaaaaaggccatcgtggacctgctgttcaagaccaaccggaaagtgacc 1680
gtgaagcagctgaaagaggactacttcaagaaaatcgagtgcttcgactccgtggaaatc 1740
tccggcgtggaagatcggttcaacgcctccctgggcacataccacgatctgctgaaaatt 1800
atcaaggacaaggacttcctggacaatgaggaaaacgaggacattctggaagatatcgtg 1860
ctgaccctgacactgtttgaggacagagagatgatcgaggaacggctgaaaacctatgcc 1920
cacctgttcgacgacaaagtgatgaagcagctgaagcggcggagatacaccggctggggc 1980
aggctgagccggaagctgatcaacggcatccgggacaagcagtccggcaagacaatcctg 2040
gatttcctgaagtccgacggcttcgccaacagaaacttcatgcagctgatccacgacgac 2100
agcctgacctttaaagaggacatccagaaagcccaggtgtccggccagggcgatagcctg 2160
cacgagcacattgccaatctggccggcagccccgccattaagaagggcatcctgcagaca 2220
gtgaaggtggtggacgagctcgtgaaagtgatgggccggcacaagcccgagaacatcgtg 2280
atcgaaatggccagagagaaccagaccacccagaagggacagaagaacagccgcgagaga 2340
atgaagcggatcgaagagggcatcaaagagctgggcagccagatcctgaaagaacacccc 2400
gtggaaaacacccagctgcagaacgagaagctgtacctgtactacctgcagaatgggcgg 2460
gatatgtacgtggaccaggaactggacatcaaccggctgtccgactacgatgtggaccat 2520
atcgtgcctcagagctttctgaaggacgactccatcgacaacaaggtgctgaccagaagc 2580
gacaagaaccggggcaagagcgacaacgtgccctccgaagaggtcgtgaagaagatgaag 2640
aactactggcggcagctgctgaacgccaagctgattacccagagaaagttcgacaatctg 2700
accaaggccgagagaggcggcctgagcgaactggataaggccggcttcatcaagagacag 2760
ctggtggaaacccggcagatcacaaagcacgtggcacagatcctggactcccggatgaac 2820
actaagtacgacgagaatgacaagctgatccgggaagtgaaagtgatcaccctgaagtcc 2880
aagctggtgtccgatttccggaaggatttccagttttacaaagtgcgcgagatcaacaac 2940
taccaccacgcccacgacgcctacctgaacgccgtcgtgggaaccgccctgatcaaaaag 3000
taccctaagctggaaagcgagttcgtgtacggcgactacaaggtgtacgacgtgcggaag 3060
atgatcgccaagagcgagcaggaaatcggcaaggctaccgccaagtacttcttctacagc 3120
aacatcatgaactttttcaagaccgagattaccctggccaacggcgagatccggaagcgg 3180
cctctgatcgagacaaacggcgaaaccggggagatcgtgtgggataagggccgggatttt 3240
gccaccgtgcggaaagtgctgagcatgccccaagtgaatatcgtgaaaaagaccgaggtg 3300
cagacaggcggcttcagcaaagagtctatcaggcccaagaggaacagcgataagctgatc 3360
gccagaaagaaggactgggaccctaagaagtacggcggcttcgtcagccccaccgtggcc 3420
tattctgtgctggtggtggccaaagtggaaaagggcaagtccaagaaactgaagagtgtg 3480
aaagagctgctggggatcaccatcatggaaagaagcagcttcgagaagaatcccatcgac 3540
tttctggaagccaagggctacaaagaagtgaaaaaggacctgatcatcaagctgcctaag 3600
tactccctgttcgagctggaaaacggccggaagagaatgctggcctctgccagattcctg 3660
cagaagggaaacgaactggccctgccctccaaatatgtgaacttcctgtacctggccagc 3720
cactatgagaagctgaagggctcccccgaggataatgagcagaaacagctgtttgtggaa 3780
cagcacaagcactacctggacgagatcatcgagcagatcagcgagttctccaagagagtg 3840
atcctggccgacgctaatctggacaaagtgctgtccgcctacaacaagcaccgggataag 3900
cccatcagagagcaggccgagaatatcatccacctgtttaccctgaccaatctgggagcc 3960
cctagggccttcaagtactttgacaccaccatcgaccggaaggtgtacaggagcaccaaa 4020
gaggtgctggacgccaccctgatccaccagagcatcaccggcctgtacgagacacggatc 4080
gacctgtctcagctgggaggtgac 4104
<210> 6
<211> 1368
<212> PRT
<213> artificial sequence
<400> 6
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 7
<211> 1368
<212> PRT
<213> artificial sequence
<400> 7
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Lys Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Arg Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Val Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Arg
1205 1210 1215
Phe Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Arg Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Val Tyr Arg Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 8
<211> 1368
<212> PRT
<213> artificial sequence
<400> 8
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile GlyThrAsn Ser Val
1 5 10 15
GlyTrp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu GlyAsnThr Asp Arg His Ser Ile Lys LysAsn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala ThrArg Leu
50 55 60
Lys ArgThr Ala ArgArgArg Tyr ThrArgArg Lys AsnArg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe GlyAsn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe ArgGly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu GluAsn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser ArgArg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys LysAsnGly Leu Phe GlyAsn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val AsnThr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys AsnGly Tyr Ala Gly Tyr Ile Asp GlyGly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
GlyThr Glu Glu Leu Leu Val Lys Leu AsnArg Glu Asp Leu LeuArg
385 390 395 400
Lys Gln ArgThr Phe Asp AsnGly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu ArgArg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp AsnArg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala ArgGlyAsn Ser Arg Phe Ala Trp
450 455 460
Met ThrArg Lys Ser Glu GluThr Ile Thr Pro TrpAsn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys ThrAsnArg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu GluAsn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu GluArg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys ArgArgArg Tyr
645 650 655
ThrGlyTrpGlyArg Leu Ser Arg Lys Leu Ile AsnGly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala AsnArgAsn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys LysGly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
ThrThr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu GluGly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu AsnThr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln AsnGlyArg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile AsnArg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu ThrArg Ser Asp Lys AsnArg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr TrpArg Gln Leu LeuAsn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu ArgGlyGly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu ThrArg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met AsnThr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile AsnAsn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val GlyThr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Asp Gly Asp Asp Pro Val Asp AspAspAsp Lys Met Ile Ala
1010 1015 1020
Lys Ser Glu Asp Glu Ile Gly Asp Ala Thr Ala Lys Tyr Phe Phe
1025 1030 1035
Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1040 1045 1050
AsnGly Glu Ile Arg Lys Arg Pro Leu Ile Glu ThrAsnGly Glu
1055 1060 1065
ThrGly Glu Ile Val Trp Asp Lys GlyArg Asp Phe Ala Thr Val
1070 1075 1080
ArgLys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys LysThr
1085 1090 1095
Glu Val GlnThrGlyGly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1100 1105 1110
ArgAsn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro
1115 1120 1125
Lys Lys TyrGlyGly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val
1130 1135 1140
Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys
1145 1150 1155
Ser Val Lys Glu Leu LeuGly Ile Thr Ile Met Glu Arg Ser Ser
1160 1165 1170
Phe Glu LysAsn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys
1175 1180 1185
Glu Val LysLys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
1190 1195 1200
Phe Glu Leu Glu AsnGlyArg Lys Arg Met Leu Ala Ser Ala Gly
1205 1210 1215
Glu Leu Gln Lys GlyAsn Glu Leu Ala Leu Pro Ser Lys Tyr Val
1220 1225 1230
AsnPhe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu AspAsn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
1250 1255 1260
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys
1265 1270 1275
ArgVal Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1280 1285 1290
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn
1295 1300 1305
Ile Ile His Leu Phe Thr Leu ThrAsn Leu Gly Ala Pro Ala Ala
1310 1315 1320
Phe Lys Tyr Phe Asp ThrThr Ile Asp Arg Lys Arg Tyr Thr Ser
1325 1330 1335
ThrLys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1340 1345 1350
GlyLeu Tyr Glu ThrArg Ile Asp Leu Ser Gln Leu GlyGly Asp
1355 1360 1365
<210> 9
<211> 4104
<212> DNA
<213> artificial sequence
<400> 9
atggacaagaagtactcgatcggcctcgccatcgggacgaactcagttggctgggccgtg 60
atcaccgacgagtacaaggtgccctctaagaagttcaaggtcctggggaacaccgaccgc 120
cattccatcaagaagaacctcatcggcgctctcctgttcgacagcggggagaccgctgag 180
gctacgaggctcaagagaaccgctaggcgccggtacacgagaaggaagaacaggatctgc 240
tacctccaagagattttctccaacgagatggccaaggttgacgattcattcttccaccgc 300
ctggaggagtctttcctcgtggaggaggataagaagcacgagcggcatcccatcttcggc 360
aacatcgtggacgaggttgcctaccacgagaagtaccctacgatctaccatctgcggaag 420
aagctcgtggactccaccgataaggcggacctcagactgatctacctcgctctggcccac 480
atgatcaagttccgcggccatttcctgatcgagggggatctcaacccagacaacagcgat 540
gttgacaagctgttcatccaactcgtgcagacctacaaccaactcttcgaggagaacccg 600
atcaacgcctctggcgtggacgcgaaggctatcctgtccgcgaggctctcgaagtccagg 660
aggctggagaacctgatcgctcagctcccaggcgagaagaagaacggcctgttcgggaac 720
ctcatcgctctcagcctggggctcaccccgaacttcaagtcgaacttcgatctcgctgag 780
gacgccaagctgcaactctccaaggacacctacgacgatgacctcgataacctcctggcc 840
cagatcggcgatcaatacgcggacctgttcctcgctgccaagaacctgtcggacgccatc 900
ctcctgtcagatatcctccgcgtgaacaccgagatcacgaaggctccactctctgcctcc 960
atgatcaagcgctacgacgagcaccatcaggatctgaccctcctgaaggcgctggtccgc 1020
caacagctcccggagaagtacaaggagattttcttcgatcagtcgaagaacggctacgct 1080
gggtacatcgacggcggggcctcacaagaggagttctacaagttcatcaagccaatcctg 1140
gagaagatggacggcacggaggagctcctggtgaagctcaacagggaggacctcctgcgg 1200
aagcagagaaccttcgataacggcagcatcccccaccaaatccatctcggggagctgcac 1260
gccatcctgagaaggcaagaggacttctaccctttcctcaaggataaccgggagaagatc 1320
gagaagatcctgaccttcagaatcccatactacgtcggccctctcgcgcgggggaactca 1380
agattcgcttggatgacccgcaagtctgaggagaccatcacgccgtggaacttcgaggag 1440
gtggtggacaagggcgctagcgctcagtcgttcatcgagaggatgaccaacttcgacaag 1500
aacctgcccaacgagaaggtgctccctaagcactcgctcctgtacgagtacttcaccgtc 1560
tacaacgagctcacgaaggtgaagtacgtcaccgagggcatgcgcaagccagcgttcctg 1620
tccggggagcagaagaaggctatcgtggacctcctgttcaagaccaaccggaaggtcacg 1680
gttaagcaactcaaggaggactacttcaagaagatcgagtgcttcgattcggtcgagatc 1740
agcggcgttgaggaccgcttcaacgccagcctcgggacctaccacgatctcctgaagatc 1800
atcaaggataaggacttcctggacaacgaggagaacgaggatatcctggaggacatcgtg 1860
ctgaccctcacgctgttcgaggacagggagatgatcgaggagcgcctgaagacgtacgcc 1920
catctcttcgatgacaaggtcatgaagcaactcaagcgccggagatacaccggctggggg 1980
aggctgtcccgcaagctcatcaacggcatccgggacaagcagtccgggaagaccatcctc 2040
gacttcctcaagagcgatggcttcgccaacaggaacttcatgcaactgatccacgatgac 2100
agcctcaccttcaaggaggatatccaaaaggctcaagtgagcggccagggggactcgctg 2160
cacgagcatatcgcgaacctcgctggctcccccgcgatcaagaagggcatcctccagacc 2220
gtgaaggttgtggacgagctcgtgaaggtcatgggccggcacaagcctgagaacatcgtc 2280
atcgagatggccagagagaaccaaaccacgcagaaggggcaaaagaactctagggagcgc 2340
atgaagcgcatcgaggagggcatcaaggagctggggtcccaaatcctcaaggagcaccca 2400
gtggagaacacccaactgcagaacgagaagctctacctgtactacctccagaacggcagg 2460
gatatgtacgtggaccaagagctggatatcaaccgcctcagcgattacgacgtcgatcat 2520
atcgttccccagtctttcctgaaggatgactccatcgacaacaaggtcctcaccaggtcg 2580
gacaagaaccgcggcaagtcagataacgttccatctgaggaggtcgttaagaagatgaag 2640
aactactggaggcagctcctgaacgccaagctgatcacgcaaaggaagttcgacaacctc 2700
accaaggctgagagaggcgggctctcagagctggacaaggccggcttcatcaagcggcag 2760
ctggtcgagaccagacaaatcacgaagcacgttgcgcaaatcctcgactctcggatgaac 2820
acgaagtacgatgagaacgacaagctgatcagggaggttaaggtgatcaccctgaagtct 2880
aagctcgtctccgacttcaggaaggatttccagttctacaaggttcgcgagatcaacaac 2940
taccaccatgcccatgacgcttacctcaacgctgtggtcggcaccgctctgatcaagaag 3000
tacccaaagctggagtccgagttcgtgtacggggactacaaggtttacgatgtgcgcaag 3060
atgatcgccaagtcggagcaagagatcggcaaggctaccgccaagtacttcttctactca 3120
aacatcatgaacttcttcaagaccgagatcacgctggccaacggcgagatccggaagaga 3180
ccgctcatcgagaccaacggcgagacgggggagatcgtgtgggacaagggcagggatttc 3240
gcgaccgtccgcaaggttctctccatgccccaggtgaacatcgtcaagaagaccgaggtc 3300
caaacgggcgggttctcaaaggagtctatcctgcctaagcggaacagcgacaagctcatc 3360
gccagaaagaaggactgggacccaaagaagtacggcgggttcgacagccctaccgtggcc 3420
tactcggtcctggttgtggcgaaggttgagaagggcaagtccaagaagctcaagagcgtg 3480
aaggagctcctggggatcaccatcatggagaggtccagcttcgagaagaacccaatcgac 3540
ttcctggaggccaagggctacaaggaggtgaagaaggacctgatcatcaagctcccgaag 3600
tactctctcttcgagctggagaacggcaggaagagaatgctggcttccgctggcgagctc 3660
cagaaggggaacgagctcgcgctgccaagcaagtacgtgaacttcctctacctggcttcc 3720
cactacgagaagctcaagggcagcccggaggacaacgagcaaaagcagctgttcgtcgag 3780
cagcacaagcattacctcgacgagatcatcgagcaaatctccgagttcagcaagcgcgtg 3840
atcctcgccgacgcgaacctggataaggtcctctccgcctacaacaagcaccgggacaag 3900
cccatcagagagcaagcggagaacatcatccatctcttcaccctgacgaacctcggcgct 3960
cctgctgctttcaagtacttcgacaccacgatcgatcggaagagatacacctccacgaag 4020
gaggtcctggacgcgaccctcatccaccagtcgatcaccggcctgtacgagacgaggatc 4080
gacctctcacaactcggcggggat 4104
<210> 10
<211> 66
<212> DNA
<213> artificial sequence
<400> 10
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 11
<211> 66
<212> DNA
<213> artificial sequence
<400> 11
gacggcgacgacaaggtgtacgacgaccggaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 12
<211> 66
<212> DNA
<213> artificial sequence
<400> 12
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 13
<211> 66
<212> DNA
<213> artificial sequence
<400> 13
gacggcgacgacaaggtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 14
<211> 66
<212> DNA
<213> artificial sequence
<400> 14
gacggcgacgaccccgtggacgacgacgacaagatgatcgccaagagcgaggacgaaatc 60
ggcgac 66
<210> 15
<211> 20
<212> DNA
<213> artificial sequence
<400> 15
gtcatcttagtcattacctg 20
<210> 16
<211> 20
<212> DNA
<213> artificial sequence
<400> 16
gtaatcttagtcattacctg 20
<210> 17
<211> 19
<212> DNA
<213> artificial sequence
<400> 17
gcatcttagtcattacctg 19
<210> 18
<211> 20
<212> DNA
<213> artificial sequence
<400> 18
gaagagcagggtcatgaagg 20
<210> 19
<211> 20
<212> DNA
<213> artificial sequence
<400> 19
gatgagcagggtcatgaagg 20
<210> 20
<211> 19
<212> DNA
<213> artificial sequence
<400> 20
gagagcagggtcatgaagg 19
Claims (11)
1. A base editor comprising or expressing a Cas9 mutant, the Cas9 mutant being a mutant protein obtained by mutating amino acid residues between positions 1010-1031 of Cas9, or a mutant protein obtained by mutating amino acid residues 1010, 1013, 1014, 1016, 1018, 1019, 1027 and/or 1031 of Cas9, or a mutant protein obtained by mutating Cas9 by any one or more of the following:
m1) mutating the tyrosine residue at position 1010 of Cas9 to an aspartic acid residue;
m2) mutating the tyrosine residue at position 1013 of Cas9 to an aspartic acid residue;
m3) mutating the lysine residue at position 1014 of Cas9 to a proline residue;
m4) mutating the tyrosine residue at position 1016 of Cas9 to an aspartic acid residue;
m5) mutating the valine residue at position 1018 of Cas9 to the aspartic acid residue;
m6) mutating the arginine residue at position 1019 of Cas9 to an aspartic acid residue;
m7) mutating the glutamine residue at position 1027 of Cas9 to an aspartic acid residue;
m8) mutates the lysine residue at position 1031 of Cas9 to an aspartic acid residue.
2. The base editor of claim 1 wherein: the Cas9 is a protein shown as a sequence 2 or a sequence 6 in a sequence table.
3. The base editor of claim 1 or 2, wherein: the Cas9 mutant is a mutant protein obtained by performing seven mutations of M1), M2), M4), M5), M6), M7) and M8) on Cas9, or is a mutant protein obtained by performing five mutations of M1), M2), M5), M7) and M8) on Cas9, or is a mutant protein obtained by performing eight mutations of M1) -M8) on Cas 9.
4. The base editor of any one of claims 1-3 wherein: the base editor also contains or also expresses sgrnas targeting the target sequence and/or domains with base modification activity.
5. The Cas9 mutant of any one of claims 1-4.
6. The biological material associated with the Cas9 mutant of any one of claims 1-4, which is any one of the following B1) to B5):
b1 A nucleic acid molecule encoding the Cas9 mutant of any one of claims 1-4;
b2 An expression cassette comprising the nucleic acid molecule of B1);
b3 A recombinant vector comprising the nucleic acid molecule of B1) or a recombinant vector comprising the expression cassette of B2);
b4 A recombinant microorganism comprising the nucleic acid molecule of B1), or a recombinant microorganism comprising the expression cassette of B2), or a recombinant microorganism comprising the recombinant vector of B3);
b5 A cell line containing the nucleic acid molecule of B1) or a cell line containing the expression cassette of B2).
7. The biomaterial according to claim 6, wherein: b1 The nucleic acid molecule is a mutant gene obtained by mutating a nucleotide sequence between 3028 and 3093 of the Cas9 gene;
further, the nucleic acid molecule B1) is obtained by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 1 in the sequence table into the sequence 10 or the sequence 11, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 5 in the sequence table into the sequence 12, or obtaining the nucleic acid molecule by mutating 3028 to 3093 of the Cas9 gene shown in the sequence 9 in the sequence table into the sequence 13 or the sequence 14.
8. A product comprising the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7.
9. The base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or any one of the following uses of the product of claim 8:
y1) converting a cytosine nucleotide residue in a biological cell or animal or subject to a thymine nucleotide residue;
y2) converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y3) converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y4) preparing a product for converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject;
y5) preparing a product for converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject;
y6) preparing a product for converting cytosine nucleotide residues in a biological cell or animal or subject to guanine nucleotide residues;
y7) as or preparing a single base editing reagent or kit;
y8) as or for the preparation of a medicament for gene therapy;
y9) treating or preventing a disease;
y10) for the preparation of a product for the treatment or prophylaxis of a disease.
10. The method comprises the following steps:
x1) a method of converting a cytosine nucleotide residue to a thymine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing of a cytosine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8 to effect conversion of the cytosine nucleotide residue of interest to a thymine nucleotide residue;
x2) a method of converting an adenine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: performing base editing on an adenine nucleotide residue of interest using the base editor of any one of claims 1-4, or the Cas9 mutant of claim 5, or the biological material of claim 6 or 7, or the product of claim 8, to convert the adenine nucleotide residue of interest to a guanine nucleotide residue;
x3) a method of converting a cytosine nucleotide residue to a guanine nucleotide residue in a biological cell or animal or subject, comprising: the conversion of a cytosine nucleotide residue of interest into a guanine nucleotide residue is accomplished by base editing the cytosine nucleotide residue of interest with a base editor as defined in any one of claims 1-4, or a Cas9 mutant as defined in claim 5, or a biological material as defined in claim 6 or 7, or a product as defined in claim 8.
11. The use according to claim 9, or the method according to claim 10, characterized in that: the biological cell is a mammalian cell; the animal is a mammal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210518789.4A CN117089572A (en) | 2022-05-13 | 2022-05-13 | Low off-target base editor and construction thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210518789.4A CN117089572A (en) | 2022-05-13 | 2022-05-13 | Low off-target base editor and construction thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117089572A true CN117089572A (en) | 2023-11-21 |
Family
ID=88777528
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210518789.4A Pending CN117089572A (en) | 2022-05-13 | 2022-05-13 | Low off-target base editor and construction thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117089572A (en) |
-
2022
- 2022-05-13 CN CN202210518789.4A patent/CN117089572A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11713471B2 (en) | Class II, type V CRISPR systems | |
CN110835634B (en) | Novel base conversion editing system and application thereof | |
CN105247066B (en) | Increasing specificity of RNA-guided genome editing using RNA-guided FokI nuclease (RFN) | |
CN110835629B (en) | Construction method and application of novel base conversion editing system | |
WO2015079056A1 (en) | Somatic human cell line mutations | |
JP2020517299A (en) | Site-specific DNA modification using a donor DNA repair template with tandem repeats | |
US20230212612A1 (en) | Genome editing system and method | |
US20230374482A1 (en) | Base editing enzymes | |
US20230416710A1 (en) | Engineered and chimeric nucleases | |
WO2020033083A1 (en) | Optimized base editors enable efficient editing in cells, organoids and mice | |
US20240002834A1 (en) | Adenine base editor lacking cytosine editing activity and use thereof | |
EP3412765B1 (en) | Method for producing mutant filamentous fungi | |
CN113249362B (en) | Modified cytosine base editor and application thereof | |
KR102151064B1 (en) | Gene editing composition comprising sgRNAs with matched 5' nucleotide and gene editing method using the same | |
CN117089572A (en) | Low off-target base editor and construction thereof | |
CN113549650B (en) | CRISPR-SaCas9 gene editing system and application thereof | |
EP4392561A1 (en) | Enzymes with ruvc domains | |
EP4347816A1 (en) | Class ii, type v crispr systems | |
KR102358538B1 (en) | Method for gene editing in microalgae using particle bombardment | |
CN115772523A (en) | Base editing tool | |
US20230348877A1 (en) | Base editing enzymes | |
KR20240107373A (en) | Novel genome editing system based on C2C9 nuclease and its application | |
WO2023039434A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
KR20210118069A (en) | DNA cutting material |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |