CN114835821B - 一种高效特异实现碱基颠换的编辑系统、方法及用途 - Google Patents
一种高效特异实现碱基颠换的编辑系统、方法及用途 Download PDFInfo
- Publication number
- CN114835821B CN114835821B CN202210415558.0A CN202210415558A CN114835821B CN 114835821 B CN114835821 B CN 114835821B CN 202210415558 A CN202210415558 A CN 202210415558A CN 114835821 B CN114835821 B CN 114835821B
- Authority
- CN
- China
- Prior art keywords
- leu
- lys
- glu
- asp
- ala
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 84
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 78
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 78
- 108020005004 Guide RNA Proteins 0.000 claims abstract description 10
- 230000006870 function Effects 0.000 claims abstract description 10
- 230000003915 cell function Effects 0.000 claims abstract description 5
- 230000001717 pathogenic effect Effects 0.000 claims abstract description 4
- 108091033319 polynucleotide Proteins 0.000 claims description 59
- 102000040430 polynucleotide Human genes 0.000 claims description 59
- 239000002157 polynucleotide Substances 0.000 claims description 59
- 210000004027 cell Anatomy 0.000 claims description 54
- 125000003729 nucleotide group Chemical group 0.000 claims description 41
- 239000002773 nucleotide Substances 0.000 claims description 40
- 230000014509 gene expression Effects 0.000 claims description 32
- 238000010362 genome editing Methods 0.000 claims description 21
- 239000013598 vector Substances 0.000 claims description 20
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 18
- 230000001105 regulatory effect Effects 0.000 claims description 14
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 13
- 206010028980 Neoplasm Diseases 0.000 claims description 9
- 201000010099 disease Diseases 0.000 claims description 9
- 238000000338 in vitro Methods 0.000 claims description 8
- 208000023275 Autoimmune disease Diseases 0.000 claims description 6
- 238000011282 treatment Methods 0.000 claims description 6
- 208000022362 bacterial infectious disease Diseases 0.000 claims description 5
- 239000003814 drug Substances 0.000 claims description 5
- 230000001225 therapeutic effect Effects 0.000 claims description 5
- 208000035473 Communicable disease Diseases 0.000 claims description 4
- 229940079593 drug Drugs 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 3
- 210000005260 human cell Anatomy 0.000 claims description 3
- 238000011160 research Methods 0.000 claims description 3
- 239000003795 chemical substances by application Substances 0.000 claims description 2
- 210000004748 cultured cell Anatomy 0.000 claims description 2
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 2
- 230000003612 virological effect Effects 0.000 claims description 2
- 239000012634 fragment Substances 0.000 abstract description 68
- 108091033409 CRISPR Proteins 0.000 abstract description 48
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 abstract description 48
- 108010008532 Deoxyribonuclease I Proteins 0.000 abstract description 39
- 102000007260 Deoxyribonuclease I Human genes 0.000 abstract description 39
- 229940035893 uracil Drugs 0.000 abstract description 24
- 102000052510 DNA-Binding Proteins Human genes 0.000 abstract description 16
- 101710096438 DNA-binding protein Proteins 0.000 abstract description 14
- 230000035772 mutation Effects 0.000 abstract description 13
- 102000004169 proteins and genes Human genes 0.000 description 62
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 60
- 108020004414 DNA Proteins 0.000 description 57
- 235000018102 proteins Nutrition 0.000 description 52
- 108010062796 arginyllysine Proteins 0.000 description 27
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 25
- 108010054155 lysyllysine Proteins 0.000 description 23
- 229940024606 amino acid Drugs 0.000 description 22
- 235000001014 amino acid Nutrition 0.000 description 22
- 108010008355 arginyl-glutamine Proteins 0.000 description 22
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 21
- 230000000694 effects Effects 0.000 description 21
- 108010034529 leucyl-lysine Proteins 0.000 description 21
- 150000001413 amino acids Chemical class 0.000 description 20
- 108010050848 glycylleucine Proteins 0.000 description 20
- 108010092854 aspartyllysine Proteins 0.000 description 19
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 17
- 108010073969 valyllysine Proteins 0.000 description 17
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 16
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 16
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 16
- 239000013604 expression vector Substances 0.000 description 16
- 239000013612 plasmid Substances 0.000 description 16
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 15
- 108010044940 alanylglutamine Proteins 0.000 description 15
- 238000010276 construction Methods 0.000 description 15
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 15
- 108010057821 leucylproline Proteins 0.000 description 15
- 108010017391 lysylvaline Proteins 0.000 description 15
- 239000000203 mixture Substances 0.000 description 15
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 14
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 14
- 108010013835 arginine glutamate Proteins 0.000 description 14
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 13
- 108010003700 lysyl aspartic acid Proteins 0.000 description 13
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 12
- 241000880493 Leptailurus serval Species 0.000 description 12
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 12
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 12
- 108010005233 alanylglutamic acid Proteins 0.000 description 12
- 108010087924 alanylproline Proteins 0.000 description 12
- 108010015792 glycyllysine Proteins 0.000 description 12
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 11
- 108010047562 NGR peptide Proteins 0.000 description 11
- YGKVNUAKYPGORG-AVGNSLFASA-N Tyr-Asp-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O YGKVNUAKYPGORG-AVGNSLFASA-N 0.000 description 11
- 108010064235 lysylglycine Proteins 0.000 description 11
- 108010012581 phenylalanylglutamate Proteins 0.000 description 11
- 108090000765 processed proteins & peptides Proteins 0.000 description 11
- MDNAVFBZPROEHO-UHFFFAOYSA-N Ala-Lys-Val Natural products CC(C)C(C(O)=O)NC(=O)C(NC(=O)C(C)N)CCCCN MDNAVFBZPROEHO-UHFFFAOYSA-N 0.000 description 10
- OLGCWMNDJTWQAG-GUBZILKMSA-N Asn-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(N)=O OLGCWMNDJTWQAG-GUBZILKMSA-N 0.000 description 10
- 108010080611 Cytosine Deaminase Proteins 0.000 description 10
- 102000000311 Cytosine Deaminase Human genes 0.000 description 10
- IOFDDSNZJDIGPB-GVXVVHGQSA-N Gln-Leu-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O IOFDDSNZJDIGPB-GVXVVHGQSA-N 0.000 description 10
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 10
- QQLBPVKLJBAXBS-FXQIFTODSA-N Glu-Glu-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QQLBPVKLJBAXBS-FXQIFTODSA-N 0.000 description 10
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 10
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 10
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 10
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 10
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 10
- UOLGINIHBRIECN-FXQIFTODSA-N Ser-Glu-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UOLGINIHBRIECN-FXQIFTODSA-N 0.000 description 10
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 10
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 10
- KRDSCBLRHORMRK-JXUBOQSCSA-N Thr-Lys-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O KRDSCBLRHORMRK-JXUBOQSCSA-N 0.000 description 10
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 10
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 10
- 108010038633 aspartylglutamate Proteins 0.000 description 10
- 108010078274 isoleucylvaline Proteins 0.000 description 10
- 210000000822 natural killer cell Anatomy 0.000 description 10
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 10
- 108010061238 threonyl-glycine Proteins 0.000 description 10
- IZSMEUDYADKZTJ-KJEVXHAQSA-N Arg-Tyr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IZSMEUDYADKZTJ-KJEVXHAQSA-N 0.000 description 9
- 108091026890 Coding region Proteins 0.000 description 9
- TWYSSILQABLLME-HJGDQZAQSA-N Glu-Thr-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWYSSILQABLLME-HJGDQZAQSA-N 0.000 description 9
- JFSGIJSCJFQGSZ-MXAVVETBSA-N Leu-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N JFSGIJSCJFQGSZ-MXAVVETBSA-N 0.000 description 9
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 9
- 239000013613 expression plasmid Substances 0.000 description 9
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 9
- 108010025306 histidylleucine Proteins 0.000 description 9
- 108010085325 histidylproline Proteins 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 108010051242 phenylalanylserine Proteins 0.000 description 9
- 238000001890 transfection Methods 0.000 description 9
- 108010051110 tyrosyl-lysine Proteins 0.000 description 9
- QXHVOUSPVAWEMX-ZLUOBGJFSA-N Asp-Asp-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O QXHVOUSPVAWEMX-ZLUOBGJFSA-N 0.000 description 8
- 102000004190 Enzymes Human genes 0.000 description 8
- 108090000790 Enzymes Proteins 0.000 description 8
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 8
- FGPLUIQCSKGLTI-WDSKDSINSA-N Gly-Ser-Glu Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O FGPLUIQCSKGLTI-WDSKDSINSA-N 0.000 description 8
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 8
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 8
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 8
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 8
- 108010047857 aspartylglycine Proteins 0.000 description 8
- 108010068265 aspartyltyrosine Proteins 0.000 description 8
- 229940104302 cytosine Drugs 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 108010049041 glutamylalanine Proteins 0.000 description 8
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 8
- 244000005700 microbiome Species 0.000 description 8
- 150000007523 nucleic acids Chemical class 0.000 description 8
- 108010031719 prolyl-serine Proteins 0.000 description 8
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 8
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 7
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 7
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 7
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 7
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 7
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 7
- WAEDSQFVZJUHLI-BYULHYEWSA-N Asp-Val-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WAEDSQFVZJUHLI-BYULHYEWSA-N 0.000 description 7
- 108020004705 Codon Proteins 0.000 description 7
- KVXVVDFOZNYYKZ-DCAQKATOSA-N Gln-Gln-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KVXVVDFOZNYYKZ-DCAQKATOSA-N 0.000 description 7
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 7
- FITIQFSXXBKFFM-NRPADANISA-N Gln-Val-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FITIQFSXXBKFFM-NRPADANISA-N 0.000 description 7
- PAQUJCSYVIBPLC-AVGNSLFASA-N Glu-Asp-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PAQUJCSYVIBPLC-AVGNSLFASA-N 0.000 description 7
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 7
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 7
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 7
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 7
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 7
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 7
- VCHVSKNMTXWIIP-SRVKXCTJSA-N Leu-Lys-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O VCHVSKNMTXWIIP-SRVKXCTJSA-N 0.000 description 7
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 7
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 7
- AEDWWMMHUGYIFD-HJGDQZAQSA-N Leu-Thr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(O)=O AEDWWMMHUGYIFD-HJGDQZAQSA-N 0.000 description 7
- JGKHAFUAPZCCDU-BZSNNMDCSA-N Leu-Tyr-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=C(O)C=C1 JGKHAFUAPZCCDU-BZSNNMDCSA-N 0.000 description 7
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 7
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 7
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 7
- LLSUNJYOSCOOEB-GUBZILKMSA-N Lys-Glu-Asp Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O LLSUNJYOSCOOEB-GUBZILKMSA-N 0.000 description 7
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 7
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 7
- WMGVYPPIMZPWPN-SRVKXCTJSA-N Phe-Asp-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N WMGVYPPIMZPWPN-SRVKXCTJSA-N 0.000 description 7
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 7
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 7
- 108010003201 RGH 0205 Proteins 0.000 description 7
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 7
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 7
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 7
- PMHLLBKTDHQMCY-ULQDDVLXSA-N Tyr-Lys-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMHLLBKTDHQMCY-ULQDDVLXSA-N 0.000 description 7
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 7
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 7
- SSYBNWFXCFNRFN-GUBZILKMSA-N Val-Pro-Ser Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SSYBNWFXCFNRFN-GUBZILKMSA-N 0.000 description 7
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 7
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 7
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 7
- 108010059459 arginyl-threonyl-phenylalanine Proteins 0.000 description 7
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 7
- 108010077245 asparaginyl-proline Proteins 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 7
- 108010079547 glutamylmethionine Proteins 0.000 description 7
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 7
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 7
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 7
- 108010089804 glycyl-threonine Proteins 0.000 description 7
- 108010048994 glycyl-tyrosyl-alanine Proteins 0.000 description 7
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 7
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 7
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 7
- 239000002609 medium Substances 0.000 description 7
- 102000039446 nucleic acids Human genes 0.000 description 7
- 108020004707 nucleic acids Proteins 0.000 description 7
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 7
- 108010025488 pinealon Proteins 0.000 description 7
- 108010026333 seryl-proline Proteins 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 6
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 6
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 6
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 6
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 6
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 6
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 6
- LSJQOMAZIKQMTJ-SRVKXCTJSA-N Asn-Phe-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O LSJQOMAZIKQMTJ-SRVKXCTJSA-N 0.000 description 6
- OOXUBGLNDRGOKT-FXQIFTODSA-N Asn-Ser-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OOXUBGLNDRGOKT-FXQIFTODSA-N 0.000 description 6
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 6
- UAXIKORUDGGIGA-DCAQKATOSA-N Asp-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O UAXIKORUDGGIGA-DCAQKATOSA-N 0.000 description 6
- SHAUZYVSXAMYAZ-JYJNAYRXSA-N Gln-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N SHAUZYVSXAMYAZ-JYJNAYRXSA-N 0.000 description 6
- KPNWAJMEMRCLAL-GUBZILKMSA-N Gln-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N KPNWAJMEMRCLAL-GUBZILKMSA-N 0.000 description 6
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 6
- MTAOBYXRYJZRGQ-WDSKDSINSA-N Glu-Gly-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O MTAOBYXRYJZRGQ-WDSKDSINSA-N 0.000 description 6
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 6
- SOYWRINXUSUWEQ-DLOVCJGASA-N Glu-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O SOYWRINXUSUWEQ-DLOVCJGASA-N 0.000 description 6
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 6
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 6
- FFALDIDGPLUDKV-ZDLURKLDSA-N Gly-Thr-Ser Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O FFALDIDGPLUDKV-ZDLURKLDSA-N 0.000 description 6
- VAXBXNPRXPHGHG-BJDJZHNGSA-N Ile-Ala-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)O)N VAXBXNPRXPHGHG-BJDJZHNGSA-N 0.000 description 6
- DFJJAVZIHDFOGQ-MNXVOIDGSA-N Ile-Glu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DFJJAVZIHDFOGQ-MNXVOIDGSA-N 0.000 description 6
- ZURHXHNAEJJRNU-CIUDSAMLSA-N Leu-Asp-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZURHXHNAEJJRNU-CIUDSAMLSA-N 0.000 description 6
- ZTLGVASZOIKNIX-DCAQKATOSA-N Leu-Gln-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZTLGVASZOIKNIX-DCAQKATOSA-N 0.000 description 6
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 6
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 6
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 6
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 6
- PHHYNOUOUWYQRO-XIRDDKMYSA-N Lys-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCCN)N PHHYNOUOUWYQRO-XIRDDKMYSA-N 0.000 description 6
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 6
- RQILLQOQXLZTCK-KBPBESRZSA-N Lys-Tyr-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O RQILLQOQXLZTCK-KBPBESRZSA-N 0.000 description 6
- 241000124008 Mammalia Species 0.000 description 6
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 6
- 241000186364 Mycobacterium intracellulare Species 0.000 description 6
- 241000187480 Mycobacterium smegmatis Species 0.000 description 6
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 6
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 6
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 6
- JMGJDTNUMAZNLX-RWRJDSDZSA-N Thr-Glu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JMGJDTNUMAZNLX-RWRJDSDZSA-N 0.000 description 6
- 102000008579 Transposases Human genes 0.000 description 6
- 108010020764 Transposases Proteins 0.000 description 6
- NKUGCYDFQKFVOJ-JYJNAYRXSA-N Tyr-Leu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NKUGCYDFQKFVOJ-JYJNAYRXSA-N 0.000 description 6
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 6
- YMTOEGGOCHVGEH-IHRRRGAJSA-N Val-Lys-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O YMTOEGGOCHVGEH-IHRRRGAJSA-N 0.000 description 6
- 239000004480 active ingredient Substances 0.000 description 6
- 108010068380 arginylarginine Proteins 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- -1 for example Substances 0.000 description 6
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 6
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 6
- 108010040030 histidinoalanine Proteins 0.000 description 6
- 108010018006 histidylserine Proteins 0.000 description 6
- 108010044348 lysyl-glutamyl-aspartic acid Proteins 0.000 description 6
- 108020004999 messenger RNA Proteins 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 102000004196 processed proteins & peptides Human genes 0.000 description 6
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 6
- 108010020532 tyrosyl-proline Proteins 0.000 description 6
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 5
- NXSFUECZFORGOG-CIUDSAMLSA-N Ala-Asn-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NXSFUECZFORGOG-CIUDSAMLSA-N 0.000 description 5
- NJPMYXWVWQWCSR-ACZMJKKPSA-N Ala-Glu-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NJPMYXWVWQWCSR-ACZMJKKPSA-N 0.000 description 5
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 5
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 5
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 5
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 5
- FDAZDMAFZYTHGS-XVYDVKMFSA-N Ala-His-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FDAZDMAFZYTHGS-XVYDVKMFSA-N 0.000 description 5
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 5
- MDNAVFBZPROEHO-DCAQKATOSA-N Ala-Lys-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MDNAVFBZPROEHO-DCAQKATOSA-N 0.000 description 5
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 5
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 5
- QOIGKCBMXUCDQU-KDXUFGMBSA-N Ala-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N)O QOIGKCBMXUCDQU-KDXUFGMBSA-N 0.000 description 5
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 5
- ZJLORAAXDAJLDC-CQDKDKBSSA-N Ala-Tyr-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O ZJLORAAXDAJLDC-CQDKDKBSSA-N 0.000 description 5
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 5
- SQKPKIJVWHAWNF-DCAQKATOSA-N Arg-Asp-Lys Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(O)=O SQKPKIJVWHAWNF-DCAQKATOSA-N 0.000 description 5
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 5
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 5
- QKSAZKCRVQYYGS-UWVGGRQHSA-N Arg-Gly-His Chemical compound N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O QKSAZKCRVQYYGS-UWVGGRQHSA-N 0.000 description 5
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 5
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 5
- COXMUHNBYCVVRG-DCAQKATOSA-N Arg-Leu-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O COXMUHNBYCVVRG-DCAQKATOSA-N 0.000 description 5
- CLICCYPMVFGUOF-IHRRRGAJSA-N Arg-Lys-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O CLICCYPMVFGUOF-IHRRRGAJSA-N 0.000 description 5
- PAPSMOYMQDWIOR-AVGNSLFASA-N Arg-Lys-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PAPSMOYMQDWIOR-AVGNSLFASA-N 0.000 description 5
- ADPACBMPYWJJCE-FXQIFTODSA-N Arg-Ser-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O ADPACBMPYWJJCE-FXQIFTODSA-N 0.000 description 5
- PSUXEQYPYZLNER-QXEWZRGKSA-N Arg-Val-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PSUXEQYPYZLNER-QXEWZRGKSA-N 0.000 description 5
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 5
- XHFXZQHTLJVZBN-FXQIFTODSA-N Asn-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N)CN=C(N)N XHFXZQHTLJVZBN-FXQIFTODSA-N 0.000 description 5
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 5
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 5
- ASCGFDYEKSRNPL-CIUDSAMLSA-N Asn-Glu-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O ASCGFDYEKSRNPL-CIUDSAMLSA-N 0.000 description 5
- HYQYLOSCICEYTR-YUMQZZPRSA-N Asn-Gly-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O HYQYLOSCICEYTR-YUMQZZPRSA-N 0.000 description 5
- UDSVWSUXKYXSTR-QWRGUYRKSA-N Asn-Gly-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UDSVWSUXKYXSTR-QWRGUYRKSA-N 0.000 description 5
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 5
- RVHGJNGNKGDCPX-KKUMJFAQSA-N Asn-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N RVHGJNGNKGDCPX-KKUMJFAQSA-N 0.000 description 5
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 5
- HPASIOLTWSNMFB-OLHMAJIHSA-N Asn-Thr-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O HPASIOLTWSNMFB-OLHMAJIHSA-N 0.000 description 5
- GHWWTICYPDKPTE-NGZCFLSTSA-N Asn-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N GHWWTICYPDKPTE-NGZCFLSTSA-N 0.000 description 5
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 5
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 5
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 5
- ORRJQLIATJDMQM-HJGDQZAQSA-N Asp-Leu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O ORRJQLIATJDMQM-HJGDQZAQSA-N 0.000 description 5
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 5
- MYLZFUMPZCPJCJ-NHCYSSNCSA-N Asp-Lys-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O MYLZFUMPZCPJCJ-NHCYSSNCSA-N 0.000 description 5
- WQSXAPPYLGNMQL-IHRRRGAJSA-N Asp-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N WQSXAPPYLGNMQL-IHRRRGAJSA-N 0.000 description 5
- LIJXJYGRSRWLCJ-IHRRRGAJSA-N Asp-Phe-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LIJXJYGRSRWLCJ-IHRRRGAJSA-N 0.000 description 5
- ZBYLEBZCVKLPCY-FXQIFTODSA-N Asp-Ser-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZBYLEBZCVKLPCY-FXQIFTODSA-N 0.000 description 5
- KGHLGJAXYSVNJP-WHFBIAKZSA-N Asp-Ser-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O KGHLGJAXYSVNJP-WHFBIAKZSA-N 0.000 description 5
- DRCOAZZDQRCGGP-GHCJXIJMSA-N Asp-Ser-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DRCOAZZDQRCGGP-GHCJXIJMSA-N 0.000 description 5
- DTMLKCYOQKZXKZ-HJGDQZAQSA-N Gln-Arg-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DTMLKCYOQKZXKZ-HJGDQZAQSA-N 0.000 description 5
- LGIKBBLQVSWUGK-DCAQKATOSA-N Gln-Leu-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O LGIKBBLQVSWUGK-DCAQKATOSA-N 0.000 description 5
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 5
- IHSGESFHTMFHRB-GUBZILKMSA-N Gln-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(N)=O IHSGESFHTMFHRB-GUBZILKMSA-N 0.000 description 5
- ZVQZXPADLZIQFF-FHWLQOOXSA-N Gln-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CCC(N)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 ZVQZXPADLZIQFF-FHWLQOOXSA-N 0.000 description 5
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 5
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 5
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 5
- KEBACWCLVOXFNC-DCAQKATOSA-N Glu-Arg-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O KEBACWCLVOXFNC-DCAQKATOSA-N 0.000 description 5
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 5
- VAZZOGXDUQSVQF-NUMRIWBASA-N Glu-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N)O VAZZOGXDUQSVQF-NUMRIWBASA-N 0.000 description 5
- QPRZKNOOOBWXSU-CIUDSAMLSA-N Glu-Asp-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N QPRZKNOOOBWXSU-CIUDSAMLSA-N 0.000 description 5
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 5
- LVCHEMOPBORRLB-DCAQKATOSA-N Glu-Gln-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O LVCHEMOPBORRLB-DCAQKATOSA-N 0.000 description 5
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 5
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 5
- NJPQBTJSYCKCNS-HVTMNAMFSA-N Glu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N NJPQBTJSYCKCNS-HVTMNAMFSA-N 0.000 description 5
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 5
- NWOUBJNMZDDGDT-AVGNSLFASA-N Glu-Leu-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 NWOUBJNMZDDGDT-AVGNSLFASA-N 0.000 description 5
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 5
- AQNYKMCFCCZEEL-JYJNAYRXSA-N Glu-Lys-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AQNYKMCFCCZEEL-JYJNAYRXSA-N 0.000 description 5
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 5
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 5
- BYYNJRSNDARRBX-YFKPBYRVSA-N Gly-Gln-Gly Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O BYYNJRSNDARRBX-YFKPBYRVSA-N 0.000 description 5
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 5
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 5
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 5
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 5
- BBTCXWTXOXUNFX-IUCAKERBSA-N Gly-Met-Arg Chemical compound CSCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O BBTCXWTXOXUNFX-IUCAKERBSA-N 0.000 description 5
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 5
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 5
- CSMYMGFCEJWALV-WDSKDSINSA-N Gly-Ser-Gln Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(N)=O CSMYMGFCEJWALV-WDSKDSINSA-N 0.000 description 5
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 5
- PYFIQROSWQERAS-LBPRGKRZSA-N Gly-Trp-Gly Chemical compound C1=CC=C2C(C[C@H](NC(=O)CN)C(=O)NCC(O)=O)=CNC2=C1 PYFIQROSWQERAS-LBPRGKRZSA-N 0.000 description 5
- IDQNVIWPPWAFSY-AVGNSLFASA-N His-His-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O IDQNVIWPPWAFSY-AVGNSLFASA-N 0.000 description 5
- IWXMHXYOACDSIA-PYJNHQTQSA-N His-Ile-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O IWXMHXYOACDSIA-PYJNHQTQSA-N 0.000 description 5
- TTYKEFZRLKQTHH-MELADBBJSA-N His-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CN=CN2)N)C(=O)O TTYKEFZRLKQTHH-MELADBBJSA-N 0.000 description 5
- DAKSMIWQZPHRIB-BZSNNMDCSA-N His-Tyr-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DAKSMIWQZPHRIB-BZSNNMDCSA-N 0.000 description 5
- KFQDSSNYWKZFOO-LSJOCFKGSA-N His-Val-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KFQDSSNYWKZFOO-LSJOCFKGSA-N 0.000 description 5
- YKRYHWJRQUSTKG-KBIXCLLPSA-N Ile-Ala-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YKRYHWJRQUSTKG-KBIXCLLPSA-N 0.000 description 5
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 5
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 5
- NKRJALPCDNXULF-BYULHYEWSA-N Ile-Asp-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O NKRJALPCDNXULF-BYULHYEWSA-N 0.000 description 5
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 5
- PHIXPNQDGGILMP-YVNDNENWSA-N Ile-Glu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N PHIXPNQDGGILMP-YVNDNENWSA-N 0.000 description 5
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 5
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 5
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 5
- PHRWFSFCNJPWRO-PPCPHDFISA-N Ile-Leu-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N PHRWFSFCNJPWRO-PPCPHDFISA-N 0.000 description 5
- IDMNOFVUXYYZPF-DKIMLUQUSA-N Ile-Lys-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N IDMNOFVUXYYZPF-DKIMLUQUSA-N 0.000 description 5
- GLYJPWIRLBAIJH-FQUUOJAGSA-N Ile-Lys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N GLYJPWIRLBAIJH-FQUUOJAGSA-N 0.000 description 5
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 5
- FGBRXCZYVRFNKQ-MXAVVETBSA-N Ile-Phe-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N FGBRXCZYVRFNKQ-MXAVVETBSA-N 0.000 description 5
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 5
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 5
- UYODHPPSCXBNCS-XUXIUFHCSA-N Ile-Val-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C UYODHPPSCXBNCS-XUXIUFHCSA-N 0.000 description 5
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 5
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 5
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 5
- JQSXWJXBASFONF-KKUMJFAQSA-N Leu-Asp-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JQSXWJXBASFONF-KKUMJFAQSA-N 0.000 description 5
- CIVKXGPFXDIQBV-WDCWCFNPSA-N Leu-Gln-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CIVKXGPFXDIQBV-WDCWCFNPSA-N 0.000 description 5
- HFBCHNRFRYLZNV-GUBZILKMSA-N Leu-Glu-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HFBCHNRFRYLZNV-GUBZILKMSA-N 0.000 description 5
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 5
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 5
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 5
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 5
- SEMUSFOBZGKBGW-YTFOTSKYSA-N Leu-Ile-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SEMUSFOBZGKBGW-YTFOTSKYSA-N 0.000 description 5
- LIINDKYIGYTDLG-PPCPHDFISA-N Leu-Ile-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LIINDKYIGYTDLG-PPCPHDFISA-N 0.000 description 5
- UBZGNBKMIJHOHL-BZSNNMDCSA-N Leu-Leu-Phe Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 UBZGNBKMIJHOHL-BZSNNMDCSA-N 0.000 description 5
- JLWZLIQRYCTYBD-IHRRRGAJSA-N Leu-Lys-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JLWZLIQRYCTYBD-IHRRRGAJSA-N 0.000 description 5
- SYRTUBLKWNDSDK-DKIMLUQUSA-N Leu-Phe-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYRTUBLKWNDSDK-DKIMLUQUSA-N 0.000 description 5
- YWKNKRAKOCLOLH-OEAJRASXSA-N Leu-Phe-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YWKNKRAKOCLOLH-OEAJRASXSA-N 0.000 description 5
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 5
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 5
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 5
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 5
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 5
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 5
- QIJVAFLRMVBHMU-KKUMJFAQSA-N Lys-Asp-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QIJVAFLRMVBHMU-KKUMJFAQSA-N 0.000 description 5
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 5
- BOJYMMBYBNOOGG-DCAQKATOSA-N Lys-Pro-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BOJYMMBYBNOOGG-DCAQKATOSA-N 0.000 description 5
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 5
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 5
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 5
- HDNOQCZWJGGHSS-VEVYYDQMSA-N Met-Asn-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HDNOQCZWJGGHSS-VEVYYDQMSA-N 0.000 description 5
- MSSJHBAKDDIRMJ-SRVKXCTJSA-N Met-Lys-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O MSSJHBAKDDIRMJ-SRVKXCTJSA-N 0.000 description 5
- QCHNRQQVLJYDSI-DLOVCJGASA-N Phe-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 QCHNRQQVLJYDSI-DLOVCJGASA-N 0.000 description 5
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 5
- MPFGIYLYWUCSJG-AVGNSLFASA-N Phe-Glu-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MPFGIYLYWUCSJG-AVGNSLFASA-N 0.000 description 5
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 5
- SWCOXQLDICUYOL-ULQDDVLXSA-N Phe-His-Arg Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SWCOXQLDICUYOL-ULQDDVLXSA-N 0.000 description 5
- KZRQONDKKJCAOL-DKIMLUQUSA-N Phe-Leu-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZRQONDKKJCAOL-DKIMLUQUSA-N 0.000 description 5
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 5
- PTLMYJOMJLTMCB-KKUMJFAQSA-N Phe-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N PTLMYJOMJLTMCB-KKUMJFAQSA-N 0.000 description 5
- KAJLHCWRWDSROH-BZSNNMDCSA-N Phe-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=CC=C1 KAJLHCWRWDSROH-BZSNNMDCSA-N 0.000 description 5
- YUPRIZTWANWWHK-DZKIICNBSA-N Phe-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N YUPRIZTWANWWHK-DZKIICNBSA-N 0.000 description 5
- FZHBZMDRDASUHN-NAKRPEOUSA-N Pro-Ala-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1)C(O)=O FZHBZMDRDASUHN-NAKRPEOUSA-N 0.000 description 5
- DIFXZGPHVCIVSQ-CIUDSAMLSA-N Pro-Gln-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DIFXZGPHVCIVSQ-CIUDSAMLSA-N 0.000 description 5
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 5
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 5
- AQGUSRZKDZYGGV-GMOBBJLQSA-N Pro-Ile-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O AQGUSRZKDZYGGV-GMOBBJLQSA-N 0.000 description 5
- BCNRNJWSRFDPTQ-HJWJTTGWSA-N Pro-Ile-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BCNRNJWSRFDPTQ-HJWJTTGWSA-N 0.000 description 5
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 5
- CHYAYDLYYIJCKY-OSUNSFLBSA-N Pro-Thr-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CHYAYDLYYIJCKY-OSUNSFLBSA-N 0.000 description 5
- RSTWKJFWBKFOFC-JYJNAYRXSA-N Pro-Trp-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(N)=O)C(O)=O RSTWKJFWBKFOFC-JYJNAYRXSA-N 0.000 description 5
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 5
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 5
- HEQPKICPPDOSIN-SRVKXCTJSA-N Ser-Asp-Tyr Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HEQPKICPPDOSIN-SRVKXCTJSA-N 0.000 description 5
- GZBKRJVCRMZAST-XKBZYTNZSA-N Ser-Glu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZBKRJVCRMZAST-XKBZYTNZSA-N 0.000 description 5
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 5
- ZSLFCBHEINFXRS-LPEHRKFASA-N Ser-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ZSLFCBHEINFXRS-LPEHRKFASA-N 0.000 description 5
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 5
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 5
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 5
- UDQBCBUXAQIZAK-GLLZPBPUSA-N Thr-Glu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O UDQBCBUXAQIZAK-GLLZPBPUSA-N 0.000 description 5
- BNGDYRRHRGOPHX-IFFSRLJSSA-N Thr-Glu-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O BNGDYRRHRGOPHX-IFFSRLJSSA-N 0.000 description 5
- GXUWHVZYDAHFSV-FLBSBUHZSA-N Thr-Ile-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GXUWHVZYDAHFSV-FLBSBUHZSA-N 0.000 description 5
- NHQVWACSJZJCGJ-FLBSBUHZSA-N Thr-Thr-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NHQVWACSJZJCGJ-FLBSBUHZSA-N 0.000 description 5
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 5
- BIJDDZBDSJLWJY-PJODQICGSA-N Trp-Ala-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O BIJDDZBDSJLWJY-PJODQICGSA-N 0.000 description 5
- SLCSPPCQWUHPPO-JYJNAYRXSA-N Tyr-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 SLCSPPCQWUHPPO-JYJNAYRXSA-N 0.000 description 5
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 5
- FIRUOPRJKCBLST-KKUMJFAQSA-N Tyr-His-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O FIRUOPRJKCBLST-KKUMJFAQSA-N 0.000 description 5
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 5
- FBHBVXUBTYVCRU-BZSNNMDCSA-N Tyr-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CN=CN1 FBHBVXUBTYVCRU-BZSNNMDCSA-N 0.000 description 5
- CWVHKVVKAQIJKY-ACRUOGEOSA-N Tyr-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC2=CC=C(C=C2)O)N CWVHKVVKAQIJKY-ACRUOGEOSA-N 0.000 description 5
- LMKKMCGTDANZTR-BZSNNMDCSA-N Tyr-Phe-Asp Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC(O)=O)C(O)=O)C1=CC=C(O)C=C1 LMKKMCGTDANZTR-BZSNNMDCSA-N 0.000 description 5
- PSALWJCUIAQKFW-ACRUOGEOSA-N Tyr-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N PSALWJCUIAQKFW-ACRUOGEOSA-N 0.000 description 5
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 5
- AKRHKDCELJLTMD-BVSLBCMMSA-N Tyr-Trp-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N AKRHKDCELJLTMD-BVSLBCMMSA-N 0.000 description 5
- VUTHNLMCXKLLFI-LAEOZQHASA-N Val-Asp-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N VUTHNLMCXKLLFI-LAEOZQHASA-N 0.000 description 5
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 5
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 5
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 5
- VXDSPJJQUQDCKH-UKJIMTQDSA-N Val-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N VXDSPJJQUQDCKH-UKJIMTQDSA-N 0.000 description 5
- ZHQWPWQNVRCXAX-XQQFMLRXSA-N Val-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZHQWPWQNVRCXAX-XQQFMLRXSA-N 0.000 description 5
- SYSWVVCYSXBVJG-RHYQMDGZSA-N Val-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C(C)C)N)O SYSWVVCYSXBVJG-RHYQMDGZSA-N 0.000 description 5
- VPGCVZRRBYOGCD-AVGNSLFASA-N Val-Lys-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O VPGCVZRRBYOGCD-AVGNSLFASA-N 0.000 description 5
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 5
- AEFJNECXZCODJM-UWVGGRQHSA-N Val-Val-Gly Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](C(C)C)C(=O)NCC([O-])=O AEFJNECXZCODJM-UWVGGRQHSA-N 0.000 description 5
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 5
- 108010070944 alanylhistidine Proteins 0.000 description 5
- 108010011559 alanylphenylalanine Proteins 0.000 description 5
- 229960000723 ampicillin Drugs 0.000 description 5
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 5
- 230000005782 double-strand break Effects 0.000 description 5
- 238000009472 formulation Methods 0.000 description 5
- 108010000761 leucylarginine Proteins 0.000 description 5
- 108010038320 lysylphenylalanine Proteins 0.000 description 5
- 210000004962 mammalian cell Anatomy 0.000 description 5
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 5
- 229920001184 polypeptide Polymers 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 108010070643 prolylglutamic acid Proteins 0.000 description 5
- 108010029020 prolylglycine Proteins 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 5
- 230000008685 targeting Effects 0.000 description 5
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- 239000012096 transfection reagent Substances 0.000 description 5
- 108010012567 tyrosyl-glycyl-glycyl-phenylalanyl Proteins 0.000 description 5
- 238000011144 upstream manufacturing Methods 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 4
- AAXVGJXZKHQQHD-LSJOCFKGSA-N Ala-His-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCSC)C(=O)O)N AAXVGJXZKHQQHD-LSJOCFKGSA-N 0.000 description 4
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 4
- KQESEZXHYOUIIM-CQDKDKBSSA-N Ala-Lys-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KQESEZXHYOUIIM-CQDKDKBSSA-N 0.000 description 4
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 4
- CWRBRVZBMVJENN-UVBJJODRSA-N Ala-Trp-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCSC)C(=O)O)N CWRBRVZBMVJENN-UVBJJODRSA-N 0.000 description 4
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 4
- FLYANDHDFRGGTM-PYJNHQTQSA-N Arg-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FLYANDHDFRGGTM-PYJNHQTQSA-N 0.000 description 4
- MTYLORHAQXVQOW-AVGNSLFASA-N Arg-Lys-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O MTYLORHAQXVQOW-AVGNSLFASA-N 0.000 description 4
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 4
- FRBAHXABMQXSJQ-FXQIFTODSA-N Arg-Ser-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O FRBAHXABMQXSJQ-FXQIFTODSA-N 0.000 description 4
- BDMIFVIWCNLDCT-CIUDSAMLSA-N Asn-Arg-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O BDMIFVIWCNLDCT-CIUDSAMLSA-N 0.000 description 4
- GXMSVVBIAMWMKO-BQBZGAKWSA-N Asn-Arg-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N GXMSVVBIAMWMKO-BQBZGAKWSA-N 0.000 description 4
- MFFOYNGMOYFPBD-DCAQKATOSA-N Asn-Arg-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O MFFOYNGMOYFPBD-DCAQKATOSA-N 0.000 description 4
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 4
- YRTOMUMWSTUQAX-FXQIFTODSA-N Asn-Pro-Asp Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O YRTOMUMWSTUQAX-FXQIFTODSA-N 0.000 description 4
- KVMPVNGOKHTUHZ-GCJQMDKQSA-N Asp-Ala-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KVMPVNGOKHTUHZ-GCJQMDKQSA-N 0.000 description 4
- SDHFVYLZFBDSQT-DCAQKATOSA-N Asp-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N SDHFVYLZFBDSQT-DCAQKATOSA-N 0.000 description 4
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 4
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 4
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 4
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 4
- LTCKTLYKRMCFOC-KKUMJFAQSA-N Asp-Phe-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O LTCKTLYKRMCFOC-KKUMJFAQSA-N 0.000 description 4
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 4
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- NNXIQPMZGZUFJJ-AVGNSLFASA-N Gln-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NNXIQPMZGZUFJJ-AVGNSLFASA-N 0.000 description 4
- UWKPRVKWEKEMSY-DCAQKATOSA-N Gln-Lys-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWKPRVKWEKEMSY-DCAQKATOSA-N 0.000 description 4
- FQCILXROGNOZON-YUMQZZPRSA-N Gln-Pro-Gly Chemical compound NC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O FQCILXROGNOZON-YUMQZZPRSA-N 0.000 description 4
- SXFPZRRVWSUYII-KBIXCLLPSA-N Gln-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N SXFPZRRVWSUYII-KBIXCLLPSA-N 0.000 description 4
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 4
- ICRKQMRFXYDYMK-LAEOZQHASA-N Gln-Val-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ICRKQMRFXYDYMK-LAEOZQHASA-N 0.000 description 4
- ZOXBSICWUDAOHX-GUBZILKMSA-N Glu-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O ZOXBSICWUDAOHX-GUBZILKMSA-N 0.000 description 4
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 4
- DNPCBMNFQVTHMA-DCAQKATOSA-N Glu-Leu-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DNPCBMNFQVTHMA-DCAQKATOSA-N 0.000 description 4
- KXTAGESXNQEZKB-DZKIICNBSA-N Glu-Phe-Val Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 KXTAGESXNQEZKB-DZKIICNBSA-N 0.000 description 4
- OVSKVOOUFAKODB-UWVGGRQHSA-N Gly-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OVSKVOOUFAKODB-UWVGGRQHSA-N 0.000 description 4
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 4
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 4
- UFPXDFOYHVEIPI-BYPYZUCNSA-N Gly-Gly-Asp Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O UFPXDFOYHVEIPI-BYPYZUCNSA-N 0.000 description 4
- SCWYHUQOOFRVHP-MBLNEYKQSA-N Gly-Ile-Thr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SCWYHUQOOFRVHP-MBLNEYKQSA-N 0.000 description 4
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 4
- WMGHDYWNHNLGBV-ONGXEEELSA-N Gly-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 WMGHDYWNHNLGBV-ONGXEEELSA-N 0.000 description 4
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- AIPUZFXMXAHZKY-QWRGUYRKSA-N His-Leu-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AIPUZFXMXAHZKY-QWRGUYRKSA-N 0.000 description 4
- FLXCRBXJRJSDHX-AVGNSLFASA-N His-Pro-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O FLXCRBXJRJSDHX-AVGNSLFASA-N 0.000 description 4
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 4
- UMYZBHKAVTXWIW-GMOBBJLQSA-N Ile-Asp-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UMYZBHKAVTXWIW-GMOBBJLQSA-N 0.000 description 4
- VCYVLFAWCJRXFT-HJPIBITLSA-N Ile-Cys-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N VCYVLFAWCJRXFT-HJPIBITLSA-N 0.000 description 4
- LGMUPVWZEYYUMU-YVNDNENWSA-N Ile-Glu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N LGMUPVWZEYYUMU-YVNDNENWSA-N 0.000 description 4
- HUWYGQOISIJNMK-SIGLWIIPSA-N Ile-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N HUWYGQOISIJNMK-SIGLWIIPSA-N 0.000 description 4
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 4
- QGXQHJQPAPMACW-PPCPHDFISA-N Ile-Thr-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QGXQHJQPAPMACW-PPCPHDFISA-N 0.000 description 4
- BCISUQVFDGYZBO-QSFUFRPTSA-N Ile-Val-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O BCISUQVFDGYZBO-QSFUFRPTSA-N 0.000 description 4
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 4
- SWNRZNLXMXRCJC-VKOGCVSHSA-N Ile-Val-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 SWNRZNLXMXRCJC-VKOGCVSHSA-N 0.000 description 4
- REPPKAMYTOJTFC-DCAQKATOSA-N Leu-Arg-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O REPPKAMYTOJTFC-DCAQKATOSA-N 0.000 description 4
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 4
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 4
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 4
- HYMLKESRWLZDBR-WEDXCCLWSA-N Leu-Gly-Thr Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O HYMLKESRWLZDBR-WEDXCCLWSA-N 0.000 description 4
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 4
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 4
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 4
- LVTJJOJKDCVZGP-QWRGUYRKSA-N Leu-Lys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LVTJJOJKDCVZGP-QWRGUYRKSA-N 0.000 description 4
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 4
- IZPVWNSAVUQBGP-CIUDSAMLSA-N Leu-Ser-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IZPVWNSAVUQBGP-CIUDSAMLSA-N 0.000 description 4
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 4
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 4
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 4
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 4
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 4
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 4
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 4
- YPLVCBKEPJPBDQ-MELADBBJSA-N Lys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N YPLVCBKEPJPBDQ-MELADBBJSA-N 0.000 description 4
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 4
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 4
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 4
- WBSCNDJQPKSPII-KKUMJFAQSA-N Lys-Lys-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O WBSCNDJQPKSPII-KKUMJFAQSA-N 0.000 description 4
- AZOFEHCPMBRNFD-BZSNNMDCSA-N Lys-Phe-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 AZOFEHCPMBRNFD-BZSNNMDCSA-N 0.000 description 4
- MEQLGHAMAUPOSJ-DCAQKATOSA-N Lys-Ser-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O MEQLGHAMAUPOSJ-DCAQKATOSA-N 0.000 description 4
- UAPZLLPGGOOCRO-IHRRRGAJSA-N Met-Asn-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N UAPZLLPGGOOCRO-IHRRRGAJSA-N 0.000 description 4
- FJVJLMZUIGMFFU-BQBZGAKWSA-N Met-Asp-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FJVJLMZUIGMFFU-BQBZGAKWSA-N 0.000 description 4
- FYRUJIJAUPHUNB-IUCAKERBSA-N Met-Gly-Arg Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N FYRUJIJAUPHUNB-IUCAKERBSA-N 0.000 description 4
- RRIHXWPHQSXHAQ-XUXIUFHCSA-N Met-Ile-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O RRIHXWPHQSXHAQ-XUXIUFHCSA-N 0.000 description 4
- AFFKUNVPPLQUGA-DCAQKATOSA-N Met-Leu-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O AFFKUNVPPLQUGA-DCAQKATOSA-N 0.000 description 4
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 4
- RIIFMEBFDDXGCV-VEVYYDQMSA-N Met-Thr-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O RIIFMEBFDDXGCV-VEVYYDQMSA-N 0.000 description 4
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 4
- 229930182555 Penicillin Natural products 0.000 description 4
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 4
- 102000007079 Peptide Fragments Human genes 0.000 description 4
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 4
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 4
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 4
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 4
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 4
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 4
- WHNJMTHJGCEKGA-ULQDDVLXSA-N Pro-Phe-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O WHNJMTHJGCEKGA-ULQDDVLXSA-N 0.000 description 4
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 4
- OJPHFSOMBZKQKQ-GUBZILKMSA-N Ser-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CO OJPHFSOMBZKQKQ-GUBZILKMSA-N 0.000 description 4
- MIJWOJAXARLEHA-WDSKDSINSA-N Ser-Gly-Glu Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O MIJWOJAXARLEHA-WDSKDSINSA-N 0.000 description 4
- IUXGJEIKJBYKOO-SRVKXCTJSA-N Ser-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CO)N IUXGJEIKJBYKOO-SRVKXCTJSA-N 0.000 description 4
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 4
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 4
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 4
- PCJLFYBAQZQOFE-KATARQTJSA-N Ser-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N)O PCJLFYBAQZQOFE-KATARQTJSA-N 0.000 description 4
- JZRYFUGREMECBH-XPUUQOCRSA-N Ser-Val-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O JZRYFUGREMECBH-XPUUQOCRSA-N 0.000 description 4
- 241000736110 Sphingomonas paucimobilis Species 0.000 description 4
- NJEMRSFGDNECGF-GCJQMDKQSA-N Thr-Ala-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O NJEMRSFGDNECGF-GCJQMDKQSA-N 0.000 description 4
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 4
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 4
- VGYVVSQFSSKZRJ-OEAJRASXSA-N Thr-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=CC=C1 VGYVVSQFSSKZRJ-OEAJRASXSA-N 0.000 description 4
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 4
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 4
- NJGMALCNYAMYCB-JRQIVUDYSA-N Thr-Tyr-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJGMALCNYAMYCB-JRQIVUDYSA-N 0.000 description 4
- SDNVRAKIJVKAGS-LKTVYLICSA-N Tyr-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N SDNVRAKIJVKAGS-LKTVYLICSA-N 0.000 description 4
- BEIGSKUPTIFYRZ-SRVKXCTJSA-N Tyr-Asp-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O BEIGSKUPTIFYRZ-SRVKXCTJSA-N 0.000 description 4
- MNWINJDPGBNOED-ULQDDVLXSA-N Tyr-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=C(O)C=C1 MNWINJDPGBNOED-ULQDDVLXSA-N 0.000 description 4
- KKHRWGYHBZORMQ-NHCYSSNCSA-N Val-Arg-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N KKHRWGYHBZORMQ-NHCYSSNCSA-N 0.000 description 4
- CVUDMNSZAIZFAE-UHFFFAOYSA-N Val-Arg-Pro Natural products NC(N)=NCCCC(NC(=O)C(N)C(C)C)C(=O)N1CCCC1C(O)=O CVUDMNSZAIZFAE-UHFFFAOYSA-N 0.000 description 4
- QGFPYRPIUXBYGR-YDHLFZDLSA-N Val-Asn-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N QGFPYRPIUXBYGR-YDHLFZDLSA-N 0.000 description 4
- NYTKXWLZSNRILS-IFFSRLJSSA-N Val-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N)O NYTKXWLZSNRILS-IFFSRLJSSA-N 0.000 description 4
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 4
- FTKXYXACXYOHND-XUXIUFHCSA-N Val-Ile-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O FTKXYXACXYOHND-XUXIUFHCSA-N 0.000 description 4
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 4
- HTONZBWRYUKUKC-RCWTZXSCSA-N Val-Thr-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O HTONZBWRYUKUKC-RCWTZXSCSA-N 0.000 description 4
- 230000009471 action Effects 0.000 description 4
- 108010028939 alanyl-alanyl-lysyl-alanine Proteins 0.000 description 4
- 108010047495 alanylglycine Proteins 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 108010024668 arginyl-glutamyl-aspartyl-valine Proteins 0.000 description 4
- 108010036533 arginylvaline Proteins 0.000 description 4
- 108010093581 aspartyl-proline Proteins 0.000 description 4
- 210000004899 c-terminal region Anatomy 0.000 description 4
- 238000006481 deamination reaction Methods 0.000 description 4
- 238000000684 flow cytometry Methods 0.000 description 4
- 108010084389 glycyltryptophan Proteins 0.000 description 4
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 4
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 229940049954 penicillin Drugs 0.000 description 4
- 108010015840 seryl-prolyl-lysyl-lysine Proteins 0.000 description 4
- 108010071207 serylmethionine Proteins 0.000 description 4
- NHCPCLJZRSIDHS-ZLUOBGJFSA-N Ala-Asp-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O NHCPCLJZRSIDHS-ZLUOBGJFSA-N 0.000 description 3
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 3
- OILNWMNBLIHXQK-ZLUOBGJFSA-N Ala-Cys-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(O)=O OILNWMNBLIHXQK-ZLUOBGJFSA-N 0.000 description 3
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 3
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 3
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 3
- AJBVYEYZVYPFCF-CIUDSAMLSA-N Ala-Lys-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O AJBVYEYZVYPFCF-CIUDSAMLSA-N 0.000 description 3
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 3
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 3
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 3
- TTXYKSADPSNOIF-IHRRRGAJSA-N Arg-Asp-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O TTXYKSADPSNOIF-IHRRRGAJSA-N 0.000 description 3
- PPPXVIBMLFWNSK-BQBZGAKWSA-N Arg-Gly-Cys Chemical compound C(C[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N PPPXVIBMLFWNSK-BQBZGAKWSA-N 0.000 description 3
- PHHRSPBBQUFULD-UWVGGRQHSA-N Arg-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N PHHRSPBBQUFULD-UWVGGRQHSA-N 0.000 description 3
- RKQRHMKFNBYOTN-IHRRRGAJSA-N Arg-His-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N RKQRHMKFNBYOTN-IHRRRGAJSA-N 0.000 description 3
- KSUALAGYYLQSHJ-RCWTZXSCSA-N Arg-Met-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KSUALAGYYLQSHJ-RCWTZXSCSA-N 0.000 description 3
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 3
- NJSNXIOKBHPFMB-GMOBBJLQSA-N Asn-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)N)N NJSNXIOKBHPFMB-GMOBBJLQSA-N 0.000 description 3
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 3
- 241000606125 Bacteroides Species 0.000 description 3
- 241000235385 Caecomyces Species 0.000 description 3
- 108091033380 Coding strand Proteins 0.000 description 3
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 3
- WZZSKAJIHTUUSG-ACZMJKKPSA-N Glu-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O WZZSKAJIHTUUSG-ACZMJKKPSA-N 0.000 description 3
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 3
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 3
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 3
- BFEZQZKEPRKKHV-SRVKXCTJSA-N Glu-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O BFEZQZKEPRKKHV-SRVKXCTJSA-N 0.000 description 3
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 3
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 3
- VIPDPMHGICREIS-GVXVVHGQSA-N Glu-Val-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VIPDPMHGICREIS-GVXVVHGQSA-N 0.000 description 3
- JPXNYFOHTHSREU-UWVGGRQHSA-N Gly-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN JPXNYFOHTHSREU-UWVGGRQHSA-N 0.000 description 3
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 3
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 3
- LCNXZQROPKFGQK-WHFBIAKZSA-N Gly-Asp-Ser Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O LCNXZQROPKFGQK-WHFBIAKZSA-N 0.000 description 3
- TTZAWSKKNCEINZ-AVGNSLFASA-N His-Arg-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O TTZAWSKKNCEINZ-AVGNSLFASA-N 0.000 description 3
- ZSKJIISDJXJQPV-BZSNNMDCSA-N His-Leu-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 ZSKJIISDJXJQPV-BZSNNMDCSA-N 0.000 description 3
- KFVUBLZRFSVDGO-BYULHYEWSA-N Ile-Gly-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O KFVUBLZRFSVDGO-BYULHYEWSA-N 0.000 description 3
- PDTMWFVVNZYWTR-NHCYSSNCSA-N Ile-Gly-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O PDTMWFVVNZYWTR-NHCYSSNCSA-N 0.000 description 3
- CIDLJWVDMNDKPT-FIRPJDEBSA-N Ile-Phe-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N CIDLJWVDMNDKPT-FIRPJDEBSA-N 0.000 description 3
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 3
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 3
- YRRCOJOXAJNSAX-IHRRRGAJSA-N Leu-Pro-Lys Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)O)N YRRCOJOXAJNSAX-IHRRRGAJSA-N 0.000 description 3
- 239000006142 Luria-Bertani Agar Substances 0.000 description 3
- SJNZALDHDUYDBU-IHRRRGAJSA-N Lys-Arg-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(O)=O SJNZALDHDUYDBU-IHRRRGAJSA-N 0.000 description 3
- SWWCDAGDQHTKIE-RHYQMDGZSA-N Lys-Arg-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWWCDAGDQHTKIE-RHYQMDGZSA-N 0.000 description 3
- MQMIRLVJXQNTRJ-SDDRHHMPSA-N Lys-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O MQMIRLVJXQNTRJ-SDDRHHMPSA-N 0.000 description 3
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 3
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 3
- GZGWILAQHOVXTD-DCAQKATOSA-N Lys-Met-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(O)=O GZGWILAQHOVXTD-DCAQKATOSA-N 0.000 description 3
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 3
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 3
- IEIHKHYMBIYQTH-YESZJQIVSA-N Lys-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCCN)N)C(=O)O IEIHKHYMBIYQTH-YESZJQIVSA-N 0.000 description 3
- NYTDJEZBAAFLLG-IHRRRGAJSA-N Lys-Val-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(O)=O NYTDJEZBAAFLLG-IHRRRGAJSA-N 0.000 description 3
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- WPTYDQPGBMDUBI-QWRGUYRKSA-N Phe-Gly-Asn Chemical compound N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O WPTYDQPGBMDUBI-QWRGUYRKSA-N 0.000 description 3
- YKUGPVXSDOOANW-KKUMJFAQSA-N Phe-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKUGPVXSDOOANW-KKUMJFAQSA-N 0.000 description 3
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 3
- RBRNEFJTEHPDSL-ACRUOGEOSA-N Phe-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 RBRNEFJTEHPDSL-ACRUOGEOSA-N 0.000 description 3
- DSXPMZMSJHOKKK-HJOGWXRNSA-N Phe-Phe-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O DSXPMZMSJHOKKK-HJOGWXRNSA-N 0.000 description 3
- 241000155545 Phoma citri Species 0.000 description 3
- OBVCYFIHIIYIQF-CIUDSAMLSA-N Pro-Asn-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OBVCYFIHIIYIQF-CIUDSAMLSA-N 0.000 description 3
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 3
- DNIAPMSPPWPWGF-UHFFFAOYSA-N Propylene glycol Chemical compound CC(O)CO DNIAPMSPPWPWGF-UHFFFAOYSA-N 0.000 description 3
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 3
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 3
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 3
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 241000589196 Sinorhizobium meliloti Species 0.000 description 3
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 3
- 108091081024 Start codon Proteins 0.000 description 3
- FQPQPTHMHZKGFM-XQXXSGGOSA-N Thr-Ala-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O FQPQPTHMHZKGFM-XQXXSGGOSA-N 0.000 description 3
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 3
- RCMWNNJFKNDKQR-UFYCRDLUSA-N Tyr-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 RCMWNNJFKNDKQR-UFYCRDLUSA-N 0.000 description 3
- RUCNAYOMFXRIKJ-DCAQKATOSA-N Val-Ala-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RUCNAYOMFXRIKJ-DCAQKATOSA-N 0.000 description 3
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 3
- CKTMJBPRVQWPHU-JSGCOSHPSA-N Val-Phe-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)O)N CKTMJBPRVQWPHU-JSGCOSHPSA-N 0.000 description 3
- ZHWZDZFWBXWPDW-GUBZILKMSA-N Val-Val-Cys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(O)=O ZHWZDZFWBXWPDW-GUBZILKMSA-N 0.000 description 3
- 101150063416 add gene Proteins 0.000 description 3
- 238000000137 annealing Methods 0.000 description 3
- 238000002659 cell therapy Methods 0.000 description 3
- KRKNYBCHXYNGOX-UHFFFAOYSA-N citric acid Chemical compound OC(=O)CC(O)(C(O)=O)CC(O)=O KRKNYBCHXYNGOX-UHFFFAOYSA-N 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 230000009615 deamination Effects 0.000 description 3
- 208000035475 disorder Diseases 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000001476 gene delivery Methods 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 3
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000003020 moisturizing effect Effects 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 230000002265 prevention Effects 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 238000011084 recovery Methods 0.000 description 3
- 229940113082 thymine Drugs 0.000 description 3
- 238000012546 transfer Methods 0.000 description 3
- 108010038745 tryptophylglycine Proteins 0.000 description 3
- 108091012372 uracil binding proteins Proteins 0.000 description 3
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- SKHCUBQVZJHOFM-NAKRPEOUSA-N Ala-Arg-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SKHCUBQVZJHOFM-NAKRPEOUSA-N 0.000 description 2
- XEXJJJRVTFGWIC-FXQIFTODSA-N Ala-Asn-Arg Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XEXJJJRVTFGWIC-FXQIFTODSA-N 0.000 description 2
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- FBHOPGDGELNWRH-DRZSPHRISA-N Ala-Glu-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FBHOPGDGELNWRH-DRZSPHRISA-N 0.000 description 2
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 2
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 2
- SUMYEVXWCAYLLJ-GUBZILKMSA-N Ala-Leu-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O SUMYEVXWCAYLLJ-GUBZILKMSA-N 0.000 description 2
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 2
- ZBLQIYPCUWZSRZ-QEJZJMRPSA-N Ala-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=CC=C1 ZBLQIYPCUWZSRZ-QEJZJMRPSA-N 0.000 description 2
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 2
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 2
- VRTOMXFZHGWHIJ-KZVJFYERSA-N Ala-Thr-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VRTOMXFZHGWHIJ-KZVJFYERSA-N 0.000 description 2
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 2
- PEFFAAKJGBZBKL-NAKRPEOUSA-N Arg-Ala-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PEFFAAKJGBZBKL-NAKRPEOUSA-N 0.000 description 2
- SBVJJNJLFWSJOV-UBHSHLNASA-N Arg-Ala-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SBVJJNJLFWSJOV-UBHSHLNASA-N 0.000 description 2
- DBKNLHKEVPZVQC-LPEHRKFASA-N Arg-Ala-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O DBKNLHKEVPZVQC-LPEHRKFASA-N 0.000 description 2
- RRGPUNYIPJXJBU-GUBZILKMSA-N Arg-Asp-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O RRGPUNYIPJXJBU-GUBZILKMSA-N 0.000 description 2
- JCAISGGAOQXEHJ-ZPFDUUQYSA-N Arg-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N JCAISGGAOQXEHJ-ZPFDUUQYSA-N 0.000 description 2
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 2
- RKRSYHCNPFGMTA-CIUDSAMLSA-N Arg-Glu-Asn Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O RKRSYHCNPFGMTA-CIUDSAMLSA-N 0.000 description 2
- MZRBYBIQTIKERR-GUBZILKMSA-N Arg-Glu-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MZRBYBIQTIKERR-GUBZILKMSA-N 0.000 description 2
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 2
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 2
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 2
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 2
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 2
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 2
- GRRXPUAICOGISM-RWMBFGLXSA-N Arg-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GRRXPUAICOGISM-RWMBFGLXSA-N 0.000 description 2
- PYZPXCZNQSEHDT-GUBZILKMSA-N Arg-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N PYZPXCZNQSEHDT-GUBZILKMSA-N 0.000 description 2
- OISWSORSLQOGFV-AVGNSLFASA-N Arg-Met-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N OISWSORSLQOGFV-AVGNSLFASA-N 0.000 description 2
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 2
- YTMKMRSYXHBGER-IHRRRGAJSA-N Arg-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N YTMKMRSYXHBGER-IHRRRGAJSA-N 0.000 description 2
- DNBMCNQKNOKOSD-DCAQKATOSA-N Arg-Pro-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O DNBMCNQKNOKOSD-DCAQKATOSA-N 0.000 description 2
- NGYHSXDNNOFHNE-AVGNSLFASA-N Arg-Pro-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O NGYHSXDNNOFHNE-AVGNSLFASA-N 0.000 description 2
- ASQKVGRCKOFKIU-KZVJFYERSA-N Arg-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ASQKVGRCKOFKIU-KZVJFYERSA-N 0.000 description 2
- AOJYORNRFWWEIV-IHRRRGAJSA-N Arg-Tyr-Asp Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 AOJYORNRFWWEIV-IHRRRGAJSA-N 0.000 description 2
- QCTOLCVIGRLMQS-HRCADAONSA-N Arg-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O QCTOLCVIGRLMQS-HRCADAONSA-N 0.000 description 2
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 2
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 description 2
- XWGJDUSDTRPQRK-ZLUOBGJFSA-N Asn-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC(N)=O XWGJDUSDTRPQRK-ZLUOBGJFSA-N 0.000 description 2
- QEYJFBMTSMLPKZ-ZKWXMUAHSA-N Asn-Ala-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O QEYJFBMTSMLPKZ-ZKWXMUAHSA-N 0.000 description 2
- WPOLSNAQGVHROR-GUBZILKMSA-N Asn-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N WPOLSNAQGVHROR-GUBZILKMSA-N 0.000 description 2
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 2
- QYXNFROWLZPWPC-FXQIFTODSA-N Asn-Glu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QYXNFROWLZPWPC-FXQIFTODSA-N 0.000 description 2
- BZMWJLLUAKSIMH-FXQIFTODSA-N Asn-Glu-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BZMWJLLUAKSIMH-FXQIFTODSA-N 0.000 description 2
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 2
- YYSYDIYQTUPNQQ-SXTJYALSSA-N Asn-Ile-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YYSYDIYQTUPNQQ-SXTJYALSSA-N 0.000 description 2
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 2
- RAUPFUCUDBQYHE-AVGNSLFASA-N Asn-Phe-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RAUPFUCUDBQYHE-AVGNSLFASA-N 0.000 description 2
- YUUIAUXBNOHFRJ-IHRRRGAJSA-N Asn-Phe-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(O)=O YUUIAUXBNOHFRJ-IHRRRGAJSA-N 0.000 description 2
- FTNRWCPWDWRPAV-BZSNNMDCSA-N Asn-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CC(N)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 FTNRWCPWDWRPAV-BZSNNMDCSA-N 0.000 description 2
- KTDWFWNZLLFEFU-KKUMJFAQSA-N Asn-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KTDWFWNZLLFEFU-KKUMJFAQSA-N 0.000 description 2
- CGYKCTPUGXFPMG-IHPCNDPISA-N Asn-Tyr-Trp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O CGYKCTPUGXFPMG-IHPCNDPISA-N 0.000 description 2
- XBQSLMACWDXWLJ-GHCJXIJMSA-N Asp-Ala-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XBQSLMACWDXWLJ-GHCJXIJMSA-N 0.000 description 2
- BLQBMRNMBAYREH-UWJYBYFXSA-N Asp-Ala-Tyr Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O BLQBMRNMBAYREH-UWJYBYFXSA-N 0.000 description 2
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 2
- OERMIMJQPQUIPK-FXQIFTODSA-N Asp-Arg-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O OERMIMJQPQUIPK-FXQIFTODSA-N 0.000 description 2
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 2
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 2
- KNMRXHIAVXHCLW-ZLUOBGJFSA-N Asp-Asn-Ser Chemical compound C([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)C(=O)O KNMRXHIAVXHCLW-ZLUOBGJFSA-N 0.000 description 2
- RDRMWJBLOSRRAW-BYULHYEWSA-N Asp-Asn-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O RDRMWJBLOSRRAW-BYULHYEWSA-N 0.000 description 2
- SBHUBSDEZQFJHJ-CIUDSAMLSA-N Asp-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O SBHUBSDEZQFJHJ-CIUDSAMLSA-N 0.000 description 2
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 2
- DXQOQMCLWWADMU-ACZMJKKPSA-N Asp-Gln-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DXQOQMCLWWADMU-ACZMJKKPSA-N 0.000 description 2
- RRKCPMGSRIDLNC-AVGNSLFASA-N Asp-Glu-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RRKCPMGSRIDLNC-AVGNSLFASA-N 0.000 description 2
- TVIZQBFURPLQDV-DJFWLOJKSA-N Asp-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N TVIZQBFURPLQDV-DJFWLOJKSA-N 0.000 description 2
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 2
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 2
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 2
- CLUMZOKVGUWUFD-CIUDSAMLSA-N Asp-Leu-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O CLUMZOKVGUWUFD-CIUDSAMLSA-N 0.000 description 2
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 2
- GYWQGGUCMDCUJE-DLOVCJGASA-N Asp-Phe-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(O)=O GYWQGGUCMDCUJE-DLOVCJGASA-N 0.000 description 2
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 2
- SQIARYGNVQWOSB-BZSNNMDCSA-N Asp-Tyr-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQIARYGNVQWOSB-BZSNNMDCSA-N 0.000 description 2
- 108010075254 C-Peptide Proteins 0.000 description 2
- 238000010354 CRISPR gene editing Methods 0.000 description 2
- 238000010453 CRISPR/Cas method Methods 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- CHRCKSPMGYDLIA-SRVKXCTJSA-N Cys-Phe-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O CHRCKSPMGYDLIA-SRVKXCTJSA-N 0.000 description 2
- CNAMJJOZGXPDHW-IHRRRGAJSA-N Cys-Pro-Phe Chemical compound N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O CNAMJJOZGXPDHW-IHRRRGAJSA-N 0.000 description 2
- 241000701022 Cytomegalovirus Species 0.000 description 2
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 2
- FBPFZTCFMRRESA-JGWLITMVSA-N D-glucitol Chemical compound OC[C@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-JGWLITMVSA-N 0.000 description 2
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 2
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- PHZYLYASFWHLHJ-FXQIFTODSA-N Gln-Asn-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PHZYLYASFWHLHJ-FXQIFTODSA-N 0.000 description 2
- ZPDVKYLJTOFQJV-WDSKDSINSA-N Gln-Asn-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O ZPDVKYLJTOFQJV-WDSKDSINSA-N 0.000 description 2
- XEYMBRRKIFYQMF-GUBZILKMSA-N Gln-Asp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O XEYMBRRKIFYQMF-GUBZILKMSA-N 0.000 description 2
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 2
- PNENQZWRFMUZOM-DCAQKATOSA-N Gln-Glu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O PNENQZWRFMUZOM-DCAQKATOSA-N 0.000 description 2
- HVQCEQTUSWWFOS-WDSKDSINSA-N Gln-Gly-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)NCC(=O)N[C@@H](CS)C(=O)O)N HVQCEQTUSWWFOS-WDSKDSINSA-N 0.000 description 2
- GIVHPCWYVWUUSG-HVTMNAMFSA-N Gln-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N GIVHPCWYVWUUSG-HVTMNAMFSA-N 0.000 description 2
- CAXXTYYGFYTBPV-IUCAKERBSA-N Gln-Leu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O CAXXTYYGFYTBPV-IUCAKERBSA-N 0.000 description 2
- PSERKXGRRADTKA-MNXVOIDGSA-N Gln-Leu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PSERKXGRRADTKA-MNXVOIDGSA-N 0.000 description 2
- IULKWYSYZSURJK-AVGNSLFASA-N Gln-Leu-Lys Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O IULKWYSYZSURJK-AVGNSLFASA-N 0.000 description 2
- ZBKUIQNCRIYVGH-SDDRHHMPSA-N Gln-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZBKUIQNCRIYVGH-SDDRHHMPSA-N 0.000 description 2
- WTJIWXMJESRHMM-XDTLVQLUSA-N Gln-Tyr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O WTJIWXMJESRHMM-XDTLVQLUSA-N 0.000 description 2
- YKLNMGJYMNPBCP-ACZMJKKPSA-N Glu-Asn-Asp Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YKLNMGJYMNPBCP-ACZMJKKPSA-N 0.000 description 2
- CKRUHITYRFNUKW-WDSKDSINSA-N Glu-Asn-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CKRUHITYRFNUKW-WDSKDSINSA-N 0.000 description 2
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 2
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 2
- ZXQPJYWZSFGWJB-AVGNSLFASA-N Glu-Cys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)O)N ZXQPJYWZSFGWJB-AVGNSLFASA-N 0.000 description 2
- CJWANNXUTOATSJ-DCAQKATOSA-N Glu-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N CJWANNXUTOATSJ-DCAQKATOSA-N 0.000 description 2
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 2
- WPLGNDORMXTMQS-FXQIFTODSA-N Glu-Gln-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O WPLGNDORMXTMQS-FXQIFTODSA-N 0.000 description 2
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 2
- VOORMNJKNBGYGK-YUMQZZPRSA-N Glu-Gly-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N VOORMNJKNBGYGK-YUMQZZPRSA-N 0.000 description 2
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 2
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 2
- ITBHUUMCJJQUSC-LAEOZQHASA-N Glu-Ile-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O ITBHUUMCJJQUSC-LAEOZQHASA-N 0.000 description 2
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 2
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 2
- IOUQWHIEQYQVFD-JYJNAYRXSA-N Glu-Leu-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IOUQWHIEQYQVFD-JYJNAYRXSA-N 0.000 description 2
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 2
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 2
- ZQYZDDXTNQXUJH-CIUDSAMLSA-N Glu-Met-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCC(=O)O)N ZQYZDDXTNQXUJH-CIUDSAMLSA-N 0.000 description 2
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 2
- SYAYROHMAIHWFB-KBIXCLLPSA-N Glu-Ser-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SYAYROHMAIHWFB-KBIXCLLPSA-N 0.000 description 2
- BXSZPACYCMNKLS-AVGNSLFASA-N Glu-Ser-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O BXSZPACYCMNKLS-AVGNSLFASA-N 0.000 description 2
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 2
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 2
- UMZHHILWZBFPGL-LOKLDPHHSA-N Glu-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O UMZHHILWZBFPGL-LOKLDPHHSA-N 0.000 description 2
- KIEICAOUSNYOLM-NRPADANISA-N Glu-Val-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O KIEICAOUSNYOLM-NRPADANISA-N 0.000 description 2
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 2
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 2
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 2
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 2
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 2
- STVHDEHTKFXBJQ-LAEOZQHASA-N Gly-Glu-Ile Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STVHDEHTKFXBJQ-LAEOZQHASA-N 0.000 description 2
- ZQIMMEYPEXIYBB-IUCAKERBSA-N Gly-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN ZQIMMEYPEXIYBB-IUCAKERBSA-N 0.000 description 2
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 2
- CCQOOWAONKGYKQ-BYPYZUCNSA-N Gly-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)CN CCQOOWAONKGYKQ-BYPYZUCNSA-N 0.000 description 2
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 2
- SXJHOPPTOJACOA-QXEWZRGKSA-N Gly-Ile-Arg Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N SXJHOPPTOJACOA-QXEWZRGKSA-N 0.000 description 2
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 2
- VBOBNHSVQKKTOT-YUMQZZPRSA-N Gly-Lys-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O VBOBNHSVQKKTOT-YUMQZZPRSA-N 0.000 description 2
- FXGRXIATVXUAHO-WEDXCCLWSA-N Gly-Lys-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN FXGRXIATVXUAHO-WEDXCCLWSA-N 0.000 description 2
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 2
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 2
- DNVDEMWIYLVIQU-RCOVLWMOSA-N Gly-Val-Asp Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O DNVDEMWIYLVIQU-RCOVLWMOSA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- IPIVXQQRZXEUGW-UWJYBYFXSA-N His-Ala-His Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IPIVXQQRZXEUGW-UWJYBYFXSA-N 0.000 description 2
- XINDHUAGVGCNSF-QSFUFRPTSA-N His-Ala-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O XINDHUAGVGCNSF-QSFUFRPTSA-N 0.000 description 2
- HTZKFIYQMHJWSQ-INTQDDNPSA-N His-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N HTZKFIYQMHJWSQ-INTQDDNPSA-N 0.000 description 2
- TVQGUFGDVODUIF-LSJOCFKGSA-N His-Arg-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CN=CN1)N TVQGUFGDVODUIF-LSJOCFKGSA-N 0.000 description 2
- MVADCDSCFTXCBT-CIUDSAMLSA-N His-Asp-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MVADCDSCFTXCBT-CIUDSAMLSA-N 0.000 description 2
- IMCHNUANCIGUKS-SRVKXCTJSA-N His-Glu-Arg Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IMCHNUANCIGUKS-SRVKXCTJSA-N 0.000 description 2
- XMENRVZYPBKBIL-AVGNSLFASA-N His-Glu-His Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O XMENRVZYPBKBIL-AVGNSLFASA-N 0.000 description 2
- JCOSMKPAOYDKRO-AVGNSLFASA-N His-Glu-Lys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N JCOSMKPAOYDKRO-AVGNSLFASA-N 0.000 description 2
- MJUUWJJEUOBDGW-IHRRRGAJSA-N His-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 MJUUWJJEUOBDGW-IHRRRGAJSA-N 0.000 description 2
- SVVULKPWDBIPCO-BZSNNMDCSA-N His-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SVVULKPWDBIPCO-BZSNNMDCSA-N 0.000 description 2
- ZFDKSLBEWYCOCS-BZSNNMDCSA-N His-Phe-Lys Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CC=1NC=NC=1)C1=CC=CC=C1 ZFDKSLBEWYCOCS-BZSNNMDCSA-N 0.000 description 2
- WCHONUZTYDQMBY-PYJNHQTQSA-N His-Pro-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WCHONUZTYDQMBY-PYJNHQTQSA-N 0.000 description 2
- PBVQWNDMFFCPIZ-ULQDDVLXSA-N His-Pro-Phe Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 PBVQWNDMFFCPIZ-ULQDDVLXSA-N 0.000 description 2
- YEKYGQZUBCRNGH-DCAQKATOSA-N His-Pro-Ser Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CO)C(=O)O YEKYGQZUBCRNGH-DCAQKATOSA-N 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 2
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 2
- NBJAAWYRLGCJOF-UGYAYLCHSA-N Ile-Asp-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NBJAAWYRLGCJOF-UGYAYLCHSA-N 0.000 description 2
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 2
- LKACSKJPTFSBHR-MNXVOIDGSA-N Ile-Gln-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N LKACSKJPTFSBHR-MNXVOIDGSA-N 0.000 description 2
- MTFVYKQRLXYAQN-LAEOZQHASA-N Ile-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O MTFVYKQRLXYAQN-LAEOZQHASA-N 0.000 description 2
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 2
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 2
- CMNMPCTVCWWYHY-MXAVVETBSA-N Ile-His-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC(C)C)C(=O)O)N CMNMPCTVCWWYHY-MXAVVETBSA-N 0.000 description 2
- PKGGWLOLRLOPGK-XUXIUFHCSA-N Ile-Leu-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PKGGWLOLRLOPGK-XUXIUFHCSA-N 0.000 description 2
- OUUCIIJSBIBCHB-ZPFDUUQYSA-N Ile-Leu-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O OUUCIIJSBIBCHB-ZPFDUUQYSA-N 0.000 description 2
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 2
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 2
- BKPPWVSPSIUXHZ-OSUNSFLBSA-N Ile-Met-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N BKPPWVSPSIUXHZ-OSUNSFLBSA-N 0.000 description 2
- HQEPKOFULQTSFV-JURCDPSOSA-N Ile-Phe-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)O)N HQEPKOFULQTSFV-JURCDPSOSA-N 0.000 description 2
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 2
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 2
- JSLIXOUMAOUGBN-JUKXBJQTSA-N Ile-Tyr-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N JSLIXOUMAOUGBN-JUKXBJQTSA-N 0.000 description 2
- DLEBSGAVWRPTIX-PEDHHIEDSA-N Ile-Val-Ile Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)[C@@H](C)CC DLEBSGAVWRPTIX-PEDHHIEDSA-N 0.000 description 2
- 108010065920 Insulin Lispro Proteins 0.000 description 2
- 108020004684 Internal Ribosome Entry Sites Proteins 0.000 description 2
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 2
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 2
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 2
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 2
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 2
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 2
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 2
- JUWJEAPUNARGCF-DCAQKATOSA-N Leu-Arg-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JUWJEAPUNARGCF-DCAQKATOSA-N 0.000 description 2
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 2
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 2
- PVMPDMIKUVNOBD-CIUDSAMLSA-N Leu-Asp-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O PVMPDMIKUVNOBD-CIUDSAMLSA-N 0.000 description 2
- IASQBRJGRVXNJI-YUMQZZPRSA-N Leu-Cys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)NCC(O)=O IASQBRJGRVXNJI-YUMQZZPRSA-N 0.000 description 2
- FQZPTCNSNPWHLJ-AVGNSLFASA-N Leu-Gln-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O FQZPTCNSNPWHLJ-AVGNSLFASA-N 0.000 description 2
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 2
- VGPCJSXPPOQPBK-YUMQZZPRSA-N Leu-Gly-Ser Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O VGPCJSXPPOQPBK-YUMQZZPRSA-N 0.000 description 2
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 2
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 2
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 2
- INCJJHQRZGQLFC-KBPBESRZSA-N Leu-Phe-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O INCJJHQRZGQLFC-KBPBESRZSA-N 0.000 description 2
- DRWMRVFCKKXHCH-BZSNNMDCSA-N Leu-Phe-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CC=CC=C1 DRWMRVFCKKXHCH-BZSNNMDCSA-N 0.000 description 2
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 2
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 2
- YIRIDPUGZKHMHT-ACRUOGEOSA-N Leu-Tyr-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YIRIDPUGZKHMHT-ACRUOGEOSA-N 0.000 description 2
- AIMGJYMCTAABEN-GVXVVHGQSA-N Leu-Val-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIMGJYMCTAABEN-GVXVVHGQSA-N 0.000 description 2
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 2
- IXHKPDJKKCUKHS-GARJFASQSA-N Lys-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IXHKPDJKKCUKHS-GARJFASQSA-N 0.000 description 2
- KNKHAVVBVXKOGX-JXUBOQSCSA-N Lys-Ala-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O KNKHAVVBVXKOGX-JXUBOQSCSA-N 0.000 description 2
- DGWXCIORNLWGGG-CIUDSAMLSA-N Lys-Asn-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O DGWXCIORNLWGGG-CIUDSAMLSA-N 0.000 description 2
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 2
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 2
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 2
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 2
- KZOHPCYVORJBLG-AVGNSLFASA-N Lys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N KZOHPCYVORJBLG-AVGNSLFASA-N 0.000 description 2
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 2
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 2
- CAVGLNOOIFHJOF-SRVKXCTJSA-N Lys-His-Cys Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N CAVGLNOOIFHJOF-SRVKXCTJSA-N 0.000 description 2
- SPCHLZUWJTYZFC-IHRRRGAJSA-N Lys-His-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(O)=O SPCHLZUWJTYZFC-IHRRRGAJSA-N 0.000 description 2
- JYXBNQOKPRQNQS-YTFOTSKYSA-N Lys-Ile-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JYXBNQOKPRQNQS-YTFOTSKYSA-N 0.000 description 2
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 2
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 2
- XIZQPFCRXLUNMK-BZSNNMDCSA-N Lys-Leu-Phe Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCCCN)N XIZQPFCRXLUNMK-BZSNNMDCSA-N 0.000 description 2
- XOQMURBBIXRRCR-SRVKXCTJSA-N Lys-Lys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN XOQMURBBIXRRCR-SRVKXCTJSA-N 0.000 description 2
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 2
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 2
- KJIXWRWPOCKYLD-IHRRRGAJSA-N Lys-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N KJIXWRWPOCKYLD-IHRRRGAJSA-N 0.000 description 2
- QBHGXFQJFPWJIH-XUXIUFHCSA-N Lys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCCCN QBHGXFQJFPWJIH-XUXIUFHCSA-N 0.000 description 2
- GHKXHCMRAUYLBS-CIUDSAMLSA-N Lys-Ser-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O GHKXHCMRAUYLBS-CIUDSAMLSA-N 0.000 description 2
- LMMBAXJRYSXCOQ-ACRUOGEOSA-N Lys-Tyr-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O LMMBAXJRYSXCOQ-ACRUOGEOSA-N 0.000 description 2
- RPWQJSBMXJSCPD-XUXIUFHCSA-N Lys-Val-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)C(C)C)C(O)=O RPWQJSBMXJSCPD-XUXIUFHCSA-N 0.000 description 2
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 2
- YRAWWKUTNBILNT-FXQIFTODSA-N Met-Ala-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O YRAWWKUTNBILNT-FXQIFTODSA-N 0.000 description 2
- QAHFGYLFLVGBNW-DCAQKATOSA-N Met-Ala-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN QAHFGYLFLVGBNW-DCAQKATOSA-N 0.000 description 2
- WGBMNLCRYKSWAR-DCAQKATOSA-N Met-Asp-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN WGBMNLCRYKSWAR-DCAQKATOSA-N 0.000 description 2
- FZUNSVYYPYJYAP-NAKRPEOUSA-N Met-Ile-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O FZUNSVYYPYJYAP-NAKRPEOUSA-N 0.000 description 2
- KSIPKXNIQOWMIC-RCWTZXSCSA-N Met-Thr-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KSIPKXNIQOWMIC-RCWTZXSCSA-N 0.000 description 2
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 2
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 2
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 2
- 241001503673 Nocardia farcinica Species 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 108010033276 Peptide Fragments Proteins 0.000 description 2
- CGOMLCQJEMWMCE-STQMWFEESA-N Phe-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 CGOMLCQJEMWMCE-STQMWFEESA-N 0.000 description 2
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 2
- KJJROSNFBRWPHS-JYJNAYRXSA-N Phe-Glu-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O KJJROSNFBRWPHS-JYJNAYRXSA-N 0.000 description 2
- CWFGECHCRMGPPT-MXAVVETBSA-N Phe-Ile-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O CWFGECHCRMGPPT-MXAVVETBSA-N 0.000 description 2
- BSHMIVKDJQGLNT-ACRUOGEOSA-N Phe-Lys-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 BSHMIVKDJQGLNT-ACRUOGEOSA-N 0.000 description 2
- TXJJXEXCZBHDNA-ACRUOGEOSA-N Phe-Phe-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N TXJJXEXCZBHDNA-ACRUOGEOSA-N 0.000 description 2
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 2
- GNRMAQSIROFNMI-IXOXFDKPSA-N Phe-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O GNRMAQSIROFNMI-IXOXFDKPSA-N 0.000 description 2
- DBNGDEAQXGFGRA-ACRUOGEOSA-N Phe-Tyr-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CCCCN)C(=O)O)N DBNGDEAQXGFGRA-ACRUOGEOSA-N 0.000 description 2
- 240000003888 Phoenix reclinata Species 0.000 description 2
- 102000011755 Phosphoglycerate Kinase Human genes 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-N Phosphoric acid Chemical compound OP(O)(O)=O NBIIXXVUZAFLBC-UHFFFAOYSA-N 0.000 description 2
- 244000062594 Pithecellobium globosum Species 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- VOHFZDSRPZLXLH-IHRRRGAJSA-N Pro-Asn-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VOHFZDSRPZLXLH-IHRRRGAJSA-N 0.000 description 2
- KIPIKSXPPLABPN-CIUDSAMLSA-N Pro-Glu-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 KIPIKSXPPLABPN-CIUDSAMLSA-N 0.000 description 2
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 2
- BFXZQMWKTYWGCF-PYJNHQTQSA-N Pro-His-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BFXZQMWKTYWGCF-PYJNHQTQSA-N 0.000 description 2
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 2
- CDGABSWLRMECHC-IHRRRGAJSA-N Pro-Lys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O CDGABSWLRMECHC-IHRRRGAJSA-N 0.000 description 2
- FNGOXVQBBCMFKV-CIUDSAMLSA-N Pro-Ser-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O FNGOXVQBBCMFKV-CIUDSAMLSA-N 0.000 description 2
- BNUKRHFCHHLIGR-JYJNAYRXSA-N Pro-Trp-Asp Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)N[C@@H](CC(=O)O)C(=O)O BNUKRHFCHHLIGR-JYJNAYRXSA-N 0.000 description 2
- QDDJNKWPTJHROJ-UFYCRDLUSA-N Pro-Tyr-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 QDDJNKWPTJHROJ-UFYCRDLUSA-N 0.000 description 2
- 241000190950 Rhodopseudomonas palustris Species 0.000 description 2
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 2
- QFBNNYNWKYKVJO-DCAQKATOSA-N Ser-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N QFBNNYNWKYKVJO-DCAQKATOSA-N 0.000 description 2
- ZXLUWXWISXIFIX-ACZMJKKPSA-N Ser-Asn-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZXLUWXWISXIFIX-ACZMJKKPSA-N 0.000 description 2
- KAAPNMOKUUPKOE-SRVKXCTJSA-N Ser-Asn-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KAAPNMOKUUPKOE-SRVKXCTJSA-N 0.000 description 2
- BNFVPSRLHHPQKS-WHFBIAKZSA-N Ser-Asp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O BNFVPSRLHHPQKS-WHFBIAKZSA-N 0.000 description 2
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 2
- OLIJLNWFEQEFDM-SRVKXCTJSA-N Ser-Asp-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OLIJLNWFEQEFDM-SRVKXCTJSA-N 0.000 description 2
- PVDTYLHUWAEYGY-CIUDSAMLSA-N Ser-Glu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PVDTYLHUWAEYGY-CIUDSAMLSA-N 0.000 description 2
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 2
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 2
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 2
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 2
- ZOPISOXXPQNOCO-SVSWQMSJSA-N Ser-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CO)N ZOPISOXXPQNOCO-SVSWQMSJSA-N 0.000 description 2
- MUJQWSAWLLRJCE-KATARQTJSA-N Ser-Leu-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MUJQWSAWLLRJCE-KATARQTJSA-N 0.000 description 2
- QJKPECIAWNNKIT-KKUMJFAQSA-N Ser-Lys-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QJKPECIAWNNKIT-KKUMJFAQSA-N 0.000 description 2
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 2
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 2
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 2
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 2
- LGIMRDKGABDMBN-DCAQKATOSA-N Ser-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N LGIMRDKGABDMBN-DCAQKATOSA-N 0.000 description 2
- 238000010459 TALEN Methods 0.000 description 2
- 101001099217 Thermotoga maritima (strain ATCC 43589 / DSM 3109 / JCM 10099 / NBRC 100826 / MSB8) Triosephosphate isomerase Proteins 0.000 description 2
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 2
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 2
- OJRNZRROAIAHDL-LKXGYXEUSA-N Thr-Asn-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OJRNZRROAIAHDL-LKXGYXEUSA-N 0.000 description 2
- OYTNZCBFDXGQGE-XQXXSGGOSA-N Thr-Gln-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](C)C(=O)O)N)O OYTNZCBFDXGQGE-XQXXSGGOSA-N 0.000 description 2
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 2
- RKDFEMGVMMYYNG-WDCWCFNPSA-N Thr-Gln-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O RKDFEMGVMMYYNG-WDCWCFNPSA-N 0.000 description 2
- YSXYEJWDHBCTDJ-DVJZZOLTSA-N Thr-Gly-Trp Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O YSXYEJWDHBCTDJ-DVJZZOLTSA-N 0.000 description 2
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 2
- WRQLCVIALDUQEQ-UNQGMJICSA-N Thr-Phe-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WRQLCVIALDUQEQ-UNQGMJICSA-N 0.000 description 2
- WYLAVUAWOUVUCA-XVSYOHENSA-N Thr-Phe-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O WYLAVUAWOUVUCA-XVSYOHENSA-N 0.000 description 2
- IWAVRIPRTCJAQO-HSHDSVGOSA-N Thr-Pro-Trp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O IWAVRIPRTCJAQO-HSHDSVGOSA-N 0.000 description 2
- MFMGPEKYBXFIRF-SUSMZKCASA-N Thr-Thr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MFMGPEKYBXFIRF-SUSMZKCASA-N 0.000 description 2
- ABCLYRRGTZNIFU-BWAGICSOSA-N Thr-Tyr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O ABCLYRRGTZNIFU-BWAGICSOSA-N 0.000 description 2
- JAWUQFCGNVEDRN-MEYUZBJRSA-N Thr-Tyr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N)O JAWUQFCGNVEDRN-MEYUZBJRSA-N 0.000 description 2
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 2
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 2
- DLZKEQQWXODGGZ-KWQFWETISA-N Tyr-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 DLZKEQQWXODGGZ-KWQFWETISA-N 0.000 description 2
- KOVXHANYYYMBRF-IRIUXVKKSA-N Tyr-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KOVXHANYYYMBRF-IRIUXVKKSA-N 0.000 description 2
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 2
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 2
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 2
- ABSXSJZNRAQDDI-KJEVXHAQSA-N Tyr-Val-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ABSXSJZNRAQDDI-KJEVXHAQSA-N 0.000 description 2
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 2
- PAPWZOJOLKZEFR-AVGNSLFASA-N Val-Arg-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N PAPWZOJOLKZEFR-AVGNSLFASA-N 0.000 description 2
- LIQJSDDOULTANC-QSFUFRPTSA-N Val-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LIQJSDDOULTANC-QSFUFRPTSA-N 0.000 description 2
- PVPAOIGJYHVWBT-KKHAAJSZSA-N Val-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N)O PVPAOIGJYHVWBT-KKHAAJSZSA-N 0.000 description 2
- HZYOWMGWKKRMBZ-BYULHYEWSA-N Val-Asp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZYOWMGWKKRMBZ-BYULHYEWSA-N 0.000 description 2
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 2
- GBESYURLQOYWLU-LAEOZQHASA-N Val-Glu-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N GBESYURLQOYWLU-LAEOZQHASA-N 0.000 description 2
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 2
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 2
- KZKMBGXCNLPYKD-YEPSODPASA-N Val-Gly-Thr Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O KZKMBGXCNLPYKD-YEPSODPASA-N 0.000 description 2
- WBAJDGWKRIHOAC-GVXVVHGQSA-N Val-Lys-Gln Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O WBAJDGWKRIHOAC-GVXVVHGQSA-N 0.000 description 2
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 2
- UEPLNXPLHJUYPT-AVGNSLFASA-N Val-Met-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O UEPLNXPLHJUYPT-AVGNSLFASA-N 0.000 description 2
- MJOUSKQHAIARKI-JYJNAYRXSA-N Val-Phe-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 MJOUSKQHAIARKI-JYJNAYRXSA-N 0.000 description 2
- GQMNEJMFMCJJTD-NHCYSSNCSA-N Val-Pro-Gln Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O GQMNEJMFMCJJTD-NHCYSSNCSA-N 0.000 description 2
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 239000000443 aerosol Substances 0.000 description 2
- 230000027455 binding Effects 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 150000001720 carbohydrates Chemical class 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 208000029742 colonic neoplasm Diseases 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 238000001816 cooling Methods 0.000 description 2
- 238000012258 culturing Methods 0.000 description 2
- 210000001671 embryonic stem cell Anatomy 0.000 description 2
- 238000001976 enzyme digestion Methods 0.000 description 2
- 238000001415 gene therapy Methods 0.000 description 2
- 239000006481 glucose medium Substances 0.000 description 2
- 108010078144 glutaminyl-glycine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 239000008187 granular material Substances 0.000 description 2
- 108010028295 histidylhistidine Proteins 0.000 description 2
- 108010092114 histidylphenylalanine Proteins 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 210000004263 induced pluripotent stem cell Anatomy 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 108010053037 kyotorphin Proteins 0.000 description 2
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 2
- 108010012058 leucyltyrosine Proteins 0.000 description 2
- 235000010355 mannitol Nutrition 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 239000000816 peptidomimetic Substances 0.000 description 2
- 108010084572 phenylalanyl-valine Proteins 0.000 description 2
- 239000006187 pill Substances 0.000 description 2
- 239000013600 plasmid vector Substances 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 229920000136 polysorbate Polymers 0.000 description 2
- 108010015796 prolylisoleucine Proteins 0.000 description 2
- 230000000069 prophylactic effect Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- 229960002920 sorbitol Drugs 0.000 description 2
- 239000006188 syrup Substances 0.000 description 2
- 235000020357 syrup Nutrition 0.000 description 2
- 239000003826 tablet Substances 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 230000009385 viral infection Effects 0.000 description 2
- 108010029988 AICDA (activation-induced cytidine deaminase) Proteins 0.000 description 1
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 1
- 108010004483 APOBEC-3G Deaminase Proteins 0.000 description 1
- 101150012656 APOBEC1 gene Proteins 0.000 description 1
- 208000035657 Abasia Diseases 0.000 description 1
- 241000589291 Acinetobacter Species 0.000 description 1
- 244000205574 Acorus calamus Species 0.000 description 1
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 1
- 206010001197 Adenocarcinoma of the cervix Diseases 0.000 description 1
- 208000034246 Adenocarcinoma of the cervix uteri Diseases 0.000 description 1
- HHGYNJRJIINWAK-FXQIFTODSA-N Ala-Ala-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N HHGYNJRJIINWAK-FXQIFTODSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 1
- GWFSQQNGMPGBEF-GHCJXIJMSA-N Ala-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C)N GWFSQQNGMPGBEF-GHCJXIJMSA-N 0.000 description 1
- BTBUEVAGZCKULD-XPUUQOCRSA-N Ala-Gly-His Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CN=CN1 BTBUEVAGZCKULD-XPUUQOCRSA-N 0.000 description 1
- LMFXXZPPZDCPTA-ZKWXMUAHSA-N Ala-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N LMFXXZPPZDCPTA-ZKWXMUAHSA-N 0.000 description 1
- JDIQCVUDDFENPU-ZKWXMUAHSA-N Ala-His-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CNC=N1 JDIQCVUDDFENPU-ZKWXMUAHSA-N 0.000 description 1
- LXAARTARZJJCMB-CIQUZCHMSA-N Ala-Ile-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LXAARTARZJJCMB-CIQUZCHMSA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 1
- IPZQNYYAYVRKKK-FXQIFTODSA-N Ala-Pro-Ala Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IPZQNYYAYVRKKK-FXQIFTODSA-N 0.000 description 1
- YHBDGLZYNIARKJ-GUBZILKMSA-N Ala-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N YHBDGLZYNIARKJ-GUBZILKMSA-N 0.000 description 1
- DCVYRWFAMZFSDA-ZLUOBGJFSA-N Ala-Ser-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DCVYRWFAMZFSDA-ZLUOBGJFSA-N 0.000 description 1
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 1
- HCBKAOZYACJUEF-XQXXSGGOSA-N Ala-Thr-Gln Chemical compound N[C@@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCC(N)=O)C(=O)O HCBKAOZYACJUEF-XQXXSGGOSA-N 0.000 description 1
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 1
- OAIGZYFGCNNVIE-ZPFDUUQYSA-N Ala-Val-Asp-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(O)=O OAIGZYFGCNNVIE-ZPFDUUQYSA-N 0.000 description 1
- ZDILXFDENZVOTL-BPNCWPANSA-N Ala-Val-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZDILXFDENZVOTL-BPNCWPANSA-N 0.000 description 1
- 241000219195 Arabidopsis thaliana Species 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- IASNWHAGGYTEKX-IUCAKERBSA-N Arg-Arg-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(O)=O IASNWHAGGYTEKX-IUCAKERBSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- JUWQNWXEGDYCIE-YUMQZZPRSA-N Arg-Gln-Gly Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O JUWQNWXEGDYCIE-YUMQZZPRSA-N 0.000 description 1
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 1
- SLNCSSWAIDUUGF-LSJOCFKGSA-N Arg-His-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O SLNCSSWAIDUUGF-LSJOCFKGSA-N 0.000 description 1
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 1
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 1
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 1
- PZBSKYJGKNNYNK-ULQDDVLXSA-N Arg-Leu-Tyr Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O PZBSKYJGKNNYNK-ULQDDVLXSA-N 0.000 description 1
- AUZAXCPWMDBWEE-HJGDQZAQSA-N Arg-Thr-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O AUZAXCPWMDBWEE-HJGDQZAQSA-N 0.000 description 1
- YNSUUAOAFCVINY-OSUNSFLBSA-N Arg-Thr-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YNSUUAOAFCVINY-OSUNSFLBSA-N 0.000 description 1
- FMYQECOAIFGQGU-CYDGBPFRSA-N Arg-Val-Ile Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMYQECOAIFGQGU-CYDGBPFRSA-N 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 1
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 1
- FAEFJTCTNZTPHX-ACZMJKKPSA-N Asn-Gln-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FAEFJTCTNZTPHX-ACZMJKKPSA-N 0.000 description 1
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 1
- JREOBWLIZLXRIS-GUBZILKMSA-N Asn-Glu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JREOBWLIZLXRIS-GUBZILKMSA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- WQLJRNRLHWJIRW-KKUMJFAQSA-N Asn-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)N)N)O WQLJRNRLHWJIRW-KKUMJFAQSA-N 0.000 description 1
- LVHMEJJWEXBMKK-GMOBBJLQSA-N Asn-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)N)N LVHMEJJWEXBMKK-GMOBBJLQSA-N 0.000 description 1
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 1
- WIDVAWAQBRAKTI-YUMQZZPRSA-N Asn-Leu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O WIDVAWAQBRAKTI-YUMQZZPRSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- RZNAMKZJPBQWDJ-SRVKXCTJSA-N Asn-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N RZNAMKZJPBQWDJ-SRVKXCTJSA-N 0.000 description 1
- RTFWCVDISAMGEQ-SRVKXCTJSA-N Asn-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N RTFWCVDISAMGEQ-SRVKXCTJSA-N 0.000 description 1
- HZZIFFOVHLWGCS-KKUMJFAQSA-N Asn-Phe-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O HZZIFFOVHLWGCS-KKUMJFAQSA-N 0.000 description 1
- HPBNLFLSSQDFQW-WHFBIAKZSA-N Asn-Ser-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O HPBNLFLSSQDFQW-WHFBIAKZSA-N 0.000 description 1
- NCXTYSVDWLAQGZ-ZKWXMUAHSA-N Asn-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O NCXTYSVDWLAQGZ-ZKWXMUAHSA-N 0.000 description 1
- YHXNKGKUDJCAHB-PBCZWWQYSA-N Asn-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O YHXNKGKUDJCAHB-PBCZWWQYSA-N 0.000 description 1
- WUQXMTITJLFXAU-JIOCBJNQSA-N Asn-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N)O WUQXMTITJLFXAU-JIOCBJNQSA-N 0.000 description 1
- UXHYOWXTJLBEPG-GSSVUCPTSA-N Asn-Thr-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UXHYOWXTJLBEPG-GSSVUCPTSA-N 0.000 description 1
- RTFXPCYMDYBZNQ-SRVKXCTJSA-N Asn-Tyr-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O RTFXPCYMDYBZNQ-SRVKXCTJSA-N 0.000 description 1
- QNNBHTFDFFFHGC-KKUMJFAQSA-N Asn-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O QNNBHTFDFFFHGC-KKUMJFAQSA-N 0.000 description 1
- XOQYDFCQPWAMSA-KKHAAJSZSA-N Asn-Val-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOQYDFCQPWAMSA-KKHAAJSZSA-N 0.000 description 1
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 1
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 1
- YBMUFUWSMIKJQA-GUBZILKMSA-N Asp-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N YBMUFUWSMIKJQA-GUBZILKMSA-N 0.000 description 1
- QCLHLXDWRKOHRR-GUBZILKMSA-N Asp-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N QCLHLXDWRKOHRR-GUBZILKMSA-N 0.000 description 1
- OVPHVTCDVYYTHN-AVGNSLFASA-N Asp-Glu-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 OVPHVTCDVYYTHN-AVGNSLFASA-N 0.000 description 1
- WBDWQKRLTVCDSY-WHFBIAKZSA-N Asp-Gly-Asp Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O WBDWQKRLTVCDSY-WHFBIAKZSA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- SNDBKTFJWVEVPO-WHFBIAKZSA-N Asp-Gly-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CO)C(O)=O SNDBKTFJWVEVPO-WHFBIAKZSA-N 0.000 description 1
- QHHVSXGWLYEAGX-GUBZILKMSA-N Asp-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QHHVSXGWLYEAGX-GUBZILKMSA-N 0.000 description 1
- LNENWJXDHCFVOF-DCAQKATOSA-N Asp-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC(=O)O)N LNENWJXDHCFVOF-DCAQKATOSA-N 0.000 description 1
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 1
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 1
- NZWDWXSWUQCNMG-GARJFASQSA-N Asp-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)C(=O)O NZWDWXSWUQCNMG-GARJFASQSA-N 0.000 description 1
- KPSHWSWFPUDEGF-FXQIFTODSA-N Asp-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(O)=O KPSHWSWFPUDEGF-FXQIFTODSA-N 0.000 description 1
- QTIZKMMLNUMHHU-DCAQKATOSA-N Asp-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O QTIZKMMLNUMHHU-DCAQKATOSA-N 0.000 description 1
- YFGUZQQCSDZRBN-DCAQKATOSA-N Asp-Pro-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O YFGUZQQCSDZRBN-DCAQKATOSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 1
- RSMZEHCMIOKNMW-GSSVUCPTSA-N Asp-Thr-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RSMZEHCMIOKNMW-GSSVUCPTSA-N 0.000 description 1
- DKQCWCQRAMAFLN-UBHSHLNASA-N Asp-Trp-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(O)=O)C(O)=O DKQCWCQRAMAFLN-UBHSHLNASA-N 0.000 description 1
- USENATHVGFXRNO-SRVKXCTJSA-N Asp-Tyr-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=C(O)C=C1 USENATHVGFXRNO-SRVKXCTJSA-N 0.000 description 1
- UXIPUCUHQBIQOS-SRVKXCTJSA-N Asp-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UXIPUCUHQBIQOS-SRVKXCTJSA-N 0.000 description 1
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 1
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 208000003950 B-cell lymphoma Diseases 0.000 description 1
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 241000606124 Bacteroides fragilis Species 0.000 description 1
- 241001195773 Bacteroides massiliensis Species 0.000 description 1
- 241001135228 Bacteroides ovatus Species 0.000 description 1
- 241000606123 Bacteroides thetaiotaomicron Species 0.000 description 1
- 241000186012 Bifidobacterium breve Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 241000845990 Bradyrhizobium diazoefficiens Species 0.000 description 1
- 241000589174 Bradyrhizobium japonicum Species 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 241001453380 Burkholderia Species 0.000 description 1
- 241001646389 Burkholderia dolosa Species 0.000 description 1
- 241000020731 Burkholderia multivorans Species 0.000 description 1
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 1
- 101000909256 Caldicellulosiruptor bescii (strain ATCC BAA-1888 / DSM 6725 / Z-1320) DNA polymerase I Proteins 0.000 description 1
- 241000589875 Campylobacter jejuni Species 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 1
- 241001603743 Cereibacter Species 0.000 description 1
- 241000282994 Cervidae Species 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 1
- 208000030808 Clear cell renal carcinoma Diseases 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 241000709687 Coxsackievirus Species 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 241000366859 Cupriavidus taiwanensis Species 0.000 description 1
- CEZSLNCYQUFOSL-BQBZGAKWSA-N Cys-Arg-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O CEZSLNCYQUFOSL-BQBZGAKWSA-N 0.000 description 1
- KABHAOSDMIYXTR-GUBZILKMSA-N Cys-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N KABHAOSDMIYXTR-GUBZILKMSA-N 0.000 description 1
- GCDLPNRHPWBKJJ-WDSKDSINSA-N Cys-Gly-Glu Chemical compound [H]N[C@@H](CS)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O GCDLPNRHPWBKJJ-WDSKDSINSA-N 0.000 description 1
- SSNJZBGOMNLSLA-CIUDSAMLSA-N Cys-Leu-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O SSNJZBGOMNLSLA-CIUDSAMLSA-N 0.000 description 1
- LHJDLVVQRJIURS-SRVKXCTJSA-N Cys-Phe-Asp Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N LHJDLVVQRJIURS-SRVKXCTJSA-N 0.000 description 1
- SWJYSDXMTPMBHO-FXQIFTODSA-N Cys-Pro-Ser Chemical compound [H]N[C@@H](CS)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O SWJYSDXMTPMBHO-FXQIFTODSA-N 0.000 description 1
- WZJLBUPPZRZNTO-CIUDSAMLSA-N Cys-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N WZJLBUPPZRZNTO-CIUDSAMLSA-N 0.000 description 1
- NAPULYCVEVVFRB-HEIBUPTGSA-N Cys-Thr-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CS NAPULYCVEVVFRB-HEIBUPTGSA-N 0.000 description 1
- XSELZJJGSKZZDO-UBHSHLNASA-N Cys-Trp-Asp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CS)N XSELZJJGSKZZDO-UBHSHLNASA-N 0.000 description 1
- OEDPLIBVQGRKGZ-AVGNSLFASA-N Cys-Tyr-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O OEDPLIBVQGRKGZ-AVGNSLFASA-N 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- 108020001738 DNA Glycosylase Proteins 0.000 description 1
- 102100040262 DNA dC->dU-editing enzyme APOBEC-3B Human genes 0.000 description 1
- 102100040261 DNA dC->dU-editing enzyme APOBEC-3C Human genes 0.000 description 1
- 102100040264 DNA dC->dU-editing enzyme APOBEC-3D Human genes 0.000 description 1
- 102100040266 DNA dC->dU-editing enzyme APOBEC-3F Human genes 0.000 description 1
- 102100038076 DNA dC->dU-editing enzyme APOBEC-3G Human genes 0.000 description 1
- 102100038050 DNA dC->dU-editing enzyme APOBEC-3H Human genes 0.000 description 1
- 101710082737 DNA dC->dU-editing enzyme APOBEC-3H Proteins 0.000 description 1
- 102000028381 DNA glycosylase Human genes 0.000 description 1
- 102100021429 DNA-directed RNA polymerase II subunit RPB1 Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 241000194032 Enterococcus faecalis Species 0.000 description 1
- 241000991587 Enterovirus C Species 0.000 description 1
- 241000283074 Equus asinus Species 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 240000009088 Fragaria x ananassa Species 0.000 description 1
- 241001536358 Fraxinus Species 0.000 description 1
- 241000605956 Fusobacterium mortiferum Species 0.000 description 1
- 241000605986 Fusobacterium nucleatum Species 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- 208000032320 Germ cell tumor of testis Diseases 0.000 description 1
- 208000032612 Glial tumor Diseases 0.000 description 1
- 201000010915 Glioblastoma multiforme Diseases 0.000 description 1
- 206010018338 Glioma Diseases 0.000 description 1
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 1
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 1
- JSYULGSPLTZDHM-NRPADANISA-N Gln-Ala-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O JSYULGSPLTZDHM-NRPADANISA-N 0.000 description 1
- JFOKLAPFYCTNHW-SRVKXCTJSA-N Gln-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)N)N JFOKLAPFYCTNHW-SRVKXCTJSA-N 0.000 description 1
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 1
- CITDWMLWXNUQKD-FXQIFTODSA-N Gln-Gln-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CITDWMLWXNUQKD-FXQIFTODSA-N 0.000 description 1
- LWDGZZGWDMHBOF-FXQIFTODSA-N Gln-Glu-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LWDGZZGWDMHBOF-FXQIFTODSA-N 0.000 description 1
- KDXKFBSNIJYNNR-YVNDNENWSA-N Gln-Glu-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KDXKFBSNIJYNNR-YVNDNENWSA-N 0.000 description 1
- XKBASPWPBXNVLQ-WDSKDSINSA-N Gln-Gly-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XKBASPWPBXNVLQ-WDSKDSINSA-N 0.000 description 1
- DWDBJWAXPXXYLP-SRVKXCTJSA-N Gln-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N DWDBJWAXPXXYLP-SRVKXCTJSA-N 0.000 description 1
- LTXLIIZACMCQTO-GUBZILKMSA-N Gln-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N LTXLIIZACMCQTO-GUBZILKMSA-N 0.000 description 1
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 1
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 1
- JKGHMESJHRTHIC-SIUGBPQLSA-N Gln-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N JKGHMESJHRTHIC-SIUGBPQLSA-N 0.000 description 1
- QBLMTCRYYTVUQY-GUBZILKMSA-N Gln-Leu-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O QBLMTCRYYTVUQY-GUBZILKMSA-N 0.000 description 1
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 1
- PBYFVIQRFLNQCO-GUBZILKMSA-N Gln-Pro-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O PBYFVIQRFLNQCO-GUBZILKMSA-N 0.000 description 1
- FGWRYRAVBVOHIB-XIRDDKMYSA-N Gln-Pro-Trp Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O FGWRYRAVBVOHIB-XIRDDKMYSA-N 0.000 description 1
- MFHVAWMMKZBSRQ-ACZMJKKPSA-N Gln-Ser-Cys Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N MFHVAWMMKZBSRQ-ACZMJKKPSA-N 0.000 description 1
- XKPACHRGOWQHFH-IRIUXVKKSA-N Gln-Thr-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XKPACHRGOWQHFH-IRIUXVKKSA-N 0.000 description 1
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 1
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 1
- MXOODARRORARSU-ACZMJKKPSA-N Glu-Ala-Ser Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)O)N MXOODARRORARSU-ACZMJKKPSA-N 0.000 description 1
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 1
- KKCUFHUTMKQQCF-SRVKXCTJSA-N Glu-Arg-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O KKCUFHUTMKQQCF-SRVKXCTJSA-N 0.000 description 1
- WOSRKEJQESVHGA-CIUDSAMLSA-N Glu-Arg-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O WOSRKEJQESVHGA-CIUDSAMLSA-N 0.000 description 1
- NTBDVNJIWCKURJ-ACZMJKKPSA-N Glu-Asp-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O NTBDVNJIWCKURJ-ACZMJKKPSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- OXEMJGCAJFFREE-FXQIFTODSA-N Glu-Gln-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O OXEMJGCAJFFREE-FXQIFTODSA-N 0.000 description 1
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 1
- OGNJZUXUTPQVBR-BQBZGAKWSA-N Glu-Gly-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OGNJZUXUTPQVBR-BQBZGAKWSA-N 0.000 description 1
- OPAINBJQDQTGJY-JGVFFNPUSA-N Glu-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)O)N)C(=O)O OPAINBJQDQTGJY-JGVFFNPUSA-N 0.000 description 1
- VGOFRWOTSXVPAU-SDDRHHMPSA-N Glu-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCC(=O)O)N)C(=O)O VGOFRWOTSXVPAU-SDDRHHMPSA-N 0.000 description 1
- WVTIBGWZUMJBFY-GUBZILKMSA-N Glu-His-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O WVTIBGWZUMJBFY-GUBZILKMSA-N 0.000 description 1
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 1
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 1
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- JDUKCSSHWNIQQZ-IHRRRGAJSA-N Glu-Phe-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JDUKCSSHWNIQQZ-IHRRRGAJSA-N 0.000 description 1
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 1
- UQHGAYSULGRWRG-WHFBIAKZSA-N Glu-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(O)=O UQHGAYSULGRWRG-WHFBIAKZSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- JVYNYWXHZWVJEF-NUMRIWBASA-N Glu-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O JVYNYWXHZWVJEF-NUMRIWBASA-N 0.000 description 1
- LWYUQLZOIORFFJ-XKBZYTNZSA-N Glu-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O LWYUQLZOIORFFJ-XKBZYTNZSA-N 0.000 description 1
- LZEUDRYSAZAJIO-AUTRQRHGSA-N Glu-Val-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LZEUDRYSAZAJIO-AUTRQRHGSA-N 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 1
- QSDKBRMVXSWAQE-BFHQHQDPSA-N Gly-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN QSDKBRMVXSWAQE-BFHQHQDPSA-N 0.000 description 1
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 1
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 1
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 1
- QGZSAHIZRQHCEQ-QWRGUYRKSA-N Gly-Asp-Tyr Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QGZSAHIZRQHCEQ-QWRGUYRKSA-N 0.000 description 1
- GVVKYKCOFMMTKZ-WHFBIAKZSA-N Gly-Cys-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CS)NC(=O)CN GVVKYKCOFMMTKZ-WHFBIAKZSA-N 0.000 description 1
- JUBDONGMHASUCN-IUCAKERBSA-N Gly-Glu-His Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(O)=O JUBDONGMHASUCN-IUCAKERBSA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- YFGONBOFGGWKKY-VHSXEESVSA-N Gly-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)CN)C(=O)O YFGONBOFGGWKKY-VHSXEESVSA-N 0.000 description 1
- HMHRTKOWRUPPNU-RCOVLWMOSA-N Gly-Ile-Gly Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O HMHRTKOWRUPPNU-RCOVLWMOSA-N 0.000 description 1
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 1
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 1
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 1
- JPVGHHQGKPQYIL-KBPBESRZSA-N Gly-Phe-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 JPVGHHQGKPQYIL-KBPBESRZSA-N 0.000 description 1
- QAMMIGULQSIRCD-IRXDYDNUSA-N Gly-Phe-Tyr Chemical compound C([C@H](NC(=O)C[NH3+])C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C([O-])=O)C1=CC=CC=C1 QAMMIGULQSIRCD-IRXDYDNUSA-N 0.000 description 1
- JYPCXBJRLBHWME-IUCAKERBSA-N Gly-Pro-Arg Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JYPCXBJRLBHWME-IUCAKERBSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 1
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 1
- WNGHUXFWEWTKAO-YUMQZZPRSA-N Gly-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN WNGHUXFWEWTKAO-YUMQZZPRSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- FFJQHWKSGAWSTJ-BFHQHQDPSA-N Gly-Thr-Ala Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O FFJQHWKSGAWSTJ-BFHQHQDPSA-N 0.000 description 1
- PNUFMLXHOLFRLD-KBPBESRZSA-N Gly-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 PNUFMLXHOLFRLD-KBPBESRZSA-N 0.000 description 1
- NGRPGJGKJMUGDM-XVKPBYJWSA-N Gly-Val-Gln Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O NGRPGJGKJMUGDM-XVKPBYJWSA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 1
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 1
- ZIMTWPHIKZEHSE-UWVGGRQHSA-N His-Arg-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O ZIMTWPHIKZEHSE-UWVGGRQHSA-N 0.000 description 1
- FPNWKONEZAVQJF-GUBZILKMSA-N His-Asn-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N FPNWKONEZAVQJF-GUBZILKMSA-N 0.000 description 1
- UZZXGLOJRZKYEL-DJFWLOJKSA-N His-Asn-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UZZXGLOJRZKYEL-DJFWLOJKSA-N 0.000 description 1
- HVCRQRQPIIRNLY-IUCAKERBSA-N His-Gln-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N HVCRQRQPIIRNLY-IUCAKERBSA-N 0.000 description 1
- DVHGLDYMGWTYKW-GUBZILKMSA-N His-Gln-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O DVHGLDYMGWTYKW-GUBZILKMSA-N 0.000 description 1
- GHAFKUCRIVBLDJ-IHRRRGAJSA-N His-His-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N GHAFKUCRIVBLDJ-IHRRRGAJSA-N 0.000 description 1
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 1
- DEOQGJUXUQGUJN-KKUMJFAQSA-N His-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N DEOQGJUXUQGUJN-KKUMJFAQSA-N 0.000 description 1
- LDFWDDVELNOGII-MXAVVETBSA-N His-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N LDFWDDVELNOGII-MXAVVETBSA-N 0.000 description 1
- IGBBXBFSLKRHJB-BZSNNMDCSA-N His-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 IGBBXBFSLKRHJB-BZSNNMDCSA-N 0.000 description 1
- WYSJPCTWSBJFCO-AVGNSLFASA-N His-Met-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CC1=CN=CN1)N WYSJPCTWSBJFCO-AVGNSLFASA-N 0.000 description 1
- PLCAEMGSYOYIPP-GUBZILKMSA-N His-Ser-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 PLCAEMGSYOYIPP-GUBZILKMSA-N 0.000 description 1
- PBJOQLUVSGXRSW-YTQUADARSA-N His-Trp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CNC3=CC=CC=C32)NC(=O)[C@H](CC4=CN=CN4)N)C(=O)O PBJOQLUVSGXRSW-YTQUADARSA-N 0.000 description 1
- WSWAUVHXQREQQG-JYJNAYRXSA-N His-Tyr-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O WSWAUVHXQREQQG-JYJNAYRXSA-N 0.000 description 1
- ZHMZWSFQRUGLEC-JYJNAYRXSA-N His-Tyr-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZHMZWSFQRUGLEC-JYJNAYRXSA-N 0.000 description 1
- SYPULFZAGBBIOM-GVXVVHGQSA-N His-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N SYPULFZAGBBIOM-GVXVVHGQSA-N 0.000 description 1
- 101000964385 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3B Proteins 0.000 description 1
- 101000964383 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3C Proteins 0.000 description 1
- 101000964382 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3D Proteins 0.000 description 1
- 101000964377 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3F Proteins 0.000 description 1
- 101001106401 Homo sapiens DNA-directed RNA polymerase II subunit RPB1 Proteins 0.000 description 1
- 101000883798 Homo sapiens Probable ATP-dependent RNA helicase DDX53 Proteins 0.000 description 1
- 101000823473 Homo sapiens Protein FAM171B Proteins 0.000 description 1
- 101000800426 Homo sapiens Putative C->U-editing enzyme APOBEC-4 Proteins 0.000 description 1
- 101000666896 Homo sapiens V-type immunoglobulin domain-containing suppressor of T-cell activation Proteins 0.000 description 1
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 1
- RWIKBYVJQAJYDP-BJDJZHNGSA-N Ile-Ala-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN RWIKBYVJQAJYDP-BJDJZHNGSA-N 0.000 description 1
- CYHYBSGMHMHKOA-CIQUZCHMSA-N Ile-Ala-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N CYHYBSGMHMHKOA-CIQUZCHMSA-N 0.000 description 1
- SACHLUOUHCVIKI-GMOBBJLQSA-N Ile-Arg-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SACHLUOUHCVIKI-GMOBBJLQSA-N 0.000 description 1
- FVEWRQXNISSYFO-ZPFDUUQYSA-N Ile-Arg-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N FVEWRQXNISSYFO-ZPFDUUQYSA-N 0.000 description 1
- YOTNPRLPIPHQSB-XUXIUFHCSA-N Ile-Arg-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOTNPRLPIPHQSB-XUXIUFHCSA-N 0.000 description 1
- YKRIXHPEIZUDDY-GMOBBJLQSA-N Ile-Asn-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKRIXHPEIZUDDY-GMOBBJLQSA-N 0.000 description 1
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 1
- DMZOUKXXHJQPTL-GRLWGSQLSA-N Ile-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N DMZOUKXXHJQPTL-GRLWGSQLSA-N 0.000 description 1
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 1
- SJLVSMMIFYTSGY-GRLWGSQLSA-N Ile-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N SJLVSMMIFYTSGY-GRLWGSQLSA-N 0.000 description 1
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 1
- CEPIAEUVRKGPGP-DSYPUSFNSA-N Ile-Lys-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 CEPIAEUVRKGPGP-DSYPUSFNSA-N 0.000 description 1
- RVNOXPZHMUWCLW-GMOBBJLQSA-N Ile-Met-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N RVNOXPZHMUWCLW-GMOBBJLQSA-N 0.000 description 1
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 1
- VEPIBPGLTLPBDW-URLPEUOOSA-N Ile-Phe-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N VEPIBPGLTLPBDW-URLPEUOOSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- YBKKLDBBPFIXBQ-MBLNEYKQSA-N Ile-Thr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)O)N YBKKLDBBPFIXBQ-MBLNEYKQSA-N 0.000 description 1
- HJDZMPFEXINXLO-QPHKQPEJSA-N Ile-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N HJDZMPFEXINXLO-QPHKQPEJSA-N 0.000 description 1
- HQLSBZFLOUHQJK-STECZYCISA-N Ile-Tyr-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N HQLSBZFLOUHQJK-STECZYCISA-N 0.000 description 1
- 108060003951 Immunoglobulin Proteins 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 1
- 244000199866 Lactobacillus casei Species 0.000 description 1
- 235000013958 Lactobacillus casei Nutrition 0.000 description 1
- 241000218588 Lactobacillus rhamnosus Species 0.000 description 1
- 208000031671 Large B-Cell Diffuse Lymphoma Diseases 0.000 description 1
- VKOAHIRLIUESLU-ULQDDVLXSA-N Leu-Arg-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VKOAHIRLIUESLU-ULQDDVLXSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 1
- MMEDVBWCMGRKKC-GARJFASQSA-N Leu-Asp-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N MMEDVBWCMGRKKC-GARJFASQSA-N 0.000 description 1
- PIHFVNPEAHFNLN-KKUMJFAQSA-N Leu-Cys-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N PIHFVNPEAHFNLN-KKUMJFAQSA-N 0.000 description 1
- DLCXCECTCPKKCD-GUBZILKMSA-N Leu-Gln-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DLCXCECTCPKKCD-GUBZILKMSA-N 0.000 description 1
- AXZGZMGRBDQTEY-SRVKXCTJSA-N Leu-Gln-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O AXZGZMGRBDQTEY-SRVKXCTJSA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- VBZOAGIPCULURB-QWRGUYRKSA-N Leu-Gly-His Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N VBZOAGIPCULURB-QWRGUYRKSA-N 0.000 description 1
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 1
- VZBIUJURDLFFOE-IHRRRGAJSA-N Leu-His-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VZBIUJURDLFFOE-IHRRRGAJSA-N 0.000 description 1
- BTNXKBVLWJBTNR-SRVKXCTJSA-N Leu-His-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O BTNXKBVLWJBTNR-SRVKXCTJSA-N 0.000 description 1
- IFMPDNRWZZEZSL-SRVKXCTJSA-N Leu-Leu-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(O)=O IFMPDNRWZZEZSL-SRVKXCTJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 1
- QNTJIDXQHWUBKC-BZSNNMDCSA-N Leu-Lys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNTJIDXQHWUBKC-BZSNNMDCSA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 1
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 1
- GOFJOGXGMPHOGL-DCAQKATOSA-N Leu-Ser-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C GOFJOGXGMPHOGL-DCAQKATOSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- ZJZNLRVCZWUONM-JXUBOQSCSA-N Leu-Thr-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O ZJZNLRVCZWUONM-JXUBOQSCSA-N 0.000 description 1
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 1
- FGZVGOAAROXFAB-IXOXFDKPSA-N Leu-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(C)C)N)O FGZVGOAAROXFAB-IXOXFDKPSA-N 0.000 description 1
- ODRREERHVHMIPT-OEAJRASXSA-N Leu-Thr-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ODRREERHVHMIPT-OEAJRASXSA-N 0.000 description 1
- UIIMIKFNIYPDJF-WDSOQIARSA-N Leu-Trp-Met Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCSC)C(O)=O)NC(=O)[C@@H](N)CC(C)C)=CNC2=C1 UIIMIKFNIYPDJF-WDSOQIARSA-N 0.000 description 1
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 1
- XZNJZXJZBMBGGS-NHCYSSNCSA-N Leu-Val-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XZNJZXJZBMBGGS-NHCYSSNCSA-N 0.000 description 1
- FMFNIDICDKEMOE-XUXIUFHCSA-N Leu-Val-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMFNIDICDKEMOE-XUXIUFHCSA-N 0.000 description 1
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 1
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- ALSRJRIWBNENFY-DCAQKATOSA-N Lys-Arg-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O ALSRJRIWBNENFY-DCAQKATOSA-N 0.000 description 1
- GQUDMNDPQTXZRV-DCAQKATOSA-N Lys-Arg-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GQUDMNDPQTXZRV-DCAQKATOSA-N 0.000 description 1
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 1
- GGAPIOORBXHMNY-ULQDDVLXSA-N Lys-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)O GGAPIOORBXHMNY-ULQDDVLXSA-N 0.000 description 1
- DNEJSAIMVANNPA-DCAQKATOSA-N Lys-Asn-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DNEJSAIMVANNPA-DCAQKATOSA-N 0.000 description 1
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 1
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 1
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 1
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 1
- FHIAJWBDZVHLAH-YUMQZZPRSA-N Lys-Gly-Ser Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FHIAJWBDZVHLAH-YUMQZZPRSA-N 0.000 description 1
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 1
- KNKJPYAZQUFLQK-IHRRRGAJSA-N Lys-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CCCCN)N KNKJPYAZQUFLQK-IHRRRGAJSA-N 0.000 description 1
- ZMMDPRTXLAEMOD-BZSNNMDCSA-N Lys-His-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZMMDPRTXLAEMOD-BZSNNMDCSA-N 0.000 description 1
- PGLGNCVOWIORQE-SRVKXCTJSA-N Lys-His-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O PGLGNCVOWIORQE-SRVKXCTJSA-N 0.000 description 1
- OIYWBDBHEGAVST-BZSNNMDCSA-N Lys-His-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O OIYWBDBHEGAVST-BZSNNMDCSA-N 0.000 description 1
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 1
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 1
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 1
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 1
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 1
- WWEWGPOLIJXGNX-XUXIUFHCSA-N Lys-Met-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CCCCN)N WWEWGPOLIJXGNX-XUXIUFHCSA-N 0.000 description 1
- MTBLFIQZECOEBY-IHRRRGAJSA-N Lys-Met-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O MTBLFIQZECOEBY-IHRRRGAJSA-N 0.000 description 1
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 1
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 1
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 1
- UGCIQUYEJIEHKX-GVXVVHGQSA-N Lys-Val-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O UGCIQUYEJIEHKX-GVXVVHGQSA-N 0.000 description 1
- GILLQRYAWOMHED-DCAQKATOSA-N Lys-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN GILLQRYAWOMHED-DCAQKATOSA-N 0.000 description 1
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 241000712079 Measles morbillivirus Species 0.000 description 1
- VHGIWFGJIHTASW-FXQIFTODSA-N Met-Ala-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O VHGIWFGJIHTASW-FXQIFTODSA-N 0.000 description 1
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 1
- KQBJYJXPZBNEIK-DCAQKATOSA-N Met-Glu-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQBJYJXPZBNEIK-DCAQKATOSA-N 0.000 description 1
- QEDGNYFHLXXIDC-DCAQKATOSA-N Met-Pro-Gln Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O QEDGNYFHLXXIDC-DCAQKATOSA-N 0.000 description 1
- GMMLGMFBYCFCCX-KZVJFYERSA-N Met-Thr-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O GMMLGMFBYCFCCX-KZVJFYERSA-N 0.000 description 1
- NSMXRFMGZYTFEX-KJEVXHAQSA-N Met-Thr-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCSC)N)O NSMXRFMGZYTFEX-KJEVXHAQSA-N 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 241000711386 Mumps virus Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000419175 Mycobacterium colombiense Species 0.000 description 1
- 241000704107 Mycobacterium paraintracellulare Species 0.000 description 1
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- XZFYRXDAULDNFX-UHFFFAOYSA-N N-L-cysteinyl-L-phenylalanine Natural products SCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XZFYRXDAULDNFX-UHFFFAOYSA-N 0.000 description 1
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- 108010066427 N-valyltryptophan Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 208000005890 Neuroma Diseases 0.000 description 1
- 241000948822 Nocardia cyriacigeorgica Species 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 241000178960 Paenibacillus macerans Species 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 241000160321 Parabacteroides Species 0.000 description 1
- 206010061332 Paraganglion neoplasm Diseases 0.000 description 1
- 208000002606 Paramyxoviridae Infections Diseases 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 241000517308 Pediculus humanus capitis Species 0.000 description 1
- 241001250141 Phacotus sphaericus Species 0.000 description 1
- BKWJQWJPZMUWEG-LFSVMHDDSA-N Phe-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BKWJQWJPZMUWEG-LFSVMHDDSA-N 0.000 description 1
- NEHSHYOUIWBYSA-DCPHZVHLSA-N Phe-Ala-Trp Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=CC=C3)N NEHSHYOUIWBYSA-DCPHZVHLSA-N 0.000 description 1
- LGBVMDMZZFYSFW-HJWJTTGWSA-N Phe-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N LGBVMDMZZFYSFW-HJWJTTGWSA-N 0.000 description 1
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 1
- SWZKMTDPQXLQRD-XVSYOHENSA-N Phe-Asp-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SWZKMTDPQXLQRD-XVSYOHENSA-N 0.000 description 1
- CUMXHKAOHNWRFQ-BZSNNMDCSA-N Phe-Asp-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CUMXHKAOHNWRFQ-BZSNNMDCSA-N 0.000 description 1
- LWPMGKSZPKFKJD-DZKIICNBSA-N Phe-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O LWPMGKSZPKFKJD-DZKIICNBSA-N 0.000 description 1
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 1
- KXUZHWXENMYOHC-QEJZJMRPSA-N Phe-Leu-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O KXUZHWXENMYOHC-QEJZJMRPSA-N 0.000 description 1
- METZZBCMDXHFMK-BZSNNMDCSA-N Phe-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N METZZBCMDXHFMK-BZSNNMDCSA-N 0.000 description 1
- KNYPNEYICHHLQL-ACRUOGEOSA-N Phe-Leu-Tyr Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 KNYPNEYICHHLQL-ACRUOGEOSA-N 0.000 description 1
- ZIQQNOXKEFDPBE-BZSNNMDCSA-N Phe-Lys-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N ZIQQNOXKEFDPBE-BZSNNMDCSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- MGLBSROLWAWCKN-FCLVOEFKSA-N Phe-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MGLBSROLWAWCKN-FCLVOEFKSA-N 0.000 description 1
- MRWOVVNKSXXLRP-IHPCNDPISA-N Phe-Ser-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O MRWOVVNKSXXLRP-IHPCNDPISA-N 0.000 description 1
- KLYYKKGCPOGDPE-OEAJRASXSA-N Phe-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O KLYYKKGCPOGDPE-OEAJRASXSA-N 0.000 description 1
- GCFNFKNPCMBHNT-IRXDYDNUSA-N Phe-Tyr-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)NCC(=O)O)N GCFNFKNPCMBHNT-IRXDYDNUSA-N 0.000 description 1
- APMXLWHMIVWLLR-BZSNNMDCSA-N Phe-Tyr-Ser Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(O)=O)C1=CC=CC=C1 APMXLWHMIVWLLR-BZSNNMDCSA-N 0.000 description 1
- JSGWNFKWZNPDAV-YDHLFZDLSA-N Phe-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 JSGWNFKWZNPDAV-YDHLFZDLSA-N 0.000 description 1
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 1
- GAMLAXHLYGLQBJ-UFYCRDLUSA-N Phe-Val-Tyr Chemical compound N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)O)CC1=CC=C(C=C1)O)C(C)C)CC1=CC=CC=C1 GAMLAXHLYGLQBJ-UFYCRDLUSA-N 0.000 description 1
- APZNYJFGVAGFCF-JYJNAYRXSA-N Phe-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccccc1)C(C)C)C(O)=O APZNYJFGVAGFCF-JYJNAYRXSA-N 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 241000224017 Plasmodium berghei Species 0.000 description 1
- 241000223960 Plasmodium falciparum Species 0.000 description 1
- 206010035500 Plasmodium falciparum infection Diseases 0.000 description 1
- DZZCICYRSZASNF-FXQIFTODSA-N Pro-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 DZZCICYRSZASNF-FXQIFTODSA-N 0.000 description 1
- AJLVKXCNXIJHDV-CIUDSAMLSA-N Pro-Ala-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O AJLVKXCNXIJHDV-CIUDSAMLSA-N 0.000 description 1
- XQLBWXHVZVBNJM-FXQIFTODSA-N Pro-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 XQLBWXHVZVBNJM-FXQIFTODSA-N 0.000 description 1
- IHCXPSYCHXFXKT-DCAQKATOSA-N Pro-Arg-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O IHCXPSYCHXFXKT-DCAQKATOSA-N 0.000 description 1
- QBFONMUYNSNKIX-AVGNSLFASA-N Pro-Arg-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O QBFONMUYNSNKIX-AVGNSLFASA-N 0.000 description 1
- ZBAGOWGNNAXMOY-IHRRRGAJSA-N Pro-Cys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZBAGOWGNNAXMOY-IHRRRGAJSA-N 0.000 description 1
- HJSCRFZVGXAGNG-SRVKXCTJSA-N Pro-Gln-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 HJSCRFZVGXAGNG-SRVKXCTJSA-N 0.000 description 1
- UAYHMOIGIQZLFR-NHCYSSNCSA-N Pro-Gln-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UAYHMOIGIQZLFR-NHCYSSNCSA-N 0.000 description 1
- MGDFPGCFVJFITQ-CIUDSAMLSA-N Pro-Glu-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MGDFPGCFVJFITQ-CIUDSAMLSA-N 0.000 description 1
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 1
- QGOZJLYCGRYYRW-KKUMJFAQSA-N Pro-Glu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QGOZJLYCGRYYRW-KKUMJFAQSA-N 0.000 description 1
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 1
- AJCRQOHDLCBHFA-SRVKXCTJSA-N Pro-His-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AJCRQOHDLCBHFA-SRVKXCTJSA-N 0.000 description 1
- SOACYAXADBWDDT-CYDGBPFRSA-N Pro-Ile-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SOACYAXADBWDDT-CYDGBPFRSA-N 0.000 description 1
- BRJGUPWVFXKBQI-XUXIUFHCSA-N Pro-Leu-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRJGUPWVFXKBQI-XUXIUFHCSA-N 0.000 description 1
- CPRLKHJUFAXVTD-ULQDDVLXSA-N Pro-Leu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CPRLKHJUFAXVTD-ULQDDVLXSA-N 0.000 description 1
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 1
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 1
- WCNVGGZRTNHOOS-ULQDDVLXSA-N Pro-Lys-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O WCNVGGZRTNHOOS-ULQDDVLXSA-N 0.000 description 1
- MLKVIVZCFYRTIR-KKUMJFAQSA-N Pro-Phe-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLKVIVZCFYRTIR-KKUMJFAQSA-N 0.000 description 1
- XYAFCOJKICBRDU-JYJNAYRXSA-N Pro-Phe-Val Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O XYAFCOJKICBRDU-JYJNAYRXSA-N 0.000 description 1
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 1
- QKDIHFHGHBYTKB-IHRRRGAJSA-N Pro-Ser-Phe Chemical compound N([C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 QKDIHFHGHBYTKB-IHRRRGAJSA-N 0.000 description 1
- YIPFBJGBRCJJJD-FHWLQOOXSA-N Pro-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 YIPFBJGBRCJJJD-FHWLQOOXSA-N 0.000 description 1
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 1
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 1
- ZMLRZBWCXPQADC-TUAOUCFPSA-N Pro-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 ZMLRZBWCXPQADC-TUAOUCFPSA-N 0.000 description 1
- 102100038236 Probable ATP-dependent RNA helicase DDX53 Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 102100022632 Protein FAM171B Human genes 0.000 description 1
- 244000007021 Prunus avium Species 0.000 description 1
- 244000279064 Psophocarpus palustris Species 0.000 description 1
- 241000195350 Pterula multifida Species 0.000 description 1
- 102100033091 Putative C->U-editing enzyme APOBEC-4 Human genes 0.000 description 1
- 101000902592 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) DNA polymerase Proteins 0.000 description 1
- 238000003559 RNA-seq method Methods 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 241000725643 Respiratory syncytial virus Species 0.000 description 1
- 241000589180 Rhizobium Species 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- IYCBDVBJWDXQRR-FXQIFTODSA-N Ser-Ala-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IYCBDVBJWDXQRR-FXQIFTODSA-N 0.000 description 1
- PZZJMBYSYAKYPK-UWJYBYFXSA-N Ser-Ala-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O PZZJMBYSYAKYPK-UWJYBYFXSA-N 0.000 description 1
- CRZRTKAVUUGKEQ-ACZMJKKPSA-N Ser-Gln-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CRZRTKAVUUGKEQ-ACZMJKKPSA-N 0.000 description 1
- DGHFNYXVIXNNMC-GUBZILKMSA-N Ser-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N DGHFNYXVIXNNMC-GUBZILKMSA-N 0.000 description 1
- SMIDBHKWSYUBRZ-ACZMJKKPSA-N Ser-Glu-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O SMIDBHKWSYUBRZ-ACZMJKKPSA-N 0.000 description 1
- YQQKYAZABFEYAF-FXQIFTODSA-N Ser-Glu-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O YQQKYAZABFEYAF-FXQIFTODSA-N 0.000 description 1
- BRGQQXQKPUCUJQ-KBIXCLLPSA-N Ser-Glu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BRGQQXQKPUCUJQ-KBIXCLLPSA-N 0.000 description 1
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 1
- RJHJPZQOMKCSTP-CIUDSAMLSA-N Ser-His-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O RJHJPZQOMKCSTP-CIUDSAMLSA-N 0.000 description 1
- WEQAYODCJHZSJZ-KKUMJFAQSA-N Ser-His-Tyr Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CN=CN1 WEQAYODCJHZSJZ-KKUMJFAQSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- GJFYFGOEWLDQGW-GUBZILKMSA-N Ser-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GJFYFGOEWLDQGW-GUBZILKMSA-N 0.000 description 1
- VZQRNAYURWAEFE-KKUMJFAQSA-N Ser-Leu-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VZQRNAYURWAEFE-KKUMJFAQSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- UGGWCAFQPKANMW-FXQIFTODSA-N Ser-Met-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UGGWCAFQPKANMW-FXQIFTODSA-N 0.000 description 1
- BUYHXYIUQUBEQP-AVGNSLFASA-N Ser-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CO)N BUYHXYIUQUBEQP-AVGNSLFASA-N 0.000 description 1
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 1
- XQAPEISNMXNKGE-FXQIFTODSA-N Ser-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CO)N)C(=O)N[C@@H](CS)C(=O)O XQAPEISNMXNKGE-FXQIFTODSA-N 0.000 description 1
- BVLGVLWFIZFEAH-BPUTZDHNSA-N Ser-Pro-Trp Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O BVLGVLWFIZFEAH-BPUTZDHNSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- BCAVNDNYOGTQMQ-AAEUAGOBSA-N Ser-Trp-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O BCAVNDNYOGTQMQ-AAEUAGOBSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- 108010071390 Serum Albumin Proteins 0.000 description 1
- 102000007562 Serum Albumin Human genes 0.000 description 1
- 241000165726 Solanum multiflorum Species 0.000 description 1
- 241000736131 Sphingomonas Species 0.000 description 1
- 241001645553 Sphingomonas melonis Species 0.000 description 1
- 208000000102 Squamous Cell Carcinoma of Head and Neck Diseases 0.000 description 1
- 208000034254 Squamous cell carcinoma of the cervix uteri Diseases 0.000 description 1
- 241001147691 Staphylococcus saprophyticus Species 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 241001468227 Streptomyces avermitilis Species 0.000 description 1
- 241000187181 Streptomyces scabiei Species 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 201000009594 Systemic Scleroderma Diseases 0.000 description 1
- 206010042953 Systemic sclerosis Diseases 0.000 description 1
- 206010042971 T-cell lymphoma Diseases 0.000 description 1
- 208000027585 T-cell non-Hodgkin lymphoma Diseases 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- XSLXHSYIVPGEER-KZVJFYERSA-N Thr-Ala-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O XSLXHSYIVPGEER-KZVJFYERSA-N 0.000 description 1
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 1
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 1
- TZKPNGDGUVREEB-FOHZUACHSA-N Thr-Asn-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O TZKPNGDGUVREEB-FOHZUACHSA-N 0.000 description 1
- XPNSAQMEAVSQRD-FBCQKBJTSA-N Thr-Gly-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)NCC(O)=O XPNSAQMEAVSQRD-FBCQKBJTSA-N 0.000 description 1
- QQWNRERCGGZOKG-WEDXCCLWSA-N Thr-Gly-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O QQWNRERCGGZOKG-WEDXCCLWSA-N 0.000 description 1
- YUPVPKZBKCLFLT-QTKMDUPCSA-N Thr-His-Val Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C(C)C)C(=O)O)N)O YUPVPKZBKCLFLT-QTKMDUPCSA-N 0.000 description 1
- CRZNCABIJLRFKZ-IUKAMOBKSA-N Thr-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N CRZNCABIJLRFKZ-IUKAMOBKSA-N 0.000 description 1
- LCCSEJSPBWKBNT-OSUNSFLBSA-N Thr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N LCCSEJSPBWKBNT-OSUNSFLBSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 1
- NWECYMJLJGCBOD-UNQGMJICSA-N Thr-Phe-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O NWECYMJLJGCBOD-UNQGMJICSA-N 0.000 description 1
- STUAPCLEDMKXKL-LKXGYXEUSA-N Thr-Ser-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O STUAPCLEDMKXKL-LKXGYXEUSA-N 0.000 description 1
- NBIIPOKZPUGATB-BWBBJGPYSA-N Thr-Ser-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)O NBIIPOKZPUGATB-BWBBJGPYSA-N 0.000 description 1
- RVMNUBQWPVOUKH-HEIBUPTGSA-N Thr-Ser-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMNUBQWPVOUKH-HEIBUPTGSA-N 0.000 description 1
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 1
- LXXCHJKHJYRMIY-FQPOAREZSA-N Thr-Tyr-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C)C(O)=O LXXCHJKHJYRMIY-FQPOAREZSA-N 0.000 description 1
- BZTSQFWJNJYZSX-JRQIVUDYSA-N Thr-Tyr-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O BZTSQFWJNJYZSX-JRQIVUDYSA-N 0.000 description 1
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 208000000728 Thymus Neoplasms Diseases 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- PEYSVKMXSLPQRU-FJHTZYQYSA-N Trp-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O PEYSVKMXSLPQRU-FJHTZYQYSA-N 0.000 description 1
- PNKDNKGMEHJTJQ-BPUTZDHNSA-N Trp-Arg-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N PNKDNKGMEHJTJQ-BPUTZDHNSA-N 0.000 description 1
- KZIQDVNORJKTMO-WDSOQIARSA-N Trp-Arg-His Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N KZIQDVNORJKTMO-WDSOQIARSA-N 0.000 description 1
- HJTYJQVRIQXMHM-XIRDDKMYSA-N Trp-Asp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N HJTYJQVRIQXMHM-XIRDDKMYSA-N 0.000 description 1
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 1
- MHCLIYHJRXZBGJ-AAEUAGOBSA-N Trp-Gly-Cys Chemical compound N[C@@H](CC1=CNC2=CC=CC=C12)C(=O)NCC(=O)N[C@@H](CS)C(=O)O MHCLIYHJRXZBGJ-AAEUAGOBSA-N 0.000 description 1
- WHJVRIBYQWHRQA-NQCBNZPSSA-N Trp-Phe-Ile Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CC=CC=C1 WHJVRIBYQWHRQA-NQCBNZPSSA-N 0.000 description 1
- PWPJLBWYRTVYQS-PMVMPFDFSA-N Trp-Phe-Leu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O PWPJLBWYRTVYQS-PMVMPFDFSA-N 0.000 description 1
- IKUMWSDCGQVGHC-UMPQAUOISA-N Trp-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CNC3=CC=CC=C32)N)O IKUMWSDCGQVGHC-UMPQAUOISA-N 0.000 description 1
- GEGYPBOPIGNZIF-CWRNSKLLSA-N Trp-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O GEGYPBOPIGNZIF-CWRNSKLLSA-N 0.000 description 1
- AKXBNSZMYAOGLS-STQMWFEESA-N Tyr-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AKXBNSZMYAOGLS-STQMWFEESA-N 0.000 description 1
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 1
- NRFTYDWKWGJLAR-MELADBBJSA-N Tyr-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O NRFTYDWKWGJLAR-MELADBBJSA-N 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- UNUZEBFXGWVAOP-DZKIICNBSA-N Tyr-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UNUZEBFXGWVAOP-DZKIICNBSA-N 0.000 description 1
- PMDWYLVWHRTJIW-STQMWFEESA-N Tyr-Gly-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PMDWYLVWHRTJIW-STQMWFEESA-N 0.000 description 1
- GULIUBBXCYPDJU-CQDKDKBSSA-N Tyr-Leu-Ala Chemical compound [O-]C(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CC1=CC=C(O)C=C1 GULIUBBXCYPDJU-CQDKDKBSSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- DMWNPLOERDAHSY-MEYUZBJRSA-N Tyr-Leu-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O DMWNPLOERDAHSY-MEYUZBJRSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- SOAUMCDLIUGXJJ-SRVKXCTJSA-N Tyr-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O SOAUMCDLIUGXJJ-SRVKXCTJSA-N 0.000 description 1
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 1
- MQGGXGKQSVEQHR-KKUMJFAQSA-N Tyr-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 MQGGXGKQSVEQHR-KKUMJFAQSA-N 0.000 description 1
- WQOHKVRQDLNDIL-YJRXYDGGSA-N Tyr-Thr-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O WQOHKVRQDLNDIL-YJRXYDGGSA-N 0.000 description 1
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 1
- UUJHRSTVQCFDPA-UFYCRDLUSA-N Tyr-Tyr-Val Chemical compound C([C@@H](C(=O)N[C@@H](C(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 UUJHRSTVQCFDPA-UFYCRDLUSA-N 0.000 description 1
- RGJZPXFZIUUQDN-BPNCWPANSA-N Tyr-Val-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O RGJZPXFZIUUQDN-BPNCWPANSA-N 0.000 description 1
- AEOFMCAKYIQQFY-YDHLFZDLSA-N Tyr-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AEOFMCAKYIQQFY-YDHLFZDLSA-N 0.000 description 1
- KLOZTPOXVVRVAQ-DZKIICNBSA-N Tyr-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 KLOZTPOXVVRVAQ-DZKIICNBSA-N 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 201000005969 Uveal melanoma Diseases 0.000 description 1
- 102100038282 V-type immunoglobulin domain-containing suppressor of T-cell activation Human genes 0.000 description 1
- WOCYUGQDXPTQPY-FXQIFTODSA-N Val-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N WOCYUGQDXPTQPY-FXQIFTODSA-N 0.000 description 1
- CVUDMNSZAIZFAE-TUAOUCFPSA-N Val-Arg-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(=O)O)N CVUDMNSZAIZFAE-TUAOUCFPSA-N 0.000 description 1
- UDNYEPLJTRDMEJ-RCOVLWMOSA-N Val-Asn-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N UDNYEPLJTRDMEJ-RCOVLWMOSA-N 0.000 description 1
- IQQYYFPCWKWUHW-YDHLFZDLSA-N Val-Asn-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N IQQYYFPCWKWUHW-YDHLFZDLSA-N 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- ZQGPWORGSNRQLN-NHCYSSNCSA-N Val-Asp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZQGPWORGSNRQLN-NHCYSSNCSA-N 0.000 description 1
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 1
- XEYUMGGWQCIWAR-XVKPBYJWSA-N Val-Gln-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N XEYUMGGWQCIWAR-XVKPBYJWSA-N 0.000 description 1
- VFOHXOLPLACADK-GVXVVHGQSA-N Val-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](C(C)C)N VFOHXOLPLACADK-GVXVVHGQSA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- YTUABZMPYKCWCQ-XQQFMLRXSA-N Val-His-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N YTUABZMPYKCWCQ-XQQFMLRXSA-N 0.000 description 1
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 1
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 1
- MLADEWAIYAPAAU-IHRRRGAJSA-N Val-Lys-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N MLADEWAIYAPAAU-IHRRRGAJSA-N 0.000 description 1
- IEBGHUMBJXIXHM-AVGNSLFASA-N Val-Lys-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N IEBGHUMBJXIXHM-AVGNSLFASA-N 0.000 description 1
- HPANGHISDXDUQY-ULQDDVLXSA-N Val-Lys-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HPANGHISDXDUQY-ULQDDVLXSA-N 0.000 description 1
- JVGHIFMSFBZDHH-WPRPVWTQSA-N Val-Met-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)O)N JVGHIFMSFBZDHH-WPRPVWTQSA-N 0.000 description 1
- VIKZGAUAKQZDOF-NRPADANISA-N Val-Ser-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O VIKZGAUAKQZDOF-NRPADANISA-N 0.000 description 1
- PGQUDQYHWICSAB-NAKRPEOUSA-N Val-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N PGQUDQYHWICSAB-NAKRPEOUSA-N 0.000 description 1
- CEKSLIVSNNGOKH-KZVJFYERSA-N Val-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](C(C)C)N)O CEKSLIVSNNGOKH-KZVJFYERSA-N 0.000 description 1
- UEXPMFIAZZHEAD-HSHDSVGOSA-N Val-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](C(C)C)N)O UEXPMFIAZZHEAD-HSHDSVGOSA-N 0.000 description 1
- SUGRIIAOLCDLBD-ZOBUZTSGSA-N Val-Trp-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC(=O)O)C(=O)O)N SUGRIIAOLCDLBD-ZOBUZTSGSA-N 0.000 description 1
- VTIAEOKFUJJBTC-YDHLFZDLSA-N Val-Tyr-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VTIAEOKFUJJBTC-YDHLFZDLSA-N 0.000 description 1
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 1
- GTACFKZDQFTVAI-STECZYCISA-N Val-Tyr-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 GTACFKZDQFTVAI-STECZYCISA-N 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 241000589655 Xanthomonas citri Species 0.000 description 1
- 241000231754 Xanthomonas fragariae Species 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 230000001919 adrenal effect Effects 0.000 description 1
- 229940008126 aerosol Drugs 0.000 description 1
- 229910000147 aluminium phosphate Inorganic materials 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000003963 antioxidant agent Substances 0.000 description 1
- 235000006708 antioxidants Nutrition 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 229960005070 ascorbic acid Drugs 0.000 description 1
- 235000010323 ascorbic acid Nutrition 0.000 description 1
- 239000011668 ascorbic acid Substances 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 230000037429 base substitution Effects 0.000 description 1
- 239000011230 binding agent Substances 0.000 description 1
- 206010005084 bladder transitional cell carcinoma Diseases 0.000 description 1
- 230000037396 body weight Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 201000006662 cervical adenocarcinoma Diseases 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 201000006612 cervical squamous cell carcinoma Diseases 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 235000015111 chews Nutrition 0.000 description 1
- 208000006990 cholangiocarcinoma Diseases 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 206010073251 clear cell renal cell carcinoma Diseases 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000011260 co-administration Methods 0.000 description 1
- 201000010897 colon adenocarcinoma Diseases 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000002856 computational phylogenetic analysis Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000001054 cortical effect Effects 0.000 description 1
- 108010016616 cysteinylglycine Proteins 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 108091009381 deaminase binding proteins Proteins 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000008367 deionised water Substances 0.000 description 1
- 229910021641 deionized water Inorganic materials 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- 239000002552 dosage form Substances 0.000 description 1
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 1
- 230000011559 double-strand break repair via nonhomologous end joining Effects 0.000 description 1
- 239000006196 drop Substances 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 206010013781 dry mouth Diseases 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000002257 embryonic structure Anatomy 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 229940032049 enterococcus faecalis Drugs 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 150000002170 ethers Chemical class 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 238000001506 fluorescence spectroscopy Methods 0.000 description 1
- IJJVMEJXYNJXOJ-UHFFFAOYSA-N fluquinconazole Chemical compound C=1C=C(Cl)C=C(Cl)C=1N1C(=O)C2=CC(F)=CC=C2N=C1N1C=NC=N1 IJJVMEJXYNJXOJ-UHFFFAOYSA-N 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000012268 genome sequencing Methods 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 208000005017 glioblastoma Diseases 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- 108010038983 glycyl-histidyl-lysine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 229940093915 gynecological organic acid Drugs 0.000 description 1
- 201000000459 head and neck squamous cell carcinoma Diseases 0.000 description 1
- 208000019691 hematopoietic and lymphoid cell neoplasm Diseases 0.000 description 1
- 206010073071 hepatocellular carcinoma Diseases 0.000 description 1
- 231100000844 hepatocellular carcinoma Toxicity 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 102000018358 immunoglobulin Human genes 0.000 description 1
- 229940072221 immunoglobulins Drugs 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 239000000644 isotonic solution Substances 0.000 description 1
- 210000003292 kidney cell Anatomy 0.000 description 1
- 238000009533 lab test Methods 0.000 description 1
- 229940017800 lactobacillus casei Drugs 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 244000144972 livestock Species 0.000 description 1
- 239000007937 lozenge Substances 0.000 description 1
- 201000005249 lung adenocarcinoma Diseases 0.000 description 1
- 201000005243 lung squamous cell carcinoma Diseases 0.000 description 1
- 208000019420 lymphoid neoplasm Diseases 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 108010012988 lysyl-glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 210000005033 mesothelial cell Anatomy 0.000 description 1
- 108010056582 methionylglutamic acid Proteins 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 150000002772 monosaccharides Chemical class 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 239000002736 nonionic surfactant Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 150000007524 organic acids Chemical class 0.000 description 1
- 235000005985 organic acids Nutrition 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 208000007312 paraganglioma Diseases 0.000 description 1
- 210000004976 peripheral blood cell Anatomy 0.000 description 1
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 1
- 239000008194 pharmaceutical composition Substances 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 208000028591 pheochromocytoma Diseases 0.000 description 1
- 239000002504 physiological saline solution Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 208000005987 polymyositis Diseases 0.000 description 1
- 229920005862 polyol Polymers 0.000 description 1
- 150000003077 polyols Chemical class 0.000 description 1
- 235000010482 polyoxyethylene sorbitan monooleate Nutrition 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 229920000053 polysorbate 80 Polymers 0.000 description 1
- 229940098458 powder spray Drugs 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008263 repair mechanism Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003248 secreting effect Effects 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 210000001082 somatic cell Anatomy 0.000 description 1
- 239000000600 sorbitol Substances 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 239000003381 stabilizer Substances 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 125000001424 substituent group Chemical group 0.000 description 1
- 150000005846 sugar alcohols Chemical class 0.000 description 1
- 239000000829 suppository Substances 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 108700029760 synthetic LTSP Proteins 0.000 description 1
- 201000000596 systemic lupus erythematosus Diseases 0.000 description 1
- 208000002918 testicular germ cell tumor Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 201000009377 thymus cancer Diseases 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 229940098465 tincture Drugs 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 108700004896 tripeptide FEG Proteins 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 108010080629 tryptophan-leucine Proteins 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 108010017949 tyrosyl-glycyl-glycine Proteins 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241001529453 unidentified herpesvirus Species 0.000 description 1
- 241000712461 unidentified influenza virus Species 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
- 108010009962 valyltyrosine Proteins 0.000 description 1
- 239000003981 vehicle Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/43—Enzymes; Proenzymes; Derivatives thereof
- A61K38/46—Hydrolases (3)
- A61K38/465—Hydrolases (3) acting on ester bonds (3.1), e.g. lipases, ribonucleases
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/43—Enzymes; Proenzymes; Derivatives thereof
- A61K38/46—Hydrolases (3)
- A61K38/50—Hydrolases (3) acting on carbon-nitrogen bonds, other than peptide bonds (3.5), e.g. asparaginase
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/04—Antibacterial agents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P31/00—Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
- A61P31/12—Antivirals
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P35/00—Antineoplastic agents
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P37/00—Drugs for immunological or allergic disorders
- A61P37/02—Immunomodulators
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/65—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/06—Animal cells or tissues; Human cells or tissues
- C12N5/0602—Vertebrate cells
- C12N5/0684—Cells of the urinary tract or kidneys
- C12N5/0686—Kidney cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04001—Cytosine deaminase (3.5.4.1)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2510/00—Genetically modified cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Medicinal Chemistry (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Pharmacology & Pharmacy (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Animal Behavior & Ethology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Epidemiology (AREA)
- Biophysics (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Urology & Nephrology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Communicable Diseases (AREA)
- Oncology (AREA)
- Cell Biology (AREA)
- Virology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明属于生物技术领域,涉及一种高效特异实现碱基颠换的编辑系统、方法及用途。本发明所述融合蛋白自N端至C端依次包括第一Cas9切口酶片段、嵌合插入片段、第二Cas9切口酶片段,所述嵌合插入片段包含脱氨酶片段和尿嘧啶DNA结合蛋白片段。本发明提供的融合蛋白结合相应的引导RNA,可以对靶位点中的C颠换成G。本发明编辑系统或方法可以在多种细胞中高效实现C‑G碱基的颠换,为修复致病突变,研究基因功能以及改善细胞功能等提供了有效的工具,具备良好的应用前景。
Description
技术领域
本发明属于生物技术领域,特别是涉及一种高效特异实现碱基颠换的编辑系统、方法及用途。
背景技术
随着基因组测序技术的普及,改造基因组是后基因组时代的重要内容。以类转录激活因子效应物核酸酶(TALEN)、锌指核酸酶(ZFN)为代表的基因编辑技术使得改造基因序列变得简单,CRISPR/Cas技术的出现更是极大地加速了这一过程。以CRISPR/Cas9为代表的基因编辑技术,由于具备简单、易操作、高效、普适性等特点迅速成为分子生物学领域的常用技术。基因编辑技术的出现加速了基因治疗以及细胞治疗的进程。
CRISPR/Cas9改造DNA的过程是通过引入双链断裂,利用细胞内的修复机制,进行非同源末端链接(NHEJ)的修复和以同源模板的同源重组修复(HDR)。而目前多种研究表明双链断裂会富集p53突变的细胞,进而有癌变的可能。以双链断裂为基础的DNA改造在临床的应用方面面临着巨大的挑战。
在CRISPR/Cas基础上通过添加不同的功能元件,目前可以实现了包括基因表达激活、抑制、表观调控、碱基编辑等操作,极大地丰富了基因编辑工具库。此种类型的基因编辑或者基因调控方法不需要利用双链断裂,即可实现改变细胞功能的目的。特别是碱基编辑技术,其不同于以双链断裂为基础的同源重组技术,其诱导点突变的效率得到显著提高。研究表明在胚胎中碱基编辑技术甚至可以实现100%的编辑。目前的碱基编辑技术主要是C转换为T的胞嘧啶碱基编辑技术(CBE)和A转换为G的腺嘌呤碱基编辑技术(ABE)。碱基编辑技术因为引入了脱氨酶,存在脱靶的风险。研究表明胞嘧啶脱氨酶可以在DNA和RNA水平均引起脱靶,这种脱靶是不依赖sgRNA的。脱氨酶随机的结合到单链DNA或RNA上是这种脱靶发生的原因。有效控制脱氨酶与单链DNA或者RNA的结合是降低脱靶的有效手段。已报道的手段有对脱氨酶进行突变,调控脱氨酶的表达等。
目前两种碱基编辑方式都是在嘧啶与嘧啶或者嘌呤与嘌呤直接的转换,而嘌呤与嘧啶直接的颠换目前还没有一种高效、特异的方法。胞嘧啶脱氨以后形成尿嘧啶,以尿嘧啶为模板进行修复的时候,尿嘧啶将变成胸腺嘧啶。如果胞嘧啶颠换成鸟嘌呤,需要将尿嘧啶切除,形成一个无碱基的位点(AP位点)。最后DNA修复时在这个AP位点插入鸟嘌呤,也可能插入腺嘌呤或者胸腺嘧啶。对于尿嘧啶的切除效果将显著影响修复的结果。
胞嘧啶碱基编辑器是实现C-T的转换,其加入了UGI抑制尿嘧啶糖基化酶(UNG)的作用。目前报道的碱基颠换工具主要是在胞嘧啶碱基编辑器的基础上去掉了UGI。已有报道指出添加DNA糖基化酶有助于提高C-G比例。来自UNG的同源物UdgX蛋白展示出了较好的编辑效果。目前报道的UdgX来自耻垢分枝杆菌(Mycobacterium smegmatis),但其作用效果仍然存在位点选择性等问题。
基于此,开发更加高效、特异的碱基颠换的方法有助于实现更多种类的碱基替换,进而对基因与细胞治疗的进展起到促进作用。
发明内容
鉴于上述现有技术的缺点,本发明的目的在于提供一种高效、特异进行碱基颠换的碱基编辑蛋白、编辑系统、方法及其用途。
本发明的目的之一是提供一种融合蛋白,所述融合蛋白自N端至C端依次包括第一Cas9切口酶片段、嵌合插入片段、第二Cas9切口酶片段,所述嵌合插入片段包含脱氨酶片段和尿嘧啶DNA结合蛋白片段。
本发明的另一目的是提供一种分离的多核苷酸,其编码如上所述的融合蛋白。
本发明的另一目的是提供一种构建体,所述构建体含有如上所述的分离的多核苷酸。
本发明的另一目的是提供一种表达系统,所述表达系统含有如上所述的构建体或基因组中整合有外源的如上所述的多核苷酸。
本发明的另一目的是提供一种碱基编辑系统,包括如上所述的融合蛋白;优选地,所述碱基编辑系统还包括sgRNA。
本发明的另一目的是提供如上所述的融合蛋白、分离的多核苷酸、构建体、表达系统、碱基编辑系统在基因编辑中的用途。
本发明的另一目的是提供一种基因编辑方法,包括:通过如上所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统对靶标序列进行基因编辑。
本发明的另一目的是提供一种报告系统,所述报告系统包含如SEQ ID NO.31所示的核苷酸序列。当SEQ ID NO.31所示的核苷酸序列突变为SEQ ID NO.30时,所述报告系统呈现蓝色荧光;当SEQ ID NO.31所示的核苷酸序列突变为SEQ ID NO.32时,所述报告系统呈现绿色荧光。当SEQ ID NO.31所示的核苷酸序列突变为SEQ ID NO.33时,所述报告系统无荧光变化。在一优选实施方式中,所述报告系统包含质粒,所述质粒包含如SEQ ID NO.31所示的核苷酸序列,所述质粒的核苷酸序列如SEQ ID NO.2所示。
本发明的另一目的是提供一种新型的尿嘧啶结合蛋白,其作用是结合尿嘧啶,促进DNA的修复;所述尿嘧啶DNA结合蛋白选自来自微生物的Udgx家族的酶或其变体。
本发明的另一目的是提供如上所述的尿嘧啶DNA结合蛋白在用于构建碱基编辑蛋白或碱基编辑系统中的用途;所述碱基编辑蛋白或碱基编辑系统用于实现C至G或G至C的颠换。
附图说明
图1示出了荧光报告系统模式图(下方图中,横坐标从左至右依次为100、101、102、103、104、105,纵坐标从下至上依次为100、101、102、103、104、105)。
图2示出了三种碱基编辑蛋白结构示意图。
图3示出了三种碱基编辑蛋白对荧光比例的影响。
图4示出了不同来源的UdgX蛋白进化树分析。
图5示出了不同来源UdgX编辑蛋白对荧光比例的影响。
图6示出了利用转座酶将APOBEC-UdgX随机插入到Cas9蛋白中间的模式图。
图7示出了不同插入位点对报告系统编辑效率的影响。
图8示出了CE-CGBE-ancApobec1与CE-CGBE-A3A编辑蛋白模式图。
图9示出了C-G编辑蛋白对内源基因的编辑一代测序峰图。(A)展示了CE-CGBE-A3A的编辑效果,(B)展示了CE-CGBE-ancApobec1的编辑效果。
图10示出了CE-CGBE-ancApobec1与CE-CGBE-A3A对内源基因编辑效率分析。
图11示出了CE-CGBE-A3A在RNA与DNA水平脱靶分析;A为DNA水平脱靶情况,B为RNA水平脱靶情况。
具体实施方式
下面将结合实施例对本发明的实施方案进行清楚、完整的描述,显然,所描述的实施例仅用于说明本发明的一部分实施例,而不应视为限制本发明的范围。实施例中未注明具体条件者,按照常规条件或制造商建议的条件进行。所用试剂或仪器未注明生产厂商者,视为可以通过市售购买获得的常规产品。
本发明发明人经过大量研究探索,发现在nCas9蛋白内部的合适位置嵌合脱氨酶和尿嘧啶结合蛋白,能够构建适宜CG碱基颠换的碱基编辑器或碱基编辑系统,并且极大地降低在RNA和DNA上的脱靶情况,在此基础上完成了本发明。
本发明一方面提供了一种融合蛋白,所述融合蛋白自N端至C端依次包括第一Cas9切口酶片段、嵌合插入片段、第二Cas9切口酶片段,所述嵌合插入片段包含脱氨酶片段和尿嘧啶DNA结合蛋白片段。
本发明所述融合蛋白中,所述第一Cas9切口酶片段的氨基酸序列包括以下片段或其变体:如SEQ ID NO.34所示的片段(对应Cas9切口酶nCas9的第1位至第1003位氨基酸片段);或如SEQ ID NO.35所示的片段(对应Cas9切口酶nCas9的第1位至第1027位氨基酸片段);或,如SEQ ID NO.36所示的片段(对应Cas9切口酶nCas9的第1位至第1039位氨基酸片段)。
本发明所述融合蛋白中,所述第二Cas9切口酶片段的氨基酸序列包括以下片段或其变体:如SEQ ID NO.37所示的片段(对应Cas9切口酶nCas9的第1004位至C端氨基酸片段);或如SEQ ID NO.38所示的片段(对应Cas9切口酶nCas9的第1028位至C端氨基酸片段);或如SEQ ID NO.39所示的片段(对应Cas9切口酶nCas9的第1040位至C端氨基酸片段)。
本发明所述融合蛋白中,当所述Cas9切口酶的N端是从起始密码子至第1003位氨基酸的片段或其变体,相应的Cas9切口酶的C端是从第1004位氨基酸至Cas9切口酶的C末端的片段或其变体;当所述Cas9切口酶的N端是从起始密码子至第1027位氨基酸的片段或其变体,相应的Cas9切口酶的C端是从第1028位氨基酸至Cas9切口酶的C末端的片段或其变体;当所述Cas9切口酶的N端是从起始密码子至第1039位氨基酸的片段或其变体,相应的Cas9切口酶的C端是从第1040位氨基酸至Cas9切口酶的C末端的片段或其变体。所述变体是指与原始序列相比具有80%以上序列相似性、且具有原始序列的功能的氨基酸片段,
本发明所述融合蛋白中,所述胞嘧啶脱氨酶的作用是对靶位点上的C脱氨形成U;所述脱氨酶选自胞嘧啶脱氨酶或其变体,所述胞嘧啶脱氨酶选自:1)APOBEC1、APOBEC2、APOBEC3A、APOBEC3B、APOBEC3C、APOBEC3D、APOBEC3F、APOBEC3G、APOBEC3H、APOBEC4、激活诱导的脱氨酶(AID)和pmCDA1;APOBEC家族;优选地,所述胞嘧啶脱氨酶选自ancAPOBEC1、Apobec3A。
本发明所述融合蛋白中,所述尿嘧啶DNA结合蛋白选自来自微生物的Udgx家族的酶或其变体;优选地,所述微生物选自类胞内分枝杆菌(Mycobacteriumparaintracellulare,NCBI Reference Sequence:WP_014385600.1)、少动鞘氨醇单胞菌(Sphingomonas paucimobilis,NCBI Reference Sequence:WP_007405629.1)、球形红假单胞菌(Cereibacter sphaeroides,NCBI Reference Sequence:WP_012643796.1)、柑桔溃疡病菌(Xanthomonas citri,NCBI Reference Sequence:WP_011052458.1)、苜蓿中华根瘤菌(Sinorhizobium meliloti,NCBI Reference Sequence:WP_010976097.1)、皮诺卡氏菌(Nocardia farcinica,NCBI Reference Sequence:WP_011209261.1)、盖尔森基兴奴卡氏菌(Nocardia cyriacigeorgica,NCBI Reference Sequence:WP_014350460.1)、哥伦布分枝杆菌(Mycobacterium colombiense,NCBI Reference Sequence:WP_007772952.1)、贪铜菌(Cupriavidus taiwanensis,NCBI Reference Sequence:WP_012351902.1)、阿氟曼链霉菌(Streptomyces scabiei,NCBI Reference Sequence:WP_012998863.1)、欺诈伯克霍尔德菌(Burkholderia dolosa,NCBI Reference Sequence:WP_006767162.1)、梅洛尼鞘氨醇单胞菌(Sphingomonas melonis,NCBI Reference Sequence:WP_017980260.1)、沼泽红假单胞菌(Rhodopseudomonas palustris,NCBI Reference Sequence:WP_011157305.1)、耻垢分枝杆菌(Mycolicibacterium smegmatis,NCBI Reference Sequence:WP_011726794.1)、胞内分枝杆菌(Mycobacterium intracellulare,NCBI ReferenceSequence:WP_009956825.1)、多噬伯克霍尔德菌(Burkholderia multivorans,NCBIReference Sequence:WP_006411759.1)、慢生型大豆根瘤菌(Bradyrhizobium japonicum,NCBI Reference Sequence:WP_014496800.1)、高效固氮慢生根瘤菌(Bradyrhizobiumdiazoefficiens,NCBI Reference Sequence:WP_011085807.1)、草莓角斑病菌(Xanthomonas fragariae,NCBI Reference Sequence:WP_011085807.1);进一步优选地,所述微生物为皮诺卡氏菌。
本发明所述融合蛋白中,所述融合蛋白还包括核定位信号片段,所述核定位信号片段可以位于融合蛋白的N端,也可以位于融合蛋白的C端,或是同时位于融合蛋白的N端和C端;当位于N端时,所述核定位信号片段的N端与第一Cas9切口酶的C端连接;当位于C端时,所述核定位信号片段的C端与第一Cas9切口酶的N端连接。
其中,所述核定位信号片段优选为BPNLS或其变体,所述变体与BPNLS具有80%以上序列相似性且具有BPNLS的功能。
本发明所述融合蛋白中,所述融合蛋白还包括柔性连接肽片段。即,在第一Cas9切口酶片段与嵌合插入片段之间,嵌合插入片段与第二Cas9切口酶片段,嵌合插入片段中脱氨酶片段与尿嘧啶DNA结合蛋白片段之间,核定位信号与第一Cas9切口酶片段或第二Cas9切口酶片段之间均可以通过柔性连接肽连接。当所述融合蛋白还有多个柔性连接肽片段时,所述多个柔性连接肽片段可以是相同或不同的。优选的,所述柔性连接肽片段的氨基酸序列包括SEQ ID NO.40:SGSETPGTSESATPESGS、SEQ ID NO.41:SGSGSETPGTSESATPES;
本发明所述融合蛋白中,所述胞嘧啶脱氨酶片段、尿嘧啶DNA结合蛋白嵌合入Cas9切口酶中,所述胞嘧啶脱氨酶片段和尿嘧啶DNA结合蛋白在融合蛋白中的顺序不作限制,可以是胞嘧啶脱氨酶片段与第一Cas9切口酶片段连接,也可以是尿嘧啶DNA结合蛋白与第一Cas9切口酶片段连接。优选地,所述融合蛋白依次包括第一Cas9切口酶片段、脱氨酶片段、尿嘧啶DNA结合蛋白片段、第二Cas9切口酶片段。
在一些实施方式中,本发明所述融合蛋白自N端到C端依次是核定位信号片段、第一Cas9切口酶片段、尿嘧啶DNA结合蛋白、胞嘧啶脱氨酶片段、第二Cas9切口酶片段和核定位信号片段。
在一些实施方式中,本发明所述融合蛋白自N端到C端依次是核定位信号片段、第一Cas9切口酶片段、胞嘧啶脱氨酶片段、尿嘧啶DNA结合蛋白、第二Cas9切口酶片段和核定位信号片段。
在另一些优选实施方式中,所述融合蛋白的氨基酸序列如SEQ ID No.11、SEQ IDNo.12、SEQ ID No.13至少任一所示;其编码多核苷酸分别如SEQ ID NO.42、SEQ ID NO.43、SEQ ID NO.44所示。
本发明的第二方面是提供一种分离的多核苷酸,其编码本发明第一方面所提供的融合蛋白。
本发明的第三方面是提供一种构建体,所述构建体含有本发明第二方面所提供的分离的多核苷酸。所述构建体通常可以通过将所述分离的多核苷酸插入合适的表达载体中构建获得,本领域技术人员可选择合适的表达载体。
本发明的第四方面是提供一种表达系统,所述表达系统含有本发明第三方面所述的构建体或基因组中整合有外源的本发明第二方面所述的分离的多核苷酸。所述表达系统可以是宿主细胞,所述宿主细胞可以表达如上所述的融合蛋白,所述融合蛋白可以与sgRNA相配合,从而可以将所述融合蛋白定位到目标区域,实现目标区域的碱基编辑。所述宿主细胞选自真核细胞或原核细胞;优选地,所述宿主细胞选自小鼠细胞、人细胞,例如是NK细胞、T细胞、脑神经瘤细胞、胚胎肾细胞、宫颈癌细胞、结肠癌细胞、骨肉瘤细胞等。
本发明的第五方面是提供一种碱基编辑系统,其包括如上所述的融合蛋白。优选地,所述碱基编辑系统还包括引导RNA,所述融合蛋白在引导RNA(sgRNA)作用下对靶位点实现C至G或G至C的颠换。本领域技术人员可以根据基因的目标编辑区域,选择合适的靶向特异性位点的sgRNA。例如,所述sgRNA的序列通常可以与目标区域至少部分互补,从而可以与所述融合蛋白相配合,将所述融合蛋白定位到目标区域,实现靶点区域内的碱基编辑,例如,可以是胞嘧啶脱氨基反应,即将胞嘧啶(cytosin,C)脱氨产生胸腺嘧啶(thymine,T),进而通过Udgx将尿嘧啶切除产生AP位点,并通过基因修复插入鸟嘌呤,实现C至G的编辑(相对应的互补链上实现G至C的编辑)。
本发明所述碱基编辑系统中,所述碱基编辑系统包括:a)所述融合蛋白或其编码多核苷酸,b)引导RNA核苷酸序列或其编码多核苷酸。所述引导RNA将所述融合蛋白靶向靶标序列中的靶C碱基。本发明所述碱基编辑系统用于碱基编辑时,所述靶标序列可以与a)、b)分别接触,或与a)和b)形成的RNP复合物接触。
在一些实施方式中,所述碱基编辑系统包含一个或多个载体;所述一个或多个载体包含(i)第一调控元件,所述第一调控元件可操作地连接至所述融合蛋白的编码多核苷酸;以及(ii)第二调控元件,所述第二调控元件可操作地连接至所述引导RNA核苷酸序列的编码多核苷酸;所述(i)和(ii)位于相同或不同载体上。
在一些实施方式中,所述碱基编辑系统包含(i)融合蛋白,以及(ii)包含所述引导核苷酸序列的编码多核苷酸的载体。所述碱基编辑系统用于碱基编辑时,所述靶标序列可以与i)、ii)分别接触,或与i)和ii)形成的RNP复合物接触。
本发明第六方面是提供如上所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统在基因编辑中的用途。具体地,所述基因编辑是指碱基颠换;优选地,所述基因编辑实现C至G或G至C的编辑。所述基因编辑用于实现模型构建(例如,疾病模型、细胞模型、动物模型等)、致病位点的修正、基因功能研究、增强细胞功能、细胞治疗等。例如,所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统用于以下疾病治病位点的修正:自身免疫性疾病、肿瘤、病毒感染性疾病、细菌感染性疾病,进而,实现自身免疫性疾病、肿瘤、病毒感染性疾病、细菌感染性疾病的预防和/或治疗。
本发明第七方面是提供一种碱基编辑方法,包括:通过本发明第一方面所提供的融合蛋白、或本发明第五方面所提供的碱基编辑系统进行基因编辑。例如,所述基因编辑方法可以包括:在适当条件下培养本发明第四方面所提供的表达系统,从而表达所述融合蛋白,所述融合蛋白可以在与其配合的靶向目标区域的sgRNA存在的条件下,对靶标区域进行碱基编辑。提供所述sgRNA存在的条件的方法对于本领域技术人员来说应该是已知的,例如,可以是在适当条件下培养能够表达所述sgRNA的表达系统,所述表达系统可以是包括含有编码所述sgRNA的多核苷酸的表达载体的宿主细胞、或染色体中整合有编码所述sgRNA的多核苷酸的宿主细胞。在本发明一具体实施例中,所述sgRNA与所述融合蛋白可以在同一宿主细胞中表达,所述宿主细胞可以为靶细胞。本发明所述方法适用的基因编辑的对象没有特别的限制,可以在体外或体内实施。在一些实施方式中,所述方法在体外实施;优选地,所述方法在培养的细胞中实施;可以在体细胞或生殖细胞,可以是动物细胞或人细胞;例如所述细胞为自然杀伤细胞即NK细胞;所述NK细胞选自来自外周血细胞的原代NK细胞(优选来自PBMC的原代NK细胞)、脐带血来源NK细胞、胚胎干细胞(ESC)、诱导多能干细胞(ips)或诱导多能干细胞(ips)诱导的NK细胞。在一些实施方式中,所述方法在体内实施;优选地,所述方法在哺乳动物中实施;进一步优选地,所述哺乳动物是啮齿动物;更进一步优选地,所述哺乳动物是人。例如所述方法靶向哺乳动物的NK细胞。
可以使用基因递送载剂递送本文所述的多核苷酸至细胞或组织。本文所用“基因递送”、“基因转移”、“转导”等,其指代将外源性多核苷酸导入宿主细胞,如载体介导的基因转移(通过,例如,病毒感染/转染,或各种其他基于蛋白质或基于脂质的基因递送复合物)以及协助“裸”多核苷酸递送的技术(如电穿孔,“基因枪”递送以及用于导入多核苷酸的各种其他技术)。导入的多核苷酸可以稳定地或瞬时地维持在宿主细胞中。稳定维持通常需要导入的多核苷酸包含与宿主细胞相容的复制起点或者纳入到宿主细胞的复制子,诸如染色体外复制子(例如,质粒)或核或线粒体染色体。许多“载体”已知能够介导基因向哺乳动物细胞的转移,如本领域已知和本文所述。
本发明所述方法中,在一些实施方式中,可以直接将所述融合蛋白或其变体递送至待编辑对象如体外体系或宿主细胞中。
本发明所述方法中,在一些实施方式中,也可以将所述融合蛋白或其变体的编码多核苷酸递送至待编辑对象如体外体系或宿主细胞中,然后翻译成所述融合蛋白或其变体。其中所述融合蛋白或其变体的编码多核苷酸可以是DNA形式或RNA形式;DNA形式包括cDNA、基因组DNA或人工合成的DNA,DNA可以是单链的或是双链的,DNA可以是编码链或非编码链;RNA形式例如是信使RNA(mRNA)。在一些优选实施方式中,所述融合蛋白或其变体的编码多核苷酸可以以表达载体的形式递送,所述表达载体上包含编码一个或多个拷贝的编码融合蛋白或其变体的多核苷酸。
本发明所述方法中,在一些实施方式中,可以直接将所述引导核苷酸序列递送至待编辑对象如体外体系或宿主细胞中。
本发明所述方法中,在一些实施方式中,也可以将所述引导核苷酸序列的编码多核苷酸递送至待编辑对象如体外体系或宿主细胞中,然后翻译成所述引导核苷酸序列。其中所述引导核苷酸序列的编码多核苷酸可以是DNA形式;DNA形式包括cDNA、基因组DNA或人工合成的DNA,DNA可以是单链的或是双链的,DNA可以是编码链或非编码链。在一些优选实施方式中,所述引导核苷酸序列的编码多核苷酸可以以表达载体的形式递送,所述表达载体上包含编码一个或多个拷贝的编码引导核苷酸序列的多核苷酸。
本发明所述方法中,编码所述融合蛋白或其变体的多核苷酸包括:只编码融合蛋白或其变体的编码序列;融合蛋白或其变体的编码序列和各种附加编码序列;融合蛋白或其变体的编码序列(和任选的附加编码序列)以及非编码序列。编码所述引导核苷酸序列的多核苷酸包括:只编码引导核苷酸序列的编码序列;引导核苷酸序列的编码序列和各种附加编码序列;引导核苷酸序列的编码序列(和任选的附加编码序列)以及非编码序列。在一些实施方式中,本发明所述方法中,所述碱基编辑系统可以是包含一个或多个载体;所述一个或多个载体包含(i)第一调控元件,所述第一调控元件可操作地连接至所述融合蛋白或其变体的编码序列;以及(ii)第二调控元件,所述第二调控元件可操作地连接至所述引导核苷酸序列的编码序列;所述(i)和(ii)位于相同或不同载体上。在一些实施方式中,所述碱基编辑系统包含(i)融合蛋白或其变体,以及(ii)包含所述引导核苷酸序列的编码序列的载体。
所述第一调控元件可以调控所述融合蛋白或其变体的编码多核苷酸的转录。所述融合蛋白或其变体的编码多核苷酸可以是一个或多个,所述第一调控元件可以是一个或多个。所述第二调控元件可以调控所述引导核苷酸序列的编码多核苷酸的转录。所述引导核苷酸序列的编码多核苷酸可以是一个或多个,所述第二调控元件可以是一个或多个。
本发明第八方面提供了一种预防和/或病症的方法,所述方法包括向有需要的受试者施用治疗有效量的如上所述方法中所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统;所述病症选自以下至少任一项:自身免疫性疾病、肿瘤、病毒感染性疾病、细菌感染性疾病等。
本发明中,所述自身免疫性疾病选自系统性红斑狼疮、类风湿关节炎、系统性硬化症、口眼干燥综合征、多发性肌炎中的一种或多种。
本发明中,所述肿瘤选自淋巴瘤、血液瘤或实体瘤;优选地,选自肾上腺皮质癌、膀胱尿路上皮癌、乳腺癌、宫颈鳞状细胞癌、宫颈内腺癌、胆管癌、结肠腺癌、淋巴样肿瘤、弥散性大B细胞淋巴瘤、食管癌、多形性成胶质细胞瘤、头颈部鳞状细胞癌、肾嫌色细胞癌、肾透明细胞癌、肾乳头状细胞癌、急性髓性白血病、脑低度胶质瘤、肝细胞癌、肺腺癌、肺鳞状细胞癌、间皮细胞癌、卵巢癌、胰腺癌、嗜铬细胞瘤和副神经节瘤、前列腺癌、直肠癌、恶性肉瘤、黑色素瘤、胃癌、睾丸生殖细胞肿瘤、甲状腺癌、胸腺癌、子宫内膜癌、子宫肉瘤、葡萄膜黑色素瘤、多发性骨髓瘤、急性淋系白血病、慢性淋系白血病、慢性髓性白血病、T细胞淋巴瘤、B细胞淋巴瘤肿瘤细胞中的一种或多种。
本发明中,所述病毒选自流感病毒、副流感病毒、麻疹病毒、腮腺炎病毒、疱疹病毒、腺病毒、呼吸道合胞病毒、脊髓灰质炎病毒、柯萨奇病毒、或埃可病毒中的一种或几种。
本发明中,所述细菌选自大肠杆菌、干酪乳杆菌、脆弱拟杆菌、鲁氏不动杆菌、具核梭杆菌、乔氏拟杆菌、拟南芥拟杆菌、鼠李糖乳杆菌、马赛拟杆菌、卵形拟杆菌、空肠弯曲菌、腐生葡萄球菌、粪肠球菌、多形拟杆菌、普通拟杆菌、单形拟杆菌、粪副拟杆菌、死亡梭杆菌以及短双歧杆菌中的一种或几种。
本发明中,所述融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统可以与其他药物或试剂联用。即,所述融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统可以是单一有效成分,也可以与其他活性组分进行组合,构成联合制剂。所述其他活性组分可以是其他各种可以用于治疗自身免疫性疾病、肿瘤、病毒感染性疾病、细菌感染性疾病的药物。组合物中活性组分的含量通常为安全有效量,所述安全有效量对于本领域技术人员来说应该是可以调整的,例如,所述活性成分的施用量通常依赖于患者的体重、应用的类型、疾病的病情和严重程度,例如,作为活性成分的所述碱基编辑系统或NK细胞的施用量通常可以为1~1000mg/kg/day、20~200mg/kg/day、1~3mg/kg/day、3~5mg/kg/day、5~10mg/kg/day、10~20mg/kg/day、20~30mg/kg/day、30~40mg/kg/day、40~60mg/kg/day、60~80mg/kg/day、80~100mg/kg/day、100~150mg/kg/day、150~200mg/kg/day、200~300mg/kg/day、300~500mg/kg/day、或500~1000mg/kg/day。
本发明所述方法和应用中,当所述活性成分与其他治疗剂联用时,所述活性成分与其他治疗剂共同给予。“共同给予”表示在同一制剂中或在两种不同制剂中经由相同或不同途径同时给予,或通过相同或不同途径顺次给予。“顺次”给予表示在两种或多种不同化合物的给药之间具有以秒、分钟、小时或天计的时间差异。
本发明第九方面是提供一种碱基编辑的细胞,所述细胞包含C至G的突变。在一些优选实施方式中,所述细胞通过如上所述的碱基方法制备得到。
本发明第十方面是提供一种报告系统,用于检测C-G突变效率,所述报告系统包含如SEQ ID NO.31所示的核苷酸序列。如SEQ ID NO.31所示的核苷酸序列编码的蛋白不发荧光,且编码的报告蛋白的第67位氨基酸对应的密码子为GAT。当GAT突变为CAT,即当SEQ IDNO.31所示的核苷酸序列突变为SEQ ID NO.30时,所述报告系统呈现蓝色荧光(SEQ IDNO.1);当GAT突变为TAT,即当SEQ ID NO.31所示的核苷酸序列突变为SEQ ID NO.32时,所述报告系统呈现绿色荧光。当SEQ ID NO.31所示的核苷酸序列突变为SEQ ID NO.33时,所述报告系统无荧光变化。在一优选实施方式中,包含所述报告系统的质粒的核苷酸序列如SEQ ID NO.2所示。
本发明第十一方面是提供如上所述的报告系统,所述报告系统用于检测如上所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统的C-G编辑效率。
本发明第十二方面是提供一种尿嘧啶DNA结合蛋白,所述尿嘧啶DNA结合蛋白选自来自微生物的Udgx家族的酶或其变体;优选地,所述微生物选自类胞内分枝杆菌、少动鞘氨醇单胞菌、球形红假单胞菌、柑桔溃疡病菌、苜蓿中华根瘤菌、皮诺卡氏菌、盖尔森基兴奴卡氏菌、哥伦布分枝杆菌、贪铜菌、阿氟曼链霉菌、欺诈伯克霍尔德菌、梅洛尼鞘氨醇单胞菌、沼泽红假单胞菌、耻垢分枝杆菌、胞内分枝杆菌、多噬伯克霍尔德菌、慢生型大豆根瘤菌、高效固氮慢生根瘤菌、草莓角斑病菌;进一步优选地,所述微生物为皮诺卡氏菌。
本发明第十三方面是提供如上所述的尿嘧啶DNA结合蛋白在用于构建碱基编辑蛋白或碱基编辑系统中的用途;所述碱基编辑蛋白或碱基编辑系统用于实现C至G或G至C的颠换。
本发明第十四方面还提供了一种组合物,其包含如上任一项所述的融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统。在一些优选实施方式,其进一步包含药学上可接受的载体。所述可接受的载体例如无菌水或生理盐水、稳定剂、赋形剂、抗氧化剂(抗坏血酸等)、缓冲剂(磷酸、枸橼酸、其它的有机酸等)、防腐剂、表面活性剂(PEG、Tween等)、螯合剂(EDTA等)、粘合剂等。而且,也可含有其它低分子量的多肽;血清白蛋白、明胶或免疫球蛋白等蛋白质;甘氨酸、谷酰胺、天冬酰胺、精氨酸和赖氨酸等氨基酸;多糖和单糖等糖类或碳水化物;甘露糖醇或山梨糖醇等糖醇。当制备用于注射的水溶液时,例如生理盐水、含有葡萄糖或其它的辅助药物的等渗溶液,如D-山梨糖醇、D-甘露糖、D-甘露糖醇、氯化钠,可并用适当的增溶剂例如醇(乙醇等)、多元醇(丙二醇,PEG等)、非离子表面活性剂(吐温80,HCO-50)等。
本发明还提供了一种试剂盒,其包含如上所述的组合物。
本发明中,所述组合物或药物组合物或联合制剂的剂型选自:注射剂、注射用无菌粉末、片剂、丸剂、胶囊、锭剂、醑剂、散剂、颗粒剂、糖浆剂、溶液剂、酊剂、气雾剂、粉雾剂、或栓剂。本领域技术人员可根据给药方式,选择合适的制剂形式,例如,适合于口服给药的制剂形式可以是包括但不限于丸剂、片剂、咀嚼剂、胶囊剂、颗粒剂、溶液剂、滴剂、糖浆、气雾剂或粉雾剂等。
本发明中,蛋白或其片段(如第一Cas9切口酶片段、第二Cas9切口酶片段、脱氨酶片段、尿嘧啶DNA结合蛋白片段)的变体为原始蛋白或其片段的片段、衍生物和类似物,其可以是(i)有一个或多个保守或非保守性氨基酸残基(优选保守性氨基酸残基)被取代的蛋白,而这样的取代的氨基酸残基可以是也可以不是由遗传密码编码的,或(ii)在一个或多个氨基酸残基中具有取代基团的蛋白,或(iii)附加的氨基酸序列融合到此蛋白序列而形成的蛋白(如前导序列或分泌序列或用来纯化此蛋白的序列或蛋白原序列)。根据本发明的定义,这些片段、衍生物和类似物属于本领域熟练技术人员公知的范围。例如,在一些实施方式中,所述第一Cas9切口酶片段的变体是指与所述融合蛋白的氨基酸序列具有75%或更高,或85%或更高,或90%或更高,或95%或更高同一性,且具有所述第一Cas9切口酶片段相同或相似功能的蛋白。上述75%或75%以上同一性,可为75%、80%、85%、90%或95%以上的同一性;具体地可为75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%。上述90%或90%以上同一性,可为91%、92%、93%、94%、95%、96%、97%、98%、99%同一性。所述相似功能是指保留原蛋白的75%或更高,或85%或更高,或90%或更高,或95%或更高的功能。
本发明中,所述碱基编辑系统能够实现C至G或G至C的编辑或突变。例如,当靶向编码链时,所述碱基编辑系统能够实现C至G的编辑,相应地,对于非编码链,实现G至C的编辑;当靶向非编码链时,所述碱基编辑系统能够实现C至G的编辑,相应地,对于编码链,实现G至C的编辑。
除非另行定义,文中所使用的所有专业与科学用语与本领域熟练人员所熟悉的意义相同。此外,任何与所记载内容相似或均等的方法及材料皆可应用于本发明方法中。文中所述的较佳实施方法与材料仅作示范之用。
如本文所用,序列相似性或同一性可以用肉眼或计算机软件进行评价。使用计算机软件,两个或多个序列之间的同一性可以用百分比(%)表示,其可以用来评价相关序列之间的同一性。
如本文所用,“包含”、“含有”等应理解为是包括性的意思,而没有排他性或穷尽的意思;即“包括但不限于”的意思。
如本文所用,“治疗有效量”通常指一用量在经过适当的给药期间后,能够达到治疗如上所列出的疾病的效果。
如本文所用,“治疗性”和“预防性”应理解为其最宽的意义。术语“治疗性”不一定暗示哺乳动物接受治疗直至完全恢复。类似地,“预防性”不一定表示对象最终不会感染疾病病症。因此,治疗和预防包括缓解具体病症的症状或防止或降低具体病症产生的风险。术语“预防”可理解为降低具体病症发作的严重程度。治疗也可降低已有病症的严重程度或急性发作的频率。
如本文所用,进行治疗性或预防性治疗的对象或个体优选哺乳动物,例如但不限于人、灵长类、牲畜(如绵羊、牛、马、驴、猪)、宠物(如狗、猫)、实验室试验动物(如小鼠、家兔、大鼠、豚鼠、仓鼠)或被捕获的野生动物(如狐狸、鹿)。所述对象优选灵长类。所述对象最优选人。
如本文所用,术语“核酸”和“核酸组分”可互换使用,其指具有核碱基和酸性部分的化合物,例如核苷、核苷酸或核苷酸的聚合物。在一些实施方式中,“核酸”是指单个核酸残基(例如,核苷酸和/或核苷)。在一些实施方式中,“核酸”是指包含三个或更多个单个核苷酸残基的寡核苷酸链。本文中所用术语“寡核苷酸”和“多核苷酸”可互换使用以指核苷酸的聚合物(例如,一串至少三个核苷酸)。在一些实施方式中,“核酸”包括RNA以及单链和/或双链DNA。核酸可以是天然产生的或是非天然产生的分子。
如本文所用,术语“表达”指多核苷酸转录成mRNA的过程和/或转录的mRNA随后被翻译成肽、多肽或蛋白质的过程。如果多核苷酸源自基因组DNA,则表达可包括真核细胞中mRNA的剪接。基因的表达水平可以通过测量细胞或组织样品内mRNA或蛋白质的量予以确定。
术语“调节元件”包括启动子、增强子、内部核糖体进入位点(IRES)、和其他表达控制元件(例如转录终止信号,如多聚腺苷酸化信号和多聚U序列)。调节元件包括指导一个核苷酸序列在许多类型的宿主细胞中的组成型表达的那些序列以及指导该核苷酸序列只在某些宿主细胞中表达的那些序列(例如,组织特异型调节序列)。组织特异型启动子可指导在感兴趣的组织表达,例如肌肉、神经元、骨、皮肤等。在一些实施例中,一个载体包含一个或多个pol III启动子、pol II启动子、pol I启动子、或其组合。pol III启动子包括但不限于U6和H1启动子。pol II启动子包括但不限于巨细胞病毒(CMV)启动子、SV40启动子、二氢叶酸还原酶启动子、β-肌动蛋白启动子、磷酸甘油激酶(PGK)启动子等。
如本文所用,术语“蛋白质”、“肽”和“多肽”可以互换使用并以其最广泛的含义使用,是指两个或更多个亚基的氨基酸、氨基酸类似物或肽模拟物的化合物。亚基可以通过肽键连接。在另一方面中,亚基可以通过其他键连接,例如,酯、醚等。蛋白质或肽必须包含至少两个氨基酸,并且对构成蛋白质或肽序列的最大氨基酸数没有限制。已知蛋白质和肽具有C-末端和N-末端,所述C-末端指代在该末端氨基酸上存在未结合羧基的末端,所述N-末端指代在该末端氨基酸上存在未结合氨基的末端。本文所用术语“氨基酸”指天然和/或非天然或合成的氨基酸,包括甘氨酸,以及D和L光学异构体,氨基酸类似物和肽模拟物。术语“融合”在蛋白质或多肽上下文中是指两个或更多个蛋白质或多肽(或其结构域)末端之间形成融合蛋白的连接。
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围不局限于下述特定的具体实施方案;还应当理解,本发明实施例中使用的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围;在本发明说明书和权利要求书中,除非文中另外明确指出,单数形式“一个”、“一”和“这个”包括复数形式。
当实施例给出数值范围时,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。
除非另外说明,本发明中所公开的实验方法、检测方法、制备方法均采用本技术领域常规的分子生物学、生物化学、染色质结构和分析、分析化学、细胞培养、重组DNA技术及相关领域的常规技术。这些技术在现有文献中已有完善说明。
实施例1:用于测试C-G编辑系统编辑效率的荧光报告系统
在哺乳动物细胞系中,利用碱基颠换编辑器对获得性报告系统进行碱基编辑,通过荧光信号,判断编辑的有无与效率的高低。具体实施如下:
1、报告系统的构建
报告蛋白的编码核苷酸序列如SEQ ID NO.31所示。当所述报告蛋白的编码核酸的编码链上的第67位氨基酸密码子为GAT(对应非编码链上的ATC),所述报告系统编码的报告蛋白不显示荧光。当利用碱基编辑系统靶向非编码链引入突变,将非编码链上的ATC突变为ATG,相应地将第67位氨基酸的密码子突变为CAT时,表达组氨酸(密码子CAT),如SEQ IDNO.31所示的核苷酸编码的报告蛋白氨基酸如SEQ ID NO.1所示,所述报告系统显示出蓝色荧光;当利用碱基编辑引入突变,将第67位氨基酸的密码子突变为TAT时,表达酪氨酸(密码子TAT),所述报告系统显示出绿色荧光。通过流式细胞术分析蓝色荧光比例,可以推算出C-G碱基的颠换比例。本实施例构建的含有该报告蛋白的质粒(报告系统)的核苷酸序列如SEQID NO.2所示,相应的报告系统命名为BFP-CG报告系统。
2、sgRNA表达载体的构建
设计对应的sgRNA,靶向非编码链,靶序列为:SEQ ID NO.3:catCggtcagggtggtcacgagg,并构建相应的sgRNA表达载体,构建过程如下:
根据靶向位点序列设计碱基互补配对的上下游引物SEQ ID NO.4:accgcatCggtcagggtggtcacg;SEQ ID NO.5:aaaccgtgaccaccctgaccGatg,加灭菌水溶解至100μM。经退火后连接到pGL3-U6-sgRNA(Addgene#51133)载体上,以构建靶向特异性sgRNA。
2.1退火产物
退火体系如下所示:
表1
上游引物 | 4.5μL |
下游引物 | 4.5μL |
10×NEB buffer2 | 1μL |
退火程序如下:
表2
95℃ | 5min |
95-85℃ | -2℃/s |
85-25℃ | -0.1℃/s |
4℃ | ∞ |
2.2线性化载体
利用BsaI(NEB,R0535S)对pGL3-U6(Addgene#51133)质粒进行酶切以得到线性化sgRNA载体。酶切体系如下所示:
表3
水 | 补水至50μL |
PGL3-U6质粒 | 10μg |
10×cutsmart buffer | 5μL |
BsaI酶 | 5μL |
2.3退火产物与线性化载体连接
以上反应体系配置好后,置于37℃条件下反应5h,酶切产物用AxyPrep DNA凝胶回收试剂盒(Axygen,AP-GX-250G)做割胶回收得到线性化载体。取50ng线性化载体与3μL退火产物通过T4连接酶(NEB,M0202S)连接,16℃孵育2小时后并转化涂板,经Sanger测序得到正确的靶向特异性sgRNA。连接体系如下:
表4
水 | 补水至10μL |
PGL3-U6-BsaI酶切线性片段 | 20ng |
退火产物 | 1μL |
Solution I | 5μL |
连接产物随后进行转化,复苏30min,涂板于氨卞抗性的LB琼脂平板,37℃培养过夜。挑选单克隆进行测序验证。连接成功且无误后,对其进行质粒抽提。
3、碱基编辑蛋白表达质粒的构建
利用AncBE4max载体(addgene:112094)为骨架,用Apobec3A脱氨酶替换其中的Apobec1,构建成A3A-CBEmax的表达载体(SEQ ID NO.6)(图2所示),构建方法利用GibsonAssembly Master Mix重组试剂盒(NEB,E2611S)。如上所述,UGI是抑制UNG的作用,防止尿嘧啶的切除,为了提高C-G突变效率,进一步地,本发明构建了去掉UGI的编辑蛋白,A3A-CBEmax-△UGI编辑蛋白的表达载体(SEQ ID NO.7)。在此基础上,本发明添加了UdgX(来自Mycobacterium smegmatis),构建了A3A-Udgx-CBEmax-△UGI编辑蛋白的表达载体(SEQ IDNO.8)。
4、哺乳动物细胞系转染C-G编辑系统和获得型报告系统
(1)HEK293T细胞接种培养在含10%FBS的DMEM培养液中(HyClone,SH30022.01B),其中含penicillin(100U/ml)和streptomycin(100μg/ml)。
(2)转染前一天将细胞分至6孔板。次日,待密度至70%-80%时进行转染。
(3)按照LipofectamineTM2000 Transfection Reagent(Invitrogen,11668-019)的操作手册,将2μg碱基编辑蛋白质粒、1μg sgRNA表达载体与100ng的BFP-CG报告系统混匀,共转染至细胞中,6-8小时后换液,分别在24小时、48小时、72小时与96小时后进行荧光数据检测。
4、荧光报告系统分析碱基颠换效率
利用flowjo软件对流式细胞术结果进行分析,本发明发现,A3A-CBEmax-△UGI与A3A-Udgx-CBEmax-△UGI可以实现BFP荧光的转换,而A3A-CBEmax无法实现BFP荧光的转换(图3);且A3A-Udgx-CBEmax-△UGI相比A3A-CBEmax-△UGI具有更高的编辑效率。同时,荧光强度在48小时最高,后续实验将以48小时作为时间点检测荧光强度。
上述结果表明,本发明构建的报告系统可以较为准确地反应C-G编辑效率。
实施例2:不同类型的尿嘧啶结合蛋白UdgX对C-G编辑器的作用效果
1、不同种类UdgX的构建
目前报道的UdgX来自耻垢分枝杆菌,为了进一步提高碱基编辑系统的编辑效果,本发明利用进化树分析了19种不同类型的UdgX(图4),并分别构建了19种A3A-Udgx-CBEmax-△UGI编辑器蛋白。相应的每种UdgX来源的微生物名称分别为类胞内分枝杆菌(Mycobacterium paraintracellulare)、少动鞘氨醇单胞菌(Sphingomonaspaucimobilis)、球形红假单胞菌(Cereibacter sphaeroides)、柑桔溃疡病菌(Xanthomonas citri)、苜蓿中华根瘤菌(Sinorhizobium meliloti)、皮诺卡氏菌(Nocardia farcinica)、盖尔森基兴奴卡氏菌(Nocardia cyriacigeorgica)、哥伦布分枝杆菌(Mycobacterium colombiense)、贪铜菌(Cupriavidus taiwanensis)、阿氟曼链霉菌(Streptomyces scabiei)、欺诈伯克霍尔德菌(Burkholderia dolosa)、梅洛尼鞘氨醇单胞菌(Sphingomonas melonis)、沼泽红假单胞菌(Rhodopseudomonas palustris)、耻垢分枝杆菌(Mycolicibacterium smegmatis)、胞内分枝杆菌(Mycobacterium intracellulare)、多噬伯克霍尔德菌(Burkholderia multivorans)、慢生型大豆根瘤菌(Bradyrhizobiumjaponicum)、高效固氮慢生根瘤菌(Bradyrhizobium diazoefficiens)、草莓角斑病菌(Xanthomonas fragariae),分析不同类型的UdgX对编辑效果的影响。
2、哺乳动物细胞系转染C-G编辑系统和获得型报告系统
(1)HEK293T细胞接种培养在含10%FBS的DMEM培养液中(HyClone,SH30022.01B),其中含penicillin(100U/ml)和streptomycin(100μg/ml)。
(2)转染前一天将细胞分至6孔板。次日,待密度至70%-80%时进行转染。
(3)按照LipofectamineTM2000 Transfection Reagent(Invitrogen,11668-019)的操作手册,将2μg碱基编辑蛋白质粒、1μg sgRNA表达载体与100ng对应的BFP-CG报告系统混匀,共转染至细胞中,6-8小时后换液,48小时后对BFP荧光信号进行检测分析。
3、不同碱基编辑器对C-G突变效率影响分析
利用flowjo软件对流式细胞术结果进行分析,通过比对BFP荧光信号,判断C-G碱基颠换的效果。结果表明来自皮诺卡氏菌(Nocardia farcinica)的UdgX(nfUdgx)展示出了更高的C-G编辑效果(图5)。后续实验本发明将A3A-nfUdgx-CBEmax-△UGI编辑蛋白的表达载体(SEQ ID NO.8)对应的碱基编辑器命名为CGBE。
实施例3:嵌合型CGBE的构建
已有研究表明脱氨酶容易造成随机脱靶,对碱基编辑器的安全构成了威胁。为此,本发明通过将脱氨酶与UdgX复合蛋白,插入到Cas9蛋白结构域内,可以有效避免脱氨酶的随机脱氨,起到降低脱靶的效果(图6)。
1、pET-nCas9-gRNA-AmpR(Y260X)-KanR质粒载体的构建
利用Gibson Assembly Master Mix重组试剂盒(NEB,E2611S)构建pET-nCas9-gRNA-AmpR(Y260X)-KanR质粒载体(SEQ ID NO.9)。该质粒氨苄抗性基因在260位氨基酸处含有终止密码子TAG,当TAG通过碱基编辑器编辑成TAC时,氨苄抗性才发挥作用,相应的菌在氨苄抗生素的平板上才能生长。
2、用于MuA转座酶构建随机插入的受体质粒
利用扩增出的APOBEC-nfUdgX(SEQ ID NO.10),pET-nCas9-gRNA-AmpR(Y260X)-KanR)质粒(SEQ ID NO.9),在MuA转座酶(Thermo Fisher,F-701)作用下,在体外构建不同位置随机插入的载体。具体反应体系如下:
表5
APOBEC-nfUdgX片段 | 250ng |
质粒(SEQ ID NO.9) | 500ng |
MuA转座酶 | 1μL |
5×Reaction Buffer for MuA Transposase | 4μL |
水 | 补水至20μL |
反应液在30℃孵育1小时以实现随机插入,然后在75℃孵育10分钟以灭活MuA转座酶。随后用异丙醇沉淀纯化出DNA,并重悬于5μL去离子水中,然后电转到100μL BL21(DE3)Electro(Shanghai Weidi Biotechnology,EE1002)感受态细胞中。
3、大肠杆菌中筛选功能性嵌入融合CGBE蛋白的表达质粒
将上述电转后的感受态细胞在SOC培养基中复苏1小时后的细菌涂布在数个含10μg/mL卡那霉素的LB琼脂板上,并在37℃下孵育16小时。然后刮下来平板上的菌落,并将其重悬于含有500μM IPTG的100mL LB中。将培养物温育10-12h以诱导nCas9表达并修复AmpR(Y260X)上的突变。然后将减少量的细胞(5mL,1mL,500μL,100μL)接种到含有氨苄青霉素(10μg/mL)和卡那霉素(10μg/mL)的15cm LB琼脂平板上。37℃孵育过夜后,挑选菌落并进行Sanger测序,以评估AmpR(Y260X)上的碱基编辑并确定APOBEC-nfUdgX插入位点。筛选到以下插入位点,具体位置为328,645,698,794,960,979,993,998,1003,1009,1027,1039,1043,1053,1098,1102,1298,1300。
4、嵌合型CGBE对报告系统的编辑
通过上述获得插入位点,本发明构建了相应的哺乳动物碱基编辑蛋白表达质粒,分别命名为CE-CGBE-328,CE-CGBE-645,CE-CGBE-698,CE-CGBE-794,CE-CGBE-960,CE-CGBE-979,CE-CGBE-993,CE-CGBE-998,CE-CGBE-1003,CE-CGBE-1009,CE-CGBE-1027,CE-CGBE-1039,CE-CGBE-1043,CE-CGBE-1053,CE-CGBE-1098,CE-CGBE-1102,CE-CGBE-1298,CE-CGBE-1300。CE-CGBE-1003、CE-CGBE-1027和CE-CGBE-1039相对应表达的碱基编辑蛋白氨基酸序列分别为SEQ ID NO.11、SEQ ID NO.12和SEQ ID NO.13。
按照LipofectamineTM2000 Transfection Reagent(Invitrogen,11668-019)的操作手册,将2μg编辑蛋白表达质粒、1μg的sgRNA表达载体与100ng对应的BFP-CG报告系统混匀,共转染至HEK293T细胞中,6-8小时后换液,48小时后对BFP荧光信号进行分析。利用flowjo软件对流式细胞术结果进行分析,通过比对BFP效率,判断C-G碱基颠换的效果。结果表明,编辑效率最高的三个位点为1003,1027以及1039。后续本发明将以1027位置为插入位点,构建碱基编辑蛋白表达质粒,进行内源基因编辑,获得的嵌合型CGBE编辑器命名为CE-CGBE。
实施例4:CE-CGBE可以对内源基因实现高效的C-G突变
为进一步研究C-G碱基颠换编辑器的作用特点及效率,本发明针对5个内源基因进行编辑。具体实施如下:
1、靶位点的选择与相应sgRNA表达载体的构建。
5个位点选择如下:
FAM171b:ACAACAACAGCAAAAGCAGCTGG(SEQ ID NO.14);
POLR2A:ACTTCAAGAACTAGTGCGCAGG(SEQ ID NO.15);
VISTA:GCGGTACCACGTCTTGTAGAAGG(SEQ ID NO.16);
NANOG:ACCAGAGAATGAAATCTAAGAGG(SEQ ID NO.17);
DDX53:TGATCAAGAGCGAGCAGTAGAGG(SEQ ID NO.18)。
设计5个位点相应的sgRNA上下游引物,序列为SEQ ID NO.19-SEQ ID NO.28。上下游序列通过程序(95℃,5min;95℃-85℃(以2℃/s速度降温);85℃-25℃(以0.1℃/s的速度降温);维持4℃)退火,连接到经过BsaI(NEB:R0539L)线性化的PGL3-U6-sgRNA-EGFP(addgene:107721)载体上。对阳性克隆摇菌,提取质粒(Axygene:AP-MN-P-250G),测定浓度备用。
2、不同类型脱氨酶的CE-CGBE编辑器构建
利用Gibson Assembly Master Mix重组试剂盒(NEB,E2611S)构建不同脱氨酶的CE-CGBE编辑器(图8)。APOBEC-nfUdgX插入位点为1027。本发明利用了ancAPOBEC1与A3A两种脱氨酶分别构建碱基编辑蛋白,分别命名为CE-CGBE-ancApobec1(碱基编辑蛋白的氨基酸序列如SEQ ID NO.29所示),与CE-CGBE-A3A(碱基编辑蛋白的氨基酸序列如SEQ IDNO.12所示)。按照实施例1第3节“碱基编辑蛋白表达质粒的构建”的方法,构建相应的碱基编辑蛋白表达质粒。
2、CE-CGBE对内源基因的编辑
将HEK293T细胞接种培养于添加10%FBS的DMEM高糖培养液中(HyClone,SH30022.01B),其中含penicillin(100U/ml)和streptomycin(100μg/ml)。转染前两个小时换成无抗生素的培养基,按照LipofectamineTM2000 Transfection Reagent(Invitrogen,11668-019)的操作手册,将2μg编辑蛋白表达质粒、1μg的sgRNA表达载体混匀,共转染至细胞中,6-8小时后换液,72小时后分选出10000个GFP阳性细胞。细胞通过裂解鉴定基因型,裂解液的成分为50mM KCl,1.5mM MgCl2,10mM Tris pH 8.0,0.5%Nonidet P-40,0.5%Tween 20,100μg/ml protease K。
3、CE-CGBE对内源基因的编辑效果分析
利用一代sanger测序,本发明对两种编辑器和相应的5个位点进行了分析(图9),并进一步统计了相应的编辑效率(图10)。两种编辑器均能实现C-G的编辑,其中CE-CGBE-A3A最高能实现70%的C-G编辑,CE-CGBE-AncApobec1最高实现55%的C-G编辑。
实施例5:CE-CGBE可以显著降低DNA和RNA的脱靶
胞嘧啶碱基编辑器在DNA与RNA水平均被报道易发生脱靶,本实施例将分析CE-CGBE-A3A对内源基因编辑后DNA与RNA水平的脱靶。
HEK293T细胞接种培养于添加10%FBS的DMEM高糖培养液中(HyClone,SH30022.01B),其中含penicillin(100U/ml)和streptomycin(100μg/ml)。转染前两个小时换成无抗生素的培养基,按照LipofectamineTM2000 Transfection Reagent(Invitrogen,11668-019)的操作手册,将4μg碱基编辑蛋白表达质粒(实施例4构建的4μg CE-CGBE-A3A编辑蛋白表达质粒、实施例1构建的A3A-Udgx-CBEmax-△UGI碱基编辑蛋白)、2μg的sgRNA表达载体(对应FAM171B位点)混匀,共转染至细胞中,6-8小时后换液,72小时后分选出500000个GFP阳性细胞。
分选的细胞通过提取DNA与RNA,送全基因组测序与RNA-测序。通过与未转染编辑器的阴性对照比较,发现CE-CGBE在DNA与RNA水平的脱靶,与参考基因组没有显著差异,与A3A-Udgx-CBEmax-△UGI相比,具有显著的降低(图11)。表明本发明提出的碱基颠换编辑器具有高效、安全的特点。
综上所述,本发明高效特异实现碱基颠换的编辑系统、方法及用途,可以对靶位点中的C颠换成G。本发明所述的融合蛋白、编辑系统或方法可以在多种细胞如哺乳动物的细胞系及原代细胞中高效实现C-G碱基的颠换,为修复致病突变,研究基因功能以及改善细胞功能等提供了有效的工具,具备良好的应用前景。所以,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。
以上的实施例是为了说明本发明公开的实施方案,并不能理解为对本发明的限制。此外,本文所列出的各种修改以及发明中方法、组合物的变化,在不脱离本发明的范围和精神的前提下对本领域内的技术人员来说是显而易见的。虽然已结合本发明的多种具体优选实施例对本发明进行了具体的描述,但应当理解,本发明不应仅限于这些具体实施例。事实上,各种如上所述的对本领域内的技术人员来说显而易见的修改来获取发明都应包括在本发明的范围内。
序列表
<110> 上海贝斯昂科生物科技有限公司
<120> 一种高效特异实现碱基颠换的编辑系统、方法及用途
<160> 44
<170> SIPOSequenceListing 1.0
<210> 1
<211> 239
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 1
Met Val Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu
1 5 10 15
Val Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly
20 25 30
Glu Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile
35 40 45
Cys Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr
50 55 60
Leu Thr His Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys
65 70 75 80
Gln His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu
85 90 95
Arg Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu
100 105 110
Val Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly
115 120 125
Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr
130 135 140
Asn Tyr Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Gln Lys Asn
145 150 155 160
Gly Ile Lys Val Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser
165 170 175
Val Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly
180 185 190
Pro Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Lys Leu
195 200 205
Ser Lys Asp Pro Asn Glu Lys Arg Asp His Met Val Leu Leu Glu Phe
210 215 220
Val Thr Ala Ala Gly Ile Thr Leu Gly Met Asp Glu Leu Tyr Lys
225 230 235
<210> 2
<211> 6138
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 2
gacggatcgg gagatctccc gatcccctat ggtgcactct cagtacaatc tgctctgatg 60
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120
cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc 180
ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt 240
gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 300
tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 360
cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 420
attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 480
atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 540
atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 600
tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 660
actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 720
aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 780
gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca 840
ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc 900
atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 960
ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 1020
ggcaagctga ccctgaagtt catctgcacc accggcaaac tgcccgtgcc ctggcccacc 1080
ctcgtgacca ccctgaccga tggcgtgcag tgcttcagcc gctaccccga ccacatgaag 1140
cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 1200
ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 1260
gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 1320
aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 1380
ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 1440
gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 1500
tacctgagca cccagtccaa gctgagcaaa gaccccaacg agaagcgcga tcacatggtc 1560
ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagtga 1620
aagcttggta ccgagctcgg atccactagt ccagtgtggt ggaattctgc agatatccag 1680
cacagtggcg gccgctcgag tctagagggc ccgtttaaac ccgctgatca gcctcgactg 1740
tgccttctag ttgccagcca tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg 1800
aaggtgccac tcccactgtc ctttcctaat aaaatgagga aattgcatcg cattgtctga 1860
gtaggtgtca ttctattctg gggggtgggg tggggcagga cagcaagggg gaggattggg 1920
aagacaatag caggcatgct ggggatgcgg tgggctctat ggcttctgag gcggaaagaa 1980
ccagctgggg ctctaggggg tatccccacg cgccctgtag cggcgcatta agcgcggcgg 2040
gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag cgccctagcg cccgctcctt 2100
tcgctttctt cccttccttt ctcgccacgt tcgccggctt tccccgtcaa gctctaaatc 2160
gggggctccc tttagggttc cgatttagtg ctttacggca cctcgacccc aaaaaacttg 2220
attagggtga tggttcacgt agtgggccat cgccctgata gacggttttt cgccctttga 2280
cgttggagtc cacgttcttt aatagtggac tcttgttcca aactggaaca acactcaacc 2340
ctatctcggt ctattctttt gatttataag ggattttgcc gatttcggcc tattggttaa 2400
aaaatgagct gatttaacaa aaatttaacg cgaattaatt ctgtggaatg tgtgtcagtt 2460
agggtgtgga aagtccccag gctccccagc aggcagaagt atgcaaagca tgcatctcaa 2520
ttagtcagca accaggtgtg gaaagtcccc aggctcccca gcaggcagaa gtatgcaaag 2580
catgcatctc aattagtcag caaccatagt cccgccccta actccgccca tcccgcccct 2640
aactccgccc agttccgccc attctccgcc ccatggctga ctaatttttt ttatttatgc 2700
agaggccgag gccgcctctg cctctgagct attccagaag tagtgaggag gcttttttgg 2760
aggcctaggc ttttgcaaaa agctcccggg agcttgtata tccattttcg gatctgatca 2820
agagacagga tgaggatcgt ttcgcatgat tgaacaagat ggattgcacg caggttctcc 2880
ggccgcttgg gtggagaggc tattcggcta tgactgggca caacagacaa tcggctgctc 2940
tgatgccgcc gtgttccggc tgtcagcgca ggggcgcccg gttctttttg tcaagaccga 3000
cctgtccggt gccctgaatg aactgcagga cgaggcagcg cggctatcgt ggctggccac 3060
gacgggcgtt ccttgcgcag ctgtgctcga cgttgtcact gaagcgggaa gggactggct 3120
gctattgggc gaagtgccgg ggcaggatct cctgtcatct caccttgctc ctgccgagaa 3180
agtatccatc atggctgatg caatgcggcg gctgcatacg cttgatccgg ctacctgccc 3240
attcgaccac caagcgaaac atcgcatcga gcgagcacgt actcggatgg aagccggtct 3300
tgtcgatcag gatgatctgg acgaagagca tcaggggctc gcgccagccg aactgttcgc 3360
caggctcaag gcgcgcatgc ccgacggcga ggatctcgtc gtgacccatg gcgatgcctg 3420
cttgccgaat atcatggtgg aaaatggccg cttttctgga ttcatcgact gtggccggct 3480
gggtgtggcg gaccgctatc aggacatagc gttggctacc cgtgatattg ctgaagagct 3540
tggcggcgaa tgggctgacc gcttcctcgt gctttacggt atcgccgctc ccgattcgca 3600
gcgcatcgcc ttctatcgcc ttcttgacga gttcttctga gcgggactct ggggttcgaa 3660
atgaccgacc aagcgacgcc caacctgcca tcacgagatt tcgattccac cgccgccttc 3720
tatgaaaggt tgggcttcgg aatcgttttc cgggacgccg gctggatgat cctccagcgc 3780
ggggatctca tgctggagtt cttcgcccac cccaacttgt ttattgcagc ttataatggt 3840
tacaaataaa gcaatagcat cacaaatttc acaaataaag catttttttc actgcattct 3900
agttgtggtt tgtccaaact catcaatgta tcttatcatg tctgtatacc gtcgacctct 3960
agctagagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 4020
acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 4080
gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 4140
tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 4200
cgctcttccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 4260
gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 4320
aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 4380
gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 4440
aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 4500
gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 4560
ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 4620
cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 4680
ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 4740
actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 4800
tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca 4860
gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 4920
ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 4980
ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 5040
gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 5100
aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 5160
gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actccccgtc 5220
gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg 5280
cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc cggaagggcc 5340
gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa ttgttgccgg 5400
gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc cattgctaca 5460
ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg ttcccaacga 5520
tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct 5580
ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat ggcagcactg 5640
cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg tgagtactca 5700
accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata 5760
cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 5820
tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 5880
cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg gtgagcaaaa 5940
acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg ttgaatactc 6000
atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga 6060
tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 6120
aaagtgccac ctgacgtc 6138
<210> 3
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 3
catcggtcag ggtggtcacg agg 23
<210> 4
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 4
accgcatcgg tcagggtggt cacg 24
<210> 5
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 5
aaaccgtgac caccctgacc gatg 24
<210> 6
<211> 8877
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 6
gcactaatct gagcgacatc attgagaagg agactgggaa acagctggtc attcaggagt 60
ccatcctgat gctgcctgag gaggtggagg aagtgatcgg caacaagcca gagtctgaca 120
tcctggtgca caccgcctac gacgagtcca cagatgagaa tgtgatgctg ctgacctctg 180
acgcccccga gtataagcct tgggccctgg tcatccagga ttctaacggc gagaataaga 240
tcaagatgct gagcggagga tccggaggat ctggaggcag caccaacctg tctgacatca 300
tcgagaagga gacaggcaag cagctggtca tccaggagag catcctgatg ctgcccgaag 360
aagtcgaaga agtgatcgga aacaagcctg agagcgatat cctggtccat accgcctacg 420
acgagagtac cgacgaaaat gtgatgctgc tgacatccga cgccccagag tataagccct 480
gggctctggt catccaggat tccaacggag agaacaaaat caaaatgctg tctggcggct 540
caaaaagaac cgccgacggc agcgaattcg agcccaagaa gaagaggaaa gtctaaccgg 600
tcatcatcac catcaccatt gagtttaaac ccgctgatca gcctcgactg tgccttctag 660
ttgccagcca tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac 720
tcccactgtc ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca 780
ttctattctg gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag 840
caggcatgct ggggatgcgg tgggctctat ggcttctgag gcggaaagaa ccagctgggg 900
ctcgataccg tcgacctcta gctagagctt ggcgtaatca tggtcatagc tgtttcctgt 960
gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa 1020
agcctaggat gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc 1080
tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcgggaag 1140
aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 1200
cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 1260
atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 1320
taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 1380
aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 1440
tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 1500
gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 1560
cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 1620
cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 1680
atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 1740
tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat 1800
ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 1860
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 1920
aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacactc agtggaacga 1980
aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 2040
tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 2100
cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 2160
catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 2220
ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 2280
aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 2340
ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 2400
caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 2460
attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 2520
agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 2580
actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 2640
ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 2700
ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 2760
gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 2820
atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 2880
cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 2940
gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 3000
gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 3060
ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc gacggatcgg gagatcgatc 3120
tcccgatccc ctagggtcga ctctcagtac aatctgctct gatgccgcat agttaagcca 3180
gtatctgctc cctgcttgtg tgttggaggt cgctgagtag tgcgcgagca aaatttaagc 3240
tacaacaagg caaggcttga ccgacaattg catgaagaat ctgcttaggg ttaggcgttt 3300
tgcgctgctt cgcgatgtac gggccagata tacgcgttga cattgattat tgactagtta 3360
ttaatagtaa tcaattacgg ggtcattagt tcatagccca tatatggagt tccgcgttac 3420
ataacttacg gtaaatggcc cgcctggctg accgcccaac gacccccgcc cattgacgtc 3480
aataatgacg tatgttccca tagtaacgcc aatagggact ttccattgac gtcaatgggt 3540
ggagtattta cggtaaactg cccacttggc agtacatcaa gtgtatcata tgccaagtac 3600
gccccctatt gacgtcaatg acggtaaatg gcccgcctgg cattatgccc agtacatgac 3660
cttatgggac tttcctactt ggcagtacat ctacgtatta gtcatcgcta ttaccatggt 3720
gatgcggttt tggcagtaca tcaatgggcg tggatagcgg tttgactcac ggggatttcc 3780
aagtctccac cccattgacg tcaatgggag tttgttttgg caccaaaatc aacgggactt 3840
tccaaaatgt cgtaacaact ccgccccatt gacgcaaatg ggcggtaggc gtgtacggtg 3900
ggaggtctat ataagcagag ctggtttagt gaaccgtcag atccgctaga gatccgcggc 3960
cgctaatacg actcactata gggagagccg ccaccatgaa acggacagcc gacggaagcg 4020
agttcgagtc accaaagaag aagcggaaag tcagcagtga ggcatctcca gcaagcggac 4080
caaggcacct gatggacccc cacatcttca cctctaactt taacaatggc atcggcaggc 4140
acaagacata cctgtgctat gaggtggagc gcctggacaa tggcaccagc gtgaagatgg 4200
atcagcacag aggcttcctg cacaaccagg ccaagaatct gctgtgcggc ttctacggcc 4260
ggcacgcaga gctgagattt ctggacctgg tgcctagcct gcagctggat ccagcccaga 4320
tctatagggt gacctggttc atcagctggt ccccatgctt ttcctgggga tgtgcaggag 4380
aggtgcgcgc ctttctgcag gagaacacac acgtgcggct gagaatcttc gccgcccgga 4440
tctttgacta cgatcctctg tataaggagg ccctgcagat gctgagagac gcaggagccc 4500
aggtgtccat catgacctac gatgagttca agcactgctg ggacacattt gtggatcacc 4560
agggctgtcc cttccagcct tgggacggac tggatgagca ctcccaggcc ctgtctggca 4620
ggctgagggc catcctgcag aaccagggca attctggagg atctagcgga ggatcctctg 4680
gcagcgagac accaggaaca agcgagtcag caacaccaga gagcagtggc ggcagcagcg 4740
gcggcagcga caagaagtac agcatcggcc tggccatcgg caccaactct gtgggctggg 4800
ccgtgatcac cgacgagtac aaggtgccca gcaagaaatt caaggtgctg ggcaacaccg 4860
accggcacag catcaagaag aacctgatcg gagccctgct gttcgacagc ggcgaaacag 4920
ccgaggccac ccggctgaag agaaccgcca gaagaagata caccagacgg aagaaccgga 4980
tctgctatct gcaagagatc ttcagcaacg agatggccaa ggtggacgac agcttcttcc 5040
acagactgga agagtccttc ctggtggaag aggataagaa gcacgagcgg caccccatct 5100
tcggcaacat cgtggacgag gtggcctacc acgagaagta ccccaccatc taccacctga 5160
gaaagaaact ggtggacagc accgacaagg ccgacctgcg gctgatctat ctggccctgg 5220
cccacatgat caagttccgg ggccacttcc tgatcgaggg cgacctgaac cccgacaaca 5280
gcgacgtgga caagctgttc atccagctgg tgcagaccta caaccagctg ttcgaggaaa 5340
accccatcaa cgccagcggc gtggacgcca aggccatcct gtctgccaga ctgagcaaga 5400
gcagacggct ggaaaatctg atcgcccagc tgcccggcga gaagaagaat ggcctgttcg 5460
gaaacctgat tgccctgagc ctgggcctga cccccaactt caagagcaac ttcgacctgg 5520
ccgaggatgc caaactgcag ctgagcaagg acacctacga cgacgacctg gacaacctgc 5580
tggcccagat cggcgaccag tacgccgacc tgtttctggc cgccaagaac ctgtccgacg 5640
ccatcctgct gagcgacatc ctgagagtga acaccgagat caccaaggcc cccctgagcg 5700
cctctatgat caagagatac gacgagcacc accaggacct gaccctgctg aaagctctcg 5760
tgcggcagca gctgcctgag aagtacaaag agattttctt cgaccagagc aagaacggct 5820
acgccggcta cattgacggc ggagccagcc aggaagagtt ctacaagttc atcaagccca 5880
tcctggaaaa gatggacggc accgaggaac tgctcgtgaa gctgaacaga gaggacctgc 5940
tgcggaagca gcggaccttc gacaacggca gcatccccca ccagatccac ctgggagagc 6000
tgcacgccat tctgcggcgg caggaagatt tttacccatt cctgaaggac aaccgggaaa 6060
agatcgagaa gatcctgacc ttccgcatcc cctactacgt gggccctctg gccaggggaa 6120
acagcagatt cgcctggatg accagaaaga gcgaggaaac catcaccccc tggaacttcg 6180
aggaagtggt ggacaagggc gcttccgccc agagcttcat cgagcggatg accaacttcg 6240
ataagaacct gcccaacgag aaggtgctgc ccaagcacag cctgctgtac gagtacttca 6300
ccgtgtataa cgagctgacc aaagtgaaat acgtgaccga gggaatgaga aagcccgcct 6360
tcctgagcgg cgagcagaaa aaggccatcg tggacctgct gttcaagacc aaccggaaag 6420
tgaccgtgaa gcagctgaaa gaggactact tcaagaaaat cgagtgcttc gactccgtgg 6480
aaatctccgg cgtggaagat cggttcaacg cctccctggg cacataccac gatctgctga 6540
aaattatcaa ggacaaggac ttcctggaca atgaggaaaa cgaggacatt ctggaagata 6600
tcgtgctgac cctgacactg tttgaggaca gagagatgat cgaggaacgg ctgaaaacct 6660
atgcccacct gttcgacgac aaagtgatga agcagctgaa gcggcggaga tacaccggct 6720
ggggcaggct gagccggaag ctgatcaacg gcatccggga caagcagtcc ggcaagacaa 6780
tcctggattt cctgaagtcc gacggcttcg ccaacagaaa cttcatgcag ctgatccacg 6840
acgacagcct gacctttaaa gaggacatcc agaaagccca ggtgtccggc cagggcgata 6900
gcctgcacga gcacattgcc aatctggccg gcagccccgc cattaagaag ggcatcctgc 6960
agacagtgaa ggtggtggac gagctcgtga aagtgatggg ccggcacaag cccgagaaca 7020
tcgtgatcga aatggccaga gagaaccaga ccacccagaa gggacagaag aacagccgcg 7080
agagaatgaa gcggatcgaa gagggcatca aagagctggg cagccagatc ctgaaagaac 7140
accccgtgga aaacacccag ctgcagaacg agaagctgta cctgtactac ctgcagaatg 7200
ggcgggatat gtacgtggac caggaactgg acatcaaccg gctgtccgac tacgatgtgg 7260
accatatcgt gcctcagagc tttctgaagg acgactccat cgacaacaag gtgctgacca 7320
gaagcgacaa gaaccggggc aagagcgaca acgtgccctc cgaagaggtc gtgaagaaga 7380
tgaagaacta ctggcggcag ctgctgaacg ccaagctgat tacccagaga aagttcgaca 7440
atctgaccaa ggccgagaga ggcggcctga gcgaactgga taaggccggc ttcatcaaga 7500
gacagctggt ggaaacccgg cagatcacaa agcacgtggc acagatcctg gactcccgga 7560
tgaacactaa gtacgacgag aatgacaagc tgatccggga agtgaaagtg atcaccctga 7620
agtccaagct ggtgtccgat ttccggaagg atttccagtt ttacaaagtg cgcgagatca 7680
acaactacca ccacgcccac gacgcctacc taaacgccgt cgtgggaacc gccctgatca 7740
aaaagtaccc taagctggaa agcgagttcg tgtacggcga ctacaaggtg tacgacgtgc 7800
ggaagatgat cgccaagagc gagcaggaaa tcggcaaggc taccgccaag tacttcttct 7860
acagcaacat catgaacttt ttcaagaccg agattaccct ggccaacggc gagatccgga 7920
agcggcctct gatcgagaca aacggcgaaa ccggggagat cgtgtgggat aagggccggg 7980
attttgccac cgtgcggaaa gtgctgagca tgccccaagt gaatatcgtg aaaaagaccg 8040
aggtgcagac aggcggcttc agcaaagagt ctatcctgcc caagaggaac agcgataagc 8100
tgatcgccag aaagaaggac tgggacccta agaagtacgg cggcttcgac agccccaccg 8160
tggcctattc tgtgctggtg gtggccaaag tggaaaaggg caagtccaag aaactgaaga 8220
gtgtgaaaga gctgctgggg atcaccatca tggaaagaag cagcttcgag aagaatccca 8280
tcgactttct ggaagccaag ggctacaaag aagtgaaaaa ggacctgatc atcaagctgc 8340
ctaagtactc cctgttcgag ctggaaaacg gccggaagag aatgctggcc tctgccggcg 8400
aactgcagaa gggaaacgaa ctggccctgc cctccaaata tgtgaacttc ctgtacctgg 8460
ccagccacta tgagaagctg aagggctccc ccgaggataa tgagcagaaa cagctgtttg 8520
tggaacagca caagcactac ctggacgaga tcatcgagca gatcagcgag ttctccaaga 8580
gagtgatcct ggccgacgct aatctggaca aagtgctgtc cgcctacaac aagcaccggg 8640
ataagcccat cagagagcag gccgagaata tcatccacct gtttaccctg accaatctgg 8700
gagcccctgc cgccttcaag tactttgaca ccaccatcga ccggaagagg tacaccagca 8760
ccaaagaggt gctggacgcc accctgatcc accagagcat caccggcctg tacgagacac 8820
ggatcgacct gtctcagctg ggaggtgaca gcggcgggag cggcgggagc gggggga 8877
<210> 7
<211> 8319
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 7
atgagtgagc taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa 60
cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg ggaagaggcg gtttgcgtat 120
tgggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg 180
agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc 240
aggaaagaac atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt 300
gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag 360
tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc 420
cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc 480
ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt 540
cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt 600
atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc 660
agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa 720
gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa 780
gccagttacc ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg 840
tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga 900
agatcctttg atcttttcta cggggtctga cactcagtgg aacgaaaact cacgttaagg 960
gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg 1020
aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt 1080
aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact 1140
ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat 1200
gataccgcga gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg 1260
aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg 1320
ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat 1380
tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc 1440
ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt 1500
cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc 1560
agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga 1620
gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc 1680
gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa 1740
acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta 1800
acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg 1860
agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg 1920
aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat 1980
gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt 2040
tccccgaaaa gtgccacctg acgtcgacgg atcgggagat cgatctcccg atcccctagg 2100
gtcgactctc agtacaatct gctctgatgc cgcatagtta agccagtatc tgctccctgc 2160
ttgtgtgttg gaggtcgctg agtagtgcgc gagcaaaatt taagctacaa caaggcaagg 2220
cttgaccgac aattgcatga agaatctgct tagggttagg cgttttgcgc tgcttcgcga 2280
tgtacgggcc agatatacgc gttgacattg attattgact agttattaat agtaatcaat 2340
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 2400
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 2460
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 2520
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtacgcccc ctattgacgt 2580
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttat gggactttcc 2640
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 2700
gtacatcaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 2760
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 2820
caactccgcc ccattgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag 2880
cagagctggt ttagtgaacc gtcagatccg ctagagatcc gcggccgcta atacgactca 2940
ctatagggag agccgccacc atgaaacgga cagccgacgg aagcgagttc gagtcaccaa 3000
agaagaagcg gaaagtcagc agtgaggcat ctccagcaag cggaccaagg cacctgatgg 3060
acccccacat cttcacctct aactttaaca atggcatcgg caggcacaag acatacctgt 3120
gctatgaggt ggagcgcctg gacaatggca ccagcgtgaa gatggatcag cacagaggct 3180
tcctgcacaa ccaggccaag aatctgctgt gcggcttcta cggccggcac gcagagctga 3240
gatttctgga cctggtgcct agcctgcagc tggatccagc ccagatctat agggtgacct 3300
ggttcatcag ctggtcccca tgcttttcct ggggatgtgc aggagaggtg cgcgcctttc 3360
tgcaggagaa cacacacgtg cggctgagaa tcttcgccgc ccggatcttt gactacgatc 3420
ctctgtataa ggaggccctg cagatgctga gagacgcagg agcccaggtg tccatcatga 3480
cctacgatga gttcaagcac tgctgggaca catttgtgga tcaccagggc tgtcccttcc 3540
agccttggga cggactggat gagcactccc aggccctgtc tggcaggctg agggccatcc 3600
tgcagaacca gggcaattct ggaggatcta gcggaggatc ctctggcagc gagacaccag 3660
gaacaagcga gtcagcaaca ccagagagca gtggcggcag cagcggcggc agcgacaaga 3720
agtacagcat cggcctggcc atcggcacca actctgtggg ctgggccgtg atcaccgacg 3780
agtacaaggt gcccagcaag aaattcaagg tgctgggcaa caccgaccgg cacagcatca 3840
agaagaacct gatcggagcc ctgctgttcg acagcggcga aacagccgag gccacccggc 3900
tgaagagaac cgccagaaga agatacacca gacggaagaa ccggatctgc tatctgcaag 3960
agatcttcag caacgagatg gccaaggtgg acgacagctt cttccacaga ctggaagagt 4020
ccttcctggt ggaagaggat aagaagcacg agcggcaccc catcttcggc aacatcgtgg 4080
acgaggtggc ctaccacgag aagtacccca ccatctacca cctgagaaag aaactggtgg 4140
acagcaccga caaggccgac ctgcggctga tctatctggc cctggcccac atgatcaagt 4200
tccggggcca cttcctgatc gagggcgacc tgaaccccga caacagcgac gtggacaagc 4260
tgttcatcca gctggtgcag acctacaacc agctgttcga ggaaaacccc atcaacgcca 4320
gcggcgtgga cgccaaggcc atcctgtctg ccagactgag caagagcaga cggctggaaa 4380
atctgatcgc ccagctgccc ggcgagaaga agaatggcct gttcggaaac ctgattgccc 4440
tgagcctggg cctgaccccc aacttcaaga gcaacttcga cctggccgag gatgccaaac 4500
tgcagctgag caaggacacc tacgacgacg acctggacaa cctgctggcc cagatcggcg 4560
accagtacgc cgacctgttt ctggccgcca agaacctgtc cgacgccatc ctgctgagcg 4620
acatcctgag agtgaacacc gagatcacca aggcccccct gagcgcctct atgatcaaga 4680
gatacgacga gcaccaccag gacctgaccc tgctgaaagc tctcgtgcgg cagcagctgc 4740
ctgagaagta caaagagatt ttcttcgacc agagcaagaa cggctacgcc ggctacattg 4800
acggcggagc cagccaggaa gagttctaca agttcatcaa gcccatcctg gaaaagatgg 4860
acggcaccga ggaactgctc gtgaagctga acagagagga cctgctgcgg aagcagcgga 4920
ccttcgacaa cggcagcatc ccccaccaga tccacctggg agagctgcac gccattctgc 4980
ggcggcagga agatttttac ccattcctga aggacaaccg ggaaaagatc gagaagatcc 5040
tgaccttccg catcccctac tacgtgggcc ctctggccag gggaaacagc agattcgcct 5100
ggatgaccag aaagagcgag gaaaccatca ccccctggaa cttcgaggaa gtggtggaca 5160
agggcgcttc cgcccagagc ttcatcgagc ggatgaccaa cttcgataag aacctgccca 5220
acgagaaggt gctgcccaag cacagcctgc tgtacgagta cttcaccgtg tataacgagc 5280
tgaccaaagt gaaatacgtg accgagggaa tgagaaagcc cgccttcctg agcggcgagc 5340
agaaaaaggc catcgtggac ctgctgttca agaccaaccg gaaagtgacc gtgaagcagc 5400
tgaaagagga ctacttcaag aaaatcgagt gcttcgactc cgtggaaatc tccggcgtgg 5460
aagatcggtt caacgcctcc ctgggcacat accacgatct gctgaaaatt atcaaggaca 5520
aggacttcct ggacaatgag gaaaacgagg acattctgga agatatcgtg ctgaccctga 5580
cactgtttga ggacagagag atgatcgagg aacggctgaa aacctatgcc cacctgttcg 5640
acgacaaagt gatgaagcag ctgaagcggc ggagatacac cggctggggc aggctgagcc 5700
ggaagctgat caacggcatc cgggacaagc agtccggcaa gacaatcctg gatttcctga 5760
agtccgacgg cttcgccaac agaaacttca tgcagctgat ccacgacgac agcctgacct 5820
ttaaagagga catccagaaa gcccaggtgt ccggccaggg cgatagcctg cacgagcaca 5880
ttgccaatct ggccggcagc cccgccatta agaagggcat cctgcagaca gtgaaggtgg 5940
tggacgagct cgtgaaagtg atgggccggc acaagcccga gaacatcgtg atcgaaatgg 6000
ccagagagaa ccagaccacc cagaagggac agaagaacag ccgcgagaga atgaagcgga 6060
tcgaagaggg catcaaagag ctgggcagcc agatcctgaa agaacacccc gtggaaaaca 6120
cccagctgca gaacgagaag ctgtacctgt actacctgca gaatgggcgg gatatgtacg 6180
tggaccagga actggacatc aaccggctgt ccgactacga tgtggaccat atcgtgcctc 6240
agagctttct gaaggacgac tccatcgaca acaaggtgct gaccagaagc gacaagaacc 6300
ggggcaagag cgacaacgtg ccctccgaag aggtcgtgaa gaagatgaag aactactggc 6360
ggcagctgct gaacgccaag ctgattaccc agagaaagtt cgacaatctg accaaggccg 6420
agagaggcgg cctgagcgaa ctggataagg ccggcttcat caagagacag ctggtggaaa 6480
cccggcagat cacaaagcac gtggcacaga tcctggactc ccggatgaac actaagtacg 6540
acgagaatga caagctgatc cgggaagtga aagtgatcac cctgaagtcc aagctggtgt 6600
ccgatttccg gaaggatttc cagttttaca aagtgcgcga gatcaacaac taccaccacg 6660
cccacgacgc ctacctaaac gccgtcgtgg gaaccgccct gatcaaaaag taccctaagc 6720
tggaaagcga gttcgtgtac ggcgactaca aggtgtacga cgtgcggaag atgatcgcca 6780
agagcgagca ggaaatcggc aaggctaccg ccaagtactt cttctacagc aacatcatga 6840
actttttcaa gaccgagatt accctggcca acggcgagat ccggaagcgg cctctgatcg 6900
agacaaacgg cgaaaccggg gagatcgtgt gggataaggg ccgggatttt gccaccgtgc 6960
ggaaagtgct gagcatgccc caagtgaata tcgtgaaaaa gaccgaggtg cagacaggcg 7020
gcttcagcaa agagtctatc ctgcccaaga ggaacagcga taagctgatc gccagaaaga 7080
aggactggga ccctaagaag tacggcggct tcgacagccc caccgtggcc tattctgtgc 7140
tggtggtggc caaagtggaa aagggcaagt ccaagaaact gaagagtgtg aaagagctgc 7200
tggggatcac catcatggaa agaagcagct tcgagaagaa tcccatcgac tttctggaag 7260
ccaagggcta caaagaagtg aaaaaggacc tgatcatcaa gctgcctaag tactccctgt 7320
tcgagctgga aaacggccgg aagagaatgc tggcctctgc cggcgaactg cagaagggaa 7380
acgaactggc cctgccctcc aaatatgtga acttcctgta cctggccagc cactatgaga 7440
agctgaaggg ctcccccgag gataatgagc agaaacagct gtttgtggaa cagcacaagc 7500
actacctgga cgagatcatc gagcagatca gcgagttctc caagagagtg atcctggccg 7560
acgctaatct ggacaaagtg ctgtccgcct acaacaagca ccgggataag cccatcagag 7620
agcaggccga gaatatcatc cacctgttta ccctgaccaa tctgggagcc cctgccgcct 7680
tcaagtactt tgacaccacc atcgaccgga agaggtacac cagcaccaaa gaggtgctgg 7740
acgccaccct gatccaccag agcatcaccg gcctgtacga gacacggatc gacctgtctc 7800
agctgggagg tgactctggc ggctcaaaaa gaaccgccga cggcagcgaa ttcgagccca 7860
agaagaagag gaaagtctaa ccggtcatca tcaccatcac cattgagttt aaacccgctg 7920
atcagcctcg actgtgcctt ctagttgcca gccatctgtt gtttgcccct cccccgtgcc 7980
ttccttgacc ctggaaggtg ccactcccac tgtcctttcc taataaaatg aggaaattgc 8040
atcgcattgt ctgagtaggt gtcattctat tctggggggt ggggtggggc aggacagcaa 8100
gggggaggat tgggaagaca atagcaggca tgctggggat gcggtgggct ctatggcttc 8160
tgaggcggaa agaaccagct ggggctcgat accgtcgacc tctagctaga gcttggcgta 8220
atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat 8280
acgagccgga agcataaagt gtaaagccta ggatgccta 8319
<210> 8
<211> 8976
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 8
atgagtgagc taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa 60
cctgtcgtgc cagctgcatt aatgaatcgg ccaacgcgcg ggaagaggcg gtttgcgtat 120
tgggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg 180
agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc 240
aggaaagaac atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt 300
gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag 360
tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc 420
cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc 480
ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt 540
cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt 600
atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc 660
agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa 720
gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa 780
gccagttacc ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg 840
tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga 900
agatcctttg atcttttcta cggggtctga cactcagtgg aacgaaaact cacgttaagg 960
gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg 1020
aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt 1080
aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact 1140
ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat 1200
gataccgcga gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg 1260
aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg 1320
ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat 1380
tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc 1440
ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt 1500
cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc 1560
agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga 1620
gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc 1680
gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa 1740
acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta 1800
acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg 1860
agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg 1920
aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat 1980
gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt 2040
tccccgaaaa gtgccacctg acgtcgacgg atcgggagat cgatctcccg atcccctagg 2100
gtcgactctc agtacaatct gctctgatgc cgcatagtta agccagtatc tgctccctgc 2160
ttgtgtgttg gaggtcgctg agtagtgcgc gagcaaaatt taagctacaa caaggcaagg 2220
cttgaccgac aattgcatga agaatctgct tagggttagg cgttttgcgc tgcttcgcga 2280
tgtacgggcc agatatacgc gttgacattg attattgact agttattaat agtaatcaat 2340
tacggggtca ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa 2400
tggcccgcct ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt 2460
tcccatagta acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta 2520
aactgcccac ttggcagtac atcaagtgta tcatatgcca agtacgcccc ctattgacgt 2580
caatgacggt aaatggcccg cctggcatta tgcccagtac atgaccttat gggactttcc 2640
tacttggcag tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca 2700
gtacatcaat gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat 2760
tgacgtcaat gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa 2820
caactccgcc ccattgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag 2880
cagagctggt ttagtgaacc gtcagatccg ctagagatcc gcggccgcta atacgactca 2940
ctatagggag agccgccacc atgaaacgga cagccgacgg aagcgagttc gagtcaccaa 3000
agaagaagcg gaaagtcagc agtgaggcat ctccagcaag cggaccaagg cacctgatgg 3060
acccccacat cttcacctct aactttaaca atggcatcgg caggcacaag acatacctgt 3120
gctatgaggt ggagcgcctg gacaatggca ccagcgtgaa gatggatcag cacagaggct 3180
tcctgcacaa ccaggccaag aatctgctgt gcggcttcta cggccggcac gcagagctga 3240
gatttctgga cctggtgcct agcctgcagc tggatccagc ccagatctat agggtgacct 3300
ggttcatcag ctggtcccca tgcttttcct ggggatgtgc aggagaggtg cgcgcctttc 3360
tgcaggagaa cacacacgtg cggctgagaa tcttcgccgc ccggatcttt gactacgatc 3420
ctctgtataa ggaggccctg cagatgctga gagacgcagg agcccaggtg tccatcatga 3480
cctacgatga gttcaagcac tgctgggaca catttgtgga tcaccagggc tgtcccttcc 3540
agccttggga cggactggat gagcactccc aggccctgtc tggcaggctg agggccatcc 3600
tgcagaacca gggcaatagc ggaggatccg gaggatctgg aggcagcatg gccggagcac 3660
aggattttgt cccacatact gccgatctgg ctgagctggc tgccgccgct ggagagtgta 3720
gaggatgcgg gctgtaccgg gatgccacac aggccgtgtt cggagcaggc ggcaggagcg 3780
cccgcatcat gatgatcggc gagcagccag gcgataagga ggacctggcc ggcctgccat 3840
ttgtgggacc agcaggccgg ctgctggaca gagccctgga ggccgccgac atcgataggg 3900
acgccctgta cgtgaccaac gccgtgaagc acttcaagtt tacacgcgcc gcaggaggca 3960
agaggagaat ccacaagacc ccctctcgga cagaggtggt ggcctgcaga ccttggctga 4020
tcgccgagat gaccagcgtg gagccagatg tggtggtgct gctgggagca acagcagcaa 4080
aggccctgct gggcaatgac ttcagggtga cccagcacag gggagaggtg ctgcacgtgg 4140
acgatgtgcc aggcgatcct gccctggtgg caaccgtgca ccctagctcc ctgctgagag 4200
gcccaaagga ggagcgggaa tccgccttcg ctgggctggt ggacgacctg agagtggctg 4260
ccgatgtgag accttctgga ggatctagcg gaggatcctc tggcagcgag acaccaggaa 4320
caagcgagtc agcaacacca gagagcagtg gcggcagcag cggcggcagc gacaagaagt 4380
acagcatcgg cctggccatc ggcaccaact ctgtgggctg ggccgtgatc accgacgagt 4440
acaaggtgcc cagcaagaaa ttcaaggtgc tgggcaacac cgaccggcac agcatcaaga 4500
agaacctgat cggagccctg ctgttcgaca gcggcgaaac agccgaggcc acccggctga 4560
agagaaccgc cagaagaaga tacaccagac ggaagaaccg gatctgctat ctgcaagaga 4620
tcttcagcaa cgagatggcc aaggtggacg acagcttctt ccacagactg gaagagtcct 4680
tcctggtgga agaggataag aagcacgagc ggcaccccat cttcggcaac atcgtggacg 4740
aggtggccta ccacgagaag taccccacca tctaccacct gagaaagaaa ctggtggaca 4800
gcaccgacaa ggccgacctg cggctgatct atctggccct ggcccacatg atcaagttcc 4860
ggggccactt cctgatcgag ggcgacctga accccgacaa cagcgacgtg gacaagctgt 4920
tcatccagct ggtgcagacc tacaaccagc tgttcgagga aaaccccatc aacgccagcg 4980
gcgtggacgc caaggccatc ctgtctgcca gactgagcaa gagcagacgg ctggaaaatc 5040
tgatcgccca gctgcccggc gagaagaaga atggcctgtt cggaaacctg attgccctga 5100
gcctgggcct gacccccaac ttcaagagca acttcgacct ggccgaggat gccaaactgc 5160
agctgagcaa ggacacctac gacgacgacc tggacaacct gctggcccag atcggcgacc 5220
agtacgccga cctgtttctg gccgccaaga acctgtccga cgccatcctg ctgagcgaca 5280
tcctgagagt gaacaccgag atcaccaagg cccccctgag cgcctctatg atcaagagat 5340
acgacgagca ccaccaggac ctgaccctgc tgaaagctct cgtgcggcag cagctgcctg 5400
agaagtacaa agagattttc ttcgaccaga gcaagaacgg ctacgccggc tacattgacg 5460
gcggagccag ccaggaagag ttctacaagt tcatcaagcc catcctggaa aagatggacg 5520
gcaccgagga actgctcgtg aagctgaaca gagaggacct gctgcggaag cagcggacct 5580
tcgacaacgg cagcatcccc caccagatcc acctgggaga gctgcacgcc attctgcggc 5640
ggcaggaaga tttttaccca ttcctgaagg acaaccggga aaagatcgag aagatcctga 5700
ccttccgcat cccctactac gtgggccctc tggccagggg aaacagcaga ttcgcctgga 5760
tgaccagaaa gagcgaggaa accatcaccc cctggaactt cgaggaagtg gtggacaagg 5820
gcgcttccgc ccagagcttc atcgagcgga tgaccaactt cgataagaac ctgcccaacg 5880
agaaggtgct gcccaagcac agcctgctgt acgagtactt caccgtgtat aacgagctga 5940
ccaaagtgaa atacgtgacc gagggaatga gaaagcccgc cttcctgagc ggcgagcaga 6000
aaaaggccat cgtggacctg ctgttcaaga ccaaccggaa agtgaccgtg aagcagctga 6060
aagaggacta cttcaagaaa atcgagtgct tcgactccgt ggaaatctcc ggcgtggaag 6120
atcggttcaa cgcctccctg ggcacatacc acgatctgct gaaaattatc aaggacaagg 6180
acttcctgga caatgaggaa aacgaggaca ttctggaaga tatcgtgctg accctgacac 6240
tgtttgagga cagagagatg atcgaggaac ggctgaaaac ctatgcccac ctgttcgacg 6300
acaaagtgat gaagcagctg aagcggcgga gatacaccgg ctggggcagg ctgagccgga 6360
agctgatcaa cggcatccgg gacaagcagt ccggcaagac aatcctggat ttcctgaagt 6420
ccgacggctt cgccaacaga aacttcatgc agctgatcca cgacgacagc ctgaccttta 6480
aagaggacat ccagaaagcc caggtgtccg gccagggcga tagcctgcac gagcacattg 6540
ccaatctggc cggcagcccc gccattaaga agggcatcct gcagacagtg aaggtggtgg 6600
acgagctcgt gaaagtgatg ggccggcaca agcccgagaa catcgtgatc gaaatggcca 6660
gagagaacca gaccacccag aagggacaga agaacagccg cgagagaatg aagcggatcg 6720
aagagggcat caaagagctg ggcagccaga tcctgaaaga acaccccgtg gaaaacaccc 6780
agctgcagaa cgagaagctg tacctgtact acctgcagaa tgggcgggat atgtacgtgg 6840
accaggaact ggacatcaac cggctgtccg actacgatgt ggaccatatc gtgcctcaga 6900
gctttctgaa ggacgactcc atcgacaaca aggtgctgac cagaagcgac aagaaccggg 6960
gcaagagcga caacgtgccc tccgaagagg tcgtgaagaa gatgaagaac tactggcggc 7020
agctgctgaa cgccaagctg attacccaga gaaagttcga caatctgacc aaggccgaga 7080
gaggcggcct gagcgaactg gataaggccg gcttcatcaa gagacagctg gtggaaaccc 7140
ggcagatcac aaagcacgtg gcacagatcc tggactcccg gatgaacact aagtacgacg 7200
agaatgacaa gctgatccgg gaagtgaaag tgatcaccct gaagtccaag ctggtgtccg 7260
atttccggaa ggatttccag ttttacaaag tgcgcgagat caacaactac caccacgccc 7320
acgacgccta cctaaacgcc gtcgtgggaa ccgccctgat caaaaagtac cctaagctgg 7380
aaagcgagtt cgtgtacggc gactacaagg tgtacgacgt gcggaagatg atcgccaaga 7440
gcgagcagga aatcggcaag gctaccgcca agtacttctt ctacagcaac atcatgaact 7500
ttttcaagac cgagattacc ctggccaacg gcgagatccg gaagcggcct ctgatcgaga 7560
caaacggcga aaccggggag atcgtgtggg ataagggccg ggattttgcc accgtgcgga 7620
aagtgctgag catgccccaa gtgaatatcg tgaaaaagac cgaggtgcag acaggcggct 7680
tcagcaaaga gtctatcctg cccaagagga acagcgataa gctgatcgcc agaaagaagg 7740
actgggaccc taagaagtac ggcggcttcg acagccccac cgtggcctat tctgtgctgg 7800
tggtggccaa agtggaaaag ggcaagtcca agaaactgaa gagtgtgaaa gagctgctgg 7860
ggatcaccat catggaaaga agcagcttcg agaagaatcc catcgacttt ctggaagcca 7920
agggctacaa agaagtgaaa aaggacctga tcatcaagct gcctaagtac tccctgttcg 7980
agctggaaaa cggccggaag agaatgctgg cctctgccgg cgaactgcag aagggaaacg 8040
aactggccct gccctccaaa tatgtgaact tcctgtacct ggccagccac tatgagaagc 8100
tgaagggctc ccccgaggat aatgagcaga aacagctgtt tgtggaacag cacaagcact 8160
acctggacga gatcatcgag cagatcagcg agttctccaa gagagtgatc ctggccgacg 8220
ctaatctgga caaagtgctg tccgcctaca acaagcaccg ggataagccc atcagagagc 8280
aggccgagaa tatcatccac ctgtttaccc tgaccaatct gggagcccct gccgccttca 8340
agtactttga caccaccatc gaccggaaga ggtacaccag caccaaagag gtgctggacg 8400
ccaccctgat ccaccagagc atcaccggcc tgtacgagac acggatcgac ctgtctcagc 8460
tgggaggtga ctctggcggc tcaaaaagaa ccgccgacgg cagcgaattc gagcccaaga 8520
agaagaggaa agtctaaccg gtcatcatca ccatcaccat tgagtttaaa cccgctgatc 8580
agcctcgact gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc 8640
cttgaccctg gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc 8700
gcattgtctg agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg 8760
ggaggattgg gaagacaata gcaggcatgc tggggatgcg gtgggctcta tggcttctga 8820
ggcggaaaga accagctggg gctcgatacc gtcgacctct agctagagct tggcgtaatc 8880
atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 8940
agccggaagc ataaagtgta aagcctagga tgccta 8976
<210> 9
<211> 10864
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 9
gacaagaagt acagcatcgg cctggccatc ggcaccaact ctgtgggctg ggccgtgatc 60
accgacgagt acaaggtgcc cagcaagaaa ttcaaggtgc tgggcaacac cgaccggcac 120
agcatcaaga agaacctgat cggagccctg ctgttcgaca gcggcgaaac agccgaggcc 180
acccggctga agagaaccgc cagaagaaga tacaccagac ggaagaaccg gatctgctat 240
ctgcaagaga tcttcagcaa cgagatggcc aaggtggacg acagcttctt ccacagactg 300
gaagagtcct tcctggtgga agaggataag aagcacgagc ggcaccccat cttcggcaac 360
atcgtggacg aggtggccta ccacgagaag taccccacca tctaccacct gagaaagaaa 420
ctggtggaca gcaccgacaa ggccgacctg cggctgatct atctggccct ggcccacatg 480
atcaagttcc ggggccactt cctgatcgag ggcgacctga accccgacaa cagcgacgtg 540
gacaagctgt tcatccagct ggtgcagacc tacaaccagc tgttcgagga aaaccccatc 600
aacgccagcg gcgtggacgc caaggccatc ctgtctgcca gactgagcaa gagcagacgg 660
ctggaaaatc tgatcgccca gctgcccggc gagaagaaga atggcctgtt cggaaacctg 720
attgccctga gcctgggcct gacccccaac ttcaagagca acttcgacct ggccgaggat 780
gccaaactgc agctgagcaa ggacacctac gacgacgacc tggacaacct gctggcccag 840
atcggcgacc agtacgccga cctgtttctg gccgccaaga acctgtccga cgccatcctg 900
ctgagcgaca tcctgagagt gaacaccgag atcaccaagg cccccctgag cgcctctatg 960
atcaagagat acgacgagca ccaccaggac ctgaccctgc tgaaagctct cgtgcggcag 1020
cagctgcctg agaagtacaa agagattttc ttcgaccaga gcaagaacgg ctacgccggc 1080
tacattgacg gcggagccag ccaggaagag ttctacaagt tcatcaagcc catcctggaa 1140
aagatggacg gcaccgagga actgctcgtg aagctgaaca gagaggacct gctgcggaag 1200
cagcggacct tcgacaacgg cagcatcccc caccagatcc acctgggaga gctgcacgcc 1260
attctgcggc ggcaggaaga tttttaccca ttcctgaagg acaaccggga aaagatcgag 1320
aagatcctga ccttccgcat cccctactac gtgggccctc tggccagggg aaacagcaga 1380
ttcgcctgga tgaccagaaa gagcgaggaa accatcaccc cctggaactt cgaggaagtg 1440
gtggacaagg gcgcttccgc ccagagcttc atcgagcgga tgaccaactt cgataagaac 1500
ctgcccaacg agaaggtgct gcccaagcac agcctgctgt acgagtactt caccgtgtat 1560
aacgagctga ccaaagtgaa atacgtgacc gagggaatga gaaagcccgc cttcctgagc 1620
ggcgagcaga aaaaggccat cgtggacctg ctgttcaaga ccaaccggaa agtgaccgtg 1680
aagcagctga aagaggacta cttcaagaaa atcgagtgct tcgactccgt ggaaatctcc 1740
ggcgtggaag atcggttcaa cgcctccctg ggcacatacc acgatctgct gaaaattatc 1800
aaggacaagg acttcctgga caatgaggaa aacgaggaca ttctggaaga tatcgtgctg 1860
accctgacac tgtttgagga cagagagatg atcgaggaac ggctgaaaac ctatgcccac 1920
ctgttcgacg acaaagtgat gaagcagctg aagcggcgga gatacaccgg ctggggcagg 1980
ctgagccgga agctgatcaa cggcatccgg gacaagcagt ccggcaagac aatcctggat 2040
ttcctgaagt ccgacggctt cgccaacaga aacttcatgc agctgatcca cgacgacagc 2100
ctgaccttta aagaggacat ccagaaagcc caggtgtccg gccagggcga tagcctgcac 2160
gagcacattg ccaatctggc cggcagcccc gccattaaga agggcatcct gcagacagtg 2220
aaggtggtgg acgagctcgt gaaagtgatg ggccggcaca agcccgagaa catcgtgatc 2280
gaaatggcca gagagaacca gaccacccag aagggacaga agaacagccg cgagagaatg 2340
aagcggatcg aagagggcat caaagagctg ggcagccaga tcctgaaaga acaccccgtg 2400
gaaaacaccc agctgcagaa cgagaagctg tacctgtact acctgcagaa tgggcgggat 2460
atgtacgtgg accaggaact ggacatcaac cggctgtccg actacgatgt ggaccatatc 2520
gtgcctcaga gctttctgaa ggacgactcc atcgacaaca aggtgctgac cagaagcgac 2580
aagaaccggg gcaagagcga caacgtgccc tccgaagagg tcgtgaagaa gatgaagaac 2640
tactggcggc agctgctgaa cgccaagctg attacccaga gaaagttcga caatctgacc 2700
aaggccgaga gaggcggcct gagcgaactg gataaggccg gcttcatcaa gagacagctg 2760
gtggaaaccc ggcagatcac aaagcacgtg gcacagatcc tggactcccg gatgaacact 2820
aagtacgacg agaatgacaa gctgatccgg gaagtgaaag tgatcaccct gaagtccaag 2880
ctggtgtccg atttccggaa ggatttccag ttttacaaag tgcgcgagat caacaactac 2940
caccacgccc acgacgccta cctgaacgcc gtcgtgggaa ccgccctgat caaaaagtac 3000
cctaagctgg aaagcgagtt cgtgtacggc gactacaagg tgtacgacgt gcggaagatg 3060
atcgccaaga gcgagcagga aatcggcaag gctaccgcca agtacttctt ctacagcaac 3120
atcatgaact ttttcaagac cgagattacc ctggccaacg gcgagatccg gaagcggcct 3180
ctgatcgaga caaacggcga aaccggggag atcgtgtggg ataagggccg ggattttgcc 3240
accgtgcgga aagtgctgag catgccccaa gtgaatatcg tgaaaaagac cgaggtgcag 3300
acaggcggct tcagcaaaga gtctatcctg cccaagagga acagcgataa gctgatcgcc 3360
agaaagaagg actgggaccc taagaagtac ggcggcttcg acagccccac cgtggcctat 3420
tctgtgctgg tggtggccaa agtggaaaag ggcaagtcca agaaactgaa gagtgtgaaa 3480
gagctgctgg ggatcaccat catggaaaga agcagcttcg agaagaatcc catcgacttt 3540
ctggaagcca agggctacaa agaagtgaaa aaggacctga tcatcaagct gcctaagtac 3600
tccctgttcg agctggaaaa cggccggaag agaatgctgg cctctgccgg cgaactgcag 3660
aagggaaacg aactggccct gccctccaaa tatgtgaact tcctgtacct ggccagccac 3720
tatgagaagc tgaagggctc ccccgaggat aatgagcaga aacagctgtt tgtggaacag 3780
cacaagcact acctggacga gatcatcgag cagatcagcg agttctccaa gagagtgatc 3840
ctggccgacg ctaatctgga caaagtgctg tccgcctaca acaagcaccg ggataagccc 3900
atcagagagc aggccgagaa tatcatccac ctgtttaccc tgaccaatct gggagcccct 3960
gccgccttca agtactttga caccaccatc gaccggaaga ggtacaccag caccaaagag 4020
gtgctggacg ccaccctgat ccaccagagc atcaccggcc tgtacgagac acggatcgac 4080
ctgtctcagc tgggaggtga ctctggcggc tcaaaaagaa ccgccgacgg cagcgaattc 4140
gagcccaaga agaagaggaa agtctaaccg gtcatcatca ccatcaccat tgagtttaaa 4200
cccgctgatc agcctcgact gtgccttcta gttgccagcc atctgttgtt tgcccctccc 4260
ccgtgccttc cttgaccctg gaaggtgcca ctcccactgt cctttcctaa taaaatgagg 4320
aaattgcatc gcattgtctg agtaggtgtc attctattct ggggggtggg gtggggcagg 4380
acagcaaggg ggaggattgg gaagacaata gcaggcatgc tggggatgcg gtgggctcta 4440
tggcttctga ggcggaaaga accagctggg gctcgttgac agctagctca gtcctaggta 4500
taatactagt gtcgtctaga taactacgat agttttagag ctagaaatag caagttaaaa 4560
taaggctagt ccgttatcaa cttgaaaaag tggcaccgag tcggtgcttt ttttgatccg 4620
gctgctaaca aagcccgaaa ggaagctgag ttggctgctg ccaccgctga gcaataacta 4680
gcataacccc ttggggcctc taaacgggtc ttgaggggtt ttttgctgaa aggaggaact 4740
atatccggat tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg 4800
tggttacgcg cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt 4860
tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc 4920
tccctttagg gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg 4980
gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg 5040
agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct 5100
cggtctattc ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg 5160
agctgattta acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaatttcag 5220
gtggcacttt tcggggaaat gtgggaaatg tgcgcggaac ccctatttgt ttatttttct 5280
aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat 5340
attgaaaaag gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg 5400
cggcattttg ccttcctgtt tttgctcacc cagaaacgct ggtgaaagta aaagatgctg 5460
aagatcagtt gggtgcacga gtgggttaca tcgaactgga tctcaacagc ggtaagatcc 5520
ttgagagttt tcgccccgaa gaacgttttc caatgatgag cacttttaaa gttctgctat 5580
gtggcgcggt attatcccgt attgacgccg ggcaagagca actcggtcgc cgcatacact 5640
attctcagaa tgacttggtt gagtactcac cagtcacaga aaagcatctt acggatggca 5700
tgacagtaag agaattatgc agtgctgcca taaccatgag tgataacact gcggccaact 5760
tacttctgac aacgatcgga ggaccgaagg agctaaccgc ttttttgcac aacatggggg 5820
atcatgtaac tcgccttgat cgttgggaac cggagctgaa tgaagccata ccaaacgacg 5880
agcgtgacac cacgatgcct gtagcaatgg caacaacgtt gcgcaaacta ttaactggcg 5940
aactacttac tctagcttcc cggcaacaat taatagactg gatggaggcg gataaagttg 6000
caggaccact tctgcgctcg gcccttccgg ctggctggtt tattgctgat aaatctggag 6060
ccggtgagcg tgggtctcgc ggtatcattg cagcactggg gccagatggt aagccctccc 6120
gtatcgtagt tatctagacg acggggagtc aggcaactat ggatgaacga aatagacaga 6180
tcgctgagat aggtgcctca ctgattaagc attggtaagc gcggaacccc tatttgttta 6240
tttttctaaa tacattcaaa tatgtatccg ctcatgaatt aattcttaga aaaactcatc 6300
gagcatcaaa tgaaactgca atttattcat atcaggatta tcaataccat atttttgaaa 6360
aagccgtttc tgtaatgaag gagaaaactc accgaggcag ttccatagga tggcaagatc 6420
ctggtatcgg tctgcgattc cgactcgtcc aacatcaata caacctatta atttcccctc 6480
gtcaaaaata aggttatcaa gtgagaaatc accatgagtg acgactgaat ccggtgagaa 6540
tggcaaaagt ttatgcattt ctttccagac ttgttcaaca ggccagccat tacgctcgtc 6600
atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt gattgcgcct gagcgagacg 6660
aaatacgcga tcgctgttaa aaggacaatt acaaacagga atcgaatgca accggcgcag 6720
gaacactgcc agcgcatcaa caatattttc acctgaatca ggatattctt ctaatacctg 6780
gaatgctgtt ttcccgggga tcgcagtggt gagtaaccat gcatcatcag gagtacggat 6840
aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc cagtttagtc tgaccatctc 6900
atctgtaaca tcattggcaa cgctaccttt gccatgtttc agaaacaact ctggcgcatc 6960
gggcttccca tacaatcgat agattgtcgc acctgattgc ccgacattat cgcgagccca 7020
tttataccca tataaatcag catccatgtt ggaatttaat cgcggcctag agcaagacgt 7080
ttcccgttga atatggctca taacacccct tgtattactg tttatgtaag cagacagttt 7140
tattgttcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg 7200
tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc 7260
aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc 7320
tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtc cttctagtgt 7380
agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc 7440
taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact 7500
caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac 7560
agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag 7620
aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg 7680
gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg 7740
tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga 7800
gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt 7860
ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct 7920
ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg 7980
aggaagcgga agagcgcctg atgcggtatt ttctccttac gcatctgtgc ggtatttcac 8040
accgcatata tggtgcactc tcagtacaat ctgctctgat gccgcatagt taagccagta 8100
tacactccgc tatcgctacg tgactgggtc atggctgcgc cccgacaccc gccaacaccc 8160
gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc 8220
gtctccggga gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgaggcag 8280
ctgcggtaaa gctcatcagc gtggtcgtga agcgattcac agatgtctgc ctgttcatcc 8340
gcgtccagct cgttgagttt ctccagaagc gttaatgtct ggcttctgat aaagcgggcc 8400
atgttaaggg cggttttttc ctgtttggtc actgatgcct ccgtgtaagg gggatttctg 8460
ttcatggggg taatgatacc gatgaaacga gagaggatgc tcacgatacg ggttactgat 8520
gatgaacatg cccggttact ggaacgttgt gagggtaaac aactggcggt atggatgcgg 8580
cgggaccaga gaaaaatcac tcagggtcaa tgccagcgct tcgttaatac agatgtaggt 8640
gttccacagg gtagccagca gcatcctgcg atgcagatcc ggaacataat ggtgcagggc 8700
gctgacttcc gcgtttccag actttacgaa acacggaaac cgaagaccat tcatgttgtt 8760
gctcaggtcg cagacgtttt gcagcagcag tcgcttcacg ttcgctcgcg tatcggtgat 8820
tcattctgct aaccagtaag gcaaccccgc cagcctagcc gggtcctcaa cgacaggagc 8880
acgatcatgc gcacccgtgg ggccgccatg ccggcgataa tggcctgctt ctcgccgaaa 8940
cgtttggtgg cgggaccagt gacgaaggct tgagcgaggg cgtgcaagat tccgaatacc 9000
gcaagcgaca ggccgatcat cgtcgcgctc cagcgaaagc ggtcctcgcc gaaaatgacc 9060
cagagcgctg ccggcacctg tcctacgagt tgcatgataa agaagacagt cataagtgcg 9120
gcgacgatag tcatgccccg cgcccaccgg aaggagctga ctgggttgaa ggctctcaag 9180
ggcatcggtc gagatcccgg tgcctaatga gtgagctaac ttacattaat tgcgttgcgc 9240
tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 9300
cgcgcgggga gaggcggttt gcgtattggg cgccagggtg gtttttcttt tcaccagtga 9360
gacgggcaac agctgattgc ccttcaccgc ctggccctga gagagttgca gcaagcggtc 9420
cacgctggtt tgccccagca ggcgaaaatc ctgtttgatg gtggttaacg gcgggatata 9480
acatgagctg tcttcggtat cgtcgtatcc cactaccgag atgtccgcac caacgcgcag 9540
cccggactcg gtaatggcgc gcattgcgcc cagcgccatc tgatcgttgg caaccagcat 9600
cgcagtggga acgatgccct cattcagcat ttgcatggtt tgttgaaaac cggacatggc 9660
actccagtcg ccttcccgtt ccgctatcgg ctgaatttga ttgcgagtga gatatttatg 9720
ccagccagcc agacgcagac gcgccgagac agaacttaat gggcccgcta acagcgcgat 9780
ttgctggtga cccaatgcga ccagatgctc cacgcccagt cgcgtaccgt cttcatggga 9840
gaaaataata ctgttgatgg gtgtctggtc agagacatca agaaataacg ccggaacatt 9900
agtgcaggca gcttccacag caatggcatc ctggtcatcc agcggatagt taatgatcag 9960
cccactgacg cgttgcgcga gaagattgtg caccgccgct ttacaggctt cgacgccgct 10020
tcgttctacc atcgacacca ccacgctggc acccagttga tcggcgcgag atttaatcgc 10080
cgcgacaatt tgcgacggcg cgtgcagggc cagactggag gtggcaacgc caatcagcaa 10140
cgactgtttg cccgccagtt gttgtgccac gcggttggga atgtaattca gctccgccat 10200
cgccgcttcc actttttccc gcgttttcgc agaaacgtgg ctggcctggt tcaccacgcg 10260
ggaaacggtc tgataagaga caccggcata ctctgcgaca tcgtataacg ttactggttt 10320
cacattcacc accctgaatt gactctcttc cgggcgctat catgccatac cgcgaaaggt 10380
tttgcgccat tcgatggtgt ccgggatctc gacgctctcc cttatgcgac tcctgcatta 10440
ggaagcagcc cagtagtagg ttgaggccgt tgagcaccgc cgccgcaagg aatggtgcat 10500
gcaaggagat ggcgcccaac agtcccccgg ccacggggcc tgccaccata cccacgccga 10560
aacaagcgct catgagcccg aagtggcgag cccgatcttc cccatcggtg atgtcggcga 10620
tataggcgcc agcaaccgca cctgtggcgc cggtgatgcc ggccacgatg cgtccggcgt 10680
agaggatcga gatcgatctc gatcccgcga aattaatacg actcactata ggggaattgt 10740
gagcggataa caattcccct ctagaaataa ttttgtttaa ctttaagaag gagatataca 10800
tgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga agaagcggaa 10860
agtc 10864
<210> 10
<211> 1251
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 10
gaggcatctc cagcaagcgg accaaggcac ctgatggacc cccacatctt cacctctaac 60
tttaacaatg gcatcggcag gcacaagaca tacctgtgct atgaggtgga gcgcctggac 120
aatggcacca gcgtgaagat ggatcagcac agaggcttcc tgcacaacca ggccaagaat 180
ctgctgtgcg gcttctacgg ccggcacgca gagctgagat ttctggacct ggtgcctagc 240
ctgcagctgg atccagccca gatctatagg gtgacctggt tcatcagctg gtccccatgc 300
ttttcctggg gatgtgcagg agaggtgcgc gcctttctgc aggagaacac acacgtgcgg 360
ctgagaatct tcgccgcccg gatctttgac tacgatcctc tgtataagga ggccctgcag 420
atgctgagag acgcaggagc ccaggtgtcc atcatgacct acgatgagtt caagcactgc 480
tgggacacat ttgtggatca ccagggctgt cccttccagc cttgggacgg actggatgag 540
cactcccagg ccctgtctgg caggctgagg gccatcctgc agaaccaggg caatagcgga 600
ggatccggag gatctggagg cagcatggcc ggagcacagg attttgtccc acatactgcc 660
gatctggctg agctggctgc cgccgctgga gagtgtagag gatgcgggct gtaccgggat 720
gccacacagg ccgtgttcgg agcaggcggc aggagcgccc gcatcatgat gatcggcgag 780
cagccaggcg ataaggagga cctggccggc ctgccatttg tgggaccagc aggccggctg 840
ctggacagag ccctggaggc cgccgacatc gatagggacg ccctgtacgt gaccaacgcc 900
gtgaagcact tcaagtttac acgcgccgca ggaggcaaga ggagaatcca caagaccccc 960
tctcggacag aggtggtggc ctgcagacct tggctgatcg ccgagatgac cagcgtggag 1020
ccagatgtgg tggtgctgct gggagcaaca gcagcaaagg ccctgctggg caatgacttc 1080
agggtgaccc agcacagggg agaggtgctg cacgtggacg atgtgccagg cgatcctgcc 1140
ctggtggcaa ccgtgcaccc tagctccctg ctgagaggcc caaaggagga gcgggaatcc 1200
gccttcgctg ggctggtgga cgacctgaga gtggctgccg atgtgagacc t 1251
<210> 11
<211> 1876
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 11
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn
20 25 30
Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
35 40 45
Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
50 55 60
Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr
65 70 75 80
Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg
85 90 95
Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp
100 105 110
Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
115 120 125
Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val
130 135 140
Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
145 150 155 160
Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
165 170 175
Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu
180 185 190
Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln
195 200 205
Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val
210 215 220
Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu
225 230 235 240
Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
245 250 255
Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser
260 265 270
Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr
275 280 285
Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr
290 295 300
Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
305 310 315 320
Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
325 330 335
Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
340 345 350
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
355 360 365
Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly
370 375 380
Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
385 390 395 400
Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu
405 410 415
Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile
420 425 430
His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr
435 440 445
Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
465 470 475 480
Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe
485 490 495
Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
500 505 510
Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
515 520 525
His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys
530 535 540
Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
545 550 555 560
Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
565 570 575
Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
580 585 590
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser
595 600 605
Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe
610 615 620
Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
625 630 635 640
Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
645 650 655
Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg
660 665 670
Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile
675 680 685
Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
690 695 700
Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
705 710 715 720
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
725 730 735
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys
740 745 750
Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
755 760 765
Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu
770 775 780
Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys
785 790 795 800
Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
805 810 815
His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
820 825 830
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile
835 840 845
Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
850 855 860
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys
865 870 875 880
Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
885 890 895
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln
900 905 910
Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu
915 920 925
Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln
930 935 940
Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
945 950 955 960
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
965 970 975
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys
980 985 990
Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
995 1000 1005
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Ser Gly Ser
1010 1015 1020
Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Gly Ser Glu
1025 1030 1035 1040
Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp Pro His Ile Phe
1045 1050 1055
Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys Thr Tyr Leu Cys
1060 1065 1070
Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val Lys Met Asp Gln
1075 1080 1085
His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu Leu Cys Gly Phe
1090 1095 1100
Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu Val Pro Ser Leu
1105 1110 1115 1120
Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp Phe Ile Ser Trp
1125 1130 1135
Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val Arg Ala Phe Leu
1140 1145 1150
Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala Ala Arg Ile Phe
1155 1160 1165
Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met Leu Arg Asp Ala
1170 1175 1180
Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe Lys His Cys Trp
1185 1190 1195 1200
Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln Pro Trp Asp Gly
1205 1210 1215
Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu Arg Ala Ile Leu
1220 1225 1230
Gln Asn Gln Gly Asn Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Met
1235 1240 1245
Ala Ala Ala Lys Ala Pro Gly Ala Ala Glu Phe Val Pro Ala Asp Ala
1250 1255 1260
Asp Leu Asp Thr Leu Arg Thr Ala Val Gln Gly Cys Arg Gly Cys Glu
1265 1270 1275 1280
Leu Tyr Arg Gly Ala Thr Gln Ala Val Phe Gly Glu Gly Pro Ala His
1285 1290 1295
Ala Pro Val Phe Val Val Gly Glu Gln Pro Gly Asp Arg Glu Asp Val
1300 1305 1310
Ala Gly His Pro Phe Val Gly Pro Ala Gly Arg Leu Leu Asp Lys Ala
1315 1320 1325
Leu Thr Glu Ala Asp Ile Asp Arg Glu Ala Val Tyr Leu Thr Asn Ala
1330 1335 1340
Val Lys His Phe Lys Phe Glu Glu Arg Gly Lys Arg Arg Ile His Lys
1345 1350 1355 1360
Gln Pro Gly Arg Thr Glu Val Val Ala Cys Ser Pro Trp Leu Thr Ala
1365 1370 1375
Glu Leu Asp Ala Val Arg Pro Gln Leu Val Val Cys Leu Gly Ala Val
1380 1385 1390
Ala Ala Lys Ala Val Leu Gly Pro Ser Phe Lys Val Ser Glu Arg Arg
1395 1400 1405
Gly Glu Val Val Glu Ala Gly Glu His Arg Val Ile Ala Thr Val His
1410 1415 1420
Pro Ser Ser Val Leu Arg Ala Pro Asp Arg Ala Ala Ala Tyr Ala Asp
1425 1430 1435 1440
Phe Leu Ala Asp Leu Arg Lys Val Arg Thr Ala Ala Gly Glu Leu His
1445 1450 1455
Arg Ala Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro
1460 1465 1470
Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly
1475 1480 1485
Gly Ser Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp
1490 1495 1500
Val Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1505 1510 1515 1520
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu
1525 1530 1535
Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1540 1545 1550
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala
1555 1560 1565
Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1570 1575 1580
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1585 1590 1595 1600
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1605 1610 1615
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val
1620 1625 1630
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1635 1640 1645
Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn
1650 1655 1660
Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1665 1670 1675 1680
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1685 1690 1695
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu
1700 1705 1710
Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His
1715 1720 1725
Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1730 1735 1740
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1745 1750 1755 1760
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys
1765 1770 1775
Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1780 1785 1790
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro
1795 1800 1805
Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1810 1815 1820
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1825 1830 1835 1840
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
1845 1850 1855
Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys
1860 1865 1870
Lys Arg Lys Val
1875
<210> 12
<211> 1876
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 12
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn
20 25 30
Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
35 40 45
Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
50 55 60
Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr
65 70 75 80
Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg
85 90 95
Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp
100 105 110
Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
115 120 125
Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val
130 135 140
Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
145 150 155 160
Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
165 170 175
Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu
180 185 190
Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln
195 200 205
Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val
210 215 220
Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu
225 230 235 240
Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
245 250 255
Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser
260 265 270
Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr
275 280 285
Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr
290 295 300
Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
305 310 315 320
Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
325 330 335
Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
340 345 350
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
355 360 365
Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly
370 375 380
Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
385 390 395 400
Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu
405 410 415
Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile
420 425 430
His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr
435 440 445
Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
465 470 475 480
Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe
485 490 495
Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
500 505 510
Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
515 520 525
His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys
530 535 540
Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
545 550 555 560
Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
565 570 575
Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
580 585 590
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser
595 600 605
Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe
610 615 620
Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
625 630 635 640
Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
645 650 655
Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg
660 665 670
Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile
675 680 685
Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
690 695 700
Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
705 710 715 720
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
725 730 735
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys
740 745 750
Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
755 760 765
Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu
770 775 780
Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys
785 790 795 800
Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
805 810 815
His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
820 825 830
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile
835 840 845
Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
850 855 860
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys
865 870 875 880
Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
885 890 895
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln
900 905 910
Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu
915 920 925
Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln
930 935 940
Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
945 950 955 960
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
965 970 975
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys
980 985 990
Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
995 1000 1005
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
1010 1015 1020
Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035 1040
Ala Lys Ser Glu Gln Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser
1045 1050 1055
Ala Thr Pro Glu Ser Gly Ser Glu Ala Ser Pro Ala Ser Gly Pro Arg
1060 1065 1070
His Leu Met Asp Pro His Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile
1075 1080 1085
Gly Arg His Lys Thr Tyr Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn
1090 1095 1100
Gly Thr Ser Val Lys Met Asp Gln His Arg Gly Phe Leu His Asn Gln
1105 1110 1115 1120
Ala Lys Asn Leu Leu Cys Gly Phe Tyr Gly Arg His Ala Glu Leu Arg
1125 1130 1135
Phe Leu Asp Leu Val Pro Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr
1140 1145 1150
Arg Val Thr Trp Phe Ile Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys
1155 1160 1165
Ala Gly Glu Val Arg Ala Phe Leu Gln Glu Asn Thr His Val Arg Leu
1170 1175 1180
Arg Ile Phe Ala Ala Arg Ile Phe Asp Tyr Asp Pro Leu Tyr Lys Glu
1185 1190 1195 1200
Ala Leu Gln Met Leu Arg Asp Ala Gly Ala Gln Val Ser Ile Met Thr
1205 1210 1215
Tyr Asp Glu Phe Lys His Cys Trp Asp Thr Phe Val Asp His Gln Gly
1220 1225 1230
Cys Pro Phe Gln Pro Trp Asp Gly Leu Asp Glu His Ser Gln Ala Leu
1235 1240 1245
Ser Gly Arg Leu Arg Ala Ile Leu Gln Asn Gln Gly Asn Ser Gly Gly
1250 1255 1260
Ser Gly Gly Ser Gly Gly Ser Met Ala Ala Ala Lys Ala Pro Gly Ala
1265 1270 1275 1280
Ala Glu Phe Val Pro Ala Asp Ala Asp Leu Asp Thr Leu Arg Thr Ala
1285 1290 1295
Val Gln Gly Cys Arg Gly Cys Glu Leu Tyr Arg Gly Ala Thr Gln Ala
1300 1305 1310
Val Phe Gly Glu Gly Pro Ala His Ala Pro Val Phe Val Val Gly Glu
1315 1320 1325
Gln Pro Gly Asp Arg Glu Asp Val Ala Gly His Pro Phe Val Gly Pro
1330 1335 1340
Ala Gly Arg Leu Leu Asp Lys Ala Leu Thr Glu Ala Asp Ile Asp Arg
1345 1350 1355 1360
Glu Ala Val Tyr Leu Thr Asn Ala Val Lys His Phe Lys Phe Glu Glu
1365 1370 1375
Arg Gly Lys Arg Arg Ile His Lys Gln Pro Gly Arg Thr Glu Val Val
1380 1385 1390
Ala Cys Ser Pro Trp Leu Thr Ala Glu Leu Asp Ala Val Arg Pro Gln
1395 1400 1405
Leu Val Val Cys Leu Gly Ala Val Ala Ala Lys Ala Val Leu Gly Pro
1410 1415 1420
Ser Phe Lys Val Ser Glu Arg Arg Gly Glu Val Val Glu Ala Gly Glu
1425 1430 1435 1440
His Arg Val Ile Ala Thr Val His Pro Ser Ser Val Leu Arg Ala Pro
1445 1450 1455
Asp Arg Ala Ala Ala Tyr Ala Asp Phe Leu Ala Asp Leu Arg Lys Val
1460 1465 1470
Arg Thr Ala Ala Gly Glu Leu His Arg Ala Ser Gly Gly Ser Ser Gly
1475 1480 1485
Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro
1490 1495 1500
Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Glu Ile Gly Lys Ala Thr
1505 1510 1515 1520
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu
1525 1530 1535
Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1540 1545 1550
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala
1555 1560 1565
Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1570 1575 1580
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1585 1590 1595 1600
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1605 1610 1615
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val
1620 1625 1630
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1635 1640 1645
Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn
1650 1655 1660
Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1665 1670 1675 1680
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1685 1690 1695
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu
1700 1705 1710
Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His
1715 1720 1725
Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1730 1735 1740
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1745 1750 1755 1760
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys
1765 1770 1775
Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1780 1785 1790
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro
1795 1800 1805
Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1810 1815 1820
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1825 1830 1835 1840
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
1845 1850 1855
Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys
1860 1865 1870
Lys Arg Lys Val
1875
<210> 13
<211> 1876
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 13
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn
20 25 30
Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
35 40 45
Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
50 55 60
Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr
65 70 75 80
Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg
85 90 95
Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp
100 105 110
Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
115 120 125
Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val
130 135 140
Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
145 150 155 160
Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
165 170 175
Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu
180 185 190
Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln
195 200 205
Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val
210 215 220
Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu
225 230 235 240
Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
245 250 255
Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser
260 265 270
Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr
275 280 285
Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr
290 295 300
Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
305 310 315 320
Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
325 330 335
Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
340 345 350
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
355 360 365
Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly
370 375 380
Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
385 390 395 400
Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu
405 410 415
Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile
420 425 430
His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr
435 440 445
Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
465 470 475 480
Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe
485 490 495
Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
500 505 510
Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
515 520 525
His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys
530 535 540
Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
545 550 555 560
Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
565 570 575
Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
580 585 590
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser
595 600 605
Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe
610 615 620
Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
625 630 635 640
Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
645 650 655
Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg
660 665 670
Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile
675 680 685
Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
690 695 700
Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
705 710 715 720
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
725 730 735
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys
740 745 750
Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
755 760 765
Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu
770 775 780
Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys
785 790 795 800
Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
805 810 815
His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
820 825 830
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile
835 840 845
Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
850 855 860
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys
865 870 875 880
Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
885 890 895
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln
900 905 910
Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu
915 920 925
Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln
930 935 940
Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
945 950 955 960
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
965 970 975
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys
980 985 990
Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
995 1000 1005
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
1010 1015 1020
Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035 1040
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe
1045 1050 1055
Tyr Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu
1060 1065 1070
Ser Gly Ser Glu Ala Ser Pro Ala Ser Gly Pro Arg His Leu Met Asp
1075 1080 1085
Pro His Ile Phe Thr Ser Asn Phe Asn Asn Gly Ile Gly Arg His Lys
1090 1095 1100
Thr Tyr Leu Cys Tyr Glu Val Glu Arg Leu Asp Asn Gly Thr Ser Val
1105 1110 1115 1120
Lys Met Asp Gln His Arg Gly Phe Leu His Asn Gln Ala Lys Asn Leu
1125 1130 1135
Leu Cys Gly Phe Tyr Gly Arg His Ala Glu Leu Arg Phe Leu Asp Leu
1140 1145 1150
Val Pro Ser Leu Gln Leu Asp Pro Ala Gln Ile Tyr Arg Val Thr Trp
1155 1160 1165
Phe Ile Ser Trp Ser Pro Cys Phe Ser Trp Gly Cys Ala Gly Glu Val
1170 1175 1180
Arg Ala Phe Leu Gln Glu Asn Thr His Val Arg Leu Arg Ile Phe Ala
1185 1190 1195 1200
Ala Arg Ile Phe Asp Tyr Asp Pro Leu Tyr Lys Glu Ala Leu Gln Met
1205 1210 1215
Leu Arg Asp Ala Gly Ala Gln Val Ser Ile Met Thr Tyr Asp Glu Phe
1220 1225 1230
Lys His Cys Trp Asp Thr Phe Val Asp His Gln Gly Cys Pro Phe Gln
1235 1240 1245
Pro Trp Asp Gly Leu Asp Glu His Ser Gln Ala Leu Ser Gly Arg Leu
1250 1255 1260
Arg Ala Ile Leu Gln Asn Gln Gly Asn Ser Gly Gly Ser Gly Gly Ser
1265 1270 1275 1280
Gly Gly Ser Met Ala Ala Ala Lys Ala Pro Gly Ala Ala Glu Phe Val
1285 1290 1295
Pro Ala Asp Ala Asp Leu Asp Thr Leu Arg Thr Ala Val Gln Gly Cys
1300 1305 1310
Arg Gly Cys Glu Leu Tyr Arg Gly Ala Thr Gln Ala Val Phe Gly Glu
1315 1320 1325
Gly Pro Ala His Ala Pro Val Phe Val Val Gly Glu Gln Pro Gly Asp
1330 1335 1340
Arg Glu Asp Val Ala Gly His Pro Phe Val Gly Pro Ala Gly Arg Leu
1345 1350 1355 1360
Leu Asp Lys Ala Leu Thr Glu Ala Asp Ile Asp Arg Glu Ala Val Tyr
1365 1370 1375
Leu Thr Asn Ala Val Lys His Phe Lys Phe Glu Glu Arg Gly Lys Arg
1380 1385 1390
Arg Ile His Lys Gln Pro Gly Arg Thr Glu Val Val Ala Cys Ser Pro
1395 1400 1405
Trp Leu Thr Ala Glu Leu Asp Ala Val Arg Pro Gln Leu Val Val Cys
1410 1415 1420
Leu Gly Ala Val Ala Ala Lys Ala Val Leu Gly Pro Ser Phe Lys Val
1425 1430 1435 1440
Ser Glu Arg Arg Gly Glu Val Val Glu Ala Gly Glu His Arg Val Ile
1445 1450 1455
Ala Thr Val His Pro Ser Ser Val Leu Arg Ala Pro Asp Arg Ala Ala
1460 1465 1470
Ala Tyr Ala Asp Phe Leu Ala Asp Leu Arg Lys Val Arg Thr Ala Ala
1475 1480 1485
Gly Glu Leu His Arg Ala Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly
1490 1495 1500
Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly
1505 1510 1515 1520
Gly Ser Ser Gly Gly Ser Ser Asn Ile Met Asn Phe Phe Lys Thr Glu
1525 1530 1535
Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr
1540 1545 1550
Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala
1555 1560 1565
Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1570 1575 1580
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys
1585 1590 1595 1600
Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1605 1610 1615
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val
1620 1625 1630
Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys
1635 1640 1645
Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn
1650 1655 1660
Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp
1665 1670 1675 1680
Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly
1685 1690 1695
Arg Lys Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu
1700 1705 1710
Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His
1715 1720 1725
Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu
1730 1735 1740
Phe Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1745 1750 1755 1760
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys
1765 1770 1775
Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln
1780 1785 1790
Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro
1795 1800 1805
Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1810 1815 1820
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr
1825 1830 1835 1840
Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
1845 1850 1855
Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys
1860 1865 1870
Lys Arg Lys Val
1875
<210> 14
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 14
acaacaacag caaaagcagc tgg 23
<210> 15
<211> 22
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 15
acttcaagaa ctagtgcgca gg 22
<210> 16
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 16
gcggtaccac gtcttgtaga agg 23
<210> 17
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 17
accagagaat gaaatctaag agg 23
<210> 18
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 18
tgatcaagag cgagcagtag agg 23
<210> 19
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 19
accgacaaca acagcaaaag cagc 24
<210> 20
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 20
aaacgctgct tttgctgttg ttgt 24
<210> 21
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 21
accgacttca agaactagtg cgc 23
<210> 22
<211> 23
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 22
aaacgcgcac tagttcttga agt 23
<210> 23
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 23
accggcggta ccacgtcttg taga 24
<210> 24
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 24
aaactctaca agacgtggta ccgc 24
<210> 25
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 25
accgaccaga gaatgaaatc taag 24
<210> 26
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 26
aaaccttaga tttcattctc tggt 24
<210> 27
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 27
accgtgatca agagcgagca gtag 24
<210> 28
<211> 24
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 28
aaacctactg ctcgctcttg atca 24
<210> 29
<211> 1904
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 29
Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys
1 5 10 15
Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn
20 25 30
Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys
35 40 45
Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn
50 55 60
Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr
65 70 75 80
Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg
85 90 95
Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp
100 105 110
Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp
115 120 125
Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val
130 135 140
Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu
145 150 155 160
Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu
165 170 175
Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu
180 185 190
Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln
195 200 205
Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val
210 215 220
Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu
225 230 235 240
Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe
245 250 255
Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser
260 265 270
Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr
275 280 285
Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr
290 295 300
Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu
305 310 315 320
Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser
325 330 335
Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
340 345 350
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile
355 360 365
Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly
370 375 380
Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys
385 390 395 400
Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu
405 410 415
Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile
420 425 430
His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr
435 440 445
Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe
450 455 460
Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe
465 470 475 480
Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe
485 490 495
Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg
500 505 510
Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys
515 520 525
His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys
530 535 540
Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly
545 550 555 560
Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys
565 570 575
Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
580 585 590
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser
595 600 605
Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe
610 615 620
Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr
625 630 635 640
Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr
645 650 655
Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg
660 665 670
Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile
675 680 685
Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp
690 695 700
Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu
705 710 715 720
Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
725 730 735
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys
740 745 750
Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val
755 760 765
Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu
770 775 780
Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys
785 790 795 800
Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu
805 810 815
His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
820 825 830
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile
835 840 845
Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe
850 855 860
Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys
865 870 875 880
Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys
885 890 895
Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln
900 905 910
Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu
915 920 925
Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln
930 935 940
Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys
945 950 955 960
Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
965 970 975
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys
980 985 990
Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn
995 1000 1005
Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser
1010 1015 1020
Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035 1040
Ala Lys Ser Glu Gln Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser
1045 1050 1055
Ala Thr Pro Glu Ser Gly Ser Glu Thr Gly Pro Val Ala Val Asp Pro
1060 1065 1070
Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe Phe Asp
1075 1080 1085
Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile Lys Trp
1090 1095 1100
Gly Thr Ser His Lys Ile Trp Arg His Ser Ser Lys Asn Thr Thr Lys
1105 1110 1115 1120
His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Ser Glu Arg His Phe
1125 1130 1135
Cys Pro Ser Thr Ser Cys Ser Ile Thr Trp Phe Leu Ser Trp Ser Pro
1140 1145 1150
Cys Gly Glu Cys Ser Lys Ala Ile Thr Glu Phe Leu Ser Gln His Pro
1155 1160 1165
Asn Val Thr Leu Val Ile Tyr Val Ala Arg Leu Tyr His His Met Asp
1170 1175 1180
Gln Gln Asn Arg Gln Gly Leu Arg Asp Leu Val Asn Ser Gly Val Thr
1185 1190 1195 1200
Ile Gln Ile Met Thr Ala Pro Glu Tyr Asp Tyr Cys Trp Arg Asn Phe
1205 1210 1215
Val Asn Tyr Pro Pro Gly Lys Glu Ala His Trp Pro Arg Tyr Pro Pro
1220 1225 1230
Leu Trp Met Lys Leu Tyr Ala Leu Glu Leu His Ala Gly Ile Leu Gly
1235 1240 1245
Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln Leu Thr
1250 1255 1260
Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu Pro Pro
1265 1270 1275 1280
His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Gly Ser Gly Gly Ser
1285 1290 1295
Gly Gly Ser Met Ala Ala Ala Lys Ala Pro Gly Ala Ala Glu Phe Val
1300 1305 1310
Pro Ala Asp Ala Asp Leu Asp Thr Leu Arg Thr Ala Val Gln Gly Cys
1315 1320 1325
Arg Gly Cys Glu Leu Tyr Arg Gly Ala Thr Gln Ala Val Phe Gly Glu
1330 1335 1340
Gly Pro Ala His Ala Pro Val Phe Val Val Gly Glu Gln Pro Gly Asp
1345 1350 1355 1360
Arg Glu Asp Val Ala Gly His Pro Phe Val Gly Pro Ala Gly Arg Leu
1365 1370 1375
Leu Asp Lys Ala Leu Thr Glu Ala Asp Ile Asp Arg Glu Ala Val Tyr
1380 1385 1390
Leu Thr Asn Ala Val Lys His Phe Lys Phe Glu Glu Arg Gly Lys Arg
1395 1400 1405
Arg Ile His Lys Gln Pro Gly Arg Thr Glu Val Val Ala Cys Ser Pro
1410 1415 1420
Trp Leu Thr Ala Glu Leu Asp Ala Val Arg Pro Gln Leu Val Val Cys
1425 1430 1435 1440
Leu Gly Ala Val Ala Ala Lys Ala Val Leu Gly Pro Ser Phe Lys Val
1445 1450 1455
Ser Glu Arg Arg Gly Glu Val Val Glu Ala Gly Glu His Arg Val Ile
1460 1465 1470
Ala Thr Val His Pro Ser Ser Val Leu Arg Ala Pro Asp Arg Ala Ala
1475 1480 1485
Ala Tyr Ala Asp Phe Leu Ala Asp Leu Arg Lys Val Arg Thr Ala Ala
1490 1495 1500
Gly Glu Leu His Arg Ala Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly
1505 1510 1515 1520
Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly
1525 1530 1535
Gly Ser Ser Gly Gly Ser Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1540 1545 1550
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala
1555 1560 1565
Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1570 1575 1580
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys
1585 1590 1595 1600
Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln
1605 1610 1615
Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp
1620 1625 1630
Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly
1635 1640 1645
Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val
1650 1655 1660
Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly
1665 1670 1675 1680
Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe
1685 1690 1695
Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys
1700 1705 1710
Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met
1715 1720 1725
Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro
1730 1735 1740
Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu
1745 1750 1755 1760
Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln
1765 1770 1775
His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1780 1785 1790
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala
1795 1800 1805
Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1810 1815 1820
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys
1825 1830 1835 1840
Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu
1845 1850 1855
Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu
1860 1865 1870
Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser Gly Gly Ser Lys
1875 1880 1885
Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys Val
1890 1895 1900
<210> 30
<211> 56
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 30
cctggcccac cctcgtgacc accctgaccc atggcgtgca gtgcttcagc cgctac 56
<210> 31
<211> 56
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 31
cctggcccac cctcgtgacc accctgaccg atggcgtgca gtgcttcagc cgctac 56
<210> 32
<211> 56
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 32
cctggcccac cctcgtgacc accctgacct atggcgtgca gtgcttcagc cgctac 56
<210> 33
<211> 56
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 33
cctggcccac cctcgtgacc accctgacca atggcgtgca gtgcttcagc cgctac 56
<210> 34
<211> 1003
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 34
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
995 1000
<210> 35
<211> 1027
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 35
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln
1025
<210> 36
<211> 1038
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 36
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser
1010 1015 1020
Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
<210> 37
<211> 365
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 37
Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg
1 5 10 15
Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys
20 25 30
Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr
35 40 45
Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
50 55 60
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val
65 70 75 80
Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
85 90 95
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn
100 105 110
Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr
115 120 125
Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala
130 135 140
Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu
145 150 155 160
Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile
165 170 175
Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile
180 185 190
Ile Lys Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys
195 200 205
Arg Met Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala
210 215 220
Leu Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu
225 230 235 240
Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val
245 250 255
Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu
260 265 270
Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu
275 280 285
Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
290 295 300
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala
305 310 315 320
Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
325 330 335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu
340 345 350
Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
355 360 365
<210> 38
<211> 341
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 38
Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met
1 5 10 15
Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys
20 25 30
Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp
35 40 45
Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln
50 55 60
Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys
65 70 75 80
Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
85 90 95
Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val
100 105 110
Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys
115 120 125
Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg
130 135 140
Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
145 150 155 160
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu
165 170 175
Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
180 185 190
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn Phe
195 200 205
Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro Glu Asp
210 215 220
Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr Leu Asp
225 230 235 240
Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile Leu Ala
245 250 255
Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp
260 265 270
Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu
275 280 285
Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile
290 295 300
Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu
305 310 315 320
Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser
325 330 335
Gln Leu Gly Gly Asp
340
<210> 39
<211> 329
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 39
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly
1 5 10 15
Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu
20 25 30
Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu
35 40 45
Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly
50 55 60
Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu
65 70 75 80
Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp
85 90 95
Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys
100 105 110
Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr
115 120 125
Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu
130 135 140
Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro
145 150 155 160
Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala
165 170 175
Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys
180 185 190
Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
195 200 205
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys
210 215 220
His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
225 230 235 240
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn
245 250 255
Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His
260 265 270
Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe
275 280 285
Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu
290 295 300
Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg
305 310 315 320
Ile Asp Leu Ser Gln Leu Gly Gly Asp
325
<210> 40
<211> 18
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 40
Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser
1 5 10 15
Gly Ser
<210> 41
<211> 18
<212> PRT
<213> 人工序列(Artificial Sequence)
<400> 41
Ser Gly Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro
1 5 10 15
Glu Ser
<210> 42
<211> 5631
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 42
atgaaacgga cagccgacgg aagcgagttc gagtcaccaa agaagaagcg gaaagtcgac 60
aagaagtaca gcatcggcct ggccatcggc accaactctg tgggctgggc cgtgatcacc 120
gacgagtaca aggtgcccag caagaaattc aaggtgctgg gcaacaccga ccggcacagc 180
atcaagaaga acctgatcgg agccctgctg ttcgacagcg gcgaaacagc cgaggccacc 240
cggctgaaga gaaccgccag aagaagatac accagacgga agaaccggat ctgctatctg 300
caagagatct tcagcaacga gatggccaag gtggacgaca gcttcttcca cagactggaa 360
gagtccttcc tggtggaaga ggataagaag cacgagcggc accccatctt cggcaacatc 420
gtggacgagg tggcctacca cgagaagtac cccaccatct accacctgag aaagaaactg 480
gtggacagca ccgacaaggc cgacctgcgg ctgatctatc tggccctggc ccacatgatc 540
aagttccggg gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac 600
aagctgttca tccagctggt gcagacctac aaccagctgt tcgaggaaaa ccccatcaac 660
gccagcggcg tggacgccaa ggccatcctg tctgccagac tgagcaagag cagacggctg 720
gaaaatctga tcgcccagct gcccggcgag aagaagaatg gcctgttcgg aaacctgatt 780
gccctgagcc tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggatgcc 840
aaactgcagc tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc 900
ggcgaccagt acgccgacct gtttctggcc gccaagaacc tgtccgacgc catcctgctg 960
agcgacatcc tgagagtgaa caccgagatc accaaggccc ccctgagcgc ctctatgatc 1020
aagagatacg acgagcacca ccaggacctg accctgctga aagctctcgt gcggcagcag 1080
ctgcctgaga agtacaaaga gattttcttc gaccagagca agaacggcta cgccggctac 1140
attgacggcg gagccagcca ggaagagttc tacaagttca tcaagcccat cctggaaaag 1200
atggacggca ccgaggaact gctcgtgaag ctgaacagag aggacctgct gcggaagcag 1260
cggaccttcg acaacggcag catcccccac cagatccacc tgggagagct gcacgccatt 1320
ctgcggcggc aggaagattt ttacccattc ctgaaggaca accgggaaaa gatcgagaag 1380
atcctgacct tccgcatccc ctactacgtg ggccctctgg ccaggggaaa cagcagattc 1440
gcctggatga ccagaaagag cgaggaaacc atcaccccct ggaacttcga ggaagtggtg 1500
gacaagggcg cttccgccca gagcttcatc gagcggatga ccaacttcga taagaacctg 1560
cccaacgaga aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtataac 1620
gagctgacca aagtgaaata cgtgaccgag ggaatgagaa agcccgcctt cctgagcggc 1680
gagcagaaaa aggccatcgt ggacctgctg ttcaagacca accggaaagt gaccgtgaag 1740
cagctgaaag aggactactt caagaaaatc gagtgcttcg actccgtgga aatctccggc 1800
gtggaagatc ggttcaacgc ctccctgggc acataccacg atctgctgaa aattatcaag 1860
gacaaggact tcctggacaa tgaggaaaac gaggacattc tggaagatat cgtgctgacc 1920
ctgacactgt ttgaggacag agagatgatc gaggaacggc tgaaaaccta tgcccacctg 1980
ttcgacgaca aagtgatgaa gcagctgaag cggcggagat acaccggctg gggcaggctg 2040
agccggaagc tgatcaacgg catccgggac aagcagtccg gcaagacaat cctggatttc 2100
ctgaagtccg acggcttcgc caacagaaac ttcatgcagc tgatccacga cgacagcctg 2160
acctttaaag aggacatcca gaaagcccag gtgtccggcc agggcgatag cctgcacgag 2220
cacattgcca atctggccgg cagccccgcc attaagaagg gcatcctgca gacagtgaag 2280
gtggtggacg agctcgtgaa agtgatgggc cggcacaagc ccgagaacat cgtgatcgaa 2340
atggccagag agaaccagac cacccagaag ggacagaaga acagccgcga gagaatgaag 2400
cggatcgaag agggcatcaa agagctgggc agccagatcc tgaaagaaca ccccgtggaa 2460
aacacccagc tgcagaacga gaagctgtac ctgtactacc tgcagaatgg gcgggatatg 2520
tacgtggacc aggaactgga catcaaccgg ctgtccgact acgatgtgga ccatatcgtg 2580
cctcagagct ttctgaagga cgactccatc gacaacaagg tgctgaccag aagcgacaag 2640
aaccggggca agagcgacaa cgtgccctcc gaagaggtcg tgaagaagat gaagaactac 2700
tggcggcagc tgctgaacgc caagctgatt acccagagaa agttcgacaa tctgaccaag 2760
gccgagagag gcggcctgag cgaactggat aaggccggct tcatcaagag acagctggtg 2820
gaaacccggc agatcacaaa gcacgtggca cagatcctgg actcccggat gaacactaag 2880
tacgacgaga atgacaagct gatccgggaa gtgaaagtga tcaccctgaa gtccaagctg 2940
gtgtccgatt tccggaagga tttccagttt tacaaagtgc gcgagatcaa caactaccac 3000
cacgcccacg acgcctacct aaacgccgtc gtgggaaccg ccctgatcaa aaagtaccct 3060
aagtctggca gcgagacacc aggaacaagc gagtcagcaa caccagagag cggcagcgag 3120
gcatctccag caagcggacc aaggcacctg atggaccccc acatcttcac ctctaacttt 3180
aacaatggca tcggcaggca caagacatac ctgtgctatg aggtggagcg cctggacaat 3240
ggcaccagcg tgaagatgga tcagcacaga ggcttcctgc acaaccaggc caagaatctg 3300
ctgtgcggct tctacggccg gcacgcagag ctgagatttc tggacctggt gcctagcctg 3360
cagctggatc cagcccagat ctatagggtg acctggttca tcagctggtc cccatgcttt 3420
tcctggggat gtgcaggaga ggtgcgcgcc tttctgcagg agaacacaca cgtgcggctg 3480
agaatcttcg ccgcccggat ctttgactac gatcctctgt ataaggaggc cctgcagatg 3540
ctgagagacg caggagccca ggtgtccatc atgacctacg atgagttcaa gcactgctgg 3600
gacacatttg tggatcacca gggctgtccc ttccagcctt gggacggact ggatgagcac 3660
tcccaggccc tgtctggcag gctgagggcc atcctgcaga accagggcaa tagcggagga 3720
tccggaggat ctggaggcag catggccgct gctaaagccc ccggcgctgc tgaattcgtc 3780
cccgctgatg ccgacctgga tacactgcgg accgccgtgc agggctgcag aggctgtgaa 3840
ctgtacagag gcgccaccca ggccgtgttc ggcgagggcc ctgctcacgc ccctgtgttt 3900
gtggtcggcg agcagcctgg cgaccgggaa gatgtggccg gccacccctt cgtgggcccc 3960
gccggcagac tgctggacaa ggccctgaca gaggccgaca tcgacaggga agccgtgtac 4020
ctgaccaacg ccgttaagca cttcaagttc gaggaaagag gaaaaagaag aatccacaag 4080
caacctggca gaaccgaggt ggtggcatgc agcccttggc tgaccgccga gctggacgcc 4140
gtgcggcctc agctggtggt gtgcctgggc gccgtggccg ccaaggccgt gctgggacca 4200
tcttttaagg tgtccgagcg gcggggcgaa gtggtcgagg ccggagagca cagagtgatc 4260
gccacagtgc accctagcag cgtgctgaga gccccagacc gcgccgctgc ctacgccgac 4320
ttcctggccg atctgagaaa ggtgcggacc gccgctggag agctccatag agcctctgga 4380
ggatctagcg gaggatcctc tggcagcgag acaccaggaa caagcgagtc agcaacacca 4440
gagagcagtg gcggcagcag cggcggcagc ctggaaagcg agttcgtgta cggcgactac 4500
aaggtgtacg acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc 4560
gccaagtact tcttctacag caacatcatg aactttttca agaccgagat taccctggcc 4620
aacggcgaga tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg 4680
tgggataagg gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat 4740
atcgtgaaaa agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag 4800
aggaacagcg ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc 4860
ttcgacagcc ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag 4920
tccaagaaac tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc 4980
ttcgagaaga atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac 5040
ctgatcatca agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg 5100
ctggcctctg ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg 5160
aacttcctgt acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag 5220
cagaaacagc tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc 5280
agcgagttct ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc 5340
tacaacaagc accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt 5400
accctgacca atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg 5460
aagaggtaca ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc 5520
ggcctgtacg agacacggat cgacctgtct cagctgggag gtgactctgg cggctcaaaa 5580
agaaccgccg acggcagcga attcgagccc aagaagaaga ggaaagtcta a 5631
<210> 43
<211> 5631
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 43
atgaaacgga cagccgacgg aagcgagttc gagtcaccaa agaagaagcg gaaagtcgac 60
aagaagtaca gcatcggcct ggccatcggc accaactctg tgggctgggc cgtgatcacc 120
gacgagtaca aggtgcccag caagaaattc aaggtgctgg gcaacaccga ccggcacagc 180
atcaagaaga acctgatcgg agccctgctg ttcgacagcg gcgaaacagc cgaggccacc 240
cggctgaaga gaaccgccag aagaagatac accagacgga agaaccggat ctgctatctg 300
caagagatct tcagcaacga gatggccaag gtggacgaca gcttcttcca cagactggaa 360
gagtccttcc tggtggaaga ggataagaag cacgagcggc accccatctt cggcaacatc 420
gtggacgagg tggcctacca cgagaagtac cccaccatct accacctgag aaagaaactg 480
gtggacagca ccgacaaggc cgacctgcgg ctgatctatc tggccctggc ccacatgatc 540
aagttccggg gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac 600
aagctgttca tccagctggt gcagacctac aaccagctgt tcgaggaaaa ccccatcaac 660
gccagcggcg tggacgccaa ggccatcctg tctgccagac tgagcaagag cagacggctg 720
gaaaatctga tcgcccagct gcccggcgag aagaagaatg gcctgttcgg aaacctgatt 780
gccctgagcc tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggatgcc 840
aaactgcagc tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc 900
ggcgaccagt acgccgacct gtttctggcc gccaagaacc tgtccgacgc catcctgctg 960
agcgacatcc tgagagtgaa caccgagatc accaaggccc ccctgagcgc ctctatgatc 1020
aagagatacg acgagcacca ccaggacctg accctgctga aagctctcgt gcggcagcag 1080
ctgcctgaga agtacaaaga gattttcttc gaccagagca agaacggcta cgccggctac 1140
attgacggcg gagccagcca ggaagagttc tacaagttca tcaagcccat cctggaaaag 1200
atggacggca ccgaggaact gctcgtgaag ctgaacagag aggacctgct gcggaagcag 1260
cggaccttcg acaacggcag catcccccac cagatccacc tgggagagct gcacgccatt 1320
ctgcggcggc aggaagattt ttacccattc ctgaaggaca accgggaaaa gatcgagaag 1380
atcctgacct tccgcatccc ctactacgtg ggccctctgg ccaggggaaa cagcagattc 1440
gcctggatga ccagaaagag cgaggaaacc atcaccccct ggaacttcga ggaagtggtg 1500
gacaagggcg cttccgccca gagcttcatc gagcggatga ccaacttcga taagaacctg 1560
cccaacgaga aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtataac 1620
gagctgacca aagtgaaata cgtgaccgag ggaatgagaa agcccgcctt cctgagcggc 1680
gagcagaaaa aggccatcgt ggacctgctg ttcaagacca accggaaagt gaccgtgaag 1740
cagctgaaag aggactactt caagaaaatc gagtgcttcg actccgtgga aatctccggc 1800
gtggaagatc ggttcaacgc ctccctgggc acataccacg atctgctgaa aattatcaag 1860
gacaaggact tcctggacaa tgaggaaaac gaggacattc tggaagatat cgtgctgacc 1920
ctgacactgt ttgaggacag agagatgatc gaggaacggc tgaaaaccta tgcccacctg 1980
ttcgacgaca aagtgatgaa gcagctgaag cggcggagat acaccggctg gggcaggctg 2040
agccggaagc tgatcaacgg catccgggac aagcagtccg gcaagacaat cctggatttc 2100
ctgaagtccg acggcttcgc caacagaaac ttcatgcagc tgatccacga cgacagcctg 2160
acctttaaag aggacatcca gaaagcccag gtgtccggcc agggcgatag cctgcacgag 2220
cacattgcca atctggccgg cagccccgcc attaagaagg gcatcctgca gacagtgaag 2280
gtggtggacg agctcgtgaa agtgatgggc cggcacaagc ccgagaacat cgtgatcgaa 2340
atggccagag agaaccagac cacccagaag ggacagaaga acagccgcga gagaatgaag 2400
cggatcgaag agggcatcaa agagctgggc agccagatcc tgaaagaaca ccccgtggaa 2460
aacacccagc tgcagaacga gaagctgtac ctgtactacc tgcagaatgg gcgggatatg 2520
tacgtggacc aggaactgga catcaaccgg ctgtccgact acgatgtgga ccatatcgtg 2580
cctcagagct ttctgaagga cgactccatc gacaacaagg tgctgaccag aagcgacaag 2640
aaccggggca agagcgacaa cgtgccctcc gaagaggtcg tgaagaagat gaagaactac 2700
tggcggcagc tgctgaacgc caagctgatt acccagagaa agttcgacaa tctgaccaag 2760
gccgagagag gcggcctgag cgaactggat aaggccggct tcatcaagag acagctggtg 2820
gaaacccggc agatcacaaa gcacgtggca cagatcctgg actcccggat gaacactaag 2880
tacgacgaga atgacaagct gatccgggaa gtgaaagtga tcaccctgaa gtccaagctg 2940
gtgtccgatt tccggaagga tttccagttt tacaaagtgc gcgagatcaa caactaccac 3000
cacgcccacg acgcctacct aaacgccgtc gtgggaaccg ccctgatcaa aaagtaccct 3060
aagctggaaa gcgagttcgt gtacggcgac tacaaggtgt acgacgtgcg gaagatgatc 3120
gccaagagcg agcagtctgg cagcgagaca ccaggaacaa gcgagtcagc aacaccagag 3180
agcggcagcg aggcatctcc agcaagcgga ccaaggcacc tgatggaccc ccacatcttc 3240
acctctaact ttaacaatgg catcggcagg cacaagacat acctgtgcta tgaggtggag 3300
cgcctggaca atggcaccag cgtgaagatg gatcagcaca gaggcttcct gcacaaccag 3360
gccaagaatc tgctgtgcgg cttctacggc cggcacgcag agctgagatt tctggacctg 3420
gtgcctagcc tgcagctgga tccagcccag atctataggg tgacctggtt catcagctgg 3480
tccccatgct tttcctgggg atgtgcagga gaggtgcgcg cctttctgca ggagaacaca 3540
cacgtgcggc tgagaatctt cgccgcccgg atctttgact acgatcctct gtataaggag 3600
gccctgcaga tgctgagaga cgcaggagcc caggtgtcca tcatgaccta cgatgagttc 3660
aagcactgct gggacacatt tgtggatcac cagggctgtc ccttccagcc ttgggacgga 3720
ctggatgagc actcccaggc cctgtctggc aggctgaggg ccatcctgca gaaccagggc 3780
aatagcggag gatccggagg atctggaggc agcatggccg ctgctaaagc ccccggcgct 3840
gctgaattcg tccccgctga tgccgacctg gatacactgc ggaccgccgt gcagggctgc 3900
agaggctgtg aactgtacag aggcgccacc caggccgtgt tcggcgaggg ccctgctcac 3960
gcccctgtgt ttgtggtcgg cgagcagcct ggcgaccggg aagatgtggc cggccacccc 4020
ttcgtgggcc ccgccggcag actgctggac aaggccctga cagaggccga catcgacagg 4080
gaagccgtgt acctgaccaa cgccgttaag cacttcaagt tcgaggaaag aggaaaaaga 4140
agaatccaca agcaacctgg cagaaccgag gtggtggcat gcagcccttg gctgaccgcc 4200
gagctggacg ccgtgcggcc tcagctggtg gtgtgcctgg gcgccgtggc cgccaaggcc 4260
gtgctgggac catcttttaa ggtgtccgag cggcggggcg aagtggtcga ggccggagag 4320
cacagagtga tcgccacagt gcaccctagc agcgtgctga gagccccaga ccgcgccgct 4380
gcctacgccg acttcctggc cgatctgaga aaggtgcgga ccgccgctgg agagctccat 4440
agagcctctg gaggatctag cggaggatcc tctggcagcg agacaccagg aacaagcgag 4500
tcagcaacac cagagagcag tggcggcagc agcggcggca gcgaaatcgg caaggctacc 4560
gccaagtact tcttctacag caacatcatg aactttttca agaccgagat taccctggcc 4620
aacggcgaga tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg 4680
tgggataagg gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat 4740
atcgtgaaaa agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag 4800
aggaacagcg ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc 4860
ttcgacagcc ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag 4920
tccaagaaac tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc 4980
ttcgagaaga atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac 5040
ctgatcatca agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg 5100
ctggcctctg ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg 5160
aacttcctgt acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag 5220
cagaaacagc tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc 5280
agcgagttct ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc 5340
tacaacaagc accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt 5400
accctgacca atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg 5460
aagaggtaca ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc 5520
ggcctgtacg agacacggat cgacctgtct cagctgggag gtgactctgg cggctcaaaa 5580
agaaccgccg acggcagcga attcgagccc aagaagaaga ggaaagtcta a 5631
<210> 44
<211> 5631
<212> DNA
<213> 人工序列(Artificial Sequence)
<400> 44
atgaaacgga cagccgacgg aagcgagttc gagtcaccaa agaagaagcg gaaagtcgac 60
aagaagtaca gcatcggcct ggccatcggc accaactctg tgggctgggc cgtgatcacc 120
gacgagtaca aggtgcccag caagaaattc aaggtgctgg gcaacaccga ccggcacagc 180
atcaagaaga acctgatcgg agccctgctg ttcgacagcg gcgaaacagc cgaggccacc 240
cggctgaaga gaaccgccag aagaagatac accagacgga agaaccggat ctgctatctg 300
caagagatct tcagcaacga gatggccaag gtggacgaca gcttcttcca cagactggaa 360
gagtccttcc tggtggaaga ggataagaag cacgagcggc accccatctt cggcaacatc 420
gtggacgagg tggcctacca cgagaagtac cccaccatct accacctgag aaagaaactg 480
gtggacagca ccgacaaggc cgacctgcgg ctgatctatc tggccctggc ccacatgatc 540
aagttccggg gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac 600
aagctgttca tccagctggt gcagacctac aaccagctgt tcgaggaaaa ccccatcaac 660
gccagcggcg tggacgccaa ggccatcctg tctgccagac tgagcaagag cagacggctg 720
gaaaatctga tcgcccagct gcccggcgag aagaagaatg gcctgttcgg aaacctgatt 780
gccctgagcc tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggatgcc 840
aaactgcagc tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc 900
ggcgaccagt acgccgacct gtttctggcc gccaagaacc tgtccgacgc catcctgctg 960
agcgacatcc tgagagtgaa caccgagatc accaaggccc ccctgagcgc ctctatgatc 1020
aagagatacg acgagcacca ccaggacctg accctgctga aagctctcgt gcggcagcag 1080
ctgcctgaga agtacaaaga gattttcttc gaccagagca agaacggcta cgccggctac 1140
attgacggcg gagccagcca ggaagagttc tacaagttca tcaagcccat cctggaaaag 1200
atggacggca ccgaggaact gctcgtgaag ctgaacagag aggacctgct gcggaagcag 1260
cggaccttcg acaacggcag catcccccac cagatccacc tgggagagct gcacgccatt 1320
ctgcggcggc aggaagattt ttacccattc ctgaaggaca accgggaaaa gatcgagaag 1380
atcctgacct tccgcatccc ctactacgtg ggccctctgg ccaggggaaa cagcagattc 1440
gcctggatga ccagaaagag cgaggaaacc atcaccccct ggaacttcga ggaagtggtg 1500
gacaagggcg cttccgccca gagcttcatc gagcggatga ccaacttcga taagaacctg 1560
cccaacgaga aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtataac 1620
gagctgacca aagtgaaata cgtgaccgag ggaatgagaa agcccgcctt cctgagcggc 1680
gagcagaaaa aggccatcgt ggacctgctg ttcaagacca accggaaagt gaccgtgaag 1740
cagctgaaag aggactactt caagaaaatc gagtgcttcg actccgtgga aatctccggc 1800
gtggaagatc ggttcaacgc ctccctgggc acataccacg atctgctgaa aattatcaag 1860
gacaaggact tcctggacaa tgaggaaaac gaggacattc tggaagatat cgtgctgacc 1920
ctgacactgt ttgaggacag agagatgatc gaggaacggc tgaaaaccta tgcccacctg 1980
ttcgacgaca aagtgatgaa gcagctgaag cggcggagat acaccggctg gggcaggctg 2040
agccggaagc tgatcaacgg catccgggac aagcagtccg gcaagacaat cctggatttc 2100
ctgaagtccg acggcttcgc caacagaaac ttcatgcagc tgatccacga cgacagcctg 2160
acctttaaag aggacatcca gaaagcccag gtgtccggcc agggcgatag cctgcacgag 2220
cacattgcca atctggccgg cagccccgcc attaagaagg gcatcctgca gacagtgaag 2280
gtggtggacg agctcgtgaa agtgatgggc cggcacaagc ccgagaacat cgtgatcgaa 2340
atggccagag agaaccagac cacccagaag ggacagaaga acagccgcga gagaatgaag 2400
cggatcgaag agggcatcaa agagctgggc agccagatcc tgaaagaaca ccccgtggaa 2460
aacacccagc tgcagaacga gaagctgtac ctgtactacc tgcagaatgg gcgggatatg 2520
tacgtggacc aggaactgga catcaaccgg ctgtccgact acgatgtgga ccatatcgtg 2580
cctcagagct ttctgaagga cgactccatc gacaacaagg tgctgaccag aagcgacaag 2640
aaccggggca agagcgacaa cgtgccctcc gaagaggtcg tgaagaagat gaagaactac 2700
tggcggcagc tgctgaacgc caagctgatt acccagagaa agttcgacaa tctgaccaag 2760
gccgagagag gcggcctgag cgaactggat aaggccggct tcatcaagag acagctggtg 2820
gaaacccggc agatcacaaa gcacgtggca cagatcctgg actcccggat gaacactaag 2880
tacgacgaga atgacaagct gatccgggaa gtgaaagtga tcaccctgaa gtccaagctg 2940
gtgtccgatt tccggaagga tttccagttt tacaaagtgc gcgagatcaa caactaccac 3000
cacgcccacg acgcctacct aaacgccgtc gtgggaaccg ccctgatcaa aaagtaccct 3060
aagctggaaa gcgagttcgt gtacggcgac tacaaggtgt acgacgtgcg gaagatgatc 3120
gccaagagcg agcaggaaat cggcaaggct accgccaagt acttcttcta ctctggcagc 3180
gagacaccag gaacaagcga gtcagcaaca ccagagagcg gcagcgaggc atctccagca 3240
agcggaccaa ggcacctgat ggacccccac atcttcacct ctaactttaa caatggcatc 3300
ggcaggcaca agacatacct gtgctatgag gtggagcgcc tggacaatgg caccagcgtg 3360
aagatggatc agcacagagg cttcctgcac aaccaggcca agaatctgct gtgcggcttc 3420
tacggccggc acgcagagct gagatttctg gacctggtgc ctagcctgca gctggatcca 3480
gcccagatct atagggtgac ctggttcatc agctggtccc catgcttttc ctggggatgt 3540
gcaggagagg tgcgcgcctt tctgcaggag aacacacacg tgcggctgag aatcttcgcc 3600
gcccggatct ttgactacga tcctctgtat aaggaggccc tgcagatgct gagagacgca 3660
ggagcccagg tgtccatcat gacctacgat gagttcaagc actgctggga cacatttgtg 3720
gatcaccagg gctgtccctt ccagccttgg gacggactgg atgagcactc ccaggccctg 3780
tctggcaggc tgagggccat cctgcagaac cagggcaata gcggaggatc cggaggatct 3840
ggaggcagca tggccgctgc taaagccccc ggcgctgctg aattcgtccc cgctgatgcc 3900
gacctggata cactgcggac cgccgtgcag ggctgcagag gctgtgaact gtacagaggc 3960
gccacccagg ccgtgttcgg cgagggccct gctcacgccc ctgtgtttgt ggtcggcgag 4020
cagcctggcg accgggaaga tgtggccggc caccccttcg tgggccccgc cggcagactg 4080
ctggacaagg ccctgacaga ggccgacatc gacagggaag ccgtgtacct gaccaacgcc 4140
gttaagcact tcaagttcga ggaaagagga aaaagaagaa tccacaagca acctggcaga 4200
accgaggtgg tggcatgcag cccttggctg accgccgagc tggacgccgt gcggcctcag 4260
ctggtggtgt gcctgggcgc cgtggccgcc aaggccgtgc tgggaccatc ttttaaggtg 4320
tccgagcggc ggggcgaagt ggtcgaggcc ggagagcaca gagtgatcgc cacagtgcac 4380
cctagcagcg tgctgagagc cccagaccgc gccgctgcct acgccgactt cctggccgat 4440
ctgagaaagg tgcggaccgc cgctggagag ctccatagag cctctggagg atctagcgga 4500
ggatcctctg gcagcgagac accaggaaca agcgagtcag caacaccaga gagcagtggc 4560
ggcagcagcg gcggcagcag caacatcatg aactttttca agaccgagat taccctggcc 4620
aacggcgaga tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg 4680
tgggataagg gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat 4740
atcgtgaaaa agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag 4800
aggaacagcg ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc 4860
ttcgacagcc ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag 4920
tccaagaaac tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc 4980
ttcgagaaga atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac 5040
ctgatcatca agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg 5100
ctggcctctg ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg 5160
aacttcctgt acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag 5220
cagaaacagc tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc 5280
agcgagttct ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc 5340
tacaacaagc accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt 5400
accctgacca atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg 5460
aagaggtaca ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc 5520
ggcctgtacg agacacggat cgacctgtct cagctgggag gtgactctgg cggctcaaaa 5580
agaaccgccg acggcagcga attcgagccc aagaagaaga ggaaagtcta a 5631
Claims (15)
1. 一种融合蛋白,其特征在于,所述融合蛋白的氨基酸序列如SEQ ID No. 11~13至少任一所示。
2.一种分离的多核苷酸,其特征在于,其编码如权利要求1所述的融合蛋白。
3.一种构建体,其特征在于,所述构建体含有如权利要求2所述的分离的多核苷酸。
4.一种表达系统,其特征在于,所述表达系统含有如权利要求3所述的构建体或基因组中整合有外源的如权利要求2所述的多核苷酸。
5.如权利要求4所述的表达系统,其特征在于,所述表达系统的宿主细胞选自真核细胞或原核细胞。
6.权利要求5所述的表达系统,其特征在于,所述宿主细胞选自小鼠细胞或人细胞。
7.一种碱基编辑系统,其包括如权利要求1所述的融合蛋白或其编码多核苷酸。
8.如权利要求7所述的碱基编辑系统,其特征在于,所述碱基编辑系统还包括引导RNA;包括以下至少任一项:
1) 所述碱基编辑系统包括:a) 所述融合蛋白或其编码多核苷酸,b) 引导RNA核苷酸序列或其编码多核苷酸;和/或,
2) 所述引导RNA将所述融合蛋白靶向靶标序列中的C碱基。
9.如权利要求8所述的碱基编辑系统,其特征在于,包括以下至少任一项:
1) 所述碱基编辑系统包含一个或多个载体;所述一个或多个载体包含(i)第一调控元件,所述第一调控元件可操作地连接至所述融合蛋白的编码多核苷酸;以及(ii)第二调控元件,所述第二调控元件可操作地连接至所述引导RNA核苷酸序列的编码多核苷酸;
所述(i)和(ii)位于相同或不同载体上;
2) 所述碱基编辑系统包含(i)融合蛋白,以及(ii)包含所述引导核苷酸序列的编码多核苷酸的载体。
10.如权利要求1所述的融合蛋白、如权利要求2所述的分离的多核苷酸、如权利要求3所述的构建体或如权利要求4~6任一项所述的表达系统或如权利要求7~9任一项所述的碱基编辑系统在非疾病的诊断和治疗目的的基因编辑中的用途。
11.如权利要求10所述的用途,其特征在于,包括以下至少任一项:
1) 所述基因编辑实现碱基颠换;
2) 所述基因编辑实现C至G的颠换或者G至C的颠换;
3) 所述基因编辑用于实现致病位点的修正、基因功能研究、增强细胞功能、细胞治疗至少任一。
12.如权利要求11所述的用途,其特征在于,包括以下至少任一项:
1) 所述治病位点导致的疾病选自以下至少任一项:自身免疫性疾病、肿瘤、病毒感染性疾病、细菌感染性疾病;
2) 所述融合蛋白、分离的多核苷酸、构建体、表达系统或碱基编辑系统与其他药物或试剂联用。
13.一种非疾病的诊断和治疗目的的基因编辑方法,包括:通过如权利要求1所述的融合蛋白、如权利要求2所述的分离的多核苷酸、如权利要求3所述的构建体或如权利要求4~6任一项所述的表达系统或如权利要求7~9任一项所述的碱基编辑系统对靶标序列进行碱基编辑。
14.如权利要求13所述的方法,其特征在于,所述方法在体外实施。
15.如权利要求13所述的方法,其特征在于,所述方法在培养的细胞中实施。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210415558.0A CN114835821B (zh) | 2022-04-18 | 2022-04-18 | 一种高效特异实现碱基颠换的编辑系统、方法及用途 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210415558.0A CN114835821B (zh) | 2022-04-18 | 2022-04-18 | 一种高效特异实现碱基颠换的编辑系统、方法及用途 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114835821A CN114835821A (zh) | 2022-08-02 |
CN114835821B true CN114835821B (zh) | 2023-12-22 |
Family
ID=82566429
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210415558.0A Active CN114835821B (zh) | 2022-04-18 | 2022-04-18 | 一种高效特异实现碱基颠换的编辑系统、方法及用途 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114835821B (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114686456B (zh) * | 2022-05-10 | 2023-02-17 | 中山大学 | 基于双分子脱氨酶互补的碱基编辑系统及其应用 |
CN115820728A (zh) * | 2022-07-11 | 2023-03-21 | 上海贝斯昂科生物科技有限公司 | 一种基因编辑的方法和用途 |
CN116515766A (zh) * | 2023-06-30 | 2023-08-01 | 上海贝斯昂科生物科技有限公司 | 一种自然杀伤细胞、其制备方法及用途 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111172133A (zh) * | 2020-03-10 | 2020-05-19 | 上海科技大学 | 一种碱基编辑工具及其用途 |
WO2021042047A1 (en) * | 2019-08-30 | 2021-03-04 | The General Hospital Corporation | C-to-g transversion dna base editors |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017048969A1 (en) * | 2015-09-17 | 2017-03-23 | The Regents Of The University Of California | Variant cas9 polypeptides comprising internal insertions |
US10745677B2 (en) * | 2016-12-23 | 2020-08-18 | President And Fellows Of Harvard College | Editing of CCR5 receptor gene to protect against HIV infection |
-
2022
- 2022-04-18 CN CN202210415558.0A patent/CN114835821B/zh active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021042047A1 (en) * | 2019-08-30 | 2021-03-04 | The General Hospital Corporation | C-to-g transversion dna base editors |
CN111172133A (zh) * | 2020-03-10 | 2020-05-19 | 上海科技大学 | 一种碱基编辑工具及其用途 |
CN114058604A (zh) * | 2020-03-10 | 2022-02-18 | 上海科技大学 | 一种融合蛋白及其在碱基编辑中的用途 |
Also Published As
Publication number | Publication date |
---|---|
CN114835821A (zh) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114835821B (zh) | 一种高效特异实现碱基颠换的编辑系统、方法及用途 | |
KR102606929B1 (ko) | 동족 항원과 t-세포 수용체 상호작용의 발견 및 특징규명을 위한 조작된 2-부분 세포 디바이스 | |
CN110055224B (zh) | 一种基因修饰的免疫细胞及其制备方法和应用 | |
KR101982360B1 (ko) | 콤팩트 tale-뉴클레아제의 발생 방법 및 이의 용도 | |
US20040077572A1 (en) | Transposon system and methods of use | |
AU2022200903B2 (en) | Engineered Cascade components and Cascade complexes | |
KR20230091894A (ko) | 부위 특이적 표적화 요소를 통한 프로그램 가능한 첨가(paste)를 사용하는 부위 특이적 유전 공학을 위한 시스템, 방법, 및 조성물 | |
DK2718440T3 (en) | NUCLEASE ACTIVITY PROTEIN, FUSION PROTEINS AND APPLICATIONS THEREOF | |
KR102614328B1 (ko) | T-세포 수용체 합성 및 tcr-제시 세포에 대한 안정적인 게놈 통합을 위한 2-부분 디바이스 | |
KR102584628B1 (ko) | T-세포 수용체, t-세포 항원 및 이들의 기능성 상호작용의 식별 및 특징규명을 위한 조작된 다성분 시스템 | |
GB2443186A (en) | Expression system for mediating alternative splicing | |
JP2023025182A (ja) | T細胞レセプター及びt細胞抗原の同定及び特徴決定のための遺伝子操作された多成分システム | |
CN111094569A (zh) | 光控性病毒蛋白质、其基因及包含该基因的病毒载体 | |
CN113692225B (zh) | 经基因组编辑的鸟类 | |
KR20230131229A (ko) | 부위 특이적 유전자 변형 | |
CN111315212B (zh) | 经过基因组编辑的鸟 | |
KR20240029020A (ko) | Dna 변형을 위한 crispr-트랜스포손 시스템 | |
CN116323955A (zh) | 通过crispr/cas介导的体内末端解析拯救重组腺病毒 | |
RU2827658C2 (ru) | Сконструированные компоненты cascade и комплексы cascade | |
RU2774631C1 (ru) | Сконструированные компоненты cascade и комплексы cascade | |
KR102727950B1 (ko) | 헬퍼 플라스미드 기반 가틀리스 아데노바이러스 생산 시스템 | |
RU2812852C2 (ru) | Невирусные днк-векторы и варианты их применения для экспрессии терапевтического средства на основе фактора viii (fviii) | |
KR20220027164A (ko) | 헬퍼 플라스미드 기반 가틀리스 아데노바이러스 생산 시스템 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |