KR102390485B1 - 피피알 모티프를 가지는 디앤에이 결합 단백질을 포함하는 융합 단백질의 설계방법 - Google Patents
피피알 모티프를 가지는 디앤에이 결합 단백질을 포함하는 융합 단백질의 설계방법 Download PDFInfo
- Publication number
- KR102390485B1 KR102390485B1 KR1020217009426A KR20217009426A KR102390485B1 KR 102390485 B1 KR102390485 B1 KR 102390485B1 KR 1020217009426 A KR1020217009426 A KR 1020217009426A KR 20217009426 A KR20217009426 A KR 20217009426A KR 102390485 B1 KR102390485 B1 KR 102390485B1
- Authority
- KR
- South Korea
- Prior art keywords
- leu
- ser
- glu
- ala
- amino acid
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 36
- 108020001507 fusion proteins Proteins 0.000 title claims description 6
- 102000052510 DNA-Binding Proteins Human genes 0.000 title description 8
- 101710096438 DNA-binding protein Proteins 0.000 title description 6
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 157
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 148
- 108020004414 DNA Proteins 0.000 claims abstract description 129
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 56
- 230000027455 binding Effects 0.000 claims description 47
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 27
- 102000004190 Enzymes Human genes 0.000 claims description 24
- 108090000790 Enzymes Proteins 0.000 claims description 24
- 150000007523 nucleic acids Chemical class 0.000 claims description 18
- 108020004707 nucleic acids Proteins 0.000 claims description 15
- 102000039446 nucleic acids Human genes 0.000 claims description 15
- 101100083945 Arabidopsis thaliana GUN1 gene Proteins 0.000 claims description 13
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 12
- 102100034579 Desmoglein-1 Human genes 0.000 claims description 11
- 101150046388 GRP23 gene Proteins 0.000 claims description 11
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 claims description 9
- 101710163270 Nuclease Proteins 0.000 claims description 7
- 241001465754 Metazoa Species 0.000 claims description 5
- 102000037865 fusion proteins Human genes 0.000 claims description 5
- 230000002103 transcriptional effect Effects 0.000 claims description 4
- 102100027881 Tumor protein 63 Human genes 0.000 claims 3
- 241000588724 Escherichia coli Species 0.000 claims 2
- 241000282412 Homo Species 0.000 claims 2
- 108091008324 binding proteins Proteins 0.000 claims 2
- 102000023732 binding proteins Human genes 0.000 claims 2
- 150000001413 amino acids Chemical class 0.000 abstract description 341
- 230000004568 DNA-binding Effects 0.000 abstract description 27
- 229940024606 amino acid Drugs 0.000 description 341
- 235000001014 amino acid Nutrition 0.000 description 341
- 235000018102 proteins Nutrition 0.000 description 142
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 91
- 235000009582 asparagine Nutrition 0.000 description 80
- 229960001230 asparagine Drugs 0.000 description 80
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 70
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 52
- 235000003704 aspartic acid Nutrition 0.000 description 52
- 229960005261 aspartic acid Drugs 0.000 description 52
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 52
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 36
- 230000006870 function Effects 0.000 description 33
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 31
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 31
- 239000004474 valine Substances 0.000 description 31
- 229960004295 valine Drugs 0.000 description 31
- 235000014393 valine Nutrition 0.000 description 31
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 28
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 28
- 239000004473 Threonine Substances 0.000 description 28
- 229960002898 threonine Drugs 0.000 description 28
- 235000008521 threonine Nutrition 0.000 description 28
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 25
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 24
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 24
- 235000014705 isoleucine Nutrition 0.000 description 24
- 229960000310 isoleucine Drugs 0.000 description 24
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 24
- 235000004400 serine Nutrition 0.000 description 24
- 229960001153 serine Drugs 0.000 description 24
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 23
- 230000007018 DNA scission Effects 0.000 description 18
- 239000004471 Glycine Substances 0.000 description 18
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 17
- 241000219194 Arabidopsis Species 0.000 description 15
- 210000004027 cell Anatomy 0.000 description 15
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 14
- 230000004570 RNA-binding Effects 0.000 description 14
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 13
- 241000219195 Arabidopsis thaliana Species 0.000 description 12
- 102100028629 Cytoskeleton-associated protein 4 Human genes 0.000 description 12
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 12
- 239000002773 nucleotide Substances 0.000 description 12
- 125000003729 nucleotide group Chemical group 0.000 description 12
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 12
- 235000008729 phenylalanine Nutrition 0.000 description 12
- 229960005190 phenylalanine Drugs 0.000 description 12
- 238000013518 transcription Methods 0.000 description 12
- 230000035897 transcription Effects 0.000 description 12
- 241000880493 Leptailurus serval Species 0.000 description 11
- 108010050848 glycylleucine Proteins 0.000 description 11
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 10
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 10
- 229960002429 proline Drugs 0.000 description 10
- 235000013930 proline Nutrition 0.000 description 10
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 9
- 101710185494 Zinc finger protein Proteins 0.000 description 9
- 102100023597 Zinc finger protein 816 Human genes 0.000 description 9
- 241000196324 Embryophyta Species 0.000 description 8
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 8
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 8
- 108010073062 Transcription Activator-Like Effectors Proteins 0.000 description 8
- 238000010362 genome editing Methods 0.000 description 8
- 108010049041 glutamylalanine Proteins 0.000 description 8
- 229960002449 glycine Drugs 0.000 description 8
- 229930182817 methionine Natural products 0.000 description 8
- 235000006109 methionine Nutrition 0.000 description 8
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 8
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 7
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 7
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 7
- 108010047495 alanylglycine Proteins 0.000 description 7
- 230000015572 biosynthetic process Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 7
- 239000013604 expression vector Substances 0.000 description 7
- 229960003136 leucine Drugs 0.000 description 7
- 235000005772 leucine Nutrition 0.000 description 7
- 108010073969 valyllysine Proteins 0.000 description 7
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 6
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 6
- 108060001084 Luciferase Proteins 0.000 description 6
- 239000005089 Luciferase Substances 0.000 description 6
- 238000010459 TALEN Methods 0.000 description 6
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 6
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 6
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 6
- 230000014509 gene expression Effects 0.000 description 6
- 235000013922 glutamic acid Nutrition 0.000 description 6
- 239000004220 glutamic acid Substances 0.000 description 6
- 108010015792 glycyllysine Proteins 0.000 description 6
- 108010056582 methionylglutamic acid Proteins 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 6
- 229960004441 tyrosine Drugs 0.000 description 6
- 235000002374 tyrosine Nutrition 0.000 description 6
- OKZOABJQOMAYEC-NUMRIWBASA-N Asn-Gln-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OKZOABJQOMAYEC-NUMRIWBASA-N 0.000 description 5
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 5
- 241000209140 Triticum Species 0.000 description 5
- 235000021307 Triticum Nutrition 0.000 description 5
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 5
- -1 and conversely Proteins 0.000 description 5
- 108010013835 arginine glutamate Proteins 0.000 description 5
- 238000003776 cleavage reaction Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 5
- 108010073628 glutamyl-valyl-phenylalanine Proteins 0.000 description 5
- 238000001727 in vivo Methods 0.000 description 5
- 238000003780 insertion Methods 0.000 description 5
- 230000037431 insertion Effects 0.000 description 5
- 108010034529 leucyl-lysine Proteins 0.000 description 5
- 108010054155 lysyllysine Proteins 0.000 description 5
- 108010017391 lysylvaline Proteins 0.000 description 5
- 108010051242 phenylalanylserine Proteins 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 230000007017 scission Effects 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 239000011701 zinc Substances 0.000 description 5
- 229910052725 zinc Inorganic materials 0.000 description 5
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 4
- 108090000331 Firefly luciferases Proteins 0.000 description 4
- CQZDZKRHFWJXDF-WDSKDSINSA-N Gly-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)CN CQZDZKRHFWJXDF-WDSKDSINSA-N 0.000 description 4
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 4
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 4
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 4
- 239000004472 Lysine Substances 0.000 description 4
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 4
- 108010066427 N-valyltryptophan Proteins 0.000 description 4
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 4
- 108010087924 alanylproline Proteins 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010008355 arginyl-glutamine Proteins 0.000 description 4
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 4
- 210000004899 c-terminal region Anatomy 0.000 description 4
- 210000003763 chloroplast Anatomy 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 108010057821 leucylproline Proteins 0.000 description 4
- 108010064235 lysylglycine Proteins 0.000 description 4
- 229960004452 methionine Drugs 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 108020001775 protein parts Proteins 0.000 description 4
- 230000008439 repair process Effects 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 229960004799 tryptophan Drugs 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- LVMUGODRNHFGRA-AVGNSLFASA-N Arg-Leu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O LVMUGODRNHFGRA-AVGNSLFASA-N 0.000 description 3
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 3
- RBWKVOSARCFSQQ-FXQIFTODSA-N Gln-Gln-Ser Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O RBWKVOSARCFSQQ-FXQIFTODSA-N 0.000 description 3
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 3
- NCWOMXABNYEPLY-NRPADANISA-N Glu-Ala-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O NCWOMXABNYEPLY-NRPADANISA-N 0.000 description 3
- CUXJIASLBRJOFV-LAEOZQHASA-N Glu-Gly-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CUXJIASLBRJOFV-LAEOZQHASA-N 0.000 description 3
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 3
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 3
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 3
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 3
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 3
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 3
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 3
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 3
- DDYIRGBOZVKRFR-AVGNSLFASA-N Phe-Asp-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DDYIRGBOZVKRFR-AVGNSLFASA-N 0.000 description 3
- PSKRILMFHNIUAO-JYJNAYRXSA-N Phe-Glu-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N PSKRILMFHNIUAO-JYJNAYRXSA-N 0.000 description 3
- YYKZDTVQHTUKDW-RYUDHWBXSA-N Phe-Gly-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)O)N YYKZDTVQHTUKDW-RYUDHWBXSA-N 0.000 description 3
- BONHGTUEEPIMPM-AVGNSLFASA-N Phe-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O BONHGTUEEPIMPM-AVGNSLFASA-N 0.000 description 3
- ITUDDXVFGFEKPD-NAKRPEOUSA-N Pro-Ser-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ITUDDXVFGFEKPD-NAKRPEOUSA-N 0.000 description 3
- 102000044126 RNA-Binding Proteins Human genes 0.000 description 3
- HBZBPFLJNDXRAY-FXQIFTODSA-N Ser-Ala-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(O)=O HBZBPFLJNDXRAY-FXQIFTODSA-N 0.000 description 3
- HHJFMHQYEAAOBM-ZLUOBGJFSA-N Ser-Ser-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O HHJFMHQYEAAOBM-ZLUOBGJFSA-N 0.000 description 3
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 3
- NJGMALCNYAMYCB-JRQIVUDYSA-N Thr-Tyr-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O NJGMALCNYAMYCB-JRQIVUDYSA-N 0.000 description 3
- 108091023040 Transcription factor Proteins 0.000 description 3
- 102000040945 Transcription factor Human genes 0.000 description 3
- 108010060035 arginylproline Proteins 0.000 description 3
- 108010092854 aspartyllysine Proteins 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 108010079547 glutamylmethionine Proteins 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 108010040030 histidinoalanine Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 3
- 125000005647 linker group Chemical group 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000002609 medium Substances 0.000 description 3
- 210000003470 mitochondria Anatomy 0.000 description 3
- 230000009456 molecular mechanism Effects 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 238000002360 preparation method Methods 0.000 description 3
- 108010090894 prolylleucine Proteins 0.000 description 3
- 108010048818 seryl-histidine Proteins 0.000 description 3
- 108010026333 seryl-proline Proteins 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 229940113082 thymine Drugs 0.000 description 3
- 229940035893 uracil Drugs 0.000 description 3
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- YLTKNGYYPIWKHZ-ACZMJKKPSA-N Ala-Ala-Glu Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O YLTKNGYYPIWKHZ-ACZMJKKPSA-N 0.000 description 2
- ZEXDYVGDZJBRMO-ACZMJKKPSA-N Ala-Asn-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N ZEXDYVGDZJBRMO-ACZMJKKPSA-N 0.000 description 2
- NWVVKQZOVSTDBQ-CIUDSAMLSA-N Ala-Glu-Arg Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NWVVKQZOVSTDBQ-CIUDSAMLSA-N 0.000 description 2
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 2
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 2
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 2
- SOBIAADAMRHGKH-CIUDSAMLSA-N Ala-Leu-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O SOBIAADAMRHGKH-CIUDSAMLSA-N 0.000 description 2
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 2
- UWIQWPWWZUHBAO-ZLIFDBKOSA-N Ala-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)CC(C)C)C(O)=O)=CNC2=C1 UWIQWPWWZUHBAO-ZLIFDBKOSA-N 0.000 description 2
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 2
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 2
- OLVCTPPSXNRGKV-GUBZILKMSA-N Ala-Pro-Pro Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OLVCTPPSXNRGKV-GUBZILKMSA-N 0.000 description 2
- 108700022317 Arabidopsis GRP23 Proteins 0.000 description 2
- 108700010519 Arabidopsis GUN1 Proteins 0.000 description 2
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 2
- NABSCJGZKWSNHX-RCWTZXSCSA-N Arg-Arg-Thr Chemical compound NC(N)=NCCC[C@@H](C(=O)N[C@@H]([C@H](O)C)C(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N NABSCJGZKWSNHX-RCWTZXSCSA-N 0.000 description 2
- VNFWDYWTSHFRRG-SRVKXCTJSA-N Arg-Gln-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O VNFWDYWTSHFRRG-SRVKXCTJSA-N 0.000 description 2
- DJAIOAKQIOGULM-DCAQKATOSA-N Arg-Glu-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O DJAIOAKQIOGULM-DCAQKATOSA-N 0.000 description 2
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 2
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 2
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 2
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 2
- KXOPYFNQLVUOAQ-FXQIFTODSA-N Arg-Ser-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KXOPYFNQLVUOAQ-FXQIFTODSA-N 0.000 description 2
- KMFPQTITXUKJOV-DCAQKATOSA-N Arg-Ser-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O KMFPQTITXUKJOV-DCAQKATOSA-N 0.000 description 2
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 2
- ZKDGORKGHPCZOV-DCAQKATOSA-N Asn-His-Arg Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZKDGORKGHPCZOV-DCAQKATOSA-N 0.000 description 2
- GQRDIVQPSMPQME-ZPFDUUQYSA-N Asn-Ile-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O GQRDIVQPSMPQME-ZPFDUUQYSA-N 0.000 description 2
- JTXVXGXTRXMOFJ-FXQIFTODSA-N Asn-Pro-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O JTXVXGXTRXMOFJ-FXQIFTODSA-N 0.000 description 2
- BCADFFUQHIMQAA-KKHAAJSZSA-N Asn-Thr-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BCADFFUQHIMQAA-KKHAAJSZSA-N 0.000 description 2
- WQAOZCVOOYUWKG-LSJOCFKGSA-N Asn-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CC(=O)N)N WQAOZCVOOYUWKG-LSJOCFKGSA-N 0.000 description 2
- LTXGDRFJRZSZAV-CIUDSAMLSA-N Asp-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N LTXGDRFJRZSZAV-CIUDSAMLSA-N 0.000 description 2
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 2
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 108020000946 Bacterial DNA Proteins 0.000 description 2
- 108091033409 CRISPR Proteins 0.000 description 2
- 238000010442 DNA editing Methods 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- QFJPFPCSXOXMKI-BPUTZDHNSA-N Gln-Gln-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N QFJPFPCSXOXMKI-BPUTZDHNSA-N 0.000 description 2
- FGWRYRAVBVOHIB-XIRDDKMYSA-N Gln-Pro-Trp Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC2=CNC3=CC=CC=C32)C(=O)O FGWRYRAVBVOHIB-XIRDDKMYSA-N 0.000 description 2
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 2
- PAOHIZNRJNIXQY-XQXXSGGOSA-N Gln-Thr-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O PAOHIZNRJNIXQY-XQXXSGGOSA-N 0.000 description 2
- VLOLPWWCNKWRNB-LOKLDPHHSA-N Gln-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O VLOLPWWCNKWRNB-LOKLDPHHSA-N 0.000 description 2
- RBSKVTZUFMIWFU-XEGUGMAKSA-N Gln-Trp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O RBSKVTZUFMIWFU-XEGUGMAKSA-N 0.000 description 2
- RLZBLVSJDFHDBL-KBIXCLLPSA-N Glu-Ala-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RLZBLVSJDFHDBL-KBIXCLLPSA-N 0.000 description 2
- HUWSBFYAGXCXKC-CIUDSAMLSA-N Glu-Ala-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O HUWSBFYAGXCXKC-CIUDSAMLSA-N 0.000 description 2
- RDPOETHPAQEGDP-ACZMJKKPSA-N Glu-Asp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O RDPOETHPAQEGDP-ACZMJKKPSA-N 0.000 description 2
- JPHYJQHPILOKHC-ACZMJKKPSA-N Glu-Asp-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JPHYJQHPILOKHC-ACZMJKKPSA-N 0.000 description 2
- BUZMZDDKFCSKOT-CIUDSAMLSA-N Glu-Glu-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O BUZMZDDKFCSKOT-CIUDSAMLSA-N 0.000 description 2
- HILMIYALTUQTRC-XVKPBYJWSA-N Glu-Gly-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HILMIYALTUQTRC-XVKPBYJWSA-N 0.000 description 2
- CXRWMMRLEMVSEH-PEFMBERDSA-N Glu-Ile-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CXRWMMRLEMVSEH-PEFMBERDSA-N 0.000 description 2
- SOEPMWQCTJITPZ-SRVKXCTJSA-N Glu-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N SOEPMWQCTJITPZ-SRVKXCTJSA-N 0.000 description 2
- IDEODOAVGCMUQV-GUBZILKMSA-N Glu-Ser-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IDEODOAVGCMUQV-GUBZILKMSA-N 0.000 description 2
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 2
- XLFHCWHXKSFVIB-BQBZGAKWSA-N Gly-Gln-Gln Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O XLFHCWHXKSFVIB-BQBZGAKWSA-N 0.000 description 2
- NTOWAXLMQFKJPT-YUMQZZPRSA-N Gly-Glu-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)CN NTOWAXLMQFKJPT-YUMQZZPRSA-N 0.000 description 2
- JSNNHGHYGYMVCK-XVKPBYJWSA-N Gly-Glu-Val Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O JSNNHGHYGYMVCK-XVKPBYJWSA-N 0.000 description 2
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 2
- PDUHNKAFQXQNLH-ZETCQYMHSA-N Gly-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)NCC(O)=O PDUHNKAFQXQNLH-ZETCQYMHSA-N 0.000 description 2
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 2
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 2
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 2
- ZRSJXIKQXUGKRB-TUBUOCAGSA-N His-Ile-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZRSJXIKQXUGKRB-TUBUOCAGSA-N 0.000 description 2
- 101000966742 Homo sapiens Leucine-rich PPR motif-containing protein, mitochondrial Proteins 0.000 description 2
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 2
- NCSIQAFSIPHVAN-IUKAMOBKSA-N Ile-Asn-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N NCSIQAFSIPHVAN-IUKAMOBKSA-N 0.000 description 2
- WUKLZPHVWAMZQV-UKJIMTQDSA-N Ile-Glu-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N WUKLZPHVWAMZQV-UKJIMTQDSA-N 0.000 description 2
- PELCGFMHLZXWBQ-BJDJZHNGSA-N Ile-Ser-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)O)N PELCGFMHLZXWBQ-BJDJZHNGSA-N 0.000 description 2
- APQYGMBHIVXFML-OSUNSFLBSA-N Ile-Val-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N APQYGMBHIVXFML-OSUNSFLBSA-N 0.000 description 2
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 2
- SENJXOPIZNYLHU-UHFFFAOYSA-N L-leucyl-L-arginine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CCCN=C(N)N SENJXOPIZNYLHU-UHFFFAOYSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- MJOZZTKJZQFKDK-GUBZILKMSA-N Leu-Ala-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(N)=O MJOZZTKJZQFKDK-GUBZILKMSA-N 0.000 description 2
- NTRAGDHVSGKUSF-AVGNSLFASA-N Leu-Arg-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NTRAGDHVSGKUSF-AVGNSLFASA-N 0.000 description 2
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 2
- WGNOPSQMIQERPK-UHFFFAOYSA-N Leu-Asn-Pro Natural products CC(C)CC(N)C(=O)NC(CC(=O)N)C(=O)N1CCCC1C(=O)O WGNOPSQMIQERPK-UHFFFAOYSA-N 0.000 description 2
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 2
- YKNBJXOJTURHCU-DCAQKATOSA-N Leu-Asp-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YKNBJXOJTURHCU-DCAQKATOSA-N 0.000 description 2
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 2
- ZYLJULGXQDNXDK-GUBZILKMSA-N Leu-Gln-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O ZYLJULGXQDNXDK-GUBZILKMSA-N 0.000 description 2
- CIVKXGPFXDIQBV-WDCWCFNPSA-N Leu-Gln-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CIVKXGPFXDIQBV-WDCWCFNPSA-N 0.000 description 2
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 2
- HPBCTWSUJOGJSH-MNXVOIDGSA-N Leu-Glu-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HPBCTWSUJOGJSH-MNXVOIDGSA-N 0.000 description 2
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 2
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 2
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 2
- YOKVEHGYYQEQOP-QWRGUYRKSA-N Leu-Leu-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O YOKVEHGYYQEQOP-QWRGUYRKSA-N 0.000 description 2
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 2
- AKVBOOKXVAMKSS-GUBZILKMSA-N Leu-Ser-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O AKVBOOKXVAMKSS-GUBZILKMSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 2
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 2
- GZRABTMNWJXFMH-UVOCVTCTSA-N Leu-Thr-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O GZRABTMNWJXFMH-UVOCVTCTSA-N 0.000 description 2
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 2
- 102100040589 Leucine-rich PPR motif-containing protein, mitochondrial Human genes 0.000 description 2
- JGAMUXDWYSXYLM-SRVKXCTJSA-N Lys-Arg-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O JGAMUXDWYSXYLM-SRVKXCTJSA-N 0.000 description 2
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 2
- VHNOAIFVYUQOOY-XUXIUFHCSA-N Lys-Arg-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VHNOAIFVYUQOOY-XUXIUFHCSA-N 0.000 description 2
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 2
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 2
- VSRXPEHZMHSFKU-IUCAKERBSA-N Lys-Gln-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VSRXPEHZMHSFKU-IUCAKERBSA-N 0.000 description 2
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 2
- DUTMKEAPLLUGNO-JYJNAYRXSA-N Lys-Glu-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DUTMKEAPLLUGNO-JYJNAYRXSA-N 0.000 description 2
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 2
- OWRUUFUVXFREBD-KKUMJFAQSA-N Lys-His-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O OWRUUFUVXFREBD-KKUMJFAQSA-N 0.000 description 2
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 2
- MTBLFIQZECOEBY-IHRRRGAJSA-N Lys-Met-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(O)=O MTBLFIQZECOEBY-IHRRRGAJSA-N 0.000 description 2
- WGILOYIKJVQUPT-DCAQKATOSA-N Lys-Pro-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O WGILOYIKJVQUPT-DCAQKATOSA-N 0.000 description 2
- JMNRXRPBHFGXQX-GUBZILKMSA-N Lys-Ser-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JMNRXRPBHFGXQX-GUBZILKMSA-N 0.000 description 2
- WXHHTBVYQOSYSL-FXQIFTODSA-N Met-Ala-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O WXHHTBVYQOSYSL-FXQIFTODSA-N 0.000 description 2
- YLLWCSDBVGZLOW-CIUDSAMLSA-N Met-Gln-Ala Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O YLLWCSDBVGZLOW-CIUDSAMLSA-N 0.000 description 2
- VZBXCMCHIHEPBL-SRVKXCTJSA-N Met-Glu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN VZBXCMCHIHEPBL-SRVKXCTJSA-N 0.000 description 2
- YCUSPBPZVJDMII-YUMQZZPRSA-N Met-Gly-Glu Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O YCUSPBPZVJDMII-YUMQZZPRSA-N 0.000 description 2
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 2
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 2
- 108010079364 N-glycylalanine Proteins 0.000 description 2
- 108010047562 NGR peptide Proteins 0.000 description 2
- 101100342977 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) leu-1 gene Proteins 0.000 description 2
- ZLGQEBCCANLYRA-RYUDHWBXSA-N Phe-Gly-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O ZLGQEBCCANLYRA-RYUDHWBXSA-N 0.000 description 2
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 2
- QSWKNJAPHQDAAS-MELADBBJSA-N Phe-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O QSWKNJAPHQDAAS-MELADBBJSA-N 0.000 description 2
- XNMYNGDKJNOKHH-BZSNNMDCSA-N Phe-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O XNMYNGDKJNOKHH-BZSNNMDCSA-N 0.000 description 2
- AHXPYZRZRMQOAU-QXEWZRGKSA-N Pro-Asn-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1)C(O)=O AHXPYZRZRMQOAU-QXEWZRGKSA-N 0.000 description 2
- QVIZLAUEAMQKGS-GUBZILKMSA-N Pro-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 QVIZLAUEAMQKGS-GUBZILKMSA-N 0.000 description 2
- JMVQDLDPDBXAAX-YUMQZZPRSA-N Pro-Gly-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 JMVQDLDPDBXAAX-YUMQZZPRSA-N 0.000 description 2
- FDMKYQQYJKYCLV-GUBZILKMSA-N Pro-Pro-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 FDMKYQQYJKYCLV-GUBZILKMSA-N 0.000 description 2
- MKGIILKDUGDRRO-FXQIFTODSA-N Pro-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 MKGIILKDUGDRRO-FXQIFTODSA-N 0.000 description 2
- 108700020471 RNA-Binding Proteins Proteins 0.000 description 2
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 2
- 108010052090 Renilla Luciferases Proteins 0.000 description 2
- OBXVZEAMXFSGPU-FXQIFTODSA-N Ser-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N)CN=C(N)N OBXVZEAMXFSGPU-FXQIFTODSA-N 0.000 description 2
- AEGUWTFAQQWVLC-BQBZGAKWSA-N Ser-Gly-Arg Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEGUWTFAQQWVLC-BQBZGAKWSA-N 0.000 description 2
- IXCHOHLPHNGFTJ-YUMQZZPRSA-N Ser-Gly-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N IXCHOHLPHNGFTJ-YUMQZZPRSA-N 0.000 description 2
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 2
- HEUVHBXOVZONPU-BJDJZHNGSA-N Ser-Leu-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HEUVHBXOVZONPU-BJDJZHNGSA-N 0.000 description 2
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 2
- OWCVUSJMEBGMOK-YUMQZZPRSA-N Ser-Lys-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O OWCVUSJMEBGMOK-YUMQZZPRSA-N 0.000 description 2
- GZGFSPWOMUKKCV-NAKRPEOUSA-N Ser-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO GZGFSPWOMUKKCV-NAKRPEOUSA-N 0.000 description 2
- BMKNXTJLHFIAAH-CIUDSAMLSA-N Ser-Ser-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O BMKNXTJLHFIAAH-CIUDSAMLSA-N 0.000 description 2
- ILZAUMFXKSIUEF-SRVKXCTJSA-N Ser-Ser-Phe Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 ILZAUMFXKSIUEF-SRVKXCTJSA-N 0.000 description 2
- ZWSZBWAFDZRBNM-UBHSHLNASA-N Ser-Trp-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(O)=O ZWSZBWAFDZRBNM-UBHSHLNASA-N 0.000 description 2
- CEXFELBFVHLYDZ-XGEHTFHBSA-N Thr-Arg-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CEXFELBFVHLYDZ-XGEHTFHBSA-N 0.000 description 2
- MEJHFIOYJHTWMK-VOAKCMCISA-N Thr-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)[C@@H](C)O MEJHFIOYJHTWMK-VOAKCMCISA-N 0.000 description 2
- NYQIZWROIMIQSL-VEVYYDQMSA-N Thr-Pro-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O NYQIZWROIMIQSL-VEVYYDQMSA-N 0.000 description 2
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 2
- BDWDMRSGCXEDMR-WFBYXXMGSA-N Trp-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N BDWDMRSGCXEDMR-WFBYXXMGSA-N 0.000 description 2
- HRHYJNLMIJWGLF-BZSNNMDCSA-N Tyr-Ser-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 HRHYJNLMIJWGLF-BZSNNMDCSA-N 0.000 description 2
- YODDULVCGFQRFZ-ZKWXMUAHSA-N Val-Asp-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O YODDULVCGFQRFZ-ZKWXMUAHSA-N 0.000 description 2
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 2
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 2
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 2
- BTWMICVCQLKKNR-DCAQKATOSA-N Val-Leu-Ser Chemical compound CC(C)[C@H]([NH3+])C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C([O-])=O BTWMICVCQLKKNR-DCAQKATOSA-N 0.000 description 2
- HWNYVQMOLCYHEA-IHRRRGAJSA-N Val-Ser-Tyr Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N HWNYVQMOLCYHEA-IHRRRGAJSA-N 0.000 description 2
- LCHZBEUVGAVMKS-RHYQMDGZSA-N Val-Thr-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)[C@@H](C)O)C(O)=O LCHZBEUVGAVMKS-RHYQMDGZSA-N 0.000 description 2
- MIAZWUMFUURQNP-YDHLFZDLSA-N Val-Tyr-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N MIAZWUMFUURQNP-YDHLFZDLSA-N 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 2
- 108010005233 alanylglutamic acid Proteins 0.000 description 2
- 108010044940 alanylglutamine Proteins 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 2
- 108010068380 arginylarginine Proteins 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 108010047857 aspartylglycine Proteins 0.000 description 2
- 108010068265 aspartyltyrosine Proteins 0.000 description 2
- 238000000546 chi-square test Methods 0.000 description 2
- 210000004748 cultured cell Anatomy 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000035558 fertility Effects 0.000 description 2
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 2
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 2
- 108010010096 glycyl-glycyl-tyrosine Proteins 0.000 description 2
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010084389 glycyltryptophan Proteins 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 108010092114 histidylphenylalanine Proteins 0.000 description 2
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 2
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 2
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 2
- 108010000761 leucylarginine Proteins 0.000 description 2
- 238000003670 luciferase enzyme activity assay Methods 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 108010016686 methionyl-alanyl-serine Proteins 0.000 description 2
- 108010085203 methionylmethionine Proteins 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 210000002706 plastid Anatomy 0.000 description 2
- 108010077112 prolyl-proline Proteins 0.000 description 2
- 108010020432 prolyl-prolylisoleucine Proteins 0.000 description 2
- 108010004914 prolylarginine Proteins 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 230000009870 specific binding Effects 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 108010061238 threonyl-glycine Proteins 0.000 description 2
- 108010051110 tyrosyl-lysine Proteins 0.000 description 2
- IBIDRSSEHFLGSD-UHFFFAOYSA-N valinyl-arginine Natural products CC(C)C(N)C(=O)NC(C(O)=O)CCCN=C(N)N IBIDRSSEHFLGSD-UHFFFAOYSA-N 0.000 description 2
- 108010009962 valyltyrosine Proteins 0.000 description 2
- 108010027345 wheylin-1 peptide Proteins 0.000 description 2
- MDNRBNZIOBQHHK-KWBADKCTSA-N (2s)-2-[[(2s)-2-[[2-[[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-3-carboxypropanoyl]amino]-3-methylbutanoic acid Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N MDNRBNZIOBQHHK-KWBADKCTSA-N 0.000 description 1
- KGMROVDTWHJSIM-YFKPBYRVSA-N (2s)-2-hydrazinyl-4-methylpentanoic acid Chemical compound CC(C)C[C@H](NN)C(O)=O KGMROVDTWHJSIM-YFKPBYRVSA-N 0.000 description 1
- 108010036211 5-HT-moduline Proteins 0.000 description 1
- 101710159080 Aconitate hydratase A Proteins 0.000 description 1
- 101710159078 Aconitate hydratase B Proteins 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 1
- RLMISHABBKUNFO-WHFBIAKZSA-N Ala-Ala-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O RLMISHABBKUNFO-WHFBIAKZSA-N 0.000 description 1
- CXRCVCURMBFFOL-FXQIFTODSA-N Ala-Ala-Pro Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O CXRCVCURMBFFOL-FXQIFTODSA-N 0.000 description 1
- YYSWCHMLFJLLBJ-ZLUOBGJFSA-N Ala-Ala-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O YYSWCHMLFJLLBJ-ZLUOBGJFSA-N 0.000 description 1
- ODWSTKXGQGYHSH-FXQIFTODSA-N Ala-Arg-Ala Chemical compound C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O ODWSTKXGQGYHSH-FXQIFTODSA-N 0.000 description 1
- WRDANSJTFOHBPI-FXQIFTODSA-N Ala-Arg-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CS)C(=O)O)N WRDANSJTFOHBPI-FXQIFTODSA-N 0.000 description 1
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 1
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 1
- KVWLTGNCJYDJET-LSJOCFKGSA-N Ala-Arg-His Chemical compound C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N KVWLTGNCJYDJET-LSJOCFKGSA-N 0.000 description 1
- YAXNATKKPOWVCP-ZLUOBGJFSA-N Ala-Asn-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O YAXNATKKPOWVCP-ZLUOBGJFSA-N 0.000 description 1
- LBJYAILUMSUTAM-ZLUOBGJFSA-N Ala-Asn-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O LBJYAILUMSUTAM-ZLUOBGJFSA-N 0.000 description 1
- YBPLKDWJFYCZSV-ZLUOBGJFSA-N Ala-Asn-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N YBPLKDWJFYCZSV-ZLUOBGJFSA-N 0.000 description 1
- JYEBJTDTPNKQJG-FXQIFTODSA-N Ala-Asn-Met Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N JYEBJTDTPNKQJG-FXQIFTODSA-N 0.000 description 1
- GSCLWXDNIMNIJE-ZLUOBGJFSA-N Ala-Asp-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GSCLWXDNIMNIJE-ZLUOBGJFSA-N 0.000 description 1
- KIUYPHAMDKDICO-WHFBIAKZSA-N Ala-Asp-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O KIUYPHAMDKDICO-WHFBIAKZSA-N 0.000 description 1
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 1
- FVSOUJZKYWEFOB-KBIXCLLPSA-N Ala-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)N FVSOUJZKYWEFOB-KBIXCLLPSA-N 0.000 description 1
- AWAXZRDKUHOPBO-GUBZILKMSA-N Ala-Gln-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O AWAXZRDKUHOPBO-GUBZILKMSA-N 0.000 description 1
- MVBWLRJESQOQTM-ACZMJKKPSA-N Ala-Gln-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O MVBWLRJESQOQTM-ACZMJKKPSA-N 0.000 description 1
- WKOBSJOZRJJVRZ-FXQIFTODSA-N Ala-Glu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WKOBSJOZRJJVRZ-FXQIFTODSA-N 0.000 description 1
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 1
- UHMQKOBNPRAZGB-CIUDSAMLSA-N Ala-Glu-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N UHMQKOBNPRAZGB-CIUDSAMLSA-N 0.000 description 1
- OMMDTNGURYRDAC-NRPADANISA-N Ala-Glu-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O OMMDTNGURYRDAC-NRPADANISA-N 0.000 description 1
- LJFNNUBZSZCZFN-WHFBIAKZSA-N Ala-Gly-Cys Chemical compound N[C@@H](C)C(=O)NCC(=O)N[C@@H](CS)C(=O)O LJFNNUBZSZCZFN-WHFBIAKZSA-N 0.000 description 1
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 1
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- NBTGEURICRTMGL-WHFBIAKZSA-N Ala-Gly-Ser Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O NBTGEURICRTMGL-WHFBIAKZSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- NIZKGBJVCMRDKO-KWQFWETISA-N Ala-Gly-Tyr Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NIZKGBJVCMRDKO-KWQFWETISA-N 0.000 description 1
- ANGAOPNEPIDLPO-XVYDVKMFSA-N Ala-His-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CS)C(=O)O)N ANGAOPNEPIDLPO-XVYDVKMFSA-N 0.000 description 1
- GSHKMNKPMLXSQW-KBIXCLLPSA-N Ala-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C)N GSHKMNKPMLXSQW-KBIXCLLPSA-N 0.000 description 1
- QCTFKEJEIMPOLW-JURCDPSOSA-N Ala-Ile-Phe Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QCTFKEJEIMPOLW-JURCDPSOSA-N 0.000 description 1
- OKIKVSXTXVVFDV-MMWGEVLESA-N Ala-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C)N OKIKVSXTXVVFDV-MMWGEVLESA-N 0.000 description 1
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 1
- WUHJHHGYVVJMQE-BJDJZHNGSA-N Ala-Leu-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WUHJHHGYVVJMQE-BJDJZHNGSA-N 0.000 description 1
- VGMNWQOPSFBBBG-XUXIUFHCSA-N Ala-Leu-Leu-Asp Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O VGMNWQOPSFBBBG-XUXIUFHCSA-N 0.000 description 1
- LDLSENBXQNDTPB-DCAQKATOSA-N Ala-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LDLSENBXQNDTPB-DCAQKATOSA-N 0.000 description 1
- MFMDKJIPHSWSBM-GUBZILKMSA-N Ala-Lys-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O MFMDKJIPHSWSBM-GUBZILKMSA-N 0.000 description 1
- BLTRAARCJYVJKV-QEJZJMRPSA-N Ala-Lys-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(O)=O BLTRAARCJYVJKV-QEJZJMRPSA-N 0.000 description 1
- NINQYGGNRIBFSC-CIUDSAMLSA-N Ala-Lys-Ser Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CO)C(O)=O NINQYGGNRIBFSC-CIUDSAMLSA-N 0.000 description 1
- DWYROCSXOOMOEU-CIUDSAMLSA-N Ala-Met-Glu Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N DWYROCSXOOMOEU-CIUDSAMLSA-N 0.000 description 1
- XSTZMVAYYCJTNR-DCAQKATOSA-N Ala-Met-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XSTZMVAYYCJTNR-DCAQKATOSA-N 0.000 description 1
- FVNAUOZKIPAYNA-BPNCWPANSA-N Ala-Met-Tyr Chemical compound CSCC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FVNAUOZKIPAYNA-BPNCWPANSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- WEZNQZHACPSMEF-QEJZJMRPSA-N Ala-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 WEZNQZHACPSMEF-QEJZJMRPSA-N 0.000 description 1
- XWFWAXPOLRTDFZ-FXQIFTODSA-N Ala-Pro-Ser Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O XWFWAXPOLRTDFZ-FXQIFTODSA-N 0.000 description 1
- JNLDTVRGXMSYJC-UVBJJODRSA-N Ala-Pro-Trp Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O JNLDTVRGXMSYJC-UVBJJODRSA-N 0.000 description 1
- VJVQKGYHIZPSNS-FXQIFTODSA-N Ala-Ser-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N VJVQKGYHIZPSNS-FXQIFTODSA-N 0.000 description 1
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 1
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 1
- OMCKWYSDUQBYCN-FXQIFTODSA-N Ala-Ser-Met Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O OMCKWYSDUQBYCN-FXQIFTODSA-N 0.000 description 1
- NCQMBSJGJMYKCK-ZLUOBGJFSA-N Ala-Ser-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O NCQMBSJGJMYKCK-ZLUOBGJFSA-N 0.000 description 1
- ARHJJAAWNWOACN-FXQIFTODSA-N Ala-Ser-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O ARHJJAAWNWOACN-FXQIFTODSA-N 0.000 description 1
- OEVCHROQUIVQFZ-YTLHQDLWSA-N Ala-Thr-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](C)C(O)=O OEVCHROQUIVQFZ-YTLHQDLWSA-N 0.000 description 1
- XQNRANMFRPCFFW-GCJQMDKQSA-N Ala-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C)N)O XQNRANMFRPCFFW-GCJQMDKQSA-N 0.000 description 1
- KUFVXLQLDHJVOG-SHGPDSBTSA-N Ala-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C)N)O KUFVXLQLDHJVOG-SHGPDSBTSA-N 0.000 description 1
- CREYEAPXISDKSB-FQPOAREZSA-N Ala-Thr-Tyr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CREYEAPXISDKSB-FQPOAREZSA-N 0.000 description 1
- IETUUAHKCHOQHP-KZVJFYERSA-N Ala-Thr-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@H](C)N)[C@@H](C)O)C(O)=O IETUUAHKCHOQHP-KZVJFYERSA-N 0.000 description 1
- PGNNQOJOEGFAOR-KWQFWETISA-N Ala-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 PGNNQOJOEGFAOR-KWQFWETISA-N 0.000 description 1
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 1
- XAXMJQUMRJAFCH-CQDKDKBSSA-N Ala-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 XAXMJQUMRJAFCH-CQDKDKBSSA-N 0.000 description 1
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 1
- IYKVSFNGSWTTNZ-GUBZILKMSA-N Ala-Val-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IYKVSFNGSWTTNZ-GUBZILKMSA-N 0.000 description 1
- BOKLLPVAQDSLHC-FXQIFTODSA-N Ala-Val-Cys Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(=O)O)N BOKLLPVAQDSLHC-FXQIFTODSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- LYILPUNCKACNGF-NAKRPEOUSA-N Ala-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)N LYILPUNCKACNGF-NAKRPEOUSA-N 0.000 description 1
- XCIGOVDXZULBBV-DCAQKATOSA-N Ala-Val-Lys Chemical compound CC(C)[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CCCCN)C(O)=O XCIGOVDXZULBBV-DCAQKATOSA-N 0.000 description 1
- REWSWYIDQIELBE-FXQIFTODSA-N Ala-Val-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O REWSWYIDQIELBE-FXQIFTODSA-N 0.000 description 1
- 101100189945 Arabidopsis thaliana PER63 gene Proteins 0.000 description 1
- 101000688247 Arabidopsis thaliana Pentatricopeptide repeat-containing protein At5g67570, chloroplastic Proteins 0.000 description 1
- MCYJBCKCAPERSE-FXQIFTODSA-N Arg-Ala-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N MCYJBCKCAPERSE-FXQIFTODSA-N 0.000 description 1
- QEKBCDODJBBWHV-GUBZILKMSA-N Arg-Arg-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O QEKBCDODJBBWHV-GUBZILKMSA-N 0.000 description 1
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 description 1
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 1
- HJWQFFYRVFEWRM-SRVKXCTJSA-N Arg-Arg-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(O)=O HJWQFFYRVFEWRM-SRVKXCTJSA-N 0.000 description 1
- PVSNBTCXCQIXSE-JYJNAYRXSA-N Arg-Arg-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 PVSNBTCXCQIXSE-JYJNAYRXSA-N 0.000 description 1
- OVVUNXXROOFSIM-SDDRHHMPSA-N Arg-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O OVVUNXXROOFSIM-SDDRHHMPSA-N 0.000 description 1
- DPXDVGDLWJYZBH-GUBZILKMSA-N Arg-Asn-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DPXDVGDLWJYZBH-GUBZILKMSA-N 0.000 description 1
- RVDVDRUZWZIBJQ-CIUDSAMLSA-N Arg-Asn-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O RVDVDRUZWZIBJQ-CIUDSAMLSA-N 0.000 description 1
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 1
- IIABBYGHLYWVOS-FXQIFTODSA-N Arg-Asn-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O IIABBYGHLYWVOS-FXQIFTODSA-N 0.000 description 1
- XVLLUZMFSAYKJV-GUBZILKMSA-N Arg-Asp-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XVLLUZMFSAYKJV-GUBZILKMSA-N 0.000 description 1
- JTWOBPNAVBESFW-FXQIFTODSA-N Arg-Cys-Asp Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)CN=C(N)N JTWOBPNAVBESFW-FXQIFTODSA-N 0.000 description 1
- BEXGZLUHRXTZCC-CIUDSAMLSA-N Arg-Gln-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N)CN=C(N)N BEXGZLUHRXTZCC-CIUDSAMLSA-N 0.000 description 1
- LMPKCSXZJSXBBL-NHCYSSNCSA-N Arg-Gln-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O LMPKCSXZJSXBBL-NHCYSSNCSA-N 0.000 description 1
- HPKSHFSEXICTLI-CIUDSAMLSA-N Arg-Glu-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O HPKSHFSEXICTLI-CIUDSAMLSA-N 0.000 description 1
- QAODJPUKWNNNRP-DCAQKATOSA-N Arg-Glu-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QAODJPUKWNNNRP-DCAQKATOSA-N 0.000 description 1
- PNQWAUXQDBIJDY-GUBZILKMSA-N Arg-Glu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O PNQWAUXQDBIJDY-GUBZILKMSA-N 0.000 description 1
- PBSOQGZLPFVXPU-YUMQZZPRSA-N Arg-Glu-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PBSOQGZLPFVXPU-YUMQZZPRSA-N 0.000 description 1
- SKTGPBFTMNLIHQ-KKUMJFAQSA-N Arg-Glu-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SKTGPBFTMNLIHQ-KKUMJFAQSA-N 0.000 description 1
- HPSVTWMFWCHKFN-GARJFASQSA-N Arg-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O HPSVTWMFWCHKFN-GARJFASQSA-N 0.000 description 1
- JAYIQMNQDMOBFY-KKUMJFAQSA-N Arg-Glu-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JAYIQMNQDMOBFY-KKUMJFAQSA-N 0.000 description 1
- YNSGXDWWPCGGQS-YUMQZZPRSA-N Arg-Gly-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O YNSGXDWWPCGGQS-YUMQZZPRSA-N 0.000 description 1
- CYXCAHZVPFREJD-LURJTMIESA-N Arg-Gly-Gly Chemical compound NC(=N)NCCC[C@H](N)C(=O)NCC(=O)NCC(O)=O CYXCAHZVPFREJD-LURJTMIESA-N 0.000 description 1
- ZZZWQALDSQQBEW-STQMWFEESA-N Arg-Gly-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZZZWQALDSQQBEW-STQMWFEESA-N 0.000 description 1
- YKBHOXLMMPZPHQ-GMOBBJLQSA-N Arg-Ile-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O YKBHOXLMMPZPHQ-GMOBBJLQSA-N 0.000 description 1
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 1
- FLYANDHDFRGGTM-PYJNHQTQSA-N Arg-Ile-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N FLYANDHDFRGGTM-PYJNHQTQSA-N 0.000 description 1
- OOIMKQRCPJBGPD-XUXIUFHCSA-N Arg-Ile-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O OOIMKQRCPJBGPD-XUXIUFHCSA-N 0.000 description 1
- OKKMBOSPBDASEP-CYDGBPFRSA-N Arg-Ile-Met Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C(O)=O OKKMBOSPBDASEP-CYDGBPFRSA-N 0.000 description 1
- YKZJPIPFKGYHKY-DCAQKATOSA-N Arg-Leu-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YKZJPIPFKGYHKY-DCAQKATOSA-N 0.000 description 1
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 1
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 1
- RIIVUOJDDQXHRV-SRVKXCTJSA-N Arg-Lys-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O RIIVUOJDDQXHRV-SRVKXCTJSA-N 0.000 description 1
- CVXXSWQORBZAAA-SRVKXCTJSA-N Arg-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N CVXXSWQORBZAAA-SRVKXCTJSA-N 0.000 description 1
- BNYNOWJESJJIOI-XUXIUFHCSA-N Arg-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N BNYNOWJESJJIOI-XUXIUFHCSA-N 0.000 description 1
- BTJVOUQWFXABOI-IHRRRGAJSA-N Arg-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCNC(N)=N BTJVOUQWFXABOI-IHRRRGAJSA-N 0.000 description 1
- GRRXPUAICOGISM-RWMBFGLXSA-N Arg-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GRRXPUAICOGISM-RWMBFGLXSA-N 0.000 description 1
- RIQBRKVTFBWEDY-RHYQMDGZSA-N Arg-Lys-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O RIQBRKVTFBWEDY-RHYQMDGZSA-N 0.000 description 1
- NYDIVDKTULRINZ-AVGNSLFASA-N Arg-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NYDIVDKTULRINZ-AVGNSLFASA-N 0.000 description 1
- JCROZIFVIYMXHM-GUBZILKMSA-N Arg-Met-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CCCN=C(N)N JCROZIFVIYMXHM-GUBZILKMSA-N 0.000 description 1
- BSGSDLYGGHGMND-IHRRRGAJSA-N Arg-Phe-Cys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N BSGSDLYGGHGMND-IHRRRGAJSA-N 0.000 description 1
- VEAIMHJZTIDCIH-KKUMJFAQSA-N Arg-Phe-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VEAIMHJZTIDCIH-KKUMJFAQSA-N 0.000 description 1
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 1
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 1
- UIUXXFIKWQVMEX-UFYCRDLUSA-N Arg-Phe-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O UIUXXFIKWQVMEX-UFYCRDLUSA-N 0.000 description 1
- AWMAZIIEFPFHCP-RCWTZXSCSA-N Arg-Pro-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O AWMAZIIEFPFHCP-RCWTZXSCSA-N 0.000 description 1
- OWSMKCJUBAPHED-JYJNAYRXSA-N Arg-Pro-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OWSMKCJUBAPHED-JYJNAYRXSA-N 0.000 description 1
- VENMDXUVHSKEIN-GUBZILKMSA-N Arg-Ser-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VENMDXUVHSKEIN-GUBZILKMSA-N 0.000 description 1
- VRTWYUYCJGNFES-CIUDSAMLSA-N Arg-Ser-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O VRTWYUYCJGNFES-CIUDSAMLSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 1
- LYJXHXGPWDTLKW-HJGDQZAQSA-N Arg-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O LYJXHXGPWDTLKW-HJGDQZAQSA-N 0.000 description 1
- DDBMKOCQWNFDBH-RHYQMDGZSA-N Arg-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O DDBMKOCQWNFDBH-RHYQMDGZSA-N 0.000 description 1
- ZPWMEWYQBWSGAO-ZJDVBMNYSA-N Arg-Thr-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZPWMEWYQBWSGAO-ZJDVBMNYSA-N 0.000 description 1
- QUBKBPZGMZWOKQ-SZMVWBNQSA-N Arg-Trp-Arg Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 QUBKBPZGMZWOKQ-SZMVWBNQSA-N 0.000 description 1
- ZUVDFJXRAICIAJ-BPUTZDHNSA-N Arg-Trp-Asp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 ZUVDFJXRAICIAJ-BPUTZDHNSA-N 0.000 description 1
- NZQFXJKVNUZYAG-BPUTZDHNSA-N Arg-Trp-Cys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CS)C(O)=O)=CNC2=C1 NZQFXJKVNUZYAG-BPUTZDHNSA-N 0.000 description 1
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 1
- LLQIAIUAKGNOSE-NHCYSSNCSA-N Arg-Val-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N LLQIAIUAKGNOSE-NHCYSSNCSA-N 0.000 description 1
- XEOXPCNONWHHSW-AVGNSLFASA-N Arg-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N XEOXPCNONWHHSW-AVGNSLFASA-N 0.000 description 1
- JWCCFNZJIRZUCL-AVGNSLFASA-N Arg-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N JWCCFNZJIRZUCL-AVGNSLFASA-N 0.000 description 1
- BRCVLJZIIFBSPF-ZLUOBGJFSA-N Asn-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)N)N BRCVLJZIIFBSPF-ZLUOBGJFSA-N 0.000 description 1
- JEPNYDRDYNSFIU-QXEWZRGKSA-N Asn-Arg-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(N)=O)C(O)=O JEPNYDRDYNSFIU-QXEWZRGKSA-N 0.000 description 1
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 1
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 1
- BGINHSZTXRJIPP-FXQIFTODSA-N Asn-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N BGINHSZTXRJIPP-FXQIFTODSA-N 0.000 description 1
- JZRLLSOWDYUKOK-SRVKXCTJSA-N Asn-Asp-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N JZRLLSOWDYUKOK-SRVKXCTJSA-N 0.000 description 1
- WQSCVMQDZYTFQU-FXQIFTODSA-N Asn-Cys-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WQSCVMQDZYTFQU-FXQIFTODSA-N 0.000 description 1
- HLTLEIXYIJDFOY-ZLUOBGJFSA-N Asn-Cys-Asn Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O HLTLEIXYIJDFOY-ZLUOBGJFSA-N 0.000 description 1
- HJRBIWRXULGMOA-ACZMJKKPSA-N Asn-Gln-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HJRBIWRXULGMOA-ACZMJKKPSA-N 0.000 description 1
- UEONJSPBTSWKOI-CIUDSAMLSA-N Asn-Gln-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O UEONJSPBTSWKOI-CIUDSAMLSA-N 0.000 description 1
- KUYKVGODHGHFDI-ACZMJKKPSA-N Asn-Gln-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O KUYKVGODHGHFDI-ACZMJKKPSA-N 0.000 description 1
- HCAUEJAQCXVQQM-ACZMJKKPSA-N Asn-Glu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O HCAUEJAQCXVQQM-ACZMJKKPSA-N 0.000 description 1
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 1
- GYOHQKJEQQJBOY-QEJZJMRPSA-N Asn-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N GYOHQKJEQQJBOY-QEJZJMRPSA-N 0.000 description 1
- CTQIOCMSIJATNX-WHFBIAKZSA-N Asn-Gly-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O CTQIOCMSIJATNX-WHFBIAKZSA-N 0.000 description 1
- DDPXDCKYWDGZAL-BQBZGAKWSA-N Asn-Gly-Arg Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N DDPXDCKYWDGZAL-BQBZGAKWSA-N 0.000 description 1
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 1
- OLVIPTLKNSAYRJ-YUMQZZPRSA-N Asn-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N OLVIPTLKNSAYRJ-YUMQZZPRSA-N 0.000 description 1
- RAQMSGVCGSJKCL-FOHZUACHSA-N Asn-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(N)=O RAQMSGVCGSJKCL-FOHZUACHSA-N 0.000 description 1
- OOWSBIOUKIUWLO-RCOVLWMOSA-N Asn-Gly-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O OOWSBIOUKIUWLO-RCOVLWMOSA-N 0.000 description 1
- OLISTMZJGQUOGS-GMOBBJLQSA-N Asn-Ile-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N OLISTMZJGQUOGS-GMOBBJLQSA-N 0.000 description 1
- XVBDDUPJVQXDSI-PEFMBERDSA-N Asn-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N XVBDDUPJVQXDSI-PEFMBERDSA-N 0.000 description 1
- NLRJGXZWTKXRHP-DCAQKATOSA-N Asn-Leu-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NLRJGXZWTKXRHP-DCAQKATOSA-N 0.000 description 1
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 1
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 1
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 1
- JLNFZLNDHONLND-GARJFASQSA-N Asn-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)N)N JLNFZLNDHONLND-GARJFASQSA-N 0.000 description 1
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 1
- RZNAMKZJPBQWDJ-SRVKXCTJSA-N Asn-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N RZNAMKZJPBQWDJ-SRVKXCTJSA-N 0.000 description 1
- ORJQQZIXTOYGGH-SRVKXCTJSA-N Asn-Lys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ORJQQZIXTOYGGH-SRVKXCTJSA-N 0.000 description 1
- ZYPWIUFLYMQZBS-SRVKXCTJSA-N Asn-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ZYPWIUFLYMQZBS-SRVKXCTJSA-N 0.000 description 1
- COWITDLVHMZSIW-CIUDSAMLSA-N Asn-Lys-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O COWITDLVHMZSIW-CIUDSAMLSA-N 0.000 description 1
- MYVBTYXSWILFCG-BQBZGAKWSA-N Asn-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC(=O)N)N MYVBTYXSWILFCG-BQBZGAKWSA-N 0.000 description 1
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 1
- IDUUACUJKUXKKD-VEVYYDQMSA-N Asn-Pro-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O IDUUACUJKUXKKD-VEVYYDQMSA-N 0.000 description 1
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 1
- XTMZYFMTYJNABC-ZLUOBGJFSA-N Asn-Ser-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)N)N XTMZYFMTYJNABC-ZLUOBGJFSA-N 0.000 description 1
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 1
- MKJBPDLENBUHQU-CIUDSAMLSA-N Asn-Ser-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O MKJBPDLENBUHQU-CIUDSAMLSA-N 0.000 description 1
- ZNYKKCADEQAZKA-FXQIFTODSA-N Asn-Ser-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O ZNYKKCADEQAZKA-FXQIFTODSA-N 0.000 description 1
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 1
- FMNBYVSGRCXWEK-FOHZUACHSA-N Asn-Thr-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O FMNBYVSGRCXWEK-FOHZUACHSA-N 0.000 description 1
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 1
- XIDSGDJNUJRUHE-VEVYYDQMSA-N Asn-Thr-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O XIDSGDJNUJRUHE-VEVYYDQMSA-N 0.000 description 1
- AMGQTNHANMRPOE-LKXGYXEUSA-N Asn-Thr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O AMGQTNHANMRPOE-LKXGYXEUSA-N 0.000 description 1
- KZYSHAMXEBPJBD-JRQIVUDYSA-N Asn-Thr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KZYSHAMXEBPJBD-JRQIVUDYSA-N 0.000 description 1
- JZLFYAAGGYMRIK-BYULHYEWSA-N Asn-Val-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O JZLFYAAGGYMRIK-BYULHYEWSA-N 0.000 description 1
- CBHVAFXKOYAHOY-NHCYSSNCSA-N Asn-Val-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O CBHVAFXKOYAHOY-NHCYSSNCSA-N 0.000 description 1
- LMIWYCWRJVMAIQ-NHCYSSNCSA-N Asn-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N LMIWYCWRJVMAIQ-NHCYSSNCSA-N 0.000 description 1
- GBAWQWASNGUNQF-ZLUOBGJFSA-N Asp-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N GBAWQWASNGUNQF-ZLUOBGJFSA-N 0.000 description 1
- XPGVTUBABLRGHY-BIIVOSGPSA-N Asp-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(=O)O)N XPGVTUBABLRGHY-BIIVOSGPSA-N 0.000 description 1
- IXIWEFWRKIUMQX-DCAQKATOSA-N Asp-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(O)=O IXIWEFWRKIUMQX-DCAQKATOSA-N 0.000 description 1
- ATYWBXGNXZYZGI-ACZMJKKPSA-N Asp-Asn-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ATYWBXGNXZYZGI-ACZMJKKPSA-N 0.000 description 1
- VBVKSAFJPVXMFJ-CIUDSAMLSA-N Asp-Asn-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N VBVKSAFJPVXMFJ-CIUDSAMLSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 1
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 1
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 1
- PXLNPFOJZQMXAT-BYULHYEWSA-N Asp-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O PXLNPFOJZQMXAT-BYULHYEWSA-N 0.000 description 1
- WXASLRQUSYWVNE-FXQIFTODSA-N Asp-Cys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N WXASLRQUSYWVNE-FXQIFTODSA-N 0.000 description 1
- NYQHSUGFEWDWPD-ACZMJKKPSA-N Asp-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N NYQHSUGFEWDWPD-ACZMJKKPSA-N 0.000 description 1
- RYKWOUUZJFSJOH-FXQIFTODSA-N Asp-Gln-Glu Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N RYKWOUUZJFSJOH-FXQIFTODSA-N 0.000 description 1
- SPKRHJOVRVDJGG-CIUDSAMLSA-N Asp-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N SPKRHJOVRVDJGG-CIUDSAMLSA-N 0.000 description 1
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 1
- IJHUZMGJRGNXIW-CIUDSAMLSA-N Asp-Glu-Arg Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IJHUZMGJRGNXIW-CIUDSAMLSA-N 0.000 description 1
- XAJRHVUUVUPFQL-ACZMJKKPSA-N Asp-Glu-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O XAJRHVUUVUPFQL-ACZMJKKPSA-N 0.000 description 1
- GHODABZPVZMWCE-FXQIFTODSA-N Asp-Glu-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O GHODABZPVZMWCE-FXQIFTODSA-N 0.000 description 1
- VFUXXFVCYZPOQG-WDSKDSINSA-N Asp-Glu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VFUXXFVCYZPOQG-WDSKDSINSA-N 0.000 description 1
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- VIRHEUMYXXLCBF-WDSKDSINSA-N Asp-Gly-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O VIRHEUMYXXLCBF-WDSKDSINSA-N 0.000 description 1
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 1
- KHGPWGKPYHPOIK-QWRGUYRKSA-N Asp-Gly-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KHGPWGKPYHPOIK-QWRGUYRKSA-N 0.000 description 1
- PGUYEUCYVNZGGV-QWRGUYRKSA-N Asp-Gly-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 PGUYEUCYVNZGGV-QWRGUYRKSA-N 0.000 description 1
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 1
- QNFRBNZGVVKBNJ-PEFMBERDSA-N Asp-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N QNFRBNZGVVKBNJ-PEFMBERDSA-N 0.000 description 1
- PYXXJFRXIYAESU-PCBIJLKTSA-N Asp-Ile-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PYXXJFRXIYAESU-PCBIJLKTSA-N 0.000 description 1
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 1
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 1
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 1
- AYFVRYXNDHBECD-YUMQZZPRSA-N Asp-Leu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AYFVRYXNDHBECD-YUMQZZPRSA-N 0.000 description 1
- OEDJQRXNDRUGEU-SRVKXCTJSA-N Asp-Leu-His Chemical compound N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O OEDJQRXNDRUGEU-SRVKXCTJSA-N 0.000 description 1
- TZBJAXGYGSIUHQ-XUXIUFHCSA-N Asp-Leu-Leu-Ser Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O TZBJAXGYGSIUHQ-XUXIUFHCSA-N 0.000 description 1
- QNMKWNONJGKJJC-NHCYSSNCSA-N Asp-Leu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O QNMKWNONJGKJJC-NHCYSSNCSA-N 0.000 description 1
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 1
- JXGJJQJHXHXJQF-CIUDSAMLSA-N Asp-Met-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O JXGJJQJHXHXJQF-CIUDSAMLSA-N 0.000 description 1
- UCHSVZYJKJLPHF-BZSNNMDCSA-N Asp-Phe-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O UCHSVZYJKJLPHF-BZSNNMDCSA-N 0.000 description 1
- KPSHWSWFPUDEGF-FXQIFTODSA-N Asp-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(O)=O KPSHWSWFPUDEGF-FXQIFTODSA-N 0.000 description 1
- BWJZSLQJNBSUPM-FXQIFTODSA-N Asp-Pro-Asn Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O BWJZSLQJNBSUPM-FXQIFTODSA-N 0.000 description 1
- FAUPLTGRUBTXNU-FXQIFTODSA-N Asp-Pro-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O FAUPLTGRUBTXNU-FXQIFTODSA-N 0.000 description 1
- RVMXMLSYBTXCAV-VEVYYDQMSA-N Asp-Pro-Thr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O RVMXMLSYBTXCAV-VEVYYDQMSA-N 0.000 description 1
- XXAMCEGRCZQGEM-ZLUOBGJFSA-N Asp-Ser-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O XXAMCEGRCZQGEM-ZLUOBGJFSA-N 0.000 description 1
- CUQDCPXNZPDYFQ-ZLUOBGJFSA-N Asp-Ser-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O CUQDCPXNZPDYFQ-ZLUOBGJFSA-N 0.000 description 1
- FIAKNCXQFFKSSI-ZLUOBGJFSA-N Asp-Ser-Cys Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(O)=O FIAKNCXQFFKSSI-ZLUOBGJFSA-N 0.000 description 1
- QSFHZPQUAAQHAQ-CIUDSAMLSA-N Asp-Ser-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O QSFHZPQUAAQHAQ-CIUDSAMLSA-N 0.000 description 1
- UTLCRGFJFSZWAW-OLHMAJIHSA-N Asp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O UTLCRGFJFSZWAW-OLHMAJIHSA-N 0.000 description 1
- NAAAPCLFJPURAM-HJGDQZAQSA-N Asp-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O NAAAPCLFJPURAM-HJGDQZAQSA-N 0.000 description 1
- CZIVKMOEXPILDK-SRVKXCTJSA-N Asp-Tyr-Ser Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O CZIVKMOEXPILDK-SRVKXCTJSA-N 0.000 description 1
- BYLPQJAWXJWUCJ-YDHLFZDLSA-N Asp-Tyr-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O BYLPQJAWXJWUCJ-YDHLFZDLSA-N 0.000 description 1
- PLOKOIJSGCISHE-BYULHYEWSA-N Asp-Val-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PLOKOIJSGCISHE-BYULHYEWSA-N 0.000 description 1
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 1
- XQFLFQWOBXPMHW-NHCYSSNCSA-N Asp-Val-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XQFLFQWOBXPMHW-NHCYSSNCSA-N 0.000 description 1
- QOJJMJKTMKNFEF-ZKWXMUAHSA-N Asp-Val-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O QOJJMJKTMKNFEF-ZKWXMUAHSA-N 0.000 description 1
- RKXVTTIQNKPCHU-KKHAAJSZSA-N Asp-Val-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O RKXVTTIQNKPCHU-KKHAAJSZSA-N 0.000 description 1
- GZYDPEJSZYZWEF-MXAVVETBSA-N Asp-Val-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O GZYDPEJSZYZWEF-MXAVVETBSA-N 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 108700031407 Chloroplast Genes Proteins 0.000 description 1
- AMRLSQGGERHDHJ-FXQIFTODSA-N Cys-Ala-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMRLSQGGERHDHJ-FXQIFTODSA-N 0.000 description 1
- FMDCYTBSPZMPQE-JBDRJPRFSA-N Cys-Ala-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FMDCYTBSPZMPQE-JBDRJPRFSA-N 0.000 description 1
- CLDCTNHPILWQCW-CIUDSAMLSA-N Cys-Arg-Glu Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CS)N)CN=C(N)N CLDCTNHPILWQCW-CIUDSAMLSA-N 0.000 description 1
- HRJLVSQKBLZHSR-ZLUOBGJFSA-N Cys-Asn-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O HRJLVSQKBLZHSR-ZLUOBGJFSA-N 0.000 description 1
- VZKXOWRNJDEGLZ-WHFBIAKZSA-N Cys-Asp-Gly Chemical compound SC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O VZKXOWRNJDEGLZ-WHFBIAKZSA-N 0.000 description 1
- YMBAVNPKBWHDAW-CIUDSAMLSA-N Cys-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CS)N YMBAVNPKBWHDAW-CIUDSAMLSA-N 0.000 description 1
- ZEXHDOQQYZKOIB-ACZMJKKPSA-N Cys-Glu-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZEXHDOQQYZKOIB-ACZMJKKPSA-N 0.000 description 1
- LYSHSHHDBVKJRN-JBDRJPRFSA-N Cys-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CS)N LYSHSHHDBVKJRN-JBDRJPRFSA-N 0.000 description 1
- DYBIDOHFRRUMLW-CIUDSAMLSA-N Cys-Leu-Cys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CS)C(O)=O DYBIDOHFRRUMLW-CIUDSAMLSA-N 0.000 description 1
- CIVXDCMSSFGWAL-YUMQZZPRSA-N Cys-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CS)N CIVXDCMSSFGWAL-YUMQZZPRSA-N 0.000 description 1
- PGBLJHDDKCVSTC-CIUDSAMLSA-N Cys-Met-Gln Chemical compound CSCC[C@H](NC(=O)[C@@H](N)CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O PGBLJHDDKCVSTC-CIUDSAMLSA-N 0.000 description 1
- IDZDFWJNPOOOHE-KKUMJFAQSA-N Cys-Phe-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N IDZDFWJNPOOOHE-KKUMJFAQSA-N 0.000 description 1
- BCFXQBXXDSEHRS-FXQIFTODSA-N Cys-Ser-Arg Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O BCFXQBXXDSEHRS-FXQIFTODSA-N 0.000 description 1
- HJXSYJVCMUOUNY-SRVKXCTJSA-N Cys-Ser-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CS)N HJXSYJVCMUOUNY-SRVKXCTJSA-N 0.000 description 1
- DRXOWZZHCSBUOI-YJRXYDGGSA-N Cys-Thr-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CS)N)O DRXOWZZHCSBUOI-YJRXYDGGSA-N 0.000 description 1
- KFYPRIGJTICABD-XGEHTFHBSA-N Cys-Thr-Val Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CS)N)O KFYPRIGJTICABD-XGEHTFHBSA-N 0.000 description 1
- JIZRUFJGHPIYPS-SRVKXCTJSA-N Cys-Tyr-Cys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N)O JIZRUFJGHPIYPS-SRVKXCTJSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 108010076804 DNA Restriction Enzymes Proteins 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 1
- OYTPNWYZORARHL-XHNCKOQMSA-N Gln-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)N)N OYTPNWYZORARHL-XHNCKOQMSA-N 0.000 description 1
- RGXXLQWXBFNXTG-CIUDSAMLSA-N Gln-Arg-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O RGXXLQWXBFNXTG-CIUDSAMLSA-N 0.000 description 1
- WOACHWLUOFZLGJ-GUBZILKMSA-N Gln-Arg-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O WOACHWLUOFZLGJ-GUBZILKMSA-N 0.000 description 1
- JESJDAAGXULQOP-CIUDSAMLSA-N Gln-Arg-Ser Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N JESJDAAGXULQOP-CIUDSAMLSA-N 0.000 description 1
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 1
- INFBPLSHYFALDE-ACZMJKKPSA-N Gln-Asn-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O INFBPLSHYFALDE-ACZMJKKPSA-N 0.000 description 1
- SOBBAYVQSNXYPQ-ACZMJKKPSA-N Gln-Asn-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SOBBAYVQSNXYPQ-ACZMJKKPSA-N 0.000 description 1
- IXFVOPOHSRKJNG-LAEOZQHASA-N Gln-Asp-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IXFVOPOHSRKJNG-LAEOZQHASA-N 0.000 description 1
- NVEASDQHBRZPSU-BQBZGAKWSA-N Gln-Gln-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O NVEASDQHBRZPSU-BQBZGAKWSA-N 0.000 description 1
- IVCOYUURLWQDJQ-LPEHRKFASA-N Gln-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)N)N)C(=O)O IVCOYUURLWQDJQ-LPEHRKFASA-N 0.000 description 1
- MCAVASRGVBVPMX-FXQIFTODSA-N Gln-Glu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MCAVASRGVBVPMX-FXQIFTODSA-N 0.000 description 1
- MFJAPSYJQJCQDN-BQBZGAKWSA-N Gln-Gly-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O MFJAPSYJQJCQDN-BQBZGAKWSA-N 0.000 description 1
- QQAPDATZKKTBIY-YUMQZZPRSA-N Gln-Gly-Met Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O QQAPDATZKKTBIY-YUMQZZPRSA-N 0.000 description 1
- ORYMMTRPKVTGSJ-XVKPBYJWSA-N Gln-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O ORYMMTRPKVTGSJ-XVKPBYJWSA-N 0.000 description 1
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 1
- MWERYIXRDZDXOA-QEWYBTABSA-N Gln-Ile-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MWERYIXRDZDXOA-QEWYBTABSA-N 0.000 description 1
- FFVXLVGUJBCKRX-UKJIMTQDSA-N Gln-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N FFVXLVGUJBCKRX-UKJIMTQDSA-N 0.000 description 1
- HYPVLWGNBIYTNA-GUBZILKMSA-N Gln-Leu-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HYPVLWGNBIYTNA-GUBZILKMSA-N 0.000 description 1
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 1
- JNENSVNAUWONEZ-GUBZILKMSA-N Gln-Lys-Asn Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JNENSVNAUWONEZ-GUBZILKMSA-N 0.000 description 1
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 1
- SXGMGNZEHFORAV-IUCAKERBSA-N Gln-Lys-Gly Chemical compound C(CCN)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N SXGMGNZEHFORAV-IUCAKERBSA-N 0.000 description 1
- XBWGJWXGUNSZAT-CIUDSAMLSA-N Gln-Met-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N XBWGJWXGUNSZAT-CIUDSAMLSA-N 0.000 description 1
- DOMHVQBSRJNNKD-ZPFDUUQYSA-N Gln-Met-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DOMHVQBSRJNNKD-ZPFDUUQYSA-N 0.000 description 1
- FALJZCPMTGJOHX-SRVKXCTJSA-N Gln-Met-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O FALJZCPMTGJOHX-SRVKXCTJSA-N 0.000 description 1
- BZULIEARJFRINC-IHRRRGAJSA-N Gln-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N BZULIEARJFRINC-IHRRRGAJSA-N 0.000 description 1
- DOQUICBEISTQHE-CIUDSAMLSA-N Gln-Pro-Asp Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O DOQUICBEISTQHE-CIUDSAMLSA-N 0.000 description 1
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 1
- OKARHJKJTKFQBM-ACZMJKKPSA-N Gln-Ser-Asn Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OKARHJKJTKFQBM-ACZMJKKPSA-N 0.000 description 1
- RWQCWSGOOOEGPB-FXQIFTODSA-N Gln-Ser-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O RWQCWSGOOOEGPB-FXQIFTODSA-N 0.000 description 1
- OKQLXOYFUPVEHI-CIUDSAMLSA-N Gln-Ser-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N OKQLXOYFUPVEHI-CIUDSAMLSA-N 0.000 description 1
- SYZZMPFLOLSMHL-XHNCKOQMSA-N Gln-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N)C(=O)O SYZZMPFLOLSMHL-XHNCKOQMSA-N 0.000 description 1
- QENSHQJGWGRPQS-QEJZJMRPSA-N Gln-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)N)C(O)=O)=CNC2=C1 QENSHQJGWGRPQS-QEJZJMRPSA-N 0.000 description 1
- OTQSTOXRUBVWAP-NRPADANISA-N Gln-Ser-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OTQSTOXRUBVWAP-NRPADANISA-N 0.000 description 1
- OUBUHIODTNUUTC-WDCWCFNPSA-N Gln-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O OUBUHIODTNUUTC-WDCWCFNPSA-N 0.000 description 1
- STHSGOZLFLFGSS-SUSMZKCASA-N Gln-Thr-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O STHSGOZLFLFGSS-SUSMZKCASA-N 0.000 description 1
- IIMZHVKZBGSEKZ-SZMVWBNQSA-N Gln-Trp-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O IIMZHVKZBGSEKZ-SZMVWBNQSA-N 0.000 description 1
- BETSEXMYBWCDAE-SZMVWBNQSA-N Gln-Trp-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N BETSEXMYBWCDAE-SZMVWBNQSA-N 0.000 description 1
- UGEZSPWLJABDAR-KKUMJFAQSA-N Gln-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)N)N UGEZSPWLJABDAR-KKUMJFAQSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 1
- ATRHMOJQJWPVBQ-DRZSPHRISA-N Glu-Ala-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ATRHMOJQJWPVBQ-DRZSPHRISA-N 0.000 description 1
- IRDASPPCLZIERZ-XHNCKOQMSA-N Glu-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCC(=O)O)N IRDASPPCLZIERZ-XHNCKOQMSA-N 0.000 description 1
- FYBSCGZLICNOBA-XQXXSGGOSA-N Glu-Ala-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FYBSCGZLICNOBA-XQXXSGGOSA-N 0.000 description 1
- RSUVOPBMWMTVDI-XEGUGMAKSA-N Glu-Ala-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCC(O)=O)C)C(O)=O)=CNC2=C1 RSUVOPBMWMTVDI-XEGUGMAKSA-N 0.000 description 1
- KBKGRMNVKPSQIF-XDTLVQLUSA-N Glu-Ala-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O KBKGRMNVKPSQIF-XDTLVQLUSA-N 0.000 description 1
- DIXKFOPPGWKZLY-CIUDSAMLSA-N Glu-Arg-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O DIXKFOPPGWKZLY-CIUDSAMLSA-N 0.000 description 1
- DYFJZDDQPNIPAB-NHCYSSNCSA-N Glu-Arg-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O DYFJZDDQPNIPAB-NHCYSSNCSA-N 0.000 description 1
- AFODTOLGSZQDSL-PEFMBERDSA-N Glu-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N AFODTOLGSZQDSL-PEFMBERDSA-N 0.000 description 1
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- JVSBYEDSSRZQGV-GUBZILKMSA-N Glu-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O JVSBYEDSSRZQGV-GUBZILKMSA-N 0.000 description 1
- PXHABOCPJVTGEK-BQBZGAKWSA-N Glu-Gln-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O PXHABOCPJVTGEK-BQBZGAKWSA-N 0.000 description 1
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 1
- LGYZYFFDELZWRS-DCAQKATOSA-N Glu-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O LGYZYFFDELZWRS-DCAQKATOSA-N 0.000 description 1
- KASDBWKLWJKTLJ-GUBZILKMSA-N Glu-Glu-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O KASDBWKLWJKTLJ-GUBZILKMSA-N 0.000 description 1
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 1
- IQACOVZVOMVILH-FXQIFTODSA-N Glu-Glu-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O IQACOVZVOMVILH-FXQIFTODSA-N 0.000 description 1
- QYPKJXSMLMREKF-BPUTZDHNSA-N Glu-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)N QYPKJXSMLMREKF-BPUTZDHNSA-N 0.000 description 1
- QJCKNLPMTPXXEM-AUTRQRHGSA-N Glu-Glu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O QJCKNLPMTPXXEM-AUTRQRHGSA-N 0.000 description 1
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 1
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 1
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 1
- VXQOONWNIWFOCS-HGNGGELXSA-N Glu-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N VXQOONWNIWFOCS-HGNGGELXSA-N 0.000 description 1
- NJPQBTJSYCKCNS-HVTMNAMFSA-N Glu-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N NJPQBTJSYCKCNS-HVTMNAMFSA-N 0.000 description 1
- LGYCLOCORAEQSZ-PEFMBERDSA-N Glu-Ile-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O LGYCLOCORAEQSZ-PEFMBERDSA-N 0.000 description 1
- WVYJNPCWJYBHJG-YVNDNENWSA-N Glu-Ile-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(O)=O WVYJNPCWJYBHJG-YVNDNENWSA-N 0.000 description 1
- WTMZXOPHTIVFCP-QEWYBTABSA-N Glu-Ile-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 WTMZXOPHTIVFCP-QEWYBTABSA-N 0.000 description 1
- ZSWGJYOZWBHROQ-RWRJDSDZSA-N Glu-Ile-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZSWGJYOZWBHROQ-RWRJDSDZSA-N 0.000 description 1
- INGJLBQKTRJLFO-UKJIMTQDSA-N Glu-Ile-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(O)=O INGJLBQKTRJLFO-UKJIMTQDSA-N 0.000 description 1
- PJBVXVBTTFZPHJ-GUBZILKMSA-N Glu-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)O)N PJBVXVBTTFZPHJ-GUBZILKMSA-N 0.000 description 1
- IRXNJYPKBVERCW-DCAQKATOSA-N Glu-Leu-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IRXNJYPKBVERCW-DCAQKATOSA-N 0.000 description 1
- ATVYZJGOZLVXDK-IUCAKERBSA-N Glu-Leu-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O ATVYZJGOZLVXDK-IUCAKERBSA-N 0.000 description 1
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 1
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 1
- GJBUAAAIZSRCDC-GVXVVHGQSA-N Glu-Leu-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O GJBUAAAIZSRCDC-GVXVVHGQSA-N 0.000 description 1
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 1
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 1
- YGLCLCMAYUYZSG-AVGNSLFASA-N Glu-Lys-His Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 YGLCLCMAYUYZSG-AVGNSLFASA-N 0.000 description 1
- HRBYTAIBKPNZKQ-AVGNSLFASA-N Glu-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCC(O)=O HRBYTAIBKPNZKQ-AVGNSLFASA-N 0.000 description 1
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 1
- SUIAHERNFYRBDZ-GVXVVHGQSA-N Glu-Lys-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O SUIAHERNFYRBDZ-GVXVVHGQSA-N 0.000 description 1
- AOCARQDSFTWWFT-DCAQKATOSA-N Glu-Met-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AOCARQDSFTWWFT-DCAQKATOSA-N 0.000 description 1
- CBEUFCJRFNZMCU-SRVKXCTJSA-N Glu-Met-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O CBEUFCJRFNZMCU-SRVKXCTJSA-N 0.000 description 1
- XEKAJTCACGEBOK-KKUMJFAQSA-N Glu-Met-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XEKAJTCACGEBOK-KKUMJFAQSA-N 0.000 description 1
- YHOJJFFTSMWVGR-HJGDQZAQSA-N Glu-Met-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O YHOJJFFTSMWVGR-HJGDQZAQSA-N 0.000 description 1
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 1
- YTRBQAQSUDSIQE-FHWLQOOXSA-N Glu-Phe-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 YTRBQAQSUDSIQE-FHWLQOOXSA-N 0.000 description 1
- UDEPRBFQTWGLCW-CIUDSAMLSA-N Glu-Pro-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O UDEPRBFQTWGLCW-CIUDSAMLSA-N 0.000 description 1
- CQAHWYDHKUWYIX-YUMQZZPRSA-N Glu-Pro-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O CQAHWYDHKUWYIX-YUMQZZPRSA-N 0.000 description 1
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 1
- GMVCSRBOSIUTFC-FXQIFTODSA-N Glu-Ser-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMVCSRBOSIUTFC-FXQIFTODSA-N 0.000 description 1
- RFTVTKBHDXCEEX-WDSKDSINSA-N Glu-Ser-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RFTVTKBHDXCEEX-WDSKDSINSA-N 0.000 description 1
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 1
- WXONSNSSBYQGNN-AVGNSLFASA-N Glu-Ser-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WXONSNSSBYQGNN-AVGNSLFASA-N 0.000 description 1
- HZISRJBYZAODRV-XQXXSGGOSA-N Glu-Thr-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O HZISRJBYZAODRV-XQXXSGGOSA-N 0.000 description 1
- MWTGQXBHVRTCOR-GLLZPBPUSA-N Glu-Thr-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MWTGQXBHVRTCOR-GLLZPBPUSA-N 0.000 description 1
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 1
- MXJYXYDREQWUMS-XKBZYTNZSA-N Glu-Thr-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O MXJYXYDREQWUMS-XKBZYTNZSA-N 0.000 description 1
- VHPVBPCCWVDGJL-IRIUXVKKSA-N Glu-Thr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VHPVBPCCWVDGJL-IRIUXVKKSA-N 0.000 description 1
- BKMOHWJHXQLFEX-IRIUXVKKSA-N Glu-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCC(=O)O)N)O BKMOHWJHXQLFEX-IRIUXVKKSA-N 0.000 description 1
- QLNKFGTZOBVMCS-JBACZVJFSA-N Glu-Tyr-Trp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O QLNKFGTZOBVMCS-JBACZVJFSA-N 0.000 description 1
- HBMRTXJZQDVRFT-DZKIICNBSA-N Glu-Tyr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O HBMRTXJZQDVRFT-DZKIICNBSA-N 0.000 description 1
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 1
- ZYRXTRTUCAVNBQ-GVXVVHGQSA-N Glu-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N ZYRXTRTUCAVNBQ-GVXVVHGQSA-N 0.000 description 1
- QRWPTXLWHHTOCO-DZKIICNBSA-N Glu-Val-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O QRWPTXLWHHTOCO-DZKIICNBSA-N 0.000 description 1
- GQGAFTPXAPKSCF-WHFBIAKZSA-N Gly-Ala-Cys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O GQGAFTPXAPKSCF-WHFBIAKZSA-N 0.000 description 1
- YMUFWNJHVPQNQD-ZKWXMUAHSA-N Gly-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN YMUFWNJHVPQNQD-ZKWXMUAHSA-N 0.000 description 1
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 1
- LJPIRKICOISLKN-WHFBIAKZSA-N Gly-Ala-Ser Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O LJPIRKICOISLKN-WHFBIAKZSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- UPOJUWHGMDJUQZ-IUCAKERBSA-N Gly-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UPOJUWHGMDJUQZ-IUCAKERBSA-N 0.000 description 1
- RQZGFWKQLPJOEQ-YUMQZZPRSA-N Gly-Arg-Gln Chemical compound C(C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)CN)CN=C(N)N RQZGFWKQLPJOEQ-YUMQZZPRSA-N 0.000 description 1
- KFMBRBPXHVMDFN-UWVGGRQHSA-N Gly-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCNC(N)=N KFMBRBPXHVMDFN-UWVGGRQHSA-N 0.000 description 1
- KRRMJKMGWWXWDW-STQMWFEESA-N Gly-Arg-Phe Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KRRMJKMGWWXWDW-STQMWFEESA-N 0.000 description 1
- XUORRGAFUQIMLC-STQMWFEESA-N Gly-Arg-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CN)O XUORRGAFUQIMLC-STQMWFEESA-N 0.000 description 1
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 1
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 1
- FUTAPPOITCCWTH-WHFBIAKZSA-N Gly-Asp-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O FUTAPPOITCCWTH-WHFBIAKZSA-N 0.000 description 1
- QSTLUOIOYLYLLF-WDSKDSINSA-N Gly-Asp-Glu Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QSTLUOIOYLYLLF-WDSKDSINSA-N 0.000 description 1
- RPLLQZBOVIVGMX-QWRGUYRKSA-N Gly-Asp-Phe Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RPLLQZBOVIVGMX-QWRGUYRKSA-N 0.000 description 1
- LXXLEUBUOMCAMR-NKWVEPMBSA-N Gly-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)CN)C(=O)O LXXLEUBUOMCAMR-NKWVEPMBSA-N 0.000 description 1
- JPWIMMUNWUKOAD-STQMWFEESA-N Gly-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN JPWIMMUNWUKOAD-STQMWFEESA-N 0.000 description 1
- QGZSAHIZRQHCEQ-QWRGUYRKSA-N Gly-Asp-Tyr Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QGZSAHIZRQHCEQ-QWRGUYRKSA-N 0.000 description 1
- YZACQYVWLCQWBT-BQBZGAKWSA-N Gly-Cys-Arg Chemical compound [H]NCC(=O)N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YZACQYVWLCQWBT-BQBZGAKWSA-N 0.000 description 1
- CEXINUGNTZFNRY-BYPYZUCNSA-N Gly-Cys-Gly Chemical compound [NH3+]CC(=O)N[C@@H](CS)C(=O)NCC([O-])=O CEXINUGNTZFNRY-BYPYZUCNSA-N 0.000 description 1
- BULIVUZUDBHKKZ-WDSKDSINSA-N Gly-Gln-Asn Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O BULIVUZUDBHKKZ-WDSKDSINSA-N 0.000 description 1
- YZPVGIVFMZLQMM-YUMQZZPRSA-N Gly-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN YZPVGIVFMZLQMM-YUMQZZPRSA-N 0.000 description 1
- NPSWCZIRBAYNSB-JHEQGTHGSA-N Gly-Gln-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NPSWCZIRBAYNSB-JHEQGTHGSA-N 0.000 description 1
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 1
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 1
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 1
- XMPXVJIDADUOQB-RCOVLWMOSA-N Gly-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C([O-])=O)NC(=O)CNC(=O)C[NH3+] XMPXVJIDADUOQB-RCOVLWMOSA-N 0.000 description 1
- QPCVIQJVRGXUSA-LURJTMIESA-N Gly-Gly-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QPCVIQJVRGXUSA-LURJTMIESA-N 0.000 description 1
- INLIXXRWNUKVCF-JTQLQIEISA-N Gly-Gly-Tyr Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 INLIXXRWNUKVCF-JTQLQIEISA-N 0.000 description 1
- ZKLYPEGLWFVRGF-IUCAKERBSA-N Gly-His-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZKLYPEGLWFVRGF-IUCAKERBSA-N 0.000 description 1
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 1
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 1
- ITZOBNKQDZEOCE-NHCYSSNCSA-N Gly-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)CN ITZOBNKQDZEOCE-NHCYSSNCSA-N 0.000 description 1
- ZOTGXWMKUFSKEU-QXEWZRGKSA-N Gly-Ile-Met Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C(O)=O ZOTGXWMKUFSKEU-QXEWZRGKSA-N 0.000 description 1
- BHPQOIPBLYJNAW-NGZCFLSTSA-N Gly-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN BHPQOIPBLYJNAW-NGZCFLSTSA-N 0.000 description 1
- NSTUFLGQJCOCDL-UWVGGRQHSA-N Gly-Leu-Arg Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N NSTUFLGQJCOCDL-UWVGGRQHSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 1
- LIXWIUAORXJNBH-QWRGUYRKSA-N Gly-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)CN LIXWIUAORXJNBH-QWRGUYRKSA-N 0.000 description 1
- LRQXRHGQEVWGPV-NHCYSSNCSA-N Gly-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN LRQXRHGQEVWGPV-NHCYSSNCSA-N 0.000 description 1
- NNCSJUBVFBDDLC-YUMQZZPRSA-N Gly-Leu-Ser Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O NNCSJUBVFBDDLC-YUMQZZPRSA-N 0.000 description 1
- JPAACTMBBBGAAR-HOTGVXAUSA-N Gly-Leu-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](NC(=O)CN)CC(C)C)C(O)=O)=CNC2=C1 JPAACTMBBBGAAR-HOTGVXAUSA-N 0.000 description 1
- MIIVFRCYJABHTQ-ONGXEEELSA-N Gly-Leu-Val Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O MIIVFRCYJABHTQ-ONGXEEELSA-N 0.000 description 1
- IUKIDFVOUHZRAK-QWRGUYRKSA-N Gly-Lys-His Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 IUKIDFVOUHZRAK-QWRGUYRKSA-N 0.000 description 1
- PTIIBFKSLCYQBO-NHCYSSNCSA-N Gly-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)CN PTIIBFKSLCYQBO-NHCYSSNCSA-N 0.000 description 1
- MHZXESQPPXOING-KBPBESRZSA-N Gly-Lys-Phe Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MHZXESQPPXOING-KBPBESRZSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- SJLKKOZFHSJJAW-YUMQZZPRSA-N Gly-Met-Glu Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)CN SJLKKOZFHSJJAW-YUMQZZPRSA-N 0.000 description 1
- WMGHDYWNHNLGBV-ONGXEEELSA-N Gly-Phe-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 WMGHDYWNHNLGBV-ONGXEEELSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- VNNRLUNBJSWZPF-ZKWXMUAHSA-N Gly-Ser-Ile Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VNNRLUNBJSWZPF-ZKWXMUAHSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- WCORRBXVISTKQL-WHFBIAKZSA-N Gly-Ser-Ser Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WCORRBXVISTKQL-WHFBIAKZSA-N 0.000 description 1
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 1
- OCRQUYDOYKCOQG-IRXDYDNUSA-N Gly-Tyr-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 OCRQUYDOYKCOQG-IRXDYDNUSA-N 0.000 description 1
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 1
- KSOBNUBCYHGUKH-UWVGGRQHSA-N Gly-Val-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)CN KSOBNUBCYHGUKH-UWVGGRQHSA-N 0.000 description 1
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 1
- 108010068250 Herpes Simplex Virus Protein Vmw65 Proteins 0.000 description 1
- TVQGUFGDVODUIF-LSJOCFKGSA-N His-Arg-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CN=CN1)N TVQGUFGDVODUIF-LSJOCFKGSA-N 0.000 description 1
- DFHVLUKTTVTCKY-PBCZWWQYSA-N His-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N)O DFHVLUKTTVTCKY-PBCZWWQYSA-N 0.000 description 1
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 1
- FYVHHKMHFPMBBG-GUBZILKMSA-N His-Gln-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N FYVHHKMHFPMBBG-GUBZILKMSA-N 0.000 description 1
- ZNNNYCXPCKACHX-DCAQKATOSA-N His-Gln-Gln Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZNNNYCXPCKACHX-DCAQKATOSA-N 0.000 description 1
- FIMNVXRZGUAGBI-AVGNSLFASA-N His-Glu-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O FIMNVXRZGUAGBI-AVGNSLFASA-N 0.000 description 1
- WGHJXSONOOTTCZ-JYJNAYRXSA-N His-Glu-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O WGHJXSONOOTTCZ-JYJNAYRXSA-N 0.000 description 1
- STWGDDDFLUFCCA-GVXVVHGQSA-N His-Glu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O STWGDDDFLUFCCA-GVXVVHGQSA-N 0.000 description 1
- MPXGJGBXCRQQJE-MXAVVETBSA-N His-Ile-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O MPXGJGBXCRQQJE-MXAVVETBSA-N 0.000 description 1
- SKOKHBGDXGTDDP-MELADBBJSA-N His-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N SKOKHBGDXGTDDP-MELADBBJSA-N 0.000 description 1
- LVWIJITYHRZHBO-IXOXFDKPSA-N His-Leu-Thr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LVWIJITYHRZHBO-IXOXFDKPSA-N 0.000 description 1
- UXSATKFPUVZVDK-KKUMJFAQSA-N His-Lys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CN=CN1)N UXSATKFPUVZVDK-KKUMJFAQSA-N 0.000 description 1
- LNDVNHOSZQPJGI-AVGNSLFASA-N His-Pro-Pro Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)N1[C@@H](CCC1)C(O)=O)C1=CN=CN1 LNDVNHOSZQPJGI-AVGNSLFASA-N 0.000 description 1
- CWSZWFILCNSNEX-CIUDSAMLSA-N His-Ser-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)N)C(=O)O)N CWSZWFILCNSNEX-CIUDSAMLSA-N 0.000 description 1
- ZHHLTWUOWXHVQJ-YUMQZZPRSA-N His-Ser-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZHHLTWUOWXHVQJ-YUMQZZPRSA-N 0.000 description 1
- PZAJPILZRFPYJJ-SRVKXCTJSA-N His-Ser-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O PZAJPILZRFPYJJ-SRVKXCTJSA-N 0.000 description 1
- VIJMRAIWYWRXSR-CIUDSAMLSA-N His-Ser-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CN=CN1 VIJMRAIWYWRXSR-CIUDSAMLSA-N 0.000 description 1
- UPJODPVSKKWGDQ-KLHWPWHYSA-N His-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N)O UPJODPVSKKWGDQ-KLHWPWHYSA-N 0.000 description 1
- ZNTSGDNUITWTRA-WDSOQIARSA-N His-Trp-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(O)=O ZNTSGDNUITWTRA-WDSOQIARSA-N 0.000 description 1
- KDDKJKKQODQQBR-NHCYSSNCSA-N His-Val-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N KDDKJKKQODQQBR-NHCYSSNCSA-N 0.000 description 1
- FFYYUUWROYYKFY-IHRRRGAJSA-N His-Val-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O FFYYUUWROYYKFY-IHRRRGAJSA-N 0.000 description 1
- XGBVLRJLHUVCNK-DCAQKATOSA-N His-Val-Ser Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O XGBVLRJLHUVCNK-DCAQKATOSA-N 0.000 description 1
- VSZALHITQINTGC-GHCJXIJMSA-N Ile-Ala-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)O)C(=O)O)N VSZALHITQINTGC-GHCJXIJMSA-N 0.000 description 1
- TZCGZYWNIDZZMR-UHFFFAOYSA-N Ile-Arg-Ala Natural products CCC(C)C(N)C(=O)NC(C(=O)NC(C)C(O)=O)CCCN=C(N)N TZCGZYWNIDZZMR-UHFFFAOYSA-N 0.000 description 1
- QLRMMMQNCWBNPQ-QXEWZRGKSA-N Ile-Arg-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)NCC(=O)O)N QLRMMMQNCWBNPQ-QXEWZRGKSA-N 0.000 description 1
- UKTUOMWSJPXODT-GUDRVLHUSA-N Ile-Asn-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N UKTUOMWSJPXODT-GUDRVLHUSA-N 0.000 description 1
- HVWXAQVMRBKKFE-UGYAYLCHSA-N Ile-Asp-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HVWXAQVMRBKKFE-UGYAYLCHSA-N 0.000 description 1
- HGNUKGZQASSBKQ-PCBIJLKTSA-N Ile-Asp-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N HGNUKGZQASSBKQ-PCBIJLKTSA-N 0.000 description 1
- AQTWDZDISVGCAC-CFMVVWHZSA-N Ile-Asp-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N AQTWDZDISVGCAC-CFMVVWHZSA-N 0.000 description 1
- LLZLRXBTOOFODM-QSFUFRPTSA-N Ile-Asp-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N LLZLRXBTOOFODM-QSFUFRPTSA-N 0.000 description 1
- KMBPQYKVZBMRMH-PEFMBERDSA-N Ile-Gln-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O KMBPQYKVZBMRMH-PEFMBERDSA-N 0.000 description 1
- KUHFPGIVBOCRMV-MNXVOIDGSA-N Ile-Gln-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)O)N KUHFPGIVBOCRMV-MNXVOIDGSA-N 0.000 description 1
- OVPYIUNCVSOVNF-ZPFDUUQYSA-N Ile-Gln-Pro Natural products CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O OVPYIUNCVSOVNF-ZPFDUUQYSA-N 0.000 description 1
- WZDCVAWMBUNDDY-KBIXCLLPSA-N Ile-Glu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](C)C(=O)O)N WZDCVAWMBUNDDY-KBIXCLLPSA-N 0.000 description 1
- JDAWAWXGAUZPNJ-ZPFDUUQYSA-N Ile-Glu-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JDAWAWXGAUZPNJ-ZPFDUUQYSA-N 0.000 description 1
- UBHUJPVCJHPSEU-GRLWGSQLSA-N Ile-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N UBHUJPVCJHPSEU-GRLWGSQLSA-N 0.000 description 1
- LPXHYGGZJOCAFR-MNXVOIDGSA-N Ile-Glu-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N LPXHYGGZJOCAFR-MNXVOIDGSA-N 0.000 description 1
- XLCZWMJPVGRWHJ-KQXIARHKSA-N Ile-Glu-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N XLCZWMJPVGRWHJ-KQXIARHKSA-N 0.000 description 1
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 1
- SLQVFYWBGNNOTK-BYULHYEWSA-N Ile-Gly-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N SLQVFYWBGNNOTK-BYULHYEWSA-N 0.000 description 1
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 1
- HYLIOBDWPQNLKI-HVTMNAMFSA-N Ile-His-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N HYLIOBDWPQNLKI-HVTMNAMFSA-N 0.000 description 1
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 1
- KYLIZSDYWQQTFM-PEDHHIEDSA-N Ile-Ile-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CCCN=C(N)N KYLIZSDYWQQTFM-PEDHHIEDSA-N 0.000 description 1
- HUORUFRRJHELPD-MNXVOIDGSA-N Ile-Leu-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N HUORUFRRJHELPD-MNXVOIDGSA-N 0.000 description 1
- PMMMQRVUMVURGJ-XUXIUFHCSA-N Ile-Leu-Pro Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O PMMMQRVUMVURGJ-XUXIUFHCSA-N 0.000 description 1
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 1
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 1
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- HQEPKOFULQTSFV-JURCDPSOSA-N Ile-Phe-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)O)N HQEPKOFULQTSFV-JURCDPSOSA-N 0.000 description 1
- FHPZJWJWTWZKNA-LLLHUVSDSA-N Ile-Phe-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N FHPZJWJWTWZKNA-LLLHUVSDSA-N 0.000 description 1
- XHBYEMIUENPZLY-GMOBBJLQSA-N Ile-Pro-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O XHBYEMIUENPZLY-GMOBBJLQSA-N 0.000 description 1
- BJECXJHLUJXPJQ-PYJNHQTQSA-N Ile-Pro-His Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N BJECXJHLUJXPJQ-PYJNHQTQSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- ZLFNNVATRMCAKN-ZKWXMUAHSA-N Ile-Ser-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)NCC(=O)O)N ZLFNNVATRMCAKN-ZKWXMUAHSA-N 0.000 description 1
- ZDNNDIJTUHQCAM-MXAVVETBSA-N Ile-Ser-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N ZDNNDIJTUHQCAM-MXAVVETBSA-N 0.000 description 1
- JNLSTRPWUXOORL-MMWGEVLESA-N Ile-Ser-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N JNLSTRPWUXOORL-MMWGEVLESA-N 0.000 description 1
- KBDIBHQICWDGDL-PPCPHDFISA-N Ile-Thr-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N KBDIBHQICWDGDL-PPCPHDFISA-N 0.000 description 1
- WXLYNEHOGRYNFU-URLPEUOOSA-N Ile-Thr-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)N WXLYNEHOGRYNFU-URLPEUOOSA-N 0.000 description 1
- WCNWGAUZWWSYDG-SVSWQMSJSA-N Ile-Thr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)O)N WCNWGAUZWWSYDG-SVSWQMSJSA-N 0.000 description 1
- CZOAJJGXTGUYOJ-SPOWBLRKSA-N Ile-Trp-Cys Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@@H](C)CC)C(=O)N[C@@H](CS)C(O)=O)=CNC2=C1 CZOAJJGXTGUYOJ-SPOWBLRKSA-N 0.000 description 1
- DTPGSUQHUMELQB-GVARAGBVSA-N Ile-Tyr-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 DTPGSUQHUMELQB-GVARAGBVSA-N 0.000 description 1
- NJGXXYLPDMMFJB-XUXIUFHCSA-N Ile-Val-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N NJGXXYLPDMMFJB-XUXIUFHCSA-N 0.000 description 1
- RQZFWBLDTBDEOF-RNJOBUHISA-N Ile-Val-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N RQZFWBLDTBDEOF-RNJOBUHISA-N 0.000 description 1
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 1
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 1
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 1
- LJHGALIOHLRRQN-DCAQKATOSA-N Leu-Ala-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LJHGALIOHLRRQN-DCAQKATOSA-N 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- KVRKAGGMEWNURO-CIUDSAMLSA-N Leu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N KVRKAGGMEWNURO-CIUDSAMLSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- FJUKMPUELVROGK-IHRRRGAJSA-N Leu-Arg-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N FJUKMPUELVROGK-IHRRRGAJSA-N 0.000 description 1
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 1
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 1
- XYUBOFCTGPZFSA-WDSOQIARSA-N Leu-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 XYUBOFCTGPZFSA-WDSOQIARSA-N 0.000 description 1
- DUBAVOVZNZKEQQ-AVGNSLFASA-N Leu-Arg-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CCCN=C(N)N DUBAVOVZNZKEQQ-AVGNSLFASA-N 0.000 description 1
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 1
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 1
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 1
- WGNOPSQMIQERPK-GARJFASQSA-N Leu-Asn-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N1CCC[C@@H]1C(=O)O)N WGNOPSQMIQERPK-GARJFASQSA-N 0.000 description 1
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 1
- IASQBRJGRVXNJI-YUMQZZPRSA-N Leu-Cys-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)NCC(O)=O IASQBRJGRVXNJI-YUMQZZPRSA-N 0.000 description 1
- LJKJVTCIRDCITR-SRVKXCTJSA-N Leu-Cys-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LJKJVTCIRDCITR-SRVKXCTJSA-N 0.000 description 1
- FOEHRHOBWFQSNW-KATARQTJSA-N Leu-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(C)C)N)O FOEHRHOBWFQSNW-KATARQTJSA-N 0.000 description 1
- AXZGZMGRBDQTEY-SRVKXCTJSA-N Leu-Gln-Met Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O AXZGZMGRBDQTEY-SRVKXCTJSA-N 0.000 description 1
- CQGSYZCULZMEDE-SRVKXCTJSA-N Leu-Gln-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(O)=O CQGSYZCULZMEDE-SRVKXCTJSA-N 0.000 description 1
- CQGSYZCULZMEDE-UHFFFAOYSA-N Leu-Gln-Pro Natural products CC(C)CC(N)C(=O)NC(CCC(N)=O)C(=O)N1CCCC1C(O)=O CQGSYZCULZMEDE-UHFFFAOYSA-N 0.000 description 1
- KUEVMUXNILMJTK-JYJNAYRXSA-N Leu-Gln-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KUEVMUXNILMJTK-JYJNAYRXSA-N 0.000 description 1
- NEEOBPIXKWSBRF-IUCAKERBSA-N Leu-Glu-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O NEEOBPIXKWSBRF-IUCAKERBSA-N 0.000 description 1
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 1
- OGUUKPXUTHOIAV-SDDRHHMPSA-N Leu-Glu-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N OGUUKPXUTHOIAV-SDDRHHMPSA-N 0.000 description 1
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 1
- BABSVXFGKFLIGW-UWVGGRQHSA-N Leu-Gly-Arg Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N BABSVXFGKFLIGW-UWVGGRQHSA-N 0.000 description 1
- LAPSXOAUPNOINL-YUMQZZPRSA-N Leu-Gly-Asp Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O LAPSXOAUPNOINL-YUMQZZPRSA-N 0.000 description 1
- KGCLIYGPQXUNLO-IUCAKERBSA-N Leu-Gly-Glu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O KGCLIYGPQXUNLO-IUCAKERBSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- CCQLQKZTXZBXTN-NHCYSSNCSA-N Leu-Gly-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O CCQLQKZTXZBXTN-NHCYSSNCSA-N 0.000 description 1
- PBGDOSARRIJMEV-DLOVCJGASA-N Leu-His-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(O)=O PBGDOSARRIJMEV-DLOVCJGASA-N 0.000 description 1
- BTNXKBVLWJBTNR-SRVKXCTJSA-N Leu-His-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(N)=O)C(O)=O BTNXKBVLWJBTNR-SRVKXCTJSA-N 0.000 description 1
- YWYQSLOTVIRCFE-SRVKXCTJSA-N Leu-His-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O YWYQSLOTVIRCFE-SRVKXCTJSA-N 0.000 description 1
- BKTXKJMNTSMJDQ-AVGNSLFASA-N Leu-His-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N BKTXKJMNTSMJDQ-AVGNSLFASA-N 0.000 description 1
- KXODZBLFVFSLAI-AVGNSLFASA-N Leu-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KXODZBLFVFSLAI-AVGNSLFASA-N 0.000 description 1
- KVOFSTUWVSQMDK-KKUMJFAQSA-N Leu-His-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CN=CN1 KVOFSTUWVSQMDK-KKUMJFAQSA-N 0.000 description 1
- QJXHMYMRGDOHRU-NHCYSSNCSA-N Leu-Ile-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O QJXHMYMRGDOHRU-NHCYSSNCSA-N 0.000 description 1
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 1
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 1
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 1
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 1
- FAELBUXXFQLUAX-AJNGGQMLSA-N Leu-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(C)C FAELBUXXFQLUAX-AJNGGQMLSA-N 0.000 description 1
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- IEWBEPKLKUXQBU-VOAKCMCISA-N Leu-Leu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O IEWBEPKLKUXQBU-VOAKCMCISA-N 0.000 description 1
- FKQPWMZLIIATBA-AJNGGQMLSA-N Leu-Lys-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FKQPWMZLIIATBA-AJNGGQMLSA-N 0.000 description 1
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 1
- LZHJZLHSRGWBBE-IHRRRGAJSA-N Leu-Lys-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LZHJZLHSRGWBBE-IHRRRGAJSA-N 0.000 description 1
- HDHQQEDVWQGBEE-DCAQKATOSA-N Leu-Met-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(O)=O HDHQQEDVWQGBEE-DCAQKATOSA-N 0.000 description 1
- NJMXCOOEFLMZSR-AVGNSLFASA-N Leu-Met-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C(C)C)C(O)=O NJMXCOOEFLMZSR-AVGNSLFASA-N 0.000 description 1
- ZDBMWELMUCLUPL-QEJZJMRPSA-N Leu-Phe-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 ZDBMWELMUCLUPL-QEJZJMRPSA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 1
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 1
- AIRUUHAOKGVJAD-JYJNAYRXSA-N Leu-Phe-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O AIRUUHAOKGVJAD-JYJNAYRXSA-N 0.000 description 1
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 1
- MJWVXZABPOKJJF-ACRUOGEOSA-N Leu-Phe-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O MJWVXZABPOKJJF-ACRUOGEOSA-N 0.000 description 1
- UHNQRAFSEBGZFZ-YESZJQIVSA-N Leu-Phe-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N2CCC[C@@H]2C(=O)O)N UHNQRAFSEBGZFZ-YESZJQIVSA-N 0.000 description 1
- FYPWFNKQVVEELI-ULQDDVLXSA-N Leu-Phe-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C(C)C)C(O)=O)CC1=CC=CC=C1 FYPWFNKQVVEELI-ULQDDVLXSA-N 0.000 description 1
- BMVFXOQHDQZAQU-DCAQKATOSA-N Leu-Pro-Asp Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N BMVFXOQHDQZAQU-DCAQKATOSA-N 0.000 description 1
- XWEVVRRSIOBJOO-SRVKXCTJSA-N Leu-Pro-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O XWEVVRRSIOBJOO-SRVKXCTJSA-N 0.000 description 1
- XXXXOVFBXRERQL-ULQDDVLXSA-N Leu-Pro-Phe Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XXXXOVFBXRERQL-ULQDDVLXSA-N 0.000 description 1
- IRMLZWSRWSGTOP-CIUDSAMLSA-N Leu-Ser-Ala Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O IRMLZWSRWSGTOP-CIUDSAMLSA-N 0.000 description 1
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 1
- IWMJFLJQHIDZQW-KKUMJFAQSA-N Leu-Ser-Phe Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 IWMJFLJQHIDZQW-KKUMJFAQSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 1
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 1
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 1
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 1
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 1
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 1
- HGLKOTPFWOMPOB-MEYUZBJRSA-N Leu-Thr-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HGLKOTPFWOMPOB-MEYUZBJRSA-N 0.000 description 1
- AIQWYVFNBNNOLU-RHYQMDGZSA-N Leu-Thr-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O AIQWYVFNBNNOLU-RHYQMDGZSA-N 0.000 description 1
- URJUVJDTPXCQFL-IHPCNDPISA-N Leu-Trp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N URJUVJDTPXCQFL-IHPCNDPISA-N 0.000 description 1
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 1
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 1
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 1
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 1
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 1
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 1
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 1
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 1
- NFLFJGGKOHYZJF-BJDJZHNGSA-N Lys-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN NFLFJGGKOHYZJF-BJDJZHNGSA-N 0.000 description 1
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 1
- VHXMZJGOKIMETG-CQDKDKBSSA-N Lys-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCCCN)N VHXMZJGOKIMETG-CQDKDKBSSA-N 0.000 description 1
- IRNSXVOWSXSULE-DCAQKATOSA-N Lys-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN IRNSXVOWSXSULE-DCAQKATOSA-N 0.000 description 1
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 1
- ZQCVMVCVPFYXHZ-SRVKXCTJSA-N Lys-Asn-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN ZQCVMVCVPFYXHZ-SRVKXCTJSA-N 0.000 description 1
- SQXUUGUCGJSWCK-CIUDSAMLSA-N Lys-Asp-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N SQXUUGUCGJSWCK-CIUDSAMLSA-N 0.000 description 1
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 1
- IWWMPCPLFXFBAF-SRVKXCTJSA-N Lys-Asp-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O IWWMPCPLFXFBAF-SRVKXCTJSA-N 0.000 description 1
- GKFNXYMAMKJSKD-NHCYSSNCSA-N Lys-Asp-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GKFNXYMAMKJSKD-NHCYSSNCSA-N 0.000 description 1
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 1
- RZHLIPMZXOEJTL-AVGNSLFASA-N Lys-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N RZHLIPMZXOEJTL-AVGNSLFASA-N 0.000 description 1
- PGBPWPTUOSCNLE-JYJNAYRXSA-N Lys-Gln-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCCN)N PGBPWPTUOSCNLE-JYJNAYRXSA-N 0.000 description 1
- NDORZBUHCOJQDO-GVXVVHGQSA-N Lys-Gln-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O NDORZBUHCOJQDO-GVXVVHGQSA-N 0.000 description 1
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 1
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 1
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 1
- ODUQLUADRKMHOZ-JYJNAYRXSA-N Lys-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)O ODUQLUADRKMHOZ-JYJNAYRXSA-N 0.000 description 1
- ULUQBUKAPDUKOC-GVXVVHGQSA-N Lys-Glu-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O ULUQBUKAPDUKOC-GVXVVHGQSA-N 0.000 description 1
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 1
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 1
- SQJSXOQXJYAVRV-SRVKXCTJSA-N Lys-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N SQJSXOQXJYAVRV-SRVKXCTJSA-N 0.000 description 1
- QBEPTBMRQALPEV-MNXVOIDGSA-N Lys-Ile-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN QBEPTBMRQALPEV-MNXVOIDGSA-N 0.000 description 1
- IZJGPPIGYTVXLB-FQUUOJAGSA-N Lys-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N IZJGPPIGYTVXLB-FQUUOJAGSA-N 0.000 description 1
- MYZMQWHPDAYKIE-SRVKXCTJSA-N Lys-Leu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O MYZMQWHPDAYKIE-SRVKXCTJSA-N 0.000 description 1
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 1
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 1
- ORVFEGYUJITPGI-IHRRRGAJSA-N Lys-Leu-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CCCCN ORVFEGYUJITPGI-IHRRRGAJSA-N 0.000 description 1
- LJADEBULDNKJNK-IHRRRGAJSA-N Lys-Leu-Val Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O LJADEBULDNKJNK-IHRRRGAJSA-N 0.000 description 1
- ZJWIXBZTAAJERF-IHRRRGAJSA-N Lys-Lys-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZJWIXBZTAAJERF-IHRRRGAJSA-N 0.000 description 1
- ALGGDNMLQNFVIZ-SRVKXCTJSA-N Lys-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N ALGGDNMLQNFVIZ-SRVKXCTJSA-N 0.000 description 1
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 1
- VSTNAUBHKQPVJX-IHRRRGAJSA-N Lys-Met-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O VSTNAUBHKQPVJX-IHRRRGAJSA-N 0.000 description 1
- TWPCWKVOZDUYAA-KKUMJFAQSA-N Lys-Phe-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O TWPCWKVOZDUYAA-KKUMJFAQSA-N 0.000 description 1
- MSSJJDVQTFTLIF-KBPBESRZSA-N Lys-Phe-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O MSSJJDVQTFTLIF-KBPBESRZSA-N 0.000 description 1
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 1
- CNGOEHJCLVCJHN-SRVKXCTJSA-N Lys-Pro-Glu Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O CNGOEHJCLVCJHN-SRVKXCTJSA-N 0.000 description 1
- YTJFXEDRUOQGSP-DCAQKATOSA-N Lys-Pro-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YTJFXEDRUOQGSP-DCAQKATOSA-N 0.000 description 1
- MIROMRNASYKZNL-ULQDDVLXSA-N Lys-Pro-Tyr Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MIROMRNASYKZNL-ULQDDVLXSA-N 0.000 description 1
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 1
- MGKFCQFVPKOWOL-CIUDSAMLSA-N Lys-Ser-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(=O)O)C(=O)O)N MGKFCQFVPKOWOL-CIUDSAMLSA-N 0.000 description 1
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 1
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 1
- QVTDVTONTRSQMF-WDCWCFNPSA-N Lys-Thr-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CCCCN QVTDVTONTRSQMF-WDCWCFNPSA-N 0.000 description 1
- CAVRAQIDHUPECU-UVOCVTCTSA-N Lys-Thr-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAVRAQIDHUPECU-UVOCVTCTSA-N 0.000 description 1
- YFQSSOAGMZGXFT-MEYUZBJRSA-N Lys-Thr-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O YFQSSOAGMZGXFT-MEYUZBJRSA-N 0.000 description 1
- RMKJOQSYLQQRFN-KKUMJFAQSA-N Lys-Tyr-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O RMKJOQSYLQQRFN-KKUMJFAQSA-N 0.000 description 1
- NQOQDINRVQCAKD-ULQDDVLXSA-N Lys-Tyr-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCCCN)N NQOQDINRVQCAKD-ULQDDVLXSA-N 0.000 description 1
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 1
- MDDUIRLQCYVRDO-NHCYSSNCSA-N Lys-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN MDDUIRLQCYVRDO-NHCYSSNCSA-N 0.000 description 1
- QFSYGUMEANRNJE-DCAQKATOSA-N Lys-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N QFSYGUMEANRNJE-DCAQKATOSA-N 0.000 description 1
- DRRXXZBXDMLGFC-IHRRRGAJSA-N Lys-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN DRRXXZBXDMLGFC-IHRRRGAJSA-N 0.000 description 1
- TXTZMVNJIRZABH-ULQDDVLXSA-N Lys-Val-Phe Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 TXTZMVNJIRZABH-ULQDDVLXSA-N 0.000 description 1
- RIPJMCFGQHGHNP-RHYQMDGZSA-N Lys-Val-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CCCCN)N)O RIPJMCFGQHGHNP-RHYQMDGZSA-N 0.000 description 1
- ONGCSGVHCSAATF-CIUDSAMLSA-N Met-Ala-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O ONGCSGVHCSAATF-CIUDSAMLSA-N 0.000 description 1
- HUKLXYYPZWPXCC-KZVJFYERSA-N Met-Ala-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HUKLXYYPZWPXCC-KZVJFYERSA-N 0.000 description 1
- CWFYZYQMUDWGTI-GUBZILKMSA-N Met-Arg-Asp Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O CWFYZYQMUDWGTI-GUBZILKMSA-N 0.000 description 1
- IIPHCNKHEZYSNE-DCAQKATOSA-N Met-Arg-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O IIPHCNKHEZYSNE-DCAQKATOSA-N 0.000 description 1
- CTVJSFRHUOSCQQ-DCAQKATOSA-N Met-Arg-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O CTVJSFRHUOSCQQ-DCAQKATOSA-N 0.000 description 1
- IVCPHARVJUYDPA-FXQIFTODSA-N Met-Asn-Asp Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IVCPHARVJUYDPA-FXQIFTODSA-N 0.000 description 1
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 1
- CAODKDAPYGUMLK-FXQIFTODSA-N Met-Asn-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O CAODKDAPYGUMLK-FXQIFTODSA-N 0.000 description 1
- UZVWDRPUTHXQAM-FXQIFTODSA-N Met-Asp-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O UZVWDRPUTHXQAM-FXQIFTODSA-N 0.000 description 1
- SQUTUWHAAWJYES-GUBZILKMSA-N Met-Asp-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SQUTUWHAAWJYES-GUBZILKMSA-N 0.000 description 1
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 1
- FJVJLMZUIGMFFU-BQBZGAKWSA-N Met-Asp-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FJVJLMZUIGMFFU-BQBZGAKWSA-N 0.000 description 1
- NCFZHKMKRCYQBJ-CIUDSAMLSA-N Met-Cys-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N NCFZHKMKRCYQBJ-CIUDSAMLSA-N 0.000 description 1
- OXHSZBRPUGNMKW-DCAQKATOSA-N Met-Gln-Arg Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O OXHSZBRPUGNMKW-DCAQKATOSA-N 0.000 description 1
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 1
- DJDFBVNNDAUPRW-GUBZILKMSA-N Met-Glu-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O DJDFBVNNDAUPRW-GUBZILKMSA-N 0.000 description 1
- YORIKIDJCPKBON-YUMQZZPRSA-N Met-Glu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O YORIKIDJCPKBON-YUMQZZPRSA-N 0.000 description 1
- GVIVXNFKJQFTCE-YUMQZZPRSA-N Met-Gly-Gln Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O GVIVXNFKJQFTCE-YUMQZZPRSA-N 0.000 description 1
- BMHIFARYXOJDLD-WPRPVWTQSA-N Met-Gly-Val Chemical compound [H]N[C@@H](CCSC)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O BMHIFARYXOJDLD-WPRPVWTQSA-N 0.000 description 1
- CFRRIZLGFGJEDB-SRVKXCTJSA-N Met-His-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O CFRRIZLGFGJEDB-SRVKXCTJSA-N 0.000 description 1
- XPCLRYNQMZOOFB-ULQDDVLXSA-N Met-His-Phe Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N XPCLRYNQMZOOFB-ULQDDVLXSA-N 0.000 description 1
- AEQVPPGEJJBFEE-CYDGBPFRSA-N Met-Ile-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AEQVPPGEJJBFEE-CYDGBPFRSA-N 0.000 description 1
- JHDNAOVJJQSMMM-GMOBBJLQSA-N Met-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCSC)N JHDNAOVJJQSMMM-GMOBBJLQSA-N 0.000 description 1
- UROWNMBTQGGTHB-DCAQKATOSA-N Met-Leu-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O UROWNMBTQGGTHB-DCAQKATOSA-N 0.000 description 1
- ZIIMORLEZLVRIP-SRVKXCTJSA-N Met-Leu-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZIIMORLEZLVRIP-SRVKXCTJSA-N 0.000 description 1
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 1
- KMSMNUFBNCHMII-IHRRRGAJSA-N Met-Leu-Lys Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN KMSMNUFBNCHMII-IHRRRGAJSA-N 0.000 description 1
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 1
- JCMMNFZUKMMECJ-DCAQKATOSA-N Met-Lys-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O JCMMNFZUKMMECJ-DCAQKATOSA-N 0.000 description 1
- YYEIFXZOBZVDPH-DCAQKATOSA-N Met-Lys-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O YYEIFXZOBZVDPH-DCAQKATOSA-N 0.000 description 1
- CGUYGMFQZCYJSG-DCAQKATOSA-N Met-Lys-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O CGUYGMFQZCYJSG-DCAQKATOSA-N 0.000 description 1
- WUYLWZRHRLLEGB-AVGNSLFASA-N Met-Met-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O WUYLWZRHRLLEGB-AVGNSLFASA-N 0.000 description 1
- MIAZEQZXAFTCCG-UBHSHLNASA-N Met-Phe-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 MIAZEQZXAFTCCG-UBHSHLNASA-N 0.000 description 1
- NHXXGBXJTLRGJI-GUBZILKMSA-N Met-Pro-Ser Chemical compound [H]N[C@@H](CCSC)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O NHXXGBXJTLRGJI-GUBZILKMSA-N 0.000 description 1
- HLZORBMOISUNIV-DCAQKATOSA-N Met-Ser-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C HLZORBMOISUNIV-DCAQKATOSA-N 0.000 description 1
- RIIFMEBFDDXGCV-VEVYYDQMSA-N Met-Thr-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O RIIFMEBFDDXGCV-VEVYYDQMSA-N 0.000 description 1
- OVTOTTGZBWXLFU-QXEWZRGKSA-N Met-Val-Asp Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(O)=O OVTOTTGZBWXLFU-QXEWZRGKSA-N 0.000 description 1
- FSTWDRPCQQUJIT-NHCYSSNCSA-N Met-Val-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCSC)N FSTWDRPCQQUJIT-NHCYSSNCSA-N 0.000 description 1
- VEKRTVRZDMUOQN-AVGNSLFASA-N Met-Val-His Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CNC=N1 VEKRTVRZDMUOQN-AVGNSLFASA-N 0.000 description 1
- QAVZUKIPOMBLMC-AVGNSLFASA-N Met-Val-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC(C)C QAVZUKIPOMBLMC-AVGNSLFASA-N 0.000 description 1
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 1
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 1
- 108010065395 Neuropep-1 Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- SEPNOAFMZLLCEW-UBHSHLNASA-N Phe-Ala-Val Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O SEPNOAFMZLLCEW-UBHSHLNASA-N 0.000 description 1
- YYRCPTVAPLQRNC-ULQDDVLXSA-N Phe-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CC1=CC=CC=C1 YYRCPTVAPLQRNC-ULQDDVLXSA-N 0.000 description 1
- PLNHHOXNVSYKOB-JYJNAYRXSA-N Phe-Arg-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC1=CC=CC=C1)N PLNHHOXNVSYKOB-JYJNAYRXSA-N 0.000 description 1
- HTTYNOXBBOWZTB-SRVKXCTJSA-N Phe-Asn-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N HTTYNOXBBOWZTB-SRVKXCTJSA-N 0.000 description 1
- LDSOBEJVGGVWGD-DLOVCJGASA-N Phe-Asp-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 LDSOBEJVGGVWGD-DLOVCJGASA-N 0.000 description 1
- RJYBHZVWJPUSLB-QEWYBTABSA-N Phe-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N RJYBHZVWJPUSLB-QEWYBTABSA-N 0.000 description 1
- NKLDZIPTGKBDBB-HTUGSXCWSA-N Phe-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N)O NKLDZIPTGKBDBB-HTUGSXCWSA-N 0.000 description 1
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 1
- AKJAKCBHLJGRBU-JYJNAYRXSA-N Phe-Glu-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N AKJAKCBHLJGRBU-JYJNAYRXSA-N 0.000 description 1
- MGECUMGTSHYHEJ-QEWYBTABSA-N Phe-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 MGECUMGTSHYHEJ-QEWYBTABSA-N 0.000 description 1
- JWQWPTLEOFNCGX-AVGNSLFASA-N Phe-Glu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 JWQWPTLEOFNCGX-AVGNSLFASA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- HBGFEEQFVBWYJQ-KBPBESRZSA-N Phe-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=CC=C1 HBGFEEQFVBWYJQ-KBPBESRZSA-N 0.000 description 1
- FXYXBEZMRACDDR-KKUMJFAQSA-N Phe-His-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(O)=O)C(O)=O FXYXBEZMRACDDR-KKUMJFAQSA-N 0.000 description 1
- PPHFTNABKQRAJV-JYJNAYRXSA-N Phe-His-Gln Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PPHFTNABKQRAJV-JYJNAYRXSA-N 0.000 description 1
- VADLTGVIOIOKGM-BZSNNMDCSA-N Phe-His-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC=CC=1)C1=CN=CN1 VADLTGVIOIOKGM-BZSNNMDCSA-N 0.000 description 1
- KDYPMIZMXDECSU-JYJNAYRXSA-N Phe-Leu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 KDYPMIZMXDECSU-JYJNAYRXSA-N 0.000 description 1
- LRBSWBVUCLLRLU-BZSNNMDCSA-N Phe-Leu-Lys Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)Cc1ccccc1)C(=O)N[C@@H](CCCCN)C(O)=O LRBSWBVUCLLRLU-BZSNNMDCSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- AUJWXNGCAQWLEI-KBPBESRZSA-N Phe-Lys-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O AUJWXNGCAQWLEI-KBPBESRZSA-N 0.000 description 1
- KLXQWABNAWDRAY-ACRUOGEOSA-N Phe-Lys-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 KLXQWABNAWDRAY-ACRUOGEOSA-N 0.000 description 1
- FUAIIFPQELBNJF-ULQDDVLXSA-N Phe-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N FUAIIFPQELBNJF-ULQDDVLXSA-N 0.000 description 1
- YVIVIQWMNCWUFS-UFYCRDLUSA-N Phe-Met-Tyr Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N YVIVIQWMNCWUFS-UFYCRDLUSA-N 0.000 description 1
- JKJSIYKSGIDHPM-WBAXXEDZSA-N Phe-Phe-Ala Chemical compound C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H](N)Cc1ccccc1)C(O)=O JKJSIYKSGIDHPM-WBAXXEDZSA-N 0.000 description 1
- WWPAHTZOWURIMR-ULQDDVLXSA-N Phe-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC1=CC=CC=C1 WWPAHTZOWURIMR-ULQDDVLXSA-N 0.000 description 1
- YMIZSYUAZJSOFL-SRVKXCTJSA-N Phe-Ser-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O YMIZSYUAZJSOFL-SRVKXCTJSA-N 0.000 description 1
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 1
- GLJZDMZJHFXJQG-BZSNNMDCSA-N Phe-Ser-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GLJZDMZJHFXJQG-BZSNNMDCSA-N 0.000 description 1
- MCIXMYKSPQUMJG-SRVKXCTJSA-N Phe-Ser-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O MCIXMYKSPQUMJG-SRVKXCTJSA-N 0.000 description 1
- XNQMZHLAYFWSGJ-HTUGSXCWSA-N Phe-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XNQMZHLAYFWSGJ-HTUGSXCWSA-N 0.000 description 1
- BSKMOCNNLNDIMU-CDMKHQONSA-N Phe-Thr-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O BSKMOCNNLNDIMU-CDMKHQONSA-N 0.000 description 1
- AGTHXWTYCLLYMC-FHWLQOOXSA-N Phe-Tyr-Glu Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCC(O)=O)C(O)=O)C1=CC=CC=C1 AGTHXWTYCLLYMC-FHWLQOOXSA-N 0.000 description 1
- 241000219000 Populus Species 0.000 description 1
- APKRGYLBSCWJJP-FXQIFTODSA-N Pro-Ala-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O APKRGYLBSCWJJP-FXQIFTODSA-N 0.000 description 1
- AJLVKXCNXIJHDV-CIUDSAMLSA-N Pro-Ala-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O AJLVKXCNXIJHDV-CIUDSAMLSA-N 0.000 description 1
- LCRSGSIRKLXZMZ-BPNCWPANSA-N Pro-Ala-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LCRSGSIRKLXZMZ-BPNCWPANSA-N 0.000 description 1
- VCYJKOLZYPYGJV-AVGNSLFASA-N Pro-Arg-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VCYJKOLZYPYGJV-AVGNSLFASA-N 0.000 description 1
- KDIIENQUNVNWHR-JYJNAYRXSA-N Pro-Arg-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O KDIIENQUNVNWHR-JYJNAYRXSA-N 0.000 description 1
- ICTZKEXYDDZZFP-SRVKXCTJSA-N Pro-Arg-Pro Chemical compound N([C@@H](CCCN=C(N)N)C(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 ICTZKEXYDDZZFP-SRVKXCTJSA-N 0.000 description 1
- UVKNEILZSJMKSR-FXQIFTODSA-N Pro-Asn-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H]1CCCN1 UVKNEILZSJMKSR-FXQIFTODSA-N 0.000 description 1
- AMBLXEMWFARNNQ-DCAQKATOSA-N Pro-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 AMBLXEMWFARNNQ-DCAQKATOSA-N 0.000 description 1
- UTAUEDINXUMHLG-FXQIFTODSA-N Pro-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H]1CCCN1 UTAUEDINXUMHLG-FXQIFTODSA-N 0.000 description 1
- JARJPEMLQAWNBR-GUBZILKMSA-N Pro-Asp-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JARJPEMLQAWNBR-GUBZILKMSA-N 0.000 description 1
- ZCXQTRXYZOSGJR-FXQIFTODSA-N Pro-Asp-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZCXQTRXYZOSGJR-FXQIFTODSA-N 0.000 description 1
- LSIWVWRUTKPXDS-DCAQKATOSA-N Pro-Gln-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LSIWVWRUTKPXDS-DCAQKATOSA-N 0.000 description 1
- ODPIUQVTULPQEP-CIUDSAMLSA-N Pro-Gln-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ODPIUQVTULPQEP-CIUDSAMLSA-N 0.000 description 1
- WFHYFCWBLSKEMS-KKUMJFAQSA-N Pro-Glu-Phe Chemical compound N([C@@H](CCC(=O)O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 WFHYFCWBLSKEMS-KKUMJFAQSA-N 0.000 description 1
- UEHYFUCOGHWASA-HJGDQZAQSA-N Pro-Glu-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 UEHYFUCOGHWASA-HJGDQZAQSA-N 0.000 description 1
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 1
- LCUOTSLIVGSGAU-AVGNSLFASA-N Pro-His-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O LCUOTSLIVGSGAU-AVGNSLFASA-N 0.000 description 1
- FKVNLUZHSFCNGY-RVMXOQNASA-N Pro-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@@H]2CCCN2 FKVNLUZHSFCNGY-RVMXOQNASA-N 0.000 description 1
- FKYKZHOKDOPHSA-DCAQKATOSA-N Pro-Leu-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O FKYKZHOKDOPHSA-DCAQKATOSA-N 0.000 description 1
- JUJCUYWRJMFJJF-AVGNSLFASA-N Pro-Lys-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 JUJCUYWRJMFJJF-AVGNSLFASA-N 0.000 description 1
- VWHJZETTZDAGOM-XUXIUFHCSA-N Pro-Lys-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VWHJZETTZDAGOM-XUXIUFHCSA-N 0.000 description 1
- MHHQQZIFLWFZGR-DCAQKATOSA-N Pro-Lys-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O MHHQQZIFLWFZGR-DCAQKATOSA-N 0.000 description 1
- ZUZINZIJHJFJRN-UBHSHLNASA-N Pro-Phe-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 ZUZINZIJHJFJRN-UBHSHLNASA-N 0.000 description 1
- JLMZKEQFMVORMA-SRVKXCTJSA-N Pro-Pro-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 JLMZKEQFMVORMA-SRVKXCTJSA-N 0.000 description 1
- GFHOSBYCLACKEK-GUBZILKMSA-N Pro-Pro-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O GFHOSBYCLACKEK-GUBZILKMSA-N 0.000 description 1
- DWPXHLIBFQLKLK-CYDGBPFRSA-N Pro-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 DWPXHLIBFQLKLK-CYDGBPFRSA-N 0.000 description 1
- SBVPYBFMIGDIDX-SRVKXCTJSA-N Pro-Pro-Pro Chemical compound OC(=O)[C@@H]1CCCN1C(=O)[C@H]1N(C(=O)[C@H]2NCCC2)CCC1 SBVPYBFMIGDIDX-SRVKXCTJSA-N 0.000 description 1
- SEZGGSHLMROBFX-CIUDSAMLSA-N Pro-Ser-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O SEZGGSHLMROBFX-CIUDSAMLSA-N 0.000 description 1
- FDMCIBSQRKFSTJ-RHYQMDGZSA-N Pro-Thr-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O FDMCIBSQRKFSTJ-RHYQMDGZSA-N 0.000 description 1
- AJJDPGVVNPUZCR-RHYQMDGZSA-N Pro-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1)O AJJDPGVVNPUZCR-RHYQMDGZSA-N 0.000 description 1
- NBDHWLZEMKSVHH-UVBJJODRSA-N Pro-Trp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 NBDHWLZEMKSVHH-UVBJJODRSA-N 0.000 description 1
- YIPFBJGBRCJJJD-FHWLQOOXSA-N Pro-Trp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@@H]3CCCN3 YIPFBJGBRCJJJD-FHWLQOOXSA-N 0.000 description 1
- WWXNZNWZNZPDIF-SRVKXCTJSA-N Pro-Val-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 WWXNZNWZNZPDIF-SRVKXCTJSA-N 0.000 description 1
- IIRBTQHFVNGPMQ-AVGNSLFASA-N Pro-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@@H]1CCCN1 IIRBTQHFVNGPMQ-AVGNSLFASA-N 0.000 description 1
- 102000017742 Pumilio homology domains Human genes 0.000 description 1
- 108050005947 Pumilio homology domains Proteins 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 238000010357 RNA editing Methods 0.000 description 1
- 230000026279 RNA modification Effects 0.000 description 1
- 101710105008 RNA-binding protein Proteins 0.000 description 1
- 241000195974 Selaginella Species 0.000 description 1
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 1
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- DWUIECHTAMYEFL-XVYDVKMFSA-N Ser-Ala-His Chemical compound OC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 DWUIECHTAMYEFL-XVYDVKMFSA-N 0.000 description 1
- WTWGOQRNRFHFQD-JBDRJPRFSA-N Ser-Ala-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WTWGOQRNRFHFQD-JBDRJPRFSA-N 0.000 description 1
- IYCBDVBJWDXQRR-FXQIFTODSA-N Ser-Ala-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCSC)C(O)=O IYCBDVBJWDXQRR-FXQIFTODSA-N 0.000 description 1
- IDQFQFVEWMWRQQ-DLOVCJGASA-N Ser-Ala-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O IDQFQFVEWMWRQQ-DLOVCJGASA-N 0.000 description 1
- BRKHVZNDAOMAHX-BIIVOSGPSA-N Ser-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N BRKHVZNDAOMAHX-BIIVOSGPSA-N 0.000 description 1
- JJKSSJVYOVRJMZ-FXQIFTODSA-N Ser-Arg-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N)CN=C(N)N JJKSSJVYOVRJMZ-FXQIFTODSA-N 0.000 description 1
- HQTKVSCNCDLXSX-BQBZGAKWSA-N Ser-Arg-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O HQTKVSCNCDLXSX-BQBZGAKWSA-N 0.000 description 1
- WDXYVIIVDIDOSX-DCAQKATOSA-N Ser-Arg-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CCCN=C(N)N WDXYVIIVDIDOSX-DCAQKATOSA-N 0.000 description 1
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 1
- OOKCGAYXSNJBGQ-ZLUOBGJFSA-N Ser-Asn-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OOKCGAYXSNJBGQ-ZLUOBGJFSA-N 0.000 description 1
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 1
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 1
- UGJRQLURDVGULT-LKXGYXEUSA-N Ser-Asn-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O UGJRQLURDVGULT-LKXGYXEUSA-N 0.000 description 1
- CTLVSHXLRVEILB-UBHSHLNASA-N Ser-Asn-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N CTLVSHXLRVEILB-UBHSHLNASA-N 0.000 description 1
- MESDJCNHLZBMEP-ZLUOBGJFSA-N Ser-Asp-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O MESDJCNHLZBMEP-ZLUOBGJFSA-N 0.000 description 1
- SFZKGGOGCNQPJY-CIUDSAMLSA-N Ser-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CO)N SFZKGGOGCNQPJY-CIUDSAMLSA-N 0.000 description 1
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 1
- INCNPLPRPOYTJI-JBDRJPRFSA-N Ser-Cys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CO)N INCNPLPRPOYTJI-JBDRJPRFSA-N 0.000 description 1
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 1
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 1
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 1
- UFKPDBLKLOBMRH-XHNCKOQMSA-N Ser-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)C(=O)O UFKPDBLKLOBMRH-XHNCKOQMSA-N 0.000 description 1
- BRIZMMZEYSAKJX-QEJZJMRPSA-N Ser-Glu-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N BRIZMMZEYSAKJX-QEJZJMRPSA-N 0.000 description 1
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 1
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- JFWDJFULOLKQFY-QWRGUYRKSA-N Ser-Gly-Phe Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JFWDJFULOLKQFY-QWRGUYRKSA-N 0.000 description 1
- CXBFHZLODKPIJY-AAEUAGOBSA-N Ser-Gly-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N CXBFHZLODKPIJY-AAEUAGOBSA-N 0.000 description 1
- OQPNSDWGAMFJNU-QWRGUYRKSA-N Ser-Gly-Tyr Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OQPNSDWGAMFJNU-QWRGUYRKSA-N 0.000 description 1
- XXXAXOWMBOKTRN-XPUUQOCRSA-N Ser-Gly-Val Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O XXXAXOWMBOKTRN-XPUUQOCRSA-N 0.000 description 1
- HMRAQFJFTOLDKW-GUBZILKMSA-N Ser-His-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMRAQFJFTOLDKW-GUBZILKMSA-N 0.000 description 1
- IOVBCLGAJJXOHK-SRVKXCTJSA-N Ser-His-His Chemical compound C([C@H](NC(=O)[C@H](CO)N)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 IOVBCLGAJJXOHK-SRVKXCTJSA-N 0.000 description 1
- CICQXRWZNVXFCU-SRVKXCTJSA-N Ser-His-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O CICQXRWZNVXFCU-SRVKXCTJSA-N 0.000 description 1
- JEHPKECJCALLRW-CUJWVEQBSA-N Ser-His-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JEHPKECJCALLRW-CUJWVEQBSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 1
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 1
- YMDNFPNTIPQMJP-NAKRPEOUSA-N Ser-Ile-Met Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCSC)C(O)=O YMDNFPNTIPQMJP-NAKRPEOUSA-N 0.000 description 1
- UIPXCLNLUUAMJU-JBDRJPRFSA-N Ser-Ile-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O UIPXCLNLUUAMJU-JBDRJPRFSA-N 0.000 description 1
- MQQBBLVOUUJKLH-HJPIBITLSA-N Ser-Ile-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQQBBLVOUUJKLH-HJPIBITLSA-N 0.000 description 1
- DOSZISJPMCYEHT-NAKRPEOUSA-N Ser-Ile-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O DOSZISJPMCYEHT-NAKRPEOUSA-N 0.000 description 1
- GJFYFGOEWLDQGW-GUBZILKMSA-N Ser-Leu-Gln Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GJFYFGOEWLDQGW-GUBZILKMSA-N 0.000 description 1
- XNCUYZKGQOCOQH-YUMQZZPRSA-N Ser-Leu-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O XNCUYZKGQOCOQH-YUMQZZPRSA-N 0.000 description 1
- YUJLIIRMIAGMCQ-CIUDSAMLSA-N Ser-Leu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YUJLIIRMIAGMCQ-CIUDSAMLSA-N 0.000 description 1
- JLPMFVAIQHCBDC-CIUDSAMLSA-N Ser-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N JLPMFVAIQHCBDC-CIUDSAMLSA-N 0.000 description 1
- BYCVMHKULKRVPV-GUBZILKMSA-N Ser-Lys-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O BYCVMHKULKRVPV-GUBZILKMSA-N 0.000 description 1
- PTWIYDNFWPXQSD-GARJFASQSA-N Ser-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)N)C(=O)O PTWIYDNFWPXQSD-GARJFASQSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- PMCMLDNPAZUYGI-DCAQKATOSA-N Ser-Lys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O PMCMLDNPAZUYGI-DCAQKATOSA-N 0.000 description 1
- XNXRTQZTFVMJIJ-DCAQKATOSA-N Ser-Met-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XNXRTQZTFVMJIJ-DCAQKATOSA-N 0.000 description 1
- XVWDJUROVRQKAE-KKUMJFAQSA-N Ser-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CO)CC1=CC=CC=C1 XVWDJUROVRQKAE-KKUMJFAQSA-N 0.000 description 1
- ADJDNJCSPNFFPI-FXQIFTODSA-N Ser-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO ADJDNJCSPNFFPI-FXQIFTODSA-N 0.000 description 1
- NUEHQDHDLDXCRU-GUBZILKMSA-N Ser-Pro-Arg Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NUEHQDHDLDXCRU-GUBZILKMSA-N 0.000 description 1
- SRSPTFBENMJHMR-WHFBIAKZSA-N Ser-Ser-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SRSPTFBENMJHMR-WHFBIAKZSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 1
- VGQVAVQWKJLIRM-FXQIFTODSA-N Ser-Ser-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O VGQVAVQWKJLIRM-FXQIFTODSA-N 0.000 description 1
- XJDMUQCLVSCRSJ-VZFHVOOUSA-N Ser-Thr-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(O)=O XJDMUQCLVSCRSJ-VZFHVOOUSA-N 0.000 description 1
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 1
- WUXCHQZLUHBSDJ-LKXGYXEUSA-N Ser-Thr-Asp Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O WUXCHQZLUHBSDJ-LKXGYXEUSA-N 0.000 description 1
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 1
- PCJLFYBAQZQOFE-KATARQTJSA-N Ser-Thr-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N)O PCJLFYBAQZQOFE-KATARQTJSA-N 0.000 description 1
- ZKOKTQPHFMRSJP-YJRXYDGGSA-N Ser-Thr-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKOKTQPHFMRSJP-YJRXYDGGSA-N 0.000 description 1
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 1
- PQEQXWRVHQAAKS-SRVKXCTJSA-N Ser-Tyr-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CC=C(O)C=C1 PQEQXWRVHQAAKS-SRVKXCTJSA-N 0.000 description 1
- PCMZJFMUYWIERL-ZKWXMUAHSA-N Ser-Val-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PCMZJFMUYWIERL-ZKWXMUAHSA-N 0.000 description 1
- HNDMFDBQXYZSRM-IHRRRGAJSA-N Ser-Val-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HNDMFDBQXYZSRM-IHRRRGAJSA-N 0.000 description 1
- JGUWRQWULDWNCM-FXQIFTODSA-N Ser-Val-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O JGUWRQWULDWNCM-FXQIFTODSA-N 0.000 description 1
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 1
- 108700005078 Synthetic Genes Proteins 0.000 description 1
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 1
- DDPVJPIGACCMEH-XQXXSGGOSA-N Thr-Ala-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O DDPVJPIGACCMEH-XQXXSGGOSA-N 0.000 description 1
- PXQUBKWZENPDGE-CIQUZCHMSA-N Thr-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)O)N PXQUBKWZENPDGE-CIQUZCHMSA-N 0.000 description 1
- BSNZTJXVDOINSR-JXUBOQSCSA-N Thr-Ala-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O BSNZTJXVDOINSR-JXUBOQSCSA-N 0.000 description 1
- DWYAUVCQDTZIJI-VZFHVOOUSA-N Thr-Ala-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O DWYAUVCQDTZIJI-VZFHVOOUSA-N 0.000 description 1
- CAJFZCICSVBOJK-SHGPDSBTSA-N Thr-Ala-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CAJFZCICSVBOJK-SHGPDSBTSA-N 0.000 description 1
- LHUBVKCLOVALIA-HJGDQZAQSA-N Thr-Arg-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O LHUBVKCLOVALIA-HJGDQZAQSA-N 0.000 description 1
- UTSWGQNAQRIHAI-UNQGMJICSA-N Thr-Arg-Phe Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 UTSWGQNAQRIHAI-UNQGMJICSA-N 0.000 description 1
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 1
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 1
- KZUJCMPVNXOBAF-LKXGYXEUSA-N Thr-Cys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(O)=O KZUJCMPVNXOBAF-LKXGYXEUSA-N 0.000 description 1
- WLDUCKSCDRIVLJ-NUMRIWBASA-N Thr-Gln-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O WLDUCKSCDRIVLJ-NUMRIWBASA-N 0.000 description 1
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- LHEZGZQRLDBSRR-WDCWCFNPSA-N Thr-Glu-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LHEZGZQRLDBSRR-WDCWCFNPSA-N 0.000 description 1
- BIENEHRYNODTLP-HJGDQZAQSA-N Thr-Glu-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N)O BIENEHRYNODTLP-HJGDQZAQSA-N 0.000 description 1
- ONNSECRQFSTMCC-XKBZYTNZSA-N Thr-Glu-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ONNSECRQFSTMCC-XKBZYTNZSA-N 0.000 description 1
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 1
- XFTYVCHLARBHBQ-FOHZUACHSA-N Thr-Gly-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O XFTYVCHLARBHBQ-FOHZUACHSA-N 0.000 description 1
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 1
- DJDSEDOKJTZBAR-ZDLURKLDSA-N Thr-Gly-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O DJDSEDOKJTZBAR-ZDLURKLDSA-N 0.000 description 1
- YSXYEJWDHBCTDJ-DVJZZOLTSA-N Thr-Gly-Trp Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O YSXYEJWDHBCTDJ-DVJZZOLTSA-N 0.000 description 1
- JKGGPMOUIAAJAA-YEPSODPASA-N Thr-Gly-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O JKGGPMOUIAAJAA-YEPSODPASA-N 0.000 description 1
- AYCQVUUPIJHJTA-IXOXFDKPSA-N Thr-His-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O AYCQVUUPIJHJTA-IXOXFDKPSA-N 0.000 description 1
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 1
- LCCSEJSPBWKBNT-OSUNSFLBSA-N Thr-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N LCCSEJSPBWKBNT-OSUNSFLBSA-N 0.000 description 1
- XYFISNXATOERFZ-OSUNSFLBSA-N Thr-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N XYFISNXATOERFZ-OSUNSFLBSA-N 0.000 description 1
- AMXMBCAXAZUCFA-RHYQMDGZSA-N Thr-Leu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AMXMBCAXAZUCFA-RHYQMDGZSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 1
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- ZSPQUTWLWGWTPS-HJGDQZAQSA-N Thr-Lys-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O ZSPQUTWLWGWTPS-HJGDQZAQSA-N 0.000 description 1
- XSEPSRUDSPHMPX-KATARQTJSA-N Thr-Lys-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O XSEPSRUDSPHMPX-KATARQTJSA-N 0.000 description 1
- DXPURPNJDFCKKO-RHYQMDGZSA-N Thr-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)[C@@H](C)O)C(O)=O DXPURPNJDFCKKO-RHYQMDGZSA-N 0.000 description 1
- UJQVSMNQMQHVRY-KZVJFYERSA-N Thr-Met-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](C)C(O)=O UJQVSMNQMQHVRY-KZVJFYERSA-N 0.000 description 1
- YJVJPJPHHFOVMG-VEVYYDQMSA-N Thr-Met-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YJVJPJPHHFOVMG-VEVYYDQMSA-N 0.000 description 1
- KDGBLMDAPJTQIW-RHYQMDGZSA-N Thr-Met-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)O)N)O KDGBLMDAPJTQIW-RHYQMDGZSA-N 0.000 description 1
- KZURUCDWKDEAFZ-XVSYOHENSA-N Thr-Phe-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O KZURUCDWKDEAFZ-XVSYOHENSA-N 0.000 description 1
- BDYBHQWMHYDRKJ-UNQGMJICSA-N Thr-Phe-Met Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)O)N)O BDYBHQWMHYDRKJ-UNQGMJICSA-N 0.000 description 1
- MXNAOGFNFNKUPD-JHYOHUSXSA-N Thr-Phe-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MXNAOGFNFNKUPD-JHYOHUSXSA-N 0.000 description 1
- JAJOFWABAUKAEJ-QTKMDUPCSA-N Thr-Pro-His Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O JAJOFWABAUKAEJ-QTKMDUPCSA-N 0.000 description 1
- MROIJTGJGIDEEJ-RCWTZXSCSA-N Thr-Pro-Pro Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 MROIJTGJGIDEEJ-RCWTZXSCSA-N 0.000 description 1
- FWTFAZKJORVTIR-VZFHVOOUSA-N Thr-Ser-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O FWTFAZKJORVTIR-VZFHVOOUSA-N 0.000 description 1
- PRTHQBSMXILLPC-XGEHTFHBSA-N Thr-Ser-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PRTHQBSMXILLPC-XGEHTFHBSA-N 0.000 description 1
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 1
- WKGAAMOJPMBBMC-IXOXFDKPSA-N Thr-Ser-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WKGAAMOJPMBBMC-IXOXFDKPSA-N 0.000 description 1
- WPSKTVVMQCXPRO-BWBBJGPYSA-N Thr-Ser-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O WPSKTVVMQCXPRO-BWBBJGPYSA-N 0.000 description 1
- AAZOYLQUEQRUMZ-GSSVUCPTSA-N Thr-Thr-Asn Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(N)=O AAZOYLQUEQRUMZ-GSSVUCPTSA-N 0.000 description 1
- UQCNIMDPYICBTR-KYNKHSRBSA-N Thr-Thr-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O UQCNIMDPYICBTR-KYNKHSRBSA-N 0.000 description 1
- PJCYRZVSACOYSN-ZJDVBMNYSA-N Thr-Thr-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCSC)C(O)=O PJCYRZVSACOYSN-ZJDVBMNYSA-N 0.000 description 1
- ZESGVALRVJIVLZ-VFCFLDTKSA-N Thr-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@@H]1C(=O)O)N)O ZESGVALRVJIVLZ-VFCFLDTKSA-N 0.000 description 1
- COYHRQWNJDJCNA-NUJDXYNKSA-N Thr-Thr-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O COYHRQWNJDJCNA-NUJDXYNKSA-N 0.000 description 1
- CSZFFQBUTMGHAH-UAXMHLISSA-N Thr-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)N)O CSZFFQBUTMGHAH-UAXMHLISSA-N 0.000 description 1
- PELIQFPESHBTMA-WLTAIBSBSA-N Thr-Tyr-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CC1=CC=C(O)C=C1 PELIQFPESHBTMA-WLTAIBSBSA-N 0.000 description 1
- CJEHCEOXPLASCK-MEYUZBJRSA-N Thr-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@H](O)C)CC1=CC=C(O)C=C1 CJEHCEOXPLASCK-MEYUZBJRSA-N 0.000 description 1
- CYCGARJWIQWPQM-YJRXYDGGSA-N Thr-Tyr-Ser Chemical compound C[C@@H](O)[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CO)C([O-])=O)CC1=CC=C(O)C=C1 CYCGARJWIQWPQM-YJRXYDGGSA-N 0.000 description 1
- MNYNCKZAEIAONY-XGEHTFHBSA-N Thr-Val-Ser Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O MNYNCKZAEIAONY-XGEHTFHBSA-N 0.000 description 1
- MHNHRNHJMXAVHZ-AAEUAGOBSA-N Trp-Asn-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)NCC(=O)O)N MHNHRNHJMXAVHZ-AAEUAGOBSA-N 0.000 description 1
- WACMTVIJWRNVSO-CWRNSKLLSA-N Trp-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N)C(=O)O WACMTVIJWRNVSO-CWRNSKLLSA-N 0.000 description 1
- OENGVSDBQHHGBU-QEJZJMRPSA-N Trp-Glu-Asn Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OENGVSDBQHHGBU-QEJZJMRPSA-N 0.000 description 1
- CZWIHKFGHICAJX-BPUTZDHNSA-N Trp-Glu-Glu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O)=CNC2=C1 CZWIHKFGHICAJX-BPUTZDHNSA-N 0.000 description 1
- VMBBTANKMSRJSS-JSGCOSHPSA-N Trp-Glu-Gly Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O VMBBTANKMSRJSS-JSGCOSHPSA-N 0.000 description 1
- OKAMOYTUQMIFJO-JBACZVJFSA-N Trp-Glu-Phe Chemical compound C([C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)N)C(O)=O)C1=CC=CC=C1 OKAMOYTUQMIFJO-JBACZVJFSA-N 0.000 description 1
- HXNVJPQADLRHGR-JBACZVJFSA-N Trp-Glu-Tyr Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC3=CC=C(C=C3)O)C(=O)O)N HXNVJPQADLRHGR-JBACZVJFSA-N 0.000 description 1
- FNOQJVHFVLVMOS-AAEUAGOBSA-N Trp-Gly-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)N)C(=O)O)N FNOQJVHFVLVMOS-AAEUAGOBSA-N 0.000 description 1
- AIISTODACBDQLW-WDSOQIARSA-N Trp-Leu-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 AIISTODACBDQLW-WDSOQIARSA-N 0.000 description 1
- XGFGVFMXDXALEV-XIRDDKMYSA-N Trp-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N XGFGVFMXDXALEV-XIRDDKMYSA-N 0.000 description 1
- YVXIAOOYAKBAAI-SZMVWBNQSA-N Trp-Leu-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 YVXIAOOYAKBAAI-SZMVWBNQSA-N 0.000 description 1
- UJRIVCPPPMYCNA-HOCLYGCPSA-N Trp-Leu-Gly Chemical compound CC(C)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N UJRIVCPPPMYCNA-HOCLYGCPSA-N 0.000 description 1
- CCZXBOFIBYQLEV-IHPCNDPISA-N Trp-Leu-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O CCZXBOFIBYQLEV-IHPCNDPISA-N 0.000 description 1
- OSYOKZZRVGUDMO-HSCHXYMDSA-N Trp-Lys-Ile Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OSYOKZZRVGUDMO-HSCHXYMDSA-N 0.000 description 1
- GRSCONMARGNYHA-PMVMPFDFSA-N Trp-Lys-Phe Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GRSCONMARGNYHA-PMVMPFDFSA-N 0.000 description 1
- NLWCSMOXNKBRLC-WDSOQIARSA-N Trp-Lys-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLWCSMOXNKBRLC-WDSOQIARSA-N 0.000 description 1
- ADMHZNPMMVKGJW-BPUTZDHNSA-N Trp-Ser-Arg Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N ADMHZNPMMVKGJW-BPUTZDHNSA-N 0.000 description 1
- KXFYAQUYJKOQMI-QEJZJMRPSA-N Trp-Ser-Gln Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 KXFYAQUYJKOQMI-QEJZJMRPSA-N 0.000 description 1
- BOBZBMOTRORUPT-XIRDDKMYSA-N Trp-Ser-Leu Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O)=CNC2=C1 BOBZBMOTRORUPT-XIRDDKMYSA-N 0.000 description 1
- ARKBYVBCEOWRNR-UBHSHLNASA-N Trp-Ser-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O ARKBYVBCEOWRNR-UBHSHLNASA-N 0.000 description 1
- HIZDHWHVOLUGOX-BPUTZDHNSA-N Trp-Ser-Val Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O HIZDHWHVOLUGOX-BPUTZDHNSA-N 0.000 description 1
- HHPSUFUXXBOFQY-AQZXSJQPSA-N Trp-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O HHPSUFUXXBOFQY-AQZXSJQPSA-N 0.000 description 1
- LNGFWVPNKLWATF-ZVZYQTTQSA-N Trp-Val-Glu Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LNGFWVPNKLWATF-ZVZYQTTQSA-N 0.000 description 1
- IEESWNWYUOETOT-BVSLBCMMSA-N Trp-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1ccccc1)C(O)=O IEESWNWYUOETOT-BVSLBCMMSA-N 0.000 description 1
- RWTFCAMQLFNPTK-UMPQAUOISA-N Trp-Val-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O)=CNC2=C1 RWTFCAMQLFNPTK-UMPQAUOISA-N 0.000 description 1
- ZWZOCUWOXSDYFZ-CQDKDKBSSA-N Tyr-Ala-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 ZWZOCUWOXSDYFZ-CQDKDKBSSA-N 0.000 description 1
- XGEUYEOEZYFHRL-KKXDTOCCSA-N Tyr-Ala-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 XGEUYEOEZYFHRL-KKXDTOCCSA-N 0.000 description 1
- LGEYOIQBBIPHQN-UWJYBYFXSA-N Tyr-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 LGEYOIQBBIPHQN-UWJYBYFXSA-N 0.000 description 1
- AKXBNSZMYAOGLS-STQMWFEESA-N Tyr-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AKXBNSZMYAOGLS-STQMWFEESA-N 0.000 description 1
- DKKHULUSOSWGHS-UWJYBYFXSA-N Tyr-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N DKKHULUSOSWGHS-UWJYBYFXSA-N 0.000 description 1
- CYDVHRFXDMDMGX-KKUMJFAQSA-N Tyr-Asn-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O CYDVHRFXDMDMGX-KKUMJFAQSA-N 0.000 description 1
- ZNFPUOSTMUMUDR-JRQIVUDYSA-N Tyr-Asn-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZNFPUOSTMUMUDR-JRQIVUDYSA-N 0.000 description 1
- NLMXVDDEQFKQQU-CFMVVWHZSA-N Tyr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NLMXVDDEQFKQQU-CFMVVWHZSA-N 0.000 description 1
- UABYBEBXFFNCIR-YDHLFZDLSA-N Tyr-Asp-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UABYBEBXFFNCIR-YDHLFZDLSA-N 0.000 description 1
- XBWKCYFGRXKWGO-SRVKXCTJSA-N Tyr-Cys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(O)=O XBWKCYFGRXKWGO-SRVKXCTJSA-N 0.000 description 1
- QUILOGWWLXMSAT-IHRRRGAJSA-N Tyr-Gln-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O QUILOGWWLXMSAT-IHRRRGAJSA-N 0.000 description 1
- MPKPIWFFDWVJGC-IRIUXVKKSA-N Tyr-Gln-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O MPKPIWFFDWVJGC-IRIUXVKKSA-N 0.000 description 1
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 1
- IMXAAEFAIBRCQF-SIUGBPQLSA-N Tyr-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N IMXAAEFAIBRCQF-SIUGBPQLSA-N 0.000 description 1
- PMDWYLVWHRTJIW-STQMWFEESA-N Tyr-Gly-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PMDWYLVWHRTJIW-STQMWFEESA-N 0.000 description 1
- CTDPLKMBVALCGN-JSGCOSHPSA-N Tyr-Gly-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O CTDPLKMBVALCGN-JSGCOSHPSA-N 0.000 description 1
- NXRGXTBPMOGFID-CFMVVWHZSA-N Tyr-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O NXRGXTBPMOGFID-CFMVVWHZSA-N 0.000 description 1
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 1
- PRONOHBTMLNXCZ-BZSNNMDCSA-N Tyr-Leu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 PRONOHBTMLNXCZ-BZSNNMDCSA-N 0.000 description 1
- JAGGEZACYAAMIL-CQDKDKBSSA-N Tyr-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N JAGGEZACYAAMIL-CQDKDKBSSA-N 0.000 description 1
- GYKDRHDMGQUZPU-MGHWNKPDSA-N Tyr-Lys-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CC=C(C=C1)O)N GYKDRHDMGQUZPU-MGHWNKPDSA-N 0.000 description 1
- JXGUUJMPCRXMSO-HJOGWXRNSA-N Tyr-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 JXGUUJMPCRXMSO-HJOGWXRNSA-N 0.000 description 1
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 1
- YYLHVUCSTXXKBS-IHRRRGAJSA-N Tyr-Pro-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O YYLHVUCSTXXKBS-IHRRRGAJSA-N 0.000 description 1
- GQVZBMROTPEPIF-SRVKXCTJSA-N Tyr-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GQVZBMROTPEPIF-SRVKXCTJSA-N 0.000 description 1
- QFXVAFIHVWXXBJ-AVGNSLFASA-N Tyr-Ser-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O QFXVAFIHVWXXBJ-AVGNSLFASA-N 0.000 description 1
- ZPFLBLFITJCBTP-QWRGUYRKSA-N Tyr-Ser-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)NCC(O)=O ZPFLBLFITJCBTP-QWRGUYRKSA-N 0.000 description 1
- BCOBSVIZMQXKFY-KKUMJFAQSA-N Tyr-Ser-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O BCOBSVIZMQXKFY-KKUMJFAQSA-N 0.000 description 1
- NHOVZGFNTGMYMI-KKUMJFAQSA-N Tyr-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NHOVZGFNTGMYMI-KKUMJFAQSA-N 0.000 description 1
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 1
- LVFZXRQQQDTBQH-IRIUXVKKSA-N Tyr-Thr-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O LVFZXRQQQDTBQH-IRIUXVKKSA-N 0.000 description 1
- JHDZONWZTCKTJR-KJEVXHAQSA-N Tyr-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JHDZONWZTCKTJR-KJEVXHAQSA-N 0.000 description 1
- JQOMHZMWQHXALX-FHWLQOOXSA-N Tyr-Tyr-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JQOMHZMWQHXALX-FHWLQOOXSA-N 0.000 description 1
- SQUMHUZLJDUROQ-YDHLFZDLSA-N Tyr-Val-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O SQUMHUZLJDUROQ-YDHLFZDLSA-N 0.000 description 1
- OBKOPLHSRDATFO-XHSDSOJGSA-N Tyr-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OBKOPLHSRDATFO-XHSDSOJGSA-N 0.000 description 1
- DDRBQONWVBDQOY-GUBZILKMSA-N Val-Ala-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DDRBQONWVBDQOY-GUBZILKMSA-N 0.000 description 1
- UEOOXDLMQZBPFR-ZKWXMUAHSA-N Val-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N UEOOXDLMQZBPFR-ZKWXMUAHSA-N 0.000 description 1
- FZSPNKUFROZBSG-ZKWXMUAHSA-N Val-Ala-Asp Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(O)=O FZSPNKUFROZBSG-ZKWXMUAHSA-N 0.000 description 1
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 1
- COYSIHFOCOMGCF-UHFFFAOYSA-N Val-Arg-Gly Natural products CC(C)C(N)C(=O)NC(C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-UHFFFAOYSA-N 0.000 description 1
- XPYNXORPPVTVQK-SRVKXCTJSA-N Val-Arg-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCSC)C(=O)O)N XPYNXORPPVTVQK-SRVKXCTJSA-N 0.000 description 1
- QPZMOUMNTGTEFR-ZKWXMUAHSA-N Val-Asn-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N QPZMOUMNTGTEFR-ZKWXMUAHSA-N 0.000 description 1
- ZMDCGGKHRKNWKD-LAEOZQHASA-N Val-Asn-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZMDCGGKHRKNWKD-LAEOZQHASA-N 0.000 description 1
- LNYOXPDEIZJDEI-NHCYSSNCSA-N Val-Asn-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](C(C)C)N LNYOXPDEIZJDEI-NHCYSSNCSA-N 0.000 description 1
- XQVRMLRMTAGSFJ-QXEWZRGKSA-N Val-Asp-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XQVRMLRMTAGSFJ-QXEWZRGKSA-N 0.000 description 1
- VLOYGOZDPGYWFO-LAEOZQHASA-N Val-Asp-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O VLOYGOZDPGYWFO-LAEOZQHASA-N 0.000 description 1
- ZSZFTYVFQLUWBF-QXEWZRGKSA-N Val-Asp-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCSC)C(=O)O)N ZSZFTYVFQLUWBF-QXEWZRGKSA-N 0.000 description 1
- SCBITHMBEJNRHC-LSJOCFKGSA-N Val-Asp-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)O)N SCBITHMBEJNRHC-LSJOCFKGSA-N 0.000 description 1
- CFSSLXZJEMERJY-NRPADANISA-N Val-Gln-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O CFSSLXZJEMERJY-NRPADANISA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 1
- ROLGIBMFNMZANA-GVXVVHGQSA-N Val-Glu-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N ROLGIBMFNMZANA-GVXVVHGQSA-N 0.000 description 1
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 1
- XWYUBUYQMOUFRQ-IFFSRLJSSA-N Val-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N)O XWYUBUYQMOUFRQ-IFFSRLJSSA-N 0.000 description 1
- UEHRGZCNLSWGHK-DLOVCJGASA-N Val-Glu-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UEHRGZCNLSWGHK-DLOVCJGASA-N 0.000 description 1
- WFENBJPLZMPVAX-XVKPBYJWSA-N Val-Gly-Glu Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(O)=O WFENBJPLZMPVAX-XVKPBYJWSA-N 0.000 description 1
- LAYSXAOGWHKNED-XPUUQOCRSA-N Val-Gly-Ser Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O LAYSXAOGWHKNED-XPUUQOCRSA-N 0.000 description 1
- DJQIUOKSNRBTSV-CYDGBPFRSA-N Val-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](C(C)C)N DJQIUOKSNRBTSV-CYDGBPFRSA-N 0.000 description 1
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 1
- AGXGCFSECFQMKB-NHCYSSNCSA-N Val-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N AGXGCFSECFQMKB-NHCYSSNCSA-N 0.000 description 1
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- GVJUTBOZZBTBIG-AVGNSLFASA-N Val-Lys-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N GVJUTBOZZBTBIG-AVGNSLFASA-N 0.000 description 1
- OJOMXGVLFKYDKP-QXEWZRGKSA-N Val-Met-Asp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N OJOMXGVLFKYDKP-QXEWZRGKSA-N 0.000 description 1
- VENKIVFKIPGEJN-NHCYSSNCSA-N Val-Met-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N VENKIVFKIPGEJN-NHCYSSNCSA-N 0.000 description 1
- RQOMPQGUGBILAG-AVGNSLFASA-N Val-Met-Leu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O RQOMPQGUGBILAG-AVGNSLFASA-N 0.000 description 1
- VNGKMNPAENRGDC-JYJNAYRXSA-N Val-Phe-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=CC=C1 VNGKMNPAENRGDC-JYJNAYRXSA-N 0.000 description 1
- NZGOVKLVQNOEKP-YDHLFZDLSA-N Val-Phe-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N NZGOVKLVQNOEKP-YDHLFZDLSA-N 0.000 description 1
- CKTMJBPRVQWPHU-JSGCOSHPSA-N Val-Phe-Gly Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)O)N CKTMJBPRVQWPHU-JSGCOSHPSA-N 0.000 description 1
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 1
- YKNOJPJWNVHORX-UNQGMJICSA-N Val-Phe-Thr Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CC1=CC=CC=C1 YKNOJPJWNVHORX-UNQGMJICSA-N 0.000 description 1
- LGXUZJIQCGXKGZ-QXEWZRGKSA-N Val-Pro-Asn Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N LGXUZJIQCGXKGZ-QXEWZRGKSA-N 0.000 description 1
- ZXYPHBKIZLAQTL-QXEWZRGKSA-N Val-Pro-Asp Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)O)C(=O)O)N ZXYPHBKIZLAQTL-QXEWZRGKSA-N 0.000 description 1
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 1
- KSFXWENSJABBFI-ZKWXMUAHSA-N Val-Ser-Asn Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O KSFXWENSJABBFI-ZKWXMUAHSA-N 0.000 description 1
- QTPQHINADBYBNA-DCAQKATOSA-N Val-Ser-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN QTPQHINADBYBNA-DCAQKATOSA-N 0.000 description 1
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- NZYNRRGJJVSSTJ-GUBZILKMSA-N Val-Ser-Val Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O NZYNRRGJJVSSTJ-GUBZILKMSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- PDDJTOSAVNRJRH-UNQGMJICSA-N Val-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](C(C)C)N)O PDDJTOSAVNRJRH-UNQGMJICSA-N 0.000 description 1
- OFTXTCGQJXTNQS-XGEHTFHBSA-N Val-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](C(C)C)N)O OFTXTCGQJXTNQS-XGEHTFHBSA-N 0.000 description 1
- YLBNZCJFSVJDRJ-KJEVXHAQSA-N Val-Thr-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(O)=O YLBNZCJFSVJDRJ-KJEVXHAQSA-N 0.000 description 1
- SVLAAUGFIHSJPK-JYJNAYRXSA-N Val-Trp-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CO)C(=O)O)N SVLAAUGFIHSJPK-JYJNAYRXSA-N 0.000 description 1
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 1
- BGTDGENDNWGMDQ-KJEVXHAQSA-N Val-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](C(C)C)N)O BGTDGENDNWGMDQ-KJEVXHAQSA-N 0.000 description 1
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 1
- SSKKGOWRPNIVDW-AVGNSLFASA-N Val-Val-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N SSKKGOWRPNIVDW-AVGNSLFASA-N 0.000 description 1
- LLJLBRRXKZTTRD-GUBZILKMSA-N Val-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(=O)O)N LLJLBRRXKZTTRD-GUBZILKMSA-N 0.000 description 1
- 241000589634 Xanthomonas Species 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 230000002378 acidificating effect Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010041407 alanylaspartic acid Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 108010001271 arginyl-glutamyl-arginine Proteins 0.000 description 1
- 108010052670 arginyl-glutamyl-glutamic acid Proteins 0.000 description 1
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 1
- 108010091818 arginyl-glycyl-aspartyl-valine Proteins 0.000 description 1
- 108010043240 arginyl-leucyl-glycine Proteins 0.000 description 1
- 108010094001 arginyl-tryptophyl-arginine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010031045 aspartyl-glycyl-aspartyl-alanine Proteins 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000000975 bioactive effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 230000030570 cellular localization Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 108091036078 conserved sequence Proteins 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 108010004073 cysteinylcysteine Proteins 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 108010004031 deoxyribonuclease A Proteins 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- 230000013020 embryo development Effects 0.000 description 1
- 230000008011 embryonic death Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000009088 enzymatic function Effects 0.000 description 1
- 239000012091 fetal bovine serum Substances 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000012239 gene modification Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 108010080575 glutamyl-aspartyl-alanine Proteins 0.000 description 1
- 108010013768 glutamyl-aspartyl-proline Proteins 0.000 description 1
- HPAIKDPJURGQLN-UHFFFAOYSA-N glycyl-L-histidyl-L-phenylalanine Natural products C=1C=CC=CC=1CC(C(O)=O)NC(=O)C(NC(=O)CN)CC1=CN=CN1 HPAIKDPJURGQLN-UHFFFAOYSA-N 0.000 description 1
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 1
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 1
- 108010066198 glycyl-leucyl-phenylalanine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010020688 glycylhistidine Proteins 0.000 description 1
- 108010081551 glycylphenylalanine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010085325 histidylproline Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 108010060857 isoleucyl-valyl-tyrosine Proteins 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 125000000468 ketone group Chemical group 0.000 description 1
- 108010053037 kyotorphin Proteins 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 108010077158 leucinyl-arginyl-tryptophan Proteins 0.000 description 1
- 108010047926 leucyl-lysyl-tyrosine Proteins 0.000 description 1
- 108010030617 leucyl-phenylalanyl-valine Proteins 0.000 description 1
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 1
- 108010091871 leucylmethionine Proteins 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 1
- 108010009298 lysylglutamic acid Proteins 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 108010063431 methionyl-aspartyl-glycine Proteins 0.000 description 1
- 108010068488 methionylphenylalanine Proteins 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000002956 necrotizing effect Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 102000044158 nucleic acid binding protein Human genes 0.000 description 1
- 108700020942 nucleic acid binding protein Proteins 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 108010074082 phenylalanyl-alanyl-lysine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 244000000003 plant pathogen Species 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 1
- 108010031719 prolyl-serine Proteins 0.000 description 1
- 108010079317 prolyl-tyrosine Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 238000010379 pull-down assay Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 108010071207 serylmethionine Proteins 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 108010005652 splenotritin Proteins 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 1
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 108091006106 transcriptional activators Proteins 0.000 description 1
- 108091008023 transcriptional regulators Proteins 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- ZSDSQXJSNMTJDA-UHFFFAOYSA-N trifluralin Chemical compound CCCN(CCC)C1=C([N+]([O-])=O)C=C(C(F)(F)F)C=C1[N+]([O-])=O ZSDSQXJSNMTJDA-UHFFFAOYSA-N 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 108010045269 tryptophyltryptophan Proteins 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/415—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from plants
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8201—Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
- C12N15/8213—Targeted insertion of genes into the plant genome by homologous recombination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/82—Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
- C12N15/8216—Methods for controlling, regulating or enhancing expression of transgenes in plant cells
- C12N15/8217—Gene switch
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y301/00—Hydrolases acting on ester bonds (3.1)
- C12Y301/21—Endodeoxyribonucleases producing 5'-phosphomonoesters (3.1.21)
- C12Y301/21004—Type II site-specific deoxyribonuclease (3.1.21.4)
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Cell Biology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Botany (AREA)
- Gastroenterology & Hepatology (AREA)
- Peptides Or Proteins (AREA)
- Analytical Chemistry (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Immunology (AREA)
- Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Catching Or Destruction (AREA)
Abstract
PPR 모티프를 가지는 RNA 인식규칙이 DNA의 인식에 대하여도 이용가능하다는 예측에 근거하여, DNA 결합에 작용하는 PPR 단백질을 해석하고, 이와 같은 특징을 가지는 PPR 단백질을 탐색하는 것을 과제로 한다. 본 발명은 하기 일반식(1)의 구조를 가지는 PPR 모티프: (Helix A)-X-(Helix B)-L 일반식(1)(상기 식에서, Helix A는 α-헬릭스 구조를 형성가능한 부분이고; X는 존재하지 않거나 또는 1 내지 9개의 아미노산으로 구성되는 부분이며; Helix B는 α-헬릭스 구조를 형성가능한 부분이고; 및, L은 2 내지 7개의 아미노산으로 구성되는 부분이다)이고, 상기 일반식(1)의 Helix A의 1번 아미노산, 4번 아미노산, 및 L에 포함되는 "ii"(-2)번 아미노산의 3개 아미노산이 대상 DNA 염기 또는 대상 염기서열에 대응하는 특이적인 아미노산의 조합을 가지는 PPR 모티프를 복수, 바람직하게는 2 내지 30개, 보다 바람직하게는 5 내지 25개, 가장 바람직하게는 9 내지 15개를 포함하고, DNA 염기 선택적으로, 혹은 DNA 염기서열 특이적으로 결합가능한 단백질에 의해 상기 과제를 해결할 수 있음을 밝혀내었다.
Description
본 발명은 의도하는 DNA 염기 또는 DNA 서열에 선택적 또는 특이적으로 결합 하는 단백질에 관한 것이다. 본 발명에서는 pentatricopeptide repeat(PPR) 모티프를 이용한다. 본 발명은 DNA 결합 단백질의 동정, 설계, PPR 모티프를 갖는 단백질의 표적 DNA의 동정, DNA의 기능제어를 위해 사용될 수 있다. 본 발명은 의료 분야, 농업 분야 등에서 유용하다. 또한, 본 발명은 PPR 모티프를 포함한 단백질 및 기능성 영역을 규정하는 단백질의 복합체를 이용하는 신규한 DNA 절단효소에 관한 것이다.
최근 다양한 분석에 의해 밝혀진 핵산 결합성 단백질 인자를 이용하여 의도하는 서열에 결합하는 기술이 확립되어 이용되고 있다. 이 서열특이적인 결합을 이용하여 표적 핵산(DNA 또는 RNA)의 세포 내 국재화(intracellular localization)의 해석, 표적으로 하는 DNA 서열의 제거, 또는 그의 하류에 존재하는 단백질 코드 유전자의 발현 제어(활성화 또는 불활성화)가 가능하게 되었다.
DNA에 작용하는 단백질성 인자로서 징크 핑거 단백질(비특허문헌 1, 비특허문헌 2), TAL 이펙터(TALE, 비특허문헌 3, 특허문헌 1) 및 CRISPR(비특허문헌 4, 비특허문헌 5)를 단백질공학적인 재료로 하는 연구 및 개발이 이루어지고 있으나, 이러한 단백질성 인자의 종류는 아직 극히 한정되어 있다.
예를 들면, 인공 DNA 절단효소로 알려진 인공효소 징크 핑거 뉴클레아제(ZFN, zinc finger protein)는 3~4개 염기의 DNA를 특이적으로 인식하여 결합하는 징크 핑거를 3 내지 6개 연결시켜 구성되는 3~4개 염기의 서열단위로 염기서열을 인식하는 부분에 대하여, 세균의 DNA 절단효소(예를 들어, FokI)의 DNA 절단 도메인을 하나 연결시킨 키메라 단백질이다(비특허문헌 2). 이러한 키메라 단백질은 징크 핑거 도메인이 DNA에 결합하는 것으로 알려진 단백질 도메인이며, 많은 전사 조절인자가 이 도메인을 가지고, 특이적인 DNA 서열에 결합하여 유전자의 발현 조절을 하고 있다는 지견에 근거한다. 3개의 징크 핑거를 가지는 이 ZFN을 2개 사용함으로써, 이론적으로는 약 700억 염기에 1개소의 절단을 유도할 수 있다.
그러나, 이 ZFN을 사용하는 방법은 그 작제에 비용이 소요되는 등의 이유로 널리 이용되는 데는 이르지 않았다. 또한, 기능적인 ZFN의 선별효율이 나쁘고, 그 점에서도 문제가 있음이 시사되고 있다. 또한, n개의 징크 핑거로 구성된 징크 핑거 도메인은, (GNN)n 서열을 인식하는 경향이 있기 때문에, 표적 유전자 서열의 자유도(degree of freedom)가 낮다는 문제점도 가지고 있다.
한편, 매 1 염기마다 인식할 수 있는 모듈 부분(module part)의 결합서열로 구성되는 단백질, TAL 이펙터(TALE)에 세균의 DNA 절단효소(예를 들어, FokI)의 DNA 절단 도메인을 결합시킨 것(TALEN)이 개발되어, ZFN을 대신하는 인공효소로서 검토가 이루어지고 있다(비특허문헌 3). 이 TALEN는 식물 병원균 Xanthomonas가 가지는 전사인자의 DNA 결합 도메인과 DNA 제한효소 FokI의 DNA 절단 도메인을 융합시킨 효소이며, 인접한 DNA 서열에 결합하여 이량체를 형성하고 이중가닥 DNA를 절단하는 것으로 알려져 있다. 이 분자는, 식물에 감염하는 세균에서 발견된 TALE의 DNA 결합 도메인이 34개 아미노산으로 구성되는 TALE 모티프 중 2개소 아미노산의 조합으로 하나의 염기를 인식하기 때문에, TALE 모듈의 반복하는 구조의 선택에 의하여 표적 DNA에의 결합성을 선택할 수 있다는 특징을 가지고 있다. 이러한 특징을 갖는 DNA 결합 도메인을 이용한 TALEN은 ZFN과 마찬가지로 표적 유전자에 변이 도입이 가능하다는 특징을 가지고 있지만, 그 표적 유전자(염기서열)의 자유도가 대폭 향상된다는 점과, 결합 염기를 코딩할 수 있는 점이 ZFN과 비교했을 때 큰 장점이다.
그러나, TALEN의 완전한 입체구조가 밝혀지지 않았기 때문에, TALEN의 DNA 절단부위를 동정하는 것은 현 상태에서 불가능하다. 따라서, TALEN은 ZFN과 비교하여 절단 부분이 부정확하고 일정하지 않고, 유사한 서열까지 절단해 버리는 문제점을 가지고 있다. 따라서, 정확하게 표적 염기서열 부분을 DNA 절단효소에 의해 절단할 수 없다는 문제점이 존재한다. 이러한 점에서 상술한 문제점이 없는 새로운 인공 DNA 절단효소의 개발이 요구되어 왔다.
게놈 서열정보에서 식물 만 500개의 큰 패밀리를 형성하는 단백질, PPR 단백질(pentatricopeptide repeat(PPR) 모티프를 갖는 단백질)이 발견되었다(비특허문헌 6). PPR 단백질은 핵 코드 단백질(nucleus-encoded protein)이지만, 오로지 기관(엽록체와 미토콘드리아)의 RNA 수준에서 제어, 절단, 번역, 접합(splicing), RNA 편집, RNA 안정성에 유전자 특이적으로 작용하는 것으로 알려진 단백질이다. 일반적으로, PPR 단백질은 보존성이 낮은 35개 아미노산 모티프, 즉 PPR 모티프가 약 10개 연속된 구조를 가지고, PPR 모티프의 조합이 RNA와 서열 선택적인 결합을 담당하는 것으로 여겨지고 있다. 대부분의 PPR 단백질은 PPR 모티프가 약 10개의 반복으로 구성되어 있으며, 종종 촉매 작용을 발휘하는데 필요한 도메인을 찾을 수 없다. 따라서, 이 PPR 단백질의 실체는 RNA 어댑터(RNA adapter)로 여겨지고 있다(비특허문헌 7).
일반적으로, 단백질과 DNA 사이의 결합 및 단백질과 RNA 사이의 결합은 다른 분자 메커니즘에 따라 이루어지고 있으며, DNA 결합형 단백질은 일반적으로, RNA에 결합하지 않고, 반대로 RNA 결합형 단백질은 통상적으로 DNA에 결합하지 않는다. 예를 들어, RNA 결합인자로 알려져 있으며, 인식 RNA를 코딩할 수 있는 Pumilio 단백질의 경우 DNA와의 결합은 보고되어 있지 않다(비특허문헌 8 및 9).
그러나, 다양한 종류의 PPR 단백질의 성질을 검토하는 과정에서 몇가지 유형의 PPR 단백질이 DNA 결합성 인자로 작용하는 것으로 여겨지고 있다.
밀(소맥, wheat)의 p63은 9개의 PPR 모티프를 가진 PPR 단백질이지만, 겔 시프트 분석(gel shift assay)에 의해 서열특이적으로 DNA와 결합하는 것으로 시사되었다(비특허문헌 10).
애기장대(Arabidopsis thaliana)의 GUN1은 11개의 PPR 모티프를 가지고, 풀다운 분석(pull down assay)에 의해 DNA와 결합하는 것으로 시사되었다(비특허문헌 11).
애기장대의 pTac2(15개 PPR 모티프를 갖는 단백질, 비특허문헌 12) 및 애기장대의 DG1(10 PPR 모티프를 갖는 단백질, 비특허문헌 13)는 DNA를 주형으로 RNA를 생성하는 전사에 직접 관여하는 것이 Run-On 분석에 의해 밝혀졌고, DNA에 결합하는 것으로 여겨지고 있다.
애기장대의 GRP23(11개 PPR 모티프를 갖는 단백질, 비특허문헌 14) 유전자 결손주는 배아 치사(embryonal death)의 표현형을 나타내지만, 그 단백질은 DNA 의존적 RNA 전사효소인 진핵생물형 RNA 전사효소 2의 주요 서브유닛과 물리적으로 상호 작용하는 것으로 알려져, 이 점에서 GRP23도 DNA 결합에 작용하는 것으로 여겨지고 있다.
그러나, 이러한 PPR 단백질은 DNA에의 결합이 간접적으로 시사된 것에 지나지 않고 실제로 서열특이적으로 결합하고 있다는 사실의 입증은 충분하지 않다. 또한, 만일 이러한 단백질이 DNA와 서열특이적으로 결합하더라도 일반적으로 단백질과 DNA 사이의 결합 및 단백질과 RNA 사이의 결합은 다른 분자 메커니즘에 의해 이루어지는 것으로 믿고 있기 때문에, 구체적으로 어떤 서열규칙을 통해 결합하고 있는지 등에 대해서는 전혀 예상조차 되지 않고 있다.
비특허문헌 1: Maeder, M.L., et al.(2008). Rapid "open-source" engineering of customized zinc-finger nucleases for highly efficient gene modification. Mol.Cell 31, 294-301.
비특허문헌 2: Urnov, F.D., et al., (2010) Genome editing with engineered zinc finger nucleases, Nature Review Genetics, 11, 636-646
비특허문헌 3: Miller, J.C., et al.(2011). A TALE nuclease architecture for efficient genome editing. Nature biotech. 29, 143-148.
비특허문헌 4: Mali P, et al.(2013) RNA-guided human genome engineering via Cas9. Science. 339, 823-826.
비특허문헌 5: Cong L, et al.(2013) Multiplex genome engineering using CRISPR/Cas systems. Science. 339, 819-823
비특허문헌 6: Small, I.D., and Peeters, N.(2000). The PPR motif - a TPR-related motif prevalent in plant organellar proteins. Trends Biochem. Sci. 25, 46-47.
비특허문헌 7: Woodson, J.D., and Chory, J.(2008). Coordination of gene expression between organellar and nuclear genomes. Nature Rev. Genet. 9, 383-395.
비특허문헌 8: Wang, X., et al.(2002). Modular recognition of RNA by a human pumilio-homology domain. Cell 110, 501-512.
비특허문헌 9: Cheong, C.G., and Hall, T.M.(2006). Engineering RNA sequence specificity of Pumilio repeats. Proc. Natl. Acad. Sci. USA 103, 13635-13639.
비특허문헌 10: Ikeda T.M. and Gray M.W.(1999) Characterization of a DNA-binding protein implicated in transcription in wheat mitochondria. Mol. Cell Bio., l19(12):8113-8122
비특허문헌 11: Koussevitzky S, et al.(2007) Signals from chloroplasts converge to regulate nuclear gene expression. Science 316: 715-719.
비특허문헌 12: Pfalz J, et al.(2006) pTAC2, -6, and -12 are components of the transcriptionally active plastid chromosome that are required for plastid gene expression. Plant Cell 18: 176-197.
비특허문헌 13: Chi W, et al.(2008) The pentratricopeptide repeat protein DELAYED GREENING1 is involved in the regulation of early chloroplast development and chloroplast gene expression in Arabidopsis. Plant Physiol. 147: 573-584.
비특허문헌 14: Ding YH, et al.(2006) Arabidopsis GLUTAMINE-RICH PROTEIN23 is essential for early embryogenesis and encodes a novel nuclear PPR motif protein that interacts with RNA polymerase II subunit III. Plant Cell 18: 815-830.
본 발명자들은 PPR 단백질(PPR 모티프를 갖는 단백질)의 RNA 어댑터로서의 성질이, PPR 단백질을 구성하는 각각의 PPR 모티프의 성질과 복수의 PPR 모티프의 조합에 의해 결정되는 것으로 예상하고, 이 PPR 모티프를 이용한 RNA 결합 단백질의 개변방법을 제안하였다(특허문헌 2). 또한, PPR 모티프와 RNA가 일대일 대응으로 결합하여 연속된 PPR 모티프에 의하여 RNA 서열 중 연속적인 RNA 염기를 인식한다는 사실, PPR 모티프를 구성하는 35개 아미노산 중 특정 3개의 아미노산을 조합하여 RNA 인식이 결정된다는 사실을 밝혀내고, PPR 모티프의 RNA 인식코드를 이용한 custum RNA 결합 단백질의 설계방법과 그의 이용에 대해 특허출원한 바있다(PCT /JP2012/077274; Yagi, Y., et al.(2013) PLoS One, 8, e57286; 및, Barkan, A., etal.(2012) PLoS Genet., 8, e1002910).
일반적으로, 단백질과 DNA 사이의 결합 및 단백질과 RNA 사이의 결합은 다른 분자 메커니즘에 근거한다고 여겨지고 있다. 이에 대해, 본 발명에서는 PPR 모티프가 갖는 RNA 인식규칙(RNA recognition rule)이 DNA의 인식에 대해서도 이용 가능하다고 예측하고 DNA 결합에 작용하는 PPR 단백질을 분석하여, 그와 같은 특징을 가지는 PPR 단백질을 탐색하는 것을 과제로 하였다. 또한, 이렇게 얻어진 DNA에 특이적으로 결합할 수 있는 PPR 단백질을 이용하여 원하는 서열에 결합하는 custum DNA 결합 단백질을 조제함과 동시에, 기능성 영역을 규정하는 단백질과 함께 사용함으로써 새로운 인공효소를 제공하고, 기능성 영역으로 DNA 절단 활성영역과 함께 사용함으로써 새로운 인공 DNA 절단효소를 제공하는 것을 과제로 하였다.
PPR 단백질의 경우, 다양한 도메인 검색 프로그램(Pfam, Prosite, Interpro 등)에서 일반적인 RNA 결합형 PPR 단백질에 포함된 PPR 모티프와 전술한 여러 종류의 DNA 결합형 PPR 단백질 포함된 PPR 모티프로, 특별히 구별되지 않는 것으로 밝혀졌다. 따라서, PPR 단백질은 핵산 인식에 필요한 아미노산과는 별도로, DNA와의 결합성 또는 RNA와의 결합성을 결정하는 아미노산(아미노산 군)이 포함되는 것이 아닐까 예상되고 있다.
본 발명자들은 PCT/JP2012/077274에서 RNA 결합형 PPR 모티프와 RNA와 일대일 대응으로 결합하고 연속되는 PPR 모티프에 의해 RNA 서열 중의 연속되는 RNA 염기를 인식하는 것, 그때 PPR 모티프를 구성하는 35개 아미노산 중 특정 3개 아미노산(즉, 모티프를 구성하는 2개의 α-헬릭스 구조 중 최초의 헬릭스(Helix A)의 1번과 4번 아미노산(1번 AA 및 4번 AA) 및 C 말단에서 2번째 아미노산("ii"(-2)번 AA))의 RNA 인식 아미노산의 조합으로 염기 선택적인 RNA의 결합이 결정되는 것을 밝혀내고, PPR 모티프의 RNA 인식코드를 이용한 custum RNA 결합 단백질의 설계 방법과 그의 이용에 대해 특허출원을 하였다.
따라서, PPR 단백질 중에서 DNA와의 결합이 시사된 전술한 밀(소맥, wheat)의 p63(비특허문헌 11, 애기장대(Arabidopis thaliana)의 상동 단백질(homologous protein)의 아미노산 서열을 서열번호 1로 함) 애기장대의 GUN1(비특허문헌 12, 아미노산 서열을 서열번호 2로 함) 애기장대의 pTac2(비특허문헌 13, 아미노산 서열을 서열번호 3으로 함) DG1(비특허문헌 14, 아미노산 서열을 서열번호 4로 함) 애기장대의 GRP23(비특허문헌 15, 아미노산 서열을 서열번호 5로 함)에 있어서, RNA를 표적으로 하는 경우에 중요하다고 여겨지는 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)에 있어서 아미노산 출현빈도를 RNA 결합형 모티프로 비교한 결과, 이러한 DNA 결합성이 시사되는 PPR 단백질의 PPR 모티프와 RNA 결합형 모티프 사이에서 아미노산 출현빈도의 경향이 거의 일치하는 것으로 밝혀졌다.
이러한 이유로, RNA 결합형 PPR 모티프의 핵산 인식코드(nucleic acid recognition code)가 DNA 결합형 PPR 모티프에도 적용할 수 있다는 것이 시사되었다. 티민(T)은 5-메틸우라실라고도 불리우는 것처럼 우라실(U) 5번 위치의 탄소를 메틸화한 구조를 갖는 우라실(U) 유도체이다. 이와 같은 핵산을 구성하는 염기의 성질로부터 RNA 결합형 PPR 모티프의 우라실(U)을 인식하는 아미노산의 결합은 DNA의 경우 티민(T)의 인식에 이용되는 것이 시사되었다.
이러한 연구 결과에 따라, DNA 결합형 PPR 단백질인 상술한 p63(서열번호 1의 아미노산 서열) 애기장대의 GUN1(서열번호 2의 아미노산 서열) 애기장대의 pTac2(서열번호 3의 아미노산 서열), DG1(서열번호 4의 아미노산 서열), 애기장대의 GRP23(서열번호 5의 아미노산 서열)을 주형으로 사용하여, PPR 단백질에 RNA 결합형 PPR 모티프 검토의 결과 얻어진 지견을 적용하여 3개의 아미노산(1번 AA 및 4번 AA 및 "ii"(-2)번 AA)을 배치함으로써, 모든 DNA 서열에 결합하는 custum DNA 결합 단백질의 작제가 가능하다는 점을 밝혀내었다.
즉, 본 발명자들은 서열번호 1의 아미노산 서열, 서열번호 2의 아미노산 서열, 서열번호 3의 아미노산 서열, 서열번호 4의 아미노산 서열, 서열번호 5의 아미노산 서열에 의해 대표되는 각각의 PPR 모티프 중 3개의 아미노산(1번 AA 및 4번 AA 및 "ii"(-2)번 AA)을 후술하는 특정 아미노산에 한 PPR 모티프를 여러 개, 바람직하게는 2 내지 30개, 보다 바람직하게는 5 내지 25개, 가장 바람직하게는 9 내지 15개 포함하는, DNA 염기 선택적 또는 DNA 염기서열 특이적으로 결합하는 단백질을 제공함으로써, 본 발명을 완성하기에 이르렀다.
본 발명은 다음을 제공한다:
[1] 다음 일반식(1)의 구조를 가지는 모티프를 하나 이상 포함하고, DNA 염기 선택적 또는 DNA 염기서열 특이적으로 결합하는 단백질:
(Helix A)-X-(Helix B)-L ..... 일반식(1)
(상기 식 1에서,
Helix A는 α-헬릭스 구조를 형성할 수 있는 부분이고;
X는 존재하지 않거나 또는 1 내지 9개의 아미노산으로 구성되는 부분이며;
Helix B는 α-헬릭스 구조를 형성할 수 있는 부분이고; 및,
L은 2 내지 7개의 아미노산으로 구성되는 부분이다)
단, 상기 단백질에 포함된 하나의 PPR 모티프(Mn)는
Helix A의 1번째 아미노산을 1번 아미노산(1번 AA), 4번째 아미노산을 4번 아미노산(4번 AA)로 하고,
· PPR 모티프(Mn)의 C 말단에 연속하여 다음 PPR 모티프(Mn+1)가 존재하는 경우(PPR 모티프 사이에 아미노산 삽입이 없는 경우), PPR 모티프(Mn)를 구성하는 아미노산의 마지막(C 말단)에서 -2번째 아미노산;
· PPR 모티프(Mn)와 그의 C 말단측 다음 PPR 모티프(Mn+1) 사이에 1 내지 20개 아미노산의 비 PPR 모티프(non-PPR motif)가 존재하는 경우, 다음 PPR 모티프(Mn+1)의 1번 아미노산의 2개 상류, 즉 -2번째 아미노산; 또는
· PPR 모티프(Mn)의 C 말단측 다음 PPR 모티프(Mn+1)가 존재하지 않는 경우, 혹은 C 말단측 다음 PPR 모티프(Mn+1)와의 사이에 21개 또는 그 이상 아미노산의 비 PPR 모티프를 구성하는 아미노산이 존재하는 경우, PPR 모티프(Mn)를 구성하는 아미노산의 마지막(C 말단측)에서 2번째 아미노산을
"ii"(-2)번의 아미노산("ii"(-2)번 AA)이라 했을 때,
1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산으로서, 대상(target) DNA 염기 또는 대상 DNA 염기서열에 대응하는 특이적인 아미노산의 조합을 가진다.
[2] [1]에 기재된 단백질에 있어서, 대상 DNA 염기 또는 대상 DNA 염기서열에 대응하는 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산의 조합이, 다음의 어느 하나에 근거하여 결정되는 단백질:
(1-1) 4번 AA가 글리신(G)인 경우, 1번 AA는 임의의 아미노산이어도 좋고, "ii"(-2)번 AA는 아스파라긴산(D), 아스파라긴(N) 또는 세린(S)이고;
(1-2) 4번 AA가 이소류신(I)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있으며;
(1-3) 4번 AA가 류신(L)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있고;
(1-4) 4번 AA가 메티오닌(M)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있으며;
(1-5) 4번 AA가 아스파라긴(N)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있고;
(1-6) 4번 AA가 프롤린(P)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있으며;
(1-7) 4번 AA가 세린(S)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있고;
(1-8) 4번 AA가 트레오닌(T)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있으며; 및,
(1-9) 4번 AA가 발린(V)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산일 수 있다.
[3] [1]에 기재된 단백질에 있어서, 대상 DNA 염기 또는 대상 DNA 염기서열에 대응하는 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산의 조합이, 다음의 어느 하나에 근거하여 결정되는 단백질:
(2-1) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합하고;
(2-2) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 글루타민산, 글리신, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합하며;
(2-3) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고;
(2-4) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 글루타민산, 글리신, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하며;
(2-5) 1번 AA, 4번 AA 및 "ii" (-2) 번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 세린인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고, 다음으로 C에 결합하고;
(2-6) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 이소류신, 임의의 아미노산인 경우, 그 PPR 모티프는 T 및 C에 선택적으로 결합하며;
(2-7) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 이소류신, 아스파라긴인 경우, 그 PPR 모티프 는 T에 선택적으로 결합하고, 다음으로 C에 결합하고;
(2-8) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 류신, 임의의 아미노산인 경우, 그 PPR 모티프는 T 및 C에 선택적으로 결합하며;
(2-9) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 류신, 아스파라긴산인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고;
(2-10) 1번 AA, 4번 AA 및 "ii "(-2) 번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 류신, 리신인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-11) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 메티오닌, 임의의 아미노산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-12) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 메티오닌, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-13) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 메티오닌, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합하고;
(2-14) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 임의의 아미노산인 경우, 그 PPR 모티프는 C 및 T에 선택적으로 결합하며;
(2-15) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-16) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-17) 1번 AA, 4번 AA 및 "ii "(-2) 번 AA의 3개 아미노산이 순서대로 글리신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-18) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-19) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 트레오닌, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-20) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합하며;
(2-21) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 차례로, 티로신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합하고;
(2-22) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합하며;
(2-23) 1번 AA, 4번 AA 및 "ii"(-2)번 AA 3개 아미노산이 순서대로 이소류신, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고;
(2-24) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 세린, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합하며;
(2-25) 1번 AA, 4번 AA 및 "ii" (-2) 번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고;
(2-26) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 세린인 경우, 그 PPR 모티프는 C에 선택적으로 결합하며;
(2-27) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 세린인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고;
(2-28) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 트레오닌인 경우, 그 PPR 모티프는 C에 선택적으로 결합하며;
(2-29) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 트레오닌인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고;
(2-30) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 트립토판인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고, 다음으로 T에 결합하며;
(2-31) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 아스파라긴, 트립토판인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합하고;
(2-32) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 프롤린, 임의의 아미노산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-33) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-34) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하며;
(2-35) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 차례로 티로신, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고;
(2-36) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 세린, 임의의 아미노산인 경우, 그 PPR 모티프는 A 및 G에 선택적으로 결합하며;
(2-37) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고;
(2-38) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하며;
(2-39) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고;
(2-40) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 임의의 아미노산인 경우, 그 PPR 모티프는 A와 G에 선택적으로 결합하며;
(2-41) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합하고;
(2-42) 1번 AA, 4번 AA 및 "ii "(-2) 번 AA의 3개 아미노산이 순서대로 발린, 트레오닌, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합하며;
(2-43) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고;
(2-44) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하며;
(2-45) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고;
(2-46) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합하며;
(2-47) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 임의의 아미노산인 경우, 그 PPR 모티프는 A, C 및 T에 결합하지만 G는 결합하지 않고;
(2-48) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 발린, 아스파라긴산인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고, 다음으로 A에 결합하며;
(2-49) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 글리신인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고; 및,
(2-50) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 트레오닌인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다.
[4] [1]에 정의된 PPR 모티프(Mn)를 2 내지 30개 포함하는, [1] 내지 [3]의 어느 한 항에 기재된 단백질.
[5] [1]에 정의된 PPR 모티프(Mn)를 5 내지 25개 포함하는, [1] 내지 [3]의 어느 한 항에 기재된 단백질.
[6] [1]에 정의된 PPR 모티프(Mn)를 9 내지 15개 포함하는, [1] 내지 [3]의 어느 한 항에 기재된 단백질.
[7] PPR 모티프를 9개 가지는 서열번호 1의 아미노산 서열, PPR 모티프를 11개 가지는 서열번호 2의 아미노산 서열, PPR 모티프를 15개 가지는 서열번호 3의 아미노산 서열, PPR 모티프 10개 가지는 서열번호 4의 아미노산 서열, PPR 모티프를 11개 가지는 서열번호 5의 아미노산 서열로부터 선택되는 서열로 구성되는, [6]에 기재된 PPR 단백질.
[8] [2]에 기재된 (1-1) 내지 (1-9) 또는 [3]에 기재된 (2-1) 내지 (2-50)의 어느 하나에 근거하여 PPR 모티프의 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산의 조합에 대응하는 DNA 염기의 존재 여부를 판정함으로써, [1]에 정의된 PPR 모티프(Mn)를 1개 이상(바람직하게는 2 내지 30개) 포함하는 DNA 결합 단백질의 표적이 되는 DNA 염기 또는 DNA 염기서열을 동정하는 방법.
[9] [2]에 기재된 (1-1) 내지 (1-9) 또는 [3]에 기재된 (2-1) 내지 (2-50)의 어느 하나에 근거하여 표적 DNA 염기 또는 표적 DNA를 구성하는 특정 염기에 대응하는, PPR 모티프의 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산의 결합 여부를 판정함으로써, 표적 DNA 염기 또는 특정 염기서열을 갖는 표적 DNA에 결합가능한 [1]에 정의된 PPR 모티프(Mn)를 1개 이상(바람직하게는 2 내지 30개) 포함하는 PPR 단백질을 동정하는 방법.
[10] [1]에 기재된 단백질을 이용하는 DNA의 기능 제어방법.
[11] [1]에 기재된 단백질을 포함하는 영역과 기능성 영역이 연결된 복합체.
[12] [1]에 기재된 단백질의 C 말단에 기능성 영역을 융합하여 구성되는 [11]에 기재된 복합체.
[13] 기능성 영역이 DNA 절단효소 또는 뉴클레아제 영역 또는 전사제어 영역이고, 표적서열 특이적 DNA 절단효소 또는 전사 조절인자 역할을 하는 [11] 또는 [12]에 기재된 복합체.
[14] DNA 절단효소가 FokI의 뉴클레아제 영역(서열번호 6)인 [13]에 기재된 복합체.
[15] 다음의 단계를 포함하는 세포의 유전물질을 개변하는 방법:
표적서열을 갖는 DNA를 포함하는 세포를 준비하는 단계; 및,
[11]에 기재된 복합체를 세포에 도입하여 복합체 단백질로 구성되는 영역이 표적서열을 갖는 DNA에 결합하고, 그로 인해 기능성 영역이 표적서열을 갖는 DNA를 개변하는 단계.
[16] PPR 모티프를 1개 이상 포함하는 PPR 단백질을 이용하여 DNA 염기 또는 특정 염기서열을 갖는 DNA를 동정, 인식 또는 표적화하는 방법.
[17] 단백질이 구성 아미노산 중 3개가 특정 아미노산의 조합인 PPR 모티프를 1개 이상 포함하는 [16]에 기재된 방법.
[18] 단백질이 [1]에 정의된 PPR 모티프(Mn)를 1개 이상 포함하는 [16] 또는 [17]에 기재된 방법.
본 발명에 따르면, 해당 DNA 염기에 결합가능한 PPR 모티프 및 이를 포함하는 단백질을 제공한다. 복수의 PPR 모티프를 배치하여 임의의 서열과 길이를 갖는 표적 DNA에 결합하는 단백질을 제공한다.
본 발명에 따르면, 임의의 PPR 단백질의 표적 DNA를 예측하고 동정할 수 있으며, 반대로 어떤 DNA에 결합하는 PPR 단백질을 예측하고 동정할 수 있다. 표적 DNA 서열을 예측하여 그 유전자의 실체를 밝히고, 유용성을 넓힐 수 있다. 또한, 본 발명에 따르면, 산업적으로 유용한 PPR 단백질의 유전자에 있어서 다양한 아미노산 다형(amino acid polymorphism)을 가지는 상동 유전자의 기능을 그 표적 DNA 서열의 차이로부터 검정할 수 있다.
또한, 본 발명은 PPR 모티프를 이용한 새로운 DNA 절단효소를 제공한다. 즉, 본 발명에 의해 제공되는 PPR 모티프 또는 PPR 단백질에 기능성 영역 단백질을 연결하여, 특정 핵산서열에 대한 결합활성을 가지고 특정 기능을 갖는 단백질을 포함하는 복합체를 제조할 수 있다.
본 발명에서 사용할 수 있는 기능성 영역은 다양한 기능 중 DNA의 절단, 전사, 복제, 수복, 합성, 수식 등의 어느 하나의 기능을 부여하는 영역을 의미한다. 본 발명의 특징인 PPR 모티프의 서열을 조정하고 표적으로 하는 DNA의 염기서열을 정함으로써 거의 모든 DNA 서열을 표적으로 이용할 수 있으며, 그 표적으로 DNA의 절단, 전사, 복제, 수복, 합성, 수식 등 기능성 영역이 가지는 기능을 이용한 게놈 편집(genome edition)을 실현할 수 있다.
예를 들어, 기능성 영역이 DNA 절단기능을 갖는 경우, 본 발명에서 제조되는 PPR 단백질 부분과 DNA의 절단영역이 연결된 복합체가 제공된다. 이와 같은 복합체는 PPR 단백질 부분에 의해 표적으로 하는 DNA의 염기서열을 인식한 후, DNA의 절단영역에 의해 DNA를 절단하는 인공 DNA 절단효소로서 역할을 한다. 기능성 영역이 전사제어 기능을 갖는 경우, 본 발명에서 제조되는 PPR 단백질 부분과 DNA의 전사제어 영역이 연결된 복합체가 제공된다. 이와 같은 복합체는 PPR 단백질 부분에 의해 표적으로 하는 DNA의 염기서열을 인식한 후, 목적하는 DNA의 전사를 촉진하는 인공 전사조절인자로서 역할을 할 수 있다.
또한, 본 발명에 따라 상기 복합체를 생체 내에 전달하고, 기능하게 하는 방법 또는 본 발명에 의해 얻어진 단백질을 코딩하는 핵산서열(DNA, RNA)을 이용한 형질전환체의 작제나 생물(세포, 조직, 개체)의 다양한 상태에서의 특이적인 개변, 제어 및 기능의 부여에 이용할 수 있다.
도 1은 PPR 모티프의 보존서열과 아미노산 번호를 나타낸다. (A) 본 발명에서 정의하는 PPR 모티프를 구성하는 아미노산 및 그 아미노산 번호를 기재하였다. (B) 결합염기 선택성을 제어하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 예측 구조에서의 위치를 나타낸다. (C) PPR 모티프의 구조적인 두 가지 예와 각각의 경우의 예측 구조에서 아미노산의 위치를 나타낸다. 여기에서, 1번 AA, 4번 AA 및 "ii"(-2)번 AA는 단백질 입체구조도(conformational diagram)에서 자홍색(흑백 표시로는 진한 회색)의 막대로 나타내었다.
도 2는 DNA 결합형 DNA의 대사에 역할을 하는 PPR 단백질인 애기장대 p63(서열번호 1의 아미노산 서열), 애기장대의 GUN1(서열번호 2의 아미노산 서열), 애기장대의 pTac2(서열번호 3의 아미노산 서열), DG1(서열번호 4의 아미노산 서열), 애기장대의 GRP23(서열번호 5의 아미노산 서열)의 구조적인 개요 및 그로부터 DNA에 결합하는 것을 나타내는 분석 시스템의 개요에 대하여 정리한 것이다.
도 3은 DNA 결합성이 시사되는 PPR 단백질(서열번호 1 내지 5)의 PPR 모티프와 공지된 RNA 결합형 모티프 사이에서, PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)에 대한 아미노산 출현빈도를 정리한 것이다.
도 4a는 (A) 애기장대 p63(서열번호 1의 아미노산 서열), (B) 애기장대의 GUN1(서열번호 2의 아미노산 서열) 각각에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸 것이다.
도 4b는 (C) 애기장대의 pTac2(서열번호 3의 아미노산 서열), (D) DG1(서열번호 4의 아미노산 서열) 각각에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸 것이다.
도 4c는 (E) 애기장대의 GRP23(서열번호 5의 아미노산 서열)에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸다.
도 5는 PPR 분자의 서열특이적인 DNA 결합능력의 평가를 나타낸다. 3종류의 DNA 결합형(으로 보여지는) PPR 분자에 전사활성화 도메인인 VP64를 융합한 인공 전사인자를 작제하고, 사람 배양세포 내에서 각각의 표적서열을 갖는 루시퍼라제 리포터를 활성화할 수 있는지를 검토하였다.
도 6에서는 pTac2-VP64 및 GUN1-VP64 각각에 대하여 음성 대조군인 pminCMV-luc2와 함께 도입한 경우 및 표적서열을 4개 또는 8개 가지는 리포터 벡터와 함께 도입한 경우의 루시퍼라제 활성을 비교하였다. 그 결과, 양쪽 모두 표적서열을 늘릴수록 활성이 상승하는 경향을 보여, 이들 PPR-VP64 분자가 각각의 표적서열에 특이적으로 결합하는 부위 특이적인 전사 활성화 인자로서 역할을 하는 것이 입증되었다.
도 2는 DNA 결합형 DNA의 대사에 역할을 하는 PPR 단백질인 애기장대 p63(서열번호 1의 아미노산 서열), 애기장대의 GUN1(서열번호 2의 아미노산 서열), 애기장대의 pTac2(서열번호 3의 아미노산 서열), DG1(서열번호 4의 아미노산 서열), 애기장대의 GRP23(서열번호 5의 아미노산 서열)의 구조적인 개요 및 그로부터 DNA에 결합하는 것을 나타내는 분석 시스템의 개요에 대하여 정리한 것이다.
도 3은 DNA 결합성이 시사되는 PPR 단백질(서열번호 1 내지 5)의 PPR 모티프와 공지된 RNA 결합형 모티프 사이에서, PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)에 대한 아미노산 출현빈도를 정리한 것이다.
도 4a는 (A) 애기장대 p63(서열번호 1의 아미노산 서열), (B) 애기장대의 GUN1(서열번호 2의 아미노산 서열) 각각에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸 것이다.
도 4b는 (C) 애기장대의 pTac2(서열번호 3의 아미노산 서열), (D) DG1(서열번호 4의 아미노산 서열) 각각에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸 것이다.
도 4c는 (E) 애기장대의 GRP23(서열번호 5의 아미노산 서열)에 대해 내부에 포함되는 PPR 모티프의 위치 및 PPR 모티프 중의 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)의 위치를 나타낸다.
도 5는 PPR 분자의 서열특이적인 DNA 결합능력의 평가를 나타낸다. 3종류의 DNA 결합형(으로 보여지는) PPR 분자에 전사활성화 도메인인 VP64를 융합한 인공 전사인자를 작제하고, 사람 배양세포 내에서 각각의 표적서열을 갖는 루시퍼라제 리포터를 활성화할 수 있는지를 검토하였다.
도 6에서는 pTac2-VP64 및 GUN1-VP64 각각에 대하여 음성 대조군인 pminCMV-luc2와 함께 도입한 경우 및 표적서열을 4개 또는 8개 가지는 리포터 벡터와 함께 도입한 경우의 루시퍼라제 활성을 비교하였다. 그 결과, 양쪽 모두 표적서열을 늘릴수록 활성이 상승하는 경향을 보여, 이들 PPR-VP64 분자가 각각의 표적서열에 특이적으로 결합하는 부위 특이적인 전사 활성화 인자로서 역할을 하는 것이 입증되었다.
[PPR 모티프와 PPR 단백질]
본 발명에서 "PPR 모티프(PPR motif)"라고 함은, 특별히 기재한 경우를 제외하고는, Web에서 단백질 도메인 검색 프로그램(예를 들어, Pfam, Prosite, Uniprot 등)으로 아미노산 서열을 분석할 때, Pfam(http://pfam.sanger.ac.uk/)에서 PF01535, Prosite(http://www.expasy.org/prosite/)에서는 PS51375에서 얻을 수 있는 E 값(E value)이 소정 값 이하(바람직하게는 E-03)의 아미노산 서열을 가지는 30 내지 38개의 아미노산으로 구성된 폴리펩티드를 의미한다. 또한, Uniprot database(http://www.uniprot.org)에도 다양한 단백질 중의 PPR 모티프가 정의되어 있다.
본 발명의 PPR 모티프는 PPR 모티프의 아미노산 서열의 보존성은 낮지만, 아래 일반식(1)에 나타낸 바와 같이 헬릭스(helix), 루프(loop), 헬릭스, 루프의 2차 구조는 잘 보존되어 있다.
(Helix A)-X-(Helix B)-L ..... 일반식(1)
본 발명에서 정의하는 PPR 모티프를 구성하는 아미노산의 위치번호는 본 발명자들의 논문(Kobayashi K, et al., Nucleic Acids Res., 40, 2712-2723(2012))에 따른다. 즉, 본 발명에서 정의하는 PPR 모티프를 구성하는 아미노산의 위치번호는, Pfam에서의 PF01535개 아미노산 번호와 거의 동일하지만, Prosite에서의 PS51375 아미노산의 번호에서 2를 뺀 수(예; 본 발명의 1번 → PS51375 3번)에 해당하고, Uniprot에서 정의된 PPR 모티프의 아미노산의 번호도 2를 뺀 수에 해당한다.
구체적으로, 본 발명에서 1번 아미노산은 일반식 1로 표시되는 Helix A가 시작하는 1번째 아미노산이다. 4번 아미노산은 1번 아미노산부터 세어서 4번째 아미노산이다. 단, "ii"(-2)번 아미노산이라고 하는 경우는
· PPR 모티프(Mn)의 C 말단에 연속하여 다음 PPR 모티프(Mn+1)가 존재하는 경우(PPR 모티프 사이에 아미노산 삽입이 없는 경우, 예를 들어, 도 4a의 (A)에서는 Motif Nos.1, 2, 3, 4, 6 및 7에 해당한다)는, PPR 모티프(Mn)을 구성하는 아미노산의 마지막(C 말단)에서 -2번째 아미노산을 의미하고;
· PPR 모티프(Mn)과 C 말단의 다음 PPR 모티프(Mn+1) 사이에 1 내지 20개 아미노산의 비 PPR 모티프(PPR 모티프가 아닌 부분)가 존재하는 경우(예를 들어, 도 4a의 (A)에서는 Motif Nos. 5 및 8이 해당되고, 도 4c의 (D)는 Motif Nos. 1, 2, 7과 8이 해당된다)는, 다음 PPR 모티프(Mn+1)의 1번 아미노산에 대하여 2개 상류, 즉 -2번째 아미노산을 "ii"(-2)번 아미노산으로 하며(도 1 참조); 및,
· PPR 모티프(Mn)의 C 말단에 다음 PPR 모티프(Mn+1)를 존재하지 않는 경우(예를 들어, 도 4a(A)의 Motif No. 9(B)의 Motif No. 11이 해당된다) 또는 C 말단의 다음 PPR 모티프(Mn+1) 사이에 21개 또는 그 이상 아미노산의 비 PPR 모티프를 구성하는 아미노산이 존재하는 경우, PPR 모티프(Mn)를 구성하는 아미노산의 마지막(C 말단)에서 2번째 아미노산을 "ii"(-2)번 아미노산으로 한다.
본 발명에서 "PPR 단백질"이라 함은, 특별히 기재한 경우를 제외하고, 상술한 PPR 모티프를 복수 가지는 PPR 단백질을 의미한다. 본 명세서에서 "단백질"이라 함은, 특별히 기재한 경우를 제외하고, 폴리펩티드(복수의 아미노산이 펩티드 결합한 사슬)로 구성되는 물질 전반을 의미하며, 비교적 저분자의 폴리펩티드로 구성되는 것도 이에 포함된다. 본 발명에서 "아미노산"이라 함은 보통의 아미노산 분자 이외에 펩타이드 사슬을 구성하는 아미노산 잔기를 의미함은 문맥으로부터 당업자에게 명백하다.
PPR 단백질은 식물에 많이 존재하고, 애기장대(Arabidopsis thaliana)에서 500 단백질, 약 5,000 모티프가 발견된다. 벼(rice), 포플러(poplar), 부처손(selaginella) 등 많은 육상 식물에서도 다양한 아미노산 서열의 PPR 모티프와 PPR 단백질이 존재한다. 일부 PPR 단백질은 꽃가루 형성(웅성 배우자)의 형성에 작용하는 임성(稔性) 회복인자(fertility restoration factor)로서, 잡종 강세를 위한 F1 종자 취득을 위한 중요한 유전자인 것으로 알려져 있다. 임성(稔性) 회복과 유사한 일부 PPR 단백질은 종 분화에 작용하고 있는 것으로 밝혀져 있다. 대부분의 PPR 단백질은 미토콘드리아 또는 엽록체 중의 RNA에 작용하는 것도 알려져 있다.
동물에서는 LRPPRC으로 동정되는 PPR 단백질의 이상이 Leigh syndrom French Canadian(LSFC; 리 증후군, 아급성 괴사성 뇌척수 질환)을 일으키는 것으로 알려져 있다.
본 발명에서 PPR 모티프의 DNA 염기와의 결합성과 관련하여, "선택적(selective)"이라 함은, 특별히 기재한 경우를 제외하고는, DNA 염기의 어느 하나의 염기에 대한 결합활성이 다른 염기에 대한 결합활성보다 높은 것을 의미한다. 이 선택성은 당업자라면 실험을 기획하고 확인할 수 있으며, 본 명세서의 실시예에 개시된 바와 같이, 계산에 의해 구할 수 있다.
본 발명에서 DNA 염기라고 함은, 특별히 기재한 경우를 제외하고는, DNA를 구성하는 디옥시리보뉴클레오타이드 염기를 의미하며, 구체적으로는 아데닌(A), 구아닌(G), 시토신(C) 또는 티민(T) 중 하나를 의미한다. 또한, PPR 단백질은 DNA 중의 염기에 대해 선택성을 가지고 있지만, 핵산 단량체에 결합하는 것은 아니다.
본 발명 이전에는 PPR 모티프로 보존 아미노산의 서열검색법은 확립되어 있지만, 선택적인 DNA 염기와의 결합에 관한 법칙성은 전혀 발견되지 않았다.
*[본 발명에 의해 제공되는 지견]
본 발명에 따르면, 다음과 같은 지견이 제공된다.
(I) 선택적인 결합을 위해 중요한 아미노산의 위치에 대한 정보
구체적으로, PPR 모티프, Helix A의 1번째 아미노산을 1번 아미노산, 4번째 아미노산을 4번 아미노산으로, 그리고
· PPR 모티프(Mn)의 C 말단에 연속하여 다음 PPR 모티프(Mn+1)가 있는 경우(PPR 모티프 사이에 아미노산 삽입이 없는 경우), PPR 모티프(Mn)를 구성하는 아미노산의 마지막(C 말단)에서 -2번째 아미노산;
· PPR 모티프(Mn)와 C 말단의 다음 PPR 모티프(Mn+1) 사이에 1 내지 20개 아미노산의 비 PPR 모티프가 존재하는 경우, 다음 PPR 모티프(Mn+1)의 1번 아미노산의 2개 상류, 즉 -2번째 아미노산; 또는
· PPR 모티프(Mn)의 C 말단에 다음 PPR 모티프(Mn+1)가 존재하지 않는 경우 혹은 C 말단의 다음 PPR 모티프(Mn+1) 사이에 21개 또는 그 이상 아미노산의 비 PPR 모티프를 구성하는 아미노산이 존재하는 경우, PPR 모티프(Mn)를 구성하는 아미노산의 마지막(C 말단)에서 2번째 아미노산을 "ii"(-2)번 아미노산으로 하는 경우,
헬릭스(Helix A)의 1번과 4번 아미노산인 1번 AA, 4번 AA, 및 상기에서 정의하는 "ii"(-2)번 AA의 3개 아미노산의 조합(1번 AA, 4번 AA 및 "ii"(-2)번 AA)이 DNA 염기의 선택적인 결합을 위해 중요하며, 이러한 조합으로 결합하는 DNA 염기가 어떤 종류인지를 결정할 수 있다.
본 발명은 본 발명자들에 의해 발견된 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3가지 아미노산의 조합에 관한 지견에 근거한다. 구체적으로는
(1-1) 4번 AA가 글리신(G)인 경우, 1번 AA는 임의의 아미노산이어도 좋고, "ii"(-2)번 AA는 아스파라긴산(D), 아스파라긴(N) 또는 세린(S)이며, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* GD) ·
· 바람직하게는 글루타민산(E)과 아스파라긴산(D)의 조합(EGD)
· 임의의 아미노산과 아스파라긴(N)의 조합(* GN)
· 바람직하게는 글루타민산(E)과 아스파라긴(N)의 조합(EGN) 또는
· 임의의 아미노산 세린(S)과의 조합(* GS)이어도 무방하다;
(1-2) 4번 AA가 이소류신(I)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴(N)과의 조합(* IN)이어도 무방하다;
(1-3) 4번 AA가 류신(L)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* LD) 또는
· 임의의 아미노산과 리신(K)의 조합(* LK)이어도 무방하다;
(1-4) 4번 AA가 메티오닌(M)인 경우, 1번 AA 및 "ii"(-2) 번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* MD) 또는
· 이소류신(I)과 아스파라긴산(D)의 조합(IMD)이어도 무방하다;
(1-5) 4번 AA가 아스파라긴(N)인 경우, 1번 AA와 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii "(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* ND),
· 페닐알라닌(F), 글리신(G), 이소류신(I), 트레오닌(T), 발린(V), 티로신(Y)의 어느 하나와 아스파라긴산(D)과의 조합(FND, GND, IND, TND, VND 또는 YND)
· 임의의 아미노산과 아스파라긴(N)와 조합(* NN),
· 이소류신(I), 세린(S), 발린(V)의 어느 하나와 아스파라긴(N)과의 조합(INN, SNN 또는 VNN),
· 임의의 아미노산과 세린(S)의 조합(* NS),
· 발린(V)과 세린(S)의 조합(VNS),
· 임의의 아미노산과 트레오닌(T)의 조합(* NT),
· 발린(V)과 트레오닌(T)의 조합(VNT),
· 임의의 아미노산과 트립토판(W)의 조합(* NW) 또는
· 이소류신(I)과 트립토판(W)과의 조합(INW)이어도 무방하다;
(1-6) 4번 AA가 프롤린(P)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* PD)
· 페닐알라닌(F)과 아스파라긴산(D)의 조합(FPD) 또는
· 티로신(Y)과 아스파라긴산(D)의 조합(YPD)이어도 무방하다;
(1- 7) 4번 AA가 세린(S)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴(N)의 조합(* SN)
· 페닐알라닌(F)과 아스파라긴(N)의 조합(FSN) 또는
· 발린(V)와 아스파라긴(N)의 조합(VSN)이어도 무방하다;
(1-8) 4번 AA가 트레오닌(T)이면 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 임의의 아미노산과 아스파라긴산(D)의 조합(* TD),
· 발린(V)과 아스파라긴산(D)의 조합(VTD),
· 임의의 아미노산과 아스파라긴(N)의 조합(* TN),
· 페닐알라닌(F)과 아스파라긴(N)의 조합(FTN),
· 이소류신(I)와 아스파라긴(N)와 조합(ITN) 또는
· 발린(V)과 아스파라긴(N)의 조합(VTN)이어도 무방하다;
(1-9) 4번 AA가 발린(V)인 경우, 1번 AA 및 "ii"(-2)번 AA는 모두 임의의 아미노산이어도 좋고, 예를 들어 1번 AA와 "ii"(-2)번 AA의 조합으로는
· 이소류신(I)과 아스파라긴산(D)의 조합(IVD),
· 임의의 아미노산 글리신(G)의 조합(* VG) 또는
· 임의의 아미노산 트레오닌(T)과의 조합(* VT)이어도 무방하다.
(II) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3가지 아미노산의 조합과 DNA 염기와의 대응에 대한 정보
구체적으로, 다음에 근거하여 결정되는 단백질로서, 선택적인 DNA 염기 결합기능을 가진다.
(2-1) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합한다;
(2-2) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 글루타민산, 글리신, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합한다;
(2-3) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-4) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 글루타민산, 글리신, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-5) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 글리신, 세린인 경우, 그 PPR 모티프는 A에 선택적으로 결합하고, 다음으로 C에 결합한다;
(2-6) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 이소류신, 임의의 아미노산인 경우, 그 PPR 모티프는 T 및 C에 선택적으로 결합한다;
(2-7) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 이소류신, 아스파라긴인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 대해 결합한다;
(2-8) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 류신, 임의의 아미노산인 경우, 그 PPR 모티프는 T 및 C에 선택적으로 결합한다;
(2-9) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노 류신, 아스파라긴산인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-10) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 류신, 리신인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-11) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 메티오닌, 임의의 아미노산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-12) 1번 AA, 4번 AA 및 "ii" (-2) 번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 메티오닌, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-13) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 메티오닌, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합하는 ;
(2-14) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 임의의 아미노산인 경우, 그 PPR 모티프는 C 및 T에 선택적으로 결합한다;
(2-15) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-16) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-17) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 글리신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-18) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-19) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 트레오닌, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-20) 1번 AA, 4번 AA 및 "ii" (-2) 번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합한다;
(2-21) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 티로신, 아스파라긴, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합하고, 다음으로 C에 결합한다;
(2-22) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-23) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-24) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 세린, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-25) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 아스파라긴인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-26) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 세린인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-27) 1번 AA, 4번 AA 및 "ii"(-2)번 AA 3개 아미노산이 순서대로 발린, 아스파라긴, 세린인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-28) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 트레오닌인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-29) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 아스파라긴, 트레오닌인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-30) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 아스파라긴, 트립토판인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고, 다음으로 T에 결합한다;
(2-31) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 아스파라긴, 트립토판인 경우, 그 PPR 모티프는 T 선택적으로 결합하고, 다음으로 C에 결합한다;
(2-32) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 프롤린, 임의의 아미노산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-33) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-34) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-35) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 티로신, 프롤린, 아스파라긴산인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다;
(2-36) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 세린, 임의의 아미노산인 경우, 그 PPR 모티프는 A와 G에 선택적으로 결합한다;
(2-37) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-38) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-39) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 세린, 아스파라긴인 경우, 그 PPR 모티프는 A 선택적으로 결합한다;
(2-40) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 임의의 아미노산인 경우, 그 PPR 모티프는 A와 G에 선택적으로 결합한다;
(2-41) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합한다;
(2-42) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 트레오닌, 아스파라긴산인 경우, 그 PPR 모티프는 G에 선택적으로 결합한다;
(2-43) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-44) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 페닐알라닌, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-45) 1번 AA, 4번 AA 및 "ii" (-2) 번 AA의 3개 아미노산이 순서대로 이소류신, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-46) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 발린, 트레오닌, 아스파라긴인 경우, 그 PPR 모티프는 A에 선택적으로 결합한다;
(2-47) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 임의의 아미노산인 경우, 그 PPR 모티프는 A, C 및 T에 결합하지만 G는 결합하지 않는다;
(2-48) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 이소류신, 발린, 아스파라긴산인 경우, 그 PPR 모티프는 C에 선택적으로 결합하고, 다음으로 A에 결합한다;
(2-49) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 글리신인 경우, 그 PPR 모티프는 C에 선택적으로 결합한다;
(2-50) 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 3개 아미노산이 순서대로 임의의 아미노산, 발린, 트레오닌인 경우, 그 PPR 모티프는 T에 선택적으로 결합한다.
특정 위치의 아미노산의 특정 조합과 DNA 염기와의 결합성은 실험을 통해 확인할 수 있다. 이러한 목적으로의 실험은 PPR 모티프 또는 복수의 PPR 모티프를 포함하는 단백질의 조제, 기질 DNA의 조제 및 결합성 시험(예를 들어, 겔 시프트법)을 포함한다. 각각의 실험은 당업자에게 잘 알려져 있으며, 보다 구체적인 절차/조건은, 예를 들면 특허문헌 2를 참조할 수 있다.
[PPR 모티프와 PPR 단백질의 이용]
동정 및 설계:
하나의 PPR 모티프는 DNA의 특정 단일염기를 인식하고, 연속하는 복수의 PPR 모티프에서 DNA 서열 중의 연속하는 염기를 인식할 수 있다. 본 발명에 따르면, 특정 위치의 아미노산을 적절히 선택함으로써, A, T, G, C 각각에 선택적인 PPR 모티프를 선택하거나 설계할 수 있으며, 그러한 PPR 모티프의 적절한 연속을 포함하는 단백질은 대응하는 해당 특이적인 서열을 인식할 수 있다. 따라서, 본 발명에 의하면 특정 염기서열을 갖는 DNA에 선택적으로 결합하는 천연형 PPR 단백질을 예측/동정할 수 있고, 반대로, PPR 단백질의 결합의 표적이 되는 DNA를 예측/동정할 수 있다. 표적의 예측/동정은 유전적 실체를 밝히는 데 도움이 되고, 표적의 이용가능성을 확대할 수 있는 점에서도 유용하다.
또한, 본 발명에 의하면, 원하는 DNA 염기에 선택적으로 결합가능한 PPR 모티프 및 원하는 DNA에 서열특이적으로 결합가능한 복수의 PPR 모티프를 갖는 단백질을 설계할 수 있다. 설계시에는, PPR 모티프 중 중요한 위치의 아미노산 이외의 부분은 서열번호 1 내지 5에 기재된 바와 같은 DNA 결합형 PPR 단백질 중에서 천연형 PPR 모티프의 서열정보를 참조할 수 있다. 또한, 전체적으로 천연형을 이용하여 해당 위치의 아미노산만을 치환하여 설계할 수 있다. PPR 모티프의 반복 수는 표적서열에 따라 적절하게 선택할 수 있는데, 예를 들어 2개 이상, 2 내지 30개, 보다 바람직하게는 5 내지 25개, 가장 바람직하게는 9 내지 15개로 선택할 수 있다.
설계시에는, 1번 AA, 4번 AA 및 "ii"(-2)번 AA 아미노산의 조합 이외에 고려하여도 좋다. 예를 들어, 상술한 특허문헌 2에 기재된 8번 및 12번 아미노산의 선택이 DNA 결합활성을 나타내는 데 중요한 경우가 있다. 본 발명자들의 검토에 따르면, 어떤 PPR 모티프의 8번 아미노산과 그것과 같은 PPR 모티프의 12번 아미노산이 DNA 결합에 협동할 가능성이 있다. 8번 아미노산은 염기성 아미노산, 바람직하게는 리신, 또는 산성 아미노산, 바람직하게는 아스파라긴산으로 하는 것이 가능하고, 12번 아미노산은 염기성 아미노산 또는 중성 아미노산 또는 소수성 아미노산으로 할 수 있다.
설계된 모티프 또는 단백질은 당업자에게 잘 알려진 방법에 의해 제조할 수 있다. 즉, 본 발명은 1번 AA, 4번 AA 및 "ii"(-2)번 AA 아미노산의 조합에 착안한 특정 DNA 염기에 선택적으로 결합하는 PPR 모티프 및 특정 서열을 갖는 DNA에 특이적으로 결합하는 PPR 단백질을 제공한다. 이러한 모티프 및 단백질은 당업자에게 잘 알려진 방법으로, 비교적 대량으로도 제조가 가능하며, 이러한 방법은 원하는 모티프 또는 단백질이 갖는 아미노산 서열에서 그것을 암호화하는 핵산서열을 결정하고 복제한 다음, 원하는 모티프 또는 단백질을 생산하는 형질전환체를 작제하는 것을 포함할 수 있다.
복합체의 제조 및 그 이용:
본 발명에 의해 제공되는 PPR 모티프 또는 PPR 단백질은 기능성 영역을 연결하여, 복합체(complex)로 제조할 수 있다. 기능성 영역은 일반적으로, 생체 내 또는 세포 내에서 특정 생물학적 기능, 예를 들면 효소기능, 촉매기능, 억제기능, 항진기능 등의 기능을 갖는 부분 또는 표지로서의 기능을 갖는 부분을 의미한다. 이러한 영역은 예를 들어, 단백질, 펩타이드, 핵산, 생리활성 물질, 약물로 구성된다.
본 발명은 PPR 단백질에 기능성 영역을 연결시킴으로써, PPR 단백질에 의해 발휘되는 표적 DNA 서열 결합기능과 기능성 영역에 의해 발휘되는 기능을 조합하여 발휘시킬 수 있다. 예를 들어, 기능성 영역으로 DNA 절단기능을 갖는 단백질(예 FokI 등의 제한효소) 또는 뉴클레아제 영역을 사용하여 복합체가 인공 DNA 절단효소로 역할을 한다.
이러한 복합체를 제조하기 위해서는 당해 기술분야에서 일반적으로 사용할 수 있는 방법을 이용할 수 있으며, 단지 하나의 단백질 분자로 합성하는 방법과 여러 단백질의 부재를 별도 합성한 후 그들의 부재를 결합하여 복합체를 형성하는 방법 등이 알려져 있다.
일례로, 단지 하나의 단백질 분자로 합성하는 방법의 경우, PPR 단백질의 C 말단 아미노산 링커를 통해 절단효소를 융합 한 단백질 복합체를 설계하여 그 단백질 복합체를 발현하는 발현용 벡터 구조를 구축하고, 상기 구조에서 원하는 복합체를 발현시킬 수 있다. 이러한 제조방법은 특원 2011-242250에 기재된 방법 등을 사용할 수 있다.
PPR 단백질과 기능성 영역 단백질과의 결합은 아미노산에 의한 링커를 통해 연결, 아비딘-비오틴 등의 특이적 친화성을 통한 연결 등의 화학적 링커를 통한 연결 등 당해 기술분야에 알려진 어떠한 연결수단을 이용해도 무방하다.
본 발명에서 사용할 수 있는 기능성 영역으로는, 다양한 기능 중 DNA의 절단, 전사, 복제, 수복, 합성, 수식 등 중 하나의 기능을 부여할 수 있는 영역을 의미한다. 본 발명의 특징인 PPR 모티프의 서열을 조정하고, 표적으로 하는 DNA의 염기서열을 정함으로써, 거의 모든 DNA 서열을 표적으로 이용할 수 있으며, 그의 표적으로 DNA의 절단, 전사, 복제, 수복, 합성, 수식 등 기능성 영역이 가지는 기능을 이용한 게놈 편집을 실현할 수 있다.
예를 들어, 기능성 영역의 기능이 DNA의 절단기능인 경우, 본 발명에서 제조되는 PPR 단백질 부분과 DNA의 절단영역이 연결된 복합체가 제공된다. 이와 같은 복합체는 PPR 단백질 부분에 의해 표적으로 하는 DNA의 염기서열을 인식한 후, DNA의 절단영역에 의해 DNA를 절단하는 인공 DNA 절단효소로 역할을 한다.
본 발명에 사용할 수 있는 절단기능을 갖는 기능성 영역의 예는 엔도디옥시리보뉴클레아제 역할을 하는 디옥시뉴클레아제(DNase)이다. 이러한 DNase의 예로는, DNase A(예를 들어, bovine pancreatic ribonuclease A: PDB 2AAS), DNase H, DNase I 등의 엔도디옥시뉴클레아제, 혹은 다양한 세균 유래의 제한효소(예 FokI(서열번호 6) 등)과 그의 뉴클레아제 영역을 사용할 수 있다. 이러한 PPR 단백질과 기능성 영역을 포함 복합체는 천연에는 존재하지 않는 신규한 것이다.
기능성 영역의 기능이 전사제어 기능인 경우, 본 발명에서 제조되는 PPR 단백질 부분과 DNA의 전사제어 영역이 연결된 복합체가 제공된다. 이와 같은 복합체는 PPR 단백질 부분에 의해 표적으로 하는 DNA의 염기서열을 인식한 후, 원하는 DNA의 전사를 제어하는 인공 전사조절 인자로서 기능할 수 있다.
본 발명에서 사용할 수 있는 전사제어 기능을 갖는 기능성 영역은 전사를 활성화하는 영역이어도 좋고, 전사를 억제하는 영역이어도 무방하다. 전사제어 영역의 예로서는 VP16, VP64, TA2, STAT-6, p65이 있다. 이러한 PPR 단백질과 전사제어 영역을 포함하는 복합체는 천연에는 존재하지 않는 신규한 것이다.
또한, 본 발명에 의해 얻어지는 복합체는 생체 내 또는 세포 내에서 DNA 서열특이적으로 기능성 영역을 전달하고 기능할 수 있다. 따라서, 징크 핑거 단백질(전술한 비특허문헌 1 및 비특허문헌 2)와 TAL effector(전술한 비특허문헌 3, 전술한 특허문헌 1)를 이용한 단백질 복합체뿐만 아니라, 생체 내 또는 세포 내에서 DNA 서열특이적으로 개변(modification)/파괴(disruption)할 수 있고, DNA 절단 및 그 기능을 이용한 게놈 편집이라는 새로운 기능을 부여할 수 있다. 구체적으로는 특정 염기와 결합할 수 있는 PPR 모티프를 여러개 연결한 PPR 단백질에 의해 특정 DNA 서열을 인식할 수 있다. 그리고, PPR 단백질에 결합된 기능성 영역에 따라 기능성 영역이 가지는 기능을 이용하여 인식한 DNA 영역의 게놈 편집을 실현할 수 있다.
또한, DNA 서열특이적으로 결합하는 PPR 단백질에 약물을 결합하여 그 DNA 서열 주변을 대상으로 약물을 전달할 수 있다. 따라서, 본 발명은 DNA 서열특이적인 기능성 물질의 송달 방법을 제공하는 것이기도 하다.
본 발명에서 재료로 하는 PPR 단백질은 DNA 편집의 편집부위의 지정에 작용하고 이와 같은 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기 위치에 특정 아미노산을 배치한 PPR 모티프가 DNA 상의 특정 염기를 인식한 후, 그 DNA에 결합활성을 갖는 것을 밝혀 내었다. 이러한 특징에 따라 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기 위치에 특정 아미노산을 배치한 PPR 단백질은 각각의 PPR 단백질에 대해 DNA 상의 특이적인 염기를 인식하고 그 결과로 염기 다형성(base polymorphism)을 도입하는 것, 또는 염기 다형성에 기인한 질환 또는 상태를 처치하는 것이 기대되고, 상술한 바와 같은 다른 기능성 영역과 결합을 통해 DNA를 분리하고 게놈 편집을 실현하기 위한 기능의 개변(modification)/향상(improvemant)에 이바지할 것으로 기대되고 있다.
또한, PPR 단백질의 C 말단에는 외래성의 DNA 절단효소를 융합시킬 수 있다. 또한, N 말단의 PPR 모티프의 결합 DNA 염기 선택성을 개선하여 DNA 서열특이적인 DNA 절단효소를 구성할 수도 있다. 또한, GFP 등의 표지 부분을 연결한 복합체는 원하는 DNA를 생체 내에서 시각화하기 위해 사용하는 것도 가능하다.
DNA 편집에 관여하는 PPR 단백질 및 그 표적서열의 수집
선행기술 문헌(비특허문헌 11 내지 비특허문헌 15)에 제공된 정보를 참조하여, p63 단백질(서열번호 1), GUN1(서열번호 2), pTac2 단백질(서열번호 3), DG1 단백질(서열번호 4), 및 GRP23 단백질(서열번호 5)의 구조 및 기능을 분석하였다.
이러한 단백질 중의 PPR 모티프 구조에 대하여, Uniprot 데이터베이스(http://www.uniprot.org/) 정보와 함께 본 발명에서 정의하는 아미노산 번호를 부여하였다. 실험에 사용된 애기장대의 5종류(서열번호 1 내지 5) PPR 단백질에 포함된 PPR 모티프와 아미노산 번호를 도 3에 기재하였다.
구체적으로는, 전술한 p63 단백질(서열번호 1), GUN1(서열번호 2), pTac2 단백질(서열번호 3), DG1 단백질(서열번호 4) 및 GRP23 단백질(서열번호 5)에 대한 RNA를 표적으로 하는 경우에 중요하다고 여겨지는 PPR 모티프 중 핵산 인식코드를 담당하는 3개의 아미노산(1번 AA, 4번 AA 및 "ii"(-2)번 AA)에 대하여, 아미노산 출현빈도를 RNA 결합형 모티프와 비교하였다.
애기장대의 p63 단백질(서열번호 1)은 9개의 PPR 모티프를 가지고 있으며, 그 아미노산 서열 중 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기의 위치는 하기 표 및 도 3에 정리한 바와 같다.
애기장대의 GUN1(서열번호 2)은 11개의 PPR 모티프를 가지고 있으며, 그 아미노산 서열 중 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기의 위치는 하기 표 및 도 3에 정리한 바와 같다.
애기장대의 pTac2(서열번호 3)는 15개의 PPR 모티프를 가지고 있으며, 그 아미노산 서열 중 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기의 위치는 하기 표 및 도 3에 정리한 바와 같다.
애기장대의 DG1 단백질(서열번호 4)은 10개의 PPR 모티프를 가지고 있으며, 그 아미노산 서열 중 1번 AA, 4번 AA 및 "ii"(-2)번 AA 잔기의 위치는 하기 표 및 도 3에 정리한 바와 같다.
애기장대의 GRP23 단백질(서열번호 5)은 11개의 PPR 모티프를 가지고 있으며, 그 아미노산 서열 중 1번 AA, 4번 AA 및 "ii"(-2)번 AA의 잔기의 위치는 하기 표 및 도 3에 정리한 바와 같다.
이들 위치의 아미노산 빈도를 각각의 단백질에 대해 확인하고, RNA 결합형 모티프의 경우 동일한 위치의 아미노산 빈도와 비교하였다. 결과를 도 2에 나타내었다. 이러한 DNA 결합성이 시사하는 PPR 단백질의 PPR 모티프와 RNA 결합형 모티프 사이에서 아미노산 출현빈도의 경향이 거의 일치하는 것으로 밝혀졌다. 즉, DNA 결합에 작용하는 PPR 단백질은 RNA 결합에 작용하는 PPR 단백질과 동일한 서열규칙으로 핵산과 결합하는 사실, 본 발명자들이 특허출원(PCT/JP2012/077274)에 기재한 RNA 인식코드는 DNA의 결합에 작용하는 PPR 단백질에 대한 DNA 인식코드로서 적용할 수 있다는 사실이 밝혀졌다.
비특허문헌(Yagi, Y. et al., Plos One 2013, 8, e57286)의 RNA 인식코드를 참고하여 각각의 염기에 선택적으로 결합하는 DNA 결합형 PPR 모티프를 평가하였다. 구체적으로, 표 6 및 표 7(DNA 결합코드의 염기선택성)에 나타낸 염기 출현빈도(occurrence nuceotide frequency)와 백그라운드 빈도(background frequency)로부터 산출한 기대값(expected nucleotide frequency)을 바탕으로 카이 제곱 검정법(chi square test)에 의하여 산출하였다. 각각의 염기(NT), 퓨린/피리미딘(AG or CT; PY), 수소결합 그룹(AT or GC; HB) 혹은 아미노·케토형(AC or GT)에 대하여 검정하였다. 유의값을 P <0.06(5.E-02; 5% 유의수준)으로 하여, 어느 하나의 검정에서 유의값이 얻어지는 경우, 1번 아미노산, 4번 아미노산, "ii"(-2)번 아미노산의 조합을 선택하였다.
표 6 및 표 7에는 유의한 염기 선택성을 나타내는 아미노산의 조합의 경우를 열거하였다. 즉, 이러한 결과는 유의한 P값이 얻어진 1번 아미노산, 4번 아미노산, "ii"(-2)번 아미노산(표에서는 (NSRs; 1,4, ii)의 아미노산 종을 가지는 PPR 모티프가 염기 선택적인 결합능력을 부여하는 PPR 모티프인 것, 및 백그라운드 차감 후에서 '긍정적(positive)'의 수치가 클수록 그 염기에 대한 염기 선택성이 높아지는 것을 의미한다. 1번 아미노산, 4번 아미노산, "ii"(-2)번 아미노산 중 4번 아미노산이 염기 선택성에 대해 가장 강하게 관여하고, "ii"(-2)번 아미노산이 염기 선택성에 대해 다음으로 강하게 관여하며, 1번 아미노산이 염기 선택성에 대하여 3개의 아미노산 중에서 약하게 관여한다.
PPR 분자의 서열특이적인 DNA 결합능력 평가
본 실시예에서는 p63, pTac2, GUN1의 3종류의 DNA 결합형(으로 보여지는) PPR 분자로 전사 활성화 도메인인 VP64를 융합시킨 인공 전사인자(transcription factor)를 작제하고, 사람 배양세포에서 각각의 표적서열을 갖는 루시퍼라제 리포터를 활성화할 수 있는지의 여부를 검토함으로써, 각 PPR 분자가 서열특이적인 DNA 결합능력이 있는지의 여부를 조사하였다(도 5).
(실험방법)
1. PPR-VP64 발현벡터의 작제
p63, pTac2, GUN1 코드 서열 중 PPR 모티프에 해당하는 부분만을 인공합성하여 작제하였다. DNA 합성은 Biomatik사의 인공 유전자 합성 서비스를 이용하였다. 백본 벡터(backbone vector)로는 CMV 프로모터를 갖는 pCS2P 벡터를 이용하여 합성한 PPR 서열을 삽입하였다. 또한, PPR 서열의 N 말단에 Flag 태그(Flag tag)와 핵 이행 시그널(nuclear transfer signal)을, C 말단에 VP64 서열을 각각 삽입하였다. 작제한 p63-VP64, pTac2-VP64, GUN1-VP64 서열을 서열목록의 서열번호 7 내지 9로 나타내었다.
2. PPR 표적서열을 갖는 리포터 벡터의 작제
Minimal CMV 프로모터의 하류에 반딧불(firefly)의 루시퍼라제 유전자를 연결하고 프로모터의 상류에 다클로닝 부위(multi-cloning site)를 배치한 리포터 벡터(pminCMV-luc2, 서열번호 10)를 작제하였다. 이 벡터의 다클로닝 부위에 각 PPR 예측 표적서열(predicted target sequence)을 삽입하였다. 각 PPR의 표적서열(p63은 TCTATCACT, pTac2는 AACTTTCGTCACTCA, GUN1는 AATTTGTCGAT, 서열목록의 서열번호 11 내지 13)은 RNA 결합형 PPR에서의 모티프-RNA 간의 인식코드로부터 DNA 결합형 PPR에서의 모티프-DNA 간의 코드를 예측함으로써 결정하였다. 각각의 PPR에 대해서는 표적서열을 4개 삽입한 것과 8개 삽입한 것을 각각 작제하고, 이하의 분석에 사용하였다. 각 벡터의 염기서열은 서열목록의 서열번호 14 내지 19로 나타내었다.
3. HEK293T 세포에의 형질도입
상기 항목 1에서 작제한 PPR-VP64 발현벡터와 상기 항목 2에서 작제한 반딧불(firefly) 루시퍼라제(luciferase) 발현벡터, 또한 참조로서 프로메가사의 pRL-CMV 벡터(레닐라(Renilla) 루시퍼라제 발현벡터)를 라이프테크놀로지스사(Life Technologies)의 리포펙타민(lipofectamine) LTX를 사용하여 도입하였다. 96웰 플레이트의 각 웰에 DMEM 배지를 25μl 이외에도 PPR-VP64 발현벡터 400ng과 반딧불의 루시퍼라제 발현벡터 100ng, pRL-CMV 벡터 20ng을 혼합한 용액을 가하였다. 그 후 DMEM 배지 25μl와 리포펙타민 LTX 0.7μl를 혼합한 용액을 각 웰에 가하고 30분간 실온에서 방치한 다음, 100μl의 15% 소 태아 혈청을 포함하는 DMEM 배지에 현탁한 6 × 104 세포분의 HEK293T 세포를 가하고, 37℃의 CO2 배양기에서 24시간 배양하였다.
4. 루시퍼라제 분석
루시퍼라제는 프로메가사의 Dual-Glo Luciferase Assay System을 사용하고, 키트에 첨부된 설명서대로 분석하였다. 루시퍼라제 활성의 측정에는 베르톨드사(Berthold)의 TriStar LB 941 플레이트 리더를 사용하였다.
(결과고찰)
pTac2-VP64 및 GUN1-VP64 각각에 대해 음성 대조군인 pminCMV-luc2와 함께 도입한 경우 및 표적서열을 4개 또는 8개 가지는 리포터 벡터와 함께 도입한 경우의 루시퍼라제 활성을 비교하였다(하기 표, 도 6). 활성은 Fluc(반딧불의 루시퍼라제)의 측정값을 참조 Rluc(레닐라(Renilla) 루시퍼라제)의 측정값으로 나누어 표준화한 점수(Fluc/Rluc)에 근거하여 비교하였다. 그 결과, 양쪽 모두 표적서열을 늘릴수록 활성이 상승하는 경향을 보였다. 이러한 PPR-VP64 분자가 각각의 표적서열에 특이적으로 결합하는 부위 특이적인 전사 활성화 인자(transcription activator)로서 역할을 하는 것이 입증되었다.
<110> Kyushu University
Hiroshima University
<120> DNA binding proteins using PPR motifs and use thereof
<130> FP 1504-SIKs-JP
<150> JP 2013-089840
<151> 2013-04-22
<160> 19
<170> KopatentIn 2.0
<210> 1
<211> 596
<212> PRT
<213> Arabidopsis thaliana
<400> 1
Met Phe Ala Leu Ser Lys Val Leu Arg Arg Thr Gln Arg Leu Arg Leu
1 5 10 15
Gly Ala Cys Ser Ala Val Phe Ser Lys Asp Ile Gln Leu Gly Gly Glu
20 25 30
Arg Ser Phe Asp Ser Asn Ser Ile Ala Ser Thr Lys Arg Glu Ala Val
35 40 45
Pro Arg Phe Tyr Glu Ile Ser Ser Leu Ser Asn Arg Ala Leu Ser Ser
50 55 60
Ser Ala Gly Thr Lys Ser Asp Gln Glu Glu Asp Asp Leu Glu Asp Gly
65 70 75 80
Phe Ser Glu Leu Glu Gly Ser Lys Ser Gly Gln Gly Ser Thr Ser Ser
85 90 95
Asp Glu Asp Glu Gly Lys Leu Ser Ala Asp Glu Glu Glu Glu Glu Glu
100 105 110
Leu Asp Leu Ile Glu Thr Asp Val Ser Arg Lys Thr Val Glu Lys Lys
115 120 125
Gln Ser Glu Leu Phe Lys Thr Ile Val Ser Ala Pro Gly Leu Ser Ile
130 135 140
Gly Ser Ala Leu Asp Lys Trp Val Glu Glu Gly Asn Glu Ile Thr Arg
145 150 155 160
Val Glu Ile Ala Lys Ala Met Leu Gln Leu Arg Arg Arg Arg Met Tyr
165 170 175
Gly Arg Ala Leu Gln Met Ser Glu Trp Leu Glu Ala Asn Lys Lys Ile
180 185 190
Glu Met Thr Glu Arg Asp Tyr Ala Ser Arg Leu Asp Leu Thr Val Lys
195 200 205
Ile Arg Gly Leu Glu Lys Gly Glu Ala Cys Met Gln Lys Ile Pro Lys
210 215 220
Ser Phe Lys Gly Glu Val Leu Tyr Arg Thr Leu Leu Ala Asn Cys Val
225 230 235 240
Ala Ala Gly Asn Val Lys Lys Ser Glu Leu Val Phe Asn Lys Met Lys
245 250 255
Asp Leu Gly Phe Pro Leu Ser Gly Phe Thr Cys Asp Gln Met Leu Leu
260 265 270
Leu His Lys Arg Ile Asp Arg Lys Lys Ile Ala Asp Val Leu Leu Leu
275 280 285
Met Glu Lys Glu Asn Ile Lys Pro Ser Leu Leu Thr Tyr Lys Ile Leu
290 295 300
Ile Asp Val Lys Gly Ala Thr Asn Asp Ile Ser Gly Met Glu Gln Ile
305 310 315 320
Leu Glu Thr Met Lys Asp Glu Gly Val Glu Leu Asp Phe Gln Thr Gln
325 330 335
Ala Leu Thr Ala Arg His Tyr Ser Gly Ala Gly Leu Lys Asp Lys Ala
340 345 350
Glu Lys Val Leu Lys Glu Met Glu Gly Glu Ser Leu Glu Ala Asn Arg
355 360 365
Arg Ala Phe Lys Asp Leu Leu Ser Ile Tyr Ala Ser Leu Gly Arg Glu
370 375 380
Asp Glu Val Lys Arg Ile Trp Lys Ile Cys Glu Ser Lys Pro Tyr Phe
385 390 395 400
Glu Glu Ser Leu Ala Ala Ile Gln Ala Phe Gly Lys Leu Asn Lys Val
405 410 415
Gln Glu Ala Glu Ala Ile Phe Glu Lys Ile Val Lys Met Asp Arg Arg
420 425 430
Ala Ser Ser Ser Thr Tyr Ser Val Leu Leu Arg Val Tyr Val Asp His
435 440 445
Lys Met Leu Ser Lys Gly Lys Asp Leu Val Lys Arg Met Ala Glu Ser
450 455 460
Gly Cys Arg Ile Glu Ala Thr Thr Trp Asp Ala Leu Ile Lys Leu Tyr
465 470 475 480
Val Glu Ala Gly Glu Val Glu Lys Ala Asp Ser Leu Leu Asp Lys Ala
485 490 495
Ser Lys Gln Ser His Thr Lys Leu Met Met Asn Ser Phe Met Tyr Ile
500 505 510
Met Asp Glu Tyr Ser Lys Arg Gly Asp Val His Asn Thr Glu Lys Ile
515 520 525
Phe Leu Lys Met Arg Glu Ala Gly Tyr Thr Ser Arg Leu Arg Gln Phe
530 535 540
Gln Ala Leu Met Gln Ala Tyr Ile Asn Ala Lys Ser Pro Ala Tyr Gly
545 550 555 560
Met Arg Asp Arg Leu Lys Ala Asp Asn Ile Phe Pro Asn Lys Ser Met
565 570 575
Ala Ala Gln Leu Ala Gln Gly Asp Pro Phe Lys Lys Thr Ala Ile Ser
580 585 590
Asp Ile Leu Asp
595
<210> 2
<211> 918
<212> PRT
<213> Arabidopsis thaliana
<400> 2
Met Ala Ser Thr Pro Pro His Trp Val Thr Thr Thr Asn Asn His Arg
1 5 10 15
Pro Trp Leu Pro Gln Arg Pro Arg Pro Gly Arg Ser Val Thr Ser Ala
20 25 30
Pro Pro Ser Ser Ser Ala Ser Val Ser Ser Ala His Leu Ser Gln Thr
35 40 45
Thr Pro Asn Phe Ser Pro Leu Gln Thr Pro Lys Ser Asp Phe Ser Gly
50 55 60
Arg Gln Ser Thr Arg Phe Val Ser Pro Ala Thr Asn Asn His Arg Gln
65 70 75 80
Thr Arg Gln Asn Pro Asn Tyr Asn His Arg Pro Tyr Gly Ala Ser Ser
85 90 95
Ser Pro Arg Gly Ser Ala Pro Pro Pro Ser Ser Val Ala Thr Val Ala
100 105 110
Pro Ala Gln Leu Ser Gln Pro Pro Asn Phe Ser Pro Leu Gln Thr Pro
115 120 125
Lys Ser Asp Leu Ser Ser Asp Phe Ser Gly Arg Arg Ser Thr Arg Phe
130 135 140
Val Ser Lys Met His Phe Gly Arg Gln Lys Thr Thr Met Ala Thr Arg
145 150 155 160
His Ser Ser Ala Ala Glu Asp Ala Leu Gln Asn Ala Ile Asp Phe Ser
165 170 175
Gly Asp Asp Glu Met Phe His Ser Leu Met Leu Ser Phe Glu Ser Lys
180 185 190
Leu Cys Gly Ser Asp Asp Cys Thr Tyr Ile Ile Arg Glu Leu Gly Asn
195 200 205
Arg Asn Glu Cys Asp Lys Ala Val Gly Phe Tyr Glu Phe Ala Val Lys
210 215 220
Arg Glu Arg Arg Lys Asn Glu Gln Gly Lys Leu Ala Ser Ala Met Ile
225 230 235 240
Ser Thr Leu Gly Arg Tyr Gly Lys Val Thr Ile Ala Lys Arg Ile Phe
245 250 255
Glu Thr Ala Phe Ala Gly Gly Tyr Gly Asn Thr Val Tyr Ala Phe Ser
260 265 270
Ala Leu Ile Ser Ala Tyr Gly Arg Ser Gly Leu His Glu Glu Ala Ile
275 280 285
Ser Val Phe Asn Ser Met Lys Glu Tyr Gly Leu Arg Pro Asn Leu Val
290 295 300
Thr Tyr Asn Ala Val Ile Asp Ala Cys Gly Lys Gly Gly Met Glu Phe
305 310 315 320
Lys Gln Val Ala Lys Phe Phe Asp Glu Met Gln Arg Asn Gly Val Gln
325 330 335
Pro Asp Arg Ile Thr Phe Asn Ser Leu Leu Ala Val Cys Ser Arg Gly
340 345 350
Gly Leu Trp Glu Ala Ala Arg Asn Leu Phe Asp Glu Met Thr Asn Arg
355 360 365
Arg Ile Glu Gln Asp Val Phe Ser Tyr Asn Thr Leu Leu Asp Ala Ile
370 375 380
Cys Lys Gly Gly Gln Met Asp Leu Ala Phe Glu Ile Leu Ala Gln Met
385 390 395 400
Pro Val Lys Arg Ile Met Pro Asn Val Val Ser Tyr Ser Thr Val Ile
405 410 415
Asp Gly Phe Ala Lys Ala Gly Arg Phe Asp Glu Ala Leu Asn Leu Phe
420 425 430
Gly Glu Met Arg Tyr Leu Gly Ile Ala Leu Asp Arg Val Ser Tyr Asn
435 440 445
Thr Leu Leu Ser Ile Tyr Thr Lys Val Gly Arg Ser Glu Glu Ala Leu
450 455 460
Asp Ile Leu Arg Glu Met Ala Ser Val Gly Ile Lys Lys Asp Val Val
465 470 475 480
Thr Tyr Asn Ala Leu Leu Gly Gly Tyr Gly Lys Gln Gly Lys Tyr Asp
485 490 495
Glu Val Lys Lys Val Phe Thr Glu Met Lys Arg Glu His Val Leu Pro
500 505 510
Asn Leu Leu Thr Tyr Ser Thr Leu Ile Asp Gly Tyr Ser Lys Gly Gly
515 520 525
Leu Tyr Lys Glu Ala Met Glu Ile Phe Arg Glu Phe Lys Ser Ala Gly
530 535 540
Leu Arg Ala Asp Val Val Leu Tyr Ser Ala Leu Ile Asp Ala Leu Cys
545 550 555 560
Lys Asn Gly Leu Val Gly Ser Ala Val Ser Leu Ile Asp Glu Met Thr
565 570 575
Lys Glu Gly Ile Ser Pro Asn Val Val Thr Tyr Asn Ser Ile Ile Asp
580 585 590
Ala Phe Gly Arg Ser Ala Thr Met Asp Arg Ser Ala Asp Tyr Ser Asn
595 600 605
Gly Gly Ser Leu Pro Phe Ser Ser Ser Ala Leu Ser Ala Leu Thr Glu
610 615 620
Thr Glu Gly Asn Arg Val Ile Gln Leu Phe Gly Gln Leu Thr Thr Glu
625 630 635 640
Ser Asn Asn Arg Thr Thr Lys Asp Cys Glu Glu Gly Met Gln Glu Leu
645 650 655
Ser Cys Ile Leu Glu Val Phe Arg Lys Met His Gln Leu Glu Ile Lys
660 665 670
Pro Asn Val Val Thr Phe Ser Ala Ile Leu Asn Ala Cys Ser Arg Cys
675 680 685
Asn Ser Phe Glu Asp Ala Ser Met Leu Leu Glu Glu Leu Arg Leu Phe
690 695 700
Asp Asn Lys Val Tyr Gly Val Val His Gly Leu Leu Met Gly Gln Arg
705 710 715 720
Glu Asn Val Trp Leu Gln Ala Gln Ser Leu Phe Asp Lys Val Asn Glu
725 730 735
Met Asp Gly Ser Thr Ala Ser Ala Phe Tyr Asn Ala Leu Thr Asp Met
740 745 750
Leu Trp His Phe Gly Gln Lys Arg Gly Ala Glu Leu Val Ala Leu Glu
755 760 765
Gly Arg Ser Arg Gln Val Trp Glu Asn Val Trp Ser Asp Ser Cys Leu
770 775 780
Asp Leu His Leu Met Ser Ser Gly Ala Ala Arg Ala Met Val His Ala
785 790 795 800
Trp Leu Leu Asn Ile Arg Ser Ile Val Tyr Glu Gly His Glu Leu Pro
805 810 815
Lys Val Leu Ser Ile Leu Thr Gly Trp Gly Lys His Ser Lys Val Val
820 825 830
Gly Asp Gly Ala Leu Arg Arg Ala Val Glu Val Leu Leu Arg Gly Met
835 840 845
Asp Ala Pro Phe His Leu Ser Lys Cys Asn Met Gly Arg Phe Thr Ser
850 855 860
Ser Gly Ser Val Val Ala Thr Trp Leu Arg Glu Ser Ala Thr Leu Lys
865 870 875 880
Leu Leu Ile Leu His Asp His Ile Thr Thr Ala Thr Ala Thr Thr Thr
885 890 895
Thr Met Lys Ser Thr Asp Gln Gln Gln Arg Lys Gln Thr Ser Phe Ala
900 905 910
Leu Gln Pro Leu Leu Leu
915
<210> 3
<211> 862
<212> PRT
<213> Arabidopsis thaliana
<400> 3
Met Asn Leu Ala Ile Pro Asn Pro Asn Ser His His Leu Ser Phe Leu
1 5 10 15
Ile Gln Asn Ser Ser Phe Ile Gly Asn Arg Arg Phe Ala Asp Gly Asn
20 25 30
Arg Leu Arg Phe Leu Ser Gly Gly Asn Arg Lys Pro Cys Ser Phe Ser
35 40 45
Gly Lys Ile Lys Ala Lys Thr Lys Asp Leu Val Leu Gly Asn Pro Ser
50 55 60
Val Ser Val Glu Lys Gly Lys Tyr Ser Tyr Asp Val Glu Ser Leu Ile
65 70 75 80
Asn Lys Leu Ser Ser Leu Pro Pro Arg Gly Ser Ile Ala Arg Cys Leu
85 90 95
Asp Ile Phe Lys Asn Lys Leu Ser Leu Asn Asp Phe Ala Leu Val Phe
100 105 110
Lys Glu Phe Ala Gly Arg Gly Asp Trp Gln Arg Ser Leu Arg Leu Phe
115 120 125
Lys Tyr Met Gln Arg Gln Ile Trp Cys Lys Pro Asn Glu His Ile Tyr
130 135 140
Thr Ile Met Ile Ser Leu Leu Gly Arg Glu Gly Leu Leu Asp Lys Cys
145 150 155 160
Leu Glu Val Phe Asp Glu Met Pro Ser Gln Gly Val Ser Arg Ser Val
165 170 175
Phe Ser Tyr Thr Ala Leu Ile Asn Ala Tyr Gly Arg Asn Gly Arg Tyr
180 185 190
Glu Thr Ser Leu Glu Leu Leu Asp Arg Met Lys Asn Glu Lys Ile Ser
195 200 205
Pro Ser Ile Leu Thr Tyr Asn Thr Val Ile Asn Ala Cys Ala Arg Gly
210 215 220
Gly Leu Asp Trp Glu Gly Leu Leu Gly Leu Phe Ala Glu Met Arg His
225 230 235 240
Glu Gly Ile Gln Pro Asp Ile Val Thr Tyr Asn Thr Leu Leu Ser Ala
245 250 255
Cys Ala Ile Arg Gly Leu Gly Asp Glu Ala Glu Met Val Phe Arg Thr
260 265 270
Met Asn Asp Gly Gly Ile Val Pro Asp Leu Thr Thr Tyr Ser His Leu
275 280 285
Val Glu Thr Phe Gly Lys Leu Arg Arg Leu Glu Lys Val Cys Asp Leu
290 295 300
Leu Gly Glu Met Ala Ser Gly Gly Ser Leu Pro Asp Ile Thr Ser Tyr
305 310 315 320
Asn Val Leu Leu Glu Ala Tyr Ala Lys Ser Gly Ser Ile Lys Glu Ala
325 330 335
Met Gly Val Phe His Gln Met Gln Ala Ala Gly Cys Thr Pro Asn Ala
340 345 350
Asn Thr Tyr Ser Val Leu Leu Asn Leu Phe Gly Gln Ser Gly Arg Tyr
355 360 365
Asp Asp Val Arg Gln Leu Phe Leu Glu Met Lys Ser Ser Asn Thr Asp
370 375 380
Pro Asp Ala Ala Thr Tyr Asn Ile Leu Ile Glu Val Phe Gly Glu Gly
385 390 395 400
Gly Tyr Phe Lys Glu Val Val Thr Leu Phe His Asp Met Val Glu Glu
405 410 415
Asn Ile Glu Pro Asp Met Glu Thr Tyr Glu Gly Ile Ile Phe Ala Cys
420 425 430
Gly Lys Gly Gly Leu His Glu Asp Ala Arg Lys Ile Leu Gln Tyr Met
435 440 445
Thr Ala Asn Asp Ile Val Pro Ser Ser Lys Ala Tyr Thr Gly Val Ile
450 455 460
Glu Ala Phe Gly Gln Ala Ala Leu Tyr Glu Glu Ala Leu Val Ala Phe
465 470 475 480
Asn Thr Met His Glu Val Gly Ser Asn Pro Ser Ile Glu Thr Phe His
485 490 495
Ser Leu Leu Tyr Ser Phe Ala Arg Gly Gly Leu Val Lys Glu Ser Glu
500 505 510
Ala Ile Leu Ser Arg Leu Val Asp Ser Gly Ile Pro Arg Asn Arg Asp
515 520 525
Thr Phe Asn Ala Gln Ile Glu Ala Tyr Lys Gln Gly Gly Lys Phe Glu
530 535 540
Glu Ala Val Lys Thr Tyr Val Asp Met Glu Lys Ser Arg Cys Asp Pro
545 550 555 560
Asp Glu Arg Thr Leu Glu Ala Val Leu Ser Val Tyr Ser Phe Ala Arg
565 570 575
Leu Val Asp Glu Cys Arg Glu Gln Phe Glu Glu Met Lys Ala Ser Asp
580 585 590
Ile Leu Pro Ser Ile Met Cys Tyr Cys Met Met Leu Ala Val Tyr Gly
595 600 605
Lys Thr Glu Arg Trp Asp Asp Val Asn Glu Leu Leu Glu Glu Met Leu
610 615 620
Ser Asn Arg Val Ser Asn Ile His Gln Val Ile Gly Gln Met Ile Lys
625 630 635 640
Gly Asp Tyr Asp Asp Asp Ser Asn Trp Gln Ile Val Glu Tyr Val Leu
645 650 655
Asp Lys Leu Asn Ser Glu Gly Cys Gly Leu Gly Ile Arg Phe Tyr Asn
660 665 670
Ala Leu Leu Asp Ala Leu Trp Trp Leu Gly Gln Lys Glu Arg Ala Ala
675 680 685
Arg Val Leu Asn Glu Ala Thr Lys Arg Gly Leu Phe Pro Glu Leu Phe
690 695 700
Arg Lys Asn Lys Leu Val Trp Ser Val Asp Val His Arg Met Ser Glu
705 710 715 720
Gly Gly Met Tyr Thr Ala Leu Ser Val Trp Leu Asn Asp Ile Asn Asp
725 730 735
Met Leu Leu Lys Gly Asp Leu Pro Gln Leu Ala Val Val Val Ser Val
740 745 750
Arg Gly Gln Leu Glu Lys Ser Ser Ala Ala Arg Glu Ser Pro Ile Ala
755 760 765
Lys Ala Ala Phe Ser Phe Leu Gln Asp His Val Ser Ser Ser Phe Ser
770 775 780
Phe Thr Gly Trp Asn Gly Gly Arg Ile Met Cys Gln Arg Ser Gln Leu
785 790 795 800
Lys Gln Leu Leu Ser Thr Lys Glu Pro Thr Ser Glu Glu Ser Glu Asn
805 810 815
Lys Asn Leu Val Ala Leu Ala Asn Ser Pro Ile Phe Ala Ala Gly Thr
820 825 830
Arg Ala Ser Thr Ser Ser Asp Thr Asn His Ser Gly Asn Pro Thr Gln
835 840 845
Arg Arg Thr Arg Thr Lys Lys Glu Leu Ala Gly Ser Thr Ala
850 855 860
<210> 4
<211> 798
<212> PRT
<213> Arabidopsis thaliana
<400> 4
Met Asp Ala Ser Val Val Arg Phe Ser Gln Ser Pro Ala Arg Val Pro
1 5 10 15
Pro Glu Phe Glu Pro Asp Met Glu Lys Ile Lys Arg Arg Leu Leu Lys
20 25 30
Tyr Gly Val Asp Pro Thr Pro Lys Ile Leu Asn Asn Leu Arg Lys Lys
35 40 45
Glu Ile Gln Lys His Asn Arg Arg Thr Lys Arg Glu Thr Glu Ser Glu
50 55 60
Ala Glu Val Tyr Thr Glu Ala Gln Lys Gln Ser Met Glu Glu Glu Ala
65 70 75 80
Arg Phe Gln Thr Leu Arg Arg Glu Tyr Lys Gln Phe Thr Arg Ser Ile
85 90 95
Ser Gly Lys Arg Gly Gly Asp Val Gly Leu Met Val Gly Asn Pro Trp
100 105 110
Glu Gly Ile Glu Arg Val Lys Leu Lys Glu Leu Val Ser Gly Val Arg
115 120 125
Arg Glu Glu Val Ser Ala Gly Glu Leu Lys Lys Glu Asn Leu Lys Glu
130 135 140
Leu Lys Lys Ile Leu Glu Lys Asp Leu Arg Trp Val Leu Asp Asp Asp
145 150 155 160
Val Asp Val Glu Glu Phe Asp Leu Asp Lys Glu Phe Asp Pro Ala Lys
165 170 175
Arg Trp Arg Asn Glu Gly Glu Ala Val Arg Val Leu Val Asp Arg Leu
180 185 190
Ser Gly Arg Glu Ile Asn Glu Lys His Trp Lys Phe Val Arg Met Met
195 200 205
Asn Gln Ser Gly Leu Gln Phe Thr Glu Asp Gln Met Leu Lys Ile Val
210 215 220
Asp Arg Leu Gly Arg Lys Gln Ser Trp Lys Gln Ala Ser Ala Val Val
225 230 235 240
His Trp Val Tyr Ser Asp Lys Lys Arg Lys His Leu Arg Ser Arg Phe
245 250 255
Val Tyr Thr Lys Leu Leu Ser Val Leu Gly Phe Ala Arg Arg Pro Gln
260 265 270
Glu Ala Leu Gln Ile Phe Asn Gln Met Leu Gly Asp Arg Gln Leu Tyr
275 280 285
Pro Asp Met Ala Ala Tyr His Cys Ile Ala Val Thr Leu Gly Gln Ala
290 295 300
Gly Leu Leu Lys Glu Leu Leu Lys Val Ile Glu Arg Met Arg Gln Lys
305 310 315 320
Pro Thr Lys Leu Thr Lys Asn Leu Arg Gln Lys Asn Trp Asp Pro Val
325 330 335
Leu Glu Pro Asp Leu Val Val Tyr Asn Ala Ile Leu Asn Ala Cys Val
340 345 350
Pro Thr Leu Gln Trp Lys Ala Val Ser Trp Val Phe Val Glu Leu Arg
355 360 365
Lys Asn Gly Leu Arg Pro Asn Gly Ala Thr Tyr Gly Leu Ala Met Glu
370 375 380
Val Met Leu Glu Ser Gly Lys Phe Asp Arg Val His Asp Phe Phe Arg
385 390 395 400
Lys Met Lys Ser Ser Gly Glu Ala Pro Lys Ala Ile Thr Tyr Lys Val
405 410 415
Leu Val Arg Ala Leu Trp Arg Glu Gly Lys Ile Glu Glu Ala Val Glu
420 425 430
Ala Val Arg Asp Met Glu Gln Lys Gly Val Ile Gly Thr Gly Ser Val
435 440 445
Tyr Tyr Glu Leu Ala Cys Cys Leu Cys Asn Asn Gly Arg Trp Cys Asp
450 455 460
Ala Met Leu Glu Val Gly Arg Met Lys Arg Leu Glu Asn Cys Arg Pro
465 470 475 480
Leu Glu Ile Thr Phe Thr Gly Leu Ile Ala Ala Ser Leu Asn Gly Gly
485 490 495
His Val Asp Asp Cys Met Ala Ile Phe Gln Tyr Met Lys Asp Lys Cys
500 505 510
Asp Pro Asn Ile Gly Thr Ala Asn Met Met Leu Lys Val Tyr Gly Arg
515 520 525
Asn Asp Met Phe Ser Glu Ala Lys Glu Leu Phe Glu Glu Ile Val Ser
530 535 540
Arg Lys Glu Thr His Leu Val Pro Asn Glu Tyr Thr Tyr Ser Phe Met
545 550 555 560
Leu Glu Ala Ser Ala Arg Ser Leu Gln Trp Glu Tyr Phe Glu His Val
565 570 575
Tyr Gln Thr Met Val Leu Ser Gly Tyr Gln Met Asp Gln Thr Lys His
580 585 590
Ala Ser Met Leu Ile Glu Ala Ser Arg Ala Gly Lys Trp Ser Leu Leu
595 600 605
Glu His Ala Phe Asp Ala Val Leu Glu Asp Gly Glu Ile Pro His Pro
610 615 620
Leu Phe Phe Thr Glu Leu Leu Cys His Ala Thr Ala Lys Gly Asp Phe
625 630 635 640
Gln Arg Ala Ile Thr Leu Ile Asn Thr Val Ala Leu Ala Ser Phe Gln
645 650 655
Ile Ser Glu Glu Glu Trp Thr Asp Leu Phe Glu Glu His Gln Asp Trp
660 665 670
Leu Thr Gln Asp Asn Leu His Lys Leu Ser Asp His Leu Ile Glu Cys
675 680 685
Asp Tyr Val Ser Glu Pro Thr Val Ser Asn Leu Ser Lys Ser Leu Lys
690 695 700
Ser Arg Cys Gly Ser Ser Ser Ser Ser Ala Gln Pro Leu Leu Ala Val
705 710 715 720
Asp Val Thr Thr Gln Ser Gln Gly Glu Lys Pro Glu Glu Asp Leu Leu
725 730 735
Leu Gln Asp Thr Thr Met Glu Asp Asp Asn Ser Ala Asn Gly Glu Ala
740 745 750
Trp Glu Phe Thr Glu Thr Glu Leu Glu Thr Leu Gly Leu Glu Glu Leu
755 760 765
Glu Ile Asp Asp Asp Glu Glu Ser Ser Asp Ser Asp Ser Leu Ser Val
770 775 780
Tyr Asp Ile Leu Lys Glu Trp Glu Glu Ser Ser Lys Lys Glu
785 790 795
<210> 5
<211> 913
<212> PRT
<213> Arabidopsis thaliana
<400> 5
Met Ser Leu Ser His Leu Leu Arg Arg Leu Cys Thr Thr Thr Thr Thr
1 5 10 15
Thr Arg Ser Pro Leu Ser Ile Ser Phe Leu His Gln Arg Ile His Asn
20 25 30
Ile Ser Leu Ser Pro Ala Asn Glu Asp Pro Glu Thr Thr Thr Gly Asn
35 40 45
Asn Gln Asp Ser Glu Lys Tyr Pro Asn Leu Asn Pro Ile Pro Asn Asp
50 55 60
Pro Ser Gln Phe Gln Ile Pro Gln Asn His Thr Pro Pro Ile Pro Tyr
65 70 75 80
Pro Pro Ile Pro His Arg Thr Met Ala Phe Ser Ser Ala Glu Glu Ala
85 90 95
Ala Ala Glu Arg Arg Arg Arg Lys Arg Arg Leu Arg Ile Glu Pro Pro
100 105 110
Leu His Ala Leu Arg Arg Asp Pro Ser Ala Pro Pro Pro Lys Arg Asp
115 120 125
Pro Asn Ala Pro Arg Leu Pro Asp Ser Thr Ser Ala Leu Val Gly Gln
130 135 140
Arg Leu Asn Leu His Asn Arg Val Gln Ser Leu Ile Arg Ala Ser Asp
145 150 155 160
Leu Asp Ala Ala Ser Lys Leu Ala Arg Gln Ser Val Phe Ser Asn Thr
165 170 175
Arg Pro Thr Val Phe Thr Cys Asn Ala Ile Ile Ala Ala Met Tyr Arg
180 185 190
Ala Lys Arg Tyr Ser Glu Ser Ile Ser Leu Phe Gln Tyr Phe Phe Lys
195 200 205
Gln Ser Asn Ile Val Pro Asn Val Val Ser Tyr Asn Gln Ile Ile Asn
210 215 220
Ala His Cys Asp Glu Gly Asn Val Asp Glu Ala Leu Glu Val Tyr Arg
225 230 235 240
His Ile Leu Ala Asn Ala Pro Phe Ala Pro Ser Ser Val Thr Tyr Arg
245 250 255
His Leu Thr Lys Gly Leu Val Gln Ala Gly Arg Ile Gly Asp Ala Ala
260 265 270
Ser Leu Leu Arg Glu Met Leu Ser Lys Gly Gln Ala Ala Asp Ser Thr
275 280 285
Val Tyr Asn Asn Leu Ile Arg Gly Tyr Leu Asp Leu Gly Asp Phe Asp
290 295 300
Lys Ala Val Glu Phe Phe Asp Glu Leu Lys Ser Lys Cys Thr Val Tyr
305 310 315 320
Asp Gly Ile Val Asn Ala Thr Phe Met Glu Tyr Trp Phe Glu Lys Gly
325 330 335
Asn Asp Lys Glu Ala Met Glu Ser Tyr Arg Ser Leu Leu Asp Lys Lys
340 345 350
Phe Arg Met His Pro Pro Thr Gly Asn Val Leu Leu Glu Val Phe Leu
355 360 365
Lys Phe Gly Lys Lys Asp Glu Ala Trp Ala Leu Phe Asn Glu Met Leu
370 375 380
Asp Asn His Ala Pro Pro Asn Ile Leu Ser Val Asn Ser Asp Thr Val
385 390 395 400
Gly Ile Met Val Asn Glu Cys Phe Lys Met Gly Glu Phe Ser Glu Ala
405 410 415
Ile Asn Thr Phe Lys Lys Val Gly Ser Lys Val Thr Ser Lys Pro Phe
420 425 430
Val Met Asp Tyr Leu Gly Tyr Cys Asn Ile Val Thr Arg Phe Cys Glu
435 440 445
Gln Gly Met Leu Thr Glu Ala Glu Arg Phe Phe Ala Glu Gly Val Ser
450 455 460
Arg Ser Leu Pro Ala Asp Ala Pro Ser His Arg Ala Met Ile Asp Ala
465 470 475 480
Tyr Leu Lys Ala Glu Arg Ile Asp Asp Ala Val Lys Met Leu Asp Arg
485 490 495
Met Val Asp Val Asn Leu Arg Val Val Ala Asp Phe Gly Ala Arg Val
500 505 510
Phe Gly Glu Leu Ile Lys Asn Gly Lys Leu Thr Glu Ser Ala Glu Val
515 520 525
Leu Thr Lys Met Gly Glu Arg Glu Pro Lys Pro Asp Pro Ser Ile Tyr
530 535 540
Asp Val Val Val Arg Gly Leu Cys Asp Gly Asp Ala Leu Asp Gln Ala
545 550 555 560
Lys Asp Ile Val Gly Glu Met Ile Arg His Asn Val Gly Val Thr Thr
565 570 575
Val Leu Arg Glu Phe Ile Ile Glu Val Phe Glu Lys Ala Gly Arg Arg
580 585 590
Glu Glu Ile Glu Lys Ile Leu Asn Ser Val Ala Arg Pro Val Arg Asn
595 600 605
Ala Gly Gln Ser Gly Asn Thr Pro Pro Arg Val Pro Ala Val Phe Gly
610 615 620
Thr Thr Pro Ala Ala Pro Gln Gln Pro Arg Asp Arg Ala Pro Trp Thr
625 630 635 640
Ser Gln Gly Val Val His Ser Asn Ser Gly Trp Ala Asn Gly Thr Ala
645 650 655
Gly Gln Thr Ala Gly Gly Ala Tyr Lys Ala Asn Asn Gly Gln Asn Pro
660 665 670
Ser Trp Ser Asn Thr Ser Asp Asn Gln Gln Gln Gln Ser Trp Ser Asn
675 680 685
Gln Thr Ala Gly Gln Gln Pro Pro Ser Trp Ser Arg Gln Ala Pro Gly
690 695 700
Tyr Gln Gln Gln Gln Ser Trp Ser Gln Gln Ser Gly Trp Ser Ser Pro
705 710 715 720
Ser Gly His Gln Gln Ser Trp Thr Asn Gln Thr Ala Gly Gln Gln Gln
725 730 735
Pro Trp Ala Asn Gln Thr Pro Gly Gln Gln Gln Gln Trp Ala Asn Gln
740 745 750
Thr Pro Gly Gln Gln Gln Gln Leu Ala Asn Gln Thr Pro Gly Gln Gln
755 760 765
Gln Gln Trp Ala Asn Gln Thr Pro Gly Gln Gln Gln Gln Trp Ala Asn
770 775 780
Gln Asn Asn Gly His Gln Gln Pro Trp Ala Asn Gln Asn Thr Gly His
785 790 795 800
Gln Gln Ser Trp Ala Asn Gln Thr Pro Ser Gln Gln Gln Pro Trp Ala
805 810 815
Asn Gln Thr Thr Gly Gln Gln Gln Gly Trp Gly Asn Gln Thr Thr Gly
820 825 830
Gln Gln Gln Gln Trp Ala Asn Gln Thr Ala Gly Gln Gln Ser Gly Trp
835 840 845
Thr Ala Gln Gln Gln Trp Ser Asn Gln Thr Ala Ser His Gln Gln Ser
850 855 860
Gln Trp Leu Asn Pro Val Pro Gly Glu Val Ala Asn Gln Thr Pro Trp
865 870 875 880
Ser Asn Ser Val Asp Ser His Leu Pro Gln Gln Gln Glu Pro Gly Pro
885 890 895
Ser His Glu Cys Gln Glu Thr Gln Glu Lys Lys Val Val Glu Leu Arg
900 905 910
Asn
<210> 6
<211> 196
<212> PRT
<213> Flabovacterium okeianocoites
<400> 6
Ala Leu Val Lys Ser Glu Leu Glu Glu Lys Lys Ser Glu Leu Arg His
1 5 10 15
Lys Leu Lys Tyr Val Pro His Glu Tyr Ile Glu Leu Ile Glu Ile Ala
20 25 30
Arg Asn Ser Thr Gln Asp Arg Ile Leu Glu Met Lys Val Met Glu Phe
35 40 45
Phe Met Lys Val Tyr Gly Tyr Arg Gly Lys His Leu Gly Gly Ser Arg
50 55 60
Lys Pro Asp Gly Ala Ile Tyr Thr Val Gly Ser Pro Ile Asp Tyr Gly
65 70 75 80
Val Ile Val Asp Thr Lys Ala Tyr Ser Gly Gly Tyr Asn Leu Pro Ile
85 90 95
Gly Gln Ala Asp Glu Met Gln Arg Tyr Val Glu Glu Asn Gln Thr Arg
100 105 110
Asn Lys His Ile Asn Pro Asn Glu Trp Trp Lys Val Tyr Pro Ser Ser
115 120 125
Val Thr Glu Phe Lys Phe Leu Phe Val Ser Gly His Phe Lys Gly Asn
130 135 140
Tyr Lys Ala Gln Leu Thr Arg Leu Asn His Ile Thr Asn Cys Asn Gly
145 150 155 160
Ala Val Leu Ser Val Glu Glu Leu Leu Ile Gly Gly Glu Met Ile Lys
165 170 175
Ala Gly Thr Leu Thr Leu Glu Glu Val Arg Arg Lys Phe Asn Asn Gly
180 185 190
Glu Ile Asn Phe
195
<210> 7
<211> 5303
<212> DNA
<213> Artificial Sequence
<220>
<223> p63-VP64
<400> 7
cgccattctg cctggggacg tcggagcaag cttgatttag gtgacactat agaatacaag 60
ctacttgttc tttttgcaag atctccacca tggactataa ggaccacgac ggagactaca 120
aggatcatga tattgattac aaagacgatg acgataagat ggccccaaag aagaagcgga 180
aggtcggtat ccccgggggc gaagtgctgt ataggacact gctggccaac tgcgtggctg 240
ctgggaacgt gaagaagtcc gaactggtct tcaacaagat gaaggatctg gggttccccc 300
tgagcggctt tacctgtgac caaatgctgc tgctgcacaa aaggattgat agaaagaaaa 360
tcgctgatgt cctgctgctg atggaaaagg aaaatatcaa gccaagcctg ctgacctaca 420
agatcctgat cgatgtgaag ggcgccacca acgacattag cgggatggaa cagattctgg 480
aaacaatgaa agacgagggc gtggagctgg atttccaaac acaggccctg acagccaggc 540
attactccgg cgctggactg aaagataagg cagaaaaggt gctgaaggaa atggagggag 600
agtccctgga agcaaatagg agggccttta aggacctgct gtccatttac gcctccctgg 660
gcagagaaga cgaagtgaaa agaatttgga agatttgcga gtccaaacca tactttgagg 720
aatccctggc cgctatccaa gcattcggca agctgaataa ggtgcaagaa gccgaggcaa 780
tcttcgaaaa gattgtgaag atggatagaa gagcaagctc cagcacatac tccgtcctgc 840
tgagagtgta cgtggatcat aagatgctga gcaaaggcaa agacctggtg aagagaatgg 900
ccgagagcgg gtgcagaatt gaagccacca cctgggacgc tctgatcaaa ctgtatgtcg 960
aggctgggga ggtggaaaaa gccgattccc tgctggataa agccagcaaa caatcccaca 1020
ctaaactgat gatgaatagc ttcatgtata tcatggacga gtatagcaag aggggcgacg 1080
tgcacaatac cgaaaaaatc tttctgaaaa tgagggaagc cgggtatact agcggatccg 1140
gacgggctga cgcattggac gattttgatc tggatatgct gggaagtgac gccctcgatg 1200
attttgacct tgacatgctt ggttcggatg cccttgatga ctttgacctc gacatgctcg 1260
gcagtgacgc ccttgatgat ttcgacctgg acatgctgat taactctagt tgatctagat 1320
tctgcagccc tatagtgagt cgtattacgt agatccagac atgataagat acattgatga 1380
gtttggacaa accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga 1440
tgctattgct ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg 1500
cattcatttt atgtttcagg ttcaggggga ggtgtgggag gttttttaat tcgcggccgc 1560
ggcgccaatg cattgggccc ggtacccagc ttttgttccc tttagtgagg gttaattgcg 1620
cgcttggcgt aatcatggtc atagctgttt cctgtgtgaa attgttatcc gctcacaatt 1680
ccacacaaca tacgagccgg aagcataaag tgtaaagcct ggggtgccta atgagtgagc 1740
taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc 1800
cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct 1860
tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca 1920
gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac 1980
atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt 2040
ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 2100
cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 2160
tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc 2220
gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc 2280
aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac 2340
tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt 2400
aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 2460
aactacggct acactagaag gacagtattt ggtatctgcg ctctgctgaa gccagttacc 2520
ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt 2580
ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg 2640
atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc 2700
atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa 2760
tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag 2820
gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg 2880
tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga 2940
gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag 3000
cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa 3060
gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc 3120
atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca 3180
aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg 3240
atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat 3300
aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc 3360
aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg 3420
gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg 3480
gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt 3540
gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca 3600
ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata 3660
ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac 3720
atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa 3780
gtgccaccta aattgtaagc gttaatattt tgttaaaatt cgcgttaaat ttttgttaaa 3840
tcagctcatt ttttaaccaa taggccgaaa tcggcaaaat cccttataaa tcaaaagaat 3900
agaccgagat agggttgagt gttgttccag tttggaacaa gagtccacta ttaaagaacg 3960
tggactccaa cgtcaaaggg cgaaaaaccg tctatcaggg cgatggccca ctacgtgaac 4020
catcacccta atcaagtttt ttggggtcga ggtgccgtaa agcactaaat cggaacccta 4080
aagggagccc ccgatttaga gcttgacggg gaaagccggc gaacgtggcg agaaaggaag 4140
ggaagaaagc gaaaggagcg ggcgctaggg cgctggcaag tgtagcggtc acgctgcgcg 4200
taaccaccac acccgccgcg cttaatgcgc cgctacaggg cgcgtcccat tcgccattca 4260
ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc ttcgctatta cgccagtcga 4320
ccatagccaa ttcaatatgg cgtatatgga ctcatgccaa ttcaatatgg tggatctgga 4380
cctgtgccaa ttcaatatgg cgtatatgga ctcgtgccaa ttcaatatgg tggatctgga 4440
ccccagccaa ttcaatatgg cggacttggc accatgccaa ttcaatatgg cggacttggc 4500
actgtgccaa ctggggaggg gtctacttgg cacggtgcca agtttgagga ggggtcttgg 4560
ccctgtgcca agtccgccat attgaattgg catggtgcca ataatggcgg ccatattggc 4620
tatatgccag gatcaatata taggcaatat ccaatatggc cctatgccaa tatggctatt 4680
ggccaggttc aatactatgt attggcccta tgccatatag tattccatat atgggttttc 4740
ctattgacgt agatagcccc tcccaatggg cggtcccata taccatatat ggggcttcct 4800
aataccgccc atagccactc ccccattgac gtcaatggtc tctatatatg gtctttccta 4860
ttgacgtcat atgggcggtc ctattgacgt atatggcgcc tcccccattg acgtcaatta 4920
cggtaaatgg cccgcctggc tcaatgccca ttgacgtcaa taggaccacc caccattgac 4980
gtcaatggga tggctcattg cccattcata tccgttctca cgccccctat tgacgtcaat 5040
gacggtaaat ggcccacttg gcagtacatc aatatctatt aatagtaact tggcaagtac 5100
attactattg gaaggacgcc agggtacatt ggcagtactc ccattgacgt caatggcggt 5160
aaatggcccg cgatggctgc caagtacatc cccattgacg tcaatgggga ggggcaatga 5220
cgcaaatggg cgttccattg acgtaaatgg gcggtaggcg tgcctaatgg gaggtctata 5280
taagcaatgc tcgtttaggg aac 5303
<210> 8
<211> 5948
<212> DNA
<213> Artificial Sequence
<220>
<223> pTac2-VP64
<400> 8
cgccattctg cctggggacg tcggagcaag cttgatttag gtgacactat agaatacaag 60
ctacttgttc tttttgcaag atctccacca tggactataa ggaccacgac ggagactaca 120
aggatcatga tattgattac aaagacgatg acgataagat ggccccaaag aagaagcgga 180
aggtcggtat ccccgggtcc ctgaacgact ttgcactggt ctttaaggaa ttcgcaggaa 240
ggggggattg gcaaagatcc ctgagactgt ttaagtatat gcagaggcaa atctggtgca 300
aacccaatga gcatatctat accattatga tttccctgct ggggagagaa ggactgctgg 360
ataaatgtct ggaagtgttt gacgaaatgc cttcccaagg agtgagcagg agcgtgttca 420
gctacactgc actgattaac gcctacggca gaaacggcag gtacgaaact agcctggagc 480
tgctggacag aatgaaaaac gagaagatca gcccaagcat cctgacttat aacacagtga 540
tcaatgcttg tgccagaggc ggactggact gggagggcct gctgggcctg ttcgcagaga 600
tgaggcacga agggattcaa cccgacatcg tgacttacaa tactctgctg tccgcatgtg 660
caattagggg cctgggggac gaagctgaaa tggtcttcag gactatgaat gacggcggaa 720
tcgtgcccga tctgaccaca tattcccatc tggtcgagac ctttgggaaa ctgaggagac 780
tggagaaggt gtgcgatctg ctgggagaaa tggctagcgg aggctccctg ccagatatta 840
cctcctacaa cgtgctgctg gaagcctacg ccaagtccgg ctccatcaag gaggctatgg 900
gcgtgtttca tcagatgcaa gccgctggct gtacccccaa tgccaacacc tattccgtcc 960
tgctgaatct gttcggccag agcgggagat acgatgacgt gaggcagctg tttctggaaa 1020
tgaagagcag caacaccgac cccgacgctg caacatacaa cattctgatc gaggtgtttg 1080
gcgagggggg ctacttcaaa gaagtcgtca ccctgttcca cgacatggtg gaggaaaaca 1140
tcgagcccga tatggagacc tatgagggga tcatcttcgc ttgcggcaaa ggcggcctgc 1200
atgaggacgc taggaagatc ctgcagtaca tgaccgctaa tgacattgtc ccatcctcca 1260
aagcttatac cggcgtgatc gaggccttcg gccaggctgc cctgtacgag gaagcactgg 1320
tcgcctttaa caccatgcac gaggtcggca gcaacccttc catcgagacc ttccactccc 1380
tgctgtatag cttcgccaga ggcgggctgg tgaaggagtc cgaggcaatc ctgagcaggc 1440
tggtcgattc cggcatcccc aggaacagag acacctttaa tgctcaaatt gaggcctaca 1500
aacagggggg gaagttcgaa gaggctgtga agacctacgt cgacatggaa aagagcaggt 1560
gcgaccccga cgagaggacc ctggaggccg tcctgtccgt gtattccttc gcaagactgg 1620
tggatgagtg cagggaacag tttgaagaaa tgaaggccag cgacattctg cccagcatta 1680
tgtgctactg catgatgctg gcagtgtacg ggaagaccga gaggtgggac gacgtgaacg 1740
aactgctgga ggagatgctg agcaacaggg tcagcaacgg atccggacgg gctgacgcat 1800
tggacgattt tgatctggat atgctgggaa gtgacgccct cgatgatttt gaccttgaca 1860
tgcttggttc ggatgccctt gatgactttg acctcgacat gctcggcagt gacgcccttg 1920
atgatttcga cctggacatg ctgattaact ctagttgatc tagattctgc agccctatag 1980
tgagtcgtat tacgtagatc cagacatgat aagatacatt gatgagtttg gacaaaccac 2040
aactagaatg cagtgaaaaa aatgctttat ttgtgaaatt tgtgatgcta ttgctttatt 2100
tgtaaccatt ataagctgca ataaacaagt taacaacaac aattgcattc attttatgtt 2160
tcaggttcag ggggaggtgt gggaggtttt ttaattcgcg gccgcggcgc caatgcattg 2220
ggcccggtac ccagcttttg ttccctttag tgagggttaa ttgcgcgctt ggcgtaatca 2280
tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga 2340
gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt 2400
gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga 2460
atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 2520
actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 2580
gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 2640
cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 2700
ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 2760
ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 2820
ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 2880
agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 2940
cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 3000
aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 3060
gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 3120
agaaggacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 3180
ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 3240
cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 3300
tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 3360
aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 3420
tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 3480
atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 3540
cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 3600
gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 3660
gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 3720
tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 3780
tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 3840
tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt 3900
aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 3960
atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa 4020
tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 4080
catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 4140
aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 4200
tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 4260
gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 4320
tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 4380
tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctaaattg 4440
taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc tcatttttta 4500
accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc gagatagggt 4560
tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac tccaacgtca 4620
aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca ccctaatcaa 4680
gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg agcccccgat 4740
ttagagcttg acggggaaag ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag 4800
gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct gcgcgtaacc accacacccg 4860
ccgcgcttaa tgcgccgcta cagggcgcgt cccattcgcc attcaggctg cgcaactgtt 4920
gggaagggcg atcggtgcgg gcctcttcgc tattacgcca gtcgaccata gccaattcaa 4980
tatggcgtat atggactcat gccaattcaa tatggtggat ctggacctgt gccaattcaa 5040
tatggcgtat atggactcgt gccaattcaa tatggtggat ctggacccca gccaattcaa 5100
tatggcggac ttggcaccat gccaattcaa tatggcggac ttggcactgt gccaactggg 5160
gaggggtcta cttggcacgg tgccaagttt gaggaggggt cttggccctg tgccaagtcc 5220
gccatattga attggcatgg tgccaataat ggcggccata ttggctatat gccaggatca 5280
atatataggc aatatccaat atggccctat gccaatatgg ctattggcca ggttcaatac 5340
tatgtattgg ccctatgcca tatagtattc catatatggg ttttcctatt gacgtagata 5400
gcccctccca atgggcggtc ccatatacca tatatggggc ttcctaatac cgcccatagc 5460
cactccccca ttgacgtcaa tggtctctat atatggtctt tcctattgac gtcatatggg 5520
cggtcctatt gacgtatatg gcgcctcccc cattgacgtc aattacggta aatggcccgc 5580
ctggctcaat gcccattgac gtcaatagga ccacccacca ttgacgtcaa tgggatggct 5640
cattgcccat tcatatccgt tctcacgccc cctattgacg tcaatgacgg taaatggccc 5700
acttggcagt acatcaatat ctattaatag taacttggca agtacattac tattggaagg 5760
acgccagggt acattggcag tactcccatt gacgtcaatg gcggtaaatg gcccgcgatg 5820
gctgccaagt acatccccat tgacgtcaat ggggaggggc aatgacgcaa atgggcgttc 5880
cattgacgta aatgggcggt aggcgtgcct aatgggaggt ctatataagc aatgctcgtt 5940
tagggaac 5948
<210> 9
<211> 5531
<212> DNA
<213> Artificial Sequence
<220>
<223> GUN1-VP64
<400> 9
cgccattctg cctggggacg tcggagcaag cttgatttag gtgacactat agaatacaag 60
ctacttgttc tttttgcaag atctccacca tggactataa ggaccacgac ggagactaca 120
aggatcatga tattgattac aaagacgatg acgataagat ggccccaaag aagaagcgga 180
aggtcggtat ccccgggcaa ggcaagctgg caagcgccat gatctccacc ctgggcaggt 240
acggaaaggt gaccattgcc aagaggatct tcgagaccgc cttcgcaggc gggtacggca 300
acaccgtgta tgctttttcc gccctgatta gcgcatatgg cagaagcggc ctgcacgaag 360
aggccattag cgtgtttaac agcatgaagg agtatggact gaggcccaac ctggtgacct 420
acaacgccgt cattgatgct tgcggcaagg gcggcatgga attcaagcag gtggccaagt 480
tcttcgatga aatgcagagg aacggcgtgc agcctgacag aattacattc aatagcctgc 540
tggctgtgtg cagcagaggg ggcctgtggg aggcagctag gaatctgttt gacgagatga 600
ccaatagaag gatcgagcag gacgtgttct cctataatac actgctggac gccatttgta 660
aaggcgggca aatggacctg gccttcgaaa tcctggccca gatgcccgtc aaaaggatca 720
tgcccaacgt ggtcagctac tccacagtca tcgacgggtt cgccaaggct ggcaggtttg 780
atgaagcact gaacctgttc ggggaaatga gatacctggg aatcgccctg gacagggtga 840
gctacaacac cctgctgagc atctacacta aggtcggcag atccgaggaa gccctggaca 900
tcctgaggga aatggcctcc gtgggcatta agaaggacgt cgtgacatac aatgccctgc 960
tgggcggcta cggcaaacag ggcaagtacg acgaggtcaa gaaggtcttc acagagatga 1020
agagggaaca cgtgctgcca aatctgctga cttattccac tctgattgat ggctactcca 1080
aaggcggact gtacaaggaa gccatggaga ttttcagaga gttcaagagc gctggcctga 1140
gagccgatgt cgtgctgtat tccgcactga tcgatgcact gtgcaaaaac ggcctggtcg 1200
gcagcgccgt gagcctgatc gacgagatga ccaaggaggg aattagcccc aatgtggtga 1260
cttacaatag catcattgat gctttcggca gaagcgccac catggacaga tccgccgact 1320
atagcaacgg cggcagcctg ccattttcct ccagcgccct gggatccgga cgggctgacg 1380
cattggacga ttttgatctg gatatgctgg gaagtgacgc cctcgatgat tttgaccttg 1440
acatgcttgg ttcggatgcc cttgatgact ttgacctcga catgctcggc agtgacgccc 1500
ttgatgattt cgacctggac atgctgatta actctagttg atctagattc tgcagcccta 1560
tagtgagtcg tattacgtag atccagacat gataagatac attgatgagt ttggacaaac 1620
cacaactaga atgcagtgaa aaaaatgctt tatttgtgaa atttgtgatg ctattgcttt 1680
atttgtaacc attataagct gcaataaaca agttaacaac aacaattgca ttcattttat 1740
gtttcaggtt cagggggagg tgtgggaggt tttttaattc gcggccgcgg cgccaatgca 1800
ttgggcccgg tacccagctt ttgttccctt tagtgagggt taattgcgcg cttggcgtaa 1860
tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc tcacaattcc acacaacata 1920
cgagccggaa gcataaagtg taaagcctgg ggtgcctaat gagtgagcta actcacatta 1980
attgcgttgc gctcactgcc cgctttccag tcgggaaacc tgtcgtgcca gctgcattaa 2040
tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc cgcttcctcg 2100
ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc tcactcaaag 2160
gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat gtgagcaaaa 2220
ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc 2280
cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg aaacccgaca 2340
ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc tcctgttccg 2400
accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt ggcgctttct 2460
catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt 2520
gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag 2580
tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa caggattagc 2640
agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa ctacggctac 2700
actagaagga cagtatttgg tatctgcgct ctgctgaagc cagttacctt cggaaaaaga 2760
gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt ttttgtttgc 2820
aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat cttttctacg 2880
gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat gagattatca 2940
aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc aatctaaagt 3000
atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc acctatctca 3060
gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta gataactacg 3120
atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga cccacgctca 3180
ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg cagaagtggt 3240
cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc tagagtaagt 3300
agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat cgtggtgtca 3360
cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag gcgagttaca 3420
tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat cgttgtcaga 3480
agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa ttctcttact 3540
gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa gtcattctga 3600
gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga taataccgcg 3660
ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg gcgaaaactc 3720
tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc acccaactga 3780
tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg aaggcaaaat 3840
gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact cttccttttt 3900
caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat atttgaatgt 3960
atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctaaa 4020
ttgtaagcgt taatattttg ttaaaattcg cgttaaattt ttgttaaatc agctcatttt 4080
ttaaccaata ggccgaaatc ggcaaaatcc cttataaatc aaaagaatag accgagatag 4140
ggttgagtgt tgttccagtt tggaacaaga gtccactatt aaagaacgtg gactccaacg 4200
tcaaagggcg aaaaaccgtc tatcagggcg atggcccact acgtgaacca tcaccctaat 4260
caagtttttt ggggtcgagg tgccgtaaag cactaaatcg gaaccctaaa gggagccccc 4320
gatttagagc ttgacgggga aagccggcga acgtggcgag aaaggaaggg aagaaagcga 4380
aaggagcggg cgctagggcg ctggcaagtg tagcggtcac gctgcgcgta accaccacac 4440
ccgccgcgct taatgcgccg ctacagggcg cgtcccattc gccattcagg ctgcgcaact 4500
gttgggaagg gcgatcggtg cgggcctctt cgctattacg ccagtcgacc atagccaatt 4560
caatatggcg tatatggact catgccaatt caatatggtg gatctggacc tgtgccaatt 4620
caatatggcg tatatggact cgtgccaatt caatatggtg gatctggacc ccagccaatt 4680
caatatggcg gacttggcac catgccaatt caatatggcg gacttggcac tgtgccaact 4740
ggggaggggt ctacttggca cggtgccaag tttgaggagg ggtcttggcc ctgtgccaag 4800
tccgccatat tgaattggca tggtgccaat aatggcggcc atattggcta tatgccagga 4860
tcaatatata ggcaatatcc aatatggccc tatgccaata tggctattgg ccaggttcaa 4920
tactatgtat tggccctatg ccatatagta ttccatatat gggttttcct attgacgtag 4980
atagcccctc ccaatgggcg gtcccatata ccatatatgg ggcttcctaa taccgcccat 5040
agccactccc ccattgacgt caatggtctc tatatatggt ctttcctatt gacgtcatat 5100
gggcggtcct attgacgtat atggcgcctc ccccattgac gtcaattacg gtaaatggcc 5160
cgcctggctc aatgcccatt gacgtcaata ggaccaccca ccattgacgt caatgggatg 5220
gctcattgcc cattcatatc cgttctcacg ccccctattg acgtcaatga cggtaaatgg 5280
cccacttggc agtacatcaa tatctattaa tagtaacttg gcaagtacat tactattgga 5340
aggacgccag ggtacattgg cagtactccc attgacgtca atggcggtaa atggcccgcg 5400
atggctgcca agtacatccc cattgacgtc aatggggagg ggcaatgacg caaatgggcg 5460
ttccattgac gtaaatgggc ggtaggcgtg cctaatggga ggtctatata agcaatgctc 5520
gtttagggaa c 5531
<210> 10
<211> 5135
<212> DNA
<213> Artificial Sequence
<220>
<223> pminCMV-luc2
<400> 10
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag cttactagtg 60
tcgaggtagg cgtgtacggt gggaggccta tataagcaga gctcgtttag tgaaccgtca 120
gatcgcctgg aggtaccgcc accatggaag atgccaaaaa cattaagaag ggcccagcgc 180
cattctaccc actcgaagac gggaccgccg gcgagcagct gcacaaagcc atgaagcgct 240
acgccctggt gcccggcacc atcgccttta ccgacgcaca tatcgaggtg gacattacct 300
acgccgagta cttcgagatg agcgttcggc tggcagaagc tatgaagcgc tatgggctga 360
atacaaacca tcggatcgtg gtgtgcagcg agaatagctt gcagttcttc atgcccgtgt 420
tgggtgccct gttcatcggt gtggctgtgg ccccagctaa cgacatctac aacgagcgcg 480
agctgctgaa cagcatgggc atcagccagc ccaccgtcgt attcgtgagc aagaaagggc 540
tgcaaaagat cctcaacgtg caaaagaagc taccgatcat acaaaagatc atcatcatgg 600
atagcaagac cgactaccag ggcttccaaa gcatgtacac cttcgtgact tcccatttgc 660
cacccggctt caacgagtac gacttcgtgc ccgagagctt cgaccgggac aaaaccatcg 720
ccctgatcat gaacagtagt ggcagtaccg gattgcccaa gggcgtagcc ctaccgcacc 780
gcaccgcttg tgtccgattc agtcatgccc gcgaccccat cttcggcaac cagatcatcc 840
ccgacaccgc tatcctcagc gtggtgccat ttcaccacgg cttcggcatg ttcaccacgc 900
tgggctactt gatctgcggc tttcgggtcg tgctcatgta ccgcttcgag gaggagctat 960
tcttgcgcag cttgcaagac tataagattc aatctgccct gctggtgccc acactattta 1020
gcttcttcgc taagagcact ctcatcgaca agtacgacct aagcaacttg cacgagatcg 1080
ccagcggcgg ggcgccgctc agcaaggagg taggtgaggc cgtggccaaa cgcttccacc 1140
taccaggcat ccgccagggc tacggcctga cagaaacaac cagcgccatt ctgatcaccc 1200
ccgaagggga cgacaagcct ggcgcagtag gcaaggtggt gcccttcttc gaggctaagg 1260
tggtggactt ggacaccggt aagacactgg gtgtgaacca gcgcggcgag ctgtgcgtcc 1320
gtggccccat gatcatgagc ggctacgtta acaaccccga ggctacaaac gctctcatcg 1380
acaaggacgg ctggctgcac agcggcgaca tcgcctactg ggacgaggac gagcacttct 1440
tcatcgtgga ccggctgaag agcctgatca aatacaaggg ctaccaggta gccccagccg 1500
aactggagag catcctgctg caacacccca acatcttcga cgccggggtc gccggcctgc 1560
ccgacgacga tgccggcgag ctgcccgccg cagtcgtcgt gctggaacac ggtaaaacca 1620
tgaccgagaa ggagatcgtg gactatgtgg ccagccaggt tacaaccgcc aagaagctgc 1680
gcggtggtgt tgtgttcgtg gacgaggtgc ctaaaggact gaccggcaag ttggacgccc 1740
gcaagatccg cgagattctc attaaggcca agaagggcgg caagatcgcc gtgaattctt 1800
aactgcagtt aatctagagt cggggcggcc ggccgcttcg agcagacatg ataagataca 1860
ttgatgagtt tggacaaacc acaactagaa tgcagtgaaa aaaatgcttt atttgtgaaa 1920
tttgtgatgc tattgcttta tttgtaacca ttataagctg caataaacaa gttaacaaca 1980
acaattgcat tcattttatg tttcaggttc agggggaggt gtgggaggtt ttttaaagca 2040
agtaaaacct ctacaaatgt ggtaaaatcg ataaggatct gaacgatgga gcggagaatg 2100
ggcggaactg ggcggagtta ggggcgggat gggcggagtt aggggcggga ctatggttgc 2160
tgactaattg agatgcatgc tttgcatact tctgcctgct ggggagcctg gggactttcc 2220
acacctggtt gctgactaat tgagatgcat gctttgcata cttctgcctg ctggggagcc 2280
tggggacttt ccacacccta actgacacac attccacagc ggatccgtcg accgatgccc 2340
ttgagagcct tcaacccagt cagctccttc cggtgggcgc ggggcatgac tatcgtcgcc 2400
gcacttatga ctgtcttctt tatcatgcaa ctcgtaggac aggtgccggc agcgctcttc 2460
cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc 2520
tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat 2580
gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt 2640
ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg 2700
aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc 2760
tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt 2820
ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa 2880
gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta 2940
tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa 3000
caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa 3060
ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctt 3120
cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt 3180
ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat 3240
cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat 3300
gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc 3360
aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc 3420
acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta 3480
gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga 3540
cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg 3600
cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc 3660
tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat 3720
cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag 3780
gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat 3840
cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa 3900
ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa 3960
gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga 4020
taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg 4080
gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc 4140
acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg 4200
aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact 4260
cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat 4320
atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt 4380
gccacctgac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag 4440
cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt 4500
tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc ctttagggtt 4560
ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg 4620
tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt 4680
taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt 4740
tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca 4800
aaaatttaac gcgaatttta acaaaatatt aacgcttaca atttgccatt cgccattcag 4860
gctgcgcaac tgttgggaag ggcgatcggt gcgggcctct tcgctattac gccagcccaa 4920
gctaccatga taagtaagta atattaaggt acgggaggta cttggagcgg ccgcaataaa 4980
atatctttat tttcattaca tctgtgtgtt ggttttttgt gtgaatcgat agtactaaca 5040
tacgctctcc atcaaaacaa aacgaaacaa aacaaactag caaaataggc tgtccccagt 5100
gcaagtgcag gtgccagaac atttctctat cgata 5135
<210> 11
<211> 9
<212> DNA
<213> Artificial Sequence
<220>
<223> Target sequence in p63
<400> 11
tctatcact 9
<210> 12
<211> 15
<212> DNA
<213> Artificial Sequence
<220>
<223> Target sequence in pTac2
<400> 12
aactttcgtc actca 15
<210> 13
<211> 11
<212> DNA
<213> Artificial Sequence
<220>
<223> Target sequence in GUN1
<400> 13
aatttgtcga t 11
<210> 14
<211> 5204
<212> DNA
<213> Artificial Sequence
<220>
<223> p63-4x target
<400> 14
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag ttctatcact 60
ttgtttcttc tatcacttga attcttctat cacttatctt cttctatcac ttcagttcgc 120
ttactagtgt cgaggtaggc gtgtacggtg ggaggcctat ataagcagag ctcgtttagt 180
gaaccgtcag atcgcctgga ggtaccgcca ccatggaaga tgccaaaaac attaagaagg 240
gcccagcgcc attctaccca ctcgaagacg ggaccgccgg cgagcagctg cacaaagcca 300
tgaagcgcta cgccctggtg cccggcacca tcgcctttac cgacgcacat atcgaggtgg 360
acattaccta cgccgagtac ttcgagatga gcgttcggct ggcagaagct atgaagcgct 420
atgggctgaa tacaaaccat cggatcgtgg tgtgcagcga gaatagcttg cagttcttca 480
tgcccgtgtt gggtgccctg ttcatcggtg tggctgtggc cccagctaac gacatctaca 540
acgagcgcga gctgctgaac agcatgggca tcagccagcc caccgtcgta ttcgtgagca 600
agaaagggct gcaaaagatc ctcaacgtgc aaaagaagct accgatcata caaaagatca 660
tcatcatgga tagcaagacc gactaccagg gcttccaaag catgtacacc ttcgtgactt 720
cccatttgcc acccggcttc aacgagtacg acttcgtgcc cgagagcttc gaccgggaca 780
aaaccatcgc cctgatcatg aacagtagtg gcagtaccgg attgcccaag ggcgtagccc 840
taccgcaccg caccgcttgt gtccgattca gtcatgcccg cgaccccatc ttcggcaacc 900
agatcatccc cgacaccgct atcctcagcg tggtgccatt tcaccacggc ttcggcatgt 960
tcaccacgct gggctacttg atctgcggct ttcgggtcgt gctcatgtac cgcttcgagg 1020
aggagctatt cttgcgcagc ttgcaagact ataagattca atctgccctg ctggtgccca 1080
cactatttag cttcttcgct aagagcactc tcatcgacaa gtacgaccta agcaacttgc 1140
acgagatcgc cagcggcggg gcgccgctca gcaaggaggt aggtgaggcc gtggccaaac 1200
gcttccacct accaggcatc cgccagggct acggcctgac agaaacaacc agcgccattc 1260
tgatcacccc cgaaggggac gacaagcctg gcgcagtagg caaggtggtg cccttcttcg 1320
aggctaaggt ggtggacttg gacaccggta agacactggg tgtgaaccag cgcggcgagc 1380
tgtgcgtccg tggccccatg atcatgagcg gctacgttaa caaccccgag gctacaaacg 1440
ctctcatcga caaggacggc tggctgcaca gcggcgacat cgcctactgg gacgaggacg 1500
agcacttctt catcgtggac cggctgaaga gcctgatcaa atacaagggc taccaggtag 1560
ccccagccga actggagagc atcctgctgc aacaccccaa catcttcgac gccggggtcg 1620
ccggcctgcc cgacgacgat gccggcgagc tgcccgccgc agtcgtcgtg ctggaacacg 1680
gtaaaaccat gaccgagaag gagatcgtgg actatgtggc cagccaggtt acaaccgcca 1740
agaagctgcg cggtggtgtt gtgttcgtgg acgaggtgcc taaaggactg accggcaagt 1800
tggacgcccg caagatccgc gagattctca ttaaggccaa gaagggcggc aagatcgccg 1860
tgaattctta actgcagtta atctagagtc ggggcggccg gccgcttcga gcagacatga 1920
taagatacat tgatgagttt ggacaaacca caactagaat gcagtgaaaa aaatgcttta 1980
tttgtgaaat ttgtgatgct attgctttat ttgtaaccat tataagctgc aataaacaag 2040
ttaacaacaa caattgcatt cattttatgt ttcaggttca gggggaggtg tgggaggttt 2100
tttaaagcaa gtaaaacctc tacaaatgtg gtaaaatcga taaggatctg aacgatggag 2160
cggagaatgg gcggaactgg gcggagttag gggcgggatg ggcggagtta ggggcgggac 2220
tatggttgct gactaattga gatgcatgct ttgcatactt ctgcctgctg gggagcctgg 2280
ggactttcca cacctggttg ctgactaatt gagatgcatg ctttgcatac ttctgcctgc 2340
tggggagcct ggggactttc cacaccctaa ctgacacaca ttccacagcg gatccgtcga 2400
ccgatgccct tgagagcctt caacccagtc agctccttcc ggtgggcgcg gggcatgact 2460
atcgtcgccg cacttatgac tgtcttcttt atcatgcaac tcgtaggaca ggtgccggca 2520
gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc 2580
ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg 2640
aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct 2700
ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca 2760
gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct 2820
cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc 2880
gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt 2940
tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc 3000
cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc 3060
cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg 3120
gtggcctaac tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc 3180
agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 3240
cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 3300
tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat 3360
tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag 3420
ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc aatgcttaat 3480
cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg cctgactccc 3540
cgtcgtgtag ataactacga tacgggaggg cttaccatct ggccccagtg ctgcaatgat 3600
accgcgagac ccacgctcac cggctccaga tttatcagca ataaaccagc cagccggaag 3660
ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc atccagtcta ttaattgttg 3720
ccgggaagct agagtaagta gttcgccagt taatagtttg cgcaacgttg ttgccattgc 3780
tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct tcattcagct ccggttccca 3840
acgatcaagg cgagttacat gatcccccat gttgtgcaaa aaagcggtta gctccttcgg 3900
tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta tcactcatgg ttatggcagc 3960
actgcataat tctcttactg tcatgccatc cgtaagatgc ttttctgtga ctggtgagta 4020
ctcaaccaag tcattctgag aatagtgtat gcggcgaccg agttgctctt gcccggcgtc 4080
aatacgggat aataccgcgc cacatagcag aactttaaaa gtgctcatca ttggaaaacg 4140
ttcttcgggg cgaaaactct caaggatctt accgctgttg agatccagtt cgatgtaacc 4200
cactcgtgca cccaactgat cttcagcatc ttttactttc accagcgttt ctgggtgagc 4260
aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg gcgacacgga aatgttgaat 4320
actcatactc ttcctttttc aatattattg aagcatttat cagggttatt gtctcatgag 4380
cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gcacatttcc 4440
ccgaaaagtg ccacctgacg cgccctgtag cggcgcatta agcgcggcgg gtgtggtggt 4500
tacgcgcagc gtgaccgcta cacttgccag cgccctagcg cccgctcctt tcgctttctt 4560
cccttccttt ctcgccacgt tcgccggctt tccccgtcaa gctctaaatc gggggctccc 4620
tttagggttc cgatttagtg ctttacggca cctcgacccc aaaaaacttg attagggtga 4680
tggttcacgt agtgggccat cgccctgata gacggttttt cgccctttga cgttggagtc 4740
cacgttcttt aatagtggac tcttgttcca aactggaaca acactcaacc ctatctcggt 4800
ctattctttt gatttataag ggattttgcc gatttcggcc tattggttaa aaaatgagct 4860
gatttaacaa aaatttaacg cgaattttaa caaaatatta acgcttacaa tttgccattc 4920
gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt cgctattacg 4980
ccagcccaag ctaccatgat aagtaagtaa tattaaggta cgggaggtac ttggagcggc 5040
cgcaataaaa tatctttatt ttcattacat ctgtgtgttg gttttttgtg tgaatcgata 5100
gtactaacat acgctctcca tcaaaacaaa acgaaacaaa acaaactagc aaaataggct 5160
gtccccagtg caagtgcagg tgccagaaca tttctctatc gata 5204
<210> 15
<211> 5272
<212> DNA
<213> Artificial Sequence
<220>
<223> p63-8x target
<400> 15
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag ttctatcact 60
tccatggttc tatcacttca cgacttctat cactttgttt cttctatcac ttgaattctt 120
ctatcactta agttcttcta tcacttttcg aattctatca cttatcttct tctatcactt 180
cagttcgctt actagtgtcg aggtaggcgt gtacggtggg aggcctatat aagcagagct 240
cgtttagtga accgtcagat cgcctggagg taccgccacc atggaagatg ccaaaaacat 300
taagaagggc ccagcgccat tctacccact cgaagacggg accgccggcg agcagctgca 360
caaagccatg aagcgctacg ccctggtgcc cggcaccatc gcctttaccg acgcacatat 420
cgaggtggac attacctacg ccgagtactt cgagatgagc gttcggctgg cagaagctat 480
gaagcgctat gggctgaata caaaccatcg gatcgtggtg tgcagcgaga atagcttgca 540
gttcttcatg cccgtgttgg gtgccctgtt catcggtgtg gctgtggccc cagctaacga 600
catctacaac gagcgcgagc tgctgaacag catgggcatc agccagccca ccgtcgtatt 660
cgtgagcaag aaagggctgc aaaagatcct caacgtgcaa aagaagctac cgatcataca 720
aaagatcatc atcatggata gcaagaccga ctaccagggc ttccaaagca tgtacacctt 780
cgtgacttcc catttgccac ccggcttcaa cgagtacgac ttcgtgcccg agagcttcga 840
ccgggacaaa accatcgccc tgatcatgaa cagtagtggc agtaccggat tgcccaaggg 900
cgtagcccta ccgcaccgca ccgcttgtgt ccgattcagt catgcccgcg accccatctt 960
cggcaaccag atcatccccg acaccgctat cctcagcgtg gtgccatttc accacggctt 1020
cggcatgttc accacgctgg gctacttgat ctgcggcttt cgggtcgtgc tcatgtaccg 1080
cttcgaggag gagctattct tgcgcagctt gcaagactat aagattcaat ctgccctgct 1140
ggtgcccaca ctatttagct tcttcgctaa gagcactctc atcgacaagt acgacctaag 1200
caacttgcac gagatcgcca gcggcggggc gccgctcagc aaggaggtag gtgaggccgt 1260
ggccaaacgc ttccacctac caggcatccg ccagggctac ggcctgacag aaacaaccag 1320
cgccattctg atcacccccg aaggggacga caagcctggc gcagtaggca aggtggtgcc 1380
cttcttcgag gctaaggtgg tggacttgga caccggtaag acactgggtg tgaaccagcg 1440
cggcgagctg tgcgtccgtg gccccatgat catgagcggc tacgttaaca accccgaggc 1500
tacaaacgct ctcatcgaca aggacggctg gctgcacagc ggcgacatcg cctactggga 1560
cgaggacgag cacttcttca tcgtggaccg gctgaagagc ctgatcaaat acaagggcta 1620
ccaggtagcc ccagccgaac tggagagcat cctgctgcaa caccccaaca tcttcgacgc 1680
cggggtcgcc ggcctgcccg acgacgatgc cggcgagctg cccgccgcag tcgtcgtgct 1740
ggaacacggt aaaaccatga ccgagaagga gatcgtggac tatgtggcca gccaggttac 1800
aaccgccaag aagctgcgcg gtggtgttgt gttcgtggac gaggtgccta aaggactgac 1860
cggcaagttg gacgcccgca agatccgcga gattctcatt aaggccaaga agggcggcaa 1920
gatcgccgtg aattcttaac tgcagttaat ctagagtcgg ggcggccggc cgcttcgagc 1980
agacatgata agatacattg atgagtttgg acaaaccaca actagaatgc agtgaaaaaa 2040
atgctttatt tgtgaaattt gtgatgctat tgctttattt gtaaccatta taagctgcaa 2100
taaacaagtt aacaacaaca attgcattca ttttatgttt caggttcagg gggaggtgtg 2160
ggaggttttt taaagcaagt aaaacctcta caaatgtggt aaaatcgata aggatctgaa 2220
cgatggagcg gagaatgggc ggaactgggc ggagttaggg gcgggatggg cggagttagg 2280
ggcgggacta tggttgctga ctaattgaga tgcatgcttt gcatacttct gcctgctggg 2340
gagcctgggg actttccaca cctggttgct gactaattga gatgcatgct ttgcatactt 2400
ctgcctgctg gggagcctgg ggactttcca caccctaact gacacacatt ccacagcgga 2460
tccgtcgacc gatgcccttg agagccttca acccagtcag ctccttccgg tgggcgcggg 2520
gcatgactat cgtcgccgca cttatgactg tcttctttat catgcaactc gtaggacagg 2580
tgccggcagc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg 2640
cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat 2700
aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc 2760
gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc 2820
tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 2880
agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 2940
ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 3000
taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 3060
gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 3120
gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 3180
ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 3240
ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 3300
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 3360
caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt 3420
taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa 3480
aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccaa 3540
tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc catagttgcc 3600
tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg ccccagtgct 3660
gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat aaaccagcca 3720
gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat ccagtctatt 3780
aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg caacgttgtt 3840
gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc 3900
ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa agcggttagc 3960
tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc actcatggtt 4020
atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt ttctgtgact 4080
ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc 4140
ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt gctcatcatt 4200
ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag atccagttcg 4260
atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac cagcgtttct 4320
gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa 4380
tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca gggttattgt 4440
ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc 4500
acatttcccc gaaaagtgcc acctgacgcg ccctgtagcg gcgcattaag cgcggcgggt 4560
gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc cgctcctttc 4620
gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc tctaaatcgg 4680
gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa aaaacttgat 4740
tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg ccctttgacg 4800
ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac actcaaccct 4860
atctcggtct attcttttga tttataaggg attttgccga tttcggccta ttggttaaaa 4920
aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac gcttacaatt 4980
tgccattcgc cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg 5040
ctattacgcc agcccaagct accatgataa gtaagtaata ttaaggtacg ggaggtactt 5100
ggagcggccg caataaaata tctttatttt cattacatct gtgtgttggt tttttgtgtg 5160
aatcgatagt actaacatac gctctccatc aaaacaaaac gaaacaaaac aaactagcaa 5220
aataggctgt ccccagtgca agtgcaggtg ccagaacatt tctctatcga ta 5272
<210> 16
<211> 5228
<212> DNA
<213> Artificial Sequence
<220>
<223> pTac2-4x target
<400> 16
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag taactttcgt 60
cactcattgt ttctaacttt cgtcactcat ggattctaac tttcgtcact catatcttct 120
aactttcgtc actcatcagt tcgcttacta gtgtcgaggt aggcgtgtac ggtgggaggc 180
ctatataagc agagctcgtt tagtgaaccg tcagatcgcc tggaggtacc gccaccatgg 240
aagatgccaa aaacattaag aagggcccag cgccattcta cccactcgaa gacgggaccg 300
ccggcgagca gctgcacaaa gccatgaagc gctacgccct ggtgcccggc accatcgcct 360
ttaccgacgc acatatcgag gtggacatta cctacgccga gtacttcgag atgagcgttc 420
ggctggcaga agctatgaag cgctatgggc tgaatacaaa ccatcggatc gtggtgtgca 480
gcgagaatag cttgcagttc ttcatgcccg tgttgggtgc cctgttcatc ggtgtggctg 540
tggccccagc taacgacatc tacaacgagc gcgagctgct gaacagcatg ggcatcagcc 600
agcccaccgt cgtattcgtg agcaagaaag ggctgcaaaa gatcctcaac gtgcaaaaga 660
agctaccgat catacaaaag atcatcatca tggatagcaa gaccgactac cagggcttcc 720
aaagcatgta caccttcgtg acttcccatt tgccacccgg cttcaacgag tacgacttcg 780
tgcccgagag cttcgaccgg gacaaaacca tcgccctgat catgaacagt agtggcagta 840
ccggattgcc caagggcgta gccctaccgc accgcaccgc ttgtgtccga ttcagtcatg 900
cccgcgaccc catcttcggc aaccagatca tccccgacac cgctatcctc agcgtggtgc 960
catttcacca cggcttcggc atgttcacca cgctgggcta cttgatctgc ggctttcggg 1020
tcgtgctcat gtaccgcttc gaggaggagc tattcttgcg cagcttgcaa gactataaga 1080
ttcaatctgc cctgctggtg cccacactat ttagcttctt cgctaagagc actctcatcg 1140
acaagtacga cctaagcaac ttgcacgaga tcgccagcgg cggggcgccg ctcagcaagg 1200
aggtaggtga ggccgtggcc aaacgcttcc acctaccagg catccgccag ggctacggcc 1260
tgacagaaac aaccagcgcc attctgatca cccccgaagg ggacgacaag cctggcgcag 1320
taggcaaggt ggtgcccttc ttcgaggcta aggtggtgga cttggacacc ggtaagacac 1380
tgggtgtgaa ccagcgcggc gagctgtgcg tccgtggccc catgatcatg agcggctacg 1440
ttaacaaccc cgaggctaca aacgctctca tcgacaagga cggctggctg cacagcggcg 1500
acatcgccta ctgggacgag gacgagcact tcttcatcgt ggaccggctg aagagcctga 1560
tcaaatacaa gggctaccag gtagccccag ccgaactgga gagcatcctg ctgcaacacc 1620
ccaacatctt cgacgccggg gtcgccggcc tgcccgacga cgatgccggc gagctgcccg 1680
ccgcagtcgt cgtgctggaa cacggtaaaa ccatgaccga gaaggagatc gtggactatg 1740
tggccagcca ggttacaacc gccaagaagc tgcgcggtgg tgttgtgttc gtggacgagg 1800
tgcctaaagg actgaccggc aagttggacg cccgcaagat ccgcgagatt ctcattaagg 1860
ccaagaaggg cggcaagatc gccgtgaatt cttaactgca gttaatctag agtcggggcg 1920
gccggccgct tcgagcagac atgataagat acattgatga gtttggacaa accacaacta 1980
gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 2040
ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg 2100
ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtaaaa 2160
tcgataagga tctgaacgat ggagcggaga atgggcggaa ctgggcggag ttaggggcgg 2220
gatgggcgga gttaggggcg ggactatggt tgctgactaa ttgagatgca tgctttgcat 2280
acttctgcct gctggggagc ctggggactt tccacacctg gttgctgact aattgagatg 2340
catgctttgc atacttctgc ctgctgggga gcctggggac tttccacacc ctaactgaca 2400
cacattccac agcggatccg tcgaccgatg cccttgagag ccttcaaccc agtcagctcc 2460
ttccggtggg cgcggggcat gactatcgtc gccgcactta tgactgtctt ctttatcatg 2520
caactcgtag gacaggtgcc ggcagcgctc ttccgcttcc tcgctcactg actcgctgcg 2580
ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 2640
cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 2700
gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 2760
tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 2820
ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 2880
atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 2940
gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 3000
tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 3060
cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 3120
cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 3180
tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 3240
cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 3300
cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 3360
gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 3420
gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 3480
gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 3540
ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc 3600
atctggcccc agtgctgcaa tgataccgcg agacccacgc tcaccggctc cagatttatc 3660
agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc 3720
ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag 3780
tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat 3840
ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg 3900
caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt 3960
gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag 4020
atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg 4080
accgagttgc tcttgcccgg cgtcaatacg ggataatacc gcgccacata gcagaacttt 4140
aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct 4200
gttgagatcc agttcgatgt aacccactcg tgcacccaac tgatcttcag catcttttac 4260
tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat 4320
aagggcgaca cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat 4380
ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca 4440
aataggggtt ccgcgcacat ttccccgaaa agtgccacct gacgcgccct gtagcggcgc 4500
attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg ccagcgccct 4560
agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg gctttccccg 4620
tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac ggcacctcga 4680
ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 4740
ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 4800
aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 4860
ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 4920
attaacgctt acaatttgcc attcgccatt caggctgcgc aactgttggg aagggcgatc 4980
ggtgcgggcc tcttcgctat tacgccagcc caagctacca tgataagtaa gtaatattaa 5040
ggtacgggag gtacttggag cggccgcaat aaaatatctt tattttcatt acatctgtgt 5100
gttggttttt tgtgtgaatc gatagtacta acatacgctc tccatcaaaa caaaacgaaa 5160
caaaacaaac tagcaaaata ggctgtcccc agtgcaagtg caggtgccag aacatttctc 5220
tatcgata 5228
<210> 17
<211> 5320
<212> DNA
<213> Artificial Sequence
<220>
<223> pTac2-8x target
<400> 17
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag taactttcgt 60
cactcatcca tggtaacttt cgtcactcat cacgactaac tttcgtcact cattgtttct 120
aactttcgtc actcatgaat tctaactttc gtcactcata agttctaact ttcgtcactc 180
atttcgaata actttcgtca ctcatatctt ctaactttcg tcactcatca gttcgcttac 240
tagtgtcgag gtaggcgtgt acggtgggag gcctatataa gcagagctcg tttagtgaac 300
cgtcagatcg cctggaggta ccgccaccat ggaagatgcc aaaaacatta agaagggccc 360
agcgccattc tacccactcg aagacgggac cgccggcgag cagctgcaca aagccatgaa 420
gcgctacgcc ctggtgcccg gcaccatcgc ctttaccgac gcacatatcg aggtggacat 480
tacctacgcc gagtacttcg agatgagcgt tcggctggca gaagctatga agcgctatgg 540
gctgaataca aaccatcgga tcgtggtgtg cagcgagaat agcttgcagt tcttcatgcc 600
cgtgttgggt gccctgttca tcggtgtggc tgtggcccca gctaacgaca tctacaacga 660
gcgcgagctg ctgaacagca tgggcatcag ccagcccacc gtcgtattcg tgagcaagaa 720
agggctgcaa aagatcctca acgtgcaaaa gaagctaccg atcatacaaa agatcatcat 780
catggatagc aagaccgact accagggctt ccaaagcatg tacaccttcg tgacttccca 840
tttgccaccc ggcttcaacg agtacgactt cgtgcccgag agcttcgacc gggacaaaac 900
catcgccctg atcatgaaca gtagtggcag taccggattg cccaagggcg tagccctacc 960
gcaccgcacc gcttgtgtcc gattcagtca tgcccgcgac cccatcttcg gcaaccagat 1020
catccccgac accgctatcc tcagcgtggt gccatttcac cacggcttcg gcatgttcac 1080
cacgctgggc tacttgatct gcggctttcg ggtcgtgctc atgtaccgct tcgaggagga 1140
gctattcttg cgcagcttgc aagactataa gattcaatct gccctgctgg tgcccacact 1200
atttagcttc ttcgctaaga gcactctcat cgacaagtac gacctaagca acttgcacga 1260
gatcgccagc ggcggggcgc cgctcagcaa ggaggtaggt gaggccgtgg ccaaacgctt 1320
ccacctacca ggcatccgcc agggctacgg cctgacagaa acaaccagcg ccattctgat 1380
cacccccgaa ggggacgaca agcctggcgc agtaggcaag gtggtgccct tcttcgaggc 1440
taaggtggtg gacttggaca ccggtaagac actgggtgtg aaccagcgcg gcgagctgtg 1500
cgtccgtggc cccatgatca tgagcggcta cgttaacaac cccgaggcta caaacgctct 1560
catcgacaag gacggctggc tgcacagcgg cgacatcgcc tactgggacg aggacgagca 1620
cttcttcatc gtggaccggc tgaagagcct gatcaaatac aagggctacc aggtagcccc 1680
agccgaactg gagagcatcc tgctgcaaca ccccaacatc ttcgacgccg gggtcgccgg 1740
cctgcccgac gacgatgccg gcgagctgcc cgccgcagtc gtcgtgctgg aacacggtaa 1800
aaccatgacc gagaaggaga tcgtggacta tgtggccagc caggttacaa ccgccaagaa 1860
gctgcgcggt ggtgttgtgt tcgtggacga ggtgcctaaa ggactgaccg gcaagttgga 1920
cgcccgcaag atccgcgaga ttctcattaa ggccaagaag ggcggcaaga tcgccgtgaa 1980
ttcttaactg cagttaatct agagtcgggg cggccggccg cttcgagcag acatgataag 2040
atacattgat gagtttggac aaaccacaac tagaatgcag tgaaaaaaat gctttatttg 2100
tgaaatttgt gatgctattg ctttatttgt aaccattata agctgcaata aacaagttaa 2160
caacaacaat tgcattcatt ttatgtttca ggttcagggg gaggtgtggg aggtttttta 2220
aagcaagtaa aacctctaca aatgtggtaa aatcgataag gatctgaacg atggagcgga 2280
gaatgggcgg aactgggcgg agttaggggc gggatgggcg gagttagggg cgggactatg 2340
gttgctgact aattgagatg catgctttgc atacttctgc ctgctgggga gcctggggac 2400
tttccacacc tggttgctga ctaattgaga tgcatgcttt gcatacttct gcctgctggg 2460
gagcctgggg actttccaca ccctaactga cacacattcc acagcggatc cgtcgaccga 2520
tgcccttgag agccttcaac ccagtcagct ccttccggtg ggcgcggggc atgactatcg 2580
tcgccgcact tatgactgtc ttctttatca tgcaactcgt aggacaggtg ccggcagcgc 2640
tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 2700
tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 2760
aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 2820
tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 2880
tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 2940
cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 3000
agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 3060
tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 3120
aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 3180
ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 3240
cctaactacg gctacactag aagaacagta tttggtatct gcgctctgct gaagccagtt 3300
accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 3360
ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 3420
ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 3480
gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 3540
aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 3600
gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actccccgtc 3660
gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg 3720
cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc cggaagggcc 3780
gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa ttgttgccgg 3840
gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc cattgctaca 3900
ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg ttcccaacga 3960
tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct 4020
ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat ggcagcactg 4080
cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg tgagtactca 4140
accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaata 4200
cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 4260
tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact 4320
cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg gtgagcaaaa 4380
acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg ttgaatactc 4440
atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga 4500
tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 4560
aaagtgccac ctgacgcgcc ctgtagcggc gcattaagcg cggcgggtgt ggtggttacg 4620
cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc tttcttccct 4680
tcctttctcg ccacgttcgc cggctttccc cgtcaagctc taaatcgggg gctcccttta 4740
gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta gggtgatggt 4800
tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt ggagtccacg 4860
ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat ctcggtctat 4920
tcttttgatt tataagggat tttgccgatt tcggcctatt ggttaaaaaa tgagctgatt 4980
taacaaaaat ttaacgcgaa ttttaacaaa atattaacgc ttacaatttg ccattcgcca 5040
ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg cctcttcgct attacgccag 5100
cccaagctac catgataagt aagtaatatt aaggtacggg aggtacttgg agcggccgca 5160
ataaaatatc tttattttca ttacatctgt gtgttggttt tttgtgtgaa tcgatagtac 5220
taacatacgc tctccatcaa aacaaaacga aacaaaacaa actagcaaaa taggctgtcc 5280
ccagtgcaag tgcaggtgcc agaacatttc tctatcgata 5320
<210> 18
<211> 5212
<212> DNA
<213> Artificial Sequence
<220>
<223> GUN1-4x target
<400> 18
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag taatttgtcg 60
atttgtttct aatttgtcga ttgaattcta atttgtcgat tatcttctaa tttgtcgatt 120
cagttcgctt actagtgtcg aggtaggcgt gtacggtggg aggcctatat aagcagagct 180
cgtttagtga accgtcagat cgcctggagg taccgccacc atggaagatg ccaaaaacat 240
taagaagggc ccagcgccat tctacccact cgaagacggg accgccggcg agcagctgca 300
caaagccatg aagcgctacg ccctggtgcc cggcaccatc gcctttaccg acgcacatat 360
cgaggtggac attacctacg ccgagtactt cgagatgagc gttcggctgg cagaagctat 420
gaagcgctat gggctgaata caaaccatcg gatcgtggtg tgcagcgaga atagcttgca 480
gttcttcatg cccgtgttgg gtgccctgtt catcggtgtg gctgtggccc cagctaacga 540
catctacaac gagcgcgagc tgctgaacag catgggcatc agccagccca ccgtcgtatt 600
cgtgagcaag aaagggctgc aaaagatcct caacgtgcaa aagaagctac cgatcataca 660
aaagatcatc atcatggata gcaagaccga ctaccagggc ttccaaagca tgtacacctt 720
cgtgacttcc catttgccac ccggcttcaa cgagtacgac ttcgtgcccg agagcttcga 780
ccgggacaaa accatcgccc tgatcatgaa cagtagtggc agtaccggat tgcccaaggg 840
cgtagcccta ccgcaccgca ccgcttgtgt ccgattcagt catgcccgcg accccatctt 900
cggcaaccag atcatccccg acaccgctat cctcagcgtg gtgccatttc accacggctt 960
cggcatgttc accacgctgg gctacttgat ctgcggcttt cgggtcgtgc tcatgtaccg 1020
cttcgaggag gagctattct tgcgcagctt gcaagactat aagattcaat ctgccctgct 1080
ggtgcccaca ctatttagct tcttcgctaa gagcactctc atcgacaagt acgacctaag 1140
caacttgcac gagatcgcca gcggcggggc gccgctcagc aaggaggtag gtgaggccgt 1200
ggccaaacgc ttccacctac caggcatccg ccagggctac ggcctgacag aaacaaccag 1260
cgccattctg atcacccccg aaggggacga caagcctggc gcagtaggca aggtggtgcc 1320
cttcttcgag gctaaggtgg tggacttgga caccggtaag acactgggtg tgaaccagcg 1380
cggcgagctg tgcgtccgtg gccccatgat catgagcggc tacgttaaca accccgaggc 1440
tacaaacgct ctcatcgaca aggacggctg gctgcacagc ggcgacatcg cctactggga 1500
cgaggacgag cacttcttca tcgtggaccg gctgaagagc ctgatcaaat acaagggcta 1560
ccaggtagcc ccagccgaac tggagagcat cctgctgcaa caccccaaca tcttcgacgc 1620
cggggtcgcc ggcctgcccg acgacgatgc cggcgagctg cccgccgcag tcgtcgtgct 1680
ggaacacggt aaaaccatga ccgagaagga gatcgtggac tatgtggcca gccaggttac 1740
aaccgccaag aagctgcgcg gtggtgttgt gttcgtggac gaggtgccta aaggactgac 1800
cggcaagttg gacgcccgca agatccgcga gattctcatt aaggccaaga agggcggcaa 1860
gatcgccgtg aattcttaac tgcagttaat ctagagtcgg ggcggccggc cgcttcgagc 1920
agacatgata agatacattg atgagtttgg acaaaccaca actagaatgc agtgaaaaaa 1980
atgctttatt tgtgaaattt gtgatgctat tgctttattt gtaaccatta taagctgcaa 2040
taaacaagtt aacaacaaca attgcattca ttttatgttt caggttcagg gggaggtgtg 2100
ggaggttttt taaagcaagt aaaacctcta caaatgtggt aaaatcgata aggatctgaa 2160
cgatggagcg gagaatgggc ggaactgggc ggagttaggg gcgggatggg cggagttagg 2220
ggcgggacta tggttgctga ctaattgaga tgcatgcttt gcatacttct gcctgctggg 2280
gagcctgggg actttccaca cctggttgct gactaattga gatgcatgct ttgcatactt 2340
ctgcctgctg gggagcctgg ggactttcca caccctaact gacacacatt ccacagcgga 2400
tccgtcgacc gatgcccttg agagccttca acccagtcag ctccttccgg tgggcgcggg 2460
gcatgactat cgtcgccgca cttatgactg tcttctttat catgcaactc gtaggacagg 2520
tgccggcagc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt cgttcggctg 2580
cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga atcaggggat 2640
aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg taaaaaggcc 2700
gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa aaatcgacgc 2760
tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt tccccctgga 2820
agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct gtccgccttt 2880
ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct cagttcggtg 2940
taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc cgaccgctgc 3000
gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt atcgccactg 3060
gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc tacagagttc 3120
ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat ctgcgctctg 3180
ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa acaaaccacc 3240
gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa aaaaggatct 3300
caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga aaactcacgt 3360
taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct tttaaattaa 3420
aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga cagttaccaa 3480
tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc catagttgcc 3540
tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg ccccagtgct 3600
gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat aaaccagcca 3660
gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat ccagtctatt 3720
aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg caacgttgtt 3780
gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc attcagctcc 3840
ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa agcggttagc 3900
tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc actcatggtt 3960
atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt ttctgtgact 4020
ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag ttgctcttgc 4080
ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt gctcatcatt 4140
ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag atccagttcg 4200
atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac cagcgtttct 4260
gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc gacacggaaa 4320
tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca gggttattgt 4380
ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc 4440
acatttcccc gaaaagtgcc acctgacgcg ccctgtagcg gcgcattaag cgcggcgggt 4500
gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc cgctcctttc 4560
gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc tctaaatcgg 4620
gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa aaaacttgat 4680
tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg ccctttgacg 4740
ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac actcaaccct 4800
atctcggtct attcttttga tttataaggg attttgccga tttcggccta ttggttaaaa 4860
aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac gcttacaatt 4920
tgccattcgc cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg 4980
ctattacgcc agcccaagct accatgataa gtaagtaata ttaaggtacg ggaggtactt 5040
ggagcggccg caataaaata tctttatttt cattacatct gtgtgttggt tttttgtgtg 5100
aatcgatagt actaacatac gctctccatc aaaacaaaac gaaacaaaac aaactagcaa 5160
aataggctgt ccccagtgca agtgcaggtg ccagaacatt tctctatcga ta 5212
<210> 19
<211> 5288
<212> DNA
<213> Artificial Sequence
<220>
<223> GUN1-8x target
<400> 19
ggtaccgagc tcttacgcgt gctagcccgg gctcgagatc tgatatcaag taatttgtcg 60
attccatggt aatttgtcga ttcacgacta atttgtcgat ttgtttctaa tttgtcgatt 120
gaattctaat ttgtcgatta agttctaatt tgtcgatttt cgaataattt gtcgattatc 180
ttctaatttg tcgattcagt tcgcttacta gtgtcgaggt aggcgtgtac ggtgggaggc 240
ctatataagc agagctcgtt tagtgaaccg tcagatcgcc tggaggtacc gccaccatgg 300
aagatgccaa aaacattaag aagggcccag cgccattcta cccactcgaa gacgggaccg 360
ccggcgagca gctgcacaaa gccatgaagc gctacgccct ggtgcccggc accatcgcct 420
ttaccgacgc acatatcgag gtggacatta cctacgccga gtacttcgag atgagcgttc 480
ggctggcaga agctatgaag cgctatgggc tgaatacaaa ccatcggatc gtggtgtgca 540
gcgagaatag cttgcagttc ttcatgcccg tgttgggtgc cctgttcatc ggtgtggctg 600
tggccccagc taacgacatc tacaacgagc gcgagctgct gaacagcatg ggcatcagcc 660
agcccaccgt cgtattcgtg agcaagaaag ggctgcaaaa gatcctcaac gtgcaaaaga 720
agctaccgat catacaaaag atcatcatca tggatagcaa gaccgactac cagggcttcc 780
aaagcatgta caccttcgtg acttcccatt tgccacccgg cttcaacgag tacgacttcg 840
tgcccgagag cttcgaccgg gacaaaacca tcgccctgat catgaacagt agtggcagta 900
ccggattgcc caagggcgta gccctaccgc accgcaccgc ttgtgtccga ttcagtcatg 960
cccgcgaccc catcttcggc aaccagatca tccccgacac cgctatcctc agcgtggtgc 1020
catttcacca cggcttcggc atgttcacca cgctgggcta cttgatctgc ggctttcggg 1080
tcgtgctcat gtaccgcttc gaggaggagc tattcttgcg cagcttgcaa gactataaga 1140
ttcaatctgc cctgctggtg cccacactat ttagcttctt cgctaagagc actctcatcg 1200
acaagtacga cctaagcaac ttgcacgaga tcgccagcgg cggggcgccg ctcagcaagg 1260
aggtaggtga ggccgtggcc aaacgcttcc acctaccagg catccgccag ggctacggcc 1320
tgacagaaac aaccagcgcc attctgatca cccccgaagg ggacgacaag cctggcgcag 1380
taggcaaggt ggtgcccttc ttcgaggcta aggtggtgga cttggacacc ggtaagacac 1440
tgggtgtgaa ccagcgcggc gagctgtgcg tccgtggccc catgatcatg agcggctacg 1500
ttaacaaccc cgaggctaca aacgctctca tcgacaagga cggctggctg cacagcggcg 1560
acatcgccta ctgggacgag gacgagcact tcttcatcgt ggaccggctg aagagcctga 1620
tcaaatacaa gggctaccag gtagccccag ccgaactgga gagcatcctg ctgcaacacc 1680
ccaacatctt cgacgccggg gtcgccggcc tgcccgacga cgatgccggc gagctgcccg 1740
ccgcagtcgt cgtgctggaa cacggtaaaa ccatgaccga gaaggagatc gtggactatg 1800
tggccagcca ggttacaacc gccaagaagc tgcgcggtgg tgttgtgttc gtggacgagg 1860
tgcctaaagg actgaccggc aagttggacg cccgcaagat ccgcgagatt ctcattaagg 1920
ccaagaaggg cggcaagatc gccgtgaatt cttaactgca gttaatctag agtcggggcg 1980
gccggccgct tcgagcagac atgataagat acattgatga gtttggacaa accacaacta 2040
gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct ttatttgtaa 2100
ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt atgtttcagg 2160
ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa tgtggtaaaa 2220
tcgataagga tctgaacgat ggagcggaga atgggcggaa ctgggcggag ttaggggcgg 2280
gatgggcgga gttaggggcg ggactatggt tgctgactaa ttgagatgca tgctttgcat 2340
acttctgcct gctggggagc ctggggactt tccacacctg gttgctgact aattgagatg 2400
catgctttgc atacttctgc ctgctgggga gcctggggac tttccacacc ctaactgaca 2460
cacattccac agcggatccg tcgaccgatg cccttgagag ccttcaaccc agtcagctcc 2520
ttccggtggg cgcggggcat gactatcgtc gccgcactta tgactgtctt ctttatcatg 2580
caactcgtag gacaggtgcc ggcagcgctc ttccgcttcc tcgctcactg actcgctgcg 2640
ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 2700
cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 2760
gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 2820
tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 2880
ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 2940
atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 3000
gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 3060
tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 3120
cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 3180
cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 3240
tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 3300
cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 3360
cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 3420
gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 3480
gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 3540
gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 3600
ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc 3660
atctggcccc agtgctgcaa tgataccgcg agacccacgc tcaccggctc cagatttatc 3720
agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc 3780
ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag 3840
tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat 3900
ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg 3960
caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt 4020
gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag 4080
atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg 4140
accgagttgc tcttgcccgg cgtcaatacg ggataatacc gcgccacata gcagaacttt 4200
aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct 4260
gttgagatcc agttcgatgt aacccactcg tgcacccaac tgatcttcag catcttttac 4320
tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat 4380
aagggcgaca cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat 4440
ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca 4500
aataggggtt ccgcgcacat ttccccgaaa agtgccacct gacgcgccct gtagcggcgc 4560
attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg ccagcgccct 4620
agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg gctttccccg 4680
tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac ggcacctcga 4740
ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 4800
ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 4860
aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 4920
ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 4980
attaacgctt acaatttgcc attcgccatt caggctgcgc aactgttggg aagggcgatc 5040
ggtgcgggcc tcttcgctat tacgccagcc caagctacca tgataagtaa gtaatattaa 5100
ggtacgggag gtacttggag cggccgcaat aaaatatctt tattttcatt acatctgtgt 5160
gttggttttt tgtgtgaatc gatagtacta acatacgctc tccatcaaaa caaaacgaaa 5220
caaaacaaac tagcaaaata ggctgtcccc agtgcaagtg caggtgccag aacatttctc 5280
tatcgata 5288
Claims (16)
- 기능성 영역(functional region)과, 상기 기능성 영역에 융합되며 DNA 염기 또는 특정 DNA 염기서열에 결합하는 결합성 영역(binding region)을 가지는 융합 단백질을 설계하는 방법에 있어서,
상기 DNA 염기 또는 특정 DNA 염기서열에 결합하는 결합성 영역은
서열번호 1의 아미노산 서열로 구성되는 p63 단백질의 9개 PPR 모티프, 서열번호 2의 아미노산 서열로 구성되는 GUN1 단백질의 11개 PPR 모티프, 서열번호 3의 아미노산 서열로 구성되는 pTac2 단백질의 15개 PPR 모티프, 서열번호 4의 아미노산 서열로 구성되는 DG1 단백질의 10개 PPR 모티프, 및 서열번호 5의 아미노산 서열로 구성되는 GRP23 단백질의 11개 PPR 모티프로 구성되는 그룹으로부터 선택되는 것을 특징으로 하는 방법.
- 삭제
- 삭제
- 제1항에 있어서,
기능성 영역이 DNA 절단효소 또는 뉴클레아제 영역 또는 전사제어 영역이고, 융합 단백질이 표적서열 특이적 DNA 절단효소 또는 전사 조절인자 역할을 하는 것을 특징으로 하는
방법.
- 제4항에 있어서,
DNA 절단효소는 서열번호 6으로 나타내는 FokI의 뉴클레아제 영역인 것을 특징으로 하는
방법.
- 사람을 제외한 동물에 있어서 제1항, 제4항 또는 제5항에 기재된 방법에 의해 설계된 융합 단백질을 이용하는 것을 특징으로 하는 게놈을 편집하는 방법.
- 사람을 제외한 동물에 있어서 제6항에 기재된 방법으로 게놈을 편집하고, 상기 편집된 게놈을 제조하는 방법.
- 사람을 제외한 동물에 있어서 제6항에 기재된 방법으로 게놈을 편집하고, 상기 편집된 게놈을 함유하는 세포를 제조하는 방법.
- 사람을 제외한 동물에 있어서 DNA 염기 또는 특정 DNA 염기서열에 결합하는 결합성 단백질을 사용하는 방법에 있어서, 상기 DNA 염기 또는 특정 DNA 염기서열에 결합하는 결합성 단백질은 서열번호 1의 아미노산 서열로 구성되는 p63 단백질의 9개 PPR 모티프, 서열번호 2의 아미노산 서열로 구성되는 GUN1 단백질의 11개 PPR 모티프, 서열번호 3의 아미노산 서열로 구성되는 pTac2 단백질의 15개 PPR 모티프, 서열번호 4의 아미노산 서열로 구성되는 DG1 단백질의 10개 PPR 모티프, 및 서열번호 5의 아미노산 서열로 구성되는 GRP23 단백질의 11개 PPR 모티프로 구성되는 그룹으로부터 선택되는 것을 특징으로 하는 방법.
- 삭제
- 서열번호 1의 아미노산 서열로 구성되는 p63 단백질의 9개 PPR 모티프, 서열번호 2의 아미노산 서열로 구성되는 GUN1 단백질의 11개 PPR 모티프, 서열번호 3의 아미노산 서열로 구성되는 pTac2 단백질의 15개 PPR 모티프, 서열번호 4의 아미노산 서열로 구성되는 DG1 단백질의 10개 PPR 모티프, 및 서열번호 5의 아미노산 서열로 구성되는 GRP23 단백질의 11개 PPR 모티프로 구성되는 그룹으로부터 선택되는 단백질을 암호화하는 핵산.
- 삭제
- 삭제
- 제11항에 기재된 핵산을 포함하는 벡터.
- 제14항에 기재된 벡터가 도입된 세포.
- 제15항에 있어서,
대장균(E. coli)인 것을 특징으로 하는
세포.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020227013315A KR102606057B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하여 dna 염기 또는 특정 염기서열을 가지는 dna를 동정, 인식 또는 표적화하는 방법 |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013089840 | 2013-04-22 | ||
JPJP-P-2013-089840 | 2013-04-22 | ||
PCT/JP2014/061329 WO2014175284A1 (ja) | 2013-04-22 | 2014-04-22 | Pprモチーフを利用したdna結合性タンパク質およびその利用 |
KR1020157033320A KR20160007541A (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 이용한 디앤에이 결합성 단백질 및 그의 이용 |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020157033320A Division KR20160007541A (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 이용한 디앤에이 결합성 단백질 및 그의 이용 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020227013315A Division KR102606057B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하여 dna 염기 또는 특정 염기서열을 가지는 dna를 동정, 인식 또는 표적화하는 방법 |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20210037757A KR20210037757A (ko) | 2021-04-06 |
KR102390485B1 true KR102390485B1 (ko) | 2022-04-22 |
Family
ID=51791855
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020157033320A KR20160007541A (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 이용한 디앤에이 결합성 단백질 및 그의 이용 |
KR1020237039952A KR20230164213A (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하는 dna 결합성 단백질 및 그의 이용 |
KR1020217016484A KR102326842B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 가지는 디앤에이 결합성 단백질을 포함하는 융합 단백질 |
KR1020227013315A KR102606057B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하여 dna 염기 또는 특정 염기서열을 가지는 dna를 동정, 인식 또는 표적화하는 방법 |
KR1020217009426A KR102390485B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 가지는 디앤에이 결합 단백질을 포함하는 융합 단백질의 설계방법 |
Family Applications Before (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020157033320A KR20160007541A (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 이용한 디앤에이 결합성 단백질 및 그의 이용 |
KR1020237039952A KR20230164213A (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하는 dna 결합성 단백질 및 그의 이용 |
KR1020217016484A KR102326842B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 모티프를 가지는 디앤에이 결합성 단백질을 포함하는 융합 단백질 |
KR1020227013315A KR102606057B1 (ko) | 2013-04-22 | 2014-04-22 | 피피알 단백질을 이용하여 dna 염기 또는 특정 염기서열을 가지는 dna를 동정, 인식 또는 표적화하는 방법 |
Country Status (12)
Country | Link |
---|---|
US (3) | US10189879B2 (ko) |
EP (2) | EP3020722A4 (ko) |
JP (6) | JP5896547B2 (ko) |
KR (5) | KR20160007541A (ko) |
CN (7) | CN116589544A (ko) |
AU (5) | AU2014258386B2 (ko) |
BR (3) | BR112015026707B1 (ko) |
CA (1) | CA2910050A1 (ko) |
HK (2) | HK1221966A1 (ko) |
NZ (3) | NZ752706A (ko) |
SG (2) | SG11201508730TA (ko) |
WO (1) | WO2014175284A1 (ko) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116589544A (zh) * | 2013-04-22 | 2023-08-15 | 国立大学法人九州大学 | 利用ppr基序的dna结合性蛋白质及其应用 |
KR102407776B1 (ko) * | 2016-06-03 | 2022-06-10 | 고쿠리쓰다이가쿠호진 규슈다이가쿠 | 표적 mRNA로부터의 단백질 발현량을 향상시키기 위한 융합 단백질 |
JP6928386B2 (ja) * | 2016-06-03 | 2021-09-01 | 国立大学法人九州大学 | 標的mRNAからのタンパク質発現量を向上させるための融合タンパク質 |
US20190177378A1 (en) * | 2016-08-10 | 2019-06-13 | Fujifilm Wako Pure Chemical Corporation | Dna-binding protein using ppr motif, and use thereof |
CN106591367B (zh) * | 2017-01-19 | 2019-05-07 | 中国人民解放军第二军医大学 | 一种在体内获得并纯化大量目的LncRNA的方法 |
US20230287060A1 (en) * | 2018-06-06 | 2023-09-14 | The University Of Western Australia | Proteins and their use for nucleotide binding |
WO2020023893A1 (en) * | 2018-07-27 | 2020-01-30 | Seekin, Inc. | Reducing noise in sequencing data |
EP3845651A4 (en) * | 2018-08-27 | 2022-06-22 | Hiroshima University | NEW FIELD OF NUCLEASE AND USES THEREOF |
WO2020204159A1 (ja) | 2019-04-05 | 2020-10-08 | 国立大学法人大阪大学 | ノックイン細胞の作製方法 |
WO2020241876A1 (ja) | 2019-05-29 | 2020-12-03 | エディットフォース株式会社 | 効率的なpprタンパク質の作製方法及びその利用 |
AU2020283367A1 (en) | 2019-05-29 | 2022-01-06 | Editforce, Inc. | Ppr protein causing less aggregation and use of the same |
CA3179365A1 (en) | 2020-03-31 | 2021-10-07 | Editforce, Inc. | Method for editing target rna |
JPWO2022097663A1 (ko) | 2020-11-06 | 2022-05-12 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011111829A1 (ja) | 2010-03-11 | 2011-09-15 | 国立大学法人九州大学 | Pprモチーフを利用したrna結合性蛋白質の改変方法 |
WO2013058404A1 (ja) | 2011-10-21 | 2013-04-25 | 国立大学法人九州大学 | Pprモチーフを利用したrna結合性蛋白質の設計方法及びその利用 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2010327998B2 (en) | 2009-12-10 | 2015-11-12 | Iowa State University Research Foundation, Inc. | TAL effector-mediated DNA modification |
JP5485015B2 (ja) | 2010-05-18 | 2014-05-07 | 株式会社グローバル・ニュークリア・フュエル・ジャパン | 燃料集合体 |
CN116589544A (zh) * | 2013-04-22 | 2023-08-15 | 国立大学法人九州大学 | 利用ppr基序的dna结合性蛋白质及其应用 |
US20190177378A1 (en) * | 2016-08-10 | 2019-06-13 | Fujifilm Wako Pure Chemical Corporation | Dna-binding protein using ppr motif, and use thereof |
-
2014
- 2014-04-22 CN CN202310066077.8A patent/CN116589544A/zh active Pending
- 2014-04-22 BR BR112015026707-6A patent/BR112015026707B1/pt active IP Right Grant
- 2014-04-22 CA CA2910050A patent/CA2910050A1/en active Pending
- 2014-04-22 NZ NZ752706A patent/NZ752706A/en unknown
- 2014-04-22 US US14/785,952 patent/US10189879B2/en active Active
- 2014-04-22 CN CN201480035686.6A patent/CN105392796A/zh active Pending
- 2014-04-22 BR BR122020018288-1A patent/BR122020018288B1/pt active IP Right Grant
- 2014-04-22 KR KR1020157033320A patent/KR20160007541A/ko not_active IP Right Cessation
- 2014-04-22 CN CN202310067824.XA patent/CN116444634A/zh active Pending
- 2014-04-22 NZ NZ714264A patent/NZ714264A/en unknown
- 2014-04-22 KR KR1020237039952A patent/KR20230164213A/ko active Application Filing
- 2014-04-22 EP EP14787853.2A patent/EP3020722A4/en not_active Withdrawn
- 2014-04-22 BR BR122020018292-0A patent/BR122020018292B1/pt active IP Right Grant
- 2014-04-22 NZ NZ752705A patent/NZ752705A/en unknown
- 2014-04-22 KR KR1020217016484A patent/KR102326842B1/ko active IP Right Grant
- 2014-04-22 AU AU2014258386A patent/AU2014258386B2/en active Active
- 2014-04-22 SG SG11201508730TA patent/SG11201508730TA/en unknown
- 2014-04-22 CN CN202310070165.5A patent/CN118530318A/zh active Pending
- 2014-04-22 EP EP20151941.0A patent/EP3696186B1/en active Active
- 2014-04-22 WO PCT/JP2014/061329 patent/WO2014175284A1/ja active Application Filing
- 2014-04-22 SG SG10201802430VA patent/SG10201802430VA/en unknown
- 2014-04-22 CN CN202310070449.4A patent/CN116731198A/zh active Pending
- 2014-04-22 KR KR1020227013315A patent/KR102606057B1/ko active IP Right Grant
- 2014-04-22 CN CN202310064591.8A patent/CN116731137A/zh active Pending
- 2014-04-22 JP JP2015500687A patent/JP5896547B2/ja active Active
- 2014-04-22 CN CN202310066485.3A patent/CN116425883A/zh active Pending
- 2014-04-22 KR KR1020217009426A patent/KR102390485B1/ko active IP Right Grant
-
2015
- 2015-09-28 JP JP2015190124A patent/JP6487306B2/ja active Active
-
2016
- 2016-08-29 HK HK16110233.8A patent/HK1221966A1/zh unknown
- 2016-10-26 HK HK16112290.4A patent/HK1223956A1/zh unknown
-
2018
- 2018-12-03 AU AU2018271425A patent/AU2018271425B2/en active Active
- 2018-12-11 US US16/216,617 patent/US20190169240A1/en not_active Abandoned
-
2019
- 2019-02-21 JP JP2019029017A patent/JP6806822B2/ja active Active
-
2020
- 2020-11-06 AU AU2020264412A patent/AU2020264412B2/en active Active
- 2020-12-04 JP JP2020201866A patent/JP7057983B2/ja active Active
-
2021
- 2021-07-01 US US17/365,090 patent/US20210324019A1/en active Pending
-
2022
- 2022-03-29 JP JP2022053594A patent/JP7290233B2/ja active Active
- 2022-04-27 AU AU2022202757A patent/AU2022202757A1/en not_active Abandoned
-
2023
- 2023-05-19 JP JP2023083041A patent/JP2023096153A/ja active Pending
-
2024
- 2024-04-04 AU AU2024202150A patent/AU2024202150A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011111829A1 (ja) | 2010-03-11 | 2011-09-15 | 国立大学法人九州大学 | Pprモチーフを利用したrna結合性蛋白質の改変方法 |
WO2013058404A1 (ja) | 2011-10-21 | 2013-04-25 | 国立大学法人九州大学 | Pprモチーフを利用したrna結合性蛋白質の設計方法及びその利用 |
Non-Patent Citations (2)
Title |
---|
Pfalz J. et al, The Plant Cell, 18:pp.176-197 (2006) |
Uyttewaal et al, PPR336 is associated with polysomes in plant mitochondiria, Jour. of Molecul. Biology, 375(3):pp.626-636 (2007.11.13.) |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2020264412B2 (en) | Dna-binding protein using ppr motif, and use thereof | |
KR101666228B1 (ko) | 생물치료학적 분자를 발현시키기 위한 치료학적 유전자-스위치 작제물 및 생물반응기, 및 이의 용도 | |
RU2763170C2 (ru) | Производство олигосахаридов человеческого молока в микроорганизмах-хозяевах с модифицированным импортом/экспортом | |
US20030119104A1 (en) | Chromosome-based platforms | |
CN109689856A (zh) | 用于海藻宿主细胞的CRISPR-Cas系统 | |
KR20200022486A (ko) | 조작되고 완전-기능 맞춤 당단백질 | |
KR102584628B1 (ko) | T-세포 수용체, t-세포 항원 및 이들의 기능성 상호작용의 식별 및 특징규명을 위한 조작된 다성분 시스템 | |
KR20100049084A (ko) | 질병 진단을 위한 방법 및 조성물 | |
KR20210105382A (ko) | 단백질을 코딩하는 rna | |
CN111094569A (zh) | 光控性病毒蛋白质、其基因及包含该基因的病毒载体 | |
JP2023025182A (ja) | T細胞レセプター及びt細胞抗原の同定及び特徴決定のための遺伝子操作された多成分システム | |
DK2935601T3 (en) | RECOMBINANT MICROBELL CELLS PRODUCING AT LEAST 28% EICOSAPENTAIC ACID AS DRY WEIGHT | |
KR20240022571A (ko) | Rna-가이드된 이펙터 동원을 위한 시스템, 방법 및 성분 | |
KR20230159994A (ko) | 하이브리드 신호서열을 포함하는 재조합 벡터 및 이를 이용한 인간 인슐린 유사 성장인자-1의 분비 생산방법 | |
RU2798786C2 (ru) | Производство молочных олигосахаридов человека в микробных продуцентах с искусственным импортом/экспортом | |
CA2522166C (en) | Lambda integrase mutein for use in recombination | |
CN115003815A (zh) | 小麦转基因事件ind-øø412-7 | |
KR20240024172A (ko) | 유전자의 발현을 조절하기 위한 조성물 및 방법 | |
KR20240023100A (ko) | 유전자 발현을 조절하기 위한 조성물 및 방법 | |
Lai | Analysis of Drosophila melanogaster snRNA activating protein complex binding to the U1 gene promoter | |
KR20240134049A (ko) | 조작된 내수송/외수송을 가진 미생물 숙주에서 모유 올리고당류의 생산 | |
ITMI20131142A1 (it) | Ceppi batterici ricombinanti per la produzione di nucleosidi naturali e analoghi modificati. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A107 | Divisional application of patent | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |