CN113166738A - 经改造的Cas9蛋白及其用途 - Google Patents
经改造的Cas9蛋白及其用途 Download PDFInfo
- Publication number
- CN113166738A CN113166738A CN201980069492.0A CN201980069492A CN113166738A CN 113166738 A CN113166738 A CN 113166738A CN 201980069492 A CN201980069492 A CN 201980069492A CN 113166738 A CN113166738 A CN 113166738A
- Authority
- CN
- China
- Prior art keywords
- lys
- leu
- glu
- ile
- asn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108091033409 CRISPR Proteins 0.000 title description 14
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 156
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 124
- 108020005004 Guide RNA Proteins 0.000 claims abstract description 59
- 150000001413 amino acids Chemical class 0.000 claims abstract description 35
- 238000012217 deletion Methods 0.000 claims abstract description 32
- 230000037430 deletion Effects 0.000 claims abstract description 32
- 125000000539 amino acid group Chemical group 0.000 claims abstract description 6
- 235000018102 proteins Nutrition 0.000 claims description 121
- 210000004027 cell Anatomy 0.000 claims description 65
- 108091033319 polynucleotide Proteins 0.000 claims description 62
- 102000040430 polynucleotide Human genes 0.000 claims description 62
- 239000002157 polynucleotide Substances 0.000 claims description 62
- 238000000034 method Methods 0.000 claims description 54
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 claims description 42
- 235000001014 amino acid Nutrition 0.000 claims description 33
- 239000002773 nucleotide Substances 0.000 claims description 30
- 125000003729 nucleotide group Chemical group 0.000 claims description 30
- 230000014509 gene expression Effects 0.000 claims description 29
- 238000011144 upstream manufacturing Methods 0.000 claims description 27
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 claims description 23
- 230000000295 complement effect Effects 0.000 claims description 20
- 230000035772 mutation Effects 0.000 claims description 20
- 230000027455 binding Effects 0.000 claims description 13
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 claims description 8
- 102000039446 nucleic acids Human genes 0.000 claims description 7
- 108020004707 nucleic acids Proteins 0.000 claims description 7
- 150000007523 nucleic acids Chemical class 0.000 claims description 7
- 239000004472 Lysine Substances 0.000 claims description 6
- 230000002103 transcriptional effect Effects 0.000 claims description 6
- 210000004102 animal cell Anatomy 0.000 claims description 5
- 235000013922 glutamic acid Nutrition 0.000 claims description 5
- 239000004220 glutamic acid Substances 0.000 claims description 5
- 239000004471 Glycine Substances 0.000 claims description 4
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 claims description 4
- 230000001965 increasing effect Effects 0.000 claims description 4
- 108091006104 gene-regulatory proteins Proteins 0.000 claims description 3
- 102000034356 gene-regulatory proteins Human genes 0.000 claims description 3
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 2
- 230000003584 silencer Effects 0.000 claims description 2
- 108091006106 transcriptional activators Proteins 0.000 claims description 2
- 108091006107 transcriptional repressors Proteins 0.000 claims description 2
- 210000005253 yeast cell Anatomy 0.000 claims description 2
- 102100025169 Max-binding protein MNT Human genes 0.000 claims 1
- 239000013598 vector Substances 0.000 abstract description 25
- 125000003275 alpha amino acid group Chemical group 0.000 abstract description 22
- 230000004568 DNA-binding Effects 0.000 abstract description 8
- 230000002829 reductive effect Effects 0.000 abstract description 4
- 108010054155 lysyllysine Proteins 0.000 description 30
- 229940024606 amino acid Drugs 0.000 description 25
- 108020004414 DNA Proteins 0.000 description 23
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 20
- UZGFHWIJWPUPOH-IHRRRGAJSA-N Arg-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UZGFHWIJWPUPOH-IHRRRGAJSA-N 0.000 description 18
- 241000880493 Leptailurus serval Species 0.000 description 18
- 108010068380 arginylarginine Proteins 0.000 description 18
- 108010068265 aspartyltyrosine Proteins 0.000 description 18
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 14
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 14
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 14
- 108010050848 glycylleucine Proteins 0.000 description 14
- 241000191967 Staphylococcus aureus Species 0.000 description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 12
- SNLOOPZHAQDMJG-CIUDSAMLSA-N Gln-Glu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SNLOOPZHAQDMJG-CIUDSAMLSA-N 0.000 description 12
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 12
- AUTNXSQEVVHSJK-YVNDNENWSA-N Glu-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O AUTNXSQEVVHSJK-YVNDNENWSA-N 0.000 description 12
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 12
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 12
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 12
- SKHPKKYKDYULDH-HJGDQZAQSA-N Thr-Asn-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SKHPKKYKDYULDH-HJGDQZAQSA-N 0.000 description 12
- BYOHPUZJVXWHAE-BYULHYEWSA-N Val-Asn-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N BYOHPUZJVXWHAE-BYULHYEWSA-N 0.000 description 12
- ZEVNVXYRZRIRCH-GVXVVHGQSA-N Val-Gln-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N ZEVNVXYRZRIRCH-GVXVVHGQSA-N 0.000 description 12
- 108010034529 leucyl-lysine Proteins 0.000 description 12
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 12
- 108010061238 threonyl-glycine Proteins 0.000 description 12
- 108010073969 valyllysine Proteins 0.000 description 12
- 239000013604 expression vector Substances 0.000 description 11
- 238000013518 transcription Methods 0.000 description 11
- 230000035897 transcription Effects 0.000 description 11
- UJGRZQYSNYTCAX-SRVKXCTJSA-N Asp-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UJGRZQYSNYTCAX-SRVKXCTJSA-N 0.000 description 10
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 10
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 10
- 206010028980 Neoplasm Diseases 0.000 description 10
- AOILQMZPNLUXCM-AVGNSLFASA-N Val-Val-Lys Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN AOILQMZPNLUXCM-AVGNSLFASA-N 0.000 description 10
- 108010077515 glycylproline Proteins 0.000 description 10
- 108010083708 leucyl-aspartyl-valine Proteins 0.000 description 10
- 108010003700 lysyl aspartic acid Proteins 0.000 description 10
- 108010017391 lysylvaline Proteins 0.000 description 10
- 101150053046 MYD88 gene Proteins 0.000 description 9
- 201000011510 cancer Diseases 0.000 description 9
- 238000002741 site-directed mutagenesis Methods 0.000 description 9
- 210000000130 stem cell Anatomy 0.000 description 9
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 8
- MMLHRUJLOUSRJX-CIUDSAMLSA-N Ala-Ser-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN MMLHRUJLOUSRJX-CIUDSAMLSA-N 0.000 description 8
- UGXYFDQFLVCDFC-CIUDSAMLSA-N Asn-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O UGXYFDQFLVCDFC-CIUDSAMLSA-N 0.000 description 8
- SWRVAQHFBRZVNX-GUBZILKMSA-N Glu-Lys-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SWRVAQHFBRZVNX-GUBZILKMSA-N 0.000 description 8
- UAVQIQOOBXFKRC-BYULHYEWSA-N Ile-Asn-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O UAVQIQOOBXFKRC-BYULHYEWSA-N 0.000 description 8
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 8
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 8
- JQSIGLHQNSZZRL-KKUMJFAQSA-N Lys-Lys-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N JQSIGLHQNSZZRL-KKUMJFAQSA-N 0.000 description 8
- FPQMQEOVSKMVMA-ACRUOGEOSA-N Lys-Tyr-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)NC(=O)[C@H](CCCCN)N)O FPQMQEOVSKMVMA-ACRUOGEOSA-N 0.000 description 8
- JAKHAONCJJZVHT-DCAQKATOSA-N Val-Lys-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)O)N JAKHAONCJJZVHT-DCAQKATOSA-N 0.000 description 8
- 108010005233 alanylglutamic acid Proteins 0.000 description 8
- 108010013835 arginine glutamate Proteins 0.000 description 8
- 108010028295 histidylhistidine Proteins 0.000 description 8
- 108010053037 kyotorphin Proteins 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- 241000282326 Felis catus Species 0.000 description 7
- 108091028113 Trans-activating crRNA Proteins 0.000 description 7
- LWUWMHIOBPTZBA-DCAQKATOSA-N Ala-Arg-Lys Chemical compound NC(=N)NCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O LWUWMHIOBPTZBA-DCAQKATOSA-N 0.000 description 6
- ZIWWTZWAKYBUOB-CIUDSAMLSA-N Ala-Asp-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O ZIWWTZWAKYBUOB-CIUDSAMLSA-N 0.000 description 6
- MKZCBYZBCINNJN-DLOVCJGASA-N Ala-Asp-Phe Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MKZCBYZBCINNJN-DLOVCJGASA-N 0.000 description 6
- AWAXZRDKUHOPBO-GUBZILKMSA-N Ala-Gln-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(O)=O AWAXZRDKUHOPBO-GUBZILKMSA-N 0.000 description 6
- HXNNRBHASOSVPG-GUBZILKMSA-N Ala-Glu-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O HXNNRBHASOSVPG-GUBZILKMSA-N 0.000 description 6
- DVJSJDDYCYSMFR-ZKWXMUAHSA-N Ala-Ile-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O DVJSJDDYCYSMFR-ZKWXMUAHSA-N 0.000 description 6
- WUHJHHGYVVJMQE-BJDJZHNGSA-N Ala-Leu-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WUHJHHGYVVJMQE-BJDJZHNGSA-N 0.000 description 6
- LDLSENBXQNDTPB-DCAQKATOSA-N Ala-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N LDLSENBXQNDTPB-DCAQKATOSA-N 0.000 description 6
- PIXQDIGKDNNOOV-GUBZILKMSA-N Ala-Lys-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O PIXQDIGKDNNOOV-GUBZILKMSA-N 0.000 description 6
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 6
- AOAKQKVICDWCLB-UWJYBYFXSA-N Ala-Tyr-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N AOAKQKVICDWCLB-UWJYBYFXSA-N 0.000 description 6
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 6
- XPSGESXVBSQZPL-SRVKXCTJSA-N Arg-Arg-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O XPSGESXVBSQZPL-SRVKXCTJSA-N 0.000 description 6
- HJVGMOYJDDXLMI-AVGNSLFASA-N Arg-Arg-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H](N)CCCNC(N)=N HJVGMOYJDDXLMI-AVGNSLFASA-N 0.000 description 6
- DPXDVGDLWJYZBH-GUBZILKMSA-N Arg-Asn-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O DPXDVGDLWJYZBH-GUBZILKMSA-N 0.000 description 6
- BVBKBQRPOJFCQM-DCAQKATOSA-N Arg-Asn-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BVBKBQRPOJFCQM-DCAQKATOSA-N 0.000 description 6
- OZNSCVPYWZRQPY-CIUDSAMLSA-N Arg-Asp-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O OZNSCVPYWZRQPY-CIUDSAMLSA-N 0.000 description 6
- VXXHDZKEQNGXNU-QXEWZRGKSA-N Arg-Asp-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N VXXHDZKEQNGXNU-QXEWZRGKSA-N 0.000 description 6
- NKNILFJYKKHBKE-WPRPVWTQSA-N Arg-Gly-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O NKNILFJYKKHBKE-WPRPVWTQSA-N 0.000 description 6
- YBIAYFFIVAZXPK-AVGNSLFASA-N Arg-His-Arg Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YBIAYFFIVAZXPK-AVGNSLFASA-N 0.000 description 6
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 6
- AOHKLEBWKMKITA-IHRRRGAJSA-N Arg-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AOHKLEBWKMKITA-IHRRRGAJSA-N 0.000 description 6
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 6
- OQPAZKMGCWPERI-GUBZILKMSA-N Arg-Ser-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O OQPAZKMGCWPERI-GUBZILKMSA-N 0.000 description 6
- ZJBUILVYSXQNSW-YTWAJWBKSA-N Arg-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O ZJBUILVYSXQNSW-YTWAJWBKSA-N 0.000 description 6
- OGZBJJLRKQZRHL-KJEVXHAQSA-N Arg-Thr-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OGZBJJLRKQZRHL-KJEVXHAQSA-N 0.000 description 6
- NVPHRWNWTKYIST-BPNCWPANSA-N Arg-Tyr-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=C(O)C=C1 NVPHRWNWTKYIST-BPNCWPANSA-N 0.000 description 6
- QLSRIZIDQXDQHK-RCWTZXSCSA-N Arg-Val-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QLSRIZIDQXDQHK-RCWTZXSCSA-N 0.000 description 6
- XYOVHPDDWCEUDY-CIUDSAMLSA-N Asn-Ala-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O XYOVHPDDWCEUDY-CIUDSAMLSA-N 0.000 description 6
- SLKLLQWZQHXYSV-CIUDSAMLSA-N Asn-Ala-Lys Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O SLKLLQWZQHXYSV-CIUDSAMLSA-N 0.000 description 6
- DAPLJWATMAXPPZ-CIUDSAMLSA-N Asn-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC(N)=O DAPLJWATMAXPPZ-CIUDSAMLSA-N 0.000 description 6
- APHUDFFMXFYRKP-CIUDSAMLSA-N Asn-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N APHUDFFMXFYRKP-CIUDSAMLSA-N 0.000 description 6
- NLCDVZJDEXIDDL-BIIVOSGPSA-N Asn-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)N)N)C(=O)O NLCDVZJDEXIDDL-BIIVOSGPSA-N 0.000 description 6
- UGXVKHRDGLYFKR-CIUDSAMLSA-N Asn-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(N)=O UGXVKHRDGLYFKR-CIUDSAMLSA-N 0.000 description 6
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 6
- UPALZCBCKAMGIY-PEFMBERDSA-N Asn-Gln-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UPALZCBCKAMGIY-PEFMBERDSA-N 0.000 description 6
- GNKVBRYFXYWXAB-WDSKDSINSA-N Asn-Glu-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O GNKVBRYFXYWXAB-WDSKDSINSA-N 0.000 description 6
- DMLSCRJBWUEALP-LAEOZQHASA-N Asn-Glu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O DMLSCRJBWUEALP-LAEOZQHASA-N 0.000 description 6
- BZWRLDPIWKOVKB-ZPFDUUQYSA-N Asn-Leu-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BZWRLDPIWKOVKB-ZPFDUUQYSA-N 0.000 description 6
- YVXRYLVELQYAEQ-SRVKXCTJSA-N Asn-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N YVXRYLVELQYAEQ-SRVKXCTJSA-N 0.000 description 6
- DJIMLSXHXKWADV-CIUDSAMLSA-N Asn-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(N)=O DJIMLSXHXKWADV-CIUDSAMLSA-N 0.000 description 6
- NYGILGUOUOXGMJ-YUMQZZPRSA-N Asn-Lys-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O NYGILGUOUOXGMJ-YUMQZZPRSA-N 0.000 description 6
- SUIJFTJDTJKSRK-IHRRRGAJSA-N Asn-Pro-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUIJFTJDTJKSRK-IHRRRGAJSA-N 0.000 description 6
- GZXOUBTUAUAVHD-ACZMJKKPSA-N Asn-Ser-Glu Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GZXOUBTUAUAVHD-ACZMJKKPSA-N 0.000 description 6
- VLDRQOHCMKCXLY-SRVKXCTJSA-N Asn-Ser-Phe Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VLDRQOHCMKCXLY-SRVKXCTJSA-N 0.000 description 6
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 6
- SVFOIXMRMLROHO-SRVKXCTJSA-N Asp-Asp-Phe Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 SVFOIXMRMLROHO-SRVKXCTJSA-N 0.000 description 6
- PMEHKVHZQKJACS-PEFMBERDSA-N Asp-Gln-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O PMEHKVHZQKJACS-PEFMBERDSA-N 0.000 description 6
- VIRHEUMYXXLCBF-WDSKDSINSA-N Asp-Gly-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O VIRHEUMYXXLCBF-WDSKDSINSA-N 0.000 description 6
- CYCKJEFVFNRWEZ-UGYAYLCHSA-N Asp-Ile-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O CYCKJEFVFNRWEZ-UGYAYLCHSA-N 0.000 description 6
- YFSLJHLQOALGSY-ZPFDUUQYSA-N Asp-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N YFSLJHLQOALGSY-ZPFDUUQYSA-N 0.000 description 6
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 6
- VNXQRBXEQXLERQ-CIUDSAMLSA-N Asp-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(=O)O)N VNXQRBXEQXLERQ-CIUDSAMLSA-N 0.000 description 6
- JDDYEZGPYBBPBN-JRQIVUDYSA-N Asp-Thr-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JDDYEZGPYBBPBN-JRQIVUDYSA-N 0.000 description 6
- DRXOWZZHCSBUOI-YJRXYDGGSA-N Cys-Thr-Tyr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CS)N)O DRXOWZZHCSBUOI-YJRXYDGGSA-N 0.000 description 6
- LJEPDHWNQXPXMM-NHCYSSNCSA-N Gln-Arg-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O LJEPDHWNQXPXMM-NHCYSSNCSA-N 0.000 description 6
- HDUDGCZEOZEFOA-KBIXCLLPSA-N Gln-Ile-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HDUDGCZEOZEFOA-KBIXCLLPSA-N 0.000 description 6
- QKCZZAZNMMVICF-DCAQKATOSA-N Gln-Leu-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O QKCZZAZNMMVICF-DCAQKATOSA-N 0.000 description 6
- HPCOBEHVEHWREJ-DCAQKATOSA-N Gln-Lys-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HPCOBEHVEHWREJ-DCAQKATOSA-N 0.000 description 6
- JILRMFFFCHUUTJ-ACZMJKKPSA-N Gln-Ser-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O JILRMFFFCHUUTJ-ACZMJKKPSA-N 0.000 description 6
- UXXIVIQGOODKQC-NUMRIWBASA-N Gln-Thr-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O UXXIVIQGOODKQC-NUMRIWBASA-N 0.000 description 6
- FHPXTPQBODWBIY-CIUDSAMLSA-N Glu-Ala-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHPXTPQBODWBIY-CIUDSAMLSA-N 0.000 description 6
- SZXSSXUNOALWCH-ACZMJKKPSA-N Glu-Ala-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O SZXSSXUNOALWCH-ACZMJKKPSA-N 0.000 description 6
- VTTSANCGJWLPNC-ZPFDUUQYSA-N Glu-Arg-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VTTSANCGJWLPNC-ZPFDUUQYSA-N 0.000 description 6
- AKJRHDMTEJXTPV-ACZMJKKPSA-N Glu-Asn-Ala Chemical compound C[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCC(O)=O)C(O)=O AKJRHDMTEJXTPV-ACZMJKKPSA-N 0.000 description 6
- PCBBLFVHTYNQGG-LAEOZQHASA-N Glu-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N PCBBLFVHTYNQGG-LAEOZQHASA-N 0.000 description 6
- UMIRPYLZFKOEOH-YVNDNENWSA-N Glu-Gln-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O UMIRPYLZFKOEOH-YVNDNENWSA-N 0.000 description 6
- ILGFBUGLBSAQQB-GUBZILKMSA-N Glu-Glu-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ILGFBUGLBSAQQB-GUBZILKMSA-N 0.000 description 6
- NKLRYVLERDYDBI-FXQIFTODSA-N Glu-Glu-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O NKLRYVLERDYDBI-FXQIFTODSA-N 0.000 description 6
- YLJHCWNDBKKOEB-IHRRRGAJSA-N Glu-Glu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YLJHCWNDBKKOEB-IHRRRGAJSA-N 0.000 description 6
- PHONAZGUEGIOEM-GLLZPBPUSA-N Glu-Glu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PHONAZGUEGIOEM-GLLZPBPUSA-N 0.000 description 6
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 6
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 6
- MWMJCGBSIORNCD-AVGNSLFASA-N Glu-Leu-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O MWMJCGBSIORNCD-AVGNSLFASA-N 0.000 description 6
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 6
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 6
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 6
- XNOWYPDMSLSRKP-GUBZILKMSA-N Glu-Met-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(O)=O XNOWYPDMSLSRKP-GUBZILKMSA-N 0.000 description 6
- CBEUFCJRFNZMCU-SRVKXCTJSA-N Glu-Met-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O CBEUFCJRFNZMCU-SRVKXCTJSA-N 0.000 description 6
- CQGBSALYGOXQPE-HTUGSXCWSA-N Glu-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O CQGBSALYGOXQPE-HTUGSXCWSA-N 0.000 description 6
- YOTHMZZSJKKEHZ-SZMVWBNQSA-N Glu-Trp-Lys Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@@H](N)CCC(O)=O)=CNC2=C1 YOTHMZZSJKKEHZ-SZMVWBNQSA-N 0.000 description 6
- CGWHAXBNGYQBBK-JBACZVJFSA-N Glu-Trp-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H](CCC(O)=O)N)C(O)=O)C1=CC=C(O)C=C1 CGWHAXBNGYQBBK-JBACZVJFSA-N 0.000 description 6
- LSYFGBRDBIQYAQ-FHWLQOOXSA-N Glu-Tyr-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LSYFGBRDBIQYAQ-FHWLQOOXSA-N 0.000 description 6
- UZWUBBRJWFTHTD-LAEOZQHASA-N Glu-Val-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O UZWUBBRJWFTHTD-LAEOZQHASA-N 0.000 description 6
- YPHPEHMXOYTEQG-LAEOZQHASA-N Glu-Val-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCC(O)=O YPHPEHMXOYTEQG-LAEOZQHASA-N 0.000 description 6
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 6
- NZAFOTBEULLEQB-WDSKDSINSA-N Gly-Asn-Glu Chemical compound C(CC(=O)O)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN NZAFOTBEULLEQB-WDSKDSINSA-N 0.000 description 6
- VLIJYPMATZSOLL-YUMQZZPRSA-N Gly-Lys-Cys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)CN VLIJYPMATZSOLL-YUMQZZPRSA-N 0.000 description 6
- GMTXWRIDLGTVFC-IUCAKERBSA-N Gly-Lys-Glu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GMTXWRIDLGTVFC-IUCAKERBSA-N 0.000 description 6
- WNZOCXUOGVYYBJ-CDMKHQONSA-N Gly-Phe-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)CN)O WNZOCXUOGVYYBJ-CDMKHQONSA-N 0.000 description 6
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 6
- RHRLHXQWHCNJKR-PMVVWTBXSA-N Gly-Thr-His Chemical compound NCC(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 RHRLHXQWHCNJKR-PMVVWTBXSA-N 0.000 description 6
- HQSKKSLNLSTONK-JTQLQIEISA-N Gly-Tyr-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 HQSKKSLNLSTONK-JTQLQIEISA-N 0.000 description 6
- GBYYQVBXFVDJPJ-WLTAIBSBSA-N Gly-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)CN)O GBYYQVBXFVDJPJ-WLTAIBSBSA-N 0.000 description 6
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 6
- AFPFGFUGETYOSY-HGNGGELXSA-N His-Ala-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O AFPFGFUGETYOSY-HGNGGELXSA-N 0.000 description 6
- MDBYBTWRMOAJAY-NHCYSSNCSA-N His-Asn-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N MDBYBTWRMOAJAY-NHCYSSNCSA-N 0.000 description 6
- ZJSMFRTVYSLKQU-DJFWLOJKSA-N His-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N ZJSMFRTVYSLKQU-DJFWLOJKSA-N 0.000 description 6
- MLZVJIREOKTDAR-SIGLWIIPSA-N His-Ile-Ile Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MLZVJIREOKTDAR-SIGLWIIPSA-N 0.000 description 6
- LQSBBHNVAVNZSX-GHCJXIJMSA-N Ile-Ala-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N LQSBBHNVAVNZSX-GHCJXIJMSA-N 0.000 description 6
- QTUSJASXLGLJSR-OSUNSFLBSA-N Ile-Arg-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H]([C@@H](C)O)C(=O)O)N QTUSJASXLGLJSR-OSUNSFLBSA-N 0.000 description 6
- QADCTXFNLZBZAB-GHCJXIJMSA-N Ile-Asn-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](C)C(=O)O)N QADCTXFNLZBZAB-GHCJXIJMSA-N 0.000 description 6
- RPZFUIQVAPZLRH-GHCJXIJMSA-N Ile-Asp-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)O)N RPZFUIQVAPZLRH-GHCJXIJMSA-N 0.000 description 6
- RGSOCXHDOPQREB-ZPFDUUQYSA-N Ile-Asp-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)O)N RGSOCXHDOPQREB-ZPFDUUQYSA-N 0.000 description 6
- YBJWJQQBWRARLT-KBIXCLLPSA-N Ile-Gln-Ser Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O YBJWJQQBWRARLT-KBIXCLLPSA-N 0.000 description 6
- PWDSHAAAFXISLE-SXTJYALSSA-N Ile-Ile-Asp Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O PWDSHAAAFXISLE-SXTJYALSSA-N 0.000 description 6
- BBQABUDWDUKJMB-LZXPERKUSA-N Ile-Ile-Ile Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C([O-])=O BBQABUDWDUKJMB-LZXPERKUSA-N 0.000 description 6
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 6
- UDBPXJNOEWDBDF-XUXIUFHCSA-N Ile-Lys-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)O)N UDBPXJNOEWDBDF-XUXIUFHCSA-N 0.000 description 6
- OTSVBELRDMSPKY-PCBIJLKTSA-N Ile-Phe-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OTSVBELRDMSPKY-PCBIJLKTSA-N 0.000 description 6
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 6
- IVXJIMGDOYRLQU-XUXIUFHCSA-N Ile-Pro-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O IVXJIMGDOYRLQU-XUXIUFHCSA-N 0.000 description 6
- KTNGVMMGIQWIDV-OSUNSFLBSA-N Ile-Pro-Thr Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O KTNGVMMGIQWIDV-OSUNSFLBSA-N 0.000 description 6
- JHNJNTMTZHEDLJ-NAKRPEOUSA-N Ile-Ser-Arg Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O JHNJNTMTZHEDLJ-NAKRPEOUSA-N 0.000 description 6
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 6
- WLRJHVNFGAOYPS-HJPIBITLSA-N Ile-Ser-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N WLRJHVNFGAOYPS-HJPIBITLSA-N 0.000 description 6
- WCNWGAUZWWSYDG-SVSWQMSJSA-N Ile-Thr-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)O)N WCNWGAUZWWSYDG-SVSWQMSJSA-N 0.000 description 6
- PMGDADKJMCOXHX-UHFFFAOYSA-N L-Arginyl-L-glutamin-acetat Natural products NC(=N)NCCCC(N)C(=O)NC(CCC(N)=O)C(O)=O PMGDADKJMCOXHX-UHFFFAOYSA-N 0.000 description 6
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 6
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 6
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 6
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 6
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 6
- QCSFMCFHVGTLFF-NHCYSSNCSA-N Leu-Asp-Val Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O QCSFMCFHVGTLFF-NHCYSSNCSA-N 0.000 description 6
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 6
- WIDZHJTYKYBLSR-DCAQKATOSA-N Leu-Glu-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O WIDZHJTYKYBLSR-DCAQKATOSA-N 0.000 description 6
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 6
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 6
- CSFVADKICPDRRF-KKUMJFAQSA-N Leu-His-Leu Chemical compound CC(C)C[C@H]([NH3+])C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C([O-])=O)CC1=CN=CN1 CSFVADKICPDRRF-KKUMJFAQSA-N 0.000 description 6
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 6
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 6
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 6
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 6
- JIHDFWWRYHSAQB-GUBZILKMSA-N Leu-Ser-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O JIHDFWWRYHSAQB-GUBZILKMSA-N 0.000 description 6
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 6
- PPGBXYKMUMHFBF-KATARQTJSA-N Leu-Ser-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PPGBXYKMUMHFBF-KATARQTJSA-N 0.000 description 6
- ICYRCNICGBJLGM-HJGDQZAQSA-N Leu-Thr-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(O)=O ICYRCNICGBJLGM-HJGDQZAQSA-N 0.000 description 6
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 6
- URJUVJDTPXCQFL-IHPCNDPISA-N Leu-Trp-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N URJUVJDTPXCQFL-IHPCNDPISA-N 0.000 description 6
- YQFZRHYZLARWDY-IHRRRGAJSA-N Leu-Val-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN YQFZRHYZLARWDY-IHRRRGAJSA-N 0.000 description 6
- FDBTVENULFNTAL-XQQFMLRXSA-N Leu-Val-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N FDBTVENULFNTAL-XQQFMLRXSA-N 0.000 description 6
- KPJJOZUXFOLGMQ-CIUDSAMLSA-N Lys-Asp-Asn Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N KPJJOZUXFOLGMQ-CIUDSAMLSA-N 0.000 description 6
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 6
- QQUJSUFWEDZQQY-AVGNSLFASA-N Lys-Gln-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCCN QQUJSUFWEDZQQY-AVGNSLFASA-N 0.000 description 6
- DRCILAJNUJKAHC-SRVKXCTJSA-N Lys-Glu-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O DRCILAJNUJKAHC-SRVKXCTJSA-N 0.000 description 6
- GRADYHMSAUIKPS-DCAQKATOSA-N Lys-Glu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRADYHMSAUIKPS-DCAQKATOSA-N 0.000 description 6
- LPAJOCKCPRZEAG-MNXVOIDGSA-N Lys-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCCN LPAJOCKCPRZEAG-MNXVOIDGSA-N 0.000 description 6
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 6
- QZONCCHVHCOBSK-YUMQZZPRSA-N Lys-Gly-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O QZONCCHVHCOBSK-YUMQZZPRSA-N 0.000 description 6
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 6
- ZXFRGTAIIZHNHG-AJNGGQMLSA-N Lys-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N ZXFRGTAIIZHNHG-AJNGGQMLSA-N 0.000 description 6
- QKXZCUCBFPEXNK-KKUMJFAQSA-N Lys-Leu-His Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 QKXZCUCBFPEXNK-KKUMJFAQSA-N 0.000 description 6
- YXPJCVNIDDKGOE-MELADBBJSA-N Lys-Lys-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N)C(=O)O YXPJCVNIDDKGOE-MELADBBJSA-N 0.000 description 6
- QQPSCXKFDSORFT-IHRRRGAJSA-N Lys-Lys-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCCN QQPSCXKFDSORFT-IHRRRGAJSA-N 0.000 description 6
- AZOFEHCPMBRNFD-BZSNNMDCSA-N Lys-Phe-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 AZOFEHCPMBRNFD-BZSNNMDCSA-N 0.000 description 6
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 6
- VVURYEVJJTXWNE-ULQDDVLXSA-N Lys-Tyr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(O)=O VVURYEVJJTXWNE-ULQDDVLXSA-N 0.000 description 6
- HMZPYMSEAALNAE-ULQDDVLXSA-N Lys-Val-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O HMZPYMSEAALNAE-ULQDDVLXSA-N 0.000 description 6
- YNOVBMBQSQTLFM-DCAQKATOSA-N Met-Asn-Leu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O YNOVBMBQSQTLFM-DCAQKATOSA-N 0.000 description 6
- FWTBMGAKKPSTBT-GUBZILKMSA-N Met-Gln-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FWTBMGAKKPSTBT-GUBZILKMSA-N 0.000 description 6
- XKJUFUPCHARJKX-UWVGGRQHSA-N Met-Gly-His Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 XKJUFUPCHARJKX-UWVGGRQHSA-N 0.000 description 6
- DJBCKVNHEIJLQA-GMOBBJLQSA-N Met-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCSC)N DJBCKVNHEIJLQA-GMOBBJLQSA-N 0.000 description 6
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 6
- LXVFHIBXOWJTKZ-BZSNNMDCSA-N Phe-Asn-Tyr Chemical compound N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O LXVFHIBXOWJTKZ-BZSNNMDCSA-N 0.000 description 6
- ABQFNJAFONNUTH-FHWLQOOXSA-N Phe-Gln-Tyr Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O)N ABQFNJAFONNUTH-FHWLQOOXSA-N 0.000 description 6
- OVJMCXAPGFDGMG-HKUYNNGSSA-N Phe-Gly-Trp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O OVJMCXAPGFDGMG-HKUYNNGSSA-N 0.000 description 6
- WKTSCAXSYITIJJ-PCBIJLKTSA-N Phe-Ile-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O WKTSCAXSYITIJJ-PCBIJLKTSA-N 0.000 description 6
- SMCHPSMKAFIERP-FXQIFTODSA-N Pro-Asn-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@@H]1CCCN1 SMCHPSMKAFIERP-FXQIFTODSA-N 0.000 description 6
- FRKBNXCFJBPJOL-GUBZILKMSA-N Pro-Glu-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O FRKBNXCFJBPJOL-GUBZILKMSA-N 0.000 description 6
- WFHYFCWBLSKEMS-KKUMJFAQSA-N Pro-Glu-Phe Chemical compound N([C@@H](CCC(=O)O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C(=O)[C@@H]1CCCN1 WFHYFCWBLSKEMS-KKUMJFAQSA-N 0.000 description 6
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 6
- 108010079005 RDV peptide Proteins 0.000 description 6
- 108010003201 RGH 0205 Proteins 0.000 description 6
- ZUGXSSFMTXKHJS-ZLUOBGJFSA-N Ser-Ala-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(O)=O ZUGXSSFMTXKHJS-ZLUOBGJFSA-N 0.000 description 6
- VMVNCJDKFOQOHM-GUBZILKMSA-N Ser-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CO)N VMVNCJDKFOQOHM-GUBZILKMSA-N 0.000 description 6
- LALNXSXEYFUUDD-GUBZILKMSA-N Ser-Glu-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LALNXSXEYFUUDD-GUBZILKMSA-N 0.000 description 6
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 6
- LQESNKGTTNHZPZ-GHCJXIJMSA-N Ser-Ile-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O LQESNKGTTNHZPZ-GHCJXIJMSA-N 0.000 description 6
- NNFMANHDYSVNIO-DCAQKATOSA-N Ser-Lys-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O NNFMANHDYSVNIO-DCAQKATOSA-N 0.000 description 6
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 6
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 6
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 6
- UPLYXVPQLJVWMM-KKUMJFAQSA-N Ser-Phe-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UPLYXVPQLJVWMM-KKUMJFAQSA-N 0.000 description 6
- XQJCEKXQUJQNNK-ZLUOBGJFSA-N Ser-Ser-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O XQJCEKXQUJQNNK-ZLUOBGJFSA-N 0.000 description 6
- PURRNJBBXDDWLX-ZDLURKLDSA-N Ser-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CO)N)O PURRNJBBXDDWLX-ZDLURKLDSA-N 0.000 description 6
- TWLMXDWFVNEFFK-FJXKBIBVSA-N Thr-Arg-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O TWLMXDWFVNEFFK-FJXKBIBVSA-N 0.000 description 6
- YLXAMFZYJTZXFH-OLHMAJIHSA-N Thr-Asn-Asp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N)O YLXAMFZYJTZXFH-OLHMAJIHSA-N 0.000 description 6
- IHAPJUHCZXBPHR-WZLNRYEVSA-N Thr-Ile-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N IHAPJUHCZXBPHR-WZLNRYEVSA-N 0.000 description 6
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 6
- KZSYAEWQMJEGRZ-RHYQMDGZSA-N Thr-Leu-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O KZSYAEWQMJEGRZ-RHYQMDGZSA-N 0.000 description 6
- MGJLBZFUXUGMML-VOAKCMCISA-N Thr-Lys-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MGJLBZFUXUGMML-VOAKCMCISA-N 0.000 description 6
- IVDFVBVIVLJJHR-LKXGYXEUSA-N Thr-Ser-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O IVDFVBVIVLJJHR-LKXGYXEUSA-N 0.000 description 6
- AYHSJESDFKREAR-KKUMJFAQSA-N Tyr-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 AYHSJESDFKREAR-KKUMJFAQSA-N 0.000 description 6
- HVHJYXDXRIWELT-RYUDHWBXSA-N Tyr-Glu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O HVHJYXDXRIWELT-RYUDHWBXSA-N 0.000 description 6
- KOVXHANYYYMBRF-IRIUXVKKSA-N Tyr-Glu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O KOVXHANYYYMBRF-IRIUXVKKSA-N 0.000 description 6
- KCPFDGNYAMKZQP-KBPBESRZSA-N Tyr-Gly-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O KCPFDGNYAMKZQP-KBPBESRZSA-N 0.000 description 6
- QHLIUFUEUDFAOT-MGHWNKPDSA-N Tyr-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CC=C(C=C1)O)N QHLIUFUEUDFAOT-MGHWNKPDSA-N 0.000 description 6
- KHCSOLAHNLOXJR-BZSNNMDCSA-N Tyr-Leu-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHCSOLAHNLOXJR-BZSNNMDCSA-N 0.000 description 6
- MXFPBNFKVBHIRW-BZSNNMDCSA-N Tyr-Lys-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O MXFPBNFKVBHIRW-BZSNNMDCSA-N 0.000 description 6
- CNNVVEPJTFOGHI-ACRUOGEOSA-N Tyr-Lys-Tyr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CNNVVEPJTFOGHI-ACRUOGEOSA-N 0.000 description 6
- IGXLNVIYDYONFB-UFYCRDLUSA-N Tyr-Phe-Arg Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)C1=CC=C(O)C=C1 IGXLNVIYDYONFB-UFYCRDLUSA-N 0.000 description 6
- LUMQYLVYUIRHHU-YJRXYDGGSA-N Tyr-Ser-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LUMQYLVYUIRHHU-YJRXYDGGSA-N 0.000 description 6
- COYSIHFOCOMGCF-WPRPVWTQSA-N Val-Arg-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)NCC(O)=O)CCCN=C(N)N COYSIHFOCOMGCF-WPRPVWTQSA-N 0.000 description 6
- ZMDCGGKHRKNWKD-LAEOZQHASA-N Val-Asn-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZMDCGGKHRKNWKD-LAEOZQHASA-N 0.000 description 6
- OVLIFGQSBSNGHY-KKHAAJSZSA-N Val-Asp-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N)O OVLIFGQSBSNGHY-KKHAAJSZSA-N 0.000 description 6
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 6
- APQIVBCUIUDSMB-OSUNSFLBSA-N Val-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N APQIVBCUIUDSMB-OSUNSFLBSA-N 0.000 description 6
- IJGPOONOTBNTFS-GVXVVHGQSA-N Val-Lys-Glu Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O IJGPOONOTBNTFS-GVXVVHGQSA-N 0.000 description 6
- DIOSYUIWOQCXNR-ONGXEEELSA-N Val-Lys-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O DIOSYUIWOQCXNR-ONGXEEELSA-N 0.000 description 6
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 6
- 108010008685 alanyl-glutamyl-aspartic acid Proteins 0.000 description 6
- 108010047495 alanylglycine Proteins 0.000 description 6
- 108010008355 arginyl-glutamine Proteins 0.000 description 6
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 108010049041 glutamylalanine Proteins 0.000 description 6
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 6
- 108010050475 glycyl-leucyl-tyrosine Proteins 0.000 description 6
- 108010045126 glycyl-tyrosyl-glycine Proteins 0.000 description 6
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 6
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 6
- 108010009298 lysylglutamic acid Proteins 0.000 description 6
- 108010051242 phenylalanylserine Proteins 0.000 description 6
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 6
- 108010005652 splenotritin Proteins 0.000 description 6
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 6
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 6
- 108010051110 tyrosyl-lysine Proteins 0.000 description 6
- 108010068794 tyrosyl-tyrosyl-glutamyl-glutamic acid Proteins 0.000 description 6
- 108010003137 tyrosyltyrosine Proteins 0.000 description 6
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 5
- 238000007792 addition Methods 0.000 description 5
- 102000037865 fusion proteins Human genes 0.000 description 5
- 108020001507 fusion proteins Proteins 0.000 description 5
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 5
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Natural products NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- MAISCYVJLBBRNU-DCAQKATOSA-N Arg-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N MAISCYVJLBBRNU-DCAQKATOSA-N 0.000 description 4
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 4
- 239000004475 Arginine Substances 0.000 description 4
- NXVGBGZQQFDUTM-XVYDVKMFSA-N Asn-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N NXVGBGZQQFDUTM-XVYDVKMFSA-N 0.000 description 4
- KSBHCUSPLWRVEK-ZLUOBGJFSA-N Asn-Asn-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O KSBHCUSPLWRVEK-ZLUOBGJFSA-N 0.000 description 4
- XQQVCUIBGYFKDC-OLHMAJIHSA-N Asn-Asp-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XQQVCUIBGYFKDC-OLHMAJIHSA-N 0.000 description 4
- ORJQQZIXTOYGGH-SRVKXCTJSA-N Asn-Lys-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ORJQQZIXTOYGGH-SRVKXCTJSA-N 0.000 description 4
- COWITDLVHMZSIW-CIUDSAMLSA-N Asn-Lys-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O COWITDLVHMZSIW-CIUDSAMLSA-N 0.000 description 4
- GWTLRDMPMJCNMH-WHFBIAKZSA-N Asp-Asn-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O GWTLRDMPMJCNMH-WHFBIAKZSA-N 0.000 description 4
- KHBLRHKVXICFMY-GUBZILKMSA-N Asp-Glu-Lys Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O KHBLRHKVXICFMY-GUBZILKMSA-N 0.000 description 4
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 4
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 4
- HJZLUGQGJWXJCJ-CIUDSAMLSA-N Asp-Pro-Gln Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJZLUGQGJWXJCJ-CIUDSAMLSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- 241000196324 Embryophyta Species 0.000 description 4
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 4
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 4
- JJKKWYQVHRUSDG-GUBZILKMSA-N Glu-Ala-Lys Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O JJKKWYQVHRUSDG-GUBZILKMSA-N 0.000 description 4
- JZJGEKDPWVJOLD-QEWYBTABSA-N Glu-Phe-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JZJGEKDPWVJOLD-QEWYBTABSA-N 0.000 description 4
- UUTGYDAKPISJAO-JYJNAYRXSA-N Glu-Tyr-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=C(O)C=C1 UUTGYDAKPISJAO-JYJNAYRXSA-N 0.000 description 4
- LURCIJSJAKFCRO-QWRGUYRKSA-N Gly-Asn-Tyr Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LURCIJSJAKFCRO-QWRGUYRKSA-N 0.000 description 4
- LCRDMSSAKLTKBU-ZDLURKLDSA-N Gly-Ser-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)CN LCRDMSSAKLTKBU-ZDLURKLDSA-N 0.000 description 4
- TTZAWSKKNCEINZ-AVGNSLFASA-N His-Arg-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O TTZAWSKKNCEINZ-AVGNSLFASA-N 0.000 description 4
- 108010091358 Hypoxanthine Phosphoribosyltransferase Proteins 0.000 description 4
- BGZIJZJBXRVBGJ-SXTJYALSSA-N Ile-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N BGZIJZJBXRVBGJ-SXTJYALSSA-N 0.000 description 4
- CSQNHSGHAPRGPQ-YTFOTSKYSA-N Ile-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)O)N CSQNHSGHAPRGPQ-YTFOTSKYSA-N 0.000 description 4
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 4
- XDUVMJCBYUKNFJ-MXAVVETBSA-N Ile-Lys-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N XDUVMJCBYUKNFJ-MXAVVETBSA-N 0.000 description 4
- RCMNUBZKIIJCOI-ZPFDUUQYSA-N Ile-Met-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N RCMNUBZKIIJCOI-ZPFDUUQYSA-N 0.000 description 4
- WYUHAXJAMDTOAU-IAVJCBSLSA-N Ile-Phe-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N WYUHAXJAMDTOAU-IAVJCBSLSA-N 0.000 description 4
- ZYVTXBXHIKGZMD-QSFUFRPTSA-N Ile-Val-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)N)C(=O)O)N ZYVTXBXHIKGZMD-QSFUFRPTSA-N 0.000 description 4
- 108010065920 Insulin Lispro Proteins 0.000 description 4
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 4
- KTFHTMHHKXUYPW-ZPFDUUQYSA-N Leu-Asp-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KTFHTMHHKXUYPW-ZPFDUUQYSA-N 0.000 description 4
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 4
- HNDWYLYAYNBWMP-AJNGGQMLSA-N Leu-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(C)C)N HNDWYLYAYNBWMP-AJNGGQMLSA-N 0.000 description 4
- PPQRKXHCLYCBSP-IHRRRGAJSA-N Leu-Leu-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCSC)C(=O)O)N PPQRKXHCLYCBSP-IHRRRGAJSA-N 0.000 description 4
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 4
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 4
- DAYQSYGBCUKVKT-VOAKCMCISA-N Leu-Thr-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O DAYQSYGBCUKVKT-VOAKCMCISA-N 0.000 description 4
- ISSAURVGLGAPDK-KKUMJFAQSA-N Leu-Tyr-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O ISSAURVGLGAPDK-KKUMJFAQSA-N 0.000 description 4
- DZQYZKPINJLLEN-KKUMJFAQSA-N Lys-Cys-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N)O DZQYZKPINJLLEN-KKUMJFAQSA-N 0.000 description 4
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 4
- GNLJXWBNLAIPEP-MELADBBJSA-N Lys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CCCCN)N)C(=O)O GNLJXWBNLAIPEP-MELADBBJSA-N 0.000 description 4
- PRSBSVAVOQOAMI-BJDJZHNGSA-N Lys-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN PRSBSVAVOQOAMI-BJDJZHNGSA-N 0.000 description 4
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 4
- GAHJXEMYXKLZRQ-AJNGGQMLSA-N Lys-Lys-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O GAHJXEMYXKLZRQ-AJNGGQMLSA-N 0.000 description 4
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 4
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 4
- 208000009565 Pharyngeal Neoplasms Diseases 0.000 description 4
- 206010034811 Pharyngeal cancer Diseases 0.000 description 4
- VPVHXWGPALPDGP-GUBZILKMSA-N Pro-Asn-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPVHXWGPALPDGP-GUBZILKMSA-N 0.000 description 4
- VOZIBWWZSBIXQN-SRVKXCTJSA-N Pro-Glu-Lys Chemical compound NCCCC[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1)C(O)=O VOZIBWWZSBIXQN-SRVKXCTJSA-N 0.000 description 4
- SSWJYJHXQOYTSP-SRVKXCTJSA-N Pro-His-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O SSWJYJHXQOYTSP-SRVKXCTJSA-N 0.000 description 4
- CPRLKHJUFAXVTD-ULQDDVLXSA-N Pro-Leu-Tyr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CPRLKHJUFAXVTD-ULQDDVLXSA-N 0.000 description 4
- JLMZKEQFMVORMA-SRVKXCTJSA-N Pro-Pro-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 JLMZKEQFMVORMA-SRVKXCTJSA-N 0.000 description 4
- OQSGBXGNAFQGGS-CYDGBPFRSA-N Pro-Val-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O OQSGBXGNAFQGGS-CYDGBPFRSA-N 0.000 description 4
- CRJZZXMAADSBBQ-SRVKXCTJSA-N Ser-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO CRJZZXMAADSBBQ-SRVKXCTJSA-N 0.000 description 4
- ZKBKUWQVDWWSRI-BZSNNMDCSA-N Ser-Phe-Tyr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZKBKUWQVDWWSRI-BZSNNMDCSA-N 0.000 description 4
- GKMYGVQDGVYCPC-IUKAMOBKSA-N Thr-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)O)N GKMYGVQDGVYCPC-IUKAMOBKSA-N 0.000 description 4
- XOWKUMFHEZLKLT-CIQUZCHMSA-N Thr-Ile-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O XOWKUMFHEZLKLT-CIQUZCHMSA-N 0.000 description 4
- BEZTUFWTPVOROW-KJEVXHAQSA-N Thr-Tyr-Arg Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N)O BEZTUFWTPVOROW-KJEVXHAQSA-N 0.000 description 4
- WZQZUVWEPMGIMM-JYJNAYRXSA-N Tyr-Gln-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O WZQZUVWEPMGIMM-JYJNAYRXSA-N 0.000 description 4
- UNUZEBFXGWVAOP-DZKIICNBSA-N Tyr-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O UNUZEBFXGWVAOP-DZKIICNBSA-N 0.000 description 4
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 4
- JQOMHZMWQHXALX-FHWLQOOXSA-N Tyr-Tyr-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O JQOMHZMWQHXALX-FHWLQOOXSA-N 0.000 description 4
- UKEVLVBHRKWECS-LSJOCFKGSA-N Val-Ile-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](C(C)C)N UKEVLVBHRKWECS-LSJOCFKGSA-N 0.000 description 4
- SDUBQHUJJWQTEU-XUXIUFHCSA-N Val-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C(C)C)N SDUBQHUJJWQTEU-XUXIUFHCSA-N 0.000 description 4
- 239000012190 activator Substances 0.000 description 4
- 239000003242 anti bacterial agent Substances 0.000 description 4
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 4
- 108010029539 arginyl-prolyl-proline Proteins 0.000 description 4
- 108010092854 aspartyllysine Proteins 0.000 description 4
- 230000003115 biocidal effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 4
- 108010037850 glycylvaline Proteins 0.000 description 4
- 108010090333 leucyl-lysyl-proline Proteins 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 108010026333 seryl-proline Proteins 0.000 description 4
- 210000001082 somatic cell Anatomy 0.000 description 4
- 229930024421 Adenine Natural products 0.000 description 3
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 3
- 241000701022 Cytomegalovirus Species 0.000 description 3
- 241000702421 Dependoparvovirus Species 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 3
- 101710163270 Nuclease Proteins 0.000 description 3
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 3
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- 239000002299 complementary DNA Substances 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 238000010362 genome editing Methods 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 201000011061 large intestine cancer Diseases 0.000 description 3
- 201000007270 liver cancer Diseases 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 201000001441 melanoma Diseases 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 125000006850 spacer group Chemical group 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 229940113082 thymine Drugs 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 238000001890 transfection Methods 0.000 description 3
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 2
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 2
- NKBQZKVMKJJDLX-SRVKXCTJSA-N Arg-Glu-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NKBQZKVMKJJDLX-SRVKXCTJSA-N 0.000 description 2
- JAYIQMNQDMOBFY-KKUMJFAQSA-N Arg-Glu-Tyr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JAYIQMNQDMOBFY-KKUMJFAQSA-N 0.000 description 2
- AGVNTAUPLWIQEN-ZPFDUUQYSA-N Arg-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N AGVNTAUPLWIQEN-ZPFDUUQYSA-N 0.000 description 2
- FFEUXEAKYRCACT-PEDHHIEDSA-N Arg-Ile-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(O)=O FFEUXEAKYRCACT-PEDHHIEDSA-N 0.000 description 2
- FKQITMVNILRUCQ-IHRRRGAJSA-N Arg-Phe-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O FKQITMVNILRUCQ-IHRRRGAJSA-N 0.000 description 2
- YCYXHLZRUSJITQ-SRVKXCTJSA-N Arg-Pro-Pro Chemical compound NC(=N)NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 YCYXHLZRUSJITQ-SRVKXCTJSA-N 0.000 description 2
- MEFGKQUUYZOLHM-GMOBBJLQSA-N Asn-Arg-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MEFGKQUUYZOLHM-GMOBBJLQSA-N 0.000 description 2
- FAEFJTCTNZTPHX-ACZMJKKPSA-N Asn-Gln-Ala Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FAEFJTCTNZTPHX-ACZMJKKPSA-N 0.000 description 2
- UEONJSPBTSWKOI-CIUDSAMLSA-N Asn-Gln-Met Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O UEONJSPBTSWKOI-CIUDSAMLSA-N 0.000 description 2
- OPEPUCYIGFEGSW-WDSKDSINSA-N Asn-Gly-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O OPEPUCYIGFEGSW-WDSKDSINSA-N 0.000 description 2
- ZMUQQMGITUJQTI-CIUDSAMLSA-N Asn-Leu-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMUQQMGITUJQTI-CIUDSAMLSA-N 0.000 description 2
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 2
- KNENKKKUYGEZIO-FXQIFTODSA-N Asn-Met-Asn Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N KNENKKKUYGEZIO-FXQIFTODSA-N 0.000 description 2
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 2
- JBDLMLZNDRLDIX-HJGDQZAQSA-N Asn-Thr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O JBDLMLZNDRLDIX-HJGDQZAQSA-N 0.000 description 2
- DATSKXOXPUAOLK-KKUMJFAQSA-N Asn-Tyr-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O DATSKXOXPUAOLK-KKUMJFAQSA-N 0.000 description 2
- LRCIOEVFVGXZKB-BZSNNMDCSA-N Asn-Tyr-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O LRCIOEVFVGXZKB-BZSNNMDCSA-N 0.000 description 2
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 2
- VZNOVQKGJQJOCS-SRVKXCTJSA-N Asp-Asp-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VZNOVQKGJQJOCS-SRVKXCTJSA-N 0.000 description 2
- HOBNTSHITVVNBN-ZPFDUUQYSA-N Asp-Ile-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N HOBNTSHITVVNBN-ZPFDUUQYSA-N 0.000 description 2
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 2
- UZFHNLYQWMGUHU-DCAQKATOSA-N Asp-Lys-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UZFHNLYQWMGUHU-DCAQKATOSA-N 0.000 description 2
- CTWCFPWFIGRAEP-CIUDSAMLSA-N Asp-Lys-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O CTWCFPWFIGRAEP-CIUDSAMLSA-N 0.000 description 2
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 2
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 2
- JSNWZMFSLIWAHS-HJGDQZAQSA-N Asp-Thr-Leu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O JSNWZMFSLIWAHS-HJGDQZAQSA-N 0.000 description 2
- UXRVDHVARNBOIO-QSFUFRPTSA-N Asp-Val-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](CC(=O)O)N UXRVDHVARNBOIO-QSFUFRPTSA-N 0.000 description 2
- 206010003571 Astrocytoma Diseases 0.000 description 2
- 244000063299 Bacillus subtilis Species 0.000 description 2
- 235000014469 Bacillus subtilis Nutrition 0.000 description 2
- 238000010453 CRISPR/Cas method Methods 0.000 description 2
- 241000701489 Cauliflower mosaic virus Species 0.000 description 2
- 206010008342 Cervix carcinoma Diseases 0.000 description 2
- OEDPLIBVQGRKGZ-AVGNSLFASA-N Cys-Tyr-Glu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O OEDPLIBVQGRKGZ-AVGNSLFASA-N 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 108091029865 Exogenous DNA Proteins 0.000 description 2
- MTCXQQINVAFZKW-MNXVOIDGSA-N Gln-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)N)N MTCXQQINVAFZKW-MNXVOIDGSA-N 0.000 description 2
- SXFPZRRVWSUYII-KBIXCLLPSA-N Gln-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)N)N SXFPZRRVWSUYII-KBIXCLLPSA-N 0.000 description 2
- SGVGIVDZLSHSEN-RYUDHWBXSA-N Gln-Tyr-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)NCC(O)=O SGVGIVDZLSHSEN-RYUDHWBXSA-N 0.000 description 2
- LJLPOZGRPLORTF-CIUDSAMLSA-N Glu-Asn-Met Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O LJLPOZGRPLORTF-CIUDSAMLSA-N 0.000 description 2
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 2
- HUFCEIHAFNVSNR-IHRRRGAJSA-N Glu-Gln-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 HUFCEIHAFNVSNR-IHRRRGAJSA-N 0.000 description 2
- CGOHAEBMDSEKFB-FXQIFTODSA-N Glu-Glu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O CGOHAEBMDSEKFB-FXQIFTODSA-N 0.000 description 2
- VGUYMZGLJUJRBV-YVNDNENWSA-N Glu-Ile-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O VGUYMZGLJUJRBV-YVNDNENWSA-N 0.000 description 2
- VGBSZQSKQRMLHD-MNXVOIDGSA-N Glu-Leu-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VGBSZQSKQRMLHD-MNXVOIDGSA-N 0.000 description 2
- IOUQWHIEQYQVFD-JYJNAYRXSA-N Glu-Leu-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IOUQWHIEQYQVFD-JYJNAYRXSA-N 0.000 description 2
- MFYLRRCYBBJYPI-JYJNAYRXSA-N Glu-Tyr-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)O MFYLRRCYBBJYPI-JYJNAYRXSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- OCDLPQDYTJPWNG-YUMQZZPRSA-N Gly-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)CN OCDLPQDYTJPWNG-YUMQZZPRSA-N 0.000 description 2
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 2
- YWAQATDNEKZFFK-BYPYZUCNSA-N Gly-Gly-Ser Chemical compound NCC(=O)NCC(=O)N[C@@H](CO)C(O)=O YWAQATDNEKZFFK-BYPYZUCNSA-N 0.000 description 2
- AFWYPMDMDYCKMD-KBPBESRZSA-N Gly-Leu-Tyr Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 AFWYPMDMDYCKMD-KBPBESRZSA-N 0.000 description 2
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 2
- SOEGEPHNZOISMT-BYPYZUCNSA-N Gly-Ser-Gly Chemical compound NCC(=O)N[C@@H](CO)C(=O)NCC(O)=O SOEGEPHNZOISMT-BYPYZUCNSA-N 0.000 description 2
- 101710160287 Heterochromatin protein 1 Proteins 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- KAFZDWMZKGQDEE-SRVKXCTJSA-N His-His-Asp Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC(=O)O)C(=O)O)N KAFZDWMZKGQDEE-SRVKXCTJSA-N 0.000 description 2
- QMUHTRISZMFKAY-MXAVVETBSA-N His-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N QMUHTRISZMFKAY-MXAVVETBSA-N 0.000 description 2
- VFBZWZXKCVBTJR-SRVKXCTJSA-N His-Leu-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC1=CN=CN1)N VFBZWZXKCVBTJR-SRVKXCTJSA-N 0.000 description 2
- 102000003893 Histone acetyltransferases Human genes 0.000 description 2
- 108090000246 Histone acetyltransferases Proteins 0.000 description 2
- 102000003964 Histone deacetylase Human genes 0.000 description 2
- 108090000353 Histone deacetylase Proteins 0.000 description 2
- FJWYJQRCVNGEAQ-ZPFDUUQYSA-N Ile-Asn-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N FJWYJQRCVNGEAQ-ZPFDUUQYSA-N 0.000 description 2
- YCKPUHHMCFSUMD-IUKAMOBKSA-N Ile-Thr-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(=O)O)C(=O)O)N YCKPUHHMCFSUMD-IUKAMOBKSA-N 0.000 description 2
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 2
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- WUFYAPWIHCUMLL-CIUDSAMLSA-N Leu-Asn-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(O)=O WUFYAPWIHCUMLL-CIUDSAMLSA-N 0.000 description 2
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 2
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 2
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 2
- ZALAVHVPPOHAOL-XUXIUFHCSA-N Leu-Ile-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(C)C)N ZALAVHVPPOHAOL-XUXIUFHCSA-N 0.000 description 2
- RTIRBWJPYJYTLO-MELADBBJSA-N Leu-Lys-Pro Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N RTIRBWJPYJYTLO-MELADBBJSA-N 0.000 description 2
- UCRJTSIIAYHOHE-ULQDDVLXSA-N Leu-Tyr-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N UCRJTSIIAYHOHE-ULQDDVLXSA-N 0.000 description 2
- VJGQRELPQWNURN-JYJNAYRXSA-N Leu-Tyr-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O VJGQRELPQWNURN-JYJNAYRXSA-N 0.000 description 2
- 239000012097 Lipofectamine 2000 Substances 0.000 description 2
- FACUGMGEFUEBTI-SRVKXCTJSA-N Lys-Asn-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCCCN FACUGMGEFUEBTI-SRVKXCTJSA-N 0.000 description 2
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 2
- KYNNSEJZFVCDIV-ZPFDUUQYSA-N Lys-Ile-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O KYNNSEJZFVCDIV-ZPFDUUQYSA-N 0.000 description 2
- QOJDBRUCOXQSSK-AJNGGQMLSA-N Lys-Ile-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(O)=O QOJDBRUCOXQSSK-AJNGGQMLSA-N 0.000 description 2
- WVJNGSFKBKOKRV-AJNGGQMLSA-N Lys-Leu-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WVJNGSFKBKOKRV-AJNGGQMLSA-N 0.000 description 2
- AIRZWUMAHCDDHR-KKUMJFAQSA-N Lys-Leu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O AIRZWUMAHCDDHR-KKUMJFAQSA-N 0.000 description 2
- WRODMZBHNNPRLN-SRVKXCTJSA-N Lys-Leu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O WRODMZBHNNPRLN-SRVKXCTJSA-N 0.000 description 2
- UQRZFMQQXXJTTF-AVGNSLFASA-N Lys-Lys-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O UQRZFMQQXXJTTF-AVGNSLFASA-N 0.000 description 2
- OBZHNHBAAVEWKI-DCAQKATOSA-N Lys-Pro-Asn Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O OBZHNHBAAVEWKI-DCAQKATOSA-N 0.000 description 2
- MIROMRNASYKZNL-ULQDDVLXSA-N Lys-Pro-Tyr Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 MIROMRNASYKZNL-ULQDDVLXSA-N 0.000 description 2
- UWHCKWNPWKTMBM-WDCWCFNPSA-N Lys-Thr-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(O)=O UWHCKWNPWKTMBM-WDCWCFNPSA-N 0.000 description 2
- DLCAXBGXGOVUCD-PPCPHDFISA-N Lys-Thr-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DLCAXBGXGOVUCD-PPCPHDFISA-N 0.000 description 2
- IKXQOBUBZSOWDY-AVGNSLFASA-N Lys-Val-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCCN)N IKXQOBUBZSOWDY-AVGNSLFASA-N 0.000 description 2
- JHDNAOVJJQSMMM-GMOBBJLQSA-N Met-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CCSC)N JHDNAOVJJQSMMM-GMOBBJLQSA-N 0.000 description 2
- TWEWRDAAIYBJTO-ULQDDVLXSA-N Met-Tyr-His Chemical compound CSCC[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N TWEWRDAAIYBJTO-ULQDDVLXSA-N 0.000 description 2
- 102000010168 Myeloid Differentiation Factor 88 Human genes 0.000 description 2
- 108010077432 Myeloid Differentiation Factor 88 Proteins 0.000 description 2
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 206010033128 Ovarian cancer Diseases 0.000 description 2
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 2
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 2
- VZFPYFRVHMSSNA-JURCDPSOSA-N Phe-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC1=CC=CC=C1 VZFPYFRVHMSSNA-JURCDPSOSA-N 0.000 description 2
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 2
- FUVBEZJCRMHWEM-FXQIFTODSA-N Pro-Asn-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O FUVBEZJCRMHWEM-FXQIFTODSA-N 0.000 description 2
- WGAQWMRJUFQXMF-ZPFDUUQYSA-N Pro-Gln-Ile Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O WGAQWMRJUFQXMF-ZPFDUUQYSA-N 0.000 description 2
- XZONQWUEBAFQPO-HJGDQZAQSA-N Pro-Gln-Thr Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZONQWUEBAFQPO-HJGDQZAQSA-N 0.000 description 2
- 206010060862 Prostate cancer Diseases 0.000 description 2
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 206010039491 Sarcoma Diseases 0.000 description 2
- FCRMLGJMPXCAHD-FXQIFTODSA-N Ser-Arg-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O FCRMLGJMPXCAHD-FXQIFTODSA-N 0.000 description 2
- BCKYYTVFBXHPOG-ACZMJKKPSA-N Ser-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CO)N BCKYYTVFBXHPOG-ACZMJKKPSA-N 0.000 description 2
- QBUWQRKEHJXTOP-DCAQKATOSA-N Ser-His-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QBUWQRKEHJXTOP-DCAQKATOSA-N 0.000 description 2
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 2
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 2
- ZSLFCBHEINFXRS-LPEHRKFASA-N Ser-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N ZSLFCBHEINFXRS-LPEHRKFASA-N 0.000 description 2
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 2
- 108091027544 Subgenomic mRNA Proteins 0.000 description 2
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 2
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 2
- LIXBDERDAGNVAV-XKBZYTNZSA-N Thr-Gln-Ser Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O LIXBDERDAGNVAV-XKBZYTNZSA-N 0.000 description 2
- GKWNLDNXMMLRMC-GLLZPBPUSA-N Thr-Glu-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N)O GKWNLDNXMMLRMC-GLLZPBPUSA-N 0.000 description 2
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 2
- KKPOGALELPLJTL-MEYUZBJRSA-N Thr-Lys-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KKPOGALELPLJTL-MEYUZBJRSA-N 0.000 description 2
- JAJOFWABAUKAEJ-QTKMDUPCSA-N Thr-Pro-His Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N)O JAJOFWABAUKAEJ-QTKMDUPCSA-N 0.000 description 2
- KAJRRNHOVMZYBL-IRIUXVKKSA-N Thr-Tyr-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAJRRNHOVMZYBL-IRIUXVKKSA-N 0.000 description 2
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- PZXUIGWOEWWFQM-SRVKXCTJSA-N Tyr-Asn-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O PZXUIGWOEWWFQM-SRVKXCTJSA-N 0.000 description 2
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 2
- LFCQXIXJQXWZJI-BZSNNMDCSA-N Tyr-His-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N)O LFCQXIXJQXWZJI-BZSNNMDCSA-N 0.000 description 2
- JLKVWTICWVWGSK-JYJNAYRXSA-N Tyr-Lys-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 JLKVWTICWVWGSK-JYJNAYRXSA-N 0.000 description 2
- NHOVZGFNTGMYMI-KKUMJFAQSA-N Tyr-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 NHOVZGFNTGMYMI-KKUMJFAQSA-N 0.000 description 2
- WYOBRXPIZVKNMF-IRXDYDNUSA-N Tyr-Tyr-Gly Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)NCC(O)=O)C1=CC=C(O)C=C1 WYOBRXPIZVKNMF-IRXDYDNUSA-N 0.000 description 2
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 2
- NWDOPHYLSORNEX-QXEWZRGKSA-N Val-Asn-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N NWDOPHYLSORNEX-QXEWZRGKSA-N 0.000 description 2
- JLFKWDAZBRYCGX-ZKWXMUAHSA-N Val-Asn-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N JLFKWDAZBRYCGX-ZKWXMUAHSA-N 0.000 description 2
- BMGOFDMKDVVGJG-NHCYSSNCSA-N Val-Asp-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BMGOFDMKDVVGJG-NHCYSSNCSA-N 0.000 description 2
- VENKIVFKIPGEJN-NHCYSSNCSA-N Val-Met-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N VENKIVFKIPGEJN-NHCYSSNCSA-N 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 108010070944 alanylhistidine Proteins 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 108010093581 aspartyl-proline Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 210000002798 bone marrow cell Anatomy 0.000 description 2
- 201000010881 cervical cancer Diseases 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 208000006990 cholangiocarcinoma Diseases 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 125000000291 glutamic acid group Chemical class N[C@@H](CCC(O)=O)C(=O)* 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 230000006054 immunological memory Effects 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 230000005764 inhibitory process Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 108010091871 leucylmethionine Proteins 0.000 description 2
- 208000032839 leukemia Diseases 0.000 description 2
- 208000014018 liver neoplasm Diseases 0.000 description 2
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 210000001616 monocyte Anatomy 0.000 description 2
- 238000002703 mutagenesis Methods 0.000 description 2
- 231100000350 mutagenesis Toxicity 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 201000002528 pancreatic cancer Diseases 0.000 description 2
- 208000008443 pancreatic carcinoma Diseases 0.000 description 2
- 239000013600 plasmid vector Substances 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 239000000523 sample Substances 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- DAEPDZWVDSPTHF-UHFFFAOYSA-M sodium pyruvate Chemical compound [Na+].CC(=O)C([O-])=O DAEPDZWVDSPTHF-UHFFFAOYSA-M 0.000 description 2
- 239000012096 transfection reagent Substances 0.000 description 2
- 108010020532 tyrosyl-proline Proteins 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- BFSVOASYOCHEOV-UHFFFAOYSA-N 2-diethylaminoethanol Chemical compound CCN(CC)CCO BFSVOASYOCHEOV-UHFFFAOYSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 206010068873 Adenosquamous cell carcinoma Diseases 0.000 description 1
- 208000003120 Angiofibroma Diseases 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 206010005949 Bone cancer Diseases 0.000 description 1
- 208000018084 Bone neoplasm Diseases 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 108091079001 CRISPR RNA Proteins 0.000 description 1
- 101710172824 CRISPR-associated endonuclease Cas9 Proteins 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 108020004394 Complementary RNA Proteins 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 208000008334 Dermatofibrosarcoma Diseases 0.000 description 1
- 206010057070 Dermatofibrosarcoma protuberans Diseases 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 208000021994 Diffuse astrocytoma Diseases 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- 206010061825 Duodenal neoplasm Diseases 0.000 description 1
- 102000002322 Egg Proteins Human genes 0.000 description 1
- 108010000912 Egg Proteins Proteins 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 208000006168 Ewing Sarcoma Diseases 0.000 description 1
- 208000017259 Extragonadal germ cell tumor Diseases 0.000 description 1
- 208000022072 Gallbladder Neoplasms Diseases 0.000 description 1
- 102000008157 Histone Demethylases Human genes 0.000 description 1
- 108010074870 Histone Demethylases Proteins 0.000 description 1
- 102000011787 Histone Methyltransferases Human genes 0.000 description 1
- 108010036115 Histone Methyltransferases Proteins 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000600434 Homo sapiens Putative uncharacterized protein encoded by MIR7-3HG Proteins 0.000 description 1
- 206010071119 Hormone-dependent prostate cancer Diseases 0.000 description 1
- 102000018251 Hypoxanthine Phosphoribosyltransferase Human genes 0.000 description 1
- 208000005726 Inflammatory Breast Neoplasms Diseases 0.000 description 1
- 206010021980 Inflammatory carcinoma of the breast Diseases 0.000 description 1
- 206010073094 Intraductal proliferative breast lesion Diseases 0.000 description 1
- 208000007766 Kaposi sarcoma Diseases 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- 229930182816 L-glutamine Natural products 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 208000018142 Leiomyosarcoma Diseases 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 208000032271 Malignant tumor of penis Diseases 0.000 description 1
- 208000005450 Maxillary Sinus Neoplasms Diseases 0.000 description 1
- 208000009018 Medullary thyroid cancer Diseases 0.000 description 1
- 208000000172 Medulloblastoma Diseases 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 101100219625 Mus musculus Casd1 gene Proteins 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 102000008300 Mutant Proteins Human genes 0.000 description 1
- 108010021466 Mutant Proteins Proteins 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- 102100024134 Myeloid differentiation primary response protein MyD88 Human genes 0.000 description 1
- 201000007224 Myeloproliferative neoplasm Diseases 0.000 description 1
- 208000010505 Nose Neoplasms Diseases 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 208000007571 Ovarian Epithelial Carcinoma Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 208000008900 Pancreatic Ductal Carcinoma Diseases 0.000 description 1
- 208000000821 Parathyroid Neoplasms Diseases 0.000 description 1
- 208000002471 Penile Neoplasms Diseases 0.000 description 1
- 206010034299 Penile cancer Diseases 0.000 description 1
- 102000002508 Peptide Elongation Factors Human genes 0.000 description 1
- 108010068204 Peptide Elongation Factors Proteins 0.000 description 1
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 1
- 102100037401 Putative uncharacterized protein encoded by MIR7-3HG Human genes 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 208000006265 Renal cell carcinoma Diseases 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 108020005091 Replication Origin Proteins 0.000 description 1
- 208000004337 Salivary Gland Neoplasms Diseases 0.000 description 1
- 206010061934 Salivary gland cancer Diseases 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- 206010054184 Small intestine carcinoma Diseases 0.000 description 1
- 208000021712 Soft tissue sarcoma Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 206010062129 Tongue neoplasm Diseases 0.000 description 1
- 101710183280 Topoisomerase Proteins 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006593 Urologic Neoplasms Diseases 0.000 description 1
- 241000700618 Vaccinia virus Species 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 208000008383 Wilms tumor Diseases 0.000 description 1
- 210000005006 adaptive immune system Anatomy 0.000 description 1
- 201000008395 adenosquamous carcinoma Diseases 0.000 description 1
- 210000001789 adipocyte Anatomy 0.000 description 1
- 210000004100 adrenal gland Anatomy 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 210000001130 astrocyte Anatomy 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 210000000227 basophil cell of anterior lobe of hypophysis Anatomy 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000001772 blood platelet Anatomy 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 210000004413 cardiac myocyte Anatomy 0.000 description 1
- 210000000845 cartilage Anatomy 0.000 description 1
- 101150055766 cat gene Proteins 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 201000006392 childhood kidney cancer Diseases 0.000 description 1
- 208000013549 childhood kidney neoplasm Diseases 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 210000001612 chondrocyte Anatomy 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 210000002808 connective tissue Anatomy 0.000 description 1
- 210000001771 cumulus cell Anatomy 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000003412 degenerative effect Effects 0.000 description 1
- 210000004443 dendritic cell Anatomy 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 206010012818 diffuse large B-cell lymphoma Diseases 0.000 description 1
- 201000000312 duodenum cancer Diseases 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 210000001339 epidermal cell Anatomy 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 239000003797 essential amino acid Substances 0.000 description 1
- 235000020776 essential amino acid Nutrition 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 210000001508 eye Anatomy 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 201000001169 fibrillary astrocytoma Diseases 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 201000010175 gallbladder cancer Diseases 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 235000004554 glutamine Nutrition 0.000 description 1
- 210000003780 hair follicle Anatomy 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 201000011066 hemangioma Diseases 0.000 description 1
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 1
- 210000003897 hepatic stem cell Anatomy 0.000 description 1
- 210000003494 hepatocyte Anatomy 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 210000004263 induced pluripotent stem cell Anatomy 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 201000004653 inflammatory breast carcinoma Diseases 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000004966 intestinal stem cell Anatomy 0.000 description 1
- 206010073095 invasive ductal breast carcinoma Diseases 0.000 description 1
- 201000010985 invasive ductal carcinoma Diseases 0.000 description 1
- 210000004153 islets of langerhan Anatomy 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 210000002510 keratinocyte Anatomy 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 210000003292 kidney cell Anatomy 0.000 description 1
- 210000000244 kidney pelvis Anatomy 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- -1 linear Chemical group 0.000 description 1
- 238000001638 lipofection Methods 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 208000025036 lymphosarcoma Diseases 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 208000006178 malignant mesothelioma Diseases 0.000 description 1
- 208000026045 malignant tumor of parathyroid gland Diseases 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 208000023356 medullary thyroid gland carcinoma Diseases 0.000 description 1
- 210000002752 melanocyte Anatomy 0.000 description 1
- 210000002901 mesenchymal stem cell Anatomy 0.000 description 1
- 230000001394 metastastic effect Effects 0.000 description 1
- 206010061289 metastatic neoplasm Diseases 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 210000000274 microglia Anatomy 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 201000010879 mucinous adenocarcinoma Diseases 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 210000001665 muscle stem cell Anatomy 0.000 description 1
- 208000037830 nasal cancer Diseases 0.000 description 1
- 208000025189 neoplasm of testis Diseases 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000001178 neural stem cell Anatomy 0.000 description 1
- 208000007538 neurilemmoma Diseases 0.000 description 1
- 210000004498 neuroglial cell Anatomy 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 210000004248 oligodendroglia Anatomy 0.000 description 1
- 210000004409 osteocyte Anatomy 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 230000002611 ovarian Effects 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 210000004681 ovum Anatomy 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 201000008129 pancreatic ductal adenocarcinoma Diseases 0.000 description 1
- 208000004019 papillary adenocarcinoma Diseases 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 210000003668 pericyte Anatomy 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 201000007315 pineal gland astrocytoma Diseases 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 208000015768 polyposis Diseases 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 1
- 125000000714 pyrimidinyl group Chemical group 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 201000006845 reticulosarcoma Diseases 0.000 description 1
- 208000029922 reticulum cell sarcoma Diseases 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 206010039667 schwannoma Diseases 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 208000037968 sinus cancer Diseases 0.000 description 1
- 210000002363 skeletal muscle cell Anatomy 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 210000000813 small intestine Anatomy 0.000 description 1
- 201000002314 small intestine cancer Diseases 0.000 description 1
- 210000000329 smooth muscle myocyte Anatomy 0.000 description 1
- 229940054269 sodium pyruvate Drugs 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 210000002325 somatostatin-secreting cell Anatomy 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical group [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 201000002510 thyroid cancer Diseases 0.000 description 1
- 201000006134 tongue cancer Diseases 0.000 description 1
- 108091008023 transcriptional regulators Proteins 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241000701447 unidentified baculovirus Species 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 210000003932 urinary bladder Anatomy 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 210000003741 urothelium Anatomy 0.000 description 1
- 208000037965 uterine sarcoma Diseases 0.000 description 1
- 210000004291 uterus Anatomy 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 230000002792 vascular Effects 0.000 description 1
- 210000003556 vascular endothelial cell Anatomy 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/52—Genes encoding for enzymes or proenzymes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Cell Biology (AREA)
- Mycology (AREA)
- Peptides Or Proteins (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
本发明涉及一种蛋白质,其由下述序列构成且具有与向导RNA结合的能力,所述序列包含如下氨基酸序列:在序列号2所示的氨基酸序列中,具有在第721~755位之间连续的缺失区域,与该缺失区域分别相邻的氨基酸通过由3~10个氨基酸残基构成的接头连接。该蛋白质尽管存在缺失区域、与全长的dSaCas9相比缩小,但是与向导RNA结合的能力以及DNA结合亲和性得到了维持。小型化的dSaCas9蛋白的使用使更多的基因向载体中的搭载成为可能。
Description
技术领域
本发明涉及维持了与向导RNA结合的能力且小型化的经改造的Cas9蛋白及其用途。
背景技术
已知成簇的规律性间隔的短回文重复序列(Clustered Regularly InterspacedShort Palindromic Repeats:CRISPR)与Cas(CRISPR相关)基因一起构成细菌及古细菌中提供对入侵外源核酸的获得性抗性的适应性免疫系统。CRISPR大多起源于噬菌体或质粒DNA,由在其间插入了尺寸相似的被称为间隔子的独特的可变DNA序列的、24~48bp的短的保守重复序列构成。另外,在重复序列及间隔子序列附近存在编码Cas蛋白家族的基因群。
CRISPR/Cas系统中,外源性DNA被Cas蛋白家族切成30bp左右的片段并被插入到CRISPR中。作为Cas蛋白家族之一的Cas1及Cas2蛋白识别外源性DNA的被称为前间隔序列邻近基序(proto-spacer adjacent motif,PAM)的碱基序列,切取其上游并插入到宿主的CRISPR序列中,其形成细菌的免疫记忆。转录包含免疫记忆的CRISPR序列而生成的RNA(称为pre-crRNA。)与部分互补的RNA(trans-activating crRNA:tracrRNA)配对,整合到作为Cas蛋白家族之一的Cas9蛋白中。整合到Cas9中的pre-crRNA及tracrRNA被RNaseIII切断,成为包含外源序列(向导序列)的小的RNA片段(CRISPR-RNAs:crRNAs),形成Cas9-crRNA-tracrRNA复合物。Cas9-crRNA-tracrRNA复合物结合到与crRNA互补的外源入侵DNA上,作为切断DNA的酶(nuclease)的Cas9蛋白将外源入侵DNA切断,由此抑制和消除从外部入侵的DNA的功能。
近年来,正在积极开发将CRISPR/Cas系统应用于基因组编辑的技术。使crRNA和tracrRNA融合而以tracrRNA-crRNA嵌合体(以下称为向导RNA(guide RNA:gRNA)。)形式表达,并有效利用。由此引进核酸酶(RNA-guided nuclease:RGN)并在目标位点切断基因组DNA。
另一方面,通过在作为基因组编辑系统之一的CRISPR/Cas9中的Cas9蛋白的核酸酶已失活的变异体(nuclease-null,dCas9)中融合VP64、VP160等转录活化因子或KRAB等转录抑制因子等转录调控因子,从而能够形成能调节靶基因的表达水平的系统。例如,为了进一步提高基因活化的效率而与将2个转录活性因子连接而成的活化因子(VP64-Rta)、将3个转录活性因子连接而成的活化因子(VP64-p65-Rta,VPR)融合,该融合而得的dCas9蛋白(dCas9-VPR;dCas9融合蛋白)在不切断DNA的情况下强烈活化靶基因的表达。
对于Cas9蛋白,以PAM特异性的缓和、核酸酶活性的改造(活化/失活)、小型化为目的,已研制并报道了各种变异体(专利文献1~3)。
现有技术文献
专利文献
专利文献1:WO2016/141224A1
专利文献2:WO2017/010543A1
专利文献3:WO2018/074979A1
发明内容
发明所要解决的问题
在体内的dCas9融合蛋白的表达中,需要表达载体。基因治疗中,从安全性和效率高角度来看,腺相关病毒载体(AAV)逐渐成为主流,但是AAV的可搭载的尺寸为4.4kb左右,而dCas9蛋白已占据4kb左右,搭载于AAV中时,融合蛋白的构成极其受限。
因此,本发明人的目的在于,提供具有与全长蛋白质实质上同等的DNA结合亲和性且进一步小型化的dCas9蛋白的变异体。
用于解决问题的方法
本发明人着眼于作为Cas9蛋白的源自金黄色葡萄球菌(S.aureus)的Cas9(本说明书中也称为SaCas9)的无核酸酶(nuclease-null)变异体(dSaCas9),为了解决上述问题进行了深入研究。结果发现了即使发生缺失也较少影响与向导RNA结合的能力的特定区域,进而通过将规定位置的氨基酸置换为特定氨基酸,成功地制造出DNA结合亲和性得到了维持、甚至增强且小型化的dSaCas9蛋白,从而完成了本发明。
将缺失及置换一并称为变异。
本说明书中,有时将导入变异前的dSaCas9蛋白称为野生型dSaCas9(蛋白)、将导入变异后的dSaCas9蛋白称为dSaCas9变异体(蛋白)。
即,本发明如下所述。
[1]一种蛋白质,其由下述序列构成且具有与向导RNA结合的能力,所述序列包含如下氨基酸序列:
在序列号2所示的氨基酸序列中,具有在第721~755位之间连续的缺失区域,
与该缺失区域分别相邻的氨基酸通过由3~10个氨基酸残基构成的接头连接。
[2]根据上述[1]所述的蛋白质,其中,接头是由甘氨酸(G)及丝氨酸(S)构成的、长度为5~9个氨基酸的接头。
[3]根据上述[2]所述的蛋白质,其中,接头从以下接头中选择:
-SGGGS-、
-GGSGGS-、
-SGSGSGSG-、
-SGSGSGSGS-。
[4]根据上述[1]~[3]中任一项所述的蛋白质,其中,该缺失区域为第721~745位的区域。
[5]根据上述[4]所述的蛋白质,其由序列号4表示。
[6]根据上述[1]~[3]中任一项所述的蛋白质,其中,该缺失区域为第721~755位的区域。
[7]根据上述[6]所述的蛋白质,其由序列号6表示。
[8]根据上述[1]~[7]中任一项所述的蛋白质,其中,进一步地,第45位和/或第163位的谷氨酸(E)被置换为其它氨基酸。
[9]根据上述[8]所述的蛋白质,其中,其它氨基酸为碱性氨基酸。
[10]根据上述[9]所述的蛋白质,其中,碱性氨基酸为赖氨酸(K)。
[11]根据上述[1]~[10]中任一项所述的蛋白质,其中,在序列号2的除实施了变异和/或缺失的位置以外的部位具有80%以上的一致性。
[12]根据上述[1]~[10]中任一项所述的蛋白质,其中,在序列号2的除实施了变异和/或缺失的位置以外的部位置换、缺失、插入和/或添加了1个~几个氨基酸。
[13]根据上述[1]~[12]中任一项所述的蛋白质,其连接有转录调控因子蛋白或结构域。
[14]根据上述[13]所述的蛋白质,其中,转录调控因子为转录活化因子。
[15]根据上述[13]所述的蛋白质,其中,转录调控因子为转录沉默子或转录抑制因子。
[16]一种核酸,其编码上述[1]~[15]中任一项所述的蛋白质。
[17]一种蛋白质-RNA复合物,其具备上述[1]~[16]中任一项所述的蛋白质和包含下述多核苷酸的向导RNA,所述多核苷酸由与靶双链多核苷酸中的PAM(Proto-spacerAdjacent Motif,前间隔序列邻近基序)序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
[18]一种用于对靶双链多核苷酸进行位点特异性改造的方法,
其具备:
将靶双链多核苷酸、蛋白质和向导RNA混合并进行温育的步骤;以及
上述蛋白质在位于PAM序列的上游的结合位点处对上述靶双链多核苷酸进行改造的步骤,
上述蛋白质为上述[1]~[15]中任一项所述的蛋白质,
上述向导RNA包含下述多核苷酸,所述多核苷酸由与上述靶双链多核苷酸中的上述PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
[19]一种使细胞的靶基因的表达增加的方法,其包括:使上述[14]所述的蛋白质和针对上述靶基因的一种或两种以上向导RNA在上述细胞内表达。
[20]一种使细胞的靶基因的表达减少的方法,其包括:使上述[15]所述的蛋白质和针对上述靶基因的一种或两种以上向导RNA在上述细胞内表达。
[21]根据上述[19]或[20]所述的方法,其中,细胞为真核细胞。
[22]根据上述[19]或[20]所述的方法,其中,细胞为酵母细胞、植物细胞或动物细胞。
发明效果
根据本发明,能够得到具有向导RNA结合能力、并且进一步小型化的dSaCas9蛋白。利用该小型化的dSaCas9蛋白,能够使更多的基因搭载到容量方面有制限的表达载体中。
附图说明
图1为示意性示出野生型dSaCas9(FL)及dSaCas9变异体(T10~T11)的结构的图。T10中的缺失区域(第721~745位)及T11中的缺失区域(721~755位)的两端通过接头肽(SGGGS)连接。
图2为示出利用使用A2(序列号12)作为crRNA的向导RNA、和在野生型dSaCas9(FL)或dSaCas9变异体(T10、T11)上连接有KRAB基因的分子来抑制MyD88基因表达的结果的图。
图3为示出利用使用A2(序列号12)或A3(序列号13)作为crRNA的向导RNA、和在野生型dSaCas9(FL)或dSaCas9变异体(T10)上连接有KRAB基因的分子来抑制MyD88基因表达的结果的图。
具体实施方式
以下,对本发明进行说明。只要没有特别声明,则本说明书中使用的术语具有该领域中通常使用的含义。
<dSaCas9变异体>
本发明的dSaCas9变异体为具有与向导RNA结合的能力且进一步小型化的dSaCas9蛋白。若使用小型化的dSaCas9变异体,则能够使更多的基因搭载到载体中。
本说明书中,“向导RNA”是指模仿tracrRNA-crRNA的发夹结构的RNA,在其5’末端区域包含由与靶双链多核苷酸中的PAM序列的上游1个碱基至优选20个碱基以上且24个碱基以下、更优选22个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成的多核苷酸。此外,也可以包含1个以上的下述多核苷酸,所述多核苷酸由包含与靶双链多核苷酸不互补的碱基序列、按照以一点为轴对称地形成互补的序列的方式排列、并能够形成发夹结构的碱基序列构成。
向导RNA具有与本发明的dSaCas9变异体结合、并且将该蛋白质引导至靶DNA的功能。向导RNA在其5’末端具有与靶DNA互补的序列,介由该互补的序列与靶DNA结合而将本发明的dSaCas9变异体引导至靶DNA。dSaCas9变异体不具有DNA内切核酸酶,因此与靶DNA进行结合但不会将其切断。
向导RNA可以基于靶DNA的序列信息来设计、制备。具体而言,可列举实施例中使用的序列。
本说明书中,“多肽”、“肽”及“蛋白质”是指氨基酸残基的聚合物,可互换使用。另外,是指一个或两个以上氨基酸为天然存在的对应氨基酸的化学类似物或修饰衍生物的氨基酸聚合物。
本说明书中,“碱性氨基酸”是指赖氨酸、精氨酸、组氨酸等在分子内除具有一个氨基以外还具有显示碱性的残基的氨基酸。
本说明书中,“序列”是指任意长度的核苷酸序列,为脱氧核糖核苷酸或核糖核苷酸,为线状、环状或支链状,为单链或双链。
本说明书中,“PAM序列”是指存在于靶双链多核苷酸中、能够被Cas9蛋白识别的序列,PAM序列的长度、碱基序列根据细菌种类而不同。
需要说明的是,本说明书中,“N”是指选自由腺嘌呤、胞嘧啶、胸腺嘧啶及鸟嘌呤组成的组中的任意的一种碱基,“A”是指腺嘌呤,“G”是指鸟嘌呤,“C”是指胞嘧啶,“T”是指胸腺嘧啶,“R”是指具有嘌呤骨架的碱基(腺嘌呤或鸟嘌呤),“Y”是指具有嘧啶骨架的碱基(胞嘧啶或胸腺嘧啶)。
本说明书中,“多核苷酸”是指为线状或环状构象、为单链或双链中的任一形态的脱氧核糖核苷酸或核糖核苷酸聚合物,对于聚合物的长度,不进行限定性解释。另外,包括天然核苷酸的公知的类似物以及碱基部分、糖部分及磷酸部分中的至少一个部分经修饰的核苷酸(例如硫代磷酸酯骨架)。通常,特定核苷酸的类似物具有相同的碱基配对特异性,例如A的类似物与T进行碱基配对。
本发明提供一种蛋白质(方式1),其由下述序列构成且具有与向导RNA结合的能力,所述序列包含如下氨基酸序列:在序列号2所示的氨基酸序列中,具有在第721~755位之间连续的缺失区域,与该缺失区域分别相邻的氨基酸通过由3~10个氨基酸残基构成的接头连接。
序列号2为dSaCas9蛋白的全长氨基酸序列。dSaCas9蛋白由REC叶(第41~425位的残基)及NUC叶(第1~40位的残基及第435~1053位的残基)这2个叶构成。2个叶通过富含精氨酸的桥状螺旋(BH:第41~73位的残基)和接头环(第426~434位的残基)而连接。NUC叶由RuvC结构域(第1~40位、第435~480位及第650~774位的残基)、HNH结构域(第520~628位的残基)、WED结构域(第788~909位的残基)及PI结构域(第910~1053位的残基)构成。PI结构域分为拓扑异构酶同源(TOPO)结构域和C末端结构域(CTD)。RuvC结构域由3个被隔开的基序(RuvC-I~III)构成,与HNH结构域及PI结构域相关联。HNH结构域介由L1(第481~519位的残基)接头及L2(第629~649位的残基)接头分别与RuvC-II及RuvC-III连接。WED结构域及RucV结构域通过“phosphate lock(磷酸酯锁)”环(第775~787位的残基)连接。
本发明的一个实施方式中,在序列号2所示的氨基酸序列的第721~755位之间连续存在的缺失区域为第721~745位的区域(方式1-1)。
作为方式1-1的蛋白质及编码其的基因,可分别列举由序列号4所示的氨基酸序列构成的蛋白质、及由序列号3所示的碱基序列构成的基因。该蛋白质相当于后述的dSaCas9变异体T10。
本发明的一个实施方式中,在序列号2所示的氨基酸序列的第721~755位之间连续存在的缺失区域为第721~755位的区域(方式1-2)。
作为方式1-2的蛋白质及编码其的基因,可分别列举由序列号6所示的氨基酸序列构成的蛋白质、及由序列号5所示的碱基序列构成的基因。该蛋白质相当于后述的dSaCas9变异体T11。
本发明的另一实施方式中,本发明提供一种蛋白质(方式2),其在上述方式1、1-1及1-2各自的变异的基础上还在第45位和/或第163位具有变异、并且具有与向导RNA结合的能力。
第45位和/或第163位的变异具体而言是谷氨酸置换为碱性氨基酸,优选置换为赖氨酸、精氨酸或组氨酸,更优选置换为赖氨酸。
作为在序列号2所示的氨基酸序列中任选形成“在第721~755位之间连续的缺失区域”时的方法及“将第45位和/或第163位的谷氨酸置换为其它氨基酸”时的方法,可列举如下方法:对于编码规定的氨基酸序列的DNA实施惯用的定点诱变,然后通过常规方法使该DNA表达。在此,作为定点诱变法,可列举例如利用琥珀突变的方法(缺口双链体法、NucleicAcids Res.,12,9441-9456(1984))、利用使用变异导入用引物的PCR的方法等。另外,可使用Q5定点诱变试剂盒(NEB)按照手册简便地实施。
本发明的另一实施方式中,本发明提供与上述方式1、1-1、1-2及2的蛋白质在功能上同等的蛋白质(方式3)。为了与上述方式1、1-1、1-2及2的蛋白质在功能上同等,在序列号2所示的氨基酸序列中,在上述方式1、1-1、1-2及2中除实施了变异的位置以外的部位具有80%以上的序列一致性、且具有与向导RNA结合的能力。在由于变异而使氨基酸存在增减的情况下,该“除实施了变异的位置以外的部位”可理解为“除与实施了变异的位置相当的位置以外的部位”。作为所述一致性,优选为80%以上,更优选为85%以上,进一步优选为90%以上,特别优选为95%以上,最优选为99%以上。氨基酸序列一致性可通过本身公知的方法来确定。例如,氨基酸序列一致性(%)可以使用该领域中惯用的程序(例如BLAST、FASTA等)在初始设定下确定。另外,另外,在另一侧面中,一致性(%)可以使用该领域中公知的任意算法、例如Needleman等(1970)(J.Mol.Biol.48:444-453)、Myers及Miller(CABIOS,1988,4:11-17)的算法等来确定。Needleman等的算法已整合到GCG软件包(可由www.gcg.com获得)的GAP程序中,一致性(%)例如可通过使用BLOSUM 62matrix或PAM250 matrix;以及gapweight:16、14、12、10、8、6或4,及length weight:1、2、3、4、5或6中的任一者来确定。另外,Myers及Miller的算法已整合到作为GCG序列比对软件包的一部分的ALIGN程序中。为了比较氨基酸序列而利用ALIGN程序时,例如可以使用PAM120 weight residue table、gaplength penalty 12、gap penalty 4。
作为与上述方式1、1-1、1-2及2的蛋白质在功能上同等的蛋白质,提供在序列号2所示的氨基酸序列中在上述方式1、1-1、1-2及2中除实施了变异的位置以外的部位置换、缺失、插入和/或添加1个~几个氨基酸、并且具有与向导RNA结合的能力的蛋白质(方式3-1)。在由于变异而使氨基酸存在增减的情况下,该“除实施了变异的位置以外的部位”可以理解为“除与实施了变异的位置相当的位置以外的部位”。
作为人工进行“氨基酸的置换、缺失、插入和/或添加”时的方法,可列举例如以下方法:对于编码规定的氨基酸序列的DNA实施惯用的定点诱变,然后通过常规方法使该DNA表达。在此,作为定点诱变法,可列举例如利用琥珀突变的方法(缺口双链体法、NucleicAcids Res.,12,9441-9456(1984))、利用使用变异导入用引物的PCR的方法等。另外,可以使用Q5定点诱变试剂盒(NEB)按照手册简便地实施。
关于上述所改造的氨基酸的数量,为至少1个残基,具体为1个或几个或其以上。另外,在上述置换、缺失、插入或添加中,特别优选氨基酸的置换。该置换更优选置换为在疏水性、电荷、pK、立体结构上的特征等方面具有类似性质的氨基酸。作为这样的置换,可列举例如i)甘氨酸、丙氨酸;ii)缬氨酸、异亮氨酸、亮氨酸;iii)天冬氨酸、谷氨酸、天冬酰胺、谷氨酰胺;iv)丝氨酸、苏氨酸;v)赖氨酸、精氨酸;vi)苯丙氨酸、酪氨酸中的组内置换。
本发明的dSaCas9变异体由于缺失变异而呈dSaCas9变异体蛋白被切断的状态,该缺失区域的两端通过接头连接。即,本发明的dSaCas9变异体中,位于与缺失区域分别相邻的位置的氨基酸通过由3~10个氨基酸残基构成的接头连接。通过所述连接,本申请发明的dSaCas9变异体具有连续的氨基酸序列。
接头只要能够连接被切断的蛋白质的两端、且不影响其功能就没有特别限定,优选能够采取根据其它蛋白质而自由地改变自身形状并进行结合的天然变性结构的基团,更优选由甘氨酸(G)及丝氨酸(S)构成的长度为5~9个氨基酸的肽残基。具体可列举以下的残基。
-SGGGS-(序列号7)
-GGSGGS-(序列号8)
-SGSGSGSG-(序列号9)
-SGSGSGSGS-(序列号10)
dSaCas9变异体T10及T11中,该接头使用的是-SGGGS-。
接头向各变异体中的导入也可以通过如下方法来实施:对于编码规定的氨基酸序列的DNA实施惯用的定点诱变并插入编码接头的碱基序列,然后通过常规方法使该DNA表达。作为定点诱变法,可列举与上述相同的方法。
本实施方式中的dSaCas9变异体例如可通过如下方法来制作。首先,使用包含编码上述本发明的dSaCas9变异体的核酸的载体转化宿主。接着,对该宿主进行培养而表达上述蛋白质。培养基的组成、培养的温度、时间、诱导物质的添加等条件可以按照公知的方法由本领域技术人员来决定,以使转化体能够生长、高效地产生上述蛋白质。另外,例如在表达载体中整合有作为选择标志物的抗生素抗性基因的情况下,可以向培养基中加入抗生素来选择转化体。接着,通过适宜地利用本身公知的方法来纯化宿主所表达的上述蛋白质,可得到本发明的dSaCas9变异体。
作为宿主,没有特别限定,可列举动物细胞、植物细胞、昆虫细胞或大肠杆菌、枯草杆菌、酵母等微生物。优选动物细胞。
<dSaCas9变异体-向导RNA复合物>
一个实施方式中,本发明提供如下蛋白质-RNA复合物,其具备:上述的<dSaCas9变异体>中所示的蛋白质和包含下述多核苷酸的向导RNA,所述多核苷酸由与靶双链多核苷酸中的PAM(前间隔序列邻近基序)序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
上述蛋白质及上述向导RNA可通过在体外及体内在温和的条件下混合而形成蛋白质-RNA复合物。温和的条件表示蛋白质不发生分解或变性的程度的温度及pH,温度优选为4℃以上且40℃以下,pH优选为4以上且10以下。
另外,将上述蛋白质及上述向导RNA混合并进行温育的时间优选为0.5小时以上且1小时以下。由上述蛋白质及上述向导RNA形成的复合物稳定,即使在室温下静置数小时也可以保持稳定性。
<CRISPR-Cas载体系统>
一个实施方式中,本发明提供如下CRISPR-Cas载体系统,其具备第一载体和第二载体,所述第一载体包含编码上述的<dSaCas9变异体>中所示的蛋白质的基因;所述第二载体含有包含下述多核苷酸的向导RNA,所述多核苷酸由与靶双链多核苷酸中的PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
另一实施方式中,本发明提供如下CRISPR-Cas载体系统,其在同一载体中具有编码上述的<dSaCas9变异体>中所示的蛋白质的基因和包含下述多核苷酸的向导RNA,所述多核苷酸由与靶双链多核苷酸中的PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
向导RNA可以适宜地设计成:在其5’末端区域包含由与靶双链多核苷酸中的PAM序列的上游1个碱基至优选20个碱基以上且24个碱基以下、更优选22个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成的多核苷酸。此外,也可以包含1个以上的下述多核苷酸,所述多核苷酸由包含与靶双链多核苷酸不互补的碱基序列、按照以一点为轴对称地形成互补的序列的方式排列、并能够形成发夹结构的碱基序列构成。
本实施方式的载体优选为表达载体。作为表达载体,可使用例如pBR322、pBR325、pUC12、pUC13等大肠杆菌来源的质粒;pUB110、pTP5、pC194等枯草杆菌来源的质粒;pSH19、pSH15等酵母来源的质粒;λ噬菌体等噬菌体;腺病毒、腺相关病毒、慢病毒、牛痘病毒、杆状病毒、巨细胞病毒等病毒;及对这些进行改造而成的载体等,在考虑体内的基因表达的活化的情况下,优选病毒载体、特别是腺相关病毒。
上述表达载体中,作为上述dSaCas9变异体蛋白及上述向导RNA表达用启动子,没有特别限定,可以使用例如EF1α启动子、SRα启动子、SV40启动子、LTR启动子、CMV(巨细胞病毒)启动子、HSV-tk启动子等用于在动物细胞中表达的启动子、花椰菜花叶病毒(CaMV)的35S启动子、REF(rubber elongation factor,橡胶延长因子)启动子等用于在植物细胞中表达的启动子、多角体启动子、p10启动子等用于在昆虫细胞中表达的启动子等。这些启动子可以根据上述dSaCas9变异体蛋白及上述向导RNA、或表达上述Cas9蛋白及上述向导RNA的细胞的种类来适宜选择。
上述表达载体可以还具有多克隆位点、增强子、剪接信号、多聚腺苷酸化信号、选择标志物(抗药性)及其启动子、复制起点等。
<用于对靶双链多核苷酸进行位点特异性改造的方法>
[第一实施方式]
一个实施方式中,本发明提供用于对靶双链多核苷酸进行位点特异性改造的方法,所述方法具备:
将靶双链多核苷酸、蛋白质和向导RNA混合并进行温育的步骤;和上述蛋白质在位于PAM序列的上游的结合位点处对上述靶双链多核苷酸进行改造的步骤,
上述靶双链多核苷酸具有PAM序列,
上述蛋白质为上述的<dSaCas9变异体>中所示的蛋白质,
上述向导RNA包含下述多核苷酸,所述多核苷酸由与上述靶双链多核苷酸中的上述PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
本实施方式中,靶双链多核苷酸只要具有PAM序列即可,没有特别限定。
本实施方式中,关于蛋白质及向导RNA,如上述的<dSaCas9变异体>中所示。
以下对用于对靶双链多核苷酸进行位点特异性改造的方法进行详细说明。
首先,将上述蛋白质及上述向导RNA在温和的条件下混合并进行温育。温和的条件如上所述。进行温育的时间优选为0.5小时以上且1小时以下。由上述蛋白质及上述向导RNA形成的复合物稳定,在室温下静置数小时也能够保持稳定性。
然后,上述蛋白质及上述向导RNA在上述靶双链多核苷酸上形成复合物。上述蛋白质识别PAM序列并在位于PAM序列的上游的结合位点处与上述靶双链多核苷酸结合。接着,可以得到在通过上述向导RNA与上述双链多核苷酸的互补性结合而确定的区域中实施了目标改造的靶双链多核苷酸。
本说明书中,“改造”是指靶双链多核苷酸结构上或功能上发生变化。可列举例如通过添加功能性蛋白质、碱基序列而进行的靶双链多核苷酸的结构上或功能上的变化。通过该改造,能够使靶双链多核苷酸的功能发生改变、缺失、增强或抑制而添加新功能。
本发明的dSaCas9变异体不具有内切核酸酶活性,因此该蛋白质能够在位于PAM序列的上游的结合位点处与上述靶双链多核苷酸结合,但止步于此,不会将其切断。因此,例如预先在该蛋白质上融合荧光蛋白(例如GFP)等标记蛋白时,可以介由dSaCas9变异体蛋白-向导RNA使标记蛋白与靶双链多核苷酸结合。通过适宜地选择与dSaCas9变异体结合的物质,能够给靶双链多核苷酸带来多样的功能。
进一步地,可以在dSaCas9变异体蛋白的N末端或C末端连接转录调控因子蛋白或结构域。作为转录调控因子或其结构域,可列举转录活化因子或其结构域(例如VP64、VP160、NF-κB p65)及转录沉默子或其结构域(例如异染色质蛋白1(HP1))或转录抑制因子或其结构域(例如KRUPPEL相关盒(KRAB)、ERF阻抑结构域(ERD)、mSin3A相互作用结构域(SID))。
也可以连接修饰DNA的甲基化状态的酶(例如DNA甲基转移酶(DNMT)、TET)、修饰组蛋白亚基的酶(例如组蛋白乙酰基转移酶(HAT)、组蛋白去乙酰化酶(HDAC)、组蛋白甲基转移酶、组蛋白去甲基化酶)。
[第二实施方式]
本实施方式中,在温育步骤之前可以还具备下述表达步骤:使用上述CRISPR-Cas载体系统表达上述的<dSaCas9变异体>中所示的蛋白质和向导RNA。
本实施方式的表达步骤中,首先使用上述CRISPR-Cas载体系统表达dSaCas9变异体蛋白及向导RNA。作为表达的具体方法,使用包含编码dSaCas9变异体蛋白的基因的表达载体及包含向导RNA的表达载体各载体(或同时包含编码dSaCas9变异体蛋白的基因及向导RNA的表达载体)转化宿主。接着,对该宿主进行培养,表达dSaCas9变异体蛋白及向导RNA。培养基的组成、培养的温度、时间、诱导物质的添加等条件可以按照公知的方法由本领域技术人员决定,以使转化体能够生长、高效地产生融合蛋白。另外,例如在表达载体中整合有作为选择标志物的抗生素抗性基因时,可以向培养基中加入抗生素来选择转化体。接着,通过利用适宜的方法纯化宿主所表达的dSaCas9变异体蛋白及向导RNA,可得到dSaCas9变异体蛋白及向导RNA。
<用于在细胞内对靶双链多核苷酸进行位点特异性改造的方法>
一个实施方式中,本发明提供在细胞内对靶双链多核苷酸进行位点特异性改造的方法,所述方法具备:
将上述CRISPR-Cas载体系统导入到细胞中而表达上述<dSaCas9变异体>中所示的蛋白质和向导RNA的表达步骤;
上述蛋白质在位于PAM序列的上游的结合位点处与上述靶双链多核苷酸结合的步骤;和
得到在通过上述向导RNA与上述靶双链多核苷酸的互补性结合而确定的区域中进行了改造的上述靶双链多核苷酸的步骤,
上述向导RNA包含如下多核苷酸,所述多核苷酸由与上述靶双链多核苷酸中的上述PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
本实施方式的表达步骤中,首先使用上述CRISPR-Cas载体系统在细胞内表达dSaCas9变异体蛋白及向导RNA。
作为成为本实施方式的方法的适用对象的细胞所来源的生物,可列举例如原核生物、酵母、动物、植物、昆虫等。作为上述动物,没有特别限定,可列举例如人、猴、狗、猫、兔、猪、牛、小鼠、大鼠等,但不限于这些。另外,成为细胞的来源的生物的种类也可以根据期望的靶双链多核苷酸的种类、目的等来任意选择。
作为成为本实施方式的方法的适用对象的动物来源的细胞,可列举例如生殖细胞(精子、卵子等)、构成生物体的体细胞、干细胞、祖细胞、从生物体分离出的癌细胞、从生物体分离且获得了永生能力而在体外稳定维持的细胞(细胞系)、从生物体分离且人工进行了基因改造的细胞、从生物体分离且人工替换了细胞核的细胞等,但不限于这些。
作为构成生物体的体细胞,可列举例如从皮肤、肾脏、脾脏、肾上腺、肝脏、肺、卵巢、胰腺、子宫、胃、结肠、小肠、大肠、膀胱、前列腺、睾丸、胸腺、肌肉、结缔组织、骨、软骨、血管组织、血液、心脏、眼、脑、神经组织等任意组织采集的细胞等,但不限于这些。作为体细胞,更具体而言,可列举例如成纤维细胞、骨髓细胞、免疫细胞(例如B淋巴细胞、T淋巴细胞、嗜中性粒细胞、巨噬细胞、单核细胞等)、红细胞、血小板、骨细胞、骨髓细胞、周皮细胞、树突细胞、角质形成细胞、脂肪细胞、间充质细胞、上皮细胞、表皮细胞、内皮细胞、血管内皮细胞、淋巴管内皮细胞、肝细胞、胰岛细胞(例如α细胞、β细胞、δ细胞、ε细胞、PP细胞等)、软骨细胞、卵丘细胞、胶质细胞、神经细胞(神经元)、少突胶质细胞、小胶质细胞、星形胶质细胞、心肌细胞、食道细胞、肌肉细胞(例如平滑肌细胞、骨骼肌细胞等)、黑色素细胞、单核细胞等,但不限于这些。
干细胞是兼具自我复制能力和分化为其它多系统的细胞的能力的细胞。作为干细胞,可列举例如胚胎干细胞(ES细胞)、胚胎肿瘤细胞、胚胎生殖干细胞、诱导多能干细胞(iPS细胞)、神经干细胞、造血干细胞、间充质干细胞、肝干细胞、胰干细胞、肌干细胞、生殖干细胞、肠干细胞、癌干细胞、毛囊干细胞等,但不限于这些。
癌细胞是指由体细胞派生并获得了无限增殖能力的细胞。作为成为癌细胞的来源的癌症,可列举例如乳腺癌(例如浸润性导管癌、非浸润性导管癌、炎性乳腺癌等)、前列腺癌(例如激素依赖性前列腺癌、激素非依赖性前列腺癌等)、胰腺癌(例如胰管癌等)、胃癌(例如乳头状腺癌、粘液性腺癌、腺鳞状细胞癌等)、肺癌(例如非小细胞肺癌、小细胞肺癌、恶性间皮瘤等)、结肠癌(例如消化道间质瘤等)、直肠癌(例如消化道间质瘤等)、大肠癌(例如家族性大肠癌、遗传性非息肉性大肠癌、消化道间质瘤等)、小肠癌(例如非霍奇金淋巴瘤、消化道间质瘤等)、食道癌、十二指肠癌、舌癌、咽癌(例如上咽癌、中咽癌、下咽癌等)、头颈癌、唾液腺癌、脑瘤(例如松果体星形细胞瘤、毛细胞星形细胞瘤、弥漫性星形细胞瘤、退行性星形细胞瘤等)、神经鞘瘤、肝癌(例如原发性肝癌、肝外胆管癌等)、肾癌(例如肾细胞癌、肾盂和输尿管移行上皮癌等)、胆囊癌、胆管癌、胰腺癌、子宫内膜癌、宫颈癌、卵巢癌(例如上皮性卵巢癌、性腺外生殖细胞瘤、卵巢性生殖细胞瘤、卵巢低恶性度肿瘤等)、膀胱癌、尿道癌、皮肤癌(例如眼内(眼)黑色素瘤、默克细胞癌等)、血管瘤、恶性淋巴瘤(例如网状细胞肉瘤、淋巴肉瘤、霍奇金病等)、黑色素瘤(恶性黑色素瘤)、甲状腺癌(例如甲状腺髓样癌等)、甲状旁腺癌、鼻癌、鼻窦癌、骨癌(例如骨肉瘤、尤文肉瘤、子宫肉瘤、软组织肉瘤等)、转移性髓母细胞瘤、血管纤维瘤、隆突性皮肤纤维肉瘤、视网膜肉瘤、阴茎癌、睾丸肿瘤、小儿实体癌(例如威尔姆氏瘤、小儿肾肿瘤等)、卡波西肉瘤、起因于AIDS的卡波西肉瘤、上颌窦肿瘤、纤维组织细胞瘤、平滑肌肉瘤、横纹肌肉瘤、慢性骨髓增殖性疾病、白血病(例如急性髓系白血病、急性淋巴母细胞性白血病等)等,但不限于这些。
细胞系是通过生物体外的人工操作而获得了无限增殖能力的细胞。作为细胞系,可列举例如HCT116、Huh7、HEK293(人胎肾细胞)、HeLa(人宫颈癌细胞系)、HepG2(人肝癌细胞系)、UT7/TPO(人白血病细胞系)、CHO(中华仓鼠卵巢细胞系)、MDCK、MDBK、BHK、C-33A、HT-29、AE-1、3D9、Ns0/1、Jurkat、NIH3T3、PC12、S2、Sf9、Sf21、High Five、Vero等,但不限于这些。
作为将CRISPR-Cas载体系统导入到细胞中的方法,可通过适合于所用的活细胞的方法来进行,可列举电穿孔法、热休克法、磷酸钙法、脂质体转染法、DEAE葡聚糖法、显微注射法、基因枪法、使用病毒的方法以及使用FuGENE(注册商标)6转染试剂(罗氏公司制)、Lipofectamine 2000试剂(Invitrogen公司制)、Lipofectamine LTX试剂(Invitrogen公司制)、Lipofectamine 3000试剂(Invitrogen公司制)等市售的转染试剂的方法等。
接下来的改造步骤与上述的<用于对靶双链核苷酸进行位点特异性改造的方法>的[第一实施方式]中所示的方法相同。
通过本实施方式的靶双链多核苷酸的修饰,可得到靶双链多核苷酸被改造的细胞。
以下示出实施例来更详细地说明本发明,但是这些并非对本发明的范围进行限定。
实施例
实施例1:dSaCas9变异体的DNA结合亲和性评价
(方法)
1.克隆
使用NEB Q5定点诱变试剂盒,在dSaCas9基因内形成规定的缺失区域,导入编码接头的基因,进一步融合作为转录调控因子的KRAB基因,由此制作各种dSaCas9(图1)。对于这些dSaCas9,使用MYD88基因调查其表达抑制活性。将全部dSaCas9的基因构建物整合到pX601载体中(F.Ann Ran et al.,Nature 2015;520(7546);pp.186-191)。各载体表达以下的dSaCas9构建物(FL、T10、T11)及sgRNA。
[dSaCas9变异体]
FL:野生型dSaCas9(序列号1、序列号2)
T10:在野生型dSaCas9中具有缺失区域(第721~745位)的缺失变异体(序列号3、序列号4)
T11:在野生型dSaCas9中具有缺失区域(第721~755位)的缺失变异体(序列号5、序列号6)
T10及T11中的各缺失区域的两端通过接头肽连接。
dSaCas9构建物以dSaCas9-KRAB-P2A-sfGFP或dSaCas9-VR-P2A-sfGFP的融合蛋白形式进行表达。
[crRNA序列]
C1:ACGGAGGCUAAGCGUCGCAA(序列号11)
A2:GGAGCCACAGUUCUUCCACGG(序列号12)
A3:CUCUACCCUUGAGGUCUCGAG(序列号13)
上述crRNA序列与以下的tracrRNA序列融合形成sgRNA并从载体表达。
[tracrRNA序列]
GUUUUAGUACUCUGGAAACAGAAUCUACUAAAACAAGGCAAAAUGCCGUGUUUAUCACGUCAACUUGUUGGCGAGAUUUUUUU(序列号14)
2.细胞转染
在转染前24小时,将HEK293FT细胞以每孔中75000个细胞的密度播种于24孔板,用添加有10%FBS、2mM的新鲜的L-谷氨酰胺、1mM的丙酮酸钠及非必需氨基酸的DMEM培养基进行培养。使用1.5μl的Lipofectamine 2000(Life technologies),按照操作说明书使各质粒(500ng)转染细胞。为了进行基因表达分析,在转染后48~72小时回收细胞,溶解于RLT缓冲液(Qiagen),使用RNeasy试剂盒(Qiagen)提取总RNA。
3.基因表达分析
为了进行Taqman分析,以20μl的容量使用TaqManTM高容量RNA-to-cDNA试剂盒(Applied Biosystems)由1.5μg的总RNA制备cDNA。将制备的cDNA稀释至20倍,每个Taqman反应使用6.33μl。针对MYD88基因及HPRT基因的Taqman引物及探针由Applied Biosystems获得。在Roche LightCycler 96或LightCycler 480中,使用Taqman基因表达预混物(geneexpression master mix)(ThermoFisher)进行Taqman反应,使用LightCycler 96analysis软件进行分析。
Taqman探针产物IDs:
MYD88:Hs01573837_g1(FAM)
HPRT:Hs99999909_m1(FAM,VIC)
Taqman QPCR条件:
步骤1:95℃10分钟
步骤2:95℃15秒
步骤3:60℃30秒
重复步骤2和3:40次
(结果)
利用表达pX601-dSaCas9(FL、T10或T11)-KRAB-P2A-sfGFP及sgRNA-A2的质粒载体转染HEK293FT细胞。使用利用表达sgRNA-C1来代替表达sgRNA-A2的载体进行转染而得的细胞作为阴性对照。3天后回收所转染的细胞,单离出RNA。通过Taqman分析来分析MYD88基因的表达,利用HPRT基因的表达水平来标准化。将MYD88基因的相对表达水平的结果示于图2。
本发明的dSaCas9变异体的MYD88基因表达水平与对照相比降低至与野生型dSaCas9匹敌的程度。该结果表明,本发明的dSaCas9变异体尽管存在缺失区域、与全长的dSaCas9相比缩小,但是与向导RNA结合的能力以及DNA结合亲和性得到了维持。
上述结果中,特别是对于DNA结合亲和性高的T10,即使改变向导RNA也确认到其结合亲和性。利用表达pX601-dSaCas9(FL或T10)-KRAB-P2A-sfGFP及sgRNA-A2或sgRNA-A3的质粒载体转染HEK293FT细胞。使用利用不表达dSaCas9变异体的载体转染而得的细胞作为阴性对照。3天后回收所转染的细胞,单离出RNA。利用Taqman分析来分析MYD88基因的表达,利用HPRT基因的表达水平进行标准化。将MYD88基因的相对表达水平的结果示于图3。在使用sgRNA-A3的情况下也确认到与sgRNA-A2同样的效果。
本发明的dSaCas9变异体中的MYD88基因表达水平与对照相比降低至与野生型dSaCas9匹敌的程度。该结果表明,本发明的dSaCas9变异体尽管存在缺失区域、与全长的dSaCas9相比缩小,但是与向导RNA结合的能力以及DNA结合亲和性得到了维持。
产业上的可利用性
根据本发明,能够得到保持了DNA结合亲和性且小型化的dSaCas9蛋白。小型化的dSaCas9蛋白的使用使更多的基因向载体中的搭载成为可能,因此能够提供多样的基因组编辑技术。
本申请以在美国提出的美国临时专利申请第62/749855(申请日2018年10月24日)为基础,其内容全部包含在本说明书中。
序列表
<110> 摩大力斯医疗株式会社
<120> 经改造的Cas9蛋白及其用途
<130> 092961
<150> US62/749855
<151> 2018-10-24
<160> 14
<170> PatentIn version 3.5
<210> 1
<211> 3162
<212> DNA
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<220>
<221> CDS
<222> (1)..(3162)
<400> 1
atg aag cgg aac tac atc ctg ggc ctg gcc atc ggc atc acc agc gtg 48
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
ggc tac ggc atc atc gac tac gag aca cgg gac gtg atc gat gcc ggc 96
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
gtg cgg ctg ttc aaa gag gcc aac gtg gaa aac aac gag ggc agg cgg 144
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
agc aag aga ggc gcc aga agg ctg aag cgg cgg agg cgg cat aga atc 192
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
cag aga gtg aag aag ctg ctg ttc gac tac aac ctg ctg acc gac cac 240
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
agc gag ctg agc ggc atc aac ccc tac gag gcc aga gtg aag ggc ctg 288
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
agc cag aag ctg agc gag gaa gag ttc tct gcc gcc ctg ctg cac ctg 336
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
gcc aag aga aga ggc gtg cac aac gtg aac gag gtg gaa gag gac acc 384
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
ggc aac gag ctg tcc acc aaa gag cag atc agc cgg aac agc aag gcc 432
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
ctg gaa gag aaa tac gtg gcc gaa ctg cag ctg gaa cgg ctg aag aaa 480
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
gac ggc gaa gtg cgg ggc agc atc aac aga ttc aag acc agc gac tac 528
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
gtg aaa gaa gcc aaa cag ctg ctg aag gtg cag aag gcc tac cac cag 576
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
ctg gac cag agc ttc atc gac acc tac atc gac ctg ctg gaa acc cgg 624
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
cgg acc tac tat gag gga cct ggc gag ggc agc ccc ttc ggc tgg aag 672
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
gac atc aaa gaa tgg tac gag atg ctg atg ggc cac tgc acc tac ttc 720
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
ccc gag gaa ctg cgg agc gtg aag tac gcc tac aac gcc gac ctg tac 768
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
aac gcc ctg aac gac ctg aac aat ctc gtg atc acc agg gac gag aac 816
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
gag aag ctg gaa tat tac gag aag ttc cag atc atc gag aac gtg ttc 864
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
aag cag aag aag aag ccc acc ctg aag cag atc gcc aaa gaa atc ctc 912
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
gtg aac gaa gag gat att aag ggc tac aga gtg acc agc acc ggc aag 960
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
ccc gag ttc acc aac ctg aag gtg tac cac gac atc aag gac att acc 1008
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
gcc cgg aaa gag att att gag aac gcc gag ctg ctg gat cag att gcc 1056
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
aag atc ctg acc atc tac cag agc agc gag gac atc cag gaa gaa ctg 1104
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
acc aat ctg aac tcc gag ctg acc cag gaa gag atc gag cag atc tct 1152
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
aat ctg aag ggc tat acc ggc acc cac aac ctg agc ctg aag gcc atc 1200
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
aac ctg atc ctg gac gag ctg tgg cac acc aac gac aac cag atc gct 1248
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
atc ttc aac cgg ctg aag ctg gtg ccc aag aag gtg gac ctg tcc cag 1296
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
cag aaa gag atc ccc acc acc ctg gtg gac gac ttc atc ctg agc ccc 1344
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
gtc gtg aag aga agc ttc atc cag agc atc aaa gtg atc aac gcc atc 1392
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
atc aag aag tac ggc ctg ccc aac gac atc att atc gag ctg gcc cgc 1440
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
gag aag aac tcc aag gac gcc cag aaa atg atc aac gag atg cag aag 1488
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
cgg aac cgg cag acc aac gag cgg atc gag gaa atc atc cgg acc acc 1536
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
ggc aaa gag aac gcc aag tac ctg atc gag aag atc aag ctg cac gac 1584
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
atg cag gaa ggc aag tgc ctg tac agc ctg gaa gcc atc cct ctg gaa 1632
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
gat ctg ctg aac aac ccc ttc aac tat gag gtg gac cac atc atc ccc 1680
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
aga agc gtg tcc ttc gac aac agc ttc aac aac aag gtg ctc gtg aag 1728
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
cag gaa gaa gcc agc aag aag ggc aac cgg acc cca ttc cag tac ctg 1776
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
agc agc agc gac agc aag atc agc tac gaa acc ttc aag aag cac atc 1824
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
ctg aat ctg gcc aag ggc aag ggc aga atc agc aag acc aag aaa gag 1872
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
tat ctg ctg gaa gaa cgg gac atc aac agg ttc tcc gtg cag aaa gac 1920
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
ttc atc aac cgg aac ctg gtg gat acc aga tac gcc acc aga ggc ctg 1968
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
atg aac ctg ctg cgg agc tac ttc aga gtg aac aac ctg gac gtg aaa 2016
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
gtg aag tcc atc aat ggc ggc ttc acc agc ttt ctg cgg cgg aag tgg 2064
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
aag ttt aag aaa gag cgg aac aag ggg tac aag cac cac gcc gag gac 2112
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
gcc ctg atc att gcc aac gcc gat ttc atc ttc aaa gag tgg aag aaa 2160
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
ctg gac aag gcc aaa aaa gtg atg gaa aac cag atg ttc gag gaa aag 2208
Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys
725 730 735
cag gcc gag agc atg ccc gag atc gaa acc gag cag gag tac aaa gag 2256
Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu
740 745 750
atc ttc atc acc ccc cac cag atc aag cac att aag gac ttc aag gac 2304
Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp
755 760 765
tac aag tac agc cac cgg gtg gac aag aag cct aat aga gag ctg att 2352
Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile
770 775 780
aac gac acc ctg tac tcc acc cgg aag gac gac aag ggc aac acc ctg 2400
Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu
785 790 795 800
atc gtg aac aat ctg aac ggc ctg tac gac aag gac aat gac aag ctg 2448
Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu
805 810 815
aaa aag ctg atc aac aag agc ccc gaa aag ctg ctg atg tac cac cac 2496
Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His
820 825 830
gac ccc cag acc tac cag aaa ctg aag ctg att atg gaa cag tac ggc 2544
Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly
835 840 845
gac gag aag aat ccc ctg tac aag tac tac gag gaa acc ggg aac tac 2592
Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr
850 855 860
ctg acc aag tac tcc aaa aag gac aac ggc ccc gtg atc aag aag att 2640
Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile
865 870 875 880
aag tat tac ggc aac aaa ctg aac gcc cat ctg gac atc acc gac gac 2688
Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp
885 890 895
tac ccc aac agc aga aac aag gtc gtg aag ctg tcc ctg aag ccc tac 2736
Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr
900 905 910
aga ttc gac gtg tac ctg gac aat ggc gtg tac aag ttc gtg acc gtg 2784
Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val
915 920 925
aag aat ctg gat gtg atc aaa aaa gaa aac tac tac gaa gtg aat agc 2832
Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser
930 935 940
aag tgc tat gag gaa gct aag aag ctg aag aag atc agc aac cag gcc 2880
Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala
945 950 955 960
gag ttt atc gcc tcc ttc tac aac aac gat ctg atc aag atc aac ggc 2928
Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly
965 970 975
gag ctg tat aga gtg atc ggc gtg aac aac gac ctg ctg aac cgg atc 2976
Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile
980 985 990
gaa gtg aac atg atc gac atc acc tac cgc gag tac ctg gaa aac atg 3024
Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met
995 1000 1005
aac gac aag agg ccc ccc agg atc att aag aca atc gcc tcc aag 3069
Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys
1010 1015 1020
acc cag agc att aag aag tac agc aca gac att ctg ggc aac ctg 3114
Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu
1025 1030 1035
tat gaa gtg aaa tct aag aag cac cct cag atc atc aaa aag ggc 3159
Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1040 1045 1050
taa 3162
<210> 2
<211> 1053
<212> PRT
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<400> 2
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys
725 730 735
Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu
740 745 750
Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp
755 760 765
Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile
770 775 780
Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu
785 790 795 800
Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu
805 810 815
Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His
820 825 830
Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly
835 840 845
Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr
850 855 860
Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile
865 870 875 880
Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp
885 890 895
Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr
900 905 910
Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val
915 920 925
Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser
930 935 940
Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala
945 950 955 960
Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly
965 970 975
Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile
980 985 990
Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met
995 1000 1005
Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys
1010 1015 1020
Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu
1025 1030 1035
Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1040 1045 1050
<210> 3
<211> 3102
<212> DNA
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<220>
<221> CDS
<222> (1)..(3102)
<400> 3
atg aag cgg aac tac atc ctg ggc ctg gcc atc ggc atc acc agc gtg 48
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
ggc tac ggc atc atc gac tac gag aca cgg gac gtg atc gat gcc ggc 96
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
gtg cgg ctg ttc aaa gag gcc aac gtg gaa aac aac gag ggc agg cgg 144
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
agc aag aga ggc gcc aga agg ctg aag cgg cgg agg cgg cat aga atc 192
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
cag aga gtg aag aag ctg ctg ttc gac tac aac ctg ctg acc gac cac 240
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
agc gag ctg agc ggc atc aac ccc tac gag gcc aga gtg aag ggc ctg 288
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
agc cag aag ctg agc gag gaa gag ttc tct gcc gcc ctg ctg cac ctg 336
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
gcc aag aga aga ggc gtg cac aac gtg aac gag gtg gaa gag gac acc 384
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
ggc aac gag ctg tcc acc aaa gag cag atc agc cgg aac agc aag gcc 432
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
ctg gaa gag aaa tac gtg gcc gaa ctg cag ctg gaa cgg ctg aag aaa 480
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
gac ggc gaa gtg cgg ggc agc atc aac aga ttc aag acc agc gac tac 528
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
gtg aaa gaa gcc aaa cag ctg ctg aag gtg cag aag gcc tac cac cag 576
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
ctg gac cag agc ttc atc gac acc tac atc gac ctg ctg gaa acc cgg 624
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
cgg acc tac tat gag gga cct ggc gag ggc agc ccc ttc ggc tgg aag 672
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
gac atc aaa gaa tgg tac gag atg ctg atg ggc cac tgc acc tac ttc 720
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
ccc gag gaa ctg cgg agc gtg aag tac gcc tac aac gcc gac ctg tac 768
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
aac gcc ctg aac gac ctg aac aat ctc gtg atc acc agg gac gag aac 816
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
gag aag ctg gaa tat tac gag aag ttc cag atc atc gag aac gtg ttc 864
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
aag cag aag aag aag ccc acc ctg aag cag atc gcc aaa gaa atc ctc 912
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
gtg aac gaa gag gat att aag ggc tac aga gtg acc agc acc ggc aag 960
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
ccc gag ttc acc aac ctg aag gtg tac cac gac atc aag gac att acc 1008
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
gcc cgg aaa gag att att gag aac gcc gag ctg ctg gat cag att gcc 1056
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
aag atc ctg acc atc tac cag agc agc gag gac atc cag gaa gaa ctg 1104
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
acc aat ctg aac tcc gag ctg acc cag gaa gag atc gag cag atc tct 1152
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
aat ctg aag ggc tat acc ggc acc cac aac ctg agc ctg aag gcc atc 1200
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
aac ctg atc ctg gac gag ctg tgg cac acc aac gac aac cag atc gct 1248
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
atc ttc aac cgg ctg aag ctg gtg ccc aag aag gtg gac ctg tcc cag 1296
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
cag aaa gag atc ccc acc acc ctg gtg gac gac ttc atc ctg agc ccc 1344
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
gtc gtg aag aga agc ttc atc cag agc atc aaa gtg atc aac gcc atc 1392
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
atc aag aag tac ggc ctg ccc aac gac atc att atc gag ctg gcc cgc 1440
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
gag aag aac tcc aag gac gcc cag aaa atg atc aac gag atg cag aag 1488
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
cgg aac cgg cag acc aac gag cgg atc gag gaa atc atc cgg acc acc 1536
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
ggc aaa gag aac gcc aag tac ctg atc gag aag atc aag ctg cac gac 1584
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
atg cag gaa ggc aag tgc ctg tac agc ctg gaa gcc atc cct ctg gaa 1632
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
gat ctg ctg aac aac ccc ttc aac tat gag gtg gac cac atc atc ccc 1680
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
aga agc gtg tcc ttc gac aac agc ttc aac aac aag gtg ctc gtg aag 1728
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
cag gaa gaa gcc agc aag aag ggc aac cgg acc cca ttc cag tac ctg 1776
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
agc agc agc gac agc aag atc agc tac gaa acc ttc aag aag cac atc 1824
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
ctg aat ctg gcc aag ggc aag ggc aga atc agc aag acc aag aaa gag 1872
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
tat ctg ctg gaa gaa cgg gac atc aac agg ttc tcc gtg cag aaa gac 1920
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
ttc atc aac cgg aac ctg gtg gat acc aga tac gcc acc aga ggc ctg 1968
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
atg aac ctg ctg cgg agc tac ttc aga gtg aac aac ctg gac gtg aaa 2016
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
gtg aag tcc atc aat ggc ggc ttc acc agc ttt ctg cgg cgg aag tgg 2064
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
aag ttt aag aaa gag cgg aac aag ggg tac aag cac cac gcc gag gac 2112
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
gcc ctg atc att gcc aac gcc gat ttc atc ttc aaa gag tgg aag aaa 2160
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
tcc ggc ggc ggt tcg acc gag cag gag tac aaa gag atc ttc atc acc 2208
Ser Gly Gly Gly Ser Thr Glu Gln Glu Tyr Lys Glu Ile Phe Ile Thr
725 730 735
ccc cac cag atc aag cac att aag gac ttc aag gac tac aag tac agc 2256
Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp Tyr Lys Tyr Ser
740 745 750
cac cgg gtg gac aag aag cct aat aga gag ctg att aac gac acc ctg 2304
His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile Asn Asp Thr Leu
755 760 765
tac tcc acc cgg aag gac gac aag ggc aac acc ctg atc gtg aac aat 2352
Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu Ile Val Asn Asn
770 775 780
ctg aac ggc ctg tac gac aag gac aat gac aag ctg aaa aag ctg atc 2400
Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu Lys Lys Leu Ile
785 790 795 800
aac aag agc ccc gaa aag ctg ctg atg tac cac cac gac ccc cag acc 2448
Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His Asp Pro Gln Thr
805 810 815
tac cag aaa ctg aag ctg att atg gaa cag tac ggc gac gag aag aat 2496
Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly Asp Glu Lys Asn
820 825 830
ccc ctg tac aag tac tac gag gaa acc ggg aac tac ctg acc aag tac 2544
Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu Thr Lys Tyr
835 840 845
tcc aaa aag gac aac ggc ccc gtg atc aag aag att aag tat tac ggc 2592
Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile Lys Tyr Tyr Gly
850 855 860
aac aaa ctg aac gcc cat ctg gac atc acc gac gac tac ccc aac agc 2640
Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp Tyr Pro Asn Ser
865 870 875 880
aga aac aag gtc gtg aag ctg tcc ctg aag ccc tac aga ttc gac gtg 2688
Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr Arg Phe Asp Val
885 890 895
tac ctg gac aat ggc gtg tac aag ttc gtg acc gtg aag aat ctg gat 2736
Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val Lys Asn Leu Asp
900 905 910
gtg atc aaa aaa gaa aac tac tac gaa gtg aat agc aag tgc tat gag 2784
Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser Lys Cys Tyr Glu
915 920 925
gaa gct aag aag ctg aag aag atc agc aac cag gcc gag ttt atc gcc 2832
Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala Glu Phe Ile Ala
930 935 940
tcc ttc tac aac aac gat ctg atc aag atc aac ggc gag ctg tat aga 2880
Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly Glu Leu Tyr Arg
945 950 955 960
gtg atc ggc gtg aac aac gac ctg ctg aac cgg atc gaa gtg aac atg 2928
Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile Glu Val Asn Met
965 970 975
atc gac atc acc tac cgc gag tac ctg gaa aac atg aac gac aag agg 2976
Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met Asn Asp Lys Arg
980 985 990
ccc ccc agg atc att aag aca atc gcc tcc aag acc cag agc att aag 3024
Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys Thr Gln Ser Ile Lys
995 1000 1005
aag tac agc aca gac att ctg ggc aac ctg tat gaa gtg aaa tct 3069
Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu Tyr Glu Val Lys Ser
1010 1015 1020
aag aag cac cct cag atc atc aaa aag ggc taa 3102
Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1025 1030
<210> 4
<211> 1033
<212> PRT
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<400> 4
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
Ser Gly Gly Gly Ser Thr Glu Gln Glu Tyr Lys Glu Ile Phe Ile Thr
725 730 735
Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp Tyr Lys Tyr Ser
740 745 750
His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile Asn Asp Thr Leu
755 760 765
Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu Ile Val Asn Asn
770 775 780
Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu Lys Lys Leu Ile
785 790 795 800
Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His Asp Pro Gln Thr
805 810 815
Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly Asp Glu Lys Asn
820 825 830
Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu Thr Lys Tyr
835 840 845
Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile Lys Tyr Tyr Gly
850 855 860
Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp Tyr Pro Asn Ser
865 870 875 880
Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr Arg Phe Asp Val
885 890 895
Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val Lys Asn Leu Asp
900 905 910
Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser Lys Cys Tyr Glu
915 920 925
Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala Glu Phe Ile Ala
930 935 940
Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly Glu Leu Tyr Arg
945 950 955 960
Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile Glu Val Asn Met
965 970 975
Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met Asn Asp Lys Arg
980 985 990
Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys Thr Gln Ser Ile Lys
995 1000 1005
Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu Tyr Glu Val Lys Ser
1010 1015 1020
Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1025 1030
<210> 5
<211> 3072
<212> DNA
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<220>
<221> CDS
<222> (1)..(3072)
<400> 5
atg aag cgg aac tac atc ctg ggc ctg gcc atc ggc atc acc agc gtg 48
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
ggc tac ggc atc atc gac tac gag aca cgg gac gtg atc gat gcc ggc 96
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
gtg cgg ctg ttc aaa gag gcc aac gtg gaa aac aac gag ggc agg cgg 144
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
agc aag aga ggc gcc aga agg ctg aag cgg cgg agg cgg cat aga atc 192
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
cag aga gtg aag aag ctg ctg ttc gac tac aac ctg ctg acc gac cac 240
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
agc gag ctg agc ggc atc aac ccc tac gag gcc aga gtg aag ggc ctg 288
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
agc cag aag ctg agc gag gaa gag ttc tct gcc gcc ctg ctg cac ctg 336
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
gcc aag aga aga ggc gtg cac aac gtg aac gag gtg gaa gag gac acc 384
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
ggc aac gag ctg tcc acc aaa gag cag atc agc cgg aac agc aag gcc 432
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
ctg gaa gag aaa tac gtg gcc gaa ctg cag ctg gaa cgg ctg aag aaa 480
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
gac ggc gaa gtg cgg ggc agc atc aac aga ttc aag acc agc gac tac 528
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
gtg aaa gaa gcc aaa cag ctg ctg aag gtg cag aag gcc tac cac cag 576
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
ctg gac cag agc ttc atc gac acc tac atc gac ctg ctg gaa acc cgg 624
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
cgg acc tac tat gag gga cct ggc gag ggc agc ccc ttc ggc tgg aag 672
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
gac atc aaa gaa tgg tac gag atg ctg atg ggc cac tgc acc tac ttc 720
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
ccc gag gaa ctg cgg agc gtg aag tac gcc tac aac gcc gac ctg tac 768
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
aac gcc ctg aac gac ctg aac aat ctc gtg atc acc agg gac gag aac 816
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
gag aag ctg gaa tat tac gag aag ttc cag atc atc gag aac gtg ttc 864
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
aag cag aag aag aag ccc acc ctg aag cag atc gcc aaa gaa atc ctc 912
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
gtg aac gaa gag gat att aag ggc tac aga gtg acc agc acc ggc aag 960
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
ccc gag ttc acc aac ctg aag gtg tac cac gac atc aag gac att acc 1008
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
gcc cgg aaa gag att att gag aac gcc gag ctg ctg gat cag att gcc 1056
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
aag atc ctg acc atc tac cag agc agc gag gac atc cag gaa gaa ctg 1104
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
acc aat ctg aac tcc gag ctg acc cag gaa gag atc gag cag atc tct 1152
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
aat ctg aag ggc tat acc ggc acc cac aac ctg agc ctg aag gcc atc 1200
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
aac ctg atc ctg gac gag ctg tgg cac acc aac gac aac cag atc gct 1248
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
atc ttc aac cgg ctg aag ctg gtg ccc aag aag gtg gac ctg tcc cag 1296
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
cag aaa gag atc ccc acc acc ctg gtg gac gac ttc atc ctg agc ccc 1344
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
gtc gtg aag aga agc ttc atc cag agc atc aaa gtg atc aac gcc atc 1392
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
atc aag aag tac ggc ctg ccc aac gac atc att atc gag ctg gcc cgc 1440
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
gag aag aac tcc aag gac gcc cag aaa atg atc aac gag atg cag aag 1488
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
cgg aac cgg cag acc aac gag cgg atc gag gaa atc atc cgg acc acc 1536
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
ggc aaa gag aac gcc aag tac ctg atc gag aag atc aag ctg cac gac 1584
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
atg cag gaa ggc aag tgc ctg tac agc ctg gaa gcc atc cct ctg gaa 1632
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
gat ctg ctg aac aac ccc ttc aac tat gag gtg gac cac atc atc ccc 1680
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
aga agc gtg tcc ttc gac aac agc ttc aac aac aag gtg ctc gtg aag 1728
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
cag gaa gaa gcc agc aag aag ggc aac cgg acc cca ttc cag tac ctg 1776
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
agc agc agc gac agc aag atc agc tac gaa acc ttc aag aag cac atc 1824
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
ctg aat ctg gcc aag ggc aag ggc aga atc agc aag acc aag aaa gag 1872
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
tat ctg ctg gaa gaa cgg gac atc aac agg ttc tcc gtg cag aaa gac 1920
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
ttc atc aac cgg aac ctg gtg gat acc aga tac gcc acc aga ggc ctg 1968
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
atg aac ctg ctg cgg agc tac ttc aga gtg aac aac ctg gac gtg aaa 2016
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
gtg aag tcc atc aat ggc ggc ttc acc agc ttt ctg cgg cgg aag tgg 2064
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
aag ttt aag aaa gag cgg aac aag ggg tac aag cac cac gcc gag gac 2112
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
gcc ctg atc att gcc aac gcc gat ttc atc ttc aaa gag tgg aag aaa 2160
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
tcc ggc ggc ggt tcg acc ccc cac cag atc aag cac att aag gac ttc 2208
Ser Gly Gly Gly Ser Thr Pro His Gln Ile Lys His Ile Lys Asp Phe
725 730 735
aag gac tac aag tac agc cac cgg gtg gac aag aag cct aat aga gag 2256
Lys Asp Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu
740 745 750
ctg att aac gac acc ctg tac tcc acc cgg aag gac gac aag ggc aac 2304
Leu Ile Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn
755 760 765
acc ctg atc gtg aac aat ctg aac ggc ctg tac gac aag gac aat gac 2352
Thr Leu Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp
770 775 780
aag ctg aaa aag ctg atc aac aag agc ccc gaa aag ctg ctg atg tac 2400
Lys Leu Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr
785 790 795 800
cac cac gac ccc cag acc tac cag aaa ctg aag ctg att atg gaa cag 2448
His His Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln
805 810 815
tac ggc gac gag aag aat ccc ctg tac aag tac tac gag gaa acc ggg 2496
Tyr Gly Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly
820 825 830
aac tac ctg acc aag tac tcc aaa aag gac aac ggc ccc gtg atc aag 2544
Asn Tyr Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys
835 840 845
aag att aag tat tac ggc aac aaa ctg aac gcc cat ctg gac atc acc 2592
Lys Ile Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr
850 855 860
gac gac tac ccc aac agc aga aac aag gtc gtg aag ctg tcc ctg aag 2640
Asp Asp Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys
865 870 875 880
ccc tac aga ttc gac gtg tac ctg gac aat ggc gtg tac aag ttc gtg 2688
Pro Tyr Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val
885 890 895
acc gtg aag aat ctg gat gtg atc aaa aaa gaa aac tac tac gaa gtg 2736
Thr Val Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val
900 905 910
aat agc aag tgc tat gag gaa gct aag aag ctg aag aag atc agc aac 2784
Asn Ser Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn
915 920 925
cag gcc gag ttt atc gcc tcc ttc tac aac aac gat ctg atc aag atc 2832
Gln Ala Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile
930 935 940
aac ggc gag ctg tat aga gtg atc ggc gtg aac aac gac ctg ctg aac 2880
Asn Gly Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn
945 950 955 960
cgg atc gaa gtg aac atg atc gac atc acc tac cgc gag tac ctg gaa 2928
Arg Ile Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu
965 970 975
aac atg aac gac aag agg ccc ccc agg atc att aag aca atc gcc tcc 2976
Asn Met Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser
980 985 990
aag acc cag agc att aag aag tac agc aca gac att ctg ggc aac ctg 3024
Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu
995 1000 1005
tat gaa gtg aaa tct aag aag cac cct cag atc atc aaa aag ggc 3069
Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1010 1015 1020
taa 3072
<210> 6
<211> 1023
<212> PRT
<213> 金黄色葡萄球菌(Staphylococcus aureus)
<400> 6
Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro
545 550 555 560
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
Ser Gly Gly Gly Ser Thr Pro His Gln Ile Lys His Ile Lys Asp Phe
725 730 735
Lys Asp Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu
740 745 750
Leu Ile Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn
755 760 765
Thr Leu Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp
770 775 780
Lys Leu Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr
785 790 795 800
His His Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln
805 810 815
Tyr Gly Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly
820 825 830
Asn Tyr Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys
835 840 845
Lys Ile Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr
850 855 860
Asp Asp Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys
865 870 875 880
Pro Tyr Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val
885 890 895
Thr Val Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val
900 905 910
Asn Ser Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn
915 920 925
Gln Ala Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile
930 935 940
Asn Gly Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn
945 950 955 960
Arg Ile Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu
965 970 975
Asn Met Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser
980 985 990
Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu
995 1000 1005
Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1010 1015 1020
<210> 7
<211> 5
<212> PRT
<213> 人工序列
<220>
<223> 接头
<400> 7
Ser Gly Gly Gly Ser
1 5
<210> 8
<211> 6
<212> PRT
<213> 人工序列
<220>
<223> 接头
<400> 8
Gly Gly Ser Gly Gly Ser
1 5
<210> 9
<211> 8
<212> PRT
<213> 人工序列
<220>
<223> 接头
<400> 9
Ser Gly Ser Gly Ser Gly Ser Gly
1 5
<210> 10
<211> 9
<212> PRT
<213> 人工序列
<220>
<223> 接头
<400> 10
Ser Gly Ser Gly Ser Gly Ser Gly Ser
1 5
<210> 11
<211> 20
<212> RNA
<213> 人工序列
<220>
<223> crRNA (C1)
<400> 11
acggaggcua agcgucgcaa 20
<210> 12
<211> 21
<212> RNA
<213> 人工序列
<220>
<223> crRNA(A2)
<400> 12
ggagccacag uucuuccacg g 21
<210> 13
<211> 21
<212> RNA
<213> 人工序列
<220>
<223> crRNA(A3)
<400> 13
cucuacccuu gaggucucga g 21
<210> 14
<211> 83
<212> RNA
<213> 人工序列
<220>
<223> tracrRNA
<400> 14
guuuuaguac ucuggaaaca gaaucuacua aaacaaggca aaaugccgug uuuaucacgu 60
caacuuguug gcgagauuuu uuu 83
Claims (22)
1.一种蛋白质,其由下述序列构成且具有与向导RNA结合的能力,所述序列包含如下氨基酸序列:
在序列号2所示的氨基酸序列中,具有在第721~755位之间连续的缺失区域,
与该缺失区域分别相邻的氨基酸通过由3~10个氨基酸残基构成的接头连接。
2.根据权利要求1所述的蛋白质,其中,接头是由甘氨酸(G)及丝氨酸(S)构成的长度为5~9个氨基酸的接头。
3.根据权利要求2所述的蛋白质,其中,接头从以下接头中选择:
-SGGGS-、
-GGSGGS-、
-SGSGSGSG-、
-SGSGSGSGS-。
4.根据权利要求1~3中任一项所述的蛋白质,其中,该缺失区域为第721~745位的区域。
5.根据权利要求4所述的蛋白质,其由序列号4表示。
6.根据权利要求1~3中任一项所述的蛋白质,其中,该缺失区域为第721~755位的区域。
7.根据权利要求6所述的蛋白质,其由序列号6表示。
8.根据权利要求1~7中任一项所述的蛋白质,其中,进一步地,第45位和/或第163位的谷氨酸(E)被置换为其它氨基酸。
9.根据权利要求8所述的蛋白质,其中,其它氨基酸为碱性氨基酸。
10.根据权利要求9所述的蛋白质,其中,碱性氨基酸为赖氨酸(K)。
11.根据权利要求1~10中任一项所述的蛋白质,其中,在序列号2的除实施了变异和/或缺失的位置以外的部位具有80%以上的一致性。
12.根据权利要求1~10中任一项所述的蛋白质,其中,在序列号2的除实施了变异和/或缺失的位置以外的部位置换、缺失、插入和/或添加了1个~几个氨基酸。
13.根据权利要求1~12中任一项所述的蛋白质,其连接有转录调控因子蛋白或结构域。
14.根据权利要求13所述的蛋白质,其中,转录调控因子为转录活化因子。
15.根据权利要求13所述的蛋白质,其中,转录调控因子为转录沉默子或转录抑制因子。
16.一种核酸,其编码权利要求1~15中任一项所述的蛋白质。
17.一种蛋白质-RNA复合物,其具备权利要求1~16中任一项所述的蛋白质和包含下述多核苷酸的向导RNA,所述多核苷酸由与靶双链多核苷酸中的PAM(前间隔序列邻近基序)序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
18.一种用于对靶双链多核苷酸进行位点特异性改造的方法,
其具备:
将靶双链多核苷酸、蛋白质和向导RNA混合并进行温育的步骤;以及
所述蛋白质在位于PAM序列的上游的结合位点处对所述靶双链多核苷酸进行改造的步骤,
所述蛋白质为权利要求1~15中任一项所述的蛋白质,
所述向导RNA包含下述多核苷酸,所述多核苷酸由与所述靶双链多核苷酸中的所述PAM序列的上游1个碱基至上游20个碱基以上且24个碱基以下的碱基序列互补的碱基序列构成。
19.一种使细胞的靶基因的表达增加的方法,其包括:使权利要求14所述的蛋白质和针对所述靶基因的一种或两种以上向导RNA在所述细胞内表达。
20.一种使细胞的靶基因的表达减少的方法,其包括:使权利要求15所述的蛋白质和针对所述靶基因的一种或两种以上向导RNA在所述细胞内表达。
21.根据权利要求19或20所述的方法,其中,细胞为真核细胞。
22.根据权利要求19或20所述的方法,其中,细胞为酵母细胞、植物细胞或动物细胞。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862749855P | 2018-10-24 | 2018-10-24 | |
US62/749,855 | 2018-10-24 | ||
PCT/JP2019/041751 WO2020085441A1 (ja) | 2018-10-24 | 2019-10-24 | 改変されたCas9タンパク質及びその用途 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113166738A true CN113166738A (zh) | 2021-07-23 |
Family
ID=70331763
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201980069492.0A Pending CN113166738A (zh) | 2018-10-24 | 2019-10-24 | 经改造的Cas9蛋白及其用途 |
Country Status (13)
Country | Link |
---|---|
US (1) | US20210246473A1 (zh) |
EP (1) | EP3878956A4 (zh) |
JP (1) | JPWO2020085441A1 (zh) |
KR (1) | KR20210082185A (zh) |
CN (1) | CN113166738A (zh) |
AU (1) | AU2019365440A1 (zh) |
BR (1) | BR112021007672A2 (zh) |
CA (1) | CA3116994A1 (zh) |
IL (1) | IL282423A (zh) |
MX (1) | MX2021004714A (zh) |
SG (1) | SG11202104097TA (zh) |
WO (1) | WO2020085441A1 (zh) |
ZA (1) | ZA202103259B (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
MX2021014478A (es) | 2019-05-28 | 2022-01-06 | Astellas Pharma Inc | Metodo para tratar distrofia muscular por direccionamiento del gen dmpk. |
WO2021230385A1 (en) * | 2020-05-15 | 2021-11-18 | Astellas Pharma Inc. | Method for treating muscular dystrophy by targeting utrophin gene |
US20230323456A1 (en) * | 2020-08-31 | 2023-10-12 | Modalis Therapeutics Corporation | Method for treating facioscapulohumeral muscular dystrophy (fshd) by targeting dux4 gene |
WO2022114243A1 (en) | 2020-11-25 | 2022-06-02 | Astellas Pharma Inc. | Method for treating muscular dystrophy by targeting dmpk gene |
WO2022145495A1 (en) * | 2021-01-04 | 2022-07-07 | Modalis Therapeutics Corporation | Method for treating spinocerebellar ataxias (sca) by targeting atxn7 gene |
WO2023069923A1 (en) * | 2021-10-18 | 2023-04-27 | Duke University | Compositions and methods relating to epigenetic modulation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016205759A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation |
JP2017501688A (ja) * | 2013-11-19 | 2017-01-19 | プレジデント アンド フェローズ オブ ハーバード カレッジ | 変異cas9タンパク質 |
CN107532161A (zh) * | 2015-03-03 | 2018-01-02 | 通用医疗公司 | 具有改变的PAM特异性的工程化CRISPR‑Cas9核酸酶 |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016196655A1 (en) * | 2015-06-03 | 2016-12-08 | The Regents Of The University Of California | Cas9 variants and methods of use thereof |
WO2017010543A1 (ja) | 2015-07-14 | 2017-01-19 | 国立大学法人東京大学 | 改変されたFnCas9タンパク質及びその使用 |
WO2018074979A1 (en) | 2016-10-17 | 2018-04-26 | Nanyang Technological University | Truncated crispr-cas proteins for dna targeting |
US11674128B2 (en) * | 2016-12-12 | 2023-06-13 | Tsinghua University | Engineering of a minimal SaCas9 CRISPR/Cas system for gene editing and transcriptional regulation optimized by enhanced guide RNA |
WO2019089910A1 (en) * | 2017-11-01 | 2019-05-09 | Ohio State Innovation Foundation | Highly compact cas9-based transcriptional regulators for in vivo gene regulation |
KR20210040985A (ko) * | 2018-08-07 | 2021-04-14 | 가부시키가이샤 모달리스 | 신규 전사 액티베이터 |
-
2019
- 2019-10-24 MX MX2021004714A patent/MX2021004714A/es unknown
- 2019-10-24 SG SG11202104097TA patent/SG11202104097TA/en unknown
- 2019-10-24 AU AU2019365440A patent/AU2019365440A1/en not_active Abandoned
- 2019-10-24 CN CN201980069492.0A patent/CN113166738A/zh active Pending
- 2019-10-24 BR BR112021007672-7A patent/BR112021007672A2/pt not_active Application Discontinuation
- 2019-10-24 KR KR1020217014297A patent/KR20210082185A/ko unknown
- 2019-10-24 EP EP19876272.6A patent/EP3878956A4/en not_active Withdrawn
- 2019-10-24 CA CA3116994A patent/CA3116994A1/en active Pending
- 2019-10-24 JP JP2020552595A patent/JPWO2020085441A1/ja active Pending
- 2019-10-24 US US17/288,379 patent/US20210246473A1/en active Pending
- 2019-10-24 WO PCT/JP2019/041751 patent/WO2020085441A1/ja unknown
-
2021
- 2021-04-19 IL IL282423A patent/IL282423A/en unknown
- 2021-05-13 ZA ZA2021/03259A patent/ZA202103259B/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017501688A (ja) * | 2013-11-19 | 2017-01-19 | プレジデント アンド フェローズ オブ ハーバード カレッジ | 変異cas9タンパク質 |
CN107532161A (zh) * | 2015-03-03 | 2018-01-02 | 通用医疗公司 | 具有改变的PAM特异性的工程化CRISPR‑Cas9核酸酶 |
WO2016205759A1 (en) * | 2015-06-18 | 2016-12-22 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of cas9 orthologs and variants for sequence manipulation |
Non-Patent Citations (1)
Title |
---|
F. ANN RAN ET AL.: "In vivo genome editing using Staphylococcus aureus Cas9", 《NATURE》, vol. 520, 31 January 2015 (2015-01-31), pages 186 * |
Also Published As
Publication number | Publication date |
---|---|
BR112021007672A2 (pt) | 2021-08-10 |
ZA202103259B (en) | 2022-08-31 |
US20210246473A1 (en) | 2021-08-12 |
KR20210082185A (ko) | 2021-07-02 |
IL282423A (en) | 2021-06-30 |
CA3116994A1 (en) | 2020-04-30 |
JPWO2020085441A1 (ja) | 2021-09-16 |
MX2021004714A (es) | 2021-06-04 |
EP3878956A1 (en) | 2021-09-15 |
EP3878956A4 (en) | 2022-07-06 |
WO2020085441A1 (ja) | 2020-04-30 |
SG11202104097TA (en) | 2021-05-28 |
AU2019365440A1 (en) | 2021-05-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11702645B2 (en) | Polynucleotide encoding modified CAS9 protein | |
CN113166738A (zh) | 经改造的Cas9蛋白及其用途 | |
US20230279374A1 (en) | Modified cas9 protein, and use thereof | |
JP7412001B2 (ja) | 改変されたCas9タンパク質及びその用途 | |
JPWO2018221685A6 (ja) | 改変されたCas9タンパク質及びその用途 | |
JP7138712B2 (ja) | ゲノム編集のためのシステム及び方法 | |
WO2017010543A1 (ja) | 改変されたFnCas9タンパク質及びその使用 | |
JP7240007B2 (ja) | 改良されたトランスポザーゼポリペプチドおよびその使用 | |
CN115244177A (zh) | 用于基因组修饰的高保真SpCas9核酸酶 | |
WO2019026976A1 (ja) | 改変されたCas9タンパク質及びその用途 | |
JP2023539569A (ja) | ヌクレアーゼを含む組成物及びその使用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |