CN115397995A - 来自细菌嗜肺巴斯德杆菌的cas9蛋白的用途 - Google Patents
来自细菌嗜肺巴斯德杆菌的cas9蛋白的用途 Download PDFInfo
- Publication number
- CN115397995A CN115397995A CN202080092630.XA CN202080092630A CN115397995A CN 115397995 A CN115397995 A CN 115397995A CN 202080092630 A CN202080092630 A CN 202080092630A CN 115397995 A CN115397995 A CN 115397995A
- Authority
- CN
- China
- Prior art keywords
- leu
- lys
- glu
- dna
- arg
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108091033409 CRISPR Proteins 0.000 title abstract description 47
- 241000606860 Pasteurella Species 0.000 title abstract description 20
- 108020004414 DNA Proteins 0.000 claims abstract description 126
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 24
- 108090000623 proteins and genes Proteins 0.000 claims description 92
- 102000004169 proteins and genes Human genes 0.000 claims description 74
- 239000002773 nucleotide Substances 0.000 claims description 41
- 125000003729 nucleotide group Chemical group 0.000 claims description 41
- 210000004027 cell Anatomy 0.000 claims description 36
- 108020005004 Guide RNA Proteins 0.000 claims description 34
- 238000000034 method Methods 0.000 claims description 27
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 25
- 230000004048 modification Effects 0.000 claims description 24
- 238000012986 modification Methods 0.000 claims description 24
- 230000005782 double-strand break Effects 0.000 claims description 19
- 102000053602 DNA Human genes 0.000 claims description 15
- 230000015572 biosynthetic process Effects 0.000 claims description 12
- 125000000539 amino acid group Chemical group 0.000 claims description 9
- 210000004962 mammalian cell Anatomy 0.000 claims description 6
- 102000039446 nucleic acids Human genes 0.000 claims description 6
- 108020004707 nucleic acids Proteins 0.000 claims description 6
- 150000007523 nucleic acids Chemical class 0.000 claims description 6
- 108091029865 Exogenous DNA Proteins 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 2
- 101710163270 Nuclease Proteins 0.000 abstract description 44
- 239000013612 plasmid Substances 0.000 abstract description 12
- 238000010356 CRISPR-Cas9 genome editing Methods 0.000 abstract description 8
- 230000001580 bacterial effect Effects 0.000 abstract description 4
- 238000003776 cleavage reaction Methods 0.000 description 28
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 24
- 230000007017 scission Effects 0.000 description 23
- 238000010354 CRISPR gene editing Methods 0.000 description 18
- 238000000338 in vitro Methods 0.000 description 18
- 230000000694 effects Effects 0.000 description 15
- 125000006850 spacer group Chemical group 0.000 description 14
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 11
- 239000012634 fragment Substances 0.000 description 11
- 238000006467 substitution reaction Methods 0.000 description 11
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 10
- 230000000295 complement effect Effects 0.000 description 10
- 238000002474 experimental method Methods 0.000 description 10
- 210000005260 human cell Anatomy 0.000 description 10
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 9
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 8
- 108010062796 arginyllysine Proteins 0.000 description 8
- 108010064235 lysylglycine Proteins 0.000 description 8
- 238000012217 deletion Methods 0.000 description 7
- 230000037430 deletion Effects 0.000 description 7
- 238000003780 insertion Methods 0.000 description 7
- 230000037431 insertion Effects 0.000 description 7
- 210000004940 nucleus Anatomy 0.000 description 7
- HQUXQAMSWFIRET-AVGNSLFASA-N Leu-Glu-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HQUXQAMSWFIRET-AVGNSLFASA-N 0.000 description 6
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 6
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 6
- 239000012636 effector Substances 0.000 description 6
- 108010050848 glycylleucine Proteins 0.000 description 6
- 239000000700 radioactive tracer Substances 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 5
- 101001048956 Homo sapiens Homeobox protein EMX1 Proteins 0.000 description 5
- 108091027544 Subgenomic mRNA Proteins 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- OTZMRMHZCMZOJZ-SRVKXCTJSA-N Arg-Leu-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O OTZMRMHZCMZOJZ-SRVKXCTJSA-N 0.000 description 4
- SLQQPJBDBVPVQV-JYJNAYRXSA-N Arg-Phe-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(O)=O SLQQPJBDBVPVQV-JYJNAYRXSA-N 0.000 description 4
- UVTGNSWSRSCPLP-UHFFFAOYSA-N Arg-Tyr Natural products NC(CCNC(=N)N)C(=O)NC(Cc1ccc(O)cc1)C(=O)O UVTGNSWSRSCPLP-UHFFFAOYSA-N 0.000 description 4
- ZMUQQMGITUJQTI-CIUDSAMLSA-N Asn-Leu-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O ZMUQQMGITUJQTI-CIUDSAMLSA-N 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 4
- NTBOEZICHOSJEE-YUMQZZPRSA-N Gly-Lys-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O NTBOEZICHOSJEE-YUMQZZPRSA-N 0.000 description 4
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 4
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 4
- ATNKHRAIZCMCCN-BZSNNMDCSA-N Lys-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCCCN)N ATNKHRAIZCMCCN-BZSNNMDCSA-N 0.000 description 4
- ZIIMORLEZLVRIP-SRVKXCTJSA-N Met-Leu-Gln Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZIIMORLEZLVRIP-SRVKXCTJSA-N 0.000 description 4
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 4
- WEDZFLRYSIDIRX-IHRRRGAJSA-N Phe-Ser-Arg Chemical compound NC(=N)NCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 WEDZFLRYSIDIRX-IHRRRGAJSA-N 0.000 description 4
- SRTCFKGBYBZRHA-ACZMJKKPSA-N Ser-Ala-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SRTCFKGBYBZRHA-ACZMJKKPSA-N 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- SYOMXKPPFZRELL-ONGXEEELSA-N Val-Gly-Lys Chemical compound CC(C)[C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N SYOMXKPPFZRELL-ONGXEEELSA-N 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 108010013835 arginine glutamate Proteins 0.000 description 4
- 108010038633 aspartylglutamate Proteins 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 229940088598 enzyme Drugs 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 108010034529 leucyl-lysine Proteins 0.000 description 4
- 108010009298 lysylglutamic acid Proteins 0.000 description 4
- 108010054155 lysyllysine Proteins 0.000 description 4
- 230000001404 mediated effect Effects 0.000 description 4
- 108010051242 phenylalanylserine Proteins 0.000 description 4
- 239000013641 positive control Substances 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 108010051110 tyrosyl-lysine Proteins 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- 108010073969 valyllysine Proteins 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 230000007018 DNA scission Effects 0.000 description 3
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 3
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 241000606583 Rodentibacter pneumotropicus Species 0.000 description 3
- 241000191967 Staphylococcus aureus Species 0.000 description 3
- 239000011543 agarose gel Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 239000007795 chemical reaction product Substances 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 239000012139 lysis buffer Substances 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000002703 mutagenesis Methods 0.000 description 3
- 231100000350 mutagenesis Toxicity 0.000 description 3
- 229920002477 rna polymer Polymers 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- FJVAQLJNTSUQPY-CIUDSAMLSA-N Ala-Ala-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN FJVAQLJNTSUQPY-CIUDSAMLSA-N 0.000 description 2
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 2
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 2
- GSCLWXDNIMNIJE-ZLUOBGJFSA-N Ala-Asp-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O GSCLWXDNIMNIJE-ZLUOBGJFSA-N 0.000 description 2
- HMRWQTHUDVXMGH-GUBZILKMSA-N Ala-Glu-Lys Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN HMRWQTHUDVXMGH-GUBZILKMSA-N 0.000 description 2
- VNYMOTCMNHJGTG-JBDRJPRFSA-N Ala-Ile-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O VNYMOTCMNHJGTG-JBDRJPRFSA-N 0.000 description 2
- HHRAXZAYZFFRAM-CIUDSAMLSA-N Ala-Leu-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O HHRAXZAYZFFRAM-CIUDSAMLSA-N 0.000 description 2
- OYJCVIGKMXUVKB-GARJFASQSA-N Ala-Leu-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@@H]1C(=O)O)N OYJCVIGKMXUVKB-GARJFASQSA-N 0.000 description 2
- QUIGLPSHIFPEOV-CIUDSAMLSA-N Ala-Lys-Ala Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O QUIGLPSHIFPEOV-CIUDSAMLSA-N 0.000 description 2
- IAUSCRHURCZUJP-CIUDSAMLSA-N Ala-Lys-Cys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CS)C(O)=O IAUSCRHURCZUJP-CIUDSAMLSA-N 0.000 description 2
- VCSABYLVNWQYQE-SRVKXCTJSA-N Ala-Lys-Lys Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CCCCN)C(O)=O VCSABYLVNWQYQE-SRVKXCTJSA-N 0.000 description 2
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 2
- JAQNUEWEJWBVAY-WBAXXEDZSA-N Ala-Phe-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 JAQNUEWEJWBVAY-WBAXXEDZSA-N 0.000 description 2
- YCRAFFCYWOUEOF-DLOVCJGASA-N Ala-Phe-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 YCRAFFCYWOUEOF-DLOVCJGASA-N 0.000 description 2
- DYXOFPBJBAHWFY-JBDRJPRFSA-N Ala-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N DYXOFPBJBAHWFY-JBDRJPRFSA-N 0.000 description 2
- KTXKIYXZQFWJKB-VZFHVOOUSA-N Ala-Thr-Ser Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O KTXKIYXZQFWJKB-VZFHVOOUSA-N 0.000 description 2
- XUBLMYHWSFRACH-CYDGBPFRSA-N Arg-Asn-Gln-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O XUBLMYHWSFRACH-CYDGBPFRSA-N 0.000 description 2
- OFIYLHVAAJYRBC-HJWJTTGWSA-N Arg-Ile-Phe Chemical compound CC[C@H](C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N[C@@H](Cc1ccccc1)C(O)=O OFIYLHVAAJYRBC-HJWJTTGWSA-N 0.000 description 2
- NMRHDSAOIURTNT-RWMBFGLXSA-N Arg-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N NMRHDSAOIURTNT-RWMBFGLXSA-N 0.000 description 2
- LCBSSOCDWUTQQV-SDDRHHMPSA-N Arg-Met-Pro Chemical compound CSCC[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LCBSSOCDWUTQQV-SDDRHHMPSA-N 0.000 description 2
- CZUHPNLXLWMYMG-UBHSHLNASA-N Arg-Phe-Ala Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 CZUHPNLXLWMYMG-UBHSHLNASA-N 0.000 description 2
- UGZUVYDKAYNCII-ULQDDVLXSA-N Arg-Phe-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O UGZUVYDKAYNCII-ULQDDVLXSA-N 0.000 description 2
- FOQFHANLUJDQEE-GUBZILKMSA-N Arg-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CS)C(=O)O FOQFHANLUJDQEE-GUBZILKMSA-N 0.000 description 2
- AWMAZIIEFPFHCP-RCWTZXSCSA-N Arg-Pro-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O AWMAZIIEFPFHCP-RCWTZXSCSA-N 0.000 description 2
- ZUVMUOOHJYNJPP-XIRDDKMYSA-N Arg-Trp-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCC(N)=O)C(O)=O ZUVMUOOHJYNJPP-XIRDDKMYSA-N 0.000 description 2
- BWMMKQPATDUYKB-IHRRRGAJSA-N Arg-Tyr-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=C(O)C=C1 BWMMKQPATDUYKB-IHRRRGAJSA-N 0.000 description 2
- CGWVCWFQGXOUSJ-ULQDDVLXSA-N Arg-Tyr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(O)=O CGWVCWFQGXOUSJ-ULQDDVLXSA-N 0.000 description 2
- WOZDCBHUGJVJPL-AVGNSLFASA-N Arg-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N WOZDCBHUGJVJPL-AVGNSLFASA-N 0.000 description 2
- KXFCBAHYSLJCCY-ZLUOBGJFSA-N Asn-Asn-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O KXFCBAHYSLJCCY-ZLUOBGJFSA-N 0.000 description 2
- UGXVKHRDGLYFKR-CIUDSAMLSA-N Asn-Asp-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(N)=O UGXVKHRDGLYFKR-CIUDSAMLSA-N 0.000 description 2
- ZDOQDYFZNGASEY-BIIVOSGPSA-N Asn-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N)C(=O)O ZDOQDYFZNGASEY-BIIVOSGPSA-N 0.000 description 2
- MOHUTCNYQLMARY-GUBZILKMSA-N Asn-His-Gln Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N MOHUTCNYQLMARY-GUBZILKMSA-N 0.000 description 2
- PTSDPWIHOYMRGR-UGYAYLCHSA-N Asn-Ile-Asn Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O PTSDPWIHOYMRGR-UGYAYLCHSA-N 0.000 description 2
- NLDNNZKUSLAYFW-NHCYSSNCSA-N Asn-Lys-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C(C)C)C(O)=O NLDNNZKUSLAYFW-NHCYSSNCSA-N 0.000 description 2
- MVXJBVVLACEGCG-PCBIJLKTSA-N Asn-Phe-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVXJBVVLACEGCG-PCBIJLKTSA-N 0.000 description 2
- BYLSYQASFJJBCL-DCAQKATOSA-N Asn-Pro-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O BYLSYQASFJJBCL-DCAQKATOSA-N 0.000 description 2
- VHQSGALUSWIYOD-QXEWZRGKSA-N Asn-Pro-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O VHQSGALUSWIYOD-QXEWZRGKSA-N 0.000 description 2
- VWADICJNCPFKJS-ZLUOBGJFSA-N Asn-Ser-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O VWADICJNCPFKJS-ZLUOBGJFSA-N 0.000 description 2
- MYTHOBCLNIOFBL-SRVKXCTJSA-N Asn-Ser-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MYTHOBCLNIOFBL-SRVKXCTJSA-N 0.000 description 2
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 2
- WSOKZUVWBXVJHX-CIUDSAMLSA-N Asp-Arg-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O WSOKZUVWBXVJHX-CIUDSAMLSA-N 0.000 description 2
- MFMJRYHVLLEMQM-DCAQKATOSA-N Asp-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N MFMJRYHVLLEMQM-DCAQKATOSA-N 0.000 description 2
- SDHFVYLZFBDSQT-DCAQKATOSA-N Asp-Arg-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)N SDHFVYLZFBDSQT-DCAQKATOSA-N 0.000 description 2
- CASGONAXMZPHCK-FXQIFTODSA-N Asp-Asn-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)CN=C(N)N CASGONAXMZPHCK-FXQIFTODSA-N 0.000 description 2
- ZELQAFZSJOBEQS-ACZMJKKPSA-N Asp-Asn-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ZELQAFZSJOBEQS-ACZMJKKPSA-N 0.000 description 2
- JDHOJQJMWBKHDB-CIUDSAMLSA-N Asp-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N JDHOJQJMWBKHDB-CIUDSAMLSA-N 0.000 description 2
- VAWNQIGQPUOPQW-ACZMJKKPSA-N Asp-Glu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O VAWNQIGQPUOPQW-ACZMJKKPSA-N 0.000 description 2
- XJQRWGXKUSDEFI-ACZMJKKPSA-N Asp-Glu-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O XJQRWGXKUSDEFI-ACZMJKKPSA-N 0.000 description 2
- PDECQIHABNQRHN-GUBZILKMSA-N Asp-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC(O)=O PDECQIHABNQRHN-GUBZILKMSA-N 0.000 description 2
- JUWZKMBALYLZCK-WHFBIAKZSA-N Asp-Gly-Asn Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O JUWZKMBALYLZCK-WHFBIAKZSA-N 0.000 description 2
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 2
- SWTQDYFZVOJVLL-KKUMJFAQSA-N Asp-His-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CC(=O)O)N)O SWTQDYFZVOJVLL-KKUMJFAQSA-N 0.000 description 2
- LBFYTUPYYZENIR-GHCJXIJMSA-N Asp-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(=O)O)N LBFYTUPYYZENIR-GHCJXIJMSA-N 0.000 description 2
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 2
- AYFVRYXNDHBECD-YUMQZZPRSA-N Asp-Leu-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O AYFVRYXNDHBECD-YUMQZZPRSA-N 0.000 description 2
- MJJIHRWNWSQTOI-VEVYYDQMSA-N Asp-Thr-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O MJJIHRWNWSQTOI-VEVYYDQMSA-N 0.000 description 2
- VHUKCUHLFMRHOD-MELADBBJSA-N Asp-Tyr-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CC(=O)O)N)C(=O)O VHUKCUHLFMRHOD-MELADBBJSA-N 0.000 description 2
- 108091079001 CRISPR RNA Proteins 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- INKFLNZBTSNFON-CIUDSAMLSA-N Gln-Ala-Arg Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O INKFLNZBTSNFON-CIUDSAMLSA-N 0.000 description 2
- RZSLYUUFFVHFRQ-FXQIFTODSA-N Gln-Ala-Glu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O RZSLYUUFFVHFRQ-FXQIFTODSA-N 0.000 description 2
- MWLYSLMKFXWZPW-ZPFDUUQYSA-N Gln-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)CCC(N)=O MWLYSLMKFXWZPW-ZPFDUUQYSA-N 0.000 description 2
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 2
- XFKUFUJECJUQTQ-CIUDSAMLSA-N Gln-Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XFKUFUJECJUQTQ-CIUDSAMLSA-N 0.000 description 2
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 2
- TWTWUBHEWQPMQW-ZPFDUUQYSA-N Gln-Ile-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O TWTWUBHEWQPMQW-ZPFDUUQYSA-N 0.000 description 2
- ZNTDJIMJKNNSLR-RWRJDSDZSA-N Gln-Ile-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N ZNTDJIMJKNNSLR-RWRJDSDZSA-N 0.000 description 2
- QBEWLBKBGXVVPD-RYUDHWBXSA-N Gln-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N QBEWLBKBGXVVPD-RYUDHWBXSA-N 0.000 description 2
- XQDGOJPVMSWZSO-SRVKXCTJSA-N Gln-Pro-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CCC(=O)N)N XQDGOJPVMSWZSO-SRVKXCTJSA-N 0.000 description 2
- DCWNCMRZIZSZBL-KKUMJFAQSA-N Gln-Pro-Tyr Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)N)N)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O DCWNCMRZIZSZBL-KKUMJFAQSA-N 0.000 description 2
- YRHZWVKUFWCEPW-GLLZPBPUSA-N Gln-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O YRHZWVKUFWCEPW-GLLZPBPUSA-N 0.000 description 2
- ARYKRXHBIPLULY-XKBZYTNZSA-N Gln-Thr-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O ARYKRXHBIPLULY-XKBZYTNZSA-N 0.000 description 2
- WPJDPEOQUIXXOY-AVGNSLFASA-N Gln-Tyr-Asn Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N)O WPJDPEOQUIXXOY-AVGNSLFASA-N 0.000 description 2
- ZZLDMBMFKZFQMU-NRPADANISA-N Gln-Val-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O ZZLDMBMFKZFQMU-NRPADANISA-N 0.000 description 2
- RCCDHXSRMWCOOY-GUBZILKMSA-N Glu-Arg-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O RCCDHXSRMWCOOY-GUBZILKMSA-N 0.000 description 2
- MLCPTRRNICEKIS-FXQIFTODSA-N Glu-Asn-Gln Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O MLCPTRRNICEKIS-FXQIFTODSA-N 0.000 description 2
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 2
- ALCAUWPAMLVUDB-FXQIFTODSA-N Glu-Gln-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ALCAUWPAMLVUDB-FXQIFTODSA-N 0.000 description 2
- UHVIQGKBMXEVGN-WDSKDSINSA-N Glu-Gly-Asn Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O UHVIQGKBMXEVGN-WDSKDSINSA-N 0.000 description 2
- LRPXYSGPOBVBEH-IUCAKERBSA-N Glu-Gly-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O LRPXYSGPOBVBEH-IUCAKERBSA-N 0.000 description 2
- KRGZZKWSBGPLKL-IUCAKERBSA-N Glu-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)O)N KRGZZKWSBGPLKL-IUCAKERBSA-N 0.000 description 2
- QIQABBIDHGQXGA-ZPFDUUQYSA-N Glu-Ile-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O QIQABBIDHGQXGA-ZPFDUUQYSA-N 0.000 description 2
- IVGJYOOGJLFKQE-AVGNSLFASA-N Glu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N IVGJYOOGJLFKQE-AVGNSLFASA-N 0.000 description 2
- OCJRHJZKGGSPRW-IUCAKERBSA-N Glu-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H](N)CCC(O)=O OCJRHJZKGGSPRW-IUCAKERBSA-N 0.000 description 2
- RBXSZQRSEGYDFG-GUBZILKMSA-N Glu-Lys-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O RBXSZQRSEGYDFG-GUBZILKMSA-N 0.000 description 2
- SOEPMWQCTJITPZ-SRVKXCTJSA-N Glu-Met-Lys Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CCC(=O)O)N SOEPMWQCTJITPZ-SRVKXCTJSA-N 0.000 description 2
- BFEZQZKEPRKKHV-SRVKXCTJSA-N Glu-Pro-Lys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CCC(=O)O)N)C(=O)N[C@@H](CCCCN)C(=O)O BFEZQZKEPRKKHV-SRVKXCTJSA-N 0.000 description 2
- BPLNJYHNAJVLRT-ACZMJKKPSA-N Glu-Ser-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O BPLNJYHNAJVLRT-ACZMJKKPSA-N 0.000 description 2
- VNCNWQPIQYAMAK-ACZMJKKPSA-N Glu-Ser-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O VNCNWQPIQYAMAK-ACZMJKKPSA-N 0.000 description 2
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 2
- ZQNCUVODKOBSSO-XEGUGMAKSA-N Glu-Trp-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O ZQNCUVODKOBSSO-XEGUGMAKSA-N 0.000 description 2
- RZMXBFUSQNLEQF-QEJZJMRPSA-N Glu-Trp-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N RZMXBFUSQNLEQF-QEJZJMRPSA-N 0.000 description 2
- HGJREIGJLUQBTJ-SZMVWBNQSA-N Glu-Trp-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC(C)C)C(O)=O HGJREIGJLUQBTJ-SZMVWBNQSA-N 0.000 description 2
- MIWJDJAMMKHUAR-ZVZYQTTQSA-N Glu-Trp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CCC(=O)O)N MIWJDJAMMKHUAR-ZVZYQTTQSA-N 0.000 description 2
- VSVZIEVNUYDAFR-YUMQZZPRSA-N Gly-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)CN VSVZIEVNUYDAFR-YUMQZZPRSA-N 0.000 description 2
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 2
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 2
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 2
- HHSOPSCKAZKQHQ-PEXQALLHSA-N Gly-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)CN HHSOPSCKAZKQHQ-PEXQALLHSA-N 0.000 description 2
- UUWOBINZFGTFMS-UWVGGRQHSA-N Gly-His-Met Chemical compound [H]NCC(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCSC)C(O)=O UUWOBINZFGTFMS-UWVGGRQHSA-N 0.000 description 2
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 2
- UESJMAMHDLEHGM-NHCYSSNCSA-N Gly-Ile-Leu Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O UESJMAMHDLEHGM-NHCYSSNCSA-N 0.000 description 2
- XVYKMNXXJXQKME-XEGUGMAKSA-N Gly-Ile-Tyr Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 XVYKMNXXJXQKME-XEGUGMAKSA-N 0.000 description 2
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 2
- LLZXNUUIBOALNY-QWRGUYRKSA-N Gly-Leu-Lys Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCCN LLZXNUUIBOALNY-QWRGUYRKSA-N 0.000 description 2
- CLNSYANKYVMZNM-UWVGGRQHSA-N Gly-Lys-Arg Chemical compound NCCCC[C@H](NC(=O)CN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N CLNSYANKYVMZNM-UWVGGRQHSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- GGAPHLIUUTVYMX-QWRGUYRKSA-N Gly-Phe-Ser Chemical compound OC[C@@H](C([O-])=O)NC(=O)[C@@H](NC(=O)C[NH3+])CC1=CC=CC=C1 GGAPHLIUUTVYMX-QWRGUYRKSA-N 0.000 description 2
- FKYQEVBRZSFAMJ-QWRGUYRKSA-N Gly-Ser-Tyr Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FKYQEVBRZSFAMJ-QWRGUYRKSA-N 0.000 description 2
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 2
- RCHFYMASWAZQQZ-ZANVPECISA-N Gly-Trp-Ala Chemical compound C1=CC=C2C(C[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)CN)=CNC2=C1 RCHFYMASWAZQQZ-ZANVPECISA-N 0.000 description 2
- RIYIFUFFFBIOEU-KBPBESRZSA-N Gly-Tyr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=C(O)C=C1 RIYIFUFFFBIOEU-KBPBESRZSA-N 0.000 description 2
- GJHWILMUOANXTG-WPRPVWTQSA-N Gly-Val-Arg Chemical compound [H]NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GJHWILMUOANXTG-WPRPVWTQSA-N 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- VSLXGYMEHVAJBH-DLOVCJGASA-N His-Ala-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O VSLXGYMEHVAJBH-DLOVCJGASA-N 0.000 description 2
- SYMSVYVUSPSAAO-IHRRRGAJSA-N His-Arg-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O SYMSVYVUSPSAAO-IHRRRGAJSA-N 0.000 description 2
- MJICNEVRDVQXJH-WDSOQIARSA-N His-Arg-Trp Chemical compound N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O MJICNEVRDVQXJH-WDSOQIARSA-N 0.000 description 2
- NNBWMLHQXBTIIT-HVTMNAMFSA-N His-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N NNBWMLHQXBTIIT-HVTMNAMFSA-N 0.000 description 2
- VDHOMPFVSABJKU-ULQDDVLXSA-N His-Phe-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CC2=CN=CN2)N VDHOMPFVSABJKU-ULQDDVLXSA-N 0.000 description 2
- WYKXJGWSJUULSL-AVGNSLFASA-N His-Val-Arg Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)Cc1cnc[nH]1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)O WYKXJGWSJUULSL-AVGNSLFASA-N 0.000 description 2
- 102100023823 Homeobox protein EMX1 Human genes 0.000 description 2
- 101100026538 Homo sapiens GRIN2B gene Proteins 0.000 description 2
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 2
- MKWSZEHGHSLNPF-NAKRPEOUSA-N Ile-Ala-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O)N MKWSZEHGHSLNPF-NAKRPEOUSA-N 0.000 description 2
- DMHGKBGOUAJRHU-UHFFFAOYSA-N Ile-Arg-Pro Natural products CCC(C)C(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O DMHGKBGOUAJRHU-UHFFFAOYSA-N 0.000 description 2
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 2
- JDAWAWXGAUZPNJ-ZPFDUUQYSA-N Ile-Glu-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N JDAWAWXGAUZPNJ-ZPFDUUQYSA-N 0.000 description 2
- PNDMHTTXXPUQJH-RWRJDSDZSA-N Ile-Glu-Thr Chemical compound N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H]([C@H](O)C)C(=O)O PNDMHTTXXPUQJH-RWRJDSDZSA-N 0.000 description 2
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 2
- UWLHDGMRWXHFFY-HPCHECBXSA-N Ile-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N1CCC[C@@H]1C(=O)O)N UWLHDGMRWXHFFY-HPCHECBXSA-N 0.000 description 2
- DBXXASNNDTXOLU-MXAVVETBSA-N Ile-Leu-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N DBXXASNNDTXOLU-MXAVVETBSA-N 0.000 description 2
- TVYWVSJGSHQWMT-AJNGGQMLSA-N Ile-Leu-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N TVYWVSJGSHQWMT-AJNGGQMLSA-N 0.000 description 2
- ADDYYRVQQZFIMW-MNXVOIDGSA-N Ile-Lys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ADDYYRVQQZFIMW-MNXVOIDGSA-N 0.000 description 2
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 2
- UAELWXJFLZBKQS-WHOFXGATSA-N Ile-Phe-Gly Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(O)=O UAELWXJFLZBKQS-WHOFXGATSA-N 0.000 description 2
- LRAUKBMYHHNADU-DKIMLUQUSA-N Ile-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)CC)CC1=CC=CC=C1 LRAUKBMYHHNADU-DKIMLUQUSA-N 0.000 description 2
- IITVUURPOYGCTD-NAKRPEOUSA-N Ile-Pro-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O IITVUURPOYGCTD-NAKRPEOUSA-N 0.000 description 2
- PRTZQMBYUZFSFA-XEGUGMAKSA-N Ile-Tyr-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)NCC(=O)O)N PRTZQMBYUZFSFA-XEGUGMAKSA-N 0.000 description 2
- HGCNKOLVKRAVHD-UHFFFAOYSA-N L-Met-L-Phe Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 HGCNKOLVKRAVHD-UHFFFAOYSA-N 0.000 description 2
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 2
- RCFDOSNHHZGBOY-UHFFFAOYSA-N L-isoleucyl-L-alanine Natural products CCC(C)C(N)C(=O)NC(C)C(O)=O RCFDOSNHHZGBOY-UHFFFAOYSA-N 0.000 description 2
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 2
- LZDNBBYBDGBADK-UHFFFAOYSA-N L-valyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C(C)C)C(O)=O)=CNC2=C1 LZDNBBYBDGBADK-UHFFFAOYSA-N 0.000 description 2
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 2
- CQQGCWPXDHTTNF-GUBZILKMSA-N Leu-Ala-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCC(O)=O CQQGCWPXDHTTNF-GUBZILKMSA-N 0.000 description 2
- UILIPCLTHRPCRB-XUXIUFHCSA-N Leu-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)N UILIPCLTHRPCRB-XUXIUFHCSA-N 0.000 description 2
- IGUOAYLTQJLPPD-DCAQKATOSA-N Leu-Asn-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IGUOAYLTQJLPPD-DCAQKATOSA-N 0.000 description 2
- DBVWMYGBVFCRBE-CIUDSAMLSA-N Leu-Asn-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O DBVWMYGBVFCRBE-CIUDSAMLSA-N 0.000 description 2
- VIWUBXKCYJGNCL-SRVKXCTJSA-N Leu-Asn-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 VIWUBXKCYJGNCL-SRVKXCTJSA-N 0.000 description 2
- ILJREDZFPHTUIE-GUBZILKMSA-N Leu-Asp-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O ILJREDZFPHTUIE-GUBZILKMSA-N 0.000 description 2
- DLCOFDAHNMMQPP-SRVKXCTJSA-N Leu-Asp-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DLCOFDAHNMMQPP-SRVKXCTJSA-N 0.000 description 2
- KAFOIVJDVSZUMD-DCAQKATOSA-N Leu-Gln-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O KAFOIVJDVSZUMD-DCAQKATOSA-N 0.000 description 2
- QVFGXCVIXXBFHO-AVGNSLFASA-N Leu-Glu-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O QVFGXCVIXXBFHO-AVGNSLFASA-N 0.000 description 2
- HVJVUYQWFYMGJS-GVXVVHGQSA-N Leu-Glu-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O HVJVUYQWFYMGJS-GVXVVHGQSA-N 0.000 description 2
- OXRLYTYUXAQTHP-YUMQZZPRSA-N Leu-Gly-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](C)C(O)=O OXRLYTYUXAQTHP-YUMQZZPRSA-N 0.000 description 2
- FMEICTQWUKNAGC-YUMQZZPRSA-N Leu-Gly-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O FMEICTQWUKNAGC-YUMQZZPRSA-N 0.000 description 2
- HYIFFZAQXPUEAU-QWRGUYRKSA-N Leu-Gly-Leu Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(C)C HYIFFZAQXPUEAU-QWRGUYRKSA-N 0.000 description 2
- XBCWOTOCBXXJDG-BZSNNMDCSA-N Leu-His-Phe Chemical compound C([C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CN=CN1 XBCWOTOCBXXJDG-BZSNNMDCSA-N 0.000 description 2
- DSFYPIUSAMSERP-IHRRRGAJSA-N Leu-Leu-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DSFYPIUSAMSERP-IHRRRGAJSA-N 0.000 description 2
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 2
- LXKNSJLSGPNHSK-KKUMJFAQSA-N Leu-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N LXKNSJLSGPNHSK-KKUMJFAQSA-N 0.000 description 2
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 2
- BGZCJDGBBUUBHA-KKUMJFAQSA-N Leu-Lys-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O BGZCJDGBBUUBHA-KKUMJFAQSA-N 0.000 description 2
- KPYAOIVPJKPIOU-KKUMJFAQSA-N Leu-Lys-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(O)=O KPYAOIVPJKPIOU-KKUMJFAQSA-N 0.000 description 2
- VVQJGYPTIYOFBR-IHRRRGAJSA-N Leu-Lys-Met Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(=O)O)N VVQJGYPTIYOFBR-IHRRRGAJSA-N 0.000 description 2
- KQFZKDITNUEVFJ-JYJNAYRXSA-N Leu-Phe-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(C)C)CC1=CC=CC=C1 KQFZKDITNUEVFJ-JYJNAYRXSA-N 0.000 description 2
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 2
- KWLWZYMNUZJKMZ-IHRRRGAJSA-N Leu-Pro-Leu Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(O)=O KWLWZYMNUZJKMZ-IHRRRGAJSA-N 0.000 description 2
- PWPBLZXWFXJFHE-RHYQMDGZSA-N Leu-Pro-Thr Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O PWPBLZXWFXJFHE-RHYQMDGZSA-N 0.000 description 2
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 2
- XOWMDXHFSBCAKQ-SRVKXCTJSA-N Leu-Ser-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(C)C XOWMDXHFSBCAKQ-SRVKXCTJSA-N 0.000 description 2
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 2
- LCNASHSOFMRYFO-WDCWCFNPSA-N Leu-Thr-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O LCNASHSOFMRYFO-WDCWCFNPSA-N 0.000 description 2
- BTEMNFBEAAOGBR-BZSNNMDCSA-N Leu-Tyr-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CCCCN)C(=O)O)N BTEMNFBEAAOGBR-BZSNNMDCSA-N 0.000 description 2
- VUBIPAHVHMZHCM-KKUMJFAQSA-N Leu-Tyr-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 VUBIPAHVHMZHCM-KKUMJFAQSA-N 0.000 description 2
- FBNPMTNBFFAMMH-AVGNSLFASA-N Leu-Val-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-AVGNSLFASA-N 0.000 description 2
- FBNPMTNBFFAMMH-UHFFFAOYSA-N Leu-Val-Arg Natural products CC(C)CC(N)C(=O)NC(C(C)C)C(=O)NC(C(O)=O)CCCN=C(N)N FBNPMTNBFFAMMH-UHFFFAOYSA-N 0.000 description 2
- KCXUCYYZNZFGLL-SRVKXCTJSA-N Lys-Ala-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O KCXUCYYZNZFGLL-SRVKXCTJSA-N 0.000 description 2
- NTEVEUCLFMWSND-SRVKXCTJSA-N Lys-Arg-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O NTEVEUCLFMWSND-SRVKXCTJSA-N 0.000 description 2
- YNNPKXBBRZVIRX-IHRRRGAJSA-N Lys-Arg-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O YNNPKXBBRZVIRX-IHRRRGAJSA-N 0.000 description 2
- QYOXSYXPHUHOJR-GUBZILKMSA-N Lys-Asn-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O QYOXSYXPHUHOJR-GUBZILKMSA-N 0.000 description 2
- DEFGUIIUYAUEDU-ZPFDUUQYSA-N Lys-Asn-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O DEFGUIIUYAUEDU-ZPFDUUQYSA-N 0.000 description 2
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 2
- XFBBBRDEQIPGNR-KATARQTJSA-N Lys-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCCCN)N)O XFBBBRDEQIPGNR-KATARQTJSA-N 0.000 description 2
- WTZUSCUIVPVCRH-SRVKXCTJSA-N Lys-Gln-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N WTZUSCUIVPVCRH-SRVKXCTJSA-N 0.000 description 2
- ZXEUFAVXODIPHC-GUBZILKMSA-N Lys-Glu-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ZXEUFAVXODIPHC-GUBZILKMSA-N 0.000 description 2
- VQXAVLQBQJMENB-SRVKXCTJSA-N Lys-Glu-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O VQXAVLQBQJMENB-SRVKXCTJSA-N 0.000 description 2
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 2
- SKRGVGLIRUGANF-AVGNSLFASA-N Lys-Leu-Glu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O SKRGVGLIRUGANF-AVGNSLFASA-N 0.000 description 2
- YUAXTFMFMOIMAM-QWRGUYRKSA-N Lys-Lys-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O YUAXTFMFMOIMAM-QWRGUYRKSA-N 0.000 description 2
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 2
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 2
- HKXSZKJMDBHOTG-CIUDSAMLSA-N Lys-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CCCCN HKXSZKJMDBHOTG-CIUDSAMLSA-N 0.000 description 2
- SBQDRNOLGSYHQA-YUMQZZPRSA-N Lys-Ser-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)NCC(O)=O SBQDRNOLGSYHQA-YUMQZZPRSA-N 0.000 description 2
- ZUGVARDEGWMMLK-SRVKXCTJSA-N Lys-Ser-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN ZUGVARDEGWMMLK-SRVKXCTJSA-N 0.000 description 2
- VHTOGMKQXXJOHG-RHYQMDGZSA-N Lys-Thr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VHTOGMKQXXJOHG-RHYQMDGZSA-N 0.000 description 2
- OHXUUQDOBQKSNB-AVGNSLFASA-N Lys-Val-Arg Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O OHXUUQDOBQKSNB-AVGNSLFASA-N 0.000 description 2
- OZVXDDFYCQOPFD-XQQFMLRXSA-N Lys-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CCCCN)N OZVXDDFYCQOPFD-XQQFMLRXSA-N 0.000 description 2
- HUKLXYYPZWPXCC-KZVJFYERSA-N Met-Ala-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O HUKLXYYPZWPXCC-KZVJFYERSA-N 0.000 description 2
- GODBLDDYHFTUAH-CIUDSAMLSA-N Met-Asp-Glu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O GODBLDDYHFTUAH-CIUDSAMLSA-N 0.000 description 2
- MYKLINMAGAIRPJ-CIUDSAMLSA-N Met-Gln-Asn Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O MYKLINMAGAIRPJ-CIUDSAMLSA-N 0.000 description 2
- SJDQOYTYNGZZJX-SRVKXCTJSA-N Met-Glu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O SJDQOYTYNGZZJX-SRVKXCTJSA-N 0.000 description 2
- HZVXPUHLTZRQEL-UWVGGRQHSA-N Met-Leu-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O HZVXPUHLTZRQEL-UWVGGRQHSA-N 0.000 description 2
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- WYBVBIHNJWOLCJ-UHFFFAOYSA-N N-L-arginyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCCN=C(N)N WYBVBIHNJWOLCJ-UHFFFAOYSA-N 0.000 description 2
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 2
- 108010066427 N-valyltryptophan Proteins 0.000 description 2
- CYZBFPYMSJGBRL-DRZSPHRISA-N Phe-Ala-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CYZBFPYMSJGBRL-DRZSPHRISA-N 0.000 description 2
- HHOOEUSPFGPZFP-QWRGUYRKSA-N Phe-Asn-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HHOOEUSPFGPZFP-QWRGUYRKSA-N 0.000 description 2
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 2
- GDBOREPXIRKSEQ-FHWLQOOXSA-N Phe-Gln-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O GDBOREPXIRKSEQ-FHWLQOOXSA-N 0.000 description 2
- CSDMCMITJLKBAH-SOUVJXGZSA-N Phe-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O CSDMCMITJLKBAH-SOUVJXGZSA-N 0.000 description 2
- MIICYIIBVYQNKE-QEWYBTABSA-N Phe-Ile-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N MIICYIIBVYQNKE-QEWYBTABSA-N 0.000 description 2
- GZGPMBKUJDRICD-ULQDDVLXSA-N Phe-Pro-His Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)N)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O GZGPMBKUJDRICD-ULQDDVLXSA-N 0.000 description 2
- NHHZWPNMYQUNEH-ACRUOGEOSA-N Phe-Tyr-His Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)N[C@@H](CC3=CN=CN3)C(=O)O)N NHHZWPNMYQUNEH-ACRUOGEOSA-N 0.000 description 2
- XALFIVXGQUEGKV-JSGCOSHPSA-N Phe-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XALFIVXGQUEGKV-JSGCOSHPSA-N 0.000 description 2
- IEIFEYBAYFSRBQ-IHRRRGAJSA-N Phe-Val-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N IEIFEYBAYFSRBQ-IHRRRGAJSA-N 0.000 description 2
- IWNOFCGBMSFTBC-CIUDSAMLSA-N Pro-Ala-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O IWNOFCGBMSFTBC-CIUDSAMLSA-N 0.000 description 2
- VPVHXWGPALPDGP-GUBZILKMSA-N Pro-Asn-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O VPVHXWGPALPDGP-GUBZILKMSA-N 0.000 description 2
- BBFRBZYKHIKFBX-GMOBBJLQSA-N Pro-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 BBFRBZYKHIKFBX-GMOBBJLQSA-N 0.000 description 2
- HOTVCUAVDQHUDB-UFYCRDLUSA-N Pro-Phe-Tyr Chemical compound C([C@@H](C(=O)O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H]1NCCC1)C1=CC=C(O)C=C1 HOTVCUAVDQHUDB-UFYCRDLUSA-N 0.000 description 2
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N Purine Natural products N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 102000004389 Ribonucleoproteins Human genes 0.000 description 2
- 108010081734 Ribonucleoproteins Proteins 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 2
- RNMRYWZYFHHOEV-CIUDSAMLSA-N Ser-Gln-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RNMRYWZYFHHOEV-CIUDSAMLSA-N 0.000 description 2
- SQBLRDDJTUJDMV-ACZMJKKPSA-N Ser-Glu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O SQBLRDDJTUJDMV-ACZMJKKPSA-N 0.000 description 2
- WBINSDOPZHQPPM-AVGNSLFASA-N Ser-Glu-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CO)N)O WBINSDOPZHQPPM-AVGNSLFASA-N 0.000 description 2
- BKZYBLLIBOBOOW-GHCJXIJMSA-N Ser-Ile-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O BKZYBLLIBOBOOW-GHCJXIJMSA-N 0.000 description 2
- IAORETPTUDBBGV-CIUDSAMLSA-N Ser-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N IAORETPTUDBBGV-CIUDSAMLSA-N 0.000 description 2
- UBRMZSHOOIVJPW-SRVKXCTJSA-N Ser-Leu-Lys Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O UBRMZSHOOIVJPW-SRVKXCTJSA-N 0.000 description 2
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 2
- GDUZTEQRAOXYJS-SRVKXCTJSA-N Ser-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CO)N GDUZTEQRAOXYJS-SRVKXCTJSA-N 0.000 description 2
- GYDFRTRSSXOZCR-ACZMJKKPSA-N Ser-Ser-Glu Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCC(O)=O GYDFRTRSSXOZCR-ACZMJKKPSA-N 0.000 description 2
- VLMIUSLQONKLDV-HEIBUPTGSA-N Ser-Thr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VLMIUSLQONKLDV-HEIBUPTGSA-N 0.000 description 2
- VEVYMLNYMULSMS-AVGNSLFASA-N Ser-Tyr-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VEVYMLNYMULSMS-AVGNSLFASA-N 0.000 description 2
- OQSQCUWQOIHECT-YJRXYDGGSA-N Ser-Tyr-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OQSQCUWQOIHECT-YJRXYDGGSA-N 0.000 description 2
- YEDSOSIKVUMIJE-DCAQKATOSA-N Ser-Val-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O YEDSOSIKVUMIJE-DCAQKATOSA-N 0.000 description 2
- 239000012505 Superdex™ Substances 0.000 description 2
- VFEHSAJCWWHDBH-RHYQMDGZSA-N Thr-Arg-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O VFEHSAJCWWHDBH-RHYQMDGZSA-N 0.000 description 2
- NAXBBCLCEOTAIG-RHYQMDGZSA-N Thr-Arg-Lys Chemical compound NC(N)=NCCC[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CCCCN)C(O)=O NAXBBCLCEOTAIG-RHYQMDGZSA-N 0.000 description 2
- YOSLMIPKOUAHKI-OLHMAJIHSA-N Thr-Asp-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O YOSLMIPKOUAHKI-OLHMAJIHSA-N 0.000 description 2
- XOTBWOCSLMBGMF-SUSMZKCASA-N Thr-Glu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XOTBWOCSLMBGMF-SUSMZKCASA-N 0.000 description 2
- VYEHBMMAJFVTOI-JHEQGTHGSA-N Thr-Gly-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O VYEHBMMAJFVTOI-JHEQGTHGSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- SXAGUVRFGJSFKC-ZEILLAHLSA-N Thr-His-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SXAGUVRFGJSFKC-ZEILLAHLSA-N 0.000 description 2
- RFKVQLIXNVEOMB-WEDXCCLWSA-N Thr-Leu-Gly Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)O)N)O RFKVQLIXNVEOMB-WEDXCCLWSA-N 0.000 description 2
- FLPZMPOZGYPBEN-PPCPHDFISA-N Thr-Leu-Ile Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O FLPZMPOZGYPBEN-PPCPHDFISA-N 0.000 description 2
- VRUFCJZQDACGLH-UVOCVTCTSA-N Thr-Leu-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VRUFCJZQDACGLH-UVOCVTCTSA-N 0.000 description 2
- YGZWVPBHYABGLT-KJEVXHAQSA-N Thr-Pro-Tyr Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 YGZWVPBHYABGLT-KJEVXHAQSA-N 0.000 description 2
- KHTIUAKJRUIEMA-HOUAVDHOSA-N Thr-Trp-Asp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)[C@H](O)C)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 KHTIUAKJRUIEMA-HOUAVDHOSA-N 0.000 description 2
- DVAAUUVLDFKTAQ-VHWLVUOQSA-N Trp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N DVAAUUVLDFKTAQ-VHWLVUOQSA-N 0.000 description 2
- UKWSFUSPGPBJGU-VFAJRCTISA-N Trp-Leu-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N)O UKWSFUSPGPBJGU-VFAJRCTISA-N 0.000 description 2
- SGFIXFAHVWJKTD-KJEVXHAQSA-N Tyr-Arg-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SGFIXFAHVWJKTD-KJEVXHAQSA-N 0.000 description 2
- KEHKBBUYZWAMHL-DZKIICNBSA-N Tyr-Gln-Val Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O KEHKBBUYZWAMHL-DZKIICNBSA-N 0.000 description 2
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 2
- SCZJKZLFSSPJDP-ACRUOGEOSA-N Tyr-Phe-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(O)=O SCZJKZLFSSPJDP-ACRUOGEOSA-N 0.000 description 2
- AKKYBQGHUAWPJR-MNSWYVGCSA-N Tyr-Thr-Trp Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N)O AKKYBQGHUAWPJR-MNSWYVGCSA-N 0.000 description 2
- RGJZPXFZIUUQDN-BPNCWPANSA-N Tyr-Val-Ala Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(O)=O RGJZPXFZIUUQDN-BPNCWPANSA-N 0.000 description 2
- PQPWEALFTLKSEB-DZKIICNBSA-N Tyr-Val-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O PQPWEALFTLKSEB-DZKIICNBSA-N 0.000 description 2
- UBTBGUDNDFZLGP-SRVKXCTJSA-N Val-Arg-Val Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](C(C)C)C(=O)O)N UBTBGUDNDFZLGP-SRVKXCTJSA-N 0.000 description 2
- ZQGPWORGSNRQLN-NHCYSSNCSA-N Val-Asp-His Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N ZQGPWORGSNRQLN-NHCYSSNCSA-N 0.000 description 2
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 2
- VCAWFLIWYNMHQP-UKJIMTQDSA-N Val-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](C(C)C)N VCAWFLIWYNMHQP-UKJIMTQDSA-N 0.000 description 2
- OTJMMKPMLUNTQT-AVGNSLFASA-N Val-Leu-Arg Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)NC(=O)[C@H](C(C)C)N OTJMMKPMLUNTQT-AVGNSLFASA-N 0.000 description 2
- FEXILLGKGGTLRI-NHCYSSNCSA-N Val-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N FEXILLGKGGTLRI-NHCYSSNCSA-N 0.000 description 2
- RWOGENDAOGMHLX-DCAQKATOSA-N Val-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](C(C)C)N RWOGENDAOGMHLX-DCAQKATOSA-N 0.000 description 2
- ZRSZTKTVPNSUNA-IHRRRGAJSA-N Val-Lys-Leu Chemical compound CC(C)C[C@H](NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)C(C)C)C(O)=O ZRSZTKTVPNSUNA-IHRRRGAJSA-N 0.000 description 2
- LJSZPMSUYKKKCP-UBHSHLNASA-N Val-Phe-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CC1=CC=CC=C1 LJSZPMSUYKKKCP-UBHSHLNASA-N 0.000 description 2
- KISFXYYRKKNLOP-IHRRRGAJSA-N Val-Phe-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)O)N KISFXYYRKKNLOP-IHRRRGAJSA-N 0.000 description 2
- USLVEJAHTBLSIL-CYDGBPFRSA-N Val-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)C(C)C USLVEJAHTBLSIL-CYDGBPFRSA-N 0.000 description 2
- DEGUERSKQBRZMZ-FXQIFTODSA-N Val-Ser-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O DEGUERSKQBRZMZ-FXQIFTODSA-N 0.000 description 2
- GVNLOVJNNDZUHS-RHYQMDGZSA-N Val-Thr-Lys Chemical compound [H]N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(O)=O GVNLOVJNNDZUHS-RHYQMDGZSA-N 0.000 description 2
- NGXQOQNXSGOYOI-BQFCYCMXSA-N Val-Trp-Gln Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O)=CNC2=C1 NGXQOQNXSGOYOI-BQFCYCMXSA-N 0.000 description 2
- NLNCNKIVJPEFBC-DLOVCJGASA-N Val-Val-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O NLNCNKIVJPEFBC-DLOVCJGASA-N 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 2
- 108010007483 arginyl-leucyl-tyrosyl-glutamic acid Proteins 0.000 description 2
- 108010084758 arginyl-tyrosyl-aspartic acid Proteins 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 2
- 108010092854 aspartyllysine Proteins 0.000 description 2
- 238000003766 bioinformatics method Methods 0.000 description 2
- 238000004587 chromatography analysis Methods 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 108010054813 diprotin B Proteins 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 2
- 238000010363 gene targeting Methods 0.000 description 2
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010000434 glycyl-alanyl-leucine Proteins 0.000 description 2
- 108010010147 glycylglutamine Proteins 0.000 description 2
- 108010087823 glycyltyrosine Proteins 0.000 description 2
- 108010037850 glycylvaline Proteins 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 108010040030 histidinoalanine Proteins 0.000 description 2
- 108010028295 histidylhistidine Proteins 0.000 description 2
- 108010025306 histidylleucine Proteins 0.000 description 2
- 108010092114 histidylphenylalanine Proteins 0.000 description 2
- 108010018006 histidylserine Proteins 0.000 description 2
- 230000007062 hydrolysis Effects 0.000 description 2
- 238000006460 hydrolysis reaction Methods 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 108010091871 leucylmethionine Proteins 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 108010003700 lysyl aspartic acid Proteins 0.000 description 2
- 108010076718 lysyl-glutamyl-tryptophan Proteins 0.000 description 2
- 108010038320 lysylphenylalanine Proteins 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 108010056582 methionylglutamic acid Proteins 0.000 description 2
- 108010068488 methionylphenylalanine Proteins 0.000 description 2
- 238000000520 microinjection Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 108010064486 phenylalanyl-leucyl-valine Proteins 0.000 description 2
- 108010031719 prolyl-serine Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 125000000561 purinyl group Chemical group N1=C(N=C2N=CNC2=C1)* 0.000 description 2
- 108010026333 seryl-proline Proteins 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000001890 transfection Methods 0.000 description 2
- 108091032955 Bacterial small RNA Proteins 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 230000008836 DNA modification Effects 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 101150066002 GFP gene Proteins 0.000 description 1
- 102100022630 Glutamate receptor ionotropic, NMDA 2B Human genes 0.000 description 1
- 101150022990 Grin2b gene Proteins 0.000 description 1
- 101000972850 Homo sapiens Glutamate receptor ionotropic, NMDA 2B Proteins 0.000 description 1
- 108010015268 Integration Host Factors Proteins 0.000 description 1
- 239000012097 Lipofectamine 2000 Substances 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 102000007474 Multiprotein Complexes Human genes 0.000 description 1
- 108010085220 Multiprotein Complexes Proteins 0.000 description 1
- 102000016943 Muramidase Human genes 0.000 description 1
- 108010014251 Muramidase Proteins 0.000 description 1
- 101100219625 Mus musculus Casd1 gene Proteins 0.000 description 1
- 108010062010 N-Acetylmuramoyl-L-alanine Amidase Proteins 0.000 description 1
- 108010054200 NR2B NMDA receptor Proteins 0.000 description 1
- 230000006819 RNA synthesis Effects 0.000 description 1
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 1
- 102000003661 Ribonuclease III Human genes 0.000 description 1
- 108010057163 Ribonuclease III Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 238000010459 TALEN Methods 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 230000000840 anti-viral effect Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 229940121357 antivirals Drugs 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 101150055766 cat gene Proteins 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 239000013068 control sample Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 108010025678 empty spiracles homeobox proteins Proteins 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 230000005714 functional activity Effects 0.000 description 1
- 238000010230 functional analysis Methods 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 238000012239 gene modification Methods 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 230000005017 genetic modification Effects 0.000 description 1
- 235000013617 genetically modified food Nutrition 0.000 description 1
- 238000010362 genome editing Methods 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 229960000274 lysozyme Drugs 0.000 description 1
- 235000010335 lysozyme Nutrition 0.000 description 1
- 239000004325 lysozyme Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 125000000714 pyrimidinyl group Chemical group 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 230000003007 single stranded DNA break Effects 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y301/00—Hydrolases acting on ester bonds (3.1)
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Plant Pathology (AREA)
- Analytical Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Immunology (AREA)
- Gastroenterology & Hepatology (AREA)
- Enzymes And Modification Thereof (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明描述了来自细菌嗜肺巴斯德杆菌(P.pneumotropica)的CRISPR‑Cas9系统的新型细菌核酸酶,以及所述核酸酶用于在DNA分子中产生严格特异性的双链切割的用途。所述核酸酶具有独特的性质,并且可以用于改变单细胞或多细胞生物体的细胞中的基因组DNA序列。本发明因此增加了可用CRISPR‑Cas9系统的普适性,使得能够使用不同的Cas9核酸酶变体,以在大量特异性位点中和/或在不同条件下,切割各种生物体中的基因组或质粒DNA。
Description
技术领域
本发明涉及生物技术,具体而言涉及新型酶,CRISPR-Cas系统的Cas核酸酶,其用于切割DNA且编辑各种生物体的基因组。该技术将来可用于遗传性人疾病的基因治疗,以及用于编辑其它生物体的基因组。
背景技术
DNA序列修饰是当今生物技术领域的热点问题之一。编辑且修饰真核和原核生物体的基因组,以及在体外操纵DNA,需要在DNA序列中靶向引入双链断裂。
为了解决这个问题,目前使用下述技术:含有锌指型结构域的人工核酸酶系统、TALEN系统和细菌CRISPR-Cas系统。前两种技术需要用于识别特定DNA序列的核酸酶氨基酸序列的费力优化。相比之下,当就CRISPR-Cas系统而言时,识别DNA靶的结构不是蛋白质,而是短引导RNA。特定DNA靶的切割并不需要从头合成核酸酶或其基因,而是通过使用与靶序列互补的引导RNA来进行。它使得CRISPR Cas系统成为用于切割各种DNA序列的方便且高效的手段。该技术允许使用不同序列的引导RNA在几个区域处同时切割DNA。这种方法也用于同时修饰真核生物体中的若干基因。
就其性质而言,CRISPR-Cas系统是能够将断裂高度特异性地引入病毒遗传材料内的原核免疫系统(Mojica F. J. M.等人,Intervening sequences of regularly spacedprokaryotic repeats derive from foreign genetic elements //Journal ofmolecular evolution. – 2005. – 第60卷. – 第2期. –第174-182页)。缩写CRISPR-Cas代表“成簇规律间隔短回文重复和CRISPR相关基因”(Jansen R.等人,Identification ofgenes that are associated with DNA repeats in prokaryotes //Molecularmicrobiology. – 2002. – 第43卷. – 第6期. – 第1565-1575页)。所有CRISPR-Cas系统都由CRISPR盒和编码各种Cas蛋白的基因组成(Jansen R.等人,Molecular microbiology.– 2002. – 第43卷. – 第6期. – 第1565-1575页)。CRISPR盒由各自具有独特核苷酸序列的间隔区和重复的回文重复组成(Jansen R.等人,Molecular microbiology. – 2002. –第43卷. – 第6期. – 第1565-1575页)。CRISPR盒的转录随后为其加工导致引导crRNA的形成,所述引导crRNA连同Cas蛋白一起形成效应复合物(Brouns S. J. J.等人,SmallCRISPR RNAs guide antiviral defense in prokaryotes //Science. – 2008. – 第321卷. – 第5891期. – 第960-964页)。由于crRNA和称为前间隔序列的靶DNA位点之间的互补配对,Cas核酸酶识别DNA靶并且在其中高度特异性地引入断裂。
取决于系统中包括的Cas蛋白,具有单个效应蛋白的CRISPR-Cas系统分组成六种不同类型(I-VI型)。2013年,首次提出了使用II型CRISPR-Cas9系统用于编辑人细胞的基因组DNA (Cong L等人,Multiplex genome engineering using CRISPR/Cas systems.Science. 2013年2月15日;339(6121 ):819-23)。II型CRISPR-Cas9系统的特征在于其简单的组成和活性机制,即其发挥功能需要形成仅由一种Cas9蛋白和如下的两种短RNA组成的效应复合物:crRNA和tracer RNA (tracrRNA)。tracer RNA与源自CRISPR重复的crRNA区域互补配对,以形成引导RNA与Cas效应子结合所必需的二级结构。确定引导RNA的序列是表征先前未研究的Cas直向同源物的重要步骤。Cas9效应蛋白是RNA依赖性DNA核酸内切酶,具有两个核酸酶结构域(HNH和RuvC),其将断裂引入靶DNA的互补链内,因此产生双链DNA断裂(Deltcheva E.等人,CRISPR RNA maturation by trans-encoded small RNA and hostfactor RNase III //Nature. – 2011. – 第471卷. – 第7340期. – 第602页)。
迄今为止,已知几种CRISPR-Cas核酸酶,其能够将双链断裂靶向且特异性地引入DNA内。CRISPR-Cas9技术是最现代和发展最快的技术之一,用于在范围从细菌菌株到人细胞的各种生物体的DNA中引入断裂,还提供了体外应用(Song M. The CRISPR/Cas9system: Their delivery,in vivo and ex vivo applications and clinicaldevelopment by startups. Biotechnol Prog. 2017年7月;33(4):1035-1045)。
除了crRNA间隔区-前间隔序列的互补性外,由Cas9和crRNA/tracrRNA双链体组成的效应核糖核酸复合物(ribonucleic complex)还需要在DNA靶上存在PAM (前间隔序列邻近基序),用于DNA的识别和后续水解(Mojica F. J.M.等人2009)。PAM是定位于II型系统中的严格限定的几个核苷酸的序列,其与脱靶链上的前间隔序列的3'端相邻或远离几个核苷酸。在不存在PAM的情况下,并不发生DNA键的水解以及随后双链断裂的形成。对于在靶上存在PAM序列的需要增加了识别特异性,但同时对于用于引入断裂的靶DNA区域选择施加了限制。因此,从3'端侧接DNA靶的所需PAM序列的存在是限制在任何DNA位点处使用CRISPR-Cas系统的特征。
不同的CRISPR-Cas蛋白使用不同的、独特的PAM序列用于其活性。具有新型的各种PAM序列的CRISPR-Cas蛋白的使用是必要的,以使得能够在体外以及在活生物体的基因组两者中修饰任何DNA区域。真核基因组的修饰还需要使用小尺寸核酸酶,以提供AAV介导的CRISPR-Cas系统递送到细胞内。
尽管用于切割DNA且修饰基因组DNA序列的许多技术是已知的,但仍需要新型的有效手段用于在各种生物体中以及在DNA序列的严格特异性位点处修饰DNA。
发明内容
本发明的目的是提供使用CRISPR-Cas9系统,修饰单细胞或多细胞生物体的基因组DNA序列的新型手段。由于必须存在于待修饰的DNA区域的3'端处的特定PAM序列,目前现有的系统具有有限的用途。寻找具有其它PAM序列的新型Cas9酶将扩大用于在各种生物体的DNA分子中的所需严格特异性位点处形成双链断裂的可用手段的范围。为了解决这个问题,作者表征了先前对于嗜肺巴斯德杆菌(Pasteurella pneumotropica) (嗜肺巴斯德杆菌(Р. pneumotropica))预测的II型CRISPR核酸酶PpCas9,其可以用于将定向修饰引入上述及其它生物体两者的基因组内。本发明的特征在于其具有下述基本特征:(a)与其它已知PAM序列不同的短PAM序列;(b)特征性PpCas9蛋白的相对小尺寸,即1055个氨基酸残基(a.a.r.)。
所述问题借助于蛋白质在DNA分子中形成定位紧接所述DNA分子中的核苷酸序列5’-NNNN(A/G)TT-3’之前的双链断裂的用途得到解决,所述蛋白质包含SEQ ID NO: 1的氨基酸序列,或包含与SEQ ID NO: 1的氨基酸序列具有至少95%同一性且与SEQ ID NO: 1的不同之处仅在于非保守氨基酸残基的氨基酸序列。在本发明的一些实施方案中,该用途的特征在于DNA分子中的双链断裂在35℃至45℃的温度下形成。在本发明的一些实施方案中,该用途的特征在于双链断裂在哺乳动物细胞的基因组DNA中形成。在本发明的一些实施方案中,该用途的特征在于DNA分子中的双链断裂形成导致所述哺乳动物细胞的基因组DNA的修饰。
所述问题通过提供用于修饰单细胞或多细胞生物体的细胞中的基因组DNA序列的方法得到进一步解决,所述方法包括将有效量的以下引入生物体的所述细胞内:a)包含SEQID NO: 1的氨基酸序列的蛋白质、或编码包含SEQ ID NO: 1的氨基酸序列的蛋白质的核酸,和b)包含与生物体的基因组DNA区域的核苷酸序列形成双链体的序列的引导RNA或编码所述引导RNA的DNA序列,所述核苷酸序列与核苷酸序列5’-NNNN(A/G)TT-3’直接相邻,并且在双链体形成之后与所述蛋白质相互作用;其中所述蛋白质与引导RNA和核苷酸序列5’-NNNN(A/G)TT-3’的相互作用导致在紧邻序列5’-NNNN(A/G)TT-3’的基因组DNA序列中的双链断裂形成。
在本发明的一些实施方案中,该方法的特征在于其进一步包括与引导RNA同时引入外源DNA序列。在本发明的一些实施方案中,该方法的特征在于所述细胞是哺乳动物细胞。
可以与靶DNA区域和PpCas9蛋白形成复合物的crRNA和tracer RNA (tracrRNA)的混合物可以用作引导RNA。在本发明的优选实施方案中,基于crRNA和tracer RNA构建的杂合RNA可以用作引导RNA。用于构建杂合引导RNA的方法是技术人员已知的(Hsu PD等人,DNAtargeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013年9月;31(9):827-32)。用于构建杂合RNA的方法之一已在下述实施例中得到公开。
本发明既可以用于在体外切割靶DNA,又可以用于修饰一些活生物体的基因组。基因组DNA可以以直接方式,即通过在相应位点处切割基因组DNA,以及通过经由同源修复插入外源DNA序列进行修饰。
除用于施用的区域外,来自生物体的基因组的双链或单链DNA的任何区域(或此类区域彼此之间以及与其它DNA片段的组合物)都可以用作外源DNA序列,其中所述区域(或区域组合物)预期整合到由PpCas9核酸酶诱导的靶DNA中的双链断裂的位置内。在本发明的一些实施方案中,通过突变(核苷酸的取代)以及通过一个或多个核苷酸的插入或缺失进行进一步修饰、用于引入PpCas9蛋白的来自生物体的基因组DNA的双链DNA区域可以用作外源DNA序列。
本发明的技术效果是增加了可用CRISPR-Cas9系统的普适性,以使得能够使用Cas9核酸酶用于在更大量的特定位点和特定条件下切割基因组或质粒DNA。新型核酸酶可以用于细菌、哺乳动物或其它生物体的细胞中。
附图说明
图1. CRISPR PpCas9系统的基因座的示意图。DR (正向重复)是作为CRISPR盒的部分的规律重复区域。
图2. 体外PAM筛选。实验的示意图。
图3. 在不同反应温度下7N文库片段的PpCas9核酸酶切割。
图4. (A)对于在每个PAM (FC)位置处的每个具体核苷酸使用比例变化对数的计算,来分析PpCas9核酸酶的体外筛选结果。(B) PpCas9核酸酶的PAM标记(Logo)。对于每个位置指示了腺嘌呤、胞嘧啶、胸腺嘧啶和鸟嘌呤的出现。字母的高度对应于在PAM序列的给定位置处出现的核苷酸。
图5. 在PAM的位置1处的单核苷酸取代对通过PpCas9核酸酶切割DNA靶效率的作用的验证。
图6. PpCas9 PAM序列中的核苷酸位置的重要性的验证。
图7. 在PAM的位置5处的A至G取代对通过PpCas9核酸酶的DNA靶切割效率的作用的验证。
图8. 在PAM的位置7处的单核苷酸取代对通过PpCas9核酸酶切割DNA靶效率的作用的验证。
图9. 使用PpCas9蛋白切割各种DNA位点。泳道1和2是阳性对照。
图10. PpCas9核酸酶对PAM序列CAGCATT的识别的验证。泳道1和2是阳性对照。
图11. DNA切割工具PpCas9的图解。
图12. 关于DNA靶切割的实验。使用了不同长度的杂合引导RNA。
图13. 使用NCBI BLASTp软件(默认参数),比对PpCas9和来自金黄色葡萄球菌(Staphylococcus aureus)的Cas9蛋白的氨基酸序列。
图14. 使用PpCas9修饰人细胞的基因组DNA。(A)是确定使用携带PpCas9的质粒修饰人细胞的基因组DNA的效率的实验示意图。(B)是在人细胞的基因组DNA的靶位点序列内的核苷酸插入和缺失的分析结果(顶部 – 将与T7核酸内切酶I的反应产物施加到琼脂糖凝胶电泳上,底部 - 通过高通量测序确定的通过PpCas9在EMX1基因中形成的插入和缺失的实例)。
具体实施方式
如本发明的说明书中使用的,术语“包括(includes)”和“包括(including)”应该被解释为意指“除其它外还包括”。所述术语并不预期解释为“仅由……组成”。除非分开定义,否则本申请中的技术和科学术语具有科学和技术文献中一般公认的通常含义。
如本文使用的,术语“两个序列的同源性百分比”等价于术语“两个序列的同一性百分比”。序列同一性基于参考序列进行确定。用于序列分析的算法是本领域已知的,例如Altschul等人,J. Mol. Biol.,215,第403-10页(1990)中描述的BLAST。为了本发明的目的,为了确定核苷酸序列和氨基酸序列之间的同一性和相似性水平,可以使用核苷酸序列和氨基酸序列的比较,其通过由美国国家生物技术信息中心(National Center forBiotechnology Information) (http://www.ncbi.nlm.nih.gov/blast)提供的BLAST软件包,使用具有标准参数的空位比对来执行。考虑到为了两个序列通过比对进行最佳比较而输入的空位数目和每个空位的长度,通过这两个序列中的相同氨基酸的位置数目,确定两个序列的同一性百分比。同一性百分比等于考虑到序列比对在给定位置处的相同氨基酸数目除以位置总数目,再乘以100。
术语“特异性杂交”指两个单链核酸分子或充分互补的序列之间的结合,其允许在本领域通常使用的预定条件下的此类杂交。
短语“定位紧接核苷酸PAM序列之前的双链断裂”意指靶DNA序列中的双链断裂将在核苷酸PAM序列之前0至25个核苷酸的距离处产生。
与引导RNA同时引入的外源DNA序列预期指在由引导RNA的特异性决定的断裂位点处,对于双链靶DNA的特异性修饰而特异性地制备的DNA序列。此类修饰可以是例如在靶DNA中的断裂位点处的某些核苷酸的插入或缺失。外源DNA可以是来自不同生物体的DNA区域或来自与靶DNA相同的生物体的DNA区域。
包含特定氨基酸序列的蛋白质预期指具有这样的氨基酸序列的蛋白质,该氨基酸序列由所述氨基酸序列和可能通过肽键与所述氨基酸序列连接的其它序构成。其它序列的实例可以是核定位信号(NLS),或对于所述氨基酸序列提供增加的功能性的其它序列。
与引导RNA同时引入的外源DNA序列预期指在由引导RNA的特异性决定的断裂位点处,对于双链靶DNA的特异性修饰而特异性地制备的DNA序列。此类修饰可以是例如在靶DNA中的断裂位点处的某些核苷酸的插入或缺失。外源DNA可以是来自不同生物体的DNA区域或来自与靶DNA相同的生物体的DNA区域。
引入细胞内的蛋白质和RNA的有效量预期指这样的蛋白质和RNA量,当引入所述细胞内时,所述量能够形成功能性复合物,即与靶DNA特异性结合,并且在其中在由引导RNA和DNA上的PAM序列确定的位点处产生双链断裂的复合物。该过程的效率可以通过使用技术人员已知的常规技术,分析从所述细胞中分离的靶DNA进行评价。
蛋白质和RNA可以通过各种技术递送至细胞。例如,蛋白质可以作为编码该蛋白质的基因的DNA质粒、作为用于在细胞质中翻译该蛋白质的mRNA、或作为包括该蛋白质和引导RNA的核糖核蛋白复合物进行递送。递送可以通过技术人员已知的各种技术来执行。
编码系统的组分的核酸可以如下直接或间接地引入细胞内:通过经由技术人员已知的方法转染或转化细胞,通过使用重组病毒,通过对细胞的操纵如DNA显微注射等。
由核酸酶和引导RNA和外源DNA (如果需要的话)组成的核糖核酸复合物可以通过将复合物转染到细胞内,或通过例如经由显微注射将复合物机械引入细胞内进行递送。
编码待引入细胞内的蛋白质的核酸分子可以整合到染色体内,或者可以是染色体外复制的DNA。在一些实施方案中,为了确保其DNA引入细胞内的蛋白质基因的有效表达,有必要按照细胞类型修饰所述DNA的序列以便优化密码子用于表达,这是由于在各种生物体的基因组的编码区中的同义密码子出现的频率不等。密码子优化对于增加在动物、植物、真菌或微生物细胞中的表达是必要的。
对于具有与SEQ ID NO: 1的氨基酸序列具有至少95%同一性的序列的蛋白质在真核细胞中发挥功能,该蛋白质在这种细胞的核中终止是必要的。因此,在本发明的一些实施方案中,具有与SEQ ID NO: 1的氨基酸序列具有至少95%同一性并且在一端或两端处通过添加一种或多种核定位信号进一步修饰的序列的蛋白质,用于在靶DNA中形成双链断裂。例如,可以使用来自SV40病毒的核定位信号。为了提供对核的有效递送,核定位信号可以通过间隔区序列与主要蛋白质序列分开,所述间隔区序列例如在Shen B等人,"Generation ofgene-modified mice via Cas9/RNA-mediated gene targeting",Cell Res. 2013 May;23(5):720-3中进行描述。进一步地,在其它实施方案中,可以使用不同的核定位信号或用于将所述蛋白质递送到细胞核内的替代方法。
本发明包括来自嗜肺巴斯德杆菌生物体的蛋白质在严格指定的位置处将双链断裂引入DNA分子内的用途,所述蛋白质与先前表征的Cas9蛋白同源。使用CRISPR核酸酶将靶向修饰引入基因组具有许多优点。首先,系统活性的特异性由crRNA序列决定,这允许对于所有靶基因座使用一种类型的核酸酶。其次,该技术使得能够一次将与不同基因靶互补的几种引导RNA递送到细胞内,从而使得能够一次同时修饰几种基因。
PpCas9是在嗜肺巴斯德杆菌АТСС 35149中发现的Cas核酸酶,所述嗜肺巴斯德杆菌是在动物的肺中生活的啮齿类动物病原体。嗜肺巴斯德杆菌(Pasteurella pneumotropica) (嗜肺巴斯德杆菌(P. pneumotropica)) CRISPR Cas9系统(以下称为CRISPR PpCas9)属于II-C型CRISPR Cas系统,并且由CRISPR盒组成,所述CRISPR盒携带通过独特间隔区的序列间隔的、具有序列5’ATTATAGCACTGCGAAATGAAAAAGGGAGCTACAAC3’的四个正向重复(DR)。该系统的间隔区无一在序列上与目前已知的细菌噬菌体或质粒一致,所述事实使得无法通过生物信息学分析确定目的PpCas9 PAM。与CRISPR盒相邻的是效应Cas9蛋白PpCas9的基因,以及涉及新间隔区的适应和整合的Cas1和Cas2蛋白的基因。在Cas基因附近,发现与正向重复部分互补并折叠成特征性二级结构的序列,其被视为tracer RNA(tracrRNA) (图1)。
II-C型系统的RNA-Cas蛋白复合物的特征性体系结构的了解使得能够预测CRISPR盒的转录方向:前体crRNA在与Cas基因相反的方向上进行转录(图1)。
因此,PpCas9基因座的序列分析使得能够预测tracer 和引导RNA的序列(表1)。
表1. 通过生物信息学方法确定的CRISPR PpCas9系统的引导RNA的序列。粗体指示正向重复,DR的序列。
为了验证PpCas9核酸酶的活性并确定目的PpCas9 PAM,我们进行了关于在体外重建DNA切割反应的实验。为了确定PpCas9蛋白的PAM序列,采用了双链PAM文库的体外切割。为此,有必要获得如下的PpCas9效应复合物的所有组分:引导RNA和以重组形式的核酸酶。引导RNA序列的确定使得能够在体外合成crRNA和tracrRNA分子。使用NEB HiScribe T7RNA合成试剂盒进行合成。双链DNA文库是374个碱基对(bp)的片段,其包含从3'端侧接随机化的七个核苷酸(5’-NNNNNNN-3’)的前间隔序列:5'-
cccggggtaccacggagagatggtggaaatcatctttctcgtgggcatccttgatggccacctcgtcggaagtgcccacgaggatgacagcaatgccaatgctgggggggctcttctgagaacgagctctgctgcctgacacggccaggacggccaacaccaaccagaacttgggagaacagcactccgctctgggcttcatcttcaactcgtcgactccctgcaaacacaaagaaagagcatgttaaaataggatctacatcacgtaacctgtcttagaagaggctagatactgcaattcaaggaccttatctcctttcattgagcacNNNNNNNaactccatcta ccagcctactctcttatctctggtatt -3’
为了切割该靶,使用了下述序列的引导RNA:
tracrRNA:
5’GCGAAATGAAAAACGUUGUUACAAUAAGAGAUGAAUUUCUCGCAAAGCTCUGCCUCUUGAAAUUUCGGUUUCAAGAGGCAUCUUUUU
和crRNA:5'
uaucuccuuucauugagcacGUUGUAGCUCCCUUUUUCAUUUCGC。
粗体指示了与前间隔序列(靶DNA序列)互补的crRNA序列。
为了产生重组PpCas9蛋白,将其基因克隆到质粒pET21a内。通过Integrated DNATechnologies (IDT)合成的DNA用作编码基因的DNA。该序列进行密码子优化,以排除在嗜肺巴斯德杆菌基因组中发现的稀有密码子。用所得到的质粒рЕТ21a-6xHis-PpCas9转化大肠杆菌Rosetta细胞。
500 μl过夜培养物在500 ml LB培养基中进行稀释,并且细胞在37°C下生长直至获得0.6 Ru的光密度。通过添加IPTG至1 mM的浓度来诱导靶蛋白的合成,然后使细胞在20°C下温育6小时。然后,将细胞以5,000 g离心30分钟,将所得到的细胞沉淀物在-20°C下冷冻。
使沉淀物在冰上解冻30分钟,重悬浮于补充有15 mg溶菌酶的15 ml裂解缓冲液(Tris-HCI 50 mM pH 8、500 mM NaCl、β-巯基乙醇1 mM、咪唑10 mM)中,并且在冰上再温育30分钟。细胞然后通过超声处理破坏30分钟,并且以16,000 g离心40分钟。使所得到的上清液通过0.2 µm过滤器,并且以1 ml/分钟施加到HisTrap HP 1 mL柱(GE Healthcare)上。
使用AKTA FPLC色谱仪(GE Healthcare)以1 ml/分钟执行色谱法。用补充有30 mM咪唑的20 ml裂解缓冲液洗涤具有所施加的蛋白质的柱,这之后用补充有300 mM咪唑的裂解缓冲液洗掉蛋白质。
然后,使在亲和色谱法的过程中获得的蛋白质级分通过Superdex 200 10/300 GL凝胶过滤柱(24 ml),所述柱用下述缓冲液进行平衡:Tris-HCI 50 mM pH 8、500 mM NaCl、1 mM DTT。使用Amicon浓缩器(具有30 kDa过滤器),将对应于PpCas9蛋白的单体形式的级分浓缩至3 mg/ml,这之后将纯化的蛋白质在-80℃下贮存于含有10%甘油的缓冲液中。
切割线性PAM文库的体外反应在下述条件下在20 μl的体积中进行。反应混合物由以下组成:1X CutSmart缓冲液(NEB)、5 mM DTT、100 nM PAM文库、2 μM trRNA/crRNA、400nM PpCas9蛋白。作为对照,以类似方式制备不含RNA的样品。使样品在不同温度下温育,并且通过在2%琼脂糖凝胶中的凝胶电泳进行分析。在DNA通过PpCas9蛋白正确识别和特异性切割的情况下,应该生成约326和48个碱基对的两个DNA片段(参见图2)。
实验结果显示了,PpCas9具有核酸酶活性,并且切割了PAM文库片段的一部分。温度梯度(图3)显示了该蛋白质在35-45℃的温度范围内具有活性。该研究随后使用42℃的温度作为工作温度。
在选择的条件下重复文库切割反应。将反应产物施加到1.5%琼脂糖凝胶上并经受电泳。从凝胶中提取长度为374 bp的未切割的DNA片段,并且准备使用NEB NextUltra II试剂盒进行高通量测序。在lllumina平台上对样品进行测序,然后使用生物信息学方法进行序列分析:我们使用(Maxwell CS等人,A detailed cell-free transcription-translation-based assay to decipher CRISPR protospacer-adjacent motifs.Methods. 2018年7月1日;143:48-57)中描述的方法,确定了与对照样品相比,在PAM(NNNNNNN)的各个位置处出现的核苷酸的差异。此外,构建了PAM标记(logo)以分析结果(图4)。
数据分析的两种方法(图4)均指示了PAM位置5、6和7的重要性。因此,体外分析允许建立如下的关于PpCas9的推定PAM序列:NNNNATT。然而,鉴于通过确定PAM的筛选方法获得的不准确结果,该序列仅是推定的。
在这方面,为了序列的更精确确定,验证了各个PAM序列位置的重要性。为此,我们执行了DNA 片段切割的体外反应,所述DNA 片段含有侧翼为PAM 序列CAACATT (或其衍生物)的 DNA 靶5’-atctcctttcattgagcac-3’: 5’-cccggggtaccacggagagatggtggaaatcatctttctcgtgggcatccttgatggccacctcgtcggaagtgcccacgaggatgacagcaatgccaatgctgggggggctcttctgagaacgagctctgctgcctgacacggccaggacggccaacaccaaccagaacttgggagaacagcactccgctctgggcttcatcttcaactcgtcgactccctgcaaacacaaagaaagagcatgttaaaataggatctacatcacgtaacctgtcttagaagaggctagatactgcaattcaaggaccttatctcctttcattgagcacCAACATTaactccatcta ccagcctactctcttatctctggtatt- 3’
所有DNA切割反应都在下述条件下执行:
1xCutSmart缓冲液
400 nM PpCas9
20 nM DNA
2 µM crRNA
2 µM tracrRNA
温育时间 - 30分钟,反应温度- 42℃。
用所有四种可能的核苷酸变体取代PAM位置1并不影响蛋白质活性的效率(图5)。
通过在PAM位置各自中的单核苷酸取代(嘌呤由嘧啶取代且反之亦然),在实验上确认了预测的位置5和6的重要性。当取代在位置5和6处发生时,蛋白质实际上停止其活性。当取代在位置7处发生时,PpCas9活性的效率降低为1/2,所述事实反映了在该位置处的核苷酸的需求减少(图6)。因此,根据PpCas9核酸酶的体外PAM筛选结果,在PAM的位置5处最可能的核苷酸是腺嘌呤或鸟嘌呤,所述事实在实验上得到确认(图7)。A至G的取代并未降低片段的切割效率。
根据体外筛选的结果,在位置7处具有“T”或“S”的片段应该得到更有效地识别。进行了另外的实验,以明确地验证在该位置处的核苷酸的重要性。体外测试的结果显示了,用A或G取代在位置7处的核苷酸“T”使切割效率降低了40-50% (图8)。因此,与位置5和6相比,PAM位置7是较不保守的:在位置7处的嘌呤降低识别效率,但并不阻止PpCas9蛋白将双链断裂引入DNA内。
研究结果如下:由PpCas9核酸酶识别的PAM对应于下式5’- NNNN(A/G)TT-3’。位置7是较不保守的。
该方法的下述示例性实施方案为了公开本发明的特性的目的而给出,并且不应被解释为以任何方式限制本发明的范围。
实施例1. 测试PpCas9蛋白在各种DNA靶的切割方面的活性。
为了检查PpCas9识别侧翼为序列5’-NNNN(A/G)TT-3’的各种DNA序列的能力,进行了关于来自人grin2b基因序列的DNA靶的体外切割实验(参见表2)。
表2. 来自人GRIN2B基因的DNA靶。
按照PAM共有序列5’-NNNN(A/G)TT-3’,携带很可能被PpCas9识别的识别位点(表2)的grin2b基因的PCR片段用作切割反应中的靶。合成将PpCas9引导至这些位点的CrRNA以识别这些序列。
切割反应在对于PpCas9选择的条件下执行;结果显示于图9中。图9显示了PpCas9酶成功切割了具有合适PAM的四个靶中的三个。
在泳道6上的靶具有PAM序列CAGCATT,根据基于耗尽分析结果的预测,该序列应该被该蛋白质有效识别。然而,该片段的识别在这个实验中并未发生。
因此,PAM CAGCATT在受限于同一PAM的另一个前间隔序列靶上得到另外验证(图10)。在这种情况下,PAM被有效识别,其导致DNA的切割。因此,该蛋白质对于DNA靶序列具有一些进一步的优先。该优先很可能与DNA的二级结构有关。
因此,研究显示了PpCas9中存在核酸酶活性,并且还允许确定其PAM序列并验证引导RNA的序列。
PpCas9核糖核蛋白复合物在受限于PAM 5’-NNNN(A/G)TT -3’的靶中从前间隔序列的5'端特异性引入断裂。PpCas9/RNA复合物的示意图显示于图11中。
实施例2. 使用杂合引导RNA用于切割DNA靶。
sgRNA是一种引导RNA的形式,其是融合的tracrRNA (tracer RNA)和crRNA。为了选择最佳sgRNA,我们构建了该序列的三种变体,其在tracrRNA-crRNA双链体的长度方面不同。RNA在体外进行合成,并且进行了关于切割DNA靶的涉及其的实验(图12)。
下述RNA序列用作杂合RNA:
1 - sgRNA1 25DR: UAUCUCCUUUCAUUGAGCACGUUGUAGCUCCCUUUUUCAUUUCGCGAAAGCGAAAUGAAAAACGUUGUUACAAUAAGAGAUGAAUUUCUCGCAAAGCTCTGCCUCUUGAAAUUUCGGUUUCAAGAGGCAUCUUUUU
2 - sgRNA2 36DR UAUCUCCUUUCAUUGAGCACGUUGUAGCUCCCUUUUUUCAUUUCGCAGUGCUAUAAUGAAAAUUAUAGCACUGCGAAAUGAAAAACGUUGUUACAAUAAGAGAUGAAUUUCUCGCAAAGCUCUGCCUCUUGAAAUUUCGGUUUCAAGAGGCAUCUUUUU
粗体指示了提供与DNA靶配对的20个核苷酸的序列(sgRNA的可变部分)。此外,该实验使用了不含RNA的对照样品和阳性对照,其是使用crRNA+trRNA的靶切割。
含有识别位点5’ tatctcctttcattgagcac 3’与相应的共有序列PAM CAACATT的序列用作DNA靶:5’-cccggggtaccacggagagatggtggaaatcatctttctcgtgggcatccttgatggccacctcgtcggaagtgcccacgaggatgacagcaatgccaatgctgggggggctcttctgagaacgagctctgctgcctgacacggccaggacggccaacaccaaccagaacttgggagaacagcactccgctctgggcttcatcttcaactcgtcgactccctgcaaacacaaagaaagagcatgttaaaataggatctacatcacgtaacctgtcttagaagaggctagatactgcaattcaaggaccttatctcctttcattgagcacCAACATTcaactccat ctaccagcctactctcttatctctggtatt – 3’
粗体指示识别位点,大写字母代表PAM。
反应在下述条件下执行:含有PAM (CAACATT)的DNA序列的浓度为20 nM,蛋白质浓度为400 nM,RNA浓度为2 μM;温育时间为30分钟,温育温度为37°C。
发现所选择的sgRNA1和sgRNA2与天然tracrRNA和crRNA序列一样有效:切割在多于80%的DNA靶中发生(图12)。
在修饰与DNA靶直接配对的序列后,这些杂合RNA变体可以用于切割任何其它靶DNA。
实施例3. 来自属于嗜肺巴斯德杆菌的密切相关生物体的Cas9蛋白。
迄今为止,在嗜肺巴斯德杆菌中并未表征CRISPR-Cas9酶。在大小方面可比较的来自金黄色葡萄球菌的Cas9蛋白与PpCas9的同一性为28% ((图13,同一性程度通过BLASTp软件,默认参数进行计算)。相似程度的同一性存在于其它已知的Cas9蛋白中(未显示)。
因此,PpCas9蛋白在其氨基酸序列方面显著不同于迄今为止研究的其它Cas9蛋白。
遗传工程领域的技术人员将了解,在本说明书中通过申请人获得并表征的PpCas9蛋白序列变体可以进行修饰,而不改变蛋白质本身的功能(例如,通过并不直接影响功能活性的氨基酸残基的定向诱变) (Sambrook等人,Molecular Cloning: A LaboratoryManual,(1989),CSH Press,第15.3-15.108页))。特别地,技术人员将认识到非保守氨基酸残基可以进行修饰,而不影响负责蛋白质功能性(决定蛋白质功能或结构)的残基。此类修饰的实例包括用同源氨基酸残基取代非保守氨基酸残基。含有非保守氨基酸残基的一些区域显示于图12中。在本发明的一些实施方案中,使用包含氨基酸序列的蛋白质以在DNA分子中形成定位紧接所述DNA分子中的核苷酸序列5’-NNNN(A/G)TT-3’之前的双链断裂是可能的,所述氨基酸序列与SEQ ID NO: 1的氨基酸序列具有至少95%同一性,且与SEQ ID NO: 1的不同之处仅在于非保守氨基酸残基。可以通过相应核酸分子的诱变(例如,定点或PCR介导的诱变),随后按照本文描述的功能分析就其功能的保存测试所编码的修饰的Cas9蛋白质,获得同源蛋白质。
实施例4. 使用PpCas9修饰人细胞的基因组DNA。
为了修饰人细胞的基因组DNA,将PpCas9核酸酶基因克隆到在CMV启动子的控制下的真核质粒载体内。将编码核定位信号的序列加入PpCas9基因的5'和3'端中,所述核定位信号确保核酸酶递送到细胞核。将sgRNA序列克隆到在U6启动子的控制下的载体内。为了测试系统的活性,使用具有与20和24个核苷酸长度的靶DNA互补的序列的sgRNA。根据现有技术已知的带有基于SpCas9的基因组DNA修饰系统的类似质粒用作阳性对照。为了评价转染的有效性,质粒进一步带有GFP (绿色荧光蛋白)基因。人基因组DNA的下述区域用作DNA靶(表3)。
表3. 人EMX1和GRIN2B基因的DNA靶。
核酸酶 | 位点名称 | 靶序列 | PAM |
PpCas9 | EMX1.1 sg20 | GCCCTTCCTCCTCCAGCTTC | GTT |
PpCas9 | EMX1.1 sg24 | ТСAGGCCCTТССТССТССAGCTТС | GTT |
FpCas9 | EMX1.2 sg20 | GGAGGTGACATCGATGTCCT | ATT |
FpCas9 | EMX1.2 sg24 | CATTGGAGGTGACATCGATGTCCT | ATT |
PpCas9 | GRIN2B1.1 sg20 | CAGCTGAAGTAATGTTAGAG | ATT |
PpCas9 | GRIN2B1.1 sg24 | TTAGCAGCTGAAGTAATGTTAGAG | ATT |
PpCas9 | GRIN2B1.2 sg20 | AATAAGAAAAACATTATTAT | ATT |
PpCas9 | GRIN2B1.2 sg24 | ATAAAATAAGAAAAACATTATTAT | ATT |
SpCas9 | EMX1 sg20 | GAGTCCGAGCAGAAGAAGAA | GGG |
SpCas9 | GRIN2B sg20 | ACCTTTTATTGCCTTGTTCA | AGG |
EMX1.1和EMX1.2是EMX1基因中的两个不同的修饰位点;类似地,GRIN2B1.1和GRIN2B1.2是GRIN2B基因中的两个不同的修饰位点。
DNA靶的3'端侧翼为PpCas9 5’-NNNNRTT -3'或SpCas9 5’- NGG -3'的PAM序列。
对于PpCas9核酸酶在真核细胞中的有效活性,有必要将蛋白质输入真核细胞的核内。这可以通过使用来自SV40 T抗原的核定位信号(Lanford等人,Cell,1986,46: 575–582)来完成,所述核定位信号经由Shen B等人"Generation of gene-modified mice viaCas9/RNA-mediated gene targeting",Cell Res. 2013 May;23(5):720-3中所述的间隔区序列或无需间隔区序列与PpCas9序列连接。
在给定的实例中,在人细胞的核内部转运的核酸酶的完整氨基酸序列为下述序列:
MAPKKKRKVGIHGVPAAEQNNPLNYILGLDLGIASIGWAVVEIDEESSPIRLIDVGVRTFERAEVAKTGESLALSRRLARSSRRLIKRRAERLKKAKRLLKAEKILHSIDEKLPINVWQLRVKGLKEKLERQEWAAVLLHLSKHRGYLSQRKNEGKSDNKELGALLSGIASNHQMLQSSEYRTPAEIAVKKFQVEEGHIRNQRGSYTHTFSRLDLLAEMELLFQRQAELGNSYTSTTLLENLTALLMWQKPALAGDAILKMLGKCTFEPSEYKAAKNSYSAERFVWLTKLNNLRILENGTERALNDNERFALLEQPYEKSKLTYAQVRAMLALSDNAIFKGVRYLGEDKKTVESKTTLIEMKFYHQIRKTLGSAELKKEWNELKGNSDLLDEIGTAFSLYKTDDDICRYLEGKLPERVLNALLENLNFDKFIQLSLKALHQILPLMLQGQRYDEAVSAIYGDHYGKKSTETTRLLPTIPADEIRNPVVLRTLTQARKVINAVVRLYGSPARIHIETAREVGKSYQDRKKLEKQQEDNRKQRESAVKKFKEMFPHFVGEPKGKDILKMRLYELQQAKCLYSGKSLELHRLLEKGYVEVDHALPFSRTWDDSFNNKVLVLANENQNKGNLTPYEWLDGKNNSERWQHFVVRVQTSGFSYAKKQRILNHKLDEKGFIERNLNDTRYVARFLCNFIADNMLLVGKGKRNVFASNGQITALLRHRWGLQKVREQNDRHHALDAVVVACSTVAMQQKITRFVRYNEGNVFSGERIDRETGEIIPLHFPSPWAFFKENVEIRIFSENPKLELENRLPDYPQYNHEWVQPLFVSRMPTRKMTGQGHMETVKSAKRLNEGLSVLKVPLTQLKLSDLERMVNRDREIALYESLKARLEQFGNDPAKAFAEPFYKKGGALVKAVRLEQTQKSGVLVRDGNGVADNASMVRVDVFTKGGKYFLVPIYTWQVAKGILPNRAATQGKDENDWDIMDEMATFQFSLCQNDLIKLVTKKKTIFGYFNGLNRATSNINIKEHDLDKSKGKLGIYLEVGVKLAISLEKYQVDELGKNIRPCRPTKRQHVRFKRPAATKKAGQAKKKK
该实验中使用的质粒具有下述序列:
gagggcctatttcccatgattccttcatatttgcatatacgatacaaggctgttagagagataattggaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaaaatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttggctttatatatcttgtggaaaggacgaaacaccgXXXXXXXXXXXXXXXXXXXXXXXGTTGTAGCTCCCTTTTTCATTTCGCGAAAGCGAAATGAAAAACGTTGTTACAATAAGAGATGAATTTCTCGCAAAGCTCTGCCTCTTGAAATTTCGGTTTCAAGAGGCATCTTTTTtgctTCTCATGTCCAATATGACCGCCATGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTacataacttacggtaaatggcccgcctggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgttcccatagtaacgccaatagggactttccattgacgtcaatgggtggagtatttacggtaaactgcccacttggcagtacatcaagtgtatcatatgccaagtccgccccctattgacgtcaatgacggtaaatggcccgcctggcattatgcccagtacatgaccttacgggactttcctacttggcagtacatctacgtattagtcatcgctattaccatggtgatgcggttttggcagtacaccaatgggcgtggatagcggtttgactcacggggatttccaagtctccaccccattgacgtcaatgggagtttgttttggcaccaaaatcaacgggactttccaaaatgtcgtaataaccccgccccgttgacgcaaatgggcggtaggcgtgtacggtgggaggtctatataagcAGAGCTCGTTTAGTGAACCGTCAGAATTAATTCAGATCGATCTACCaccgccaccATGATGGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCGAACAGAATAATCCGCTTAACTACATTCTTGGGCTGGATTTGGGAATTGCGAGTATAGGCTGGGCGGTGGTTGAGATCGATGAAGAGAGTAGTCCGATACGCCTTATCGACGTTGGAGTTAGGACGTTCGAGAGGGCGGAGGTCGCCAAGACCGGTGAGAGCTTGGCCCTCAGCCGGCGGCTCGCTCGATCTAGTCGCAGGCTTATAAAGAGGAGGGCTGAGCGCCTTAAGAAAGCTAAGAGGCTCCTTAAGGCAGAAAAAATTCTGCATAGTATCGACGAAAAGCTGCCGATAAATGTTTGGCAGCTCCGAGTAAAAGGGCTGAAGGAAAAATTGGAAAGGCAGGAGTGGGCGGCGGTACTGCTTCATCTCTCCAAGCACCGGGGCTATCTGTCTCAGCGAAAAAACGAAGGTAAGTCAGACAACAAGGAGCTGGGCGCACTTTTGTCCGGGATAGCGTCAAATCATCAGATGCTCCAATCAAGTGAGTATCGGACCCCTGCGGAGATCGCCGTTAAAAAGTTTCAAGTTGAGGAGGGCCACATCAGAAATCAGAGGGGGTCTTACACCCATACGTTCTCTAGACTCGACCTCCTTGCGGAAATGGAACTCCTGTTTCAGCGCCAGGCGGAGCTTGGTAACTCCTACACGTCCACTACCCTCCTGGAAAACCTGACAGCCCTGCTGATGTGGCAGAAGCCCGCTTTGGCGGGGGATGCCATCCTGAAGATGCTGGGTAAATGCACCTTTGAGCCGTCAGAATATAAAGCCGCCAAGAATAGTTACTCTGCGGAGCGATTTGTTTGGTTGACAAAGTTGAATAACCTGCGCATCCTGGAGAACGGTACCGAGCGCGCACTCAATGATAATGAGCGCTTCGCCCTCCTGGAACAGCCCTACGAGAAGTCCAAGCTCACCTACGCCCAAGTCAGAGCCATGCTGGCTCTTAGTGACAACGCGATTTTTAAGGGCGTGCGATACTTGGGCGAGGATAAGAAAACCGTAGAGTCAAAAACGACTCTGATCGAGATGAAATTCTATCACCAAATTAGAAAGACCCTCGGTTCTGCCGAGCTGAAAAAGGAATGGAACGAACTTAAGGGTAACAGCGACCTGCTCGATGAAATCGGTACCGCATTTAGCCTTTATAAAACGGACGACGACATCTGCCGATATTTGGAGGGGAAGCTCCCAGAGCGAGTATTGAATGCACTCCTTGAGAACCTTAATTTTGACAAGTTCATTCAGCTGTCCCTCAAAGCACTGCATCAAATCCTCCCACTTATGCTGCAAGGACAACGATACGACGAAGCCGTCAGCGCGATATATGGAGATCATTACGGAAAAAAGTCCACCGAGACCACACGACTGCTTCCTACGATCCCCGCCGATGAGATCAGAAATCCCGTAGTCCTTCGAACACTTACTCAGGCTAGGAAGGTGATTAATGCGGTAGTTAGGTTGTATGGATCTCCGGCACGGATACATATAGAAACAGCTCGCGAAGTGGGTAAATCTTACCAAGACCGCAAGAAATTGGAGAAACAACAGGAGGATAACCGAAAGCAACGAGAATCTGCCGTTAAAAAGTTTAAGGAAATGTTTCCTCACTTTGTAGGAGAACCGAAGGGTAAAGATATCTTGAAAATGCGGTTGTACGAGTTGCAGCAAGCTAAGTGTCTCTATAGCGGCAAGAGTTTGGAATTGCACCGCCTCCTGGAGAAAGGCTACGTGGAAGTAGACCATGCGCTCCCGTTTTCCCGAACCTGGGATGATTCTTTCAATAACAAAGTCCTTGTGCTGGCAAATGAGAACCAGAACAAAGGAAATCTGACTCCTTATGAGTGGTTGGATGGCAAGAATAATTCTGAGCGGTGGCAACATTTCGTTGTCCGCGTCCAAACGTCAGGGTTCAGCTATGCTAAGAAACAAAGGATCCTCAATCACAAGCTCGACGAGAAAGGATTCATAGAACGAAATTTGAATGACACTAGGTATGTGGCTCGATTTCTCTGCAATTTTATTGCTGACAATATGCTCCTCGTTGGGAAGGGAAAGCGGAATGTTTTTGCATCAAATGGGCAGATAACGGCGCTCTTGAGACATAGATGGGGGCTGCAAAAGGTGAGAGAGCAAAATGATAGACATCACGCCCTGGATGCCGTTGTAGTCGCCTGTTCAACGGTTGCGATGCAGCAAAAGATCACTCGGTTCGTTAGGTATAACGAAGGGAACGTTTTTAGTGGAGAGCGCATAGATCGGGAAACAGGCGAAATCATCCCTTTGCATTTCCCAAGTCCTTGGGCTTTTTTCAAAGAGAATGTGGAAATAAGGATATTCAGTGAAAACCCTAAGTTGGAGCTTGAGAATCGGTTGCCCGATTATCCCCAGTACAATCATGAGTGGGTTCAACCGCTGTTCGTATCCCGCATGCCAACCCGAAAGATGACCGGGCAGGGTCACATGGAGACTGTGAAATCTGCAAAGAGACTTAATGAGGGCCTGTCAGTGTTGAAGGTGCCCTTGACTCAACTGAAATTGAGCGACCTCGAGCGCATGGTAAACCGCGATAGAGAAATCGCACTTTATGAGAGTCTGAAGGCGCGATTGGAACAATTCGGTAATGATCCGGCAAAGGCTTTCGCTGAGCCATTCTACAAGAAGGGTGGAGCGCTGGTTAAGGCTGTCCGACTCGAACAGACACAAAAGTCAGGGGTCTTGGTCAGAGATGGTAACGGGGTTGCCGACAACGCCTCCATGGTACGAGTAGATGTTTTCACGAAAGGAGGAAAATACTTTCTGGTACCTATCTATACCTGGCAAGTTGCCAAGGGAATACTCCCGAATAGGGCGGCGACCCAGGGAAAGGATGAAAACGACTGGGATATAATGGATGAAATGGCTACGTTTCAGTTTAGCTTGTGCCAGAATGACCTCATAAAACTGGTAACCAAAAAAAAGACTATATTCGGGTATTTCAATGGCCTTAATCGGGCAACTTCCAATATCAACATCAAGGAACATGATCTGGATAAGAGCAAGGGAAAGCTTGGTATCTATCTCGAAGTTGGAGTCAAGCTCGCTATTTCCCTCGAGAAATATCAAGTAGATGAACTGGGAAAGAATATACGGCCATGCCGGCCCACAAAAAGACAACACGTACGGTTCAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGGGATCCTACCCATACGATGTTCCAGATTACGCTTATCCCTACGACGTGCCTGATTATGCATACCCATATGATGTCCCCGACTATGCCGGCGCAACAAACTTCTCTCTGCTGAAACAAGCCGGAGATGTCGAAGAGAATCCTGGACCGgtgagcaagggcgaggagctgttcaccggggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaagttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgaccctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccctcgtgaccaccctgacctacggcgtgcagtgcttcagccgctaccccgaccacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgtccaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcgccgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaagggcatcgacttcaaggaggacggcaacatcctggggcacaagctggagtacaactacaacagccacaacgtctatatcatggccgacaagcagaagaacggcatcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtgcagctcgccgaccactaccagcagaacacccccatcggcgacggccccgtgctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaagaccccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgccgccgggatcactctcggcatggacgagctgtacaagTAA
在质粒序列中区分了下述部分:U6启动子(第一个区域,大写字母),与前间隔序列互补的序列(“XXX-XXX"),sgRNA的保守部分(第三个区域,大写字母),PpCas9基因(以粗体突出显示),GFP基因(最后一个区域,大写字母)。
使用Lipofectamine 2000试剂,将具有PpCas9或SpCas9的质粒转染到人HEK293T细胞培养物内。在转染之后72小时使细胞裂解,使所得到的裂解物经受PCR,以生成包括基因组DNA的靶修饰位点的区域。使所得到的PCR片段经受与T7核酸内切酶I的体外反应,以确定基因组DNA的靶位点中的插入和缺失频率。将反应产物施加到琼脂糖凝胶上并经受电泳。图14A显示了PpCas9活跃地引入对EMX1和GRIN2b基因的修饰,其效率类似于现有技术中描述的SpCas9核酸酶。
该实验显示,为了有效地修饰基因组DNA,与SpCas9相比,PpCas9需要延长的sgRNA:在给定的实例中,当使用具有与长度为24个核苷酸的DNA靶互补的序列的sgRNA时,遗传修饰的效率更高(与20个核苷酸的长度相比)。
高通量测序确认了在靶DNA位点中引入的修饰。图14B显示了对EMX1基因的核苷酸序列的可检测修饰的实例。
以核糖核酸复合物形式的递送也可以用于将NLS_PpCas9_NLS递送到人细胞。它通过使重组形式的PpCas9 NLS与引导RNA一起在CutSmart缓冲液(NEB)中温育来进行。通过经由亲和色谱(NiNTA,Qiagen)与尺寸排阻(Superdex 200)纯化重组蛋白质,由细菌生产细胞产生重组蛋白质。
蛋白质以1:2 (PpCas9 NLS : sgRNA)的比率与RNA混合,使混合物在室温下温育10分钟,然后转染到细胞内。
接下来,分析从其中提取的DNA在靶DNA位点处的插入/缺失(如上所述)。
可以使用技术人员已知的标准途径和方法,将本发明表征的来自细菌嗜肺巴斯德杆菌的PpCas9核酸酶递送到各种起源的细胞,用于修饰DNA。与先前表征的Cas9蛋白相比,PpCas9具有许多优点。
PpCas9具有与其它已知的Cas核酸酶不同的短的两字母PAM,其是系统发挥功能所必需的。本发明已表明,定位远离前间隔序列4个核苷酸的短PAM (RTT)的存在足以使PpCas9在体内成功发挥功能。
迄今为止已知的能够将双链断裂引入DNA内的许多小尺寸的Cas核酸酶具有复杂的多字母PAM序列,限制了适合于切割的序列的选择。在迄今为止研究的识别短PAM的Cas核酸酶中,仅PpCas9能够识别侧翼为RTT基序的序列。
PpCas9的第二个优点是蛋白质的小尺寸(1055个aar)。迄今为止,它是所研究的具有三字母RTT РАМ序列的唯一的小尺寸蛋白质。
PpCas9是新型的、小尺寸的Cas核酸酶,具有短的、易于使用的PAM,其不同于目前已知的其它核酸酶的PAM序列。PpCas9蛋白在37℃下以高效率切割各种DNA靶,包括人细胞中的基因组DNA,并且可能成为新基因组编辑工具的基础。
尽管本发明已参考所公开的实施方案进行描述,但本领域技术人员将了解,已提供详细描述的特定实施方案用于说明本发明的目的,并且不应被解释为以任何方式限制本发明的范围。应理解,可以进行各种修改而不背离本发明的精神。
序列表
<110>高等教育斯科尔科沃科学技术学院的自治非营利教育组织
<120> 来自细菌嗜肺巴斯德杆菌的CAS9蛋白的用途
<130> 424316
<150> RU 2019136164
<151> 2019-11-11
<160> 2
<210> 1
<211> 1055
<212> PRT
<213> 嗜肺巴斯德杆菌
<220>
<223> 与Cas9同源的蛋白质
<400> 1
Met Gln Asn Asn Pro Leu Asn Tyr Ile Leu Gly Leu Asp Leu Gly Ile
1 5 10 15
Ala Ser Ile Gly Trp Ala Val Val Glu Ile Asp Glu Glu Ser Ser Pro
20 25 30
Ile Arg Leu Ile Asp Val Gly Val Arg Thr Phe Glu Arg Ala Glu Val
35 40 45
Ala Lys Thr Gly Glu Ser Leu Ala Leu Ser Arg Arg Leu Ala Arg Ser
50 55 60
Ser Arg Arg Leu Ile Lys Arg Arg Ala Glu Arg Leu Lys Lys Ala Lys
65 70 75 80
Arg Leu Leu Lys Ala Glu Lys Ile Leu His Ser Ile Asp Glu Lys Leu
85 90 95
Pro Ile Asn Val Trp Gln Leu Arg Val Lys Gly Leu Lys Glu Lys Leu
100 105 110
Glu Arg Gln Glu Trp Ala Ala Val Leu Leu His Leu Ser Lys His Arg
115 120 125
Gly Tyr Leu Ser Gln Arg Lys Asn Glu Gly Lys Ser Asp Asn Lys Glu
130 135 140
Leu Gly Ala Leu Leu Ser Gly Ile Ala Ser Asn His Gln Met Leu Gln
145 150 155 160
Ser Ser Glu Tyr Arg Thr Pro Ala Glu Ile Ala Val Lys Lys Phe Gln
165 170 175
Val Glu Glu Gly His Ile Arg Asn Gln Arg Gly Ser Tyr Thr His Thr
180 185 190
Phe Ser Arg Leu Asp Leu Leu Ala Glu Met Glu Leu Leu Phe Gln Arg
195 200 205
Gln Ala Glu Leu Gly Asn Ser Tyr Thr Ser Thr Thr Leu Leu Glu Asn
210 215 220
Leu Thr Ala Leu Leu Met Trp Gln Lys Pro Ala Leu Ala Gly Asp Ala
225 230 235 240
Ile Leu Lys Met Leu Gly Lys Cys Thr Phe Glu Pro Ser Glu Tyr Lys
245 250 255
Ala Ala Lys Asn Ser Tyr Ser Ala Glu Arg Phe Val Trp Leu Thr Lys
260 265 270
Leu Asn Asn Leu Arg Ile Leu Glu Asn Gly Thr Glu Arg Ala Leu Asn
275 280 285
Asp Asn Glu Arg Phe Ala Leu Leu Glu Gln Pro Tyr Glu Lys Ser Lys
290 295 300
Leu Thr Tyr Ala Gln Val Arg Ala Met Leu Ala Leu Ser Asp Asn Ala
305 310 315 320
Ile Phe Lys Gly Val Arg Tyr Leu Gly Glu Asp Lys Lys Thr Val Glu
325 330 335
Ser Lys Thr Thr Leu Ile Glu Met Lys Phe Tyr His Gln Ile Arg Lys
340 345 350
Thr Leu Gly Ser Ala Glu Leu Lys Lys Glu Trp Asn Glu Leu Lys Gly
355 360 365
Asn Ser Asp Leu Leu Asp Glu Ile Gly Thr Ala Phe Ser Leu Tyr Lys
370 375 380
Thr Asp Asp Asp Ile Cys Arg Tyr Leu Glu Gly Lys Leu Pro Glu Arg
385 390 395 400
Val Leu Asn Ala Leu Leu Glu Asn Leu Asn Phe Asp Lys Phe Ile Gln
405 410 415
Leu Ser Leu Lys Ala Leu His Gln Ile Leu Pro Leu Met Leu Gln Gly
420 425 430
Gln Arg Tyr Asp Glu Ala Val Ser Ala Ile Tyr Gly Asp His Tyr Gly
435 440 445
Lys Lys Ser Thr Glu Thr Thr Arg Leu Leu Pro Thr Ile Pro Ala Asp
450 455 460
Glu Ile Arg Asn Pro Val Val Leu Arg Thr Leu Thr Gln Ala Arg Lys
465 470 475 480
Val Ile Asn Ala Val Val Arg Leu Tyr Gly Ser Pro Ala Arg Ile His
485 490 495
Ile Glu Thr Ala Arg Glu Val Gly Lys Ser Tyr Gln Asp Arg Lys Lys
500 505 510
Leu Glu Lys Gln Gln Glu Asp Asn Arg Lys Gln Arg Glu Ser Ala Val
515 520 525
Lys Lys Phe Lys Glu Met Phe Pro His Phe Val Gly Glu Pro Lys Gly
530 535 540
Lys Asp Ile Leu Lys Met Arg Leu Tyr Glu Leu Gln Gln Ala Lys Cys
545 550 555 560
Leu Tyr Ser Gly Lys Ser Leu Glu Leu His Arg Leu Leu Glu Lys Gly
565 570 575
Tyr Val Glu Val Asp His Ala Leu Pro Phe Ser Arg Thr Trp Asp Asp
580 585 590
Ser Phe Asn Asn Lys Val Leu Val Leu Ala Asn Glu Asn Gln Asn Lys
595 600 605
Gly Asn Leu Thr Pro Tyr Glu Trp Leu Asp Gly Lys Asn Asn Ser Glu
610 615 620
Arg Trp Gln His Phe Val Val Arg Val Gln Thr Ser Gly Phe Ser Tyr
625 630 635 640
Ala Lys Lys Gln Arg Ile Leu Asn His Lys Leu Asp Glu Lys Gly Phe
645 650 655
Ile Glu Arg Asn Leu Asn Asp Thr Arg Tyr Val Ala Arg Phe Leu Cys
660 665 670
Asn Phe Ile Ala Asp Asn Met Leu Leu Val Gly Lys Gly Lys Arg Asn
675 680 685
Val Phe Ala Ser Asn Gly Gln Ile Thr Ala Leu Leu Arg His Arg Trp
690 695 700
Gly Leu Gln Lys Val Arg Glu Gln Asn Asp Arg His His Ala Leu Asp
705 710 715 720
Ala Val Val Val Ala Cys Ser Thr Val Ala Met Gln Gln Lys Ile Thr
725 730 735
Arg Phe Val Arg Tyr Asn Glu Gly Asn Val Phe Ser Gly Glu Arg Ile
740 745 750
Asp Arg Glu Thr Gly Glu Ile Ile Pro Leu His Phe Pro Ser Pro Trp
755 760 765
Ala Phe Phe Lys Glu Asn Val Glu Ile Arg Ile Phe Ser Glu Asn Pro
770 775 780
Lys Leu Glu Leu Glu Asn Arg Leu Pro Asp Tyr Pro Gln Tyr Asn His
785 790 795 800
Glu Trp Val Gln Pro Leu Phe Val Ser Arg Met Pro Thr Arg Lys Met
805 810 815
Thr Gly Gln Gly His Met Glu Thr Val Lys Ser Ala Lys Arg Leu Asn
820 825 830
Glu Gly Leu Ser Val Leu Lys Val Pro Leu Thr Gln Leu Lys Leu Ser
835 840 845
Asp Leu Glu Arg Met Val Asn Arg Asp Arg Glu Ile Ala Leu Tyr Glu
850 855 860
Ser Leu Lys Ala Arg Leu Glu Gln Phe Gly Asn Asp Pro Ala Lys Ala
865 870 875 880
Phe Ala Glu Pro Phe Tyr Lys Lys Gly Gly Ala Leu Val Lys Ala Val
885 890 895
Arg Leu Glu Gln Thr Gln Lys Ser Gly Val Leu Val Arg Asp Gly Asn
900 905 910
Gly Val Ala Asp Asn Ala Ser Met Val Arg Val Asp Val Phe Thr Lys
915 920 925
Gly Gly Lys Tyr Phe Leu Val Pro Ile Tyr Thr Trp Gln Val Ala Lys
930 935 940
Gly Ile Leu Pro Asn Arg Ala Ala Thr Gln Gly Lys Asp Glu Asn Asp
945 950 955 960
Trp Asp Ile Met Asp Glu Met Ala Thr Phe Gln Phe Ser Leu Cys Gln
965 970 975
Asn Asp Leu Ile Lys Leu Val Thr Lys Lys Lys Thr Ile Phe Gly Tyr
980 985 990
Phe Asn Gly Leu Asn Arg Ala Thr Ser Asn Ile Asn Ile Lys Glu His
995 1000 1005
Asp Leu Asp Lys Ser Lys Gly Lys Leu Gly Ile Tyr Leu Glu Val Gly
1010 1015 1020
Val Lys Leu Ala Ile Ser Leu Glu Lys Tyr Gln Val Asp Glu Leu Gly
1025 1030 1035 1040
Lys Asn Ile Arg Pro Cys Arg Pro Thr Lys Arg Gln His Val Arg
1045 1050 1055
<210> 2
<211> 3168
<212> DNA
<213> 嗜肺巴斯德杆菌
<220>
<223> 与Cas9同源的蛋白质
<400> 2
atgcaaaata atccattaaa ttacatttta gggttagatt taggcattgc ttctattggt 60
tgggcggttg tggaaattga tgaggagagt tcacctattc gcttaattga tgtgggcgtc 120
cgtacatttg aacgggctga agtcgctaaa accggcgaaa gtttagcatt gtctcgtcgt 180
ttagctcgtt catcacggcg attaattaaa cgccgagcag agcgattaaa aaaagcaaaa 240
cgtttattaa aagcagaaaa gattttacat tctattgatg aaaaattacc cattaatgtt 300
tggcagcttc gagtaaaagg attgaaggaa aaactcgaac gtcaggagtg ggcagcggtt 360
ttattacatt tgtcaaagca tcgtggctat ttatcacaac gtaaaaatga gggtaaaagt 420
gataataaag agctgggggc attactttca ggtatcgcaa gtaaccacca aatgttgcaa 480
tcctccgaat atcgtacccc tgcagaaatt gcagtcaaaa aatttcaagt agaagaagga 540
catattcgta atcaacgtgg atcttatacc cataccttta gccgtttgga tttgttggca 600
gaaatggaat tattatttca acgccaagct gagttaggca attcttacac gtccaccaca 660
ttattagaaa atttgacggc gttactaatg tggcaaaagc cagctcttgc gggtgatgcg 720
attttaaaaa tgttgggcaa gtgtaccttc gaacccagcg aatataaagc cgcaaaaaat 780
agttattctg ctgaacgttt tgtgtggtta accaagctga ataatttacg cattttagaa 840
aatggcacgg aaagagcttt aaatgacaat gaacgttttg ctttgcttga gcaaccgtat 900
gagaaatcaa aattaactta tgctcaagtg agagcaatgc ttgcgttatc tgataatgct 960
attttcaaag gggttcgtta tttaggcgaa gataaaaaaa cagtagagag caaaactacg 1020
ttgatagaaa tgaagtttta tcatcaaatc cgcaaaacat taggcagtgc agaattaaaa 1080
aaggaatgga atgagttaaa aggcaattcc gatttattag atgagattgg cacggcattt 1140
tcgttgtata aaacggatga tgatatttgc cgttatttag agggaaaact accagaaagg 1200
gtattaaatg cgttattgga aaatttaaat ttcgataaat ttattcaact ttcacttaaa 1260
gccttacacc aaattttacc attgatgctg caagggcaac gttatgatga ggcggtttct 1320
gcgatttatg gtgatcatta tggtaaaaaa tcgacagaaa caacccgctt gttgccgact 1380
attcctgccg atgaaatccg aaatcctgtg gtattacgca ccctgaccca agcccgtaaa 1440
gtgatcaatg cggtggtgcg gttatatggt tcgcctgccc gtattcatat tgaaacagcg 1500
agagaagtcg gcaaatctta ccaagatcgt aaaaaacttg aaaaacagca agaagataat 1560
cgtaagcaac gtgaaagtgc ggtcaaaaaa tttaaagaaa tgtttccgca ttttgtgggg 1620
gagccgaaag gtaaagatat tttaaaaatg cgattgtatg agttacaaca agcgaaatgt 1680
ttatattctg gaaaatcttt agaacttcat cgtttgcttg agaaggggta tgtagaagtg 1740
gatcacgctt tgccattttc tcgcacgtgg gatgatagct ttaataataa agtactggtg 1800
cttgccaacg agaaccaaaa taaaggcaat ttaacgcctt atgaatggtt agatggtaaa 1860
aataacagtg agcgttggca acattttgtt gtacgagtac aaaccagcgg tttctcttat 1920
gctaaaaaac aacgcatttt gaaccataaa ttggatgaaa aagggtttat cgaacgtaat 1980
ttaaacgata ctcgctatgt agctcgtttc ttatgtaact ttattgccga taatatgttg 2040
ttggttggta aaggcaagcg aaacgtgttt gcttcaaacg ggcaaatcac ggcgttattg 2100
cggcatcgtt ggggcttaca aaaagtgcgt gaacagaatg atcgccacca cgcactggac 2160
gcggttgtgg tggcttgctc tactgtggca atgcaacaaa aaatcactcg atttgtgaga 2220
tataacgaag gaaatgtctt tagcggtgaa cgtatcgatc gtgaaactgg cgagattatt 2280
ccattacatt ttccaagccc ttgggctttt ttcaaagaga atgtggaaat tcgcattttt 2340
agtgaaaatc caaaattgga attagaaaat cgcctgcctg attatccgca atataatcac 2400
gaatgggtgc aaccattgtt tgtttcgaga atgccaaccc gaaaaatgac agggcaaggg 2460
catatggaaa cggtaaaatc cgcaaaacga ttaaatgaag gtttaagtgt gttaaaagtc 2520
cctttaacac aacttaaatt gagtgattta gaacgaatgg ttaatcgtga tcgtgaaatt 2580
gcattgtatg aatccttaaa agcacgttta gagcaatttg gtaacgaccc agccaaagcc 2640
tttgccgaac cattctataa aaagggtggg gcattagtca aagcagtccg attggaacaa 2700
acacaaaaat cgggggtatt agtacgtgat ggtaacggtg ttgcggataa tgcttcaatg 2760
gtacgggttg atgtttttac taaaggtgga aaatatttct tagtgccgat ttatacttgg 2820
caggtagcga aagggatttt accgaatagg gctgcgacac aaggtaaaga tgaaaatgat 2880
tgggatatta tggatgaaat ggctactttc caattttctc tatgtcaaaa tgatctaatt 2940
aaattagtta ccaaaaagaa aacaatcttt ggatatttta atggattaaa tagagctact 3000
agcaatataa atattaaaga gcatgatcta gataagtcta aagggaaatt aggtatttac 3060
ttagaagttg gtgtaaaact agctatttcc cttgaaaagt accaagtcga cgaactcggc 3120
aaaaatatcc gtccttgtcg tccgactaaa cgacagcacg tgcgttaa 3168
序列表
<110> JSC BIOCAD
<120> 来自细菌嗜肺巴斯德杆菌的CAS9蛋白的用途
<150> RU 2019136164
<151> 2019-11-11
<160> 2
<210> 1
<211> 1055
<212> PRT
<213> 嗜肺巴斯德杆菌
<220>
<223> 与Cas9同源的蛋白质
<400> 1
Met Gln Asn Asn Pro Leu Asn Tyr Ile Leu Gly Leu Asp Leu Gly Ile
1 5 10 15
Ala Ser Ile Gly Trp Ala Val Val Glu Ile Asp Glu Glu Ser Ser Pro
20 25 30
Ile Arg Leu Ile Asp Val Gly Val Arg Thr Phe Glu Arg Ala Glu Val
35 40 45
Ala Lys Thr Gly Glu Ser Leu Ala Leu Ser Arg Arg Leu Ala Arg Ser
50 55 60
Ser Arg Arg Leu Ile Lys Arg Arg Ala Glu Arg Leu Lys Lys Ala Lys
65 70 75 80
Arg Leu Leu Lys Ala Glu Lys Ile Leu His Ser Ile Asp Glu Lys Leu
85 90 95
Pro Ile Asn Val Trp Gln Leu Arg Val Lys Gly Leu Lys Glu Lys Leu
100 105 110
Glu Arg Gln Glu Trp Ala Ala Val Leu Leu His Leu Ser Lys His Arg
115 120 125
Gly Tyr Leu Ser Gln Arg Lys Asn Glu Gly Lys Ser Asp Asn Lys Glu
130 135 140
Leu Gly Ala Leu Leu Ser Gly Ile Ala Ser Asn His Gln Met Leu Gln
145 150 155 160
Ser Ser Glu Tyr Arg Thr Pro Ala Glu Ile Ala Val Lys Lys Phe Gln
165 170 175
Val Glu Glu Gly His Ile Arg Asn Gln Arg Gly Ser Tyr Thr His Thr
180 185 190
Phe Ser Arg Leu Asp Leu Leu Ala Glu Met Glu Leu Leu Phe Gln Arg
195 200 205
Gln Ala Glu Leu Gly Asn Ser Tyr Thr Ser Thr Thr Leu Leu Glu Asn
210 215 220
Leu Thr Ala Leu Leu Met Trp Gln Lys Pro Ala Leu Ala Gly Asp Ala
225 230 235 240
Ile Leu Lys Met Leu Gly Lys Cys Thr Phe Glu Pro Ser Glu Tyr Lys
245 250 255
Ala Ala Lys Asn Ser Tyr Ser Ala Glu Arg Phe Val Trp Leu Thr Lys
260 265 270
Leu Asn Asn Leu Arg Ile Leu Glu Asn Gly Thr Glu Arg Ala Leu Asn
275 280 285
Asp Asn Glu Arg Phe Ala Leu Leu Glu Gln Pro Tyr Glu Lys Ser Lys
290 295 300
Leu Thr Tyr Ala Gln Val Arg Ala Met Leu Ala Leu Ser Asp Asn Ala
305 310 315 320
Ile Phe Lys Gly Val Arg Tyr Leu Gly Glu Asp Lys Lys Thr Val Glu
325 330 335
Ser Lys Thr Thr Leu Ile Glu Met Lys Phe Tyr His Gln Ile Arg Lys
340 345 350
Thr Leu Gly Ser Ala Glu Leu Lys Lys Glu Trp Asn Glu Leu Lys Gly
355 360 365
Asn Ser Asp Leu Leu Asp Glu Ile Gly Thr Ala Phe Ser Leu Tyr Lys
370 375 380
Thr Asp Asp Asp Ile Cys Arg Tyr Leu Glu Gly Lys Leu Pro Glu Arg
385 390 395 400
Val Leu Asn Ala Leu Leu Glu Asn Leu Asn Phe Asp Lys Phe Ile Gln
405 410 415
Leu Ser Leu Lys Ala Leu His Gln Ile Leu Pro Leu Met Leu Gln Gly
420 425 430
Gln Arg Tyr Asp Glu Ala Val Ser Ala Ile Tyr Gly Asp His Tyr Gly
435 440 445
Lys Lys Ser Thr Glu Thr Thr Arg Leu Leu Pro Thr Ile Pro Ala Asp
450 455 460
Glu Ile Arg Asn Pro Val Val Leu Arg Thr Leu Thr Gln Ala Arg Lys
465 470 475 480
Val Ile Asn Ala Val Val Arg Leu Tyr Gly Ser Pro Ala Arg Ile His
485 490 495
Ile Glu Thr Ala Arg Glu Val Gly Lys Ser Tyr Gln Asp Arg Lys Lys
500 505 510
Leu Glu Lys Gln Gln Glu Asp Asn Arg Lys Gln Arg Glu Ser Ala Val
515 520 525
Lys Lys Phe Lys Glu Met Phe Pro His Phe Val Gly Glu Pro Lys Gly
530 535 540
Lys Asp Ile Leu Lys Met Arg Leu Tyr Glu Leu Gln Gln Ala Lys Cys
545 550 555 560
Leu Tyr Ser Gly Lys Ser Leu Glu Leu His Arg Leu Leu Glu Lys Gly
565 570 575
Tyr Val Glu Val Asp His Ala Leu Pro Phe Ser Arg Thr Trp Asp Asp
580 585 590
Ser Phe Asn Asn Lys Val Leu Val Leu Ala Asn Glu Asn Gln Asn Lys
595 600 605
Gly Asn Leu Thr Pro Tyr Glu Trp Leu Asp Gly Lys Asn Asn Ser Glu
610 615 620
Arg Trp Gln His Phe Val Val Arg Val Gln Thr Ser Gly Phe Ser Tyr
625 630 635 640
Ala Lys Lys Gln Arg Ile Leu Asn His Lys Leu Asp Glu Lys Gly Phe
645 650 655
Ile Glu Arg Asn Leu Asn Asp Thr Arg Tyr Val Ala Arg Phe Leu Cys
660 665 670
Asn Phe Ile Ala Asp Asn Met Leu Leu Val Gly Lys Gly Lys Arg Asn
675 680 685
Val Phe Ala Ser Asn Gly Gln Ile Thr Ala Leu Leu Arg His Arg Trp
690 695 700
Gly Leu Gln Lys Val Arg Glu Gln Asn Asp Arg His His Ala Leu Asp
705 710 715 720
Ala Val Val Val Ala Cys Ser Thr Val Ala Met Gln Gln Lys Ile Thr
725 730 735
Arg Phe Val Arg Tyr Asn Glu Gly Asn Val Phe Ser Gly Glu Arg Ile
740 745 750
Asp Arg Glu Thr Gly Glu Ile Ile Pro Leu His Phe Pro Ser Pro Trp
755 760 765
Ala Phe Phe Lys Glu Asn Val Glu Ile Arg Ile Phe Ser Glu Asn Pro
770 775 780
Lys Leu Glu Leu Glu Asn Arg Leu Pro Asp Tyr Pro Gln Tyr Asn His
785 790 795 800
Glu Trp Val Gln Pro Leu Phe Val Ser Arg Met Pro Thr Arg Lys Met
805 810 815
Thr Gly Gln Gly His Met Glu Thr Val Lys Ser Ala Lys Arg Leu Asn
820 825 830
Glu Gly Leu Ser Val Leu Lys Val Pro Leu Thr Gln Leu Lys Leu Ser
835 840 845
Asp Leu Glu Arg Met Val Asn Arg Asp Arg Glu Ile Ala Leu Tyr Glu
850 855 860
Ser Leu Lys Ala Arg Leu Glu Gln Phe Gly Asn Asp Pro Ala Lys Ala
865 870 875 880
Phe Ala Glu Pro Phe Tyr Lys Lys Gly Gly Ala Leu Val Lys Ala Val
885 890 895
Arg Leu Glu Gln Thr Gln Lys Ser Gly Val Leu Val Arg Asp Gly Asn
900 905 910
Gly Val Ala Asp Asn Ala Ser Met Val Arg Val Asp Val Phe Thr Lys
915 920 925
Gly Gly Lys Tyr Phe Leu Val Pro Ile Tyr Thr Trp Gln Val Ala Lys
930 935 940
Gly Ile Leu Pro Asn Arg Ala Ala Thr Gln Gly Lys Asp Glu Asn Asp
945 950 955 960
Trp Asp Ile Met Asp Glu Met Ala Thr Phe Gln Phe Ser Leu Cys Gln
965 970 975
Asn Asp Leu Ile Lys Leu Val Thr Lys Lys Lys Thr Ile Phe Gly Tyr
980 985 990
Phe Asn Gly Leu Asn Arg Ala Thr Ser Asn Ile Asn Ile Lys Glu His
995 1000 1005
Asp Leu Asp Lys Ser Lys Gly Lys Leu Gly Ile Tyr Leu Glu Val Gly
1010 1015 1020
Val Lys Leu Ala Ile Ser Leu Glu Lys Tyr Gln Val Asp Glu Leu Gly
1025 1030 1035 1040
Lys Asn Ile Arg Pro Cys Arg Pro Thr Lys Arg Gln His Val Arg
1045 1050 1055
<210> 2
<211> 3168
<212> DNA
<213> 嗜肺巴斯德杆菌
<220>
<223> 与Cas9同源的蛋白质
<400> 2
atgcaaaata atccattaaa ttacatttta gggttagatt taggcattgc ttctattggt 60
tgggcggttg tggaaattga tgaggagagt tcacctattc gcttaattga tgtgggcgtc 120
cgtacatttg aacgggctga agtcgctaaa accggcgaaa gtttagcatt gtctcgtcgt 180
ttagctcgtt catcacggcg attaattaaa cgccgagcag agcgattaaa aaaagcaaaa 240
cgtttattaa aagcagaaaa gattttacat tctattgatg aaaaattacc cattaatgtt 300
tggcagcttc gagtaaaagg attgaaggaa aaactcgaac gtcaggagtg ggcagcggtt 360
ttattacatt tgtcaaagca tcgtggctat ttatcacaac gtaaaaatga gggtaaaagt 420
gataataaag agctgggggc attactttca ggtatcgcaa gtaaccacca aatgttgcaa 480
tcctccgaat atcgtacccc tgcagaaatt gcagtcaaaa aatttcaagt agaagaagga 540
catattcgta atcaacgtgg atcttatacc cataccttta gccgtttgga tttgttggca 600
gaaatggaat tattatttca acgccaagct gagttaggca attcttacac gtccaccaca 660
ttattagaaa atttgacggc gttactaatg tggcaaaagc cagctcttgc gggtgatgcg 720
attttaaaaa tgttgggcaa gtgtaccttc gaacccagcg aatataaagc cgcaaaaaat 780
agttattctg ctgaacgttt tgtgtggtta accaagctga ataatttacg cattttagaa 840
aatggcacgg aaagagcttt aaatgacaat gaacgttttg ctttgcttga gcaaccgtat 900
gagaaatcaa aattaactta tgctcaagtg agagcaatgc ttgcgttatc tgataatgct 960
attttcaaag gggttcgtta tttaggcgaa gataaaaaaa cagtagagag caaaactacg 1020
ttgatagaaa tgaagtttta tcatcaaatc cgcaaaacat taggcagtgc agaattaaaa 1080
aaggaatgga atgagttaaa aggcaattcc gatttattag atgagattgg cacggcattt 1140
tcgttgtata aaacggatga tgatatttgc cgttatttag agggaaaact accagaaagg 1200
gtattaaatg cgttattgga aaatttaaat ttcgataaat ttattcaact ttcacttaaa 1260
gccttacacc aaattttacc attgatgctg caagggcaac gttatgatga ggcggtttct 1320
gcgatttatg gtgatcatta tggtaaaaaa tcgacagaaa caacccgctt gttgccgact 1380
attcctgccg atgaaatccg aaatcctgtg gtattacgca ccctgaccca agcccgtaaa 1440
gtgatcaatg cggtggtgcg gttatatggt tcgcctgccc gtattcatat tgaaacagcg 1500
agagaagtcg gcaaatctta ccaagatcgt aaaaaacttg aaaaacagca agaagataat 1560
cgtaagcaac gtgaaagtgc ggtcaaaaaa tttaaagaaa tgtttccgca ttttgtgggg 1620
gagccgaaag gtaaagatat tttaaaaatg cgattgtatg agttacaaca agcgaaatgt 1680
ttatattctg gaaaatcttt agaacttcat cgtttgcttg agaaggggta tgtagaagtg 1740
gatcacgctt tgccattttc tcgcacgtgg gatgatagct ttaataataa agtactggtg 1800
cttgccaacg agaaccaaaa taaaggcaat ttaacgcctt atgaatggtt agatggtaaa 1860
aataacagtg agcgttggca acattttgtt gtacgagtac aaaccagcgg tttctcttat 1920
gctaaaaaac aacgcatttt gaaccataaa ttggatgaaa aagggtttat cgaacgtaat 1980
ttaaacgata ctcgctatgt agctcgtttc ttatgtaact ttattgccga taatatgttg 2040
ttggttggta aaggcaagcg aaacgtgttt gcttcaaacg ggcaaatcac ggcgttattg 2100
cggcatcgtt ggggcttaca aaaagtgcgt gaacagaatg atcgccacca cgcactggac 2160
gcggttgtgg tggcttgctc tactgtggca atgcaacaaa aaatcactcg atttgtgaga 2220
tataacgaag gaaatgtctt tagcggtgaa cgtatcgatc gtgaaactgg cgagattatt 2280
ccattacatt ttccaagccc ttgggctttt ttcaaagaga atgtggaaat tcgcattttt 2340
agtgaaaatc caaaattgga attagaaaat cgcctgcctg attatccgca atataatcac 2400
gaatgggtgc aaccattgtt tgtttcgaga atgccaaccc gaaaaatgac agggcaaggg 2460
catatggaaa cggtaaaatc cgcaaaacga ttaaatgaag gtttaagtgt gttaaaagtc 2520
cctttaacac aacttaaatt gagtgattta gaacgaatgg ttaatcgtga tcgtgaaatt 2580
gcattgtatg aatccttaaa agcacgttta gagcaatttg gtaacgaccc agccaaagcc 2640
tttgccgaac cattctataa aaagggtggg gcattagtca aagcagtccg attggaacaa 2700
acacaaaaat cgggggtatt agtacgtgat ggtaacggtg ttgcggataa tgcttcaatg 2760
gtacgggttg atgtttttac taaaggtgga aaatatttct tagtgccgat ttatacttgg 2820
caggtagcga aagggatttt accgaatagg gctgcgacac aaggtaaaga tgaaaatgat 2880
tgggatatta tggatgaaat ggctactttc caattttctc tatgtcaaaa tgatctaatt 2940
aaattagtta ccaaaaagaa aacaatcttt ggatatttta atggattaaa tagagctact 3000
agcaatataa atattaaaga gcatgatcta gataagtcta aagggaaatt aggtatttac 3060
ttagaagttg gtgtaaaact agctatttcc cttgaaaagt accaagtcga cgaactcggc 3120
aaaaatatcc gtccttgtcg tccgactaaa cgacagcacg tgcgttaa 3168
Claims (8)
1. 蛋白质在DNA分子中形成定位紧接所述DNA分子中的核苷酸序列5’-NNNN(A/G)TT-3’之前的双链断裂的用途,所述蛋白质包含SEQ ID NO: 1的氨基酸序列,或包含与SEQ IDNO: 1的氨基酸序列具有至少95%同一性且与SEQ ID NO: 1的不同之处仅在于非保守氨基酸残基的氨基酸序列。
2.根据权利要求1所述的用途,其特征在于所述DNA分子中的双链断裂在35℃至45℃的温度下形成。
3. 根据权利要求1所述的蛋白质的用途,其中所述蛋白质包含SEQ ID NO: 1的氨基酸序列。
4.根据权利要求1所述的用途,其特征在于所述DNA分子中的双链断裂在哺乳动物细胞的基因组DNA中形成。
5.根据权利要求4所述的用途,其特征在于所述DNA分子中的双链断裂导致所述哺乳动物细胞的基因组DNA的修饰。
6. 一种用于修饰包含基因组DNA的单细胞或多细胞生物体的细胞中的基因组DNA序列的方法,所述方法包括将有效量的以下引入生物体的所述细胞内:a)包含SEQ ID NO: 1的氨基酸序列的蛋白质、或编码包含SEQ ID NO: 1的氨基酸序列的蛋白质的核酸,和b)包含与生物体的基因组DNA区域的核苷酸序列形成双链体的序列的引导RNA或编码所述引导RNA的DNA序列,所述核苷酸序列与核苷酸序列5’-NNNN(A/G)TT-3’直接相邻,并且在双链体形成之后与所述蛋白质相互作用;
其中所述蛋白质与引导RNA和核苷酸序列5’-NNNN(A/G)TT-3’的相互作用导致在紧邻序列5’-NNNN(A/G)TT-3’的基因组DNA序列中的双链断裂形成。
7.根据权利要求6所述的方法,其进一步包括与所述引导RNA同时引入外源DNA序列。
8.根据权利要求6所述的方法,其特征在于所述细胞是哺乳动物细胞。
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
RU2019136164 | 2019-11-11 | ||
RU2019136164A RU2724470C1 (ru) | 2019-11-11 | 2019-11-11 | Применение cas9 белка из бактерии pasteurella pneumotropica для модификации геномной днк в клетках |
PCT/RU2020/050145 WO2021096391A1 (ru) | 2019-11-11 | 2020-07-02 | Применение cas9 бежа из бактерии pasteurella pneumotropica |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115397995A true CN115397995A (zh) | 2022-11-25 |
Family
ID=71136150
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202080092630.XA Pending CN115397995A (zh) | 2019-11-11 | 2020-07-02 | 来自细菌嗜肺巴斯德杆菌的cas9蛋白的用途 |
Country Status (16)
Country | Link |
---|---|
US (1) | US20220403369A1 (zh) |
EP (1) | EP4056705A4 (zh) |
JP (1) | JP2023501524A (zh) |
KR (1) | KR20220145324A (zh) |
CN (1) | CN115397995A (zh) |
AU (1) | AU2020384851A1 (zh) |
BR (1) | BR112022009148A2 (zh) |
CA (1) | CA3157898A1 (zh) |
CL (1) | CL2022001220A1 (zh) |
CO (1) | CO2022006156A2 (zh) |
MA (1) | MA57032A1 (zh) |
MX (1) | MX2022005685A (zh) |
PE (1) | PE20230035A1 (zh) |
RU (1) | RU2724470C1 (zh) |
WO (1) | WO2021096391A1 (zh) |
ZA (1) | ZA202205208B (zh) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019165168A1 (en) * | 2018-02-23 | 2019-08-29 | Pioneer Hi-Bred International, Inc. | Novel cas9 orthologs |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
BR112022017713A2 (pt) * | 2020-03-04 | 2022-11-16 | Flagship Pioneering Innovations Vi Llc | Métodos e composições para modular um genoma |
-
2019
- 2019-11-11 RU RU2019136164A patent/RU2724470C1/ru active
-
2020
- 2020-07-02 CA CA3157898A patent/CA3157898A1/en active Pending
- 2020-07-02 AU AU2020384851A patent/AU2020384851A1/en active Pending
- 2020-07-02 US US17/775,626 patent/US20220403369A1/en active Pending
- 2020-07-02 CN CN202080092630.XA patent/CN115397995A/zh active Pending
- 2020-07-02 PE PE2022000763A patent/PE20230035A1/es unknown
- 2020-07-02 EP EP20887900.7A patent/EP4056705A4/en active Pending
- 2020-07-02 JP JP2022527121A patent/JP2023501524A/ja active Pending
- 2020-07-02 WO PCT/RU2020/050145 patent/WO2021096391A1/ru unknown
- 2020-07-02 MX MX2022005685A patent/MX2022005685A/es unknown
- 2020-07-02 MA MA57032A patent/MA57032A1/fr unknown
- 2020-07-02 BR BR112022009148A patent/BR112022009148A2/pt unknown
- 2020-07-02 KR KR1020227019785A patent/KR20220145324A/ko unknown
-
2022
- 2022-05-10 CL CL2022001220A patent/CL2022001220A1/es unknown
- 2022-05-11 CO CONC2022/0006156A patent/CO2022006156A2/es unknown
- 2022-05-11 ZA ZA2022/05208A patent/ZA202205208B/en unknown
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019165168A1 (en) * | 2018-02-23 | 2019-08-29 | Pioneer Hi-Bred International, Inc. | Novel cas9 orthologs |
Non-Patent Citations (3)
Title |
---|
ASTRID WENINGER, ANNA-MARIA HATZL, CHRISTIAN SCHMID, THOMAS VOGL, ANTON GLIEDER: "Combinatorial optimization of CRISPR/Cas9 expression enables precision genome engineering in the methylotrophic yeast Pichia pastoris", JOURNAL OF BIOTECHNOLOGY, vol. 235, pages 139 - 149 * |
NCBI: "type II CRISPR RNA-guided endonuclease Cas9 [Rodentibacter pneumotropicus]", NCBI REFERENCE SEQUENCE: WP_018356570.1 * |
NEGIN P MARTIN , PAGE MYERS , EUGENIA GOULDING , SHIH-HENG CHEN , MITZIE WALKER , THOMAS M PORTER , LUCAS VAN GORDER , AMANDA MATH: "En masse lentiviral gene delivery to mouse fertilized eggs via laser perforation of zona pellucida", TRANSGENIC RES, vol. 27, no. 1, pages 39 - 49, XP036439454, DOI: 10.1007/s11248-017-0056-8 * |
Also Published As
Publication number | Publication date |
---|---|
US20220403369A1 (en) | 2022-12-22 |
CL2022001220A1 (es) | 2023-01-06 |
MX2022005685A (es) | 2022-07-27 |
PE20230035A1 (es) | 2023-01-10 |
ZA202205208B (en) | 2023-04-26 |
JP2023501524A (ja) | 2023-01-18 |
CA3157898A1 (en) | 2021-05-20 |
RU2724470C1 (ru) | 2020-06-23 |
KR20220145324A (ko) | 2022-10-28 |
CO2022006156A2 (es) | 2023-01-26 |
AU2020384851A1 (en) | 2022-12-01 |
EP4056705A1 (en) | 2022-09-14 |
EP4056705A4 (en) | 2023-12-27 |
WO2021096391A1 (ru) | 2021-05-20 |
MA57032A1 (fr) | 2023-01-31 |
BR112022009148A2 (pt) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP2023517041A (ja) | クラスiiのv型crispr系 | |
CN113234701B (zh) | 一种Cpf1蛋白及基因编辑系统 | |
CN115397995A (zh) | 来自细菌嗜肺巴斯德杆菌的cas9蛋白的用途 | |
RU2722934C1 (ru) | Средство разрезания днк на основе cas9 белка из бактерии pasteurella pneumotropica | |
JP2024509047A (ja) | Crispr関連トランスポゾンシステム及びその使用方法 | |
RU2778156C1 (ru) | Средство разрезания ДНК на основе Cas9 белка из бактерии Capnocytophaga ochracea | |
RU2722933C1 (ru) | Средство разрезания днк на основе cas9 белка из бактерии demequina sediminicola | |
RU2788197C1 (ru) | Средство разрезания ДНК на основе Cas9 белка из бактерии Streptococcus uberis NCTC3858 | |
OA20812A (en) | Use of CAS9 protein from the bacterium pasteurella pneumotropica. | |
RU2771626C1 (ru) | Средство разрезания двунитевой ДНК с помощью Cas12d белка из Katanobacteria и гибридной РНК, полученной путем слияния направляющей CRISPR РНК и scout РНК | |
EA044419B1 (ru) | Применение cas9 белка из бактерии pasteurella pneumotropica | |
RU2791447C1 (ru) | Средство разрезания ДНК на основе ScCas12a белка из бактерии Sedimentisphaera cyanobacteriorum | |
EA041935B1 (ru) | СРЕДСТВО РАЗРЕЗАНИЯ ДНК НА ОСНОВЕ Cas9 БЕЛКА ИЗ БАКТЕРИИ Pasteurella Pneumotropica | |
OA20443A (en) | DNA-cutting agent based on CAS9 protein from the bacterium pasteurella pneumotropica | |
RU2712492C1 (ru) | Средство разрезания днк на основе cas9 белка из defluviimonas sp. | |
CN113785055A (zh) | Dna切割剂 | |
OA20197A (en) | DNA-cutting agent. | |
Esquerra et al. | Identification of the EH CRISPR-Cas9 system on a metagenome and its application to genome engineering | |
EA041933B1 (ru) | Средство разрезания днк | |
EA042517B1 (ru) | Средство разрезания днк |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40084463 Country of ref document: HK |