CN117070497A - 可降低基因编辑脱靶率的基因编辑蛋白变体 - Google Patents
可降低基因编辑脱靶率的基因编辑蛋白变体 Download PDFInfo
- Publication number
- CN117070497A CN117070497A CN202210508434.7A CN202210508434A CN117070497A CN 117070497 A CN117070497 A CN 117070497A CN 202210508434 A CN202210508434 A CN 202210508434A CN 117070497 A CN117070497 A CN 117070497A
- Authority
- CN
- China
- Prior art keywords
- lys
- gene
- leu
- ile
- asn
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 150
- 238000010362 genome editing Methods 0.000 title claims abstract description 140
- 102000004169 proteins and genes Human genes 0.000 title claims abstract description 136
- 238000003776 cleavage reaction Methods 0.000 claims abstract description 79
- 230000000694 effects Effects 0.000 claims abstract description 77
- 230000007017 scission Effects 0.000 claims abstract description 34
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 claims abstract description 27
- 150000001413 amino acids Chemical group 0.000 claims abstract description 26
- 230000035772 mutation Effects 0.000 claims abstract description 20
- 239000004472 Lysine Substances 0.000 claims abstract description 12
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 claims abstract description 10
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 claims abstract description 10
- 238000000034 method Methods 0.000 claims description 35
- 102000040430 polynucleotide Human genes 0.000 claims description 33
- 108091033319 polynucleotide Proteins 0.000 claims description 33
- 239000002157 polynucleotide Substances 0.000 claims description 33
- 239000013604 expression vector Substances 0.000 claims description 29
- 239000000203 mixture Substances 0.000 claims description 28
- 239000013598 vector Substances 0.000 claims description 21
- 108020005004 Guide RNA Proteins 0.000 claims description 15
- 239000003814 drug Substances 0.000 claims description 11
- 239000013612 plasmid Substances 0.000 claims description 10
- 238000002360 preparation method Methods 0.000 claims description 9
- 150000007523 nucleic acids Chemical group 0.000 claims description 8
- 102000004190 Enzymes Human genes 0.000 claims description 4
- 108090000790 Enzymes Proteins 0.000 claims description 4
- 229940079593 drug Drugs 0.000 claims description 4
- 239000003153 chemical reaction reagent Substances 0.000 claims description 3
- 108091034117 Oligonucleotide Proteins 0.000 claims description 2
- 238000012258 culturing Methods 0.000 claims description 2
- 239000003937 drug carrier Substances 0.000 claims description 2
- 230000008439 repair process Effects 0.000 claims description 2
- 101000860092 Francisella tularensis subsp. novicida (strain U112) CRISPR-associated endonuclease Cas12a Proteins 0.000 abstract description 62
- 235000018102 proteins Nutrition 0.000 description 109
- 210000004027 cell Anatomy 0.000 description 51
- 108020004414 DNA Proteins 0.000 description 45
- 235000001014 amino acid Nutrition 0.000 description 26
- 229940024606 amino acid Drugs 0.000 description 23
- 239000012634 fragment Substances 0.000 description 22
- 102000053602 DNA Human genes 0.000 description 18
- 102000008300 Mutant Proteins Human genes 0.000 description 18
- 108010021466 Mutant Proteins Proteins 0.000 description 18
- 125000003275 alpha amino acid group Chemical group 0.000 description 18
- 108700004991 Cas12a Proteins 0.000 description 17
- FADYJNXDPBKVCA-UHFFFAOYSA-N L-Phenylalanyl-L-lysin Natural products NCCCCC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FADYJNXDPBKVCA-UHFFFAOYSA-N 0.000 description 12
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 12
- 108091028043 Nucleic acid sequence Proteins 0.000 description 12
- 125000000539 amino acid group Chemical group 0.000 description 12
- 108010092854 aspartyllysine Proteins 0.000 description 12
- 108010003700 lysyl aspartic acid Proteins 0.000 description 12
- 108090000765 processed proteins & peptides Proteins 0.000 description 12
- 229920001184 polypeptide Polymers 0.000 description 11
- 102000004196 processed proteins & peptides Human genes 0.000 description 11
- 239000004475 Arginine Substances 0.000 description 10
- 241000196324 Embryophyta Species 0.000 description 10
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 10
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 9
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 9
- 108010044940 alanylglutamine Proteins 0.000 description 9
- 108010038633 aspartylglutamate Proteins 0.000 description 9
- 108010009298 lysylglutamic acid Proteins 0.000 description 9
- 108010038320 lysylphenylalanine Proteins 0.000 description 9
- 239000002773 nucleotide Substances 0.000 description 9
- 125000003729 nucleotide group Chemical group 0.000 description 9
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 9
- 108010051110 tyrosyl-lysine Proteins 0.000 description 9
- 210000003527 eukaryotic cell Anatomy 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 238000012986 modification Methods 0.000 description 8
- 238000003786 synthesis reaction Methods 0.000 description 8
- 241000894006 Bacteria Species 0.000 description 7
- 230000001580 bacterial effect Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 239000002609 medium Substances 0.000 description 7
- 210000005253 yeast cell Anatomy 0.000 description 7
- CVGNCMIULZNYES-WHFBIAKZSA-N Ala-Asn-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O CVGNCMIULZNYES-WHFBIAKZSA-N 0.000 description 6
- VCSABYLVNWQYQE-UHFFFAOYSA-N Ala-Lys-Lys Natural products NCCCCC(NC(=O)C(N)C)C(=O)NC(CCCCN)C(O)=O VCSABYLVNWQYQE-UHFFFAOYSA-N 0.000 description 6
- 101100385358 Alicyclobacillus acidoterrestris (strain ATCC 49025 / DSM 3922 / CIP 106132 / NCIMB 13137 / GD3B) cas12b gene Proteins 0.000 description 6
- BXUHCIXDSWRSBS-CIUDSAMLSA-N Asn-Leu-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BXUHCIXDSWRSBS-CIUDSAMLSA-N 0.000 description 6
- GKWFMNNNYZHJHV-SRVKXCTJSA-N Asp-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC(O)=O GKWFMNNNYZHJHV-SRVKXCTJSA-N 0.000 description 6
- RXBGWGRSWXOBGK-KKUMJFAQSA-N Asp-Lys-Tyr Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O RXBGWGRSWXOBGK-KKUMJFAQSA-N 0.000 description 6
- 241000588724 Escherichia coli Species 0.000 description 6
- HJIFPJUEOGZWRI-GUBZILKMSA-N Glu-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)O)N HJIFPJUEOGZWRI-GUBZILKMSA-N 0.000 description 6
- KBAPKNDWAGVGTH-IGISWZIWSA-N Ile-Ile-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KBAPKNDWAGVGTH-IGISWZIWSA-N 0.000 description 6
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 6
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 6
- KUIDCYNIEJBZBU-AJNGGQMLSA-N Leu-Ile-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O KUIDCYNIEJBZBU-AJNGGQMLSA-N 0.000 description 6
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 6
- AMSSKPUHBUQBOQ-SRVKXCTJSA-N Leu-Ser-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N AMSSKPUHBUQBOQ-SRVKXCTJSA-N 0.000 description 6
- MPOHDJKRBLVGCT-CIUDSAMLSA-N Lys-Ala-Asn Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCCN)N MPOHDJKRBLVGCT-CIUDSAMLSA-N 0.000 description 6
- WSXTWLJHTLRFLW-SRVKXCTJSA-N Lys-Ala-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(O)=O WSXTWLJHTLRFLW-SRVKXCTJSA-N 0.000 description 6
- NNCDAORZCMPZPX-GUBZILKMSA-N Lys-Gln-Ser Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N NNCDAORZCMPZPX-GUBZILKMSA-N 0.000 description 6
- NJNRBRKHOWSGMN-SRVKXCTJSA-N Lys-Leu-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O NJNRBRKHOWSGMN-SRVKXCTJSA-N 0.000 description 6
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 6
- 101710163270 Nuclease Proteins 0.000 description 6
- 238000012408 PCR amplification Methods 0.000 description 6
- ZENDEDYRYVHBEG-SRVKXCTJSA-N Phe-Asp-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 ZENDEDYRYVHBEG-SRVKXCTJSA-N 0.000 description 6
- FMMIYCMOVGXZIP-AVGNSLFASA-N Phe-Glu-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O FMMIYCMOVGXZIP-AVGNSLFASA-N 0.000 description 6
- KYYMILWEGJYPQZ-IHRRRGAJSA-N Phe-Glu-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CC1=CC=CC=C1 KYYMILWEGJYPQZ-IHRRRGAJSA-N 0.000 description 6
- OYQBFWWQSVIHBN-FHWLQOOXSA-N Phe-Glu-Phe Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O OYQBFWWQSVIHBN-FHWLQOOXSA-N 0.000 description 6
- 108020004682 Single-Stranded DNA Proteins 0.000 description 6
- WAPFQMXRSDEGOE-IHRRRGAJSA-N Tyr-Glu-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O WAPFQMXRSDEGOE-IHRRRGAJSA-N 0.000 description 6
- NSGZILIDHCIZAM-KKUMJFAQSA-N Tyr-Leu-Ser Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N NSGZILIDHCIZAM-KKUMJFAQSA-N 0.000 description 6
- CDKZJGMPZHPAJC-ULQDDVLXSA-N Tyr-Leu-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 CDKZJGMPZHPAJC-ULQDDVLXSA-N 0.000 description 6
- KOSRFJWDECSPRO-UHFFFAOYSA-N alpha-L-glutamyl-L-glutamic acid Natural products OC(=O)CCC(N)C(=O)NC(CCC(O)=O)C(O)=O KOSRFJWDECSPRO-UHFFFAOYSA-N 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000007796 conventional method Methods 0.000 description 6
- 239000012636 effector Substances 0.000 description 6
- 239000003623 enhancer Substances 0.000 description 6
- 238000009472 formulation Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 108010055341 glutamyl-glutamic acid Proteins 0.000 description 6
- 108010050848 glycylleucine Proteins 0.000 description 6
- 108010092114 histidylphenylalanine Proteins 0.000 description 6
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 6
- 108010044374 isoleucyl-tyrosine Proteins 0.000 description 6
- 108010034529 leucyl-lysine Proteins 0.000 description 6
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 6
- 108010057952 lysyl-phenylalanyl-lysine Proteins 0.000 description 6
- 108010064235 lysylglycine Proteins 0.000 description 6
- 108010054155 lysyllysine Proteins 0.000 description 6
- 238000003752 polymerase chain reaction Methods 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- 108010061238 threonyl-glycine Proteins 0.000 description 6
- 239000002299 complementary DNA Substances 0.000 description 5
- 230000003013 cytotoxicity Effects 0.000 description 5
- 231100000135 cytotoxicity Toxicity 0.000 description 5
- 239000002552 dosage form Substances 0.000 description 5
- 239000000499 gel Substances 0.000 description 5
- 238000011068 loading method Methods 0.000 description 5
- 210000004962 mammalian cell Anatomy 0.000 description 5
- 102000039446 nucleic acids Human genes 0.000 description 5
- 108020004707 nucleic acids Proteins 0.000 description 5
- 238000010188 recombinant method Methods 0.000 description 5
- 241000235649 Kluyveromyces Species 0.000 description 4
- 238000010459 TALEN Methods 0.000 description 4
- 238000007792 addition Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 239000012669 liquid formulation Substances 0.000 description 4
- 239000003550 marker Substances 0.000 description 4
- 210000001236 prokaryotic cell Anatomy 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- CNKBMTKICGGSCQ-ACRUOGEOSA-N (2S)-2-[[(2S)-2-[[(2S)-2,6-diamino-1-oxohexyl]amino]-1-oxo-3-phenylpropyl]amino]-3-(4-hydroxyphenyl)propanoic acid Chemical compound C([C@H](NC(=O)[C@@H](N)CCCCN)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 CNKBMTKICGGSCQ-ACRUOGEOSA-N 0.000 description 3
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 3
- STACJSVFHSEZJV-GHCJXIJMSA-N Ala-Asn-Ile Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O STACJSVFHSEZJV-GHCJXIJMSA-N 0.000 description 3
- XQGIRPGAVLFKBJ-CIUDSAMLSA-N Ala-Asn-Lys Chemical compound N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)O XQGIRPGAVLFKBJ-CIUDSAMLSA-N 0.000 description 3
- IFKQPMZRDQZSHI-GHCJXIJMSA-N Ala-Ile-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O IFKQPMZRDQZSHI-GHCJXIJMSA-N 0.000 description 3
- FOHXUHGZZKETFI-JBDRJPRFSA-N Ala-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C)N FOHXUHGZZKETFI-JBDRJPRFSA-N 0.000 description 3
- CKLDHDOIYBVUNP-KBIXCLLPSA-N Ala-Ile-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O CKLDHDOIYBVUNP-KBIXCLLPSA-N 0.000 description 3
- TZDNWXDLYFIFPT-BJDJZHNGSA-N Ala-Ile-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O TZDNWXDLYFIFPT-BJDJZHNGSA-N 0.000 description 3
- RZZMZYZXNJRPOJ-BJDJZHNGSA-N Ala-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](C)N RZZMZYZXNJRPOJ-BJDJZHNGSA-N 0.000 description 3
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 3
- PMQXMXAASGFUDX-SRVKXCTJSA-N Ala-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CCCCN PMQXMXAASGFUDX-SRVKXCTJSA-N 0.000 description 3
- IORKCNUBHNIMKY-CIUDSAMLSA-N Ala-Pro-Glu Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O IORKCNUBHNIMKY-CIUDSAMLSA-N 0.000 description 3
- BTRULDJUUVGRNE-DCAQKATOSA-N Ala-Pro-Lys Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O BTRULDJUUVGRNE-DCAQKATOSA-N 0.000 description 3
- PEEYDECOOVQKRZ-DLOVCJGASA-N Ala-Ser-Phe Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PEEYDECOOVQKRZ-DLOVCJGASA-N 0.000 description 3
- VYMJAWXRWHJIMS-LKTVYLICSA-N Ala-Tyr-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N VYMJAWXRWHJIMS-LKTVYLICSA-N 0.000 description 3
- QRIYOHQJRDHFKF-UWJYBYFXSA-N Ala-Tyr-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=C(O)C=C1 QRIYOHQJRDHFKF-UWJYBYFXSA-N 0.000 description 3
- USNSOPDIZILSJP-FXQIFTODSA-N Arg-Asn-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O USNSOPDIZILSJP-FXQIFTODSA-N 0.000 description 3
- FEZJJKXNPSEYEV-CIUDSAMLSA-N Arg-Gln-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O FEZJJKXNPSEYEV-CIUDSAMLSA-N 0.000 description 3
- GOWZVQXTHUCNSQ-NHCYSSNCSA-N Arg-Glu-Val Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O GOWZVQXTHUCNSQ-NHCYSSNCSA-N 0.000 description 3
- NVUIWHJLPSZZQC-CYDGBPFRSA-N Arg-Ile-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O NVUIWHJLPSZZQC-CYDGBPFRSA-N 0.000 description 3
- LLUGJARLJCGLAR-CYDGBPFRSA-N Arg-Ile-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N LLUGJARLJCGLAR-CYDGBPFRSA-N 0.000 description 3
- SSZGOKWBHLOCHK-DCAQKATOSA-N Arg-Lys-Asn Chemical compound NC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N SSZGOKWBHLOCHK-DCAQKATOSA-N 0.000 description 3
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 3
- XUGATJVGQUGQKY-ULQDDVLXSA-N Arg-Lys-Phe Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 XUGATJVGQUGQKY-ULQDDVLXSA-N 0.000 description 3
- DPLFNLDACGGBAK-KKUMJFAQSA-N Arg-Phe-Glu Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N DPLFNLDACGGBAK-KKUMJFAQSA-N 0.000 description 3
- KZXPVYVSHUJCEO-ULQDDVLXSA-N Arg-Phe-Lys Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(O)=O)CC1=CC=CC=C1 KZXPVYVSHUJCEO-ULQDDVLXSA-N 0.000 description 3
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 3
- WVCJSDCHTUTONA-FXQIFTODSA-N Asn-Asp-Arg Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WVCJSDCHTUTONA-FXQIFTODSA-N 0.000 description 3
- ZWASIOHRQWRWAS-UGYAYLCHSA-N Asn-Asp-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZWASIOHRQWRWAS-UGYAYLCHSA-N 0.000 description 3
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 3
- COUZKSSMBFADSB-AVGNSLFASA-N Asn-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)N)N COUZKSSMBFADSB-AVGNSLFASA-N 0.000 description 3
- JZDZLBJVYWIIQU-AVGNSLFASA-N Asn-Glu-Tyr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O JZDZLBJVYWIIQU-AVGNSLFASA-N 0.000 description 3
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 3
- ODBSSLHUFPJRED-CIUDSAMLSA-N Asn-His-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CC(=O)N)N ODBSSLHUFPJRED-CIUDSAMLSA-N 0.000 description 3
- AITGTTNYKAWKDR-CIUDSAMLSA-N Asn-His-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O AITGTTNYKAWKDR-CIUDSAMLSA-N 0.000 description 3
- YYSYDIYQTUPNQQ-SXTJYALSSA-N Asn-Ile-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YYSYDIYQTUPNQQ-SXTJYALSSA-N 0.000 description 3
- ACKNRKFVYUVWAC-ZPFDUUQYSA-N Asn-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)N)N ACKNRKFVYUVWAC-ZPFDUUQYSA-N 0.000 description 3
- SPCONPVIDFMDJI-QSFUFRPTSA-N Asn-Ile-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O SPCONPVIDFMDJI-QSFUFRPTSA-N 0.000 description 3
- PNHQRQTVBRDIEF-CIUDSAMLSA-N Asn-Leu-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(=O)N)N PNHQRQTVBRDIEF-CIUDSAMLSA-N 0.000 description 3
- HFPXZWPUVFVNLL-GUBZILKMSA-N Asn-Leu-Gln Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFPXZWPUVFVNLL-GUBZILKMSA-N 0.000 description 3
- MYCSPQIARXTUTP-SRVKXCTJSA-N Asn-Leu-His Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CC(=O)N)N MYCSPQIARXTUTP-SRVKXCTJSA-N 0.000 description 3
- GLWFAWNYGWBMOC-SRVKXCTJSA-N Asn-Leu-Leu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O GLWFAWNYGWBMOC-SRVKXCTJSA-N 0.000 description 3
- TZFQICWZWFNIKU-KKUMJFAQSA-N Asn-Leu-Tyr Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 TZFQICWZWFNIKU-KKUMJFAQSA-N 0.000 description 3
- NCFJQJRLQJEECD-NHCYSSNCSA-N Asn-Leu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O NCFJQJRLQJEECD-NHCYSSNCSA-N 0.000 description 3
- FTSAJSADJCMDHH-CIUDSAMLSA-N Asn-Lys-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)N)N FTSAJSADJCMDHH-CIUDSAMLSA-N 0.000 description 3
- JWKDQOORUCYUIW-ZPFDUUQYSA-N Asn-Lys-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JWKDQOORUCYUIW-ZPFDUUQYSA-N 0.000 description 3
- WXVGISRWSYGEDK-KKUMJFAQSA-N Asn-Lys-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)N)N WXVGISRWSYGEDK-KKUMJFAQSA-N 0.000 description 3
- AWXDRZJQCVHCIT-DCAQKATOSA-N Asn-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O AWXDRZJQCVHCIT-DCAQKATOSA-N 0.000 description 3
- GMUOCGCDOYYWPD-FXQIFTODSA-N Asn-Pro-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O GMUOCGCDOYYWPD-FXQIFTODSA-N 0.000 description 3
- HCZQKHSRYHCPSD-IUKAMOBKSA-N Asn-Thr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HCZQKHSRYHCPSD-IUKAMOBKSA-N 0.000 description 3
- KTDWFWNZLLFEFU-KKUMJFAQSA-N Asn-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CC(=O)N)N)O KTDWFWNZLLFEFU-KKUMJFAQSA-N 0.000 description 3
- BEHQTVDBCLSCBY-CFMVVWHZSA-N Asn-Tyr-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BEHQTVDBCLSCBY-CFMVVWHZSA-N 0.000 description 3
- DPWDPEVGACCWTC-SRVKXCTJSA-N Asn-Tyr-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O DPWDPEVGACCWTC-SRVKXCTJSA-N 0.000 description 3
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 3
- VPPXTHJNTYDNFJ-CIUDSAMLSA-N Asp-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N VPPXTHJNTYDNFJ-CIUDSAMLSA-N 0.000 description 3
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 3
- XACXDSRQIXRMNS-OLHMAJIHSA-N Asp-Asn-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CC(=O)O)N)O XACXDSRQIXRMNS-OLHMAJIHSA-N 0.000 description 3
- QOVWVLLHMMCFFY-ZLUOBGJFSA-N Asp-Asp-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O QOVWVLLHMMCFFY-ZLUOBGJFSA-N 0.000 description 3
- VPSHHQXIWLGVDD-ZLUOBGJFSA-N Asp-Asp-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O VPSHHQXIWLGVDD-ZLUOBGJFSA-N 0.000 description 3
- WCFCYFDBMNFSPA-ACZMJKKPSA-N Asp-Asp-Glu Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(O)=O WCFCYFDBMNFSPA-ACZMJKKPSA-N 0.000 description 3
- CELPEWWLSXMVPH-CIUDSAMLSA-N Asp-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O CELPEWWLSXMVPH-CIUDSAMLSA-N 0.000 description 3
- QCLHLXDWRKOHRR-GUBZILKMSA-N Asp-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)N QCLHLXDWRKOHRR-GUBZILKMSA-N 0.000 description 3
- VILLWIDTHYPSLC-PEFMBERDSA-N Asp-Glu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VILLWIDTHYPSLC-PEFMBERDSA-N 0.000 description 3
- PZXPWHFYZXTFBI-YUMQZZPRSA-N Asp-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PZXPWHFYZXTFBI-YUMQZZPRSA-N 0.000 description 3
- TZOZNVLBTAFJRW-UGYAYLCHSA-N Asp-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)[C@H](CC(=O)O)N TZOZNVLBTAFJRW-UGYAYLCHSA-N 0.000 description 3
- JNNVNVRBYUJYGS-CIUDSAMLSA-N Asp-Leu-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O JNNVNVRBYUJYGS-CIUDSAMLSA-N 0.000 description 3
- RQHLMGCXCZUOGT-ZPFDUUQYSA-N Asp-Leu-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O RQHLMGCXCZUOGT-ZPFDUUQYSA-N 0.000 description 3
- JTRDJYIZIKCIRC-AJNGGQMLSA-N Asp-Leu-Leu-Gln Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O JTRDJYIZIKCIRC-AJNGGQMLSA-N 0.000 description 3
- UMHUHHJMEXNSIV-CIUDSAMLSA-N Asp-Leu-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC(O)=O UMHUHHJMEXNSIV-CIUDSAMLSA-N 0.000 description 3
- XWSIYTYNLKCLJB-CIUDSAMLSA-N Asp-Lys-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O XWSIYTYNLKCLJB-CIUDSAMLSA-N 0.000 description 3
- NVFSJIXJZCDICF-SRVKXCTJSA-N Asp-Lys-Lys Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N NVFSJIXJZCDICF-SRVKXCTJSA-N 0.000 description 3
- DONWIPDSZZJHHK-HJGDQZAQSA-N Asp-Lys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC(=O)O)N)O DONWIPDSZZJHHK-HJGDQZAQSA-N 0.000 description 3
- JUWISGAGWSDGDH-KKUMJFAQSA-N Asp-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CC(O)=O)CC1=CC=CC=C1 JUWISGAGWSDGDH-KKUMJFAQSA-N 0.000 description 3
- RPUYTJJZXQBWDT-SRVKXCTJSA-N Asp-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC(=O)O)N RPUYTJJZXQBWDT-SRVKXCTJSA-N 0.000 description 3
- NONWUQAWAANERO-BZSNNMDCSA-N Asp-Phe-Tyr Chemical compound C([C@H](NC(=O)[C@H](CC(O)=O)N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)C1=CC=CC=C1 NONWUQAWAANERO-BZSNNMDCSA-N 0.000 description 3
- WMLFFCRUSPNENW-ZLUOBGJFSA-N Asp-Ser-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O WMLFFCRUSPNENW-ZLUOBGJFSA-N 0.000 description 3
- KBJVTFWQWXCYCQ-IUKAMOBKSA-N Asp-Thr-Ile Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KBJVTFWQWXCYCQ-IUKAMOBKSA-N 0.000 description 3
- GXHDGYOXPNQCKM-XVSYOHENSA-N Asp-Thr-Phe Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC(=O)O)N)O GXHDGYOXPNQCKM-XVSYOHENSA-N 0.000 description 3
- BJDHEININLSZOT-KKUMJFAQSA-N Asp-Tyr-Lys Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(O)=O BJDHEININLSZOT-KKUMJFAQSA-N 0.000 description 3
- PLOKOIJSGCISHE-BYULHYEWSA-N Asp-Val-Asn Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O PLOKOIJSGCISHE-BYULHYEWSA-N 0.000 description 3
- XQFLFQWOBXPMHW-NHCYSSNCSA-N Asp-Val-His Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CNC=N1)C(=O)O XQFLFQWOBXPMHW-NHCYSSNCSA-N 0.000 description 3
- JGLWFWXGOINXEA-YDHLFZDLSA-N Asp-Val-Tyr Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 JGLWFWXGOINXEA-YDHLFZDLSA-N 0.000 description 3
- KCSDYJSCUWLILX-BJDJZHNGSA-N Cys-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CS)N KCSDYJSCUWLILX-BJDJZHNGSA-N 0.000 description 3
- XCDDSPYIMNXECQ-NAKRPEOUSA-N Cys-Pro-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CS XCDDSPYIMNXECQ-NAKRPEOUSA-N 0.000 description 3
- HPZAJRPYUIHDIN-BZSNNMDCSA-N Cys-Tyr-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)NC(=O)[C@H](CS)N HPZAJRPYUIHDIN-BZSNNMDCSA-N 0.000 description 3
- 239000003298 DNA probe Substances 0.000 description 3
- 241000589602 Francisella tularensis Species 0.000 description 3
- TWHDOEYLXXQYOZ-FXQIFTODSA-N Gln-Asn-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N TWHDOEYLXXQYOZ-FXQIFTODSA-N 0.000 description 3
- AAOBFSKXAVIORT-GUBZILKMSA-N Gln-Asn-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O AAOBFSKXAVIORT-GUBZILKMSA-N 0.000 description 3
- ODBLJLZVLAWVMS-GUBZILKMSA-N Gln-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)N)N ODBLJLZVLAWVMS-GUBZILKMSA-N 0.000 description 3
- IXFVOPOHSRKJNG-LAEOZQHASA-N Gln-Asp-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O IXFVOPOHSRKJNG-LAEOZQHASA-N 0.000 description 3
- PCKOTDPDHIBGRW-CIUDSAMLSA-N Gln-Cys-Arg Chemical compound C(C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CCC(=O)N)N)CN=C(N)N PCKOTDPDHIBGRW-CIUDSAMLSA-N 0.000 description 3
- AJDMYLOISOCHHC-YVNDNENWSA-N Gln-Gln-Ile Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O AJDMYLOISOCHHC-YVNDNENWSA-N 0.000 description 3
- UFNSPPFJOHNXRE-AUTRQRHGSA-N Gln-Gln-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UFNSPPFJOHNXRE-AUTRQRHGSA-N 0.000 description 3
- KCJJFESQRXGTGC-BQBZGAKWSA-N Gln-Glu-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O KCJJFESQRXGTGC-BQBZGAKWSA-N 0.000 description 3
- LFIVHGMKWFGUGK-IHRRRGAJSA-N Gln-Glu-Phe Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LFIVHGMKWFGUGK-IHRRRGAJSA-N 0.000 description 3
- XSBGUANSZDGULP-IUCAKERBSA-N Gln-Gly-Lys Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCCCN)C(O)=O XSBGUANSZDGULP-IUCAKERBSA-N 0.000 description 3
- YXQCLIVLWCKCRS-RYUDHWBXSA-N Gln-Gly-Tyr Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCC(=O)N)N)O YXQCLIVLWCKCRS-RYUDHWBXSA-N 0.000 description 3
- RGAOLBZBLOJUTP-GRLWGSQLSA-N Gln-Ile-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)NC(=O)[C@H](CCC(=O)N)N RGAOLBZBLOJUTP-GRLWGSQLSA-N 0.000 description 3
- FTIJVMLAGRAYMJ-MNXVOIDGSA-N Gln-Ile-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCC(N)=O FTIJVMLAGRAYMJ-MNXVOIDGSA-N 0.000 description 3
- ITZWDGBYBPUZRG-KBIXCLLPSA-N Gln-Ile-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O ITZWDGBYBPUZRG-KBIXCLLPSA-N 0.000 description 3
- HSHCEAUPUPJPTE-JYJNAYRXSA-N Gln-Leu-Tyr Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HSHCEAUPUPJPTE-JYJNAYRXSA-N 0.000 description 3
- FKXCBKCOSVIGCT-AVGNSLFASA-N Gln-Lys-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O FKXCBKCOSVIGCT-AVGNSLFASA-N 0.000 description 3
- ZGHMRONFHDVXEF-AVGNSLFASA-N Gln-Ser-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZGHMRONFHDVXEF-AVGNSLFASA-N 0.000 description 3
- ININBLZFFVOQIO-JHEQGTHGSA-N Gln-Thr-Gly Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)N)N)O ININBLZFFVOQIO-JHEQGTHGSA-N 0.000 description 3
- CSMHMEATMDCQNY-DZKIICNBSA-N Gln-Val-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O CSMHMEATMDCQNY-DZKIICNBSA-N 0.000 description 3
- SOEXCCGNHQBFPV-DLOVCJGASA-N Gln-Val-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SOEXCCGNHQBFPV-DLOVCJGASA-N 0.000 description 3
- LKDIBBOKUAASNP-FXQIFTODSA-N Glu-Ala-Glu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O LKDIBBOKUAASNP-FXQIFTODSA-N 0.000 description 3
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 3
- CVPXINNKRTZBMO-CIUDSAMLSA-N Glu-Arg-Asn Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N)CN=C(N)N CVPXINNKRTZBMO-CIUDSAMLSA-N 0.000 description 3
- PBEQPAZRHDVJQI-SRVKXCTJSA-N Glu-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCC(=O)O)N PBEQPAZRHDVJQI-SRVKXCTJSA-N 0.000 description 3
- RJONUNZIMUXUOI-GUBZILKMSA-N Glu-Asn-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCC(=O)O)N RJONUNZIMUXUOI-GUBZILKMSA-N 0.000 description 3
- JPHYJQHPILOKHC-ACZMJKKPSA-N Glu-Asp-Asp Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JPHYJQHPILOKHC-ACZMJKKPSA-N 0.000 description 3
- NADWTMLCUDMDQI-ACZMJKKPSA-N Glu-Asp-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N NADWTMLCUDMDQI-ACZMJKKPSA-N 0.000 description 3
- IESFZVCAVACGPH-PEFMBERDSA-N Glu-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCC(O)=O IESFZVCAVACGPH-PEFMBERDSA-N 0.000 description 3
- XHUCVVHRLNPZSZ-CIUDSAMLSA-N Glu-Gln-Glu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XHUCVVHRLNPZSZ-CIUDSAMLSA-N 0.000 description 3
- BUAKRRKDHSSIKK-IHRRRGAJSA-N Glu-Glu-Tyr Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 BUAKRRKDHSSIKK-IHRRRGAJSA-N 0.000 description 3
- ZWABFSSWTSAMQN-KBIXCLLPSA-N Glu-Ile-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O ZWABFSSWTSAMQN-KBIXCLLPSA-N 0.000 description 3
- ZCOJVESMNGBGLF-GRLWGSQLSA-N Glu-Ile-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O ZCOJVESMNGBGLF-GRLWGSQLSA-N 0.000 description 3
- HVYWQYLBVXMXSV-GUBZILKMSA-N Glu-Leu-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O HVYWQYLBVXMXSV-GUBZILKMSA-N 0.000 description 3
- WNRZUESNGGDCJX-JYJNAYRXSA-N Glu-Leu-Phe Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O WNRZUESNGGDCJX-JYJNAYRXSA-N 0.000 description 3
- NJCALAAIGREHDR-WDCWCFNPSA-N Glu-Leu-Thr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NJCALAAIGREHDR-WDCWCFNPSA-N 0.000 description 3
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 3
- ILWHFUZZCFYSKT-AVGNSLFASA-N Glu-Lys-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O ILWHFUZZCFYSKT-AVGNSLFASA-N 0.000 description 3
- ZGEJRLJEAMPEDV-SRVKXCTJSA-N Glu-Lys-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(=O)O)N ZGEJRLJEAMPEDV-SRVKXCTJSA-N 0.000 description 3
- MFNUFCFRAZPJFW-JYJNAYRXSA-N Glu-Lys-Phe Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 MFNUFCFRAZPJFW-JYJNAYRXSA-N 0.000 description 3
- UERORLSAFUHDGU-AVGNSLFASA-N Glu-Phe-Asn Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)O)N UERORLSAFUHDGU-AVGNSLFASA-N 0.000 description 3
- MRWYPDWDZSLWJM-ACZMJKKPSA-N Glu-Ser-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O MRWYPDWDZSLWJM-ACZMJKKPSA-N 0.000 description 3
- QOXDAWODGSIDDI-GUBZILKMSA-N Glu-Ser-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(=O)O)N QOXDAWODGSIDDI-GUBZILKMSA-N 0.000 description 3
- DMYACXMQUABZIQ-NRPADANISA-N Glu-Ser-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O DMYACXMQUABZIQ-NRPADANISA-N 0.000 description 3
- YQAQQKPWFOBSMU-WDCWCFNPSA-N Glu-Thr-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O YQAQQKPWFOBSMU-WDCWCFNPSA-N 0.000 description 3
- HAGKYCXGTRUUFI-RYUDHWBXSA-N Glu-Tyr-Gly Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CCC(=O)O)N)O HAGKYCXGTRUUFI-RYUDHWBXSA-N 0.000 description 3
- OCQUNKSFDYDXBG-QXEWZRGKSA-N Gly-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N OCQUNKSFDYDXBG-QXEWZRGKSA-N 0.000 description 3
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 3
- BGVYNAQWHSTTSP-BYULHYEWSA-N Gly-Asn-Ile Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BGVYNAQWHSTTSP-BYULHYEWSA-N 0.000 description 3
- JVACNFOPSUPDTK-QWRGUYRKSA-N Gly-Asn-Phe Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JVACNFOPSUPDTK-QWRGUYRKSA-N 0.000 description 3
- HDNXXTBKOJKWNN-WDSKDSINSA-N Gly-Glu-Asn Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O HDNXXTBKOJKWNN-WDSKDSINSA-N 0.000 description 3
- XTQFHTHIAKKCTM-YFKPBYRVSA-N Gly-Glu-Gly Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O XTQFHTHIAKKCTM-YFKPBYRVSA-N 0.000 description 3
- QSVCIFZPGLOZGH-WDSKDSINSA-N Gly-Glu-Ser Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O QSVCIFZPGLOZGH-WDSKDSINSA-N 0.000 description 3
- OLPPXYMMIARYAL-QMMMGPOBSA-N Gly-Gly-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)CNC(=O)CN OLPPXYMMIARYAL-QMMMGPOBSA-N 0.000 description 3
- VEPBEGNDJYANCF-QWRGUYRKSA-N Gly-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCCN VEPBEGNDJYANCF-QWRGUYRKSA-N 0.000 description 3
- FXLVSYVJDPCIHH-STQMWFEESA-N Gly-Phe-Arg Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FXLVSYVJDPCIHH-STQMWFEESA-N 0.000 description 3
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 3
- DBUNZBWUWCIELX-JHEQGTHGSA-N Gly-Thr-Glu Chemical compound [H]NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DBUNZBWUWCIELX-JHEQGTHGSA-N 0.000 description 3
- UVTSZKIATYSKIR-RYUDHWBXSA-N Gly-Tyr-Glu Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O UVTSZKIATYSKIR-RYUDHWBXSA-N 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 3
- RVKIPWVMZANZLI-UHFFFAOYSA-N H-Lys-Trp-OH Natural products C1=CC=C2C(CC(NC(=O)C(N)CCCCN)C(O)=O)=CNC2=C1 RVKIPWVMZANZLI-UHFFFAOYSA-N 0.000 description 3
- SVHKVHBPTOMLTO-DCAQKATOSA-N His-Arg-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O SVHKVHBPTOMLTO-DCAQKATOSA-N 0.000 description 3
- FLYSHWAAHYNKRT-JYJNAYRXSA-N His-Gln-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FLYSHWAAHYNKRT-JYJNAYRXSA-N 0.000 description 3
- TVRMJKNELJKNRS-GUBZILKMSA-N His-Glu-Asn Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(=O)N)C(=O)O)N TVRMJKNELJKNRS-GUBZILKMSA-N 0.000 description 3
- HQKADFMLECZIQJ-HVTMNAMFSA-N His-Glu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC1=CN=CN1)N HQKADFMLECZIQJ-HVTMNAMFSA-N 0.000 description 3
- HAPWZEVRQYGLSG-IUCAKERBSA-N His-Gly-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O HAPWZEVRQYGLSG-IUCAKERBSA-N 0.000 description 3
- ZVKDCQVQTGYBQT-LSJOCFKGSA-N His-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O ZVKDCQVQTGYBQT-LSJOCFKGSA-N 0.000 description 3
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 3
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 3
- QYZYJFXHXYUZMZ-UGYAYLCHSA-N Ile-Asn-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N QYZYJFXHXYUZMZ-UGYAYLCHSA-N 0.000 description 3
- HZMLFETXHFHGBB-UGYAYLCHSA-N Ile-Asn-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N HZMLFETXHFHGBB-UGYAYLCHSA-N 0.000 description 3
- IDAHFEPYTJJZFD-PEFMBERDSA-N Ile-Asp-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N IDAHFEPYTJJZFD-PEFMBERDSA-N 0.000 description 3
- QSPLUJGYOPZINY-ZPFDUUQYSA-N Ile-Asp-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N QSPLUJGYOPZINY-ZPFDUUQYSA-N 0.000 description 3
- WEWCEPOYKANMGZ-MMWGEVLESA-N Ile-Cys-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N1CCC[C@@H]1C(=O)O)N WEWCEPOYKANMGZ-MMWGEVLESA-N 0.000 description 3
- VCYVLFAWCJRXFT-HJPIBITLSA-N Ile-Cys-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N VCYVLFAWCJRXFT-HJPIBITLSA-N 0.000 description 3
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 3
- NYEYYMLUABXDMC-NHCYSSNCSA-N Ile-Gly-Leu Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)O)N NYEYYMLUABXDMC-NHCYSSNCSA-N 0.000 description 3
- DFFTXLCCDFYRKD-MBLNEYKQSA-N Ile-Gly-Thr Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)O)N DFFTXLCCDFYRKD-MBLNEYKQSA-N 0.000 description 3
- KLBVGHCGHUNHEA-BJDJZHNGSA-N Ile-Leu-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)O)N KLBVGHCGHUNHEA-BJDJZHNGSA-N 0.000 description 3
- YGDWPQCLFJNMOL-MNXVOIDGSA-N Ile-Leu-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N YGDWPQCLFJNMOL-MNXVOIDGSA-N 0.000 description 3
- GVKKVHNRTUFCCE-BJDJZHNGSA-N Ile-Leu-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)O)N GVKKVHNRTUFCCE-BJDJZHNGSA-N 0.000 description 3
- OVDKXUDMKXAZIV-ZPFDUUQYSA-N Ile-Lys-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)N)C(=O)O)N OVDKXUDMKXAZIV-ZPFDUUQYSA-N 0.000 description 3
- RMNMUUCYTMLWNA-ZPFDUUQYSA-N Ile-Lys-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)O)N RMNMUUCYTMLWNA-ZPFDUUQYSA-N 0.000 description 3
- PNTWNAXGBOZMBO-MNXVOIDGSA-N Ile-Lys-Gln Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PNTWNAXGBOZMBO-MNXVOIDGSA-N 0.000 description 3
- YSGBJIQXTIVBHZ-AJNGGQMLSA-N Ile-Lys-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O YSGBJIQXTIVBHZ-AJNGGQMLSA-N 0.000 description 3
- GVNNAHIRSDRIII-AJNGGQMLSA-N Ile-Lys-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N GVNNAHIRSDRIII-AJNGGQMLSA-N 0.000 description 3
- FFJQAEYLAQMGDL-MGHWNKPDSA-N Ile-Lys-Tyr Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 FFJQAEYLAQMGDL-MGHWNKPDSA-N 0.000 description 3
- IIWQTXMUALXGOV-PCBIJLKTSA-N Ile-Phe-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(=O)O)C(=O)O)N IIWQTXMUALXGOV-PCBIJLKTSA-N 0.000 description 3
- USXAYNCLFSUSBA-MGHWNKPDSA-N Ile-Phe-His Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)N USXAYNCLFSUSBA-MGHWNKPDSA-N 0.000 description 3
- NLZVTPYXYXMCIP-XUXIUFHCSA-N Ile-Pro-Lys Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O NLZVTPYXYXMCIP-XUXIUFHCSA-N 0.000 description 3
- CZWANIQKACCEKW-CYDGBPFRSA-N Ile-Pro-Met Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)O)N CZWANIQKACCEKW-CYDGBPFRSA-N 0.000 description 3
- VGSPNSSCMOHRRR-BJDJZHNGSA-N Ile-Ser-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N VGSPNSSCMOHRRR-BJDJZHNGSA-N 0.000 description 3
- DZMWFIRHFFVBHS-ZEWNOJEFSA-N Ile-Tyr-Phe Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)N[C@@H](CC2=CC=CC=C2)C(=O)O)N DZMWFIRHFFVBHS-ZEWNOJEFSA-N 0.000 description 3
- 108010065920 Insulin Lispro Proteins 0.000 description 3
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 3
- 241000880493 Leptailurus serval Species 0.000 description 3
- SUPVSFFZWVOEOI-CQDKDKBSSA-N Leu-Ala-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-CQDKDKBSSA-N 0.000 description 3
- SUPVSFFZWVOEOI-UHFFFAOYSA-N Leu-Ala-Tyr Natural products CC(C)CC(N)C(=O)NC(C)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 SUPVSFFZWVOEOI-UHFFFAOYSA-N 0.000 description 3
- JUWJEAPUNARGCF-DCAQKATOSA-N Leu-Arg-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JUWJEAPUNARGCF-DCAQKATOSA-N 0.000 description 3
- RFUBXQQFJFGJFV-GUBZILKMSA-N Leu-Asn-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RFUBXQQFJFGJFV-GUBZILKMSA-N 0.000 description 3
- OIARJGNVARWKFP-YUMQZZPRSA-N Leu-Asn-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O OIARJGNVARWKFP-YUMQZZPRSA-N 0.000 description 3
- FIJMQLGQLBLBOL-HJGDQZAQSA-N Leu-Asn-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FIJMQLGQLBLBOL-HJGDQZAQSA-N 0.000 description 3
- PJYSOYLLTJKZHC-GUBZILKMSA-N Leu-Asp-Gln Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCC(N)=O PJYSOYLLTJKZHC-GUBZILKMSA-N 0.000 description 3
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 3
- QLQHWWCSCLZUMA-KKUMJFAQSA-N Leu-Asp-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 QLQHWWCSCLZUMA-KKUMJFAQSA-N 0.000 description 3
- YVKSMSDXKMSIRX-GUBZILKMSA-N Leu-Glu-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O YVKSMSDXKMSIRX-GUBZILKMSA-N 0.000 description 3
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 3
- DBSLVQBXKVKDKJ-BJDJZHNGSA-N Leu-Ile-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O DBSLVQBXKVKDKJ-BJDJZHNGSA-N 0.000 description 3
- USLNHQZCDQJBOV-ZPFDUUQYSA-N Leu-Ile-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(O)=O USLNHQZCDQJBOV-ZPFDUUQYSA-N 0.000 description 3
- HGFGEMSVBMCFKK-MNXVOIDGSA-N Leu-Ile-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HGFGEMSVBMCFKK-MNXVOIDGSA-N 0.000 description 3
- OMHLATXVNQSALM-FQUUOJAGSA-N Leu-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC(C)C)N OMHLATXVNQSALM-FQUUOJAGSA-N 0.000 description 3
- HRTRLSRYZZKPCO-BJDJZHNGSA-N Leu-Ile-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CO)C(O)=O HRTRLSRYZZKPCO-BJDJZHNGSA-N 0.000 description 3
- KYIIALJHAOIAHF-KKUMJFAQSA-N Leu-Leu-His Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CC1=CN=CN1 KYIIALJHAOIAHF-KKUMJFAQSA-N 0.000 description 3
- ZRHDPZAAWLXXIR-SRVKXCTJSA-N Leu-Lys-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O ZRHDPZAAWLXXIR-SRVKXCTJSA-N 0.000 description 3
- HVHRPWQEQHIQJF-AVGNSLFASA-N Leu-Lys-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O HVHRPWQEQHIQJF-AVGNSLFASA-N 0.000 description 3
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 3
- ZAVCJRJOQKIOJW-KKUMJFAQSA-N Leu-Phe-Asp Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(O)=O)C(O)=O)CC1=CC=CC=C1 ZAVCJRJOQKIOJW-KKUMJFAQSA-N 0.000 description 3
- PJWOOBTYQNNRBF-BZSNNMDCSA-N Leu-Phe-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)O)N PJWOOBTYQNNRBF-BZSNNMDCSA-N 0.000 description 3
- MVVSHHJKJRZVNY-ACRUOGEOSA-N Leu-Phe-Tyr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MVVSHHJKJRZVNY-ACRUOGEOSA-N 0.000 description 3
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 3
- WUHBLPVELFTPQK-KKUMJFAQSA-N Leu-Tyr-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(N)=O)C(O)=O WUHBLPVELFTPQK-KKUMJFAQSA-N 0.000 description 3
- FZIJIFCXUCZHOL-CIUDSAMLSA-N Lys-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN FZIJIFCXUCZHOL-CIUDSAMLSA-N 0.000 description 3
- WXJKFRMKJORORD-DCAQKATOSA-N Lys-Arg-Ala Chemical compound NC(=N)NCCC[C@@H](C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@@H](N)CCCCN WXJKFRMKJORORD-DCAQKATOSA-N 0.000 description 3
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 3
- ABHIXYDMILIUKV-CIUDSAMLSA-N Lys-Asn-Asn Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O ABHIXYDMILIUKV-CIUDSAMLSA-N 0.000 description 3
- 108010062166 Lys-Asn-Asp Proteins 0.000 description 3
- BYPMOIFBQPEWOH-CIUDSAMLSA-N Lys-Asn-Asp Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)O)C(=O)O)N BYPMOIFBQPEWOH-CIUDSAMLSA-N 0.000 description 3
- HQVDJTYKCMIWJP-YUMQZZPRSA-N Lys-Asn-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O HQVDJTYKCMIWJP-YUMQZZPRSA-N 0.000 description 3
- FLCMXEFCTLXBTL-DCAQKATOSA-N Lys-Asp-Arg Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N FLCMXEFCTLXBTL-DCAQKATOSA-N 0.000 description 3
- LMVOVCYVZBBWQB-SRVKXCTJSA-N Lys-Asp-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN LMVOVCYVZBBWQB-SRVKXCTJSA-N 0.000 description 3
- NTBFKPBULZGXQL-KKUMJFAQSA-N Lys-Asp-Tyr Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 NTBFKPBULZGXQL-KKUMJFAQSA-N 0.000 description 3
- GJJQCBVRWDGLMQ-GUBZILKMSA-N Lys-Glu-Ala Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O GJJQCBVRWDGLMQ-GUBZILKMSA-N 0.000 description 3
- GCMWRRQAKQXDED-IUCAKERBSA-N Lys-Glu-Gly Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)N[C@@H](CCC([O-])=O)C(=O)NCC([O-])=O GCMWRRQAKQXDED-IUCAKERBSA-N 0.000 description 3
- DCRWPTBMWMGADO-AVGNSLFASA-N Lys-Glu-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O DCRWPTBMWMGADO-AVGNSLFASA-N 0.000 description 3
- VQXAVLQBQJMENB-SRVKXCTJSA-N Lys-Glu-Met Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCSC)C(O)=O VQXAVLQBQJMENB-SRVKXCTJSA-N 0.000 description 3
- PAMDBWYMLWOELY-SDDRHHMPSA-N Lys-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCCCN)N)C(=O)O PAMDBWYMLWOELY-SDDRHHMPSA-N 0.000 description 3
- VEGLGAOVLFODGC-GUBZILKMSA-N Lys-Glu-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VEGLGAOVLFODGC-GUBZILKMSA-N 0.000 description 3
- GQZMPWBZQALKJO-UWVGGRQHSA-N Lys-Gly-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O GQZMPWBZQALKJO-UWVGGRQHSA-N 0.000 description 3
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 3
- DTUZCYRNEJDKSR-NHCYSSNCSA-N Lys-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN DTUZCYRNEJDKSR-NHCYSSNCSA-N 0.000 description 3
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 3
- NKKFVJRLCCUJNA-QWRGUYRKSA-N Lys-Gly-Lys Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN NKKFVJRLCCUJNA-QWRGUYRKSA-N 0.000 description 3
- PBLLTSKBTAHDNA-KBPBESRZSA-N Lys-Gly-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O PBLLTSKBTAHDNA-KBPBESRZSA-N 0.000 description 3
- ZASPELYMPSACER-HOCLYGCPSA-N Lys-Gly-Trp Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O ZASPELYMPSACER-HOCLYGCPSA-N 0.000 description 3
- NNKLKUUGESXCBS-KBPBESRZSA-N Lys-Gly-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O NNKLKUUGESXCBS-KBPBESRZSA-N 0.000 description 3
- MXMDJEJWERYPMO-XUXIUFHCSA-N Lys-Ile-Arg Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O MXMDJEJWERYPMO-XUXIUFHCSA-N 0.000 description 3
- NCZIQZYZPUPMKY-PPCPHDFISA-N Lys-Ile-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NCZIQZYZPUPMKY-PPCPHDFISA-N 0.000 description 3
- MUXNCRWTWBMNHX-SRVKXCTJSA-N Lys-Leu-Asp Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O MUXNCRWTWBMNHX-SRVKXCTJSA-N 0.000 description 3
- RBEATVHTWHTHTJ-KKUMJFAQSA-N Lys-Leu-Lys Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(O)=O RBEATVHTWHTHTJ-KKUMJFAQSA-N 0.000 description 3
- OIQSIMFSVLLWBX-VOAKCMCISA-N Lys-Leu-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OIQSIMFSVLLWBX-VOAKCMCISA-N 0.000 description 3
- VUTWYNQUSJWBHO-BZSNNMDCSA-N Lys-Leu-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O VUTWYNQUSJWBHO-BZSNNMDCSA-N 0.000 description 3
- RIJCHEVHFWMDKD-SRVKXCTJSA-N Lys-Lys-Asn Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RIJCHEVHFWMDKD-SRVKXCTJSA-N 0.000 description 3
- PYFNONMJYNJENN-AVGNSLFASA-N Lys-Lys-Gln Chemical compound C(CCN)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N PYFNONMJYNJENN-AVGNSLFASA-N 0.000 description 3
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 3
- YDDDRTIPNTWGIG-SRVKXCTJSA-N Lys-Lys-Ser Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O YDDDRTIPNTWGIG-SRVKXCTJSA-N 0.000 description 3
- PLDJDCJLRCYPJB-VOAKCMCISA-N Lys-Lys-Thr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PLDJDCJLRCYPJB-VOAKCMCISA-N 0.000 description 3
- BXPHMHQHYHILBB-BZSNNMDCSA-N Lys-Lys-Tyr Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O BXPHMHQHYHILBB-BZSNNMDCSA-N 0.000 description 3
- ZJSZPXISKMDJKQ-JYJNAYRXSA-N Lys-Phe-Glu Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CCC(O)=O)C(O)=O)CC1=CC=CC=C1 ZJSZPXISKMDJKQ-JYJNAYRXSA-N 0.000 description 3
- JOSAKOKSPXROGQ-BJDJZHNGSA-N Lys-Ser-Ile Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O JOSAKOKSPXROGQ-BJDJZHNGSA-N 0.000 description 3
- IOQWIOPSKJOEKI-SRVKXCTJSA-N Lys-Ser-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O IOQWIOPSKJOEKI-SRVKXCTJSA-N 0.000 description 3
- SQXZLVXQXWILKW-KKUMJFAQSA-N Lys-Ser-Phe Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SQXZLVXQXWILKW-KKUMJFAQSA-N 0.000 description 3
- RPWTZTBIFGENIA-VOAKCMCISA-N Lys-Thr-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RPWTZTBIFGENIA-VOAKCMCISA-N 0.000 description 3
- SQRLLZAQNOQCEG-KKUMJFAQSA-N Lys-Tyr-Ser Chemical compound NCCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CC=C(O)C=C1 SQRLLZAQNOQCEG-KKUMJFAQSA-N 0.000 description 3
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 3
- BLIPQDLSCFGUFA-GUBZILKMSA-N Met-Arg-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O BLIPQDLSCFGUFA-GUBZILKMSA-N 0.000 description 3
- SODXFJOPSCXOHE-IHRRRGAJSA-N Met-Leu-Leu Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O SODXFJOPSCXOHE-IHRRRGAJSA-N 0.000 description 3
- LCPUWQLULVXROY-RHYQMDGZSA-N Met-Lys-Thr Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LCPUWQLULVXROY-RHYQMDGZSA-N 0.000 description 3
- QEDGNYFHLXXIDC-DCAQKATOSA-N Met-Pro-Gln Chemical compound CSCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O QEDGNYFHLXXIDC-DCAQKATOSA-N 0.000 description 3
- SMVTWPOATVIXTN-NAKRPEOUSA-N Met-Ser-Ile Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O SMVTWPOATVIXTN-NAKRPEOUSA-N 0.000 description 3
- GGXZOTSDJJTDGB-GUBZILKMSA-N Met-Ser-Val Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O GGXZOTSDJJTDGB-GUBZILKMSA-N 0.000 description 3
- AUEJLPRZGVVDNU-UHFFFAOYSA-N N-L-tyrosyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CC1=CC=C(O)C=C1 AUEJLPRZGVVDNU-UHFFFAOYSA-N 0.000 description 3
- AJHCSUXXECOXOY-UHFFFAOYSA-N N-glycyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)CN)C(O)=O)=CNC2=C1 AJHCSUXXECOXOY-UHFFFAOYSA-N 0.000 description 3
- 108010079364 N-glycylalanine Proteins 0.000 description 3
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 3
- 108010087066 N2-tryptophyllysine Proteins 0.000 description 3
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 3
- BJEYSVHMGIJORT-NHCYSSNCSA-N Phe-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 BJEYSVHMGIJORT-NHCYSSNCSA-N 0.000 description 3
- AYPMIIKUMNADSU-IHRRRGAJSA-N Phe-Arg-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(N)=O)C(O)=O AYPMIIKUMNADSU-IHRRRGAJSA-N 0.000 description 3
- KIEPQOIQHFKQLK-PCBIJLKTSA-N Phe-Asn-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KIEPQOIQHFKQLK-PCBIJLKTSA-N 0.000 description 3
- IUVYJBMTHARMIP-PCBIJLKTSA-N Phe-Asp-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O IUVYJBMTHARMIP-PCBIJLKTSA-N 0.000 description 3
- WIVCOAKLPICYGY-KKUMJFAQSA-N Phe-Asp-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N WIVCOAKLPICYGY-KKUMJFAQSA-N 0.000 description 3
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 3
- RJYBHZVWJPUSLB-QEWYBTABSA-N Phe-Gln-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=CC=C1)N RJYBHZVWJPUSLB-QEWYBTABSA-N 0.000 description 3
- BFYHIHGIHGROAT-HTUGSXCWSA-N Phe-Glu-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BFYHIHGIHGROAT-HTUGSXCWSA-N 0.000 description 3
- JEBWZLWTRPZQRX-QWRGUYRKSA-N Phe-Gly-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O JEBWZLWTRPZQRX-QWRGUYRKSA-N 0.000 description 3
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 3
- KRYSMKKRRRWOCZ-QEWYBTABSA-N Phe-Ile-Glu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCC(O)=O)C(O)=O KRYSMKKRRRWOCZ-QEWYBTABSA-N 0.000 description 3
- WEMYTDDMDBLPMI-DKIMLUQUSA-N Phe-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)N WEMYTDDMDBLPMI-DKIMLUQUSA-N 0.000 description 3
- RMKGXGPQIPLTFC-KKUMJFAQSA-N Phe-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O RMKGXGPQIPLTFC-KKUMJFAQSA-N 0.000 description 3
- OQTDZEJJWWAGJT-KKUMJFAQSA-N Phe-Lys-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O OQTDZEJJWWAGJT-KKUMJFAQSA-N 0.000 description 3
- DOXQMJCSSYZSNM-BZSNNMDCSA-N Phe-Lys-Leu Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O DOXQMJCSSYZSNM-BZSNNMDCSA-N 0.000 description 3
- PEFJUUYFEGBXFA-BZSNNMDCSA-N Phe-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CC1=CC=CC=C1 PEFJUUYFEGBXFA-BZSNNMDCSA-N 0.000 description 3
- SCKXGHWQPPURGT-KKUMJFAQSA-N Phe-Lys-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O SCKXGHWQPPURGT-KKUMJFAQSA-N 0.000 description 3
- XZQYIJALMGEUJD-OEAJRASXSA-N Phe-Lys-Thr Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XZQYIJALMGEUJD-OEAJRASXSA-N 0.000 description 3
- CBENHWCORLVGEQ-HJOGWXRNSA-N Phe-Phe-Phe Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 CBENHWCORLVGEQ-HJOGWXRNSA-N 0.000 description 3
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 3
- XDMMOISUAHXXFD-SRVKXCTJSA-N Phe-Ser-Asp Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O XDMMOISUAHXXFD-SRVKXCTJSA-N 0.000 description 3
- IPFXYNKCXYGSSV-KKUMJFAQSA-N Phe-Ser-Lys Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O)N IPFXYNKCXYGSSV-KKUMJFAQSA-N 0.000 description 3
- BQMFWUKNOCJDNV-HJWJTTGWSA-N Phe-Val-Ile Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O BQMFWUKNOCJDNV-HJWJTTGWSA-N 0.000 description 3
- SXMSEHDMNIUTSP-DCAQKATOSA-N Pro-Lys-Asn Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O SXMSEHDMNIUTSP-DCAQKATOSA-N 0.000 description 3
- SMFQZMGHCODUPQ-ULQDDVLXSA-N Pro-Lys-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O SMFQZMGHCODUPQ-ULQDDVLXSA-N 0.000 description 3
- WCNVGGZRTNHOOS-ULQDDVLXSA-N Pro-Lys-Tyr Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC2=CC=C(C=C2)O)C(=O)O WCNVGGZRTNHOOS-ULQDDVLXSA-N 0.000 description 3
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 3
- RMJZWERKFFNNNS-XGEHTFHBSA-N Pro-Thr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O RMJZWERKFFNNNS-XGEHTFHBSA-N 0.000 description 3
- VEUACYMXJKXALX-IHRRRGAJSA-N Pro-Tyr-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(O)=O VEUACYMXJKXALX-IHRRRGAJSA-N 0.000 description 3
- XDKKMRPRRCOELJ-GUBZILKMSA-N Pro-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H]1CCCN1 XDKKMRPRRCOELJ-GUBZILKMSA-N 0.000 description 3
- 108010003201 RGH 0205 Proteins 0.000 description 3
- 108020004511 Recombinant DNA Proteins 0.000 description 3
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 3
- WTUJZHKANPDPIN-CIUDSAMLSA-N Ser-Ala-Lys Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N WTUJZHKANPDPIN-CIUDSAMLSA-N 0.000 description 3
- QPFJSHSJFIYDJZ-GHCJXIJMSA-N Ser-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO QPFJSHSJFIYDJZ-GHCJXIJMSA-N 0.000 description 3
- BYIROAKULFFTEK-CIUDSAMLSA-N Ser-Asp-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CO BYIROAKULFFTEK-CIUDSAMLSA-N 0.000 description 3
- BTPAWKABYQMKKN-LKXGYXEUSA-N Ser-Asp-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BTPAWKABYQMKKN-LKXGYXEUSA-N 0.000 description 3
- SWSRFJZZMNLMLY-ZKWXMUAHSA-N Ser-Asp-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O SWSRFJZZMNLMLY-ZKWXMUAHSA-N 0.000 description 3
- IXUGADGDCQDLSA-FXQIFTODSA-N Ser-Gln-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N IXUGADGDCQDLSA-FXQIFTODSA-N 0.000 description 3
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 3
- QKQDTEYDEIJPNK-GUBZILKMSA-N Ser-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CO QKQDTEYDEIJPNK-GUBZILKMSA-N 0.000 description 3
- VQBCMLMPEWPUTB-ACZMJKKPSA-N Ser-Glu-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O VQBCMLMPEWPUTB-ACZMJKKPSA-N 0.000 description 3
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 3
- IOVHBRCQOGWAQH-ZKWXMUAHSA-N Ser-Gly-Ile Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)CC)C(O)=O IOVHBRCQOGWAQH-ZKWXMUAHSA-N 0.000 description 3
- JIPVNVNKXJLFJF-BJDJZHNGSA-N Ser-Ile-Lys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CO)N JIPVNVNKXJLFJF-BJDJZHNGSA-N 0.000 description 3
- VMLONWHIORGALA-SRVKXCTJSA-N Ser-Leu-Leu Chemical compound CC(C)C[C@@H](C([O-])=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]([NH3+])CO VMLONWHIORGALA-SRVKXCTJSA-N 0.000 description 3
- PPNPDKGQRFSCAC-CIUDSAMLSA-N Ser-Lys-Asp Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)CO)C(=O)N[C@@H](CC(O)=O)C(O)=O PPNPDKGQRFSCAC-CIUDSAMLSA-N 0.000 description 3
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 3
- LRZLZIUXQBIWTB-KATARQTJSA-N Ser-Lys-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LRZLZIUXQBIWTB-KATARQTJSA-N 0.000 description 3
- UGTZYIPOBYXWRW-SRVKXCTJSA-N Ser-Phe-Asp Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O UGTZYIPOBYXWRW-SRVKXCTJSA-N 0.000 description 3
- WNDUPCKKKGSKIQ-CIUDSAMLSA-N Ser-Pro-Gln Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O WNDUPCKKKGSKIQ-CIUDSAMLSA-N 0.000 description 3
- NADLKBTYNKUJEP-KATARQTJSA-N Ser-Thr-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O NADLKBTYNKUJEP-KATARQTJSA-N 0.000 description 3
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 3
- HSWXBJCBYSWBPT-GUBZILKMSA-N Ser-Val-Val Chemical compound CC(C)[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)CO)C(C)C)C(O)=O HSWXBJCBYSWBPT-GUBZILKMSA-N 0.000 description 3
- LVHHEVGYAZGXDE-KDXUFGMBSA-N Thr-Ala-Pro Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(=O)O)N)O LVHHEVGYAZGXDE-KDXUFGMBSA-N 0.000 description 3
- VIBXMCZWVUOZLA-OLHMAJIHSA-N Thr-Asn-Asn Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N)O VIBXMCZWVUOZLA-OLHMAJIHSA-N 0.000 description 3
- QILPDQCTQZDHFM-HJGDQZAQSA-N Thr-Gln-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QILPDQCTQZDHFM-HJGDQZAQSA-N 0.000 description 3
- LAFLAXHTDVNVEL-WDCWCFNPSA-N Thr-Gln-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O LAFLAXHTDVNVEL-WDCWCFNPSA-N 0.000 description 3
- SXAGUVRFGJSFKC-ZEILLAHLSA-N Thr-His-Thr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O SXAGUVRFGJSFKC-ZEILLAHLSA-N 0.000 description 3
- PAXANSWUSVPFNK-IUKAMOBKSA-N Thr-Ile-Asn Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H]([C@@H](C)O)N PAXANSWUSVPFNK-IUKAMOBKSA-N 0.000 description 3
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 3
- IJVNLNRVDUTWDD-MEYUZBJRSA-N Thr-Leu-Tyr Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O IJVNLNRVDUTWDD-MEYUZBJRSA-N 0.000 description 3
- TZJSEJOXAIWOST-RHYQMDGZSA-N Thr-Lys-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCN=C(N)N TZJSEJOXAIWOST-RHYQMDGZSA-N 0.000 description 3
- QNCFWHZVRNXAKW-OEAJRASXSA-N Thr-Lys-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O QNCFWHZVRNXAKW-OEAJRASXSA-N 0.000 description 3
- XHWCDRUPDNSDAZ-XKBZYTNZSA-N Thr-Ser-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N)O XHWCDRUPDNSDAZ-XKBZYTNZSA-N 0.000 description 3
- IQPWNQRRAJHOKV-KATARQTJSA-N Thr-Ser-Lys Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCCN IQPWNQRRAJHOKV-KATARQTJSA-N 0.000 description 3
- IEZVHOULSUULHD-XGEHTFHBSA-N Thr-Ser-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O IEZVHOULSUULHD-XGEHTFHBSA-N 0.000 description 3
- XVHAUVJXBFGUPC-RPTUDFQQSA-N Thr-Tyr-Phe Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O XVHAUVJXBFGUPC-RPTUDFQQSA-N 0.000 description 3
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 3
- 235000021307 Triticum Nutrition 0.000 description 3
- 244000098338 Triticum aestivum Species 0.000 description 3
- NKUIXQOJUAEIET-AQZXSJQPSA-N Trp-Asp-Thr Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@H](O)C)C(O)=O)=CNC2=C1 NKUIXQOJUAEIET-AQZXSJQPSA-N 0.000 description 3
- RRVUOLRWIZXBRQ-IHPCNDPISA-N Trp-Leu-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N RRVUOLRWIZXBRQ-IHPCNDPISA-N 0.000 description 3
- NLLARHRWSFNEMH-NUTKFTJISA-N Trp-Lys-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N NLLARHRWSFNEMH-NUTKFTJISA-N 0.000 description 3
- KRCPXGSWDOGHAM-XIRDDKMYSA-N Trp-Lys-Asp Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O KRCPXGSWDOGHAM-XIRDDKMYSA-N 0.000 description 3
- UUIYFDAWNBSWPG-IHPCNDPISA-N Trp-Lys-Lys Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N UUIYFDAWNBSWPG-IHPCNDPISA-N 0.000 description 3
- WBZOZLNLXVBCNW-LTHWPDAASA-N Trp-Thr-Ile Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)CC)C(O)=O)[C@@H](C)O)=CNC2=C1 WBZOZLNLXVBCNW-LTHWPDAASA-N 0.000 description 3
- HTHCZRWCFXMENJ-KKUMJFAQSA-N Tyr-Arg-Glu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O HTHCZRWCFXMENJ-KKUMJFAQSA-N 0.000 description 3
- AYPAIRCDLARHLM-KKUMJFAQSA-N Tyr-Asn-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCCCN)C(=O)O)N)O AYPAIRCDLARHLM-KKUMJFAQSA-N 0.000 description 3
- SCCKSNREWHMKOJ-SRVKXCTJSA-N Tyr-Asn-Ser Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O SCCKSNREWHMKOJ-SRVKXCTJSA-N 0.000 description 3
- IYHNBRUWVBIVJR-IHRRRGAJSA-N Tyr-Gln-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 IYHNBRUWVBIVJR-IHRRRGAJSA-N 0.000 description 3
- TWAVEIJGFCBWCG-JYJNAYRXSA-N Tyr-Gln-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CC=C(C=C1)O)N TWAVEIJGFCBWCG-JYJNAYRXSA-N 0.000 description 3
- ZRPLVTZTKPPSBT-AVGNSLFASA-N Tyr-Glu-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O ZRPLVTZTKPPSBT-AVGNSLFASA-N 0.000 description 3
- KIJLSRYAUGGZIN-CFMVVWHZSA-N Tyr-Ile-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KIJLSRYAUGGZIN-CFMVVWHZSA-N 0.000 description 3
- BXPOOVDVGWEXDU-WZLNRYEVSA-N Tyr-Ile-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(O)=O BXPOOVDVGWEXDU-WZLNRYEVSA-N 0.000 description 3
- DWAMXBFJNZIHMC-KBPBESRZSA-N Tyr-Leu-Gly Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O DWAMXBFJNZIHMC-KBPBESRZSA-N 0.000 description 3
- GITNQBVCEQBDQC-KKUMJFAQSA-N Tyr-Lys-Asn Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O GITNQBVCEQBDQC-KKUMJFAQSA-N 0.000 description 3
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 3
- XGZBEGGGAUQBMB-KJEVXHAQSA-N Tyr-Pro-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC2=CC=C(C=C2)O)N)O XGZBEGGGAUQBMB-KJEVXHAQSA-N 0.000 description 3
- QPOUERMDWKKZEG-HJPIBITLSA-N Tyr-Ser-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QPOUERMDWKKZEG-HJPIBITLSA-N 0.000 description 3
- TYFLVOUZHQUBGM-IHRRRGAJSA-N Tyr-Ser-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 TYFLVOUZHQUBGM-IHRRRGAJSA-N 0.000 description 3
- PWKMJDQXKCENMF-MEYUZBJRSA-N Tyr-Thr-Leu Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O PWKMJDQXKCENMF-MEYUZBJRSA-N 0.000 description 3
- OBKOPLHSRDATFO-XHSDSOJGSA-N Tyr-Val-Pro Chemical compound CC(C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N OBKOPLHSRDATFO-XHSDSOJGSA-N 0.000 description 3
- GXAZTLJYINLMJL-LAEOZQHASA-N Val-Asn-Gln Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CCC(=O)N)C(=O)O)N GXAZTLJYINLMJL-LAEOZQHASA-N 0.000 description 3
- OUUBKKIJQIAPRI-LAEOZQHASA-N Val-Gln-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O OUUBKKIJQIAPRI-LAEOZQHASA-N 0.000 description 3
- CVIXTAITYJQMPE-LAEOZQHASA-N Val-Glu-Asn Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CVIXTAITYJQMPE-LAEOZQHASA-N 0.000 description 3
- SZTTYWIUCGSURQ-AUTRQRHGSA-N Val-Glu-Glu Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O SZTTYWIUCGSURQ-AUTRQRHGSA-N 0.000 description 3
- ZXAGTABZUOMUDO-GVXVVHGQSA-N Val-Glu-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)O)N ZXAGTABZUOMUDO-GVXVVHGQSA-N 0.000 description 3
- LYERIXUFCYVFFX-GVXVVHGQSA-N Val-Leu-Glu Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N LYERIXUFCYVFFX-GVXVVHGQSA-N 0.000 description 3
- SVFRYKBZHUGKLP-QXEWZRGKSA-N Val-Met-Asn Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)N)C(=O)O)N SVFRYKBZHUGKLP-QXEWZRGKSA-N 0.000 description 3
- YLRAFVVWZRSZQC-DZKIICNBSA-N Val-Phe-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N YLRAFVVWZRSZQC-DZKIICNBSA-N 0.000 description 3
- YQYFYUSYEDNLSD-YEPSODPASA-N Val-Thr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O YQYFYUSYEDNLSD-YEPSODPASA-N 0.000 description 3
- JAIZPWVHPQRYOU-ZJDVBMNYSA-N Val-Thr-Thr Chemical compound C[C@H]([C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O JAIZPWVHPQRYOU-ZJDVBMNYSA-N 0.000 description 3
- RLVTVHSDKHBFQP-ULQDDVLXSA-N Val-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C(C)C)CC1=CC=C(O)C=C1 RLVTVHSDKHBFQP-ULQDDVLXSA-N 0.000 description 3
- IECQJCJNPJVUSB-IHRRRGAJSA-N Val-Tyr-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CO)C(O)=O IECQJCJNPJVUSB-IHRRRGAJSA-N 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 3
- 108010005233 alanylglutamic acid Proteins 0.000 description 3
- 108010087924 alanylproline Proteins 0.000 description 3
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 3
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 3
- 108010062796 arginyllysine Proteins 0.000 description 3
- 108010068265 aspartyltyrosine Proteins 0.000 description 3
- 239000007795 chemical reaction product Substances 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 238000005520 cutting process Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- FSXRLASFHBWESK-UHFFFAOYSA-N dipeptide phenylalanyl-tyrosine Natural products C=1C=C(O)C=CC=1CC(C(O)=O)NC(=O)C(N)CC1=CC=CC=C1 FSXRLASFHBWESK-UHFFFAOYSA-N 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 229940118764 francisella tularensis Drugs 0.000 description 3
- 108010078144 glutaminyl-glycine Proteins 0.000 description 3
- 108010057083 glutamyl-aspartyl-leucine Proteins 0.000 description 3
- 108010049041 glutamylalanine Proteins 0.000 description 3
- 230000013595 glycosylation Effects 0.000 description 3
- 238000006206 glycosylation reaction Methods 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 3
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 3
- 108010059898 glycyl-tyrosyl-lysine Proteins 0.000 description 3
- 108010015792 glycyllysine Proteins 0.000 description 3
- 108010084389 glycyltryptophan Proteins 0.000 description 3
- 108010087823 glycyltyrosine Proteins 0.000 description 3
- 108010025306 histidylleucine Proteins 0.000 description 3
- 108010091871 leucylmethionine Proteins 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 108010025153 lysyl-alanyl-alanine Proteins 0.000 description 3
- 108010045397 lysyl-tyrosyl-lysine Proteins 0.000 description 3
- 108010017391 lysylvaline Proteins 0.000 description 3
- 108010012581 phenylalanylglutamate Proteins 0.000 description 3
- 108010051242 phenylalanylserine Proteins 0.000 description 3
- 108010053725 prolylvaline Proteins 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 108010069117 seryl-lysyl-aspartic acid Proteins 0.000 description 3
- 108010026333 seryl-proline Proteins 0.000 description 3
- 239000006228 supernatant Substances 0.000 description 3
- 108010031491 threonyl-lysyl-glutamic acid Proteins 0.000 description 3
- 108010072986 threonyl-seryl-lysine Proteins 0.000 description 3
- 108010020532 tyrosyl-proline Proteins 0.000 description 3
- 108010003137 tyrosyltyrosine Proteins 0.000 description 3
- 241000701161 unidentified adenovirus Species 0.000 description 3
- 241001430294 unidentified retrovirus Species 0.000 description 3
- 239000013603 viral vector Substances 0.000 description 3
- 239000011701 zinc Substances 0.000 description 3
- 229910052725 zinc Inorganic materials 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 2
- 241001408449 Asca Species 0.000 description 2
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 2
- 241000181825 Bacteroidetes oral taxon 274 Species 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 2
- 240000003259 Brassica oleracea var. botrytis Species 0.000 description 2
- 241001478240 Coccus Species 0.000 description 2
- 108020003215 DNA Probes Proteins 0.000 description 2
- 241000238557 Decapoda Species 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 102000004533 Endonucleases Human genes 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 2
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 2
- 241001430278 Helcococcus Species 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 235000014663 Kluyveromyces fragilis Nutrition 0.000 description 2
- 241001138401 Kluyveromyces lactis Species 0.000 description 2
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- 241000689677 Lachnospiraceae bacterium NC2008 Species 0.000 description 2
- 241000689670 Lachnospiraceae bacterium ND2006 Species 0.000 description 2
- 240000008415 Lactuca sativa Species 0.000 description 2
- JACAKCWAOHKQBV-UWVGGRQHSA-N Met-Gly-Lys Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN JACAKCWAOHKQBV-UWVGGRQHSA-N 0.000 description 2
- 241000204031 Mycoplasma Species 0.000 description 2
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 2
- 241001365624 Oribacterium sp. NK2B42 Species 0.000 description 2
- 241000083652 Osca Species 0.000 description 2
- 241000235648 Pichia Species 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 244000253911 Saccharomyces fragilis Species 0.000 description 2
- 235000018368 Saccharomyces fragilis Nutrition 0.000 description 2
- 244000300264 Spinacia oleracea Species 0.000 description 2
- 241000589886 Treponema Species 0.000 description 2
- 208000034784 Tularaemia Diseases 0.000 description 2
- 244000078534 Vaccinium myrtillus Species 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 238000000246 agarose gel electrophoresis Methods 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 235000009582 asparagine Nutrition 0.000 description 2
- 229960001230 asparagine Drugs 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 2
- 238000005119 centrifugation Methods 0.000 description 2
- 238000007385 chemical modification Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000009089 cytolysis Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000001962 electrophoresis Methods 0.000 description 2
- 238000004520 electroporation Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 210000004602 germ cell Anatomy 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- 239000005090 green fluorescent protein Substances 0.000 description 2
- 238000004128 high performance liquid chromatography Methods 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 229940031154 kluyveromyces marxianus Drugs 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 239000006186 oral dosage form Substances 0.000 description 2
- YBYRMVIVWMBXKQ-UHFFFAOYSA-N phenylmethanesulfonyl fluoride Chemical compound FS(=O)(=O)CC1=CC=CC=C1 YBYRMVIVWMBXKQ-UHFFFAOYSA-N 0.000 description 2
- 239000002244 precipitate Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000010791 quenching Methods 0.000 description 2
- 230000000171 quenching effect Effects 0.000 description 2
- 230000035484 reaction time Effects 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 230000003362 replicative effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000003248 secreting effect Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- ATHGHQPFGPMSJY-UHFFFAOYSA-N spermidine Chemical compound NCCCCNCCCN ATHGHQPFGPMSJY-UHFFFAOYSA-N 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- 241000186046 Actinomyces Species 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- WGDNWOMKBUXFHR-BQBZGAKWSA-N Ala-Gly-Arg Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N WGDNWOMKBUXFHR-BQBZGAKWSA-N 0.000 description 1
- 235000006667 Aleurites moluccana Nutrition 0.000 description 1
- 241000187643 Amycolatopsis Species 0.000 description 1
- 244000144730 Amygdalus persica Species 0.000 description 1
- 244000003416 Asparagus officinalis Species 0.000 description 1
- 235000005340 Asparagus officinalis Nutrition 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 235000007558 Avena sp Nutrition 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 241000589941 Azospirillum Species 0.000 description 1
- 241000606125 Bacteroides Species 0.000 description 1
- 241000237519 Bivalvia Species 0.000 description 1
- 238000009010 Bradford assay Methods 0.000 description 1
- 241000219198 Brassica Species 0.000 description 1
- 240000007124 Brassica oleracea Species 0.000 description 1
- 235000003899 Brassica oleracea var acephala Nutrition 0.000 description 1
- 235000011301 Brassica oleracea var capitata Nutrition 0.000 description 1
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 1
- 235000001169 Brassica oleracea var oleracea Nutrition 0.000 description 1
- 238000010453 CRISPR/Cas method Methods 0.000 description 1
- 101100112111 Caenorhabditis elegans cand-1 gene Proteins 0.000 description 1
- 241000589876 Campylobacter Species 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- 235000002566 Capsicum Nutrition 0.000 description 1
- 240000008574 Capsicum frutescens Species 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 108091062157 Cis-regulatory element Proteins 0.000 description 1
- 241000207199 Citrus Species 0.000 description 1
- 235000005979 Citrus limon Nutrition 0.000 description 1
- 244000248349 Citrus limon Species 0.000 description 1
- 240000000560 Citrus x paradisi Species 0.000 description 1
- 240000007154 Coffea arabica Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000238424 Crustacea Species 0.000 description 1
- 241000195493 Cryptophyta Species 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 235000002767 Daucus carota Nutrition 0.000 description 1
- 244000000626 Daucus carota Species 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 229920001917 Ficoll Polymers 0.000 description 1
- 241000589565 Flavobacterium Species 0.000 description 1
- 235000016623 Fragaria vesca Nutrition 0.000 description 1
- 240000009088 Fragaria x ananassa Species 0.000 description 1
- 235000011363 Fragaria x ananassa Nutrition 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 241000032681 Gluconacetobacter Species 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 241000219146 Gossypium Species 0.000 description 1
- 101100412102 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) rec2 gene Proteins 0.000 description 1
- 101100356020 Haemophilus influenzae (strain ATCC 51907 / DSM 11121 / KW20 / Rd) recA gene Proteins 0.000 description 1
- 235000009496 Juglans regia Nutrition 0.000 description 1
- 240000007049 Juglans regia Species 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- 241000186660 Lactobacillus Species 0.000 description 1
- 235000003228 Lactuca sativa Nutrition 0.000 description 1
- 241000283953 Lagomorpha Species 0.000 description 1
- 241000589248 Legionella Species 0.000 description 1
- 208000007764 Legionnaires' Disease Diseases 0.000 description 1
- 241000713666 Lentivirus Species 0.000 description 1
- 241000209510 Liliopsida Species 0.000 description 1
- 241000186781 Listeria Species 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 244000081841 Malus domestica Species 0.000 description 1
- 235000011430 Malus pumila Nutrition 0.000 description 1
- 235000015103 Malus silvestris Nutrition 0.000 description 1
- 240000003183 Manihot esculenta Species 0.000 description 1
- 235000016735 Manihot esculenta subsp esculenta Nutrition 0.000 description 1
- FYRUJIJAUPHUNB-IUCAKERBSA-N Met-Gly-Arg Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCNC(N)=N FYRUJIJAUPHUNB-IUCAKERBSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 101100042680 Mus musculus Slc7a1 gene Proteins 0.000 description 1
- 241000588653 Neisseria Species 0.000 description 1
- 101710118186 Neomycin resistance protein Proteins 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- BZQFBWGGLXLEPQ-UHFFFAOYSA-N O-phosphoryl-L-serine Natural products OC(=O)C(N)COP(O)(O)=O BZQFBWGGLXLEPQ-UHFFFAOYSA-N 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241000237502 Ostreidae Species 0.000 description 1
- 235000016496 Panda oleosa Nutrition 0.000 description 1
- 240000000220 Panda oleosa Species 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 241000009328 Perro Species 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 240000003889 Piper guineense Species 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 235000003447 Pistacia vera Nutrition 0.000 description 1
- 240000006711 Pistacia vera Species 0.000 description 1
- 229920002538 Polyethylene Glycol 20000 Polymers 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 235000009827 Prunus armeniaca Nutrition 0.000 description 1
- 244000018633 Prunus armeniaca Species 0.000 description 1
- 235000006029 Prunus persica var nucipersica Nutrition 0.000 description 1
- 235000006040 Prunus persica var persica Nutrition 0.000 description 1
- 244000017714 Prunus persica var. nucipersica Species 0.000 description 1
- 235000014443 Pyrus communis Nutrition 0.000 description 1
- 240000001987 Pyrus communis Species 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 235000017848 Rubus fruticosus Nutrition 0.000 description 1
- 240000007651 Rubus glaucus Species 0.000 description 1
- 235000011034 Rubus glaucus Nutrition 0.000 description 1
- 235000009122 Rubus idaeus Nutrition 0.000 description 1
- 241000207763 Solanum Species 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 235000002597 Solanum melongena Nutrition 0.000 description 1
- 244000061458 Solanum melongena Species 0.000 description 1
- 240000003829 Sorghum propinquum Species 0.000 description 1
- 235000011684 Sorghum saccharatum Nutrition 0.000 description 1
- 241000949716 Sphaerochaeta Species 0.000 description 1
- 235000009337 Spinacia oleracea Nutrition 0.000 description 1
- 241000191940 Staphylococcus Species 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 108091027544 Subgenomic mRNA Proteins 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 235000009470 Theobroma cacao Nutrition 0.000 description 1
- 244000299461 Theobroma cacao Species 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- 108091028113 Trans-activating crRNA Proteins 0.000 description 1
- 241000223259 Trichoderma Species 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 235000003095 Vaccinium corymbosum Nutrition 0.000 description 1
- 235000017537 Vaccinium myrtillus Nutrition 0.000 description 1
- 235000009754 Vitis X bourquina Nutrition 0.000 description 1
- 235000012333 Vitis X labruscana Nutrition 0.000 description 1
- 240000006365 Vitis vinifera Species 0.000 description 1
- 235000014787 Vitis vinifera Nutrition 0.000 description 1
- 108091060592 XDNA Proteins 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000005377 adsorption chromatography Methods 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 235000021029 blackberry Nutrition 0.000 description 1
- 235000021014 blueberries Nutrition 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 101150059443 cas12a gene Proteins 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 235000020971 citrus fruits Nutrition 0.000 description 1
- 235000020639 clam Nutrition 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- 238000000975 co-precipitation Methods 0.000 description 1
- 235000016213 coffee Nutrition 0.000 description 1
- 235000013353 coffee beverage Nutrition 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 229950006137 dexfosfoserine Drugs 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000010828 elution Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 241001233957 eudicotyledons Species 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000003682 fluorination reaction Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000002523 gelfiltration Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000014726 immortalization of host cell Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 239000007972 injectable composition Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl beta-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 229940039696 lactobacillus Drugs 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000004811 liquid chromatography Methods 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000010297 mechanical methods and process Methods 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 125000001360 methionine group Chemical group N[C@@H](CCSC)C(=O)* 0.000 description 1
- 125000000956 methoxy group Chemical group [H]C([H])([H])O* 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- 239000002808 molecular sieve Substances 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 229910052759 nickel Inorganic materials 0.000 description 1
- 235000014571 nuts Nutrition 0.000 description 1
- 230000009438 off-target cleavage Effects 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 230000003204 osmotic effect Effects 0.000 description 1
- 235000020636 oyster Nutrition 0.000 description 1
- 239000008194 pharmaceutical composition Substances 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- USRGIUJOYOXOQJ-GBXIJSLDSA-N phosphothreonine Chemical compound OP(=O)(O)O[C@H](C)[C@H](N)C(O)=O USRGIUJOYOXOQJ-GBXIJSLDSA-N 0.000 description 1
- DCWXELXMIBXGTH-UHFFFAOYSA-N phosphotyrosine Chemical compound OC(=O)C(N)CC1=CC=C(OP(O)(O)=O)C=C1 DCWXELXMIBXGTH-UHFFFAOYSA-N 0.000 description 1
- 235000020233 pistachio Nutrition 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 244000144977 poultry Species 0.000 description 1
- 235000013594 poultry meat Nutrition 0.000 description 1
- 238000012257 pre-denaturation Methods 0.000 description 1
- -1 promoters Substances 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 229940121649 protein inhibitor Drugs 0.000 description 1
- 239000012268 protein inhibitor Substances 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 230000002797 proteolythic effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000004153 renaturation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 238000005185 salting out Methods 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 239000013605 shuttle vector Substances 0.000 description 1
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 229940063673 spermidine Drugs 0.000 description 1
- 230000001954 sterilising effect Effects 0.000 description 1
- 238000004659 sterilization and disinfection Methods 0.000 description 1
- 125000001424 substituent group Chemical group 0.000 description 1
- 125000000446 sulfanediyl group Chemical group *S* 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000003151 transfection method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 230000014621 translational initiation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 241001529453 unidentified herpesvirus Species 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K31/00—Medicinal preparations containing organic active ingredients
- A61K31/70—Carbohydrates; Sugars; Derivatives thereof
- A61K31/7088—Compounds having three or more nucleosides or nucleotides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/43—Enzymes; Proenzymes; Derivatives thereof
- A61K38/46—Hydrolases (3)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/43—Enzymes; Proenzymes; Derivatives thereof
- A61K38/46—Hydrolases (3)
- A61K38/465—Hydrolases (3) acting on ester bonds (3.1), e.g. lipases, ribonucleases
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/195—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/70—Vectors or expression systems specially adapted for E. coli
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/74—Vectors or expression systems specially adapted for prokaryotic hosts other than E. coli, e.g. Lactobacillus, Micromonospora
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12R—INDEXING SCHEME ASSOCIATED WITH SUBCLASSES C12C - C12Q, RELATING TO MICROORGANISMS
- C12R2001/00—Microorganisms ; Processes using microorganisms
- C12R2001/01—Bacteria or Actinomycetales ; using bacteria or Actinomycetales
- C12R2001/185—Escherichia
- C12R2001/19—Escherichia coli
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Public Health (AREA)
- Epidemiology (AREA)
- Pharmacology & Pharmacy (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Gastroenterology & Hepatology (AREA)
- Immunology (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
本发明涉及可降低基因编辑脱靶率的基因编辑蛋白变体。具体地,本发明提供一种基因编辑蛋白变体,所述变体为具有cis切割活性的非天然蛋白,且所述变体相较于其野生型基因编辑蛋白的trans切割活性降低,并且所述变体在野生型基因编辑蛋白的选自下组一个或多个与切割活性相关的核心氨基酸位点发生突变:对应于FnCas12a第1081位的苯丙氨酸(F)位点;和/或对应于FnCas12a第1069位的赖氨酸(K)位点,本发明首次发现,本发明的基因编辑蛋白变体可具有cis切割活性且trans切割活性降低,并且本发明的基因编辑蛋白变体以及含有本发明的基因编辑蛋白变体的基因编辑系统可显著降低基因编辑脱靶率。
Description
技术领域
本发明涉及生物技术领域,具体地,涉及可降低基因编辑脱靶率的基因编辑蛋白变体。
背景技术
基因编辑是指对DNA序列进行删除、插入或替换等操作,广泛应用于基因功能研究、疾病模型建立、疾病治疗以及转基因动植物工程等等。第一代基因编辑技术基于锌指核酸酶(Zinc Finger Nuclease,ZFN),ZFN含有一个能够特异性识别序列的DNA锌指结合域,通过改造这个区域可以实现靶向不同的DNA序列。一个DNA锌指结合域一般由多个锌指结构组成,每个锌指结构识别3个碱基,因此ZNF的靶序列必须是3的倍数。由于ZNF的识别结构域存在上下文依赖效应,其设计和筛选的难度非常大,应用范围受到限制,并且该技术还存在成本高、劳动量大、耗时长、成功率低、易脱靶、细胞毒性大等缺陷。第二代基因编辑技术基于转录激活样效用因子核酸酶(TALEN,Transcription Activator-like effectorNuclease),其识别靶位点DNA特异性的单位模块是间隔32个恒定氨基酸残基的二联氨基酸,不同的二联氨基酸能够与AGTC四个核苷酸碱基一一对应。根据靶标DNA的序列反推出对应的二联氨基酸序列,从而构成TALEN靶点识别模块。该模块进行组装需要大量的分子克隆和测序操作,从而限制了该技术的推广。
第三基因编辑技术基于CRISPR-Cas技术,该技术通过guide RNA实现特异性识别靶标DNA序列,guide RNA的设计和合成工作量远小于TALEN和ZFN技术的DNA识别模块的构建过程。Guide RNA能够结合具有核酸酶活性的Cas蛋白,并引导其对靶标DNA进行切割。
目前基因编辑蛋白还是存在一定程度的脱靶率。当基因编辑蛋白(比如Cas12a)与guide RNA和靶标DNA形成三元复合体后,不仅对靶标DNA具有cis切割活性,还对体系中存在的单链DNA具有非特异的trans切割活性。当DNA处于复制或转录状态时,双链DNA会解链成单链DNA,此时基因编辑蛋白(比如Cas12a)的trans切割活性可能会导致这些DNA被切割,因而导致脱靶产生,并且引起细胞毒问题。因此需要消除基因编辑蛋白(比如Cas12a)的trans切割活性以解决脱靶引起的细胞毒问题。
因此,本领域迫切需要开发消除基因编辑蛋白(比如Cas12a)的trans切割活性以解决脱靶引起的细胞毒问题的方法。
发明内容
本发明的目的是提供一种降低乃至消除基因编辑蛋白(比如Cas12a)的trans切割活性以解决脱靶引起的细胞毒问题的方法。
本发明第一方面提供了一种基因编辑蛋白变体,所述变体为具有cis切割活性的非天然蛋白,且所述变体相较于其野生型基因编辑蛋白的trans切割活性降低,并且所述变体在野生型基因编辑蛋白的选自下组一个或多个与切割活性相关的核心氨基酸位点发生突变:
对应于FnCas12a第1081位的苯丙氨酸(F)位点;和/或
对应于FnCas12a第1069位的赖氨酸(K)位点。
在另一优选例中,所述变体相较于其野生型基因编辑蛋白的trans切割活性降低指与野生型的基因编辑蛋白相比,所述变体的trans切割活性降低≥50,较佳地≥80%,更佳地,≥90%或100%。
在另一优选例中,所述FnCas12a第1081位的苯丙氨酸(F)突变为选自下组的一种或多种氨基酸:精氨酸(R)、酪氨酸(Y)、色氨酸(W)、谷氨酰胺(Q)、天冬酰胺(N)、赖氨酸(K)、谷氨酸(E)、天冬氨酸(D)或其组合。
在另一优选例中,所述FnCas12a第1069位的赖氨酸(K)突变为选自下组的一种或多种氨基酸:精氨酸(R)、酪氨酸(Y)、谷氨酰胺(Q)、天冬酰胺(N)、赖氨酸(K)、谷氨酸(E)、天冬氨酸(D)或其组合。
在另一优选例中,所述FnCas12a第1081位的苯丙氨酸(F)突变为精氨酸(R)。
在另一优选例中,所述FnCas12a第1069位的赖氨酸(K)突变为精氨酸(R)。
在另一优选例中,所述的突变选自下组:F1081R、K1069R、或其组合。
在另一优选例中,所述基因编辑蛋白为V型CRISPR/Cas蛋白。
在另一优选例中,所述基因编辑蛋白选自下组:Cas 12、Cas14、或其组合。
在另一优选例中,所述基因编辑蛋白选自下组:Cas12a、Cas12b、Cas12e或其组合。
另一优选例中,所述Cas12a选自下组:FnCas12a、AsCas12a、LbCas12a、Lb5Cas12a、HkCas12a、OsCas12a、TsCas12a、BbCas12a、BoCas12a、Lb4Cas12a、或其组合。
在另一优选例中,所述基因编辑蛋白的来源选自下组:纤毛菌属、李斯特菌属、棒状杆菌属、萨特氏菌属、军团菌属、密螺旋体属、产线菌属、真细菌属、链球菌属、乳酸菌属、支原体属、拟杆菌属、Flaviivola、黄杆菌属、固氮螺菌属、Sphaerochaeta、葡糖醋杆菌属、奈瑟氏菌属、罗氏菌属、Parvibaculum、葡萄球菌属、Nitratifractor、支原体属、弯曲杆菌属、毛螺菌属、或其组合。
在另一优选例中,所述基因编辑蛋白的来源选自下组:毛螺菌科细菌ND2006(Lachnospiraceae bacterium ND2006)(LbCas12a)、Thiomicrospira sp.XS5(TsCas12a)、土拉弗菌(Francisella tularensis)(FnCas12a)、Bacteroidetes oral taxon 274(BoCas12a)、Oribacterium sp.NK2B42(OsCas12a)、氨基酸球菌属BV3L6(Acidaminococcussp.BV3L6)(AsCas12a)、孔氏创伤球菌(Helcococcus sp kunzii)(HkCas12a)、毛螺菌科细菌NC2008(Lachnospiraceae bacterium NC2008)(Lb5Cas12a)、或其组合。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于FnCas12a的第1081和1069位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于BbCas12a的第1019和1007位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于AsCas12a的第1069和1057位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于BoCas12a的第1033和1021位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于HkCas12a的第1090和1078位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于Lb4Cas12a的第1004和992位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于Lb5Cas12a的第980和968位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于LbCas12a的第1018和1006位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于OsCas12a的第1001和989位。
在另一优选例中,所述FnCas12a第1081和1069位的位点位于TsCas12a的第1070和1058位。
在另一优选例中,所述基因编辑蛋白为FnCas12a。
在另一优选例中,所述基因编辑蛋白的序列如SEQ ID NO.1所示。
在另一优选例中,所述变体的氨基酸序列如SEQ ID NO.2-3中的任一所示。
在另一优选例中,所述的变体为具有SEQ ID NO.:2-3中任一所示氨基酸序列的多肽、其活性片段、或其保守性变异多肽。
在另一优选例中,所述的变体除所述突变(如1081位、和/或1069位)外,其余的氨基酸序列与野生型的基因编辑蛋白的序列相同或基本相同。
在另一优选例中,所述的基本相同是至多有50个(较佳地为1-20个,更佳地为1-10个、更佳地1-5个)氨基酸不相同,其中,所述的不相同包括氨基酸的取代、缺失或添加,且所述的变体具有cis切割活性且trans切割活性降低。
在另一优选例中,所述变体与所述野生型的基因编辑蛋白的同源性至少为80%,较佳地至少为85%或90%,更佳地至少为95%,最佳地至少为98%或99%。
在另一优选例中,所述的变体选自下组:
(a)具有SEQ ID NO.:2-3中任一所示氨基酸序列的多肽;
(b)将SEQ ID NO.:2-3中任一所示氨基酸序列经过一个或多个(如2个、3个、4个或5个)氨基酸残基的取代、缺失或添加而形成的,且具有cis切割活性且trans切割活性降低的由(a)衍生的多肽。
在另一优选例中,所述的衍生的多肽与SEQ ID NO.:2-3中任一所示序列的同源性至少为60%,较佳地至少为70%,更佳地至少为80%,最佳地至少为90%,如95%、97%、99%。
在另一优选例中,所述变体为所述野生型的基因编辑蛋白经突变形成的。
本发明第二方面提供了一种多核苷酸,所述的多核苷酸编码本发明第一方面所述的变体。
在另一优选例中,所述多核苷酸选自下组:
(a)编码如SEQ ID NO.2-3中任一所示多肽的多核苷酸;
(b)序列如SEQ ID NO.:4-5中任一所示的多核苷酸;
(c)核苷酸序列与SEQ ID NO.:4-5中任一所示序列的同源性≥80%(较佳地≥90%,更佳地≥95%,最佳地≥98%),且编码SEQ ID NO.:2-3中任一所示多肽的多核苷酸;
(d)与(a)-(c)任一所述的多核苷酸互补的多核苷酸。
在另一优选例中,所述的多核苷酸在所述变体的ORF的侧翼还额外含有选自下组的辅助元件:信号肽、分泌肽、标签序列(如6His)、或其组合。
在另一优选例中,所述的多核苷酸选自下组:基因组序列、cDNA序列、RNA序列、或其组合。
在另一优选例中,该多核苷酸还包含与所述变体的ORF序列操作性连接的启动子。
在另一优选例中,所述的启动子选自下组:组成型启动子、组织特异性启动子、诱导型启动子、或者强启动子。
本发明第三方面提供了一种载体,所述的载体含有本发明第二方面所述的多核苷酸。
在另一优选例中,所述载体包含一个或多个启动子,所述启动子可操作地与所述核酸序列、增强子、转录终止信号、多腺苷酸化序列、复制起点、选择性标记、核酸限制性位点、和/或同源重组位点连接。
在另一优选例中,所述载体包括质粒、病毒载体。
在另一优选例中,所述的病毒载体选自下组:腺相关病毒(AAV)、腺病毒、慢病毒、逆转录病毒、疱疹病毒、SV40、痘病毒、或其组合。
在另一优选例中,所述载体包括克隆载体、转化载体、表达载体、穿梭载体、整合载体、多功能载体。
本发明第四方面提供了一种宿主细胞,所述宿主细胞含有本发明第三方面所述的载体,或其基因组中整合有本发明第二方面所述的多核苷酸。
在另一优选例中,所述的宿主细胞为真核细胞,如酵母细胞、植物细胞或哺乳动物细胞(包括人和非人哺乳动物)。
在另一优选例中,所述的宿主细胞为原核细胞,如大肠杆菌。
在另一优选例中,所述酵母细胞选自下组的一种或多种来源的酵母:毕氏酵母、克鲁维酵母、或其组合;较佳地,所述的酵母细胞包括:克鲁维酵母,更佳地为马克斯克鲁维酵母、和/或乳酸克鲁维酵母。
在另一优选例中,所述宿主细胞选自下组:大肠杆菌、麦胚细胞,昆虫细胞,SF9、Hela、HEK293、CHO、酵母细胞、或其组合。
本发明第五方面提供了一种基因编辑蛋白变体的制备方法,所述的方法包括步骤:
(a)在适合表达的条件下,培养本发明第四方面所述的宿主细胞,从而表达所述的基因编辑蛋白变体;和
(b)分离所述的基因编辑蛋白变体。
本发明第六方面提供了一种酶制剂,所述酶制剂包括本发明第一方面所述的基因编辑蛋白变体。
在另一优选例中,所述的酶制剂包括注射剂、和/或冻干制剂。
本发明第七方面提供了一种基因编辑系统,包括:
本发明第一方面所述的基因编辑蛋白变体、或其编码基因或其表达载体;和
gRNA或其表达载体,和/或其用于靶标位点断裂修复的寡核苷酸或核酸片段或质粒。
在另一优选例中,所述表达载体包括质粒、病毒载体。
在另一优选例中,所述gRNA包括crRNA、tracrRNA、sgRNA。
在另一优选例中,所述gRNA包括未修饰和经修饰的gRNA。
在另一优选例中,所述经修饰的gRNA包括碱基的化学修饰。
在另一优选例中,所述化学修饰包括甲基化修饰、甲氧基修饰、氟化修饰或硫代修饰。
在另一优选例中,所述的基因编辑包括基于CRISPR的基因编辑。
本发明第八方面提供了一种组合物,包括:
本发明第七方面所述的系统;和
药学上可接受的载体。
在另一优选例中,所述组合物包括药物组合物。
在另一优选例中,所述组合物的剂型选自下组:冻干制剂、液体制剂、或其组合。
在另一优选例中,所述组合物的剂型为液体制剂。
在另一优选例中,所述组合物的剂型为注射剂型。
在另一优选例中,所述组合物为细胞制剂。
在另一优选例中,所述基因编辑蛋白变体的表达载体和gRNA的表达载体为同一载体或不同载体。
在另一优选例中,所述的组合物中,本发明第三方面所述的系统占所述组合物总重量的1-99wt%,较佳地10-90wt%,更佳地30-70wt%。
本发明第九方面提供了一种试剂盒,包括:本发明第一方面所述的基因编辑蛋白变体或本发明第七方面所述的基因编辑系统。
在另一优选例中,所述试剂盒还包括标签或说明书。
本发明第十方面提供了一种药盒,包括:
第一容器,以及位于所述第一容器中的本发明第七方面所述的基因编辑系统或本发明第八方面所述的组合物,或含有本发明第七方面所述的基因编辑系统或本发明第八方面所述的组合物的药物。
在另一优选例中,所述的第一容器的药物是含本发明第七方面所述的基因编辑系统或本发明第八方面所述的组合物的单方制剂。
在另一优选例中,所述药物的剂型选自下组:冻干制剂、液体制剂、或其组合。
在另一优选例中,所述药物的剂型为口服剂型或注射剂型。
在另一优选例中,所述的试剂盒还含有说明书。
本发明第十一方面提供了一种药盒,包括:
(a1)第一容器,以及位于所述第一容器中的本发明第一方面所述的基因编辑蛋白变体、或其编码基因或其表达载体,或含有本发明第一方面所述的基因编辑蛋白变体、或其编码基因或其表达载体的药物;
(b1)第二容器,以及位于所述第二容器中的gRNA或其表达载体,或含有gRNA或其表达载体的药物。
在另一优选例中,所述的第一容器和第二容器为不同的容器。
在另一优选例中,所述的第一容器的药物是含本发明第一方面所述的基因编辑蛋白变体、或其编码基因或其表达载体的单方制剂。
在另一优选例中,所述的第二容器的药物是含gRNA或其表达载体的单方制剂。
在另一优选例中,所述药物的剂型选自下组:冻干制剂、液体制剂、或其组合。
在另一优选例中,所述药物的剂型为口服剂型或注射剂型。
在另一优选例中,所述的试剂盒还含有说明书。
本发明第十二方面提供了一种本发明第一方面所述的基因编辑蛋白变体、本发明第七方面所述的基因编辑系统、本发明第八方面所述的组合物、本发明第九方面所述的试剂盒或本发明第十方面或第十一方面所述的药盒的用途,用于制备用于降低基因编辑脱靶率的试剂或试剂盒。
在另一优选例中,所述试剂或试剂盒用于降低基因编辑的trans切割活性。
在另一优选例中,所述试剂或试剂盒用于降低基因编辑的trans切割活性同时保留cis切割活性。
在另一优选例中,所述降低基因编辑的trans切割活性指将基因编辑的trans切割活性降低≥80%,更佳地,≥90%或100%。
本发明第十三方面提供了一种降低基因编辑脱靶率的方法,包括步骤:
在本发明第一方面所述的基因编辑蛋白变体、本发明第七方面所述的基因编辑系统、本发明第八方面所述的组合物、本发明第九方面所述的试剂盒或本发明第十方面或第十一方面所述的药盒的存在下,对细胞进行基因编辑,从而降低基因编辑脱靶率。
在另一优选例中,所述细胞是原核细胞或真核细胞。
在另一优选例中,所述细胞是哺乳动物细胞。
在另一优选例中,所述哺乳动物细胞是非人类哺乳动物,例如灵长类动物、牛、羊、猪类、犬、啮齿动物、兔科,例如猴、母牛、绵羊、猪、狗、兔、大鼠或小鼠的细胞。
在另一优选例中,所述细胞是非哺乳动物真核细胞例如家禽鸟类(例如鸡)、脊椎动物鱼(例如鲑鱼)或甲壳类动物(例如牡蛎、蛤、龙虾、虾)的细胞。
在另一优选例中,所述细胞是植物细胞。
在另一优选例中,所述植物细胞是单子叶植物或双子叶植物具有的细胞或栽培植物或粮食植物例如木薯、玉米、高粱、大豆、小麦、燕麦或稻具有的细胞。
在另一优选例中,所述植物细胞是藻类、树或生产植物、果实或蔬菜(例如,树类例如柑橘树,例如桔子树、葡萄柚树或柠檬树;桃树或油桃树;苹果树或梨树;坚果树例如杏树或核桃树或阿月浑子树;茄属植物;芸苔属植物;莴苣属植物;菠菜属植物;辣椒属植物;棉花、烟草、芦笋、胡萝卜、甘蓝、西兰花、花椰菜、番茄、茄子、胡椒、莴苣、菠菜、草莓、蓝莓、覆盆子、黑莓、葡萄、咖啡、可可等)具有的细胞。。
在另一优选例中,所述的基因编辑在一体外反应体系中进行。
在另一优选例中,所述方法为非诊断性和非治疗性的。
在另一优选例中,所述细胞为体外的细胞。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
附图说明
图1为基因编辑蛋白纯化胶图,显示了三种基因编辑蛋白的分子量大小,均为150KDa。泳道1和2为野生型FnCas12a,上样量分别为3μg和5μg;泳道3和4为突变型蛋白FnCas12aK1069R,上样量分别为2μg和3μg;泳道5和6为突变型蛋白FnCas12aF1081R,上样量分别为2μg和3μg。
图2为基因编辑蛋白与靶标dsDNA的顺式切割反应产物的电泳图,显示了三种蛋白都有cis切割活性,且突变体蛋白FnCas12aK1069R和FnCas12aF1081R的cis切割活力与野生型基因编辑蛋白的cis切割活性相比并没有显著差异。图2中M为1Kb DNA Marker;S为靶标dsDNA片段,大小为829bp;P为靶标dsDNA的顺式切割产物,大小分别为529bp和300bp。
图3为基因编辑蛋白与非靶标ssDNA的反式切割反应的荧光信号变化图。如图3所示,用实时荧光定量PCR仪器检测反应体系的荧光信号。其中,control为阴性对照trans切割反应体系,即trans切割体系中不添加target dsDNA。随着时间的增加,control体系中没有检测到荧光信号。WT为野生型FnCas12a蛋白,其trans切割反应体系的荧光信号随着反应时间的延长而增强,说明野生型FnCas12a具有trans切割活性。突变体蛋白FnCas12aK1069R、FnCas12aF1081Rtrans切割反应体系的荧光信号随着反应时间的延长,一直处于本底水平,保持不变,说明突变体蛋白FnCas12aK1069R和FnCas12aF1081R没有显著的trans切割活性。
图4(4a-4e)是10种类型的Cas12a蛋白氨基酸序列比对分析图。从该图可以得知,这10种Cas12a蛋白氨基酸序列具有较高的同源性。
图5是CRISPR V型Cas蛋白(即Cas12蛋白)的进化树。据图所示,Cas12蛋白都含有RuvC功能结构域。(Yan Winston X等人.Functionally diverse type VCRISPR-Cassystems.[J].Science(New York,N.Y.),2018,363(6422).)。
图6是FnCas12a的蛋白质结构域示意图,标明了各功能结构域的氨基酸残基起止位置(Stefano,Stella,Pablo,等人.Conformational Activation Promotes CRISPR-Cas12a Catalysis and Resetting of the Endonuclease Activity.[J].Cell,2018,175:1856–1871)。
图7是Cas12a、Cas12b、Cas12e的蛋白质结构域示意图(Tong Baisong等人.TheVersatile Type V CRISPR Effectors and Their Application Prospects[J].Frontiers in Cell and Developmental Biology,2021,8:622103-622103.)
具体实施方式
本发明人经过广泛而深入的研究,原本尝试突变V型家族的效应蛋白以期增加其与反式切割活性底物DNA的相互作用,经过大量筛选,却相反地意外获得一种基因编辑蛋白变体。相比野生型基因编辑蛋白,本发明的基因编辑蛋白变体可具有cis切割活性且trans切割活性降低,甚至没有trans切割活性,并且本发明的基因编辑蛋白变体以及含有本发明的基因编辑蛋白变体的基因编辑系统可显著降低基因编辑脱靶率。在此基础上,本发明人完成了本发明。
术语
为了可以更容易地理解本公开,首先定义某些术语。如本申请中所使用的,除非本文另有明确规定,否则以下术语中的每一个应具有下面给出的含义。在整个申请中阐述了其它定义。
术语“约”可以是指在本领域普通技术人员确定的特定值或组成的可接受误差范围内的值或组成,其将部分地取决于如何测量或测定值或组成。例如,如本文所用,表述“约100”包括99和101和之间的全部值(例如,99.1、99.2、99.3、99.4等)。
如本文所用,术语“含有”或“包括(包含)”可以是开放式、半封闭式和封闭式的。换言之,所述术语也包括“基本上由…构成”、或“由…构成”。
序列同一性(或同源性)通过沿着预定的比较窗(其可以是参考核苷酸序列或蛋白的长度的50%、60%、70%、80%、90%、95%或100%)比较两个对齐的序列,并且确定出现相同的残基的位置的数目来确定。通常地,这表示为百分比。核苷酸序列的序列同一性的测量是本领域技术人员熟知的方法。
cis切割活性
在本发明中,cis切割活性是指Cas蛋白对靶标核酸分子的特异切割活性。
trans切割活性
在本发明中,trans切割活性是指Cas蛋白对非靶标核酸分子(主要是非靶标单链核酸分子)的非特异切割活性。
当DNA处于复制或转录状态时,双链DNA会解链成单链DNA,此时基因编辑蛋白(比如Cas12a)的trans切割活性可能会导致这些单链状态的DNA被切断,从而引起脱靶切割,因此,降低基因编辑蛋白的trans切割活性等同于降低基因编辑蛋白的基因编辑脱靶率。
野生型的基因编辑蛋白
如本文所用,“野生型的基因编辑蛋白”是指天然存在的、未经过人工改造的基因编辑蛋白,其核苷酸可以通过基因工程技术来获得,如基因组测序、聚合酶链式反应(PCR)等,其氨基酸序列可由核苷酸序列推导而得到。所述野生型基因编辑蛋白的来源包括毛螺菌科细菌ND2006(Lachnospiraceae bacterium ND2006)(LbCas12a)、Thiomicros pirasp.XS5(TsCas12a)、土拉弗菌(Francisella tularensis)(FnCas12a)、Bacteroidetesoral taxon 274(BoCas12a)、Oribacterium sp.NK2B42(OsCas12a)、氨基酸球菌属BV3L6(Acidaminococcus sp.BV3L6)(AsCas12a)、孔氏创伤球菌(Helcococcus sp kunzii)(HkCas12a)、毛螺菌科细菌NC2008(Lachnospiraceae bacterium NC2008)(Lb5Cas12a)。野生型的基因编辑蛋白包括Cas12、Cas14,进一步包括Cas12a、Cas12b、Cas12e;又进一步,所述Cas12a选自以下组:FnCas12a、AsCas12a、LbCas12a、Lb5Cas12a、HkCas12a、OsCas12a、TsCas12a、BbCas12a、BoCas12a、Lb4Cas12a、或其组合。
在本发明的一个优选例中,所述野生型的基因编辑蛋白为FnCas12a,序列如SEQID NO.1所示。
野生型FnCas12a氨基酸序列(SEQ ID NO.1):
MSIYQEFVNKYSLSKTLRFELIPQGKTLENIKARGLILDDEKRAKDYKKAKQIIDKYHQFFIEEILSSVCISEDLLQNYSDVYFKLKKSDDDNLQKDFKSAKDTIKKQISEYIKDSEKFKNLFNQNLIDAKKGQESDLILWLKQSKDNGIELFKANSDITDIDEALEIIKSFKGWTTYFKGFHENRKNVYSSNDIPTSIIYRIVDDNLPKFLENKAKYESLKDKAPEAINYEQIKKDLAEELTFDIDYKTSEVNQRVFSLDEVFEIANFNNYLNQSGITKFNTIIGGKFVNGENTKRKGINEYINLYSQQINDKTLKKYKMSVLFKQILSDTESKSFVIDKLEDDSDVVTTMQSFYEQIAAFKTVEEKSIKETLSLLFDDLKAQKLDLSKIYFKNDKSLTDLSQQVFDDYSVIGTAVLEYITQQIAPKNLDNPSKEQELIAKKTEKAKYLSLETIKLALEEFNKHRDIDKQCRFEEILANFAAIPMIFDEIAQNKDNLAQISIKYQNQGKKDLLQASAEDDVKAIKDLLDQTNNLLHKLKIFHISQSEDKANILDKDEHFYLVFEECYFELANIVPLYNKIRNYITQKPYSDEKFKLNFENSTLANGWDKNKEPDNTAILFIKDDKYYLGVMNKKNNKIFDDKAIKENKGEGYKKIVYKLLPGANKMLPKVFFSAKSIKFYNPSEDILRIRNHSTHTKNGSPQKGYEKFEFNIEDCRKFIDFYKQSISKHPEWKDFGFRFSDTQRYNSIDEFYREVENQGYKLTFENISESYIDSVVNQGKLYLFQIYNKDFSAYSKGRPNLHTLYWKALFDERNLQDVVYKLNGEAELFYRKQSIPKKITHPAKEAIANKNKDNPKKESVFEYDLIKDKRFTEDKFFFHCPITINFKSSGANKFNDEINLLLKEKANDVHILSIDRGERHLAYYTLVDGKGNIIKQDTFNIIGNDRMKTNYHDKLAAIEKDRDSARKDWKKINNIKEMKEGYLSQVVHEIAKLVIEYNAIVVFEDLNFGFKRGRFKVEKQVYQKLEKMLIEKLNYLVFKDNEFDKTGGVLRAYQLTAPFETFKKMGKQTGIIYYVPAGFTSKICPVTGFVNQLYPKYESVSKSQEFFSKFDKICYNLDKGYFEFSFDYKNFGDKAAKGKWTIASFGSRLINFRNSDKNHNWDTREVYPTKELEKLLKDYSIEYGHGECIKAAICGESDKKFFAKLTSVLNTILQMRNSKTGTELDYLISPVADVNGNFFDSRQAPKNMPQDADANGAYHIGLKGLMLLGRIKNNQEGKKLNLVIKNEEYFEFVQNRNN*
基因编辑蛋白变体及其编码核酸
如本文所用,术语“基因编辑蛋白变体”、“本发明的变体”、“本发明的基因编辑突变蛋白”、“突变蛋白”均可互换使用,均指具有cis切割活性的非天然存在的突变的基因编辑蛋白,并且所述突变蛋白在野生型的基因编辑蛋白的选自下组的一个或多个与切割活性相关的核心氨基酸位点发生突变:
对应于FnCas12a第1081位的苯丙氨酸(F)位点;和/或
对应于FnCas12a第1069位的赖氨酸(K)位点,且所述突变蛋白相较于其野生型基因编辑蛋白的trans切割活性降低,甚至没有trans切割活性。
术语“核心氨基酸”指的是基于野生型的基因编辑蛋白,且与野生型的基因编辑蛋白同源性达至少80%,如84%、85%、90%、92%、95%、98%或99%的序列中,相应位点是本文所述的特定氨基酸,如基于野生型的基因编辑蛋白,核心氨基酸为:
对应于FnCas12a第1081位的苯丙氨酸(F);和/或
对应于FnCas12a第1069位的赖氨酸(K)。
且对上述核心氨基酸进行突变所得到的突变蛋白具有cis切割活性且trans切割活性降低,甚至没有trans切割活性。
优选地,在本发明中,对本发明的所述核心氨基酸进行如下突变:
对应于FnCas12a第1081位的苯丙氨酸(F)突变为精氨酸(R);
对应于FnCas12a第1069位的赖氨酸(K)突变为精氨酸(R)。
应理解,本发明突变蛋白中的氨基酸编号基于野生型的基因编辑蛋白作出,当某一具体突变蛋白与野生型的基因编辑蛋白的序列的同源性达到80%或以上时,突变蛋白的氨基酸编号可能会有相对于野生型的基因编辑蛋白的氨基酸编号的错位,如向氨基酸的N末端或C末端错位1-100位,而采用本领域常规的序列比对技术,本领域技术人员通常可以理解这样的错位是在合理范围内的,且不应当由于氨基酸编号的错位而使同源性达80%(如90%、95%、98%)的、具有相同或相似的具有cis切割活性且trans切割活性降低的突变蛋白不在本发明突变蛋白的范围内。
本发明突变蛋白是合成蛋白或重组蛋白,即可以是化学合成的产物,或使用重组技术从原核或真核宿主(例如,细菌、酵母、植物)中产生。根据重组生产方案所用的宿主,本发明的突变蛋白可以是糖基化的,或可以是非糖基化的。本发明的突变蛋白还可包括或不包括起始的甲硫氨酸残基。
本发明还包括所述突变蛋白的片段、衍生物和类似物。如本文所用,术语“片段”、“衍生物”和“类似物”是指基本上保持所述突变蛋白相同的生物学功能或活性的蛋白。
本发明的突变蛋白片段、衍生物或类似物可以是(i)有一个或多个保守或非保守性氨基酸残基(优选保守性氨基酸残基)被取代的突变蛋白,而这样的取代的氨基酸残基可以是也可以不是由遗传密码编码的,或(ii)在一个或多个氨基酸残基中具有取代基团的突变蛋白,或(iii)成熟突变蛋白与另一个化合物(比如延长突变蛋白半衰期的化合物,例如聚乙二醇)融合所形成的突变蛋白,或(iv)附加的氨基酸序列融合到此突变蛋白序列而形成的突变蛋白(如前导序列或分泌序列或用来纯化此突变蛋白的序列或蛋白原序列,或与抗原IgG片段的形成的融合蛋白)。根据本文的教导,这些片段、衍生物和类似物属于本领域熟练技术人员公知的范围。本发明中,保守性替换的氨基酸最好根据表I进行氨基酸替换而产生。
表I
本发明的活性突变蛋白具有cis切割活性且trans切割活性降低,甚至没有trans切割活性。
优选地,所述的突变蛋白如SEQ ID NO.:2-3中任一所示。
突变体蛋白FnCas12aK1069R氨基酸序列:
MSIYQEFVNKYSLSKTLRFELIPQGKTLENIKARGLILDDEKRAKDYKKAKQIIDKYHQFFIEEILSSVCISEDLLQNYSDVYFKLKKSDDDNLQKDFKSAKDTIKKQISEYIKDSEKFKNLFNQNLIDAKKGQESDLILWLKQSKDNGIELFKANSDITDIDEALEIIKSFKGWTTYFKGFHENRKNVYSSNDIPTSIIYRIVDDNLPKFLENKAKYESLKDKAPEAINYEQIKKDLAEELTFDIDYKTSEVNQRVFSLDEVFEIANFNNYLNQSGITKFNTIIGGKFVNGENTKRKGINEYINLYSQQINDKTLKKYKMSVLFKQILSDTESKSFVIDKLEDDSDVVTTMQSFYEQIAAFKTVEEKSIKETLSLLFDDLKAQKLDLSKIYFKNDKSLTDLSQQVFDDYSVIGTAVLEYITQQIAPKNLDNPSKEQELIAKKTEKAKYLSLETIKLALEEFNKHRDIDKQCRFEEILANFAAIPMIFDEIAQNKDNLAQISIKYQNQGKKDLLQASAEDDVKAIKDLLDQTNNLLHKLKIFHISQSEDKANILDKDEHFYLVFEECYFELANIVPLYNKIRNYITQKPYSDEKFKLNFENSTLANGWDKNKEPDNTAILFIKDDKYYLGVMNKKNNKIFDDKAIKENKGEGYKKIVYKLLPGANKMLPKVFFSAKSIKFYNPSEDILRIRNHSTHTKNGSPQKGYEKFEFNIEDCRKFIDFYKQSISKHPEWKDFGFRFSDTQRYNSIDEFYREVENQGYKLTFENISESYIDSVVNQGKLYLFQIYNKDFSAYSKGRPNLHTLYWKALFDERNLQDVVYKLNGEAELFYRKQSIPKKITHPAKEAIANKNKDNPKKESVFEYDLIKDKRFTEDKFFFHCPITINFKSSGANKFNDEINLLLKEKANDVHILSIDRGERHLAYYTLVDGKGNIIKQDTFNIIGNDRMKTNYHDKLAAIEKDRDSARKDWKKINNIKEMKEGYLSQVVHEIAKLVIEYNAIVVFEDLNFGFKRGRFKVEKQVYQKLEKMLIEKLNYLVFKDNEFDKTGGVLRAYQLTAPFETFKKMGRQTGIIYYVPAGFTSKICPVTGFVNQLYPKYESVSKSQEFFSKFDKICYNLDKGYFEFSFDYKNFGDKAAKGKWTIASFGSRLINFRNSDKNHNWDTREVYPTKELEKLLKDYSIEYGHGECIKAAICGESDKKFFAKLTSVLNTILQMRNSKTGTELDYLISPVADVNGNFFDSRQAPKNMPQDADANGAYHIGLKGLMLLGRIKNNQEGKKLNLVIKNEEYFEFVQNRNN*
(SEQ ID NO.2)
突变型蛋白FnCas12aF1081R氨基酸序列:
MSIYQEFVNKYSLSKTLRFELIPQGKTLENIKARGLILDDEKRAKDYKKAKQIIDKYHQFFIEEILSSVCISEDLLQNYSDVYFKLKKSDDDNLQKDFKSAKDTIKKQISEYIKDSEKFKNLFNQNLIDAKKGQESDLILWLKQSKDNGIELFKANSDITDIDEALEIIKSFKGWTTYFKGFHENRKNVYSSNDIPTSIIYRIVDDNLPKFLENKAKYESLKDKAPEAINYEQIKKDLAEELTFDIDYKTSEVNQRVFSLDEVFEIANFNNYLNQSGITKFNTIIGGKFVNGENTKRKGINEYINLYSQQINDKTLKKYKMSVLFKQILSDTESKSFVIDKLEDDSDVVTTMQSFYEQIAAFKTVEEKSIKETLSLLFDDLKAQKLDLSKIYFKNDKSLTDLSQQVFDDYSVIGTAVLEYITQQIAPKNLDNPSKEQELIAKKTEKAKYLSLETIKLALEEFNKHRDIDKQCRFEEILANFAAIPMIFDEIAQNKDNLAQISIKYQNQGKKDLLQASAEDDVKAIKDLLDQTNNLLHKLKIFHISQSEDKANILDKDEHFYLVFEECYFELANIVPLYNKIRNYITQKPYSDEKFKLNFENSTLANGWDKNKEPDNTAILFIKDDKYYLGVMNKKNNKIFDDKAIKENKGEGYKKIVYKLLPGANKMLPKVFFSAKSIKFYNPSEDILRIRNHSTHTKNGSPQKGYEKFEFNIEDCRKFIDFYKQSISKHPEWKDFGFRFSDTQRYNSIDEFYREVENQGYKLTFENISESYIDSVVNQGKLYLFQIYNKDFSAYSKGRPNLHTLYWKALFDERNLQDVVYKLNGEAELFYRKQSIPKKITHPAKEAIANKNKDNPKKESVFEYDLIKDKRFTEDKFFFHCPITINFKSSGANKFNDEINLLLKEKANDVHILSIDRGERHLAYYTLVDGKGNIIKQDTFNIIGNDRMKTNYHDKLAAIEKDRDSARKDWKKINNIKEMKEGYLSQVVHEIAKLVIEYNAIVVFEDLNFGFKRGRFKVEKQVYQKLEKMLIEKLNYLVFKDNEFDKTGGVLRAYQLTAPFETFKKMGKQTGIIYYVPAGRTSKICPVTGFVNQLYPKYESVSKSQEFFSKFDKICYNLDKGYFEFSFDYKNFGDKAAKGKWTIASFGSRLINFRNSDKNHNWDTREVYPTKELEKLLKDYSIEYGHGECIKAAICGESDKKFFAKLTSVLNTILQMRNSKTGTELDYLISPVADVNGNFFDSRQAPKNMPQDADANGAYHIGLKGLMLLGRIKNNQEGKKLNLVIKNEEYFEFVQNRNN*(SEQ ID NO.3)
应理解,本发明突变蛋白与SEQ ID NO.:2-3中任一所示的序列相比,通常具有较高的同源性(相同性),优选地,所述的突变蛋白与SEQ ID NO.:2-3中任一所示序列的同源性至少为80%,较佳地至少为85%-90%,更佳地至少为95%,最佳地至少为98%或99%。
此外,还可以对本发明突变蛋白进行修饰。修饰(通常不改变一级结构)形式包括:体内或体外的突变蛋白的化学衍生形式如乙酰化或羧基化。修饰还包括糖基化,如那些在突变蛋白的合成和加工中或进一步加工步骤中进行糖基化修饰而产生的突变蛋白。这种修饰可以通过将突变蛋白暴露于进行糖基化的酶(如哺乳动物的糖基化酶或去糖基化酶)而完成。修饰形式还包括具有磷酸化氨基酸残基(如磷酸酪氨酸,磷酸丝氨酸,磷酸苏氨酸)的序列。还包括被修饰从而提高了其抗蛋白水解性能或优化了溶解性能的突变蛋白。
术语“编码突变蛋白的多核苷酸”可以是包括编码本发明突变蛋白的多核苷酸,也可以是还包括附加编码和/或非编码序列的多核苷酸。
在一优选实施方式中,本发明的编码突变蛋白的多核苷酸的序列如SEQ ID NO.:4-5中任一所示。
FnCas12aK1069R核苷酸序列(SEQ ID NO.4):
atgagcatctatcaggagttcgtgaataagtacagcctgtccaagaccctgcggtttgagctgatcccccagggcaagacactggagaacatcaaggccaggggcctgatcctggacgatgagaagcgcgccaaggactataagaaggccaagcagatcatcgataagtaccaccagttctttatcgaggagatcctgagcagcgtgtgcatctctgaggatctgctgcagaattacagcgacgtgtatttcaagctgaagaagtctgacgatgacaacctgcagaaggacttcaagagcgccaaggacaccatcaagaagcagatcagcgagtatatcaaggactccgagaagtttaagaatctgttcaaccagaatctgatcgatgccaagaagggccaggagtccgacctgatcctgtggctgaagcagtctaaggacaatggcatcgagctgttcaaggccaactctgatatcaccgatatcgacgaggccctggagatcatcaagagctttaagggctggaccacatactttaagggcttccacgagaacaggaagaacgtgtacagcagcaacgacatccctacaagcatcatctaccgcatcgtggatgacaatctgccaaagttcctggagaacaaggccaagtatgagtccctgaaggacaaggcccccgaggccatcaattacgagcagatcaagaaggatctggccgaggagctgaccttcgatatcgactataagacatccgaggtgaaccagcgggtgttttctctggacgaggtgtttgagatcgccaatttcaacaattacctgaaccagtccggcatcaccaagttcaatacaatcatcggcggcaagtttgtgaacggcgagaataccaagagaaagggcatcaacgagtacatcaatctgtatagccagcagatcaacgacaagaccctgaagaagtacaagatgagcgtgctgttcaagcagatcctgtccgatacagagtctaagagctttgtgatcgataagctggaggatgactctgacgtggtgaccacaatgcagagcttttatgagcagatcgccgccttcaagaccgtggaggagaagtctatcaaggagacactgagcctgctgttcgatgacctgaaggcccagaagctggacctgtctaagatctacttcaagaacgataagtccctgaccgacctgtctcagcaggtgtttgatgactatagcgtgatcggcaccgccgtgctggagtacatcacacagcagatcgccccaaagaacctggataatccctctaagaaggagcaggagctgatcgccaagaagaccgagaaggccaagtatctgagcctggagacaatcaagctggccctggaggagttcaataagcaccgggatatcgacaagcagtgcagatttgaggagatcctggccaacttcgccgccatccccatgatctttgatgagatcgcccagaacaaggacaatctggcccagatctccatcaagtaccagaaccagggcaagaaggacctgctgcaggcctctgccgaggatgacgtgaaggccatcaaggatctgctggaccagaccaacaatctgctgcacaagctgaagatcttccacatctcccagtctgaggataaggccaatatcctggataaggacgagcacttttatctggtgttcgaggagtgttacttcgagctggccaacatcgtgcccctgtacaacaagatcagaaattatatcacacagaagccttactccgacgagaagtttaagctgaacttcgagaacagcaccctggccaacggctgggataagaataaggagcctgacaacacagccatcctgttcatcaaggatgacaagtactatctgggcgtgatgaataagaagaacaataagatcttcgatgacaaggccatcaaggagaacaagggcgagggctacaagaagatcgtgtataagctgctgcccggcgccaataagatgctgcctaaggtgttcttttccgccaagtctatcaagttctacaacccatccgaggacatcctgcggatcagaaatcactccacccacacaaagaacggctctccccagaagggctatgagaagtttgagttcaatatcgaggattgccggaagtttatcgacttctacaagcagagcatctccaagcaccctgagtggaaggattttggcttcaggtttagcgacacccagcggtacaactccatcgacgagttctacagagaggtggagaatcagggctataagctgacatttgagaacatctctgagagctacatcgacagcgtggtgaatcagggcaagctgtacctgttccagatctataacaaggacttcagcgcctattccaagggccggccaaacctgcacaccctgtactggaaggccctgttcgatgagagaaatctgcaggacgtggtgtataagctgaacggcgaggccgagctgttttacaggaagcagtccatccctaagaagatcacacacccagccaaggaggccatcgccaacaagaataaggacaatcctaagaaggagagcgtgttcgagtacgatctgatcaaggacaagcggttcaccgaggataagttctttttccactgtccaatcacaatcaacttcaagtcctctggcgccaacaagtttaatgacgagatcaatctgctgctgaaggagaaggccaacgatgtgcacatcctgagcatcgaccggggcgagagacacctggcctactataccctggtggatggcaagggcaatatcatcaagcaggataccttcaacatcatcggcaatgacaggatgaagacaaactaccacgataagctggccgccatcgagaaggatagggactccgcccgcaaggactggaagaagatcaacaatatcaaggagatgaaggagggctatctgtctcaggtggtgcacgagatcgccaagctggtcatcgagtacaatgccatcgtggtgttcgaggatctgaacttcggctttaagaggggccgctttaaggtggagaagcaggtgtatcagaagctggagaagatgctgatcgagaagctgaattacctggtgtttaaggataacgagttcgacaagaccggaggcgtgctgagggcataccagctgaccgccccctttgagacattcaagaagatgggcAGgcagacaggcatcatctactatgtgccagccggcttcacctccaagatctgccccgtgacaggctttgtgaaccagctgtaccctaagtatgagtccgtgtctaagagccaggagtttttcagcaagttcgataagatctgttataatctggacaagggctacttcgagttttccttcgattataagaactttggcgacaaggccgccaagggcaagtggaccatcgcctctttcggcagccggctgatcaactttagaaattccgataagaaccacaattgggacacccgggaggtgtacccaacaaaggagctggagaagctgctgaaggactacagcatcgagtatggccacggcgagtgcatcaaggccgccatctgtggcgagagcgataagaagtttttcgccaagctgacctccgtgctgaatacaatcctgcagatgcggaacagcaagaccggcacagagctggactacctgatctcccccgtggccgatgtgaacggcaacttcttcgacagcagacaggcccccaagaatatgcctcaggatgccgacgccaacggcgcctatcacatcggcctgaagggcctgatgctgctgggcaggatcaagaacaatcaggagggcaagaagctgaacctggtcatcaagaacgaggagtactttgagttcgtgcagaaccgcaacaattga
FnCas12aF1081R核苷酸序列(SEQ ID NO.5):
atgagcatctatcaggagttcgtgaataagtacagcctgtccaagaccctgcggtttgagctgatcccccagggcaagacactggagaacatcaaggccaggggcctgatcctggacgatgagaagcgcgccaaggactataagaaggccaagcagatcatcgataagtaccaccagttctttatcgaggagatcctgagcagcgtgtgcatctctgaggatctgctgcagaattacagcgacgtgtatttcaagctgaagaagtctgacgatgacaacctgcagaaggacttcaagagcgccaaggacaccatcaagaagcagatcagcgagtatatcaaggactccgagaagtttaagaatctgttcaaccagaatctgatcgatgccaagaagggccaggagtccgacctgatcctgtggctgaagcagtctaaggacaatggcatcgagctgttcaaggccaactctgatatcaccgatatcgacgaggccctggagatcatcaagagctttaagggctggaccacatactttaagggcttccacgagaacaggaagaacgtgtacagcagcaacgacatccctacaagcatcatctaccgcatcgtggatgacaatctgccaaagttcctggagaacaaggccaagtatgagtccctgaaggacaaggcccccgaggccatcaattacgagcagatcaagaaggatctggccgaggagctgaccttcgatatcgactataagacatccgaggtgaaccagcgggtgttttctctggacgaggtgtttgagatcgccaatttcaacaattacctgaaccagtccggcatcaccaagttcaatacaatcatcggcggcaagtttgtgaacggcgagaataccaagagaaagggcatcaacgagtacatcaatctgtatagccagcagatcaacgacaagaccctgaagaagtacaagatgagcgtgctgttcaagcagatcctgtccgatacagagtctaagagctttgtgatcgataagctggaggatgactctgacgtggtgaccacaatgcagagcttttatgagcagatcgccgccttcaagaccgtggaggagaagtctatcaaggagacactgagcctgctgttcgatgacctgaaggcccagaagctggacctgtctaagatctacttcaagaacgataagtccctgaccgacctgtctcagcaggtgtttgatgactatagcgtgatcggcaccgccgtgctggagtacatcacacagcagatcgccccaaagaacctggataatccctctaagaaggagcaggagctgatcgccaagaagaccgagaaggccaagtatctgagcctggagacaatcaagctggccctggaggagttcaataagcaccgggatatcgacaagcagtgcagatttgaggagatcctggccaacttcgccgccatccccatgatctttgatgagatcgcccagaacaaggacaatctggcccagatctccatcaagtaccagaaccagggcaagaaggacctgctgcaggcctctgccgaggatgacgtgaaggccatcaaggatctgctggaccagaccaacaatctgctgcacaagctgaagatcttccacatctcccagtctgaggataaggccaatatcctggataaggacgagcacttttatctggtgttcgaggagtgttacttcgagctggccaacatcgtgcccctgtacaacaagatcagaaattatatcacacagaagccttactccgacgagaagtttaagctgaacttcgagaacagcaccctggccaacggctgggataagaataaggagcctgacaacacagccatcctgttcatcaaggatgacaagtactatctgggcgtgatgaataagaagaacaataagatcttcgatgacaaggccatcaaggagaacaagggcgagggctacaagaagatcgtgtataagctgctgcccggcgccaataagatgctgcctaaggtgttcttttccgccaagtctatcaagttctacaacccatccgaggacatcctgcggatcagaaatcactccacccacacaaagaacggctctccccagaagggctatgagaagtttgagttcaatatcgaggattgccggaagtttatcgacttctacaagcagagcatctccaagcaccctgagtggaaggattttggcttcaggtttagcgacacccagcggtacaactccatcgacgagttctacagagaggtggagaatcagggctataagctgacatttgagaacatctctgagagctacatcgacagcgtggtgaatcagggcaagctgtacctgttccagatctataacaaggacttcagcgcctattccaagggccggccaaacctgcacaccctgtactggaaggccctgttcgatgagagaaatctgcaggacgtggtgtataagctgaacggcgaggccgagctgttttacaggaagcagtccatccctaagaagatcacacacccagccaaggaggccatcgccaacaagaataaggacaatcctaagaaggagagcgtgttcgagtacgatctgatcaaggacaagcggttcaccgaggataagttctttttccactgtccaatcacaatcaacttcaagtcctctggcgccaacaagtttaatgacgagatcaatctgctgctgaaggagaaggccaacgatgtgcacatcctgagcatcgaccggggcgagagacacctggcctactataccctggtggatggcaagggcaatatcatcaagcaggataccttcaacatcatcggcaatgacaggatgaagacaaactaccacgataagctggccgccatcgagaaggatagggactccgcccgcaaggactggaagaagatcaacaatatcaaggagatgaaggagggctatctgtctcaggtggtgcacgagatcgccaagctggtcatcgagtacaatgccatcgtggtgttcgaggatctgaacttcggctttaagaggggccgctttaaggtggagaagcaggtgtatcagaagctggagaagatgctgatcgagaagctgaattacctggtgtttaaggataacgagttcgacaagaccggaggcgtgctgagggcataccagctgaccgccccctttgagacattcaagaagatgggcaagcagacaggcatcatctactatgtgccagccggcCGcacctccaagatctgccccgtgacaggctttgtgaaccagctgtaccctaagtatgagtccgtgtctaagagccaggagtttttcagcaagttcgataagatctgttataatctggacaagggctacttcgagttttccttcgattataagaactttggcgacaaggccgccaagggcaagtggaccatcgcctctttcggcagccggctgatcaactttagaaattccgataagaaccacaattgggacacccgggaggtgtacccaacaaaggagctggagaagctgctgaaggactacagcatcgagtatggccacggcgagtgcatcaaggccgccatctgtggcgagagcgataagaagtttttcgccaagctgacctccgtgctgaatacaatcctgcagatgcggaacagcaagaccggcacagagctggactacctgatctcccccgtggccgatgtgaacggcaacttcttcgacagcagacaggcccccaagaatatgcctcaggatgccgacgccaacggcgcctatcacatcggcctgaagggcctgatgctgctgggcaggatcaagaacaatcaggagggcaagaagctgaacctggtcatcaagaacgaggagtactttgagttcgtgcagaaccgcaacaattga
本发明还涉及上述多核苷酸的变异体,其编码与本发明有相同的氨基酸序列的多肽或突变蛋白的片段、类似物和衍生物。这些核苷酸变异体包括取代变异体、缺失变异体和插入变异体。如本领域所知的,等位变异体是一个多核苷酸的替换形式,它可能是一个或多个核苷酸的取代、缺失或插入,但不会从实质上改变其编码的突变蛋白的功能。
本发明还涉及与上述的序列杂交且两个序列之间具有至少50%,较佳地至少70%,更佳地至少80%相同性的多核苷酸。本发明特别涉及在严格条件(或严紧条件)下与本发明所述多核苷酸可杂交的多核苷酸。在本发明中,“严格条件”是指:(1)在较低离子强度和较高温度下的杂交和洗脱,如0.2×SSC,0.1%SDS,60℃;或(2)杂交时加有变性剂,如50%(v/v)甲酰胺,0.1%小牛血清/0.1%Ficoll,42℃等;或(3)仅在两条序列之间的相同性至少在90%以上,更好是95%以上时才发生杂交。
本发明的突变蛋白和多核苷酸优选以分离的形式提供,更佳地,被纯化至均质。
本发明多核苷酸全长序列通常可以通过PCR扩增法、重组法或人工合成的方法获得。对于PCR扩增法,可根据本发明所公开的有关核苷酸序列,尤其是开放阅读框序列来设计引物,并用市售的cDNA库或按本领域技术人员已知的常规方法所制备的cDNA库作为模板,扩增而得有关序列。当序列较长时,常常需要进行两次或多次PCR扩增,然后再将各次扩增出的片段按正确次序拼接在一起。
一旦获得了有关的序列,就可以用重组法来大批量地获得有关序列。这通常是将其克隆入载体,再转入细胞,然后通过常规方法从增殖后的宿主细胞中分离得到有关序列。
此外,还可用人工合成的方法来合成有关序列,尤其是片段长度较短时。通常,通过先合成多个小片段,然后再进行连接可获得序列很长的片段。
目前,已经可以完全通过化学合成来得到编码本发明蛋白(或其片段,或其衍生物)的DNA序列。然后可将该DNA序列引入本领域中已知的各种现有的DNA分子(或如载体)和细胞中。此外,还可通过化学合成将突变引入本发明蛋白序列中。
应用PCR技术扩增DNA/RNA的方法被优选用于获得本发明的多核苷酸。特别是很难从文库中得到全长的cDNA时,可优选使用RACE法(RACE-cDNA末端快速扩增法),用于PCR的引物可根据本文所公开的本发明的序列信息适当地选择,并可用常规方法合成。可用常规方法如通过凝胶电泳分离和纯化扩增的DNA/RNA片段。
应注意,本发明中来源于Francisella tularensis的基因编辑蛋白(FnCas12a)氨基酸序列中的1081位点、1069位点在其余来源的Cas12a中对应的位点均为保守位点,具体对应关系见表II。
表II突变氨基酸对应位点
NCBI序列登录号 | 蛋白种类 | 对应突变位点1 | 对应突变位点2 |
489130501 | FnCas12a | K1069 | F1081 |
987324269 | BbCas12a | K1007 | N1019 |
545612232 | AsCas12a | T1057 | Y1069 |
496509559 | BoCas12a | K1021 | N1033 |
491540987 | HkCas12a | G1078 | Y1090 |
769130406 | Lb4Cas12a | R992 | N1004 |
652820612 | Lb5Cas12a | K968 | L980 |
917059416 | LbCas12a | T1006 | L1018 |
909652572 | OsCas12a | K989 | L1001 |
972924080 | TsCas12a | K1058 | Y1070 |
因此,上述位点的突变对于降低基因编辑脱靶率具有至关重要的作用。
表达载体和宿主细胞
本发明也涉及包含本发明的多核苷酸的载体,以及用本发明的载体或本发明突变蛋白编码序列经基因工程产生的宿主细胞,以及经重组技术产生本发明所述多肽的方法。
通过常规的重组DNA技术,可利用本发明的多聚核苷酸序列可用来表达或生产重组的突变蛋白。一般来说有以下步骤:
(1).用本发明的编码本发明突变蛋白的多核苷酸(或变异体),或用含有该多核苷酸的重组表达载体转化或转导合适的宿主细胞;
(2).在合适的培养基中培养的宿主细胞;
(3).从培养基或细胞中分离、纯化蛋白质。
本发明中,编码突变蛋白的多核苷酸序列可插入到重组表达载体中。术语“重组表达载体”指本领域熟知的细菌质粒、噬菌体、酵母质粒、植物细胞病毒、哺乳动物细胞病毒如腺病毒、逆转录病毒或其他载体。只要能在宿主体内复制和稳定,任何质粒和载体都可以用。表达载体的一个重要特征是通常含有复制起点、启动子、标记基因和翻译控制元件。
本领域的技术人员熟知的方法能用于构建含本发明突变蛋白编码DNA序列和合适的转录/翻译控制信号的表达载体。这些方法包括体外重组DNA技术、DNA合成技术、体内重组技术等。所述的DNA序列可有效连接到表达载体中的适当启动子上,以指导mRNA合成。这些启动子的代表性例子有:大肠杆菌的lac或trp启动子;λ噬菌体PL启动子;真核启动子包括CMV立即早期启动子、HSV胸苷激酶启动子、早期和晚期SV40启动子、反转录病毒的LTRs和其他一些已知的可控制基因在原核或真核细胞或其病毒中表达的启动子。表达载体还包括翻译起始用的核糖体结合位点和转录终止子。
此外,表达载体优选地包含一个或多个选择性标记基因,以提供用于选择转化的宿主细胞的表型性状,如真核细胞培养用的二氢叶酸还原酶、新霉素抗性以及绿色荧光蛋白(GFP),或用于大肠杆菌的四环素或氨苄青霉素抗性。
包含上述的适当DNA序列以及适当启动子或者控制序列的载体,可以用于转化适当的宿主细胞,以使其能够表达蛋白质。
宿主细胞可以是原核细胞(如大肠杆菌),或是低等真核细胞,或是高等真核细胞,如酵母细胞、植物细胞或哺乳动物细胞(包括人和非人哺乳动物)。代表性例子有:大肠杆菌、麦胚细胞,昆虫细胞,SF9、Hela、HEK293、CHO、酵母细胞等。在本发明的一个优选实施方式中,选择酵母细胞(如毕氏酵母、克鲁维酵母、或其组合;较佳地,所述的酵母细胞包括:克鲁维酵母,更佳地为马克斯克鲁维酵母、和/或乳酸克鲁维酵母)为宿主细胞。
本发明的多核苷酸在高等真核细胞中表达时,如果在载体中插入增强子序列时将会使转录得到增强。增强子是DNA的顺式作用因子,通常大约有10到300个碱基对,作用于启动子以增强基因的转录。可举的例子包括在复制起始点晚期一侧的100到270个碱基对的SV40增强子、在复制起始点晚期一侧的多瘤增强子以及腺病毒增强子等。
本领域一般技术人员都清楚如何选择适当的载体、启动子、增强子和宿主细胞。
用重组DNA转化宿主细胞可用本领域技术人员熟知的常规技术进行。当宿主为原核生物如大肠杆菌时,能吸收DNA的感受态细胞可在指数生长期后收获,用CaCl2法处理,所用的步骤在本领域众所周知。另一种方法是使用MgCl2。如果需要,转化也可用电穿孔的方法进行。当宿主是真核生物,可选用如下的DNA转染方法:磷酸钙共沉淀法,常规机械方法如显微注射、电穿孔、脂质体包装等。
获得的转化子可以用常规方法培养,表达本发明的基因所编码的多肽。根据所用的宿主细胞,培养中所用的培养基可选自各种常规培养基。在适于宿主细胞生长的条件下进行培养。当宿主细胞生长到适当的细胞密度后,用合适的方法(如温度转换或化学诱导)诱导选择的启动子,将细胞再培养一段时间。
在上面的方法中的重组多肽可在细胞内、或在细胞膜上表达、或分泌到细胞外。如果需要,可利用其物理的、化学的和其它特性通过各种分离方法分离和纯化重组的蛋白。这些方法是本领域技术人员所熟知的。这些方法的例子包括但并不限于:常规的复性处理、用蛋白沉淀剂处理(盐析方法)、离心、渗透破菌、超处理、超离心、分子筛层析(凝胶过滤)、吸附层析、离子交换层析、高效液相层析(HPLC)和其它各种液相层析技术及这些方法的结合。
本发明的主要优点包括:
(1)本发明首次发现一种新的基因编辑蛋白变体,相比野生型基因编辑蛋白,本发明的基因编辑蛋白变体可具有cis切割活性且trans切割活性降低,甚至没有trans切割活性,并且本发明的基因编辑蛋白变体以及含有本发明的基因编辑蛋白变体的基因编辑系统可显著降低基因编辑脱靶率。
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件,例如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring HarborLaboratory Press,1989)中所述的条件,或按照制造厂商所建议的条件。除非另外说明,否则百分比和份数是重量百分比和重量份数。
除非有特别说明,否则本发明实施例中的试剂和材料均为市售产品。
(一)材料与方法
1、FnCas12a蛋白突变实验
(1)FnCas12a突变体蛋白表达载体的构建
设计包含突变位点的引物(序列见表1),以野生型FnCas12a表达质粒为模板,用Phanta DNA polymerase扩增出具有目的位点突变的线性片段,用Ezmax(获自安徽吐露港生物科技有限公司)将扩增产物无缝拼接连接为环状表达载体,将反应产物转入DH10B(获自安徽吐露港生物科技有限公司),并在50μg/mL Kan的LB培养基37℃过夜培养,挑取单克隆在50μg/mL Kan液体LB培养基中摇菌,37℃过夜,提取质粒。将测序正确的质粒保存于-80℃。
表1
(2)FnCas12a突变体蛋白的纯化
将构建好的pET28TEV-FnCas12a质粒转化E.coli BL21(DE3)感受态,在含有50μg/mL卡那(后面简称Kan)抗性的固体LB培养基37℃培养12-14h。挑3个单克隆到50mL Kan抗性的液体LB培养基中,37℃摇床过夜培养后,1%(v/v)转接到1L含有Kan抗性的液体LB培养基,37℃培养至OD600=0.6-0.8之间,冰浴30min,加入终浓度为0.2-0.5mM的IPTG,16℃,220rpm培养14-16h。16℃,6000rpm离心5min收菌,将菌沉淀称好重量后开始破菌,也可以暂时保存于-80℃。(以下步骤都要在4℃操作)。按蛋白裂解缓冲液/菌重量为5-10mL/g的比例重悬菌沉淀,同时,加入终浓度为1mM PMSF蛋白抑制剂,菌体重悬均匀后,细胞破碎仪高压裂解,将获得的裂解溶液14000rpm离心30min,收集上清。将离心所得蛋白上清液与Ni-NTA(天地人和生物科技有限公司)混合,4℃慢慢晃动1h,使蛋白与镍柱充分结合,然后将其装载到30mL的柱上,流尽上清液后,用低浓度咪唑的洗杂缓冲液冲洗杂蛋白,在用高浓度咪唑的洗脱缓冲液洗脱目的蛋白,小体积洗脱目的蛋白(具体操作步骤参考Ni-NTA的操作说明书)。浓度为10%(v/v)的SDS-PAGE胶验证目的蛋白的纯度,将较纯的几管目的蛋白合并,透析过夜后用50KDa超滤管浓缩,蛋白纯度如图1所示。等体积的甘油(提前预冷至4℃)与蛋白混合均匀,用Bradford方法测定蛋白浓度,小体积分装保存于-80℃,短期使用可在-20℃保存。
2、target dsDNA序列的制备:
以AMED16s-F/R(序列AMED16s-F:5'-gtgaactaagccagtagagc-3',AMED16s-R:5'-ctttcgctcctcagcgtcag-3',生工生物工程(上海)股份有限公司合成)为扩增引物,以地中海拟无枝酸菌U32基因组(NCBI登录号:SAMN02603409)为模板进行PCR扩增。Target dsDNA片段的PCR扩增体系见表2。PCR反应程序为:95℃预变性10min,95℃变性15s,57℃退火15s,72℃延伸30s(1min可扩增2kb),32个cycles,最后,75℃延伸5min。1.5%(w/v)琼脂糖凝胶电泳鉴定片段大小,扩增产物为正确单一的DNA片段,采用柱回收方法回收目的片段,柱回收用Promega公司的Wizard SV Gel and PCR clean-up system试剂盒。
表2 target dsDNA片段的PCR扩增体系
3、cis切割反应实验:
表3顺式切割反应体系
表4 10xHOLMES buffer成分
成分 | 浓度(mM或%) |
Spermidine | 25 |
Tris | 400 |
MgCl2 | 60 |
DTT | 10 |
Glycine | 400 |
Triton X-100 | 0.01% |
PEG20000 | 4% |
pH | 8.4 |
crRNA序列:5'-AAUUUCUACUCUUGUAGAUGCCAGGGACGAAGCGCAAGUGACGGAA U-3',由南京金斯瑞生物科技有限公司合成,HPLC纯化。检测方法如下:37℃反应40min,85℃灭活5min,加入终浓度为1×DNA loading。将全部反应产物上样,2%(w/v)琼脂糖凝胶电泳,140V电泳25min,EB泡染30min,凝胶成像仪照胶,cis-cleavage产物约为529bp和300bp的DNA片段。另外,Control的实验体系不加入FnCas12a蛋白。实验结果如图2所示。
4、trans切割活性检测实验
表5 trans切割反应体系
成分 | 用量 |
FnCas12a蛋白 | 5μM |
target dsDNA | 30nM |
crRNA | 50nM |
10×HOLMES buffer | 2μL |
RRI(Takara) | 0.25μL |
HOLMES-P(FQ-reporter) | 1μM |
ddH2O(RNase free) | Up to 20μL |
HOLMES-P(FQ-reporter),购自安徽吐露港生物科技有限公司,是一端为FAM荧光发光基团修饰另一端为荧光淬灭基团修饰的短单链DNA探针(5'-TTTTTT-3')。当该短单链DNA片段完整时,该DNA探针不发荧光;而只有当该单链DNA片段被切开后,淬灭基团与荧光基团分开,才能检测到该DNA探针的荧光信号。配好体系后立刻放入实时荧光定量PCR仪器中检测荧光信号,37℃条件下孵育,每隔一分钟采集一次荧光信号,共计采集30次信号(计60min),其实验结果如图3所示。该体系中除FnCas12a蛋白外,其它成分均先配成混合体系。另外,Control即实验体系不加入target dsDNA。
(二)结果与讨论
本发明对FnCas12a结构进行分析,根据晶体结构6i1k显示的结果,与DNA底物相互作用的FnCas12a氨基酸包括:K1069,F1081,F1010,V1285、N1288等,这些氨基酸位点可能与trans切割活性相关,本发明对这些位点进行突变,并对这些蛋白进行cis和trans切割活性测定,最后获得两个具有cis切割活性且没有trans切割活性的突变体蛋白,这两个蛋白的突变分别是1081位氨基酸从苯丙氨酸突变为精氨酸(F1081R)和1069位氨基酸从赖氨酸突变为精氨酸(K1069R),对应的蛋白名称分别为FnCas12aF1081R和FnCas12aK1069R。野生型蛋白(WT)和突变体蛋白(F1081R和K1069R)的纯化结果如图1。cis切割活性检测结果显示FnCas12aF1081R和FnCas12aK1069R这两个突变体蛋白的cis活性和FnCas12a无显著差异(图2)。trans切割活性检测结果表明FnCas12aF1081R和FnCas12aK1069R的trans切割活性相较于野生型FnCas12a蛋白的trans切割活性有显著降低(图3)。
综上所述,本发明发现了两个FnCas12a的突变体蛋白,它们的突变位点分别是1081位氨基酸从苯丙氨酸突变为精氨酸(F1081R)和1069位氨基酸从赖氨酸突变为精氨酸(K1069R),该两种突变体蛋白保留了cis切割活性,丧失(或显著降低)原有野生型基因编辑的trans切割活性。由于Cas12a野生型蛋白不但能够特异性切割靶标DNA,还具有对单链状态的DNA具有非特异的trans切割活性,其在基因编辑过程中会引起一定程度的脱靶效应。在本发明通过对野生型基因编辑蛋白进行人工改造的方法将Cas12a的trans切割活性去除(或降低)的同时保留了其cis切割活性,克服了由基因编辑蛋白的trans切割活性所引起的脱靶问题,从而使Cas12a突变体蛋白在基因编辑方面更有优势。
此外,2类成簇规则间隔短回文重复序列(CRISPR)-Cas系统以单一效应蛋白为特征,可进一步细分为类型II、V和VI等。V型家族的效应蛋白在N端具有多样性,但在C端保留一个统一的RuvC样内切酶结构域。V型系统进一步细分为许多亚型,包括V-A型到V-I型、V-K型、V-U型和CRISPR–Cas8φ(见图5)。Cas12a(V-A型)、Cas12b(V-B型)和Cas12e(V-E型)都属于V型系统,它们在效应蛋白结合gRNA形成二元复合物后,特异性识别富含5'-T的PAM,并促进靶DNA解旋,同时,靶标序列的非靶标链(NTS)发生位移,形成所谓的“R环”结构。RuvC域在PAM远处连续切割NTS和靶标链(TS),形成一个有5、7或10个NT链5'突出部分的交错切口。Cas12a,Cas12b,和Cas12e,这三种蛋白质都有由α螺旋识别(REC)叶和核酸酶(NUC)叶组成的双叶结构(见图7)。两个叶通过桥螺旋(BH)结构域连接。REC叶包含两个REC结构域(REC1和REC2),主要帮助调节和稳定形成“R环”后的crRNA靶与DNA杂交。(Tong Baisong等人.TheVersatile Type V CRISPR Effectors and Their Application Prospects[J].Frontiers in Cell and Developmental Biology,2021,8:622103-622103.)
根据图6,FnCas12a的1069位点位于RuvC域。V型系统下各Cas蛋白均存在RuvC域(见图5)及相应位点,可以预料,对V型系统下各Cas蛋白中对应于FnCas12a的1069位点的氨基酸残基进行突变,会获得类似的效果;尤其是对结构、功能更为类似,同源性更高的Cas12a、Cas12b、Cas12e中对应于FnCas12a的1069位点的氨基酸残基进行突变,更会获得类似的效果;更尤其是对结构功能又更为类似、同源性又更高的其余来源的Cas12a(见图4,a-e)中对应于FnCas12a的1069位点的氨基酸残基进行突变(见表II),更会获得类似效果。
同样,根据图6,FnCas12a的1081位点位于RuvC域与NUC域交界处。Cas12a、Cas12b、Cas12e均存在RuvC域及NUC域(见图7)。可以预料,对结构、功能类似,同源性高的Cas12a、Cas12b、Cas12e中对应于FnCas12a的1081位点的氨基酸残基进行突变,会获得类似的效果;更尤其是结构功能更为类似、同源性高的其余来源的Cas12a(见图4,a-e)中对应于FnCas12a的1081位点的氨基酸残基进行突变(见表II),更会获得类似效果。
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。
序列表
<110> 上海吐露港生物科技有限公司
<120> 可降低基因编辑脱靶率的基因编辑蛋白变体
<130> P2022-0226
<160> 5
<170> PatentIn version 3.5
<210> 1
<211> 1299
<212> PRT
<213> 人工序列(artificial sequence)
<400> 1
Met Ser Ile Tyr Gln Glu Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys
20 25 30
Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys
35 40 45
Lys Ala Lys Gln Ile Ile Asp Lys Tyr His Gln Phe Phe Ile Glu Glu
50 55 60
Ile Leu Ser Ser Val Cys Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser
65 70 75 80
Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys
85 90 95
Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr
100 105 110
Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile
115 120 125
Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln
130 135 140
Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr
145 150 155 160
Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr
165 170 175
Thr Tyr Phe Lys Gly Phe His Glu Asn Arg Lys Asn Val Tyr Ser Ser
180 185 190
Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu
195 200 205
Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys
210 215 220
Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu
225 230 235 240
Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg
245 250 255
Val Phe Ser Leu Asp Glu Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr
260 265 270
Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys
275 280 285
Phe Val Asn Gly Glu Asn Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile
290 295 300
Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys
305 310 315 320
Met Ser Val Leu Phe Lys Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser
325 330 335
Phe Val Ile Asp Lys Leu Glu Asp Asp Ser Asp Val Val Thr Thr Met
340 345 350
Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe Lys Thr Val Glu Glu Lys
355 360 365
Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln
370 375 380
Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr
385 390 395 400
Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala
405 410 415
Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn
420 425 430
Pro Ser Lys Glu Gln Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys
435 440 445
Tyr Leu Ser Leu Glu Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys
450 455 460
His Arg Asp Ile Asp Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn
465 470 475 480
Phe Ala Ala Ile Pro Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp
485 490 495
Asn Leu Ala Gln Ile Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp
500 505 510
Leu Leu Gln Ala Ser Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu
515 520 525
Leu Asp Gln Thr Asn Asn Leu Leu His Lys Leu Lys Ile Phe His Ile
530 535 540
Ser Gln Ser Glu Asp Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe
545 550 555 560
Tyr Leu Val Phe Glu Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro
565 570 575
Leu Tyr Asn Lys Ile Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp
580 585 590
Glu Lys Phe Lys Leu Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp
595 600 605
Asp Lys Asn Lys Glu Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp
610 615 620
Asp Lys Tyr Tyr Leu Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe
625 630 635 640
Asp Asp Lys Ala Ile Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile
645 650 655
Val Tyr Lys Leu Leu Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe
660 665 670
Phe Ser Ala Lys Ser Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu
675 680 685
Arg Ile Arg Asn His Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys
690 695 700
Gly Tyr Glu Lys Phe Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile
705 710 715 720
Asp Phe Tyr Lys Gln Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe
725 730 735
Gly Phe Arg Phe Ser Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe
740 745 750
Tyr Arg Glu Val Glu Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile
755 760 765
Ser Glu Ser Tyr Ile Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu
770 775 780
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro
785 790 795 800
Asn Leu His Thr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu
805 810 815
Gln Asp Val Val Tyr Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg
820 825 830
Lys Gln Ser Ile Pro Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile
835 840 845
Ala Asn Lys Asn Lys Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr
850 855 860
Asp Leu Ile Lys Asp Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His
865 870 875 880
Cys Pro Ile Thr Ile Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn
885 890 895
Asp Glu Ile Asn Leu Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile
900 905 910
Leu Ser Ile Asp Arg Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val
915 920 925
Asp Gly Lys Gly Asn Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly
930 935 940
Asn Asp Arg Met Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu
945 950 955 960
Lys Asp Arg Asp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile
965 970 975
Lys Glu Met Lys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala
980 985 990
Lys Leu Val Ile Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn
995 1000 1005
Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr
1010 1015 1020
Gln Lys Leu Glu Lys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val
1025 1030 1035
Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly Val Leu Arg Ala
1040 1045 1050
Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys Met Gly Lys
1055 1060 1065
Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr Ser Lys
1070 1075 1080
Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys Tyr
1085 1090 1095
Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1100 1105 1110
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp
1115 1120 1125
Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile
1130 1135 1140
Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys
1145 1150 1155
Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu
1160 1165 1170
Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu
1175 1180 1185
Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe
1190 1195 1200
Ala Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn
1205 1210 1215
Ser Lys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala
1220 1225 1230
Asp Val Asn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn
1235 1240 1245
Met Pro Gln Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu
1250 1255 1260
Lys Gly Leu Met Leu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly
1265 1270 1275
Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe
1280 1285 1290
Val Gln Asn Arg Asn Asn
1295
<210> 2
<211> 1299
<212> PRT
<213> 人工序列(artificial sequence)
<400> 2
Met Ser Ile Tyr Gln Glu Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys
20 25 30
Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys
35 40 45
Lys Ala Lys Gln Ile Ile Asp Lys Tyr His Gln Phe Phe Ile Glu Glu
50 55 60
Ile Leu Ser Ser Val Cys Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser
65 70 75 80
Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys
85 90 95
Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr
100 105 110
Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile
115 120 125
Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln
130 135 140
Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr
145 150 155 160
Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr
165 170 175
Thr Tyr Phe Lys Gly Phe His Glu Asn Arg Lys Asn Val Tyr Ser Ser
180 185 190
Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu
195 200 205
Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys
210 215 220
Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu
225 230 235 240
Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg
245 250 255
Val Phe Ser Leu Asp Glu Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr
260 265 270
Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys
275 280 285
Phe Val Asn Gly Glu Asn Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile
290 295 300
Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys
305 310 315 320
Met Ser Val Leu Phe Lys Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser
325 330 335
Phe Val Ile Asp Lys Leu Glu Asp Asp Ser Asp Val Val Thr Thr Met
340 345 350
Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe Lys Thr Val Glu Glu Lys
355 360 365
Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln
370 375 380
Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr
385 390 395 400
Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala
405 410 415
Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn
420 425 430
Pro Ser Lys Glu Gln Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys
435 440 445
Tyr Leu Ser Leu Glu Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys
450 455 460
His Arg Asp Ile Asp Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn
465 470 475 480
Phe Ala Ala Ile Pro Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp
485 490 495
Asn Leu Ala Gln Ile Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp
500 505 510
Leu Leu Gln Ala Ser Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu
515 520 525
Leu Asp Gln Thr Asn Asn Leu Leu His Lys Leu Lys Ile Phe His Ile
530 535 540
Ser Gln Ser Glu Asp Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe
545 550 555 560
Tyr Leu Val Phe Glu Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro
565 570 575
Leu Tyr Asn Lys Ile Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp
580 585 590
Glu Lys Phe Lys Leu Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp
595 600 605
Asp Lys Asn Lys Glu Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp
610 615 620
Asp Lys Tyr Tyr Leu Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe
625 630 635 640
Asp Asp Lys Ala Ile Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile
645 650 655
Val Tyr Lys Leu Leu Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe
660 665 670
Phe Ser Ala Lys Ser Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu
675 680 685
Arg Ile Arg Asn His Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys
690 695 700
Gly Tyr Glu Lys Phe Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile
705 710 715 720
Asp Phe Tyr Lys Gln Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe
725 730 735
Gly Phe Arg Phe Ser Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe
740 745 750
Tyr Arg Glu Val Glu Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile
755 760 765
Ser Glu Ser Tyr Ile Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu
770 775 780
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro
785 790 795 800
Asn Leu His Thr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu
805 810 815
Gln Asp Val Val Tyr Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg
820 825 830
Lys Gln Ser Ile Pro Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile
835 840 845
Ala Asn Lys Asn Lys Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr
850 855 860
Asp Leu Ile Lys Asp Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His
865 870 875 880
Cys Pro Ile Thr Ile Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn
885 890 895
Asp Glu Ile Asn Leu Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile
900 905 910
Leu Ser Ile Asp Arg Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val
915 920 925
Asp Gly Lys Gly Asn Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly
930 935 940
Asn Asp Arg Met Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu
945 950 955 960
Lys Asp Arg Asp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile
965 970 975
Lys Glu Met Lys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala
980 985 990
Lys Leu Val Ile Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn
995 1000 1005
Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr
1010 1015 1020
Gln Lys Leu Glu Lys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val
1025 1030 1035
Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly Val Leu Arg Ala
1040 1045 1050
Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys Met Gly Arg
1055 1060 1065
Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Phe Thr Ser Lys
1070 1075 1080
Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys Tyr
1085 1090 1095
Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1100 1105 1110
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp
1115 1120 1125
Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile
1130 1135 1140
Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys
1145 1150 1155
Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu
1160 1165 1170
Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu
1175 1180 1185
Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe
1190 1195 1200
Ala Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn
1205 1210 1215
Ser Lys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala
1220 1225 1230
Asp Val Asn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn
1235 1240 1245
Met Pro Gln Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu
1250 1255 1260
Lys Gly Leu Met Leu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly
1265 1270 1275
Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe
1280 1285 1290
Val Gln Asn Arg Asn Asn
1295
<210> 3
<211> 1299
<212> PRT
<213> 人工序列(artificial sequence)
<400> 3
Met Ser Ile Tyr Gln Glu Phe Val Asn Lys Tyr Ser Leu Ser Lys Thr
1 5 10 15
Leu Arg Phe Glu Leu Ile Pro Gln Gly Lys Thr Leu Glu Asn Ile Lys
20 25 30
Ala Arg Gly Leu Ile Leu Asp Asp Glu Lys Arg Ala Lys Asp Tyr Lys
35 40 45
Lys Ala Lys Gln Ile Ile Asp Lys Tyr His Gln Phe Phe Ile Glu Glu
50 55 60
Ile Leu Ser Ser Val Cys Ile Ser Glu Asp Leu Leu Gln Asn Tyr Ser
65 70 75 80
Asp Val Tyr Phe Lys Leu Lys Lys Ser Asp Asp Asp Asn Leu Gln Lys
85 90 95
Asp Phe Lys Ser Ala Lys Asp Thr Ile Lys Lys Gln Ile Ser Glu Tyr
100 105 110
Ile Lys Asp Ser Glu Lys Phe Lys Asn Leu Phe Asn Gln Asn Leu Ile
115 120 125
Asp Ala Lys Lys Gly Gln Glu Ser Asp Leu Ile Leu Trp Leu Lys Gln
130 135 140
Ser Lys Asp Asn Gly Ile Glu Leu Phe Lys Ala Asn Ser Asp Ile Thr
145 150 155 160
Asp Ile Asp Glu Ala Leu Glu Ile Ile Lys Ser Phe Lys Gly Trp Thr
165 170 175
Thr Tyr Phe Lys Gly Phe His Glu Asn Arg Lys Asn Val Tyr Ser Ser
180 185 190
Asn Asp Ile Pro Thr Ser Ile Ile Tyr Arg Ile Val Asp Asp Asn Leu
195 200 205
Pro Lys Phe Leu Glu Asn Lys Ala Lys Tyr Glu Ser Leu Lys Asp Lys
210 215 220
Ala Pro Glu Ala Ile Asn Tyr Glu Gln Ile Lys Lys Asp Leu Ala Glu
225 230 235 240
Glu Leu Thr Phe Asp Ile Asp Tyr Lys Thr Ser Glu Val Asn Gln Arg
245 250 255
Val Phe Ser Leu Asp Glu Val Phe Glu Ile Ala Asn Phe Asn Asn Tyr
260 265 270
Leu Asn Gln Ser Gly Ile Thr Lys Phe Asn Thr Ile Ile Gly Gly Lys
275 280 285
Phe Val Asn Gly Glu Asn Thr Lys Arg Lys Gly Ile Asn Glu Tyr Ile
290 295 300
Asn Leu Tyr Ser Gln Gln Ile Asn Asp Lys Thr Leu Lys Lys Tyr Lys
305 310 315 320
Met Ser Val Leu Phe Lys Gln Ile Leu Ser Asp Thr Glu Ser Lys Ser
325 330 335
Phe Val Ile Asp Lys Leu Glu Asp Asp Ser Asp Val Val Thr Thr Met
340 345 350
Gln Ser Phe Tyr Glu Gln Ile Ala Ala Phe Lys Thr Val Glu Glu Lys
355 360 365
Ser Ile Lys Glu Thr Leu Ser Leu Leu Phe Asp Asp Leu Lys Ala Gln
370 375 380
Lys Leu Asp Leu Ser Lys Ile Tyr Phe Lys Asn Asp Lys Ser Leu Thr
385 390 395 400
Asp Leu Ser Gln Gln Val Phe Asp Asp Tyr Ser Val Ile Gly Thr Ala
405 410 415
Val Leu Glu Tyr Ile Thr Gln Gln Ile Ala Pro Lys Asn Leu Asp Asn
420 425 430
Pro Ser Lys Glu Gln Glu Leu Ile Ala Lys Lys Thr Glu Lys Ala Lys
435 440 445
Tyr Leu Ser Leu Glu Thr Ile Lys Leu Ala Leu Glu Glu Phe Asn Lys
450 455 460
His Arg Asp Ile Asp Lys Gln Cys Arg Phe Glu Glu Ile Leu Ala Asn
465 470 475 480
Phe Ala Ala Ile Pro Met Ile Phe Asp Glu Ile Ala Gln Asn Lys Asp
485 490 495
Asn Leu Ala Gln Ile Ser Ile Lys Tyr Gln Asn Gln Gly Lys Lys Asp
500 505 510
Leu Leu Gln Ala Ser Ala Glu Asp Asp Val Lys Ala Ile Lys Asp Leu
515 520 525
Leu Asp Gln Thr Asn Asn Leu Leu His Lys Leu Lys Ile Phe His Ile
530 535 540
Ser Gln Ser Glu Asp Lys Ala Asn Ile Leu Asp Lys Asp Glu His Phe
545 550 555 560
Tyr Leu Val Phe Glu Glu Cys Tyr Phe Glu Leu Ala Asn Ile Val Pro
565 570 575
Leu Tyr Asn Lys Ile Arg Asn Tyr Ile Thr Gln Lys Pro Tyr Ser Asp
580 585 590
Glu Lys Phe Lys Leu Asn Phe Glu Asn Ser Thr Leu Ala Asn Gly Trp
595 600 605
Asp Lys Asn Lys Glu Pro Asp Asn Thr Ala Ile Leu Phe Ile Lys Asp
610 615 620
Asp Lys Tyr Tyr Leu Gly Val Met Asn Lys Lys Asn Asn Lys Ile Phe
625 630 635 640
Asp Asp Lys Ala Ile Lys Glu Asn Lys Gly Glu Gly Tyr Lys Lys Ile
645 650 655
Val Tyr Lys Leu Leu Pro Gly Ala Asn Lys Met Leu Pro Lys Val Phe
660 665 670
Phe Ser Ala Lys Ser Ile Lys Phe Tyr Asn Pro Ser Glu Asp Ile Leu
675 680 685
Arg Ile Arg Asn His Ser Thr His Thr Lys Asn Gly Ser Pro Gln Lys
690 695 700
Gly Tyr Glu Lys Phe Glu Phe Asn Ile Glu Asp Cys Arg Lys Phe Ile
705 710 715 720
Asp Phe Tyr Lys Gln Ser Ile Ser Lys His Pro Glu Trp Lys Asp Phe
725 730 735
Gly Phe Arg Phe Ser Asp Thr Gln Arg Tyr Asn Ser Ile Asp Glu Phe
740 745 750
Tyr Arg Glu Val Glu Asn Gln Gly Tyr Lys Leu Thr Phe Glu Asn Ile
755 760 765
Ser Glu Ser Tyr Ile Asp Ser Val Val Asn Gln Gly Lys Leu Tyr Leu
770 775 780
Phe Gln Ile Tyr Asn Lys Asp Phe Ser Ala Tyr Ser Lys Gly Arg Pro
785 790 795 800
Asn Leu His Thr Leu Tyr Trp Lys Ala Leu Phe Asp Glu Arg Asn Leu
805 810 815
Gln Asp Val Val Tyr Lys Leu Asn Gly Glu Ala Glu Leu Phe Tyr Arg
820 825 830
Lys Gln Ser Ile Pro Lys Lys Ile Thr His Pro Ala Lys Glu Ala Ile
835 840 845
Ala Asn Lys Asn Lys Asp Asn Pro Lys Lys Glu Ser Val Phe Glu Tyr
850 855 860
Asp Leu Ile Lys Asp Lys Arg Phe Thr Glu Asp Lys Phe Phe Phe His
865 870 875 880
Cys Pro Ile Thr Ile Asn Phe Lys Ser Ser Gly Ala Asn Lys Phe Asn
885 890 895
Asp Glu Ile Asn Leu Leu Leu Lys Glu Lys Ala Asn Asp Val His Ile
900 905 910
Leu Ser Ile Asp Arg Gly Glu Arg His Leu Ala Tyr Tyr Thr Leu Val
915 920 925
Asp Gly Lys Gly Asn Ile Ile Lys Gln Asp Thr Phe Asn Ile Ile Gly
930 935 940
Asn Asp Arg Met Lys Thr Asn Tyr His Asp Lys Leu Ala Ala Ile Glu
945 950 955 960
Lys Asp Arg Asp Ser Ala Arg Lys Asp Trp Lys Lys Ile Asn Asn Ile
965 970 975
Lys Glu Met Lys Glu Gly Tyr Leu Ser Gln Val Val His Glu Ile Ala
980 985 990
Lys Leu Val Ile Glu Tyr Asn Ala Ile Val Val Phe Glu Asp Leu Asn
995 1000 1005
Phe Gly Phe Lys Arg Gly Arg Phe Lys Val Glu Lys Gln Val Tyr
1010 1015 1020
Gln Lys Leu Glu Lys Met Leu Ile Glu Lys Leu Asn Tyr Leu Val
1025 1030 1035
Phe Lys Asp Asn Glu Phe Asp Lys Thr Gly Gly Val Leu Arg Ala
1040 1045 1050
Tyr Gln Leu Thr Ala Pro Phe Glu Thr Phe Lys Lys Met Gly Lys
1055 1060 1065
Gln Thr Gly Ile Ile Tyr Tyr Val Pro Ala Gly Arg Thr Ser Lys
1070 1075 1080
Ile Cys Pro Val Thr Gly Phe Val Asn Gln Leu Tyr Pro Lys Tyr
1085 1090 1095
Glu Ser Val Ser Lys Ser Gln Glu Phe Phe Ser Lys Phe Asp Lys
1100 1105 1110
Ile Cys Tyr Asn Leu Asp Lys Gly Tyr Phe Glu Phe Ser Phe Asp
1115 1120 1125
Tyr Lys Asn Phe Gly Asp Lys Ala Ala Lys Gly Lys Trp Thr Ile
1130 1135 1140
Ala Ser Phe Gly Ser Arg Leu Ile Asn Phe Arg Asn Ser Asp Lys
1145 1150 1155
Asn His Asn Trp Asp Thr Arg Glu Val Tyr Pro Thr Lys Glu Leu
1160 1165 1170
Glu Lys Leu Leu Lys Asp Tyr Ser Ile Glu Tyr Gly His Gly Glu
1175 1180 1185
Cys Ile Lys Ala Ala Ile Cys Gly Glu Ser Asp Lys Lys Phe Phe
1190 1195 1200
Ala Lys Leu Thr Ser Val Leu Asn Thr Ile Leu Gln Met Arg Asn
1205 1210 1215
Ser Lys Thr Gly Thr Glu Leu Asp Tyr Leu Ile Ser Pro Val Ala
1220 1225 1230
Asp Val Asn Gly Asn Phe Phe Asp Ser Arg Gln Ala Pro Lys Asn
1235 1240 1245
Met Pro Gln Asp Ala Asp Ala Asn Gly Ala Tyr His Ile Gly Leu
1250 1255 1260
Lys Gly Leu Met Leu Leu Gly Arg Ile Lys Asn Asn Gln Glu Gly
1265 1270 1275
Lys Lys Leu Asn Leu Val Ile Lys Asn Glu Glu Tyr Phe Glu Phe
1280 1285 1290
Val Gln Asn Arg Asn Asn
1295
<210> 4
<211> 3903
<212> DNA
<213> 人工序列(artificial sequence)
<400> 4
atgagcatct atcaggagtt cgtgaataag tacagcctgt ccaagaccct gcggtttgag 60
ctgatccccc agggcaagac actggagaac atcaaggcca ggggcctgat cctggacgat 120
gagaagcgcg ccaaggacta taagaaggcc aagcagatca tcgataagta ccaccagttc 180
tttatcgagg agatcctgag cagcgtgtgc atctctgagg atctgctgca gaattacagc 240
gacgtgtatt tcaagctgaa gaagtctgac gatgacaacc tgcagaagga cttcaagagc 300
gccaaggaca ccatcaagaa gcagatcagc gagtatatca aggactccga gaagtttaag 360
aatctgttca accagaatct gatcgatgcc aagaagggcc aggagtccga cctgatcctg 420
tggctgaagc agtctaagga caatggcatc gagctgttca aggccaactc tgatatcacc 480
gatatcgacg aggccctgga gatcatcaag agctttaagg gctggaccac atactttaag 540
ggcttccacg agaacaggaa gaacgtgtac agcagcaacg acatccctac aagcatcatc 600
taccgcatcg tggatgacaa tctgccaaag ttcctggaga acaaggccaa gtatgagtcc 660
ctgaaggaca aggcccccga ggccatcaat tacgagcaga tcaagaagga tctggccgag 720
gagctgacct tcgatatcga ctataagaca tccgaggtga accagcgggt gttttctctg 780
gacgaggtgt ttgagatcgc caatttcaac aattacctga accagtccgg catcaccaag 840
ttcaatacaa tcatcggcgg caagtttgtg aacggcgaga ataccaagag aaagggcatc 900
aacgagtaca tcaatctgta tagccagcag atcaacgaca agaccctgaa gaagtacaag 960
atgagcgtgc tgttcaagca gatcctgtcc gatacagagt ctaagagctt tgtgatcgat 1020
aagctggagg atgactctga cgtggtgacc acaatgcaga gcttttatga gcagatcgcc 1080
gccttcaaga ccgtggagga gaagtctatc aaggagacac tgagcctgct gttcgatgac 1140
ctgaaggccc agaagctgga cctgtctaag atctacttca agaacgataa gtccctgacc 1200
gacctgtctc agcaggtgtt tgatgactat agcgtgatcg gcaccgccgt gctggagtac 1260
atcacacagc agatcgcccc aaagaacctg gataatccct ctaagaagga gcaggagctg 1320
atcgccaaga agaccgagaa ggccaagtat ctgagcctgg agacaatcaa gctggccctg 1380
gaggagttca ataagcaccg ggatatcgac aagcagtgca gatttgagga gatcctggcc 1440
aacttcgccg ccatccccat gatctttgat gagatcgccc agaacaagga caatctggcc 1500
cagatctcca tcaagtacca gaaccagggc aagaaggacc tgctgcaggc ctctgccgag 1560
gatgacgtga aggccatcaa ggatctgctg gaccagacca acaatctgct gcacaagctg 1620
aagatcttcc acatctccca gtctgaggat aaggccaata tcctggataa ggacgagcac 1680
ttttatctgg tgttcgagga gtgttacttc gagctggcca acatcgtgcc cctgtacaac 1740
aagatcagaa attatatcac acagaagcct tactccgacg agaagtttaa gctgaacttc 1800
gagaacagca ccctggccaa cggctgggat aagaataagg agcctgacaa cacagccatc 1860
ctgttcatca aggatgacaa gtactatctg ggcgtgatga ataagaagaa caataagatc 1920
ttcgatgaca aggccatcaa ggagaacaag ggcgagggct acaagaagat cgtgtataag 1980
ctgctgcccg gcgccaataa gatgctgcct aaggtgttct tttccgccaa gtctatcaag 2040
ttctacaacc catccgagga catcctgcgg atcagaaatc actccaccca cacaaagaac 2100
ggctctcccc agaagggcta tgagaagttt gagttcaata tcgaggattg ccggaagttt 2160
atcgacttct acaagcagag catctccaag caccctgagt ggaaggattt tggcttcagg 2220
tttagcgaca cccagcggta caactccatc gacgagttct acagagaggt ggagaatcag 2280
ggctataagc tgacatttga gaacatctct gagagctaca tcgacagcgt ggtgaatcag 2340
ggcaagctgt acctgttcca gatctataac aaggacttca gcgcctattc caagggccgg 2400
ccaaacctgc acaccctgta ctggaaggcc ctgttcgatg agagaaatct gcaggacgtg 2460
gtgtataagc tgaacggcga ggccgagctg ttttacagga agcagtccat ccctaagaag 2520
atcacacacc cagccaagga ggccatcgcc aacaagaata aggacaatcc taagaaggag 2580
agcgtgttcg agtacgatct gatcaaggac aagcggttca ccgaggataa gttctttttc 2640
cactgtccaa tcacaatcaa cttcaagtcc tctggcgcca acaagtttaa tgacgagatc 2700
aatctgctgc tgaaggagaa ggccaacgat gtgcacatcc tgagcatcga ccggggcgag 2760
agacacctgg cctactatac cctggtggat ggcaagggca atatcatcaa gcaggatacc 2820
ttcaacatca tcggcaatga caggatgaag acaaactacc acgataagct ggccgccatc 2880
gagaaggata gggactccgc ccgcaaggac tggaagaaga tcaacaatat caaggagatg 2940
aaggagggct atctgtctca ggtggtgcac gagatcgcca agctggtcat cgagtacaat 3000
gccatcgtgg tgttcgagga tctgaacttc ggctttaaga ggggccgctt taaggtggag 3060
aagcaggtgt atcagaagct ggagaagatg ctgatcgaga agctgaatta cctggtgttt 3120
aaggataacg agttcgacaa gaccggaggc gtgctgaggg cataccagct gaccgccccc 3180
tttgagacat tcaagaagat gggcaggcag acaggcatca tctactatgt gccagccggc 3240
ttcacctcca agatctgccc cgtgacaggc tttgtgaacc agctgtaccc taagtatgag 3300
tccgtgtcta agagccagga gtttttcagc aagttcgata agatctgtta taatctggac 3360
aagggctact tcgagttttc cttcgattat aagaactttg gcgacaaggc cgccaagggc 3420
aagtggacca tcgcctcttt cggcagccgg ctgatcaact ttagaaattc cgataagaac 3480
cacaattggg acacccggga ggtgtaccca acaaaggagc tggagaagct gctgaaggac 3540
tacagcatcg agtatggcca cggcgagtgc atcaaggccg ccatctgtgg cgagagcgat 3600
aagaagtttt tcgccaagct gacctccgtg ctgaatacaa tcctgcagat gcggaacagc 3660
aagaccggca cagagctgga ctacctgatc tcccccgtgg ccgatgtgaa cggcaacttc 3720
ttcgacagca gacaggcccc caagaatatg cctcaggatg ccgacgccaa cggcgcctat 3780
cacatcggcc tgaagggcct gatgctgctg ggcaggatca agaacaatca ggagggcaag 3840
aagctgaacc tggtcatcaa gaacgaggag tactttgagt tcgtgcagaa ccgcaacaat 3900
tga 3903
<210> 5
<211> 3903
<212> DNA
<213> 人工序列(artificial sequence)
<400> 5
atgagcatct atcaggagtt cgtgaataag tacagcctgt ccaagaccct gcggtttgag 60
ctgatccccc agggcaagac actggagaac atcaaggcca ggggcctgat cctggacgat 120
gagaagcgcg ccaaggacta taagaaggcc aagcagatca tcgataagta ccaccagttc 180
tttatcgagg agatcctgag cagcgtgtgc atctctgagg atctgctgca gaattacagc 240
gacgtgtatt tcaagctgaa gaagtctgac gatgacaacc tgcagaagga cttcaagagc 300
gccaaggaca ccatcaagaa gcagatcagc gagtatatca aggactccga gaagtttaag 360
aatctgttca accagaatct gatcgatgcc aagaagggcc aggagtccga cctgatcctg 420
tggctgaagc agtctaagga caatggcatc gagctgttca aggccaactc tgatatcacc 480
gatatcgacg aggccctgga gatcatcaag agctttaagg gctggaccac atactttaag 540
ggcttccacg agaacaggaa gaacgtgtac agcagcaacg acatccctac aagcatcatc 600
taccgcatcg tggatgacaa tctgccaaag ttcctggaga acaaggccaa gtatgagtcc 660
ctgaaggaca aggcccccga ggccatcaat tacgagcaga tcaagaagga tctggccgag 720
gagctgacct tcgatatcga ctataagaca tccgaggtga accagcgggt gttttctctg 780
gacgaggtgt ttgagatcgc caatttcaac aattacctga accagtccgg catcaccaag 840
ttcaatacaa tcatcggcgg caagtttgtg aacggcgaga ataccaagag aaagggcatc 900
aacgagtaca tcaatctgta tagccagcag atcaacgaca agaccctgaa gaagtacaag 960
atgagcgtgc tgttcaagca gatcctgtcc gatacagagt ctaagagctt tgtgatcgat 1020
aagctggagg atgactctga cgtggtgacc acaatgcaga gcttttatga gcagatcgcc 1080
gccttcaaga ccgtggagga gaagtctatc aaggagacac tgagcctgct gttcgatgac 1140
ctgaaggccc agaagctgga cctgtctaag atctacttca agaacgataa gtccctgacc 1200
gacctgtctc agcaggtgtt tgatgactat agcgtgatcg gcaccgccgt gctggagtac 1260
atcacacagc agatcgcccc aaagaacctg gataatccct ctaagaagga gcaggagctg 1320
atcgccaaga agaccgagaa ggccaagtat ctgagcctgg agacaatcaa gctggccctg 1380
gaggagttca ataagcaccg ggatatcgac aagcagtgca gatttgagga gatcctggcc 1440
aacttcgccg ccatccccat gatctttgat gagatcgccc agaacaagga caatctggcc 1500
cagatctcca tcaagtacca gaaccagggc aagaaggacc tgctgcaggc ctctgccgag 1560
gatgacgtga aggccatcaa ggatctgctg gaccagacca acaatctgct gcacaagctg 1620
aagatcttcc acatctccca gtctgaggat aaggccaata tcctggataa ggacgagcac 1680
ttttatctgg tgttcgagga gtgttacttc gagctggcca acatcgtgcc cctgtacaac 1740
aagatcagaa attatatcac acagaagcct tactccgacg agaagtttaa gctgaacttc 1800
gagaacagca ccctggccaa cggctgggat aagaataagg agcctgacaa cacagccatc 1860
ctgttcatca aggatgacaa gtactatctg ggcgtgatga ataagaagaa caataagatc 1920
ttcgatgaca aggccatcaa ggagaacaag ggcgagggct acaagaagat cgtgtataag 1980
ctgctgcccg gcgccaataa gatgctgcct aaggtgttct tttccgccaa gtctatcaag 2040
ttctacaacc catccgagga catcctgcgg atcagaaatc actccaccca cacaaagaac 2100
ggctctcccc agaagggcta tgagaagttt gagttcaata tcgaggattg ccggaagttt 2160
atcgacttct acaagcagag catctccaag caccctgagt ggaaggattt tggcttcagg 2220
tttagcgaca cccagcggta caactccatc gacgagttct acagagaggt ggagaatcag 2280
ggctataagc tgacatttga gaacatctct gagagctaca tcgacagcgt ggtgaatcag 2340
ggcaagctgt acctgttcca gatctataac aaggacttca gcgcctattc caagggccgg 2400
ccaaacctgc acaccctgta ctggaaggcc ctgttcgatg agagaaatct gcaggacgtg 2460
gtgtataagc tgaacggcga ggccgagctg ttttacagga agcagtccat ccctaagaag 2520
atcacacacc cagccaagga ggccatcgcc aacaagaata aggacaatcc taagaaggag 2580
agcgtgttcg agtacgatct gatcaaggac aagcggttca ccgaggataa gttctttttc 2640
cactgtccaa tcacaatcaa cttcaagtcc tctggcgcca acaagtttaa tgacgagatc 2700
aatctgctgc tgaaggagaa ggccaacgat gtgcacatcc tgagcatcga ccggggcgag 2760
agacacctgg cctactatac cctggtggat ggcaagggca atatcatcaa gcaggatacc 2820
ttcaacatca tcggcaatga caggatgaag acaaactacc acgataagct ggccgccatc 2880
gagaaggata gggactccgc ccgcaaggac tggaagaaga tcaacaatat caaggagatg 2940
aaggagggct atctgtctca ggtggtgcac gagatcgcca agctggtcat cgagtacaat 3000
gccatcgtgg tgttcgagga tctgaacttc ggctttaaga ggggccgctt taaggtggag 3060
aagcaggtgt atcagaagct ggagaagatg ctgatcgaga agctgaatta cctggtgttt 3120
aaggataacg agttcgacaa gaccggaggc gtgctgaggg cataccagct gaccgccccc 3180
tttgagacat tcaagaagat gggcaagcag acaggcatca tctactatgt gccagccggc 3240
cgcacctcca agatctgccc cgtgacaggc tttgtgaacc agctgtaccc taagtatgag 3300
tccgtgtcta agagccagga gtttttcagc aagttcgata agatctgtta taatctggac 3360
aagggctact tcgagttttc cttcgattat aagaactttg gcgacaaggc cgccaagggc 3420
aagtggacca tcgcctcttt cggcagccgg ctgatcaact ttagaaattc cgataagaac 3480
cacaattggg acacccggga ggtgtaccca acaaaggagc tggagaagct gctgaaggac 3540
tacagcatcg agtatggcca cggcgagtgc atcaaggccg ccatctgtgg cgagagcgat 3600
aagaagtttt tcgccaagct gacctccgtg ctgaatacaa tcctgcagat gcggaacagc 3660
aagaccggca cagagctgga ctacctgatc tcccccgtgg ccgatgtgaa cggcaacttc 3720
ttcgacagca gacaggcccc caagaatatg cctcaggatg ccgacgccaa cggcgcctat 3780
cacatcggcc tgaagggcct gatgctgctg ggcaggatca agaacaatca ggagggcaag 3840
aagctgaacc tggtcatcaa gaacgaggag tactttgagt tcgtgcagaa ccgcaacaat 3900
tga 3903
Claims (13)
1.一种基因编辑蛋白变体,其特征在于,所述变体为具有cis切割活性的非天然蛋白,且所述变体相较于其野生型基因编辑蛋白的trans切割活性降低,并且所述变体在野生型基因编辑蛋白的选自下组一个或多个与切割活性相关的核心氨基酸位点发生突变:
对应于FnCas12a第1081位的苯丙氨酸(F)位点;和/或
对应于FnCas12a第1069位的赖氨酸(K)位点。
2.一种多核苷酸,其特征在于,所述的多核苷酸编码权利要求1所述的变体。
3.一种载体,其特征在于,所述的载体含有权利要求2所述的多核苷酸。
4.一种宿主细胞,其特征在于,所述宿主细胞含有权利要求3所述的载体,或其基因组中整合有权利要求2所述的多核苷酸。
5.一种基因编辑蛋白变体的制备方法,其特征在于,所述的方法包括步骤:
(a)在适合表达的条件下,培养权利要求4所述的宿主细胞,从而表达所述的基因编辑蛋白变体;和
(b)分离所述的基因编辑蛋白变体。
6.一种酶制剂,其特征在于,所述酶制剂包括权利要求1所述的基因编辑蛋白变体。
7.一种基因编辑系统,其特征在于,包括:
权利要求1所述的基因编辑蛋白变体、或其编码基因或其表达载体;和
gRNA或其表达载体,和/或其用于靶标位点断裂修复的寡核苷酸或核酸片段或质粒。
8.一种组合物,其特征在于,包括:
权利要求7所述的系统;和
药学上可接受的载体。
9.一种试剂盒,其特征在于,包括:权利要求1所述的基因编辑蛋白变体或权利要求7所述的基因编辑系统。
10.一种药盒,其特征在于,包括:
第一容器,以及位于所述第一容器中的权利要求7所述的基因编辑系统或权利要求8所述的组合物,或含有权利要求7所述的基因编辑系统或权利要求8所述的组合物的药物。
11.一种药盒,其特征在于,包括:
(a1)第一容器,以及位于所述第一容器中的权利要求1所述的基因编辑蛋白变体、或其编码基因或其表达载体,或含有权利要求1所述的基因编辑蛋白变体、或其编码基因或其表达载体的药物;
(b1)第二容器,以及位于所述第二容器中的gRNA或其表达载体,或含有gRNA或其表达载体的药物。
12.一种权利要求1所述的基因编辑蛋白变体、权利要求7所述的基因编辑系统、权利要求8所述的组合物、权利要求9所述的试剂盒或权利要求10或11所述的药盒的用途,其特征在于,用于制备用于降低基因编辑脱靶率的试剂或试剂盒。
13.一种降低基因编辑脱靶率的方法,其特征在于,包括步骤:
在权利要求1所述的基因编辑蛋白变体、权利要求7所述的基因编辑系统、权利要求8所述的组合物、权利要求9所述的试剂盒或权利要求10或11所述的药盒的存在下,对细胞进行基因编辑,从而降低基因编辑脱靶率。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210508434.7A CN117070497A (zh) | 2022-05-10 | 2022-05-10 | 可降低基因编辑脱靶率的基因编辑蛋白变体 |
PCT/CN2023/085720 WO2023216764A1 (zh) | 2022-05-10 | 2023-03-31 | 可降低基因编辑脱靶率的基因编辑蛋白变体 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210508434.7A CN117070497A (zh) | 2022-05-10 | 2022-05-10 | 可降低基因编辑脱靶率的基因编辑蛋白变体 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117070497A true CN117070497A (zh) | 2023-11-17 |
Family
ID=88717707
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210508434.7A Pending CN117070497A (zh) | 2022-05-10 | 2022-05-10 | 可降低基因编辑脱靶率的基因编辑蛋白变体 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117070497A (zh) |
WO (1) | WO2023216764A1 (zh) |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200216825A1 (en) * | 2019-01-08 | 2020-07-09 | Integrated Dna Technologies, Inc. | CAS12a MUTANT GENES AND POLYPEPTIDES ENCODED BY SAME |
CN113811608A (zh) * | 2019-02-22 | 2021-12-17 | 合成Dna技术公司 | 毛螺菌科细菌nd2006 cas12a突变型基因和由其编码的多肽 |
CN109825532B (zh) * | 2019-03-04 | 2019-12-10 | 中国科学院昆明植物研究所 | CRISPR/Cas12a基因编辑系统在小立碗藓基因编辑中的应用 |
CN111235232B (zh) * | 2020-01-19 | 2022-05-27 | 华中农业大学 | 基于CRISPR-Cas12a系统的可视化快速核酸检测方法及应用 |
CN112359057B (zh) * | 2020-10-23 | 2022-11-22 | 浙江大学 | CRISPR/Cas12a基因编辑系统在84K杨树基因编辑中的应用 |
-
2022
- 2022-05-10 CN CN202210508434.7A patent/CN117070497A/zh active Pending
-
2023
- 2023-03-31 WO PCT/CN2023/085720 patent/WO2023216764A1/zh unknown
Also Published As
Publication number | Publication date |
---|---|
WO2023216764A1 (zh) | 2023-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108239633B (zh) | 一种催化活性得到提高的d-阿洛酮糖-3-差向异构酶的突变体及其应用 | |
CN100587068C (zh) | 具有独特特异性的多个重组位点在重组克隆中的用途 | |
JP2882775B2 (ja) | ヒトーグリア由来神経突起因子 | |
CN113881652B (zh) | 新型Cas酶和系统以及应用 | |
CN114672473B (zh) | 一种优化的Cas蛋白及其应用 | |
JPH0789934B2 (ja) | 構造遺伝子の製造および発現 | |
CN114380922A (zh) | 在细胞内产生点突变的融合蛋白、其制备及用途 | |
JP2001506855A (ja) | ポリペプチド操作のための方法および組成物 | |
JP2015535010A (ja) | 部位特異的酵素および使用方法 | |
CN114410609B (zh) | 一种活性提高的Cas蛋白以及应用 | |
CN109929839B (zh) | 拆分型单碱基基因编辑系统及其应用 | |
CN114015676B (zh) | 一种适配中药饲料添加剂的纤维素酶的构建方法 | |
KR102584789B1 (ko) | Adh 단백질 계열 돌연변이 및 이의 용도 | |
CN113604455B (zh) | 双链特异性核酸酶变体及其应用 | |
TW201625665A (zh) | 具有高分泌效能之蛋白質分泌因子及包含該因子之表現載體 | |
CN114438055B (zh) | 新型的crispr酶和系统以及应用 | |
WO1993000353A1 (en) | Sequences characteristic of human gene transcription product | |
JPS62501538A (ja) | araBプロモーターを含有する複製可能な発現ビヒクル | |
US20110088121A1 (en) | Genes for improving salt tolerance and drought tolerance of plant and the uses thereof | |
CN108239632B (zh) | 一种热稳定性得到改善的d-阿洛酮糖-3-差向异构酶的突变体及其应用 | |
US6607899B2 (en) | Amplification-based cloning method | |
CN117070497A (zh) | 可降低基因编辑脱靶率的基因编辑蛋白变体 | |
CN113564145B (zh) | 用于胞嘧啶碱基编辑的融合蛋白及其应用 | |
EP1670932B1 (en) | Libraries of recombinant chimeric proteins | |
CN114958893A (zh) | 一种乳猪高温教槽料制备所需的乳糖酶的构建方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |