CN104093855B - 特异结合和靶定dna‑rna杂合双链的方法 - Google Patents
特异结合和靶定dna‑rna杂合双链的方法 Download PDFInfo
- Publication number
- CN104093855B CN104093855B CN201280060126.7A CN201280060126A CN104093855B CN 104093855 B CN104093855 B CN 104093855B CN 201280060126 A CN201280060126 A CN 201280060126A CN 104093855 B CN104093855 B CN 104093855B
- Authority
- CN
- China
- Prior art keywords
- leu
- dna
- ala
- val
- gln
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 230000008685 targeting Effects 0.000 title abstract description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 79
- 102100034343 Integrase Human genes 0.000 claims description 43
- 101710203526 Integrase Proteins 0.000 claims description 37
- 238000012986 modification Methods 0.000 claims description 13
- 230000004048 modification Effects 0.000 claims description 13
- 230000001629 suppression Effects 0.000 claims description 8
- 108090000604 Hydrolases Proteins 0.000 claims description 3
- 102000004157 Hydrolases Human genes 0.000 claims description 3
- 230000001225 therapeutic effect Effects 0.000 claims 4
- 108090000623 proteins and genes Proteins 0.000 abstract description 78
- 102000004169 proteins and genes Human genes 0.000 abstract description 60
- 108020004414 DNA Proteins 0.000 description 117
- 108010073062 Transcription Activator-Like Effectors Proteins 0.000 description 77
- XVZCXCTYGHPNEM-UHFFFAOYSA-N Leu-Leu-Pro Natural products CC(C)CC(N)C(=O)NC(CC(C)C)C(=O)N1CCCC1C(O)=O XVZCXCTYGHPNEM-UHFFFAOYSA-N 0.000 description 69
- LTFLDDDGWOVIHY-NAKRPEOUSA-N Val-Ala-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N LTFLDDDGWOVIHY-NAKRPEOUSA-N 0.000 description 47
- XVZCXCTYGHPNEM-IHRRRGAJSA-N (2s)-1-[(2s)-2-[[(2s)-2-amino-4-methylpentanoyl]amino]-4-methylpentanoyl]pyrrolidine-2-carboxylic acid Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(O)=O XVZCXCTYGHPNEM-IHRRRGAJSA-N 0.000 description 46
- BMOFUVHDBROBSE-DCAQKATOSA-N Val-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](C(C)C)N BMOFUVHDBROBSE-DCAQKATOSA-N 0.000 description 44
- XKUKSGPZAADMRA-UHFFFAOYSA-N glycyl-glycyl-glycine Chemical compound NCC(=O)NCC(=O)NCC(O)=O XKUKSGPZAADMRA-UHFFFAOYSA-N 0.000 description 41
- UWZLBXOBVKRUFE-HGNGGELXSA-N Gln-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCC(=O)N)N UWZLBXOBVKRUFE-HGNGGELXSA-N 0.000 description 40
- 230000003252 repetitive effect Effects 0.000 description 38
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 34
- DFXQCCBKGUNYGG-GUBZILKMSA-N Lys-Gln-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN DFXQCCBKGUNYGG-GUBZILKMSA-N 0.000 description 33
- 102000053602 DNA Human genes 0.000 description 30
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 30
- PRBLYKYHAJEABA-SRVKXCTJSA-N Gln-Arg-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(C)C)C(O)=O PRBLYKYHAJEABA-SRVKXCTJSA-N 0.000 description 29
- QITBQGJOXQYMOA-ZETCQYMHSA-N Gly-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)CN QITBQGJOXQYMOA-ZETCQYMHSA-N 0.000 description 29
- ZFNLIDNJUWNIJL-WDCWCFNPSA-N Leu-Glu-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O ZFNLIDNJUWNIJL-WDCWCFNPSA-N 0.000 description 28
- KLALXKYLOMZDQT-ZLUOBGJFSA-N Ala-Ser-Asn Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CC(N)=O KLALXKYLOMZDQT-ZLUOBGJFSA-N 0.000 description 27
- KVYVOGYEMPEXBT-GUBZILKMSA-N Gln-Ala-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O KVYVOGYEMPEXBT-GUBZILKMSA-N 0.000 description 27
- SOEXCCGNHQBFPV-DLOVCJGASA-N Gln-Val-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C(C)C)C(O)=O SOEXCCGNHQBFPV-DLOVCJGASA-N 0.000 description 27
- DLISPGXMKZTWQG-IFFSRLJSSA-N Glu-Thr-Val Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O DLISPGXMKZTWQG-IFFSRLJSSA-N 0.000 description 26
- 108010067216 glycyl-glycyl-glycine Proteins 0.000 description 26
- PNALXAODQKTNLV-JBDRJPRFSA-N Ala-Ile-Ala Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O PNALXAODQKTNLV-JBDRJPRFSA-N 0.000 description 24
- QKIBIXAQKAFZGL-GUBZILKMSA-N Leu-Cys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O QKIBIXAQKAFZGL-GUBZILKMSA-N 0.000 description 24
- JDBQSGMJBMPNFT-AVGNSLFASA-N Leu-Pro-Val Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(O)=O JDBQSGMJBMPNFT-AVGNSLFASA-N 0.000 description 24
- QGVBFDIREUUSHX-IFFSRLJSSA-N Thr-Val-Gln Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O QGVBFDIREUUSHX-IFFSRLJSSA-N 0.000 description 24
- 108010050848 glycylleucine Proteins 0.000 description 24
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 23
- 108010015792 glycyllysine Proteins 0.000 description 23
- OKEWAFFWMHBGPT-XPUUQOCRSA-N Ala-His-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CN=CN1 OKEWAFFWMHBGPT-XPUUQOCRSA-N 0.000 description 22
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 22
- XTAUQCGQFJQGEJ-NHCYSSNCSA-N Val-Gln-Arg Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N XTAUQCGQFJQGEJ-NHCYSSNCSA-N 0.000 description 22
- NHWYNIZWLJYZAG-XVYDVKMFSA-N Ala-Ser-His Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N NHWYNIZWLJYZAG-XVYDVKMFSA-N 0.000 description 20
- CCDFBRZVTDDJNM-GUBZILKMSA-N Ala-Leu-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O CCDFBRZVTDDJNM-GUBZILKMSA-N 0.000 description 19
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 19
- 239000013078 crystal Substances 0.000 description 19
- 230000004568 DNA-binding Effects 0.000 description 18
- RGPWUJOMKFYFSR-QWRGUYRKSA-N His-Gly-Leu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O RGPWUJOMKFYFSR-QWRGUYRKSA-N 0.000 description 16
- 239000012634 fragment Substances 0.000 description 16
- 241000700605 Viruses Species 0.000 description 15
- 230000029087 digestion Effects 0.000 description 15
- 239000000499 gel Substances 0.000 description 15
- 229910052759 nickel Inorganic materials 0.000 description 15
- 241000894006 Bacteria Species 0.000 description 14
- 150000001413 amino acids Chemical class 0.000 description 14
- FHQRLHFYVZAQHU-IUCAKERBSA-N Gly-Lys-Gln Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(N)=O)C(O)=O FHQRLHFYVZAQHU-IUCAKERBSA-N 0.000 description 13
- 210000004027 cell Anatomy 0.000 description 13
- 241000725303 Human immunodeficiency virus Species 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 12
- AJLVKXCNXIJHDV-CIUDSAMLSA-N Pro-Ala-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O AJLVKXCNXIJHDV-CIUDSAMLSA-N 0.000 description 11
- 238000002425 crystallisation Methods 0.000 description 11
- 230000008025 crystallization Effects 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 11
- 239000011780 sodium chloride Substances 0.000 description 11
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 10
- BVFQOPGFOQVZTE-ACZMJKKPSA-N Cys-Gln-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O BVFQOPGFOQVZTE-ACZMJKKPSA-N 0.000 description 10
- 108091028043 Nucleic acid sequence Proteins 0.000 description 10
- VAUMZJHYZQXZBQ-WHFBIAKZSA-N Ser-Asn-Gly Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O VAUMZJHYZQXZBQ-WHFBIAKZSA-N 0.000 description 10
- XERQKTRGJIKTRB-CIUDSAMLSA-N Ser-His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CN=CN1 XERQKTRGJIKTRB-CIUDSAMLSA-N 0.000 description 10
- 108010070944 alanylhistidine Proteins 0.000 description 10
- 108010047857 aspartylglycine Proteins 0.000 description 10
- 230000027455 binding Effects 0.000 description 10
- 230000009182 swimming Effects 0.000 description 10
- UFNSPPFJOHNXRE-AUTRQRHGSA-N Gln-Gln-Val Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O UFNSPPFJOHNXRE-AUTRQRHGSA-N 0.000 description 9
- FISHYTLIMUYTQY-GUBZILKMSA-N Pro-Gln-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1 FISHYTLIMUYTQY-GUBZILKMSA-N 0.000 description 9
- 108020004682 Single-Stranded DNA Proteins 0.000 description 9
- 108010044940 alanylglutamine Proteins 0.000 description 9
- 239000007853 buffer solution Substances 0.000 description 9
- 108020001507 fusion proteins Proteins 0.000 description 9
- 102000037865 fusion proteins Human genes 0.000 description 9
- 230000004224 protection Effects 0.000 description 9
- 238000010839 reverse transcription Methods 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 9
- 241001430294 unidentified retrovirus Species 0.000 description 9
- NDXSOKGYKCGYKT-VEVYYDQMSA-N Thr-Pro-Asp Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O NDXSOKGYKCGYKT-VEVYYDQMSA-N 0.000 description 8
- XKWABWFMQXMUMT-HJGDQZAQSA-N Thr-Pro-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O XKWABWFMQXMUMT-HJGDQZAQSA-N 0.000 description 8
- 238000001962 electrophoresis Methods 0.000 description 8
- 239000007788 liquid Substances 0.000 description 8
- 150000007523 nucleic acids Chemical class 0.000 description 8
- YIGLXQRFQVWFEY-NRPADANISA-N Ala-Gln-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O YIGLXQRFQVWFEY-NRPADANISA-N 0.000 description 7
- 108091027305 Heteroduplex Proteins 0.000 description 7
- 230000014509 gene expression Effects 0.000 description 7
- 238000000746 purification Methods 0.000 description 7
- 108010061238 threonyl-glycine Proteins 0.000 description 7
- HDOYNXLPTRQLAD-JBDRJPRFSA-N Ile-Ala-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)O)N HDOYNXLPTRQLAD-JBDRJPRFSA-N 0.000 description 6
- WPQKSRHDTMRSJM-CIUDSAMLSA-N Pro-Asp-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H]1CCCN1 WPQKSRHDTMRSJM-CIUDSAMLSA-N 0.000 description 6
- KHRLUIPIMIQFGT-AVGNSLFASA-N Pro-Val-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O KHRLUIPIMIQFGT-AVGNSLFASA-N 0.000 description 6
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 6
- YMEXHZTVKDAKIY-GHCJXIJMSA-N Ser-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CO)C(O)=O YMEXHZTVKDAKIY-GHCJXIJMSA-N 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 210000004899 c-terminal region Anatomy 0.000 description 6
- 150000001875 compounds Chemical class 0.000 description 6
- 201000010099 disease Diseases 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 239000012636 effector Substances 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000000605 extraction Methods 0.000 description 6
- 239000003292 glue Substances 0.000 description 6
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 6
- 238000011534 incubation Methods 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 238000002156 mixing Methods 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 5
- 206010028980 Neoplasm Diseases 0.000 description 5
- WTMPKZWHRCMMMT-KZVJFYERSA-N Thr-Pro-Ala Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O WTMPKZWHRCMMMT-KZVJFYERSA-N 0.000 description 5
- 238000001042 affinity chromatography Methods 0.000 description 5
- 230000003321 amplification Effects 0.000 description 5
- 238000000137 annealing Methods 0.000 description 5
- 230000004663 cell proliferation Effects 0.000 description 5
- 238000004140 cleaning Methods 0.000 description 5
- 238000013480 data collection Methods 0.000 description 5
- 238000001502 gel electrophoresis Methods 0.000 description 5
- 108010049041 glutamylalanine Proteins 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 238000003199 nucleic acid amplification method Methods 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 239000006228 supernatant Substances 0.000 description 5
- 210000004881 tumor cell Anatomy 0.000 description 5
- MEFILNJXAVSUTO-JXUBOQSCSA-N Ala-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MEFILNJXAVSUTO-JXUBOQSCSA-N 0.000 description 4
- BSYKSCBTTQKOJG-GUBZILKMSA-N Arg-Pro-Ala Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O BSYKSCBTTQKOJG-GUBZILKMSA-N 0.000 description 4
- KLKHFFMNGWULBN-VKHMYHEASA-N Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)NCC(O)=O KLKHFFMNGWULBN-VKHMYHEASA-N 0.000 description 4
- ZSJFGGSPCCHMNE-LAEOZQHASA-N Asp-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N ZSJFGGSPCCHMNE-LAEOZQHASA-N 0.000 description 4
- 230000004543 DNA replication Effects 0.000 description 4
- 241000196324 Embryophyta Species 0.000 description 4
- 241000588724 Escherichia coli Species 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- HTTJABKRGRZYRN-UHFFFAOYSA-N Heparin Chemical compound OC1C(NC(=O)C)C(O)OC(COS(O)(=O)=O)C1OC1C(OS(O)(=O)=O)C(O)C(OC2C(C(OS(O)(=O)=O)C(OC3C(C(O)C(O)C(O3)C(O)=O)OS(O)(=O)=O)C(CO)O2)NS(O)(=O)=O)C(C(O)=O)O1 HTTJABKRGRZYRN-UHFFFAOYSA-N 0.000 description 4
- LSQHWKPPOFDHHZ-YUMQZZPRSA-N His-Asp-Gly Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)O)N LSQHWKPPOFDHHZ-YUMQZZPRSA-N 0.000 description 4
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 4
- VDGTVWFMRXVQCT-GUBZILKMSA-N Pro-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 VDGTVWFMRXVQCT-GUBZILKMSA-N 0.000 description 4
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 4
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- 239000003814 drug Substances 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 229960002897 heparin Drugs 0.000 description 4
- 229920000669 heparin Polymers 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 102000039446 nucleic acids Human genes 0.000 description 4
- 108020004707 nucleic acids Proteins 0.000 description 4
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 4
- 239000013612 plasmid Substances 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- PCKRJVZAQZWNKM-WHFBIAKZSA-N Asn-Asn-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)NCC(O)=O PCKRJVZAQZWNKM-WHFBIAKZSA-N 0.000 description 3
- 108091033380 Coding strand Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- HTTSBEBKVNEDFE-AUTRQRHGSA-N Glu-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCC(=O)O)N HTTSBEBKVNEDFE-AUTRQRHGSA-N 0.000 description 3
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 3
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 3
- AFMOTCMSEBITOE-YEPSODPASA-N Gly-Val-Thr Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O AFMOTCMSEBITOE-YEPSODPASA-N 0.000 description 3
- CDGLBYSAZFIIJO-RCOVLWMOSA-N Ile-Gly-Gly Chemical compound CC[C@H](C)[C@H]([NH3+])C(=O)NCC(=O)NCC([O-])=O CDGLBYSAZFIIJO-RCOVLWMOSA-N 0.000 description 3
- YJRSIJZUIUANHO-NAKRPEOUSA-N Ile-Val-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](C)C(=O)O)N YJRSIJZUIUANHO-NAKRPEOUSA-N 0.000 description 3
- WQWSMEOYXJTFRU-GUBZILKMSA-N Leu-Glu-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CO)C(O)=O WQWSMEOYXJTFRU-GUBZILKMSA-N 0.000 description 3
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 3
- SLQJJFAVWSZLBL-BJDJZHNGSA-N Lys-Ile-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCCCN SLQJJFAVWSZLBL-BJDJZHNGSA-N 0.000 description 3
- 241000714177 Murine leukemia virus Species 0.000 description 3
- IFMDQWDAJUMMJC-DCAQKATOSA-N Pro-Ala-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O IFMDQWDAJUMMJC-DCAQKATOSA-N 0.000 description 3
- SNGZLPOXVRTNMB-LPEHRKFASA-N Pro-Ser-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N2CCC[C@@H]2C(=O)O SNGZLPOXVRTNMB-LPEHRKFASA-N 0.000 description 3
- 208000000389 T-cell leukemia Diseases 0.000 description 3
- 208000028530 T-cell lymphoblastic leukemia/lymphoma Diseases 0.000 description 3
- LKJCABTUFGTPPY-HJGDQZAQSA-N Thr-Pro-Gln Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O LKJCABTUFGTPPY-HJGDQZAQSA-N 0.000 description 3
- PNKDNKGMEHJTJQ-BPUTZDHNSA-N Trp-Arg-Asn Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(=O)N)C(=O)O)N PNKDNKGMEHJTJQ-BPUTZDHNSA-N 0.000 description 3
- 241000589634 Xanthomonas Species 0.000 description 3
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 3
- 108010093581 aspartyl-proline Proteins 0.000 description 3
- 238000005119 centrifugation Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000010828 elution Methods 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 108010062266 glycyl-glycyl-argininal Proteins 0.000 description 3
- 108010078326 glycyl-glycyl-valine Proteins 0.000 description 3
- 208000032839 leukemia Diseases 0.000 description 3
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 108091008146 restriction endonucleases Proteins 0.000 description 3
- 239000000523 sample Substances 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 108010048818 seryl-histidine Proteins 0.000 description 3
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 3
- 108010073969 valyllysine Proteins 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- 229920001817 Agar Polymers 0.000 description 2
- YJHKTAMKPGFJCT-NRPADANISA-N Ala-Val-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O YJHKTAMKPGFJCT-NRPADANISA-N 0.000 description 2
- CLOMBHBBUKAUBP-LSJOCFKGSA-N Ala-Val-His Chemical compound C[C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N CLOMBHBBUKAUBP-LSJOCFKGSA-N 0.000 description 2
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 2
- OVQJAKFLFTZDNC-GUBZILKMSA-N Arg-Pro-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O OVQJAKFLFTZDNC-GUBZILKMSA-N 0.000 description 2
- RGKKALNPOYURGE-ZKWXMUAHSA-N Asp-Ala-Val Chemical compound N[C@@H](CC(=O)O)C(=O)N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)O RGKKALNPOYURGE-ZKWXMUAHSA-N 0.000 description 2
- LLUXQOVDMQZMPJ-KKUMJFAQSA-N Cys-Tyr-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)CS)CC1=CC=C(O)C=C1 LLUXQOVDMQZMPJ-KKUMJFAQSA-N 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- YPMDZWPZFOZYFG-GUBZILKMSA-N Gln-Leu-Ser Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YPMDZWPZFOZYFG-GUBZILKMSA-N 0.000 description 2
- XOFYVODYSNKPDK-AVGNSLFASA-N Glu-His-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)[C@H](CCC(=O)O)N XOFYVODYSNKPDK-AVGNSLFASA-N 0.000 description 2
- UPOJUWHGMDJUQZ-IUCAKERBSA-N Gly-Arg-Arg Chemical compound NC(=N)NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O UPOJUWHGMDJUQZ-IUCAKERBSA-N 0.000 description 2
- LXXANCRPFBSSKS-IUCAKERBSA-N Gly-Gln-Leu Chemical compound [H]NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LXXANCRPFBSSKS-IUCAKERBSA-N 0.000 description 2
- YYPFZVIXAVDHIK-IUCAKERBSA-N Gly-Glu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN YYPFZVIXAVDHIK-IUCAKERBSA-N 0.000 description 2
- HQRHFUYMGCHHJS-LURJTMIESA-N Gly-Gly-Arg Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CCCN=C(N)N HQRHFUYMGCHHJS-LURJTMIESA-N 0.000 description 2
- STOOMQFEJUVAKR-KKUMJFAQSA-N His-His-His Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1N=CNC=1)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CNC=N1 STOOMQFEJUVAKR-KKUMJFAQSA-N 0.000 description 2
- JKGHDYGZRDWHGA-SRVKXCTJSA-N Leu-Asn-Leu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JKGHDYGZRDWHGA-SRVKXCTJSA-N 0.000 description 2
- CLVUXCBGKUECIT-HJGDQZAQSA-N Leu-Asp-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O CLVUXCBGKUECIT-HJGDQZAQSA-N 0.000 description 2
- DPURXCQCHSQPAN-AVGNSLFASA-N Leu-Pro-Pro Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 DPURXCQCHSQPAN-AVGNSLFASA-N 0.000 description 2
- HVAUKHLDSDDROB-KKUMJFAQSA-N Lys-Lys-Leu Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O HVAUKHLDSDDROB-KKUMJFAQSA-N 0.000 description 2
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- OAICVXFJPJFONN-UHFFFAOYSA-N Phosphorus Chemical compound [P] OAICVXFJPJFONN-UHFFFAOYSA-N 0.000 description 2
- GURGCNUWVSDYTP-SRVKXCTJSA-N Pro-Leu-Gln Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GURGCNUWVSDYTP-SRVKXCTJSA-N 0.000 description 2
- 206010038997 Retroviral infections Diseases 0.000 description 2
- 241000712907 Retroviridae Species 0.000 description 2
- 108010083644 Ribonucleases Proteins 0.000 description 2
- 102000006382 Ribonucleases Human genes 0.000 description 2
- YMTLKLXDFCSCNX-BYPYZUCNSA-N Ser-Gly-Gly Chemical compound OC[C@H](N)C(=O)NCC(=O)NCC(O)=O YMTLKLXDFCSCNX-BYPYZUCNSA-N 0.000 description 2
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 2
- 238000001261 affinity purification Methods 0.000 description 2
- 239000008272 agar Substances 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 229910021529 ammonia Inorganic materials 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N ammonia Natural products N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- NKLPQNGYXWVELD-UHFFFAOYSA-M coomassie brilliant blue Chemical compound [Na+].C1=CC(OCC)=CC=C1NC1=CC=C(C(=C2C=CC(C=C2)=[N+](CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=2C=CC(=CC=2)N(CC)CC=2C=C(C=CC=2)S([O-])(=O)=O)C=C1 NKLPQNGYXWVELD-UHFFFAOYSA-M 0.000 description 2
- 238000005138 cryopreservation Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 239000003599 detergent Substances 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000002337 electrophoretic mobility shift assay Methods 0.000 description 2
- -1 elute incomplete Chemical compound 0.000 description 2
- 239000012149 elution buffer Substances 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 108010089804 glycyl-threonine Proteins 0.000 description 2
- 108010040030 histidinoalanine Proteins 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 239000012535 impurity Substances 0.000 description 2
- 108010073472 leucyl-prolyl-proline Proteins 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 229910001629 magnesium chloride Inorganic materials 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 239000002808 molecular sieve Substances 0.000 description 2
- 239000010413 mother solution Substances 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 244000052769 pathogen Species 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 229910052698 phosphorus Inorganic materials 0.000 description 2
- 239000011574 phosphorus Substances 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 108010070643 prolylglutamic acid Proteins 0.000 description 2
- 108010090894 prolylleucine Proteins 0.000 description 2
- 230000001681 protective effect Effects 0.000 description 2
- 238000001742 protein purification Methods 0.000 description 2
- 238000010583 slow cooling Methods 0.000 description 2
- URGAHOPLAPQHLN-UHFFFAOYSA-N sodium aluminosilicate Chemical compound [Na+].[Al+3].[O-][Si]([O-])=O.[O-][Si]([O-])=O URGAHOPLAPQHLN-UHFFFAOYSA-N 0.000 description 2
- 238000012916 structural analysis Methods 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000029812 viral genome replication Effects 0.000 description 2
- AXFMEGAFCUULFV-BLFANLJRSA-N (2s)-2-[[(2s)-1-[(2s,3r)-2-amino-3-methylpentanoyl]pyrrolidine-2-carbonyl]amino]pentanedioic acid Chemical compound CC[C@@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O AXFMEGAFCUULFV-BLFANLJRSA-N 0.000 description 1
- FFRBMBIXVSCUFS-UHFFFAOYSA-N 2,4-dinitro-1-naphthol Chemical group C1=CC=C2C(O)=C([N+]([O-])=O)C=C([N+]([O-])=O)C2=C1 FFRBMBIXVSCUFS-UHFFFAOYSA-N 0.000 description 1
- IKHKJYWPWWBSFZ-UHFFFAOYSA-N 4-[[4-(diethylamino)phenyl]-(4-diethylazaniumylidenecyclohexa-2,5-dien-1-ylidene)methyl]benzene-1,3-disulfonate;hydron Chemical compound C1=CC(N(CC)CC)=CC=C1C(C=1C(=CC(=CC=1)S([O-])(=O)=O)S(O)(=O)=O)=C1C=CC(=[N+](CC)CC)C=C1 IKHKJYWPWWBSFZ-UHFFFAOYSA-N 0.000 description 1
- 208000030507 AIDS Diseases 0.000 description 1
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 1
- JBVSSSZFNTXJDX-YTLHQDLWSA-N Ala-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@H](C)N JBVSSSZFNTXJDX-YTLHQDLWSA-N 0.000 description 1
- SVBXIUDNTRTKHE-CIUDSAMLSA-N Ala-Arg-Glu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O SVBXIUDNTRTKHE-CIUDSAMLSA-N 0.000 description 1
- PBAMJJXWDQXOJA-FXQIFTODSA-N Ala-Asp-Arg Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PBAMJJXWDQXOJA-FXQIFTODSA-N 0.000 description 1
- WXERCAHAIKMTKX-ZLUOBGJFSA-N Ala-Asp-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O WXERCAHAIKMTKX-ZLUOBGJFSA-N 0.000 description 1
- LZRNYBIJOSKKRJ-XVYDVKMFSA-N Ala-Asp-His Chemical compound C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)N LZRNYBIJOSKKRJ-XVYDVKMFSA-N 0.000 description 1
- BUDNAJYVCUHLSV-ZLUOBGJFSA-N Ala-Asp-Ser Chemical compound C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O BUDNAJYVCUHLSV-ZLUOBGJFSA-N 0.000 description 1
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 1
- OBVSBEYOMDWLRJ-BFHQHQDPSA-N Ala-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N OBVSBEYOMDWLRJ-BFHQHQDPSA-N 0.000 description 1
- LNNSWWRRYJLGNI-NAKRPEOUSA-N Ala-Ile-Val Chemical compound C[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O LNNSWWRRYJLGNI-NAKRPEOUSA-N 0.000 description 1
- QPBSRMDNJOTFAL-AICCOOGYSA-N Ala-Leu-Leu-Thr Chemical compound C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QPBSRMDNJOTFAL-AICCOOGYSA-N 0.000 description 1
- CHFFHQUVXHEGBY-GARJFASQSA-N Ala-Lys-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N1CCC[C@@H]1C(=O)O)N CHFFHQUVXHEGBY-GARJFASQSA-N 0.000 description 1
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 1
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 1
- NZGRHTKZFSVPAN-BIIVOSGPSA-N Ala-Ser-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N NZGRHTKZFSVPAN-BIIVOSGPSA-N 0.000 description 1
- AAWLEICNDUHIJM-MBLNEYKQSA-N Ala-Thr-His Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](C)N)O AAWLEICNDUHIJM-MBLNEYKQSA-N 0.000 description 1
- OMSKGWFGWCQFBD-KZVJFYERSA-N Ala-Val-Thr Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OMSKGWFGWCQFBD-KZVJFYERSA-N 0.000 description 1
- OOIMKQRCPJBGPD-XUXIUFHCSA-N Arg-Ile-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O OOIMKQRCPJBGPD-XUXIUFHCSA-N 0.000 description 1
- WKPXXXUSUHAXDE-SRVKXCTJSA-N Arg-Pro-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCN=C(N)N)C(O)=O WKPXXXUSUHAXDE-SRVKXCTJSA-N 0.000 description 1
- VENMDXUVHSKEIN-GUBZILKMSA-N Arg-Ser-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VENMDXUVHSKEIN-GUBZILKMSA-N 0.000 description 1
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 1
- AIFHRTPABBBHKU-RCWTZXSCSA-N Arg-Thr-Arg Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O AIFHRTPABBBHKU-RCWTZXSCSA-N 0.000 description 1
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 1
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 1
- ZUVDFJXRAICIAJ-BPUTZDHNSA-N Arg-Trp-Asp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)N)C(=O)N[C@@H](CC(O)=O)C(O)=O)=CNC2=C1 ZUVDFJXRAICIAJ-BPUTZDHNSA-N 0.000 description 1
- VYZBPPBKFCHCIS-WPRPVWTQSA-N Arg-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N VYZBPPBKFCHCIS-WPRPVWTQSA-N 0.000 description 1
- SUMJNGAMIQSNGX-TUAOUCFPSA-N Arg-Val-Pro Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCCNC(N)=N)C(=O)N1CCC[C@@H]1C(O)=O SUMJNGAMIQSNGX-TUAOUCFPSA-N 0.000 description 1
- HUAOKVVEVHACHR-CIUDSAMLSA-N Asn-Asp-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N HUAOKVVEVHACHR-CIUDSAMLSA-N 0.000 description 1
- FHETWELNCBMRMG-HJGDQZAQSA-N Asn-Leu-Thr Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O FHETWELNCBMRMG-HJGDQZAQSA-N 0.000 description 1
- WSWYMRLTJVKRCE-ZLUOBGJFSA-N Asp-Ala-Asp Chemical compound OC(=O)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O WSWYMRLTJVKRCE-ZLUOBGJFSA-N 0.000 description 1
- PBVLJOIPOGUQQP-CIUDSAMLSA-N Asp-Ala-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O PBVLJOIPOGUQQP-CIUDSAMLSA-N 0.000 description 1
- NAPNAGZWHQHZLG-ZLUOBGJFSA-N Asp-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)N NAPNAGZWHQHZLG-ZLUOBGJFSA-N 0.000 description 1
- YDJVIBMKAMQPPP-LAEOZQHASA-N Asp-Glu-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O YDJVIBMKAMQPPP-LAEOZQHASA-N 0.000 description 1
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 1
- WSGVTKZFVJSJOG-RCOVLWMOSA-N Asp-Gly-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O WSGVTKZFVJSJOG-RCOVLWMOSA-N 0.000 description 1
- PAYPSKIBMDHZPI-CIUDSAMLSA-N Asp-Leu-Asp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O PAYPSKIBMDHZPI-CIUDSAMLSA-N 0.000 description 1
- HICVMZCGVFKTPM-BQBZGAKWSA-N Asp-Pro-Gly Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O HICVMZCGVFKTPM-BQBZGAKWSA-N 0.000 description 1
- XUVTWGPERWIERB-IHRRRGAJSA-N Asp-Pro-Phe Chemical compound N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](Cc1ccccc1)C(O)=O XUVTWGPERWIERB-IHRRRGAJSA-N 0.000 description 1
- ZQFRDAZBTSFGGW-SRVKXCTJSA-N Asp-Ser-Phe Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O ZQFRDAZBTSFGGW-SRVKXCTJSA-N 0.000 description 1
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 1
- 241000193830 Bacillus <bacterium> Species 0.000 description 1
- 241000219193 Brassicaceae Species 0.000 description 1
- 101000914947 Bungarus multicinctus Long neurotoxin homolog TA-bm16 Proteins 0.000 description 1
- 241000579895 Chlorostilbon Species 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 108020004394 Complementary RNA Proteins 0.000 description 1
- WAJDEKCJRKGRPG-CIUDSAMLSA-N Cys-His-Ser Chemical compound C1=C(NC=N1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N WAJDEKCJRKGRPG-CIUDSAMLSA-N 0.000 description 1
- 102100025698 Cytosolic carboxypeptidase 4 Human genes 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 102100037840 Dehydrogenase/reductase SDR family member 2, mitochondrial Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- NUMFTVCBONFQIQ-DRZSPHRISA-N Gln-Ala-Phe Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O NUMFTVCBONFQIQ-DRZSPHRISA-N 0.000 description 1
- SHERTACNJPYHAR-ACZMJKKPSA-N Gln-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O SHERTACNJPYHAR-ACZMJKKPSA-N 0.000 description 1
- CYTSBCIIEHUPDU-ACZMJKKPSA-N Gln-Asp-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O CYTSBCIIEHUPDU-ACZMJKKPSA-N 0.000 description 1
- SOIAHPSKKUYREP-CIUDSAMLSA-N Gln-Asp-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(=O)N)N SOIAHPSKKUYREP-CIUDSAMLSA-N 0.000 description 1
- PKVWNYGXMNWJSI-CIUDSAMLSA-N Gln-Gln-Gln Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O PKVWNYGXMNWJSI-CIUDSAMLSA-N 0.000 description 1
- XFAUJGNLHIGXET-AVGNSLFASA-N Gln-Leu-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O XFAUJGNLHIGXET-AVGNSLFASA-N 0.000 description 1
- UWMDGPFFTKDUIY-HJGDQZAQSA-N Gln-Pro-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(O)=O UWMDGPFFTKDUIY-HJGDQZAQSA-N 0.000 description 1
- OSCLNNWLKKIQJM-WDSKDSINSA-N Gln-Ser-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)NCC(O)=O OSCLNNWLKKIQJM-WDSKDSINSA-N 0.000 description 1
- GFLQTABMFBXRIY-GUBZILKMSA-N Glu-Gln-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GFLQTABMFBXRIY-GUBZILKMSA-N 0.000 description 1
- VSRCAOIHMGCIJK-SRVKXCTJSA-N Glu-Leu-Arg Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O VSRCAOIHMGCIJK-SRVKXCTJSA-N 0.000 description 1
- YKBUCXNNBYZYAY-MNXVOIDGSA-N Glu-Lys-Ile Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YKBUCXNNBYZYAY-MNXVOIDGSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- JNGJGFMFXREJNF-KBPBESRZSA-N Gly-Glu-Trp Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O JNGJGFMFXREJNF-KBPBESRZSA-N 0.000 description 1
- XPJBQTCXPJNIFE-ZETCQYMHSA-N Gly-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)CN XPJBQTCXPJNIFE-ZETCQYMHSA-N 0.000 description 1
- UHPAZODVFFYEEL-QWRGUYRKSA-N Gly-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)CN UHPAZODVFFYEEL-QWRGUYRKSA-N 0.000 description 1
- HFPVRZWORNJRRC-UWVGGRQHSA-N Gly-Pro-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN HFPVRZWORNJRRC-UWVGGRQHSA-N 0.000 description 1
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 1
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 1
- RYAOJUMWLWUGNW-QMMMGPOBSA-N Gly-Val-Gly Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O RYAOJUMWLWUGNW-QMMMGPOBSA-N 0.000 description 1
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 1
- 241000238631 Hexapoda Species 0.000 description 1
- FLUVGKKRRMLNPU-CQDKDKBSSA-N His-Ala-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](C)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O FLUVGKKRRMLNPU-CQDKDKBSSA-N 0.000 description 1
- TTZAWSKKNCEINZ-AVGNSLFASA-N His-Arg-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C(C)C)C(O)=O TTZAWSKKNCEINZ-AVGNSLFASA-N 0.000 description 1
- RAVLQPXCMRCLKT-KBPBESRZSA-N His-Gly-Phe Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O RAVLQPXCMRCLKT-KBPBESRZSA-N 0.000 description 1
- CTGZVVQVIBSOBB-AVGNSLFASA-N His-His-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O CTGZVVQVIBSOBB-AVGNSLFASA-N 0.000 description 1
- SYIPVNMWBZXKMU-HJPIBITLSA-N His-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CC2=CN=CN2)N SYIPVNMWBZXKMU-HJPIBITLSA-N 0.000 description 1
- SKOKHBGDXGTDDP-MELADBBJSA-N His-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N SKOKHBGDXGTDDP-MELADBBJSA-N 0.000 description 1
- ZVKDCQVQTGYBQT-LSJOCFKGSA-N His-Pro-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(O)=O ZVKDCQVQTGYBQT-LSJOCFKGSA-N 0.000 description 1
- DQZCEKQPSOBNMJ-NKIYYHGXSA-N His-Thr-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(O)=O DQZCEKQPSOBNMJ-NKIYYHGXSA-N 0.000 description 1
- 101000932590 Homo sapiens Cytosolic carboxypeptidase 4 Proteins 0.000 description 1
- 241000598436 Human T-cell lymphotropic virus Species 0.000 description 1
- NKVZTQVGUNLLQW-JBDRJPRFSA-N Ile-Ala-Ala Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)O)N NKVZTQVGUNLLQW-JBDRJPRFSA-N 0.000 description 1
- NZGTYCMLUGYMCV-XUXIUFHCSA-N Ile-Lys-Arg Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)N NZGTYCMLUGYMCV-XUXIUFHCSA-N 0.000 description 1
- GLYJPWIRLBAIJH-UHFFFAOYSA-N Ile-Lys-Pro Natural products CCC(C)C(N)C(=O)NC(CCCCN)C(=O)N1CCCC1C(O)=O GLYJPWIRLBAIJH-UHFFFAOYSA-N 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 208000029462 Immunodeficiency disease Diseases 0.000 description 1
- KFKWRHQBZQICHA-STQMWFEESA-N L-leucyl-L-phenylalanine Natural products CC(C)C[C@H](N)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 KFKWRHQBZQICHA-STQMWFEESA-N 0.000 description 1
- KVRKAGGMEWNURO-CIUDSAMLSA-N Leu-Ala-Cys Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CC(C)C)N KVRKAGGMEWNURO-CIUDSAMLSA-N 0.000 description 1
- GRZSCTXVCDUIPO-SRVKXCTJSA-N Leu-Arg-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(O)=O GRZSCTXVCDUIPO-SRVKXCTJSA-N 0.000 description 1
- BPANDPNDMJHFEV-CIUDSAMLSA-N Leu-Asp-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O BPANDPNDMJHFEV-CIUDSAMLSA-N 0.000 description 1
- LOLUPZNNADDTAA-AVGNSLFASA-N Leu-Gln-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O LOLUPZNNADDTAA-AVGNSLFASA-N 0.000 description 1
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 1
- RVVBWTWPNFDYBE-SRVKXCTJSA-N Leu-Glu-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O RVVBWTWPNFDYBE-SRVKXCTJSA-N 0.000 description 1
- VWHGTYCRDRBSFI-ZETCQYMHSA-N Leu-Gly-Gly Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)NCC(O)=O VWHGTYCRDRBSFI-ZETCQYMHSA-N 0.000 description 1
- WXZOHBVPVKABQN-DCAQKATOSA-N Leu-Met-Asp Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(=O)O)C(=O)O)N WXZOHBVPVKABQN-DCAQKATOSA-N 0.000 description 1
- GCXGCIYIHXSKAY-ULQDDVLXSA-N Leu-Phe-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O GCXGCIYIHXSKAY-ULQDDVLXSA-N 0.000 description 1
- BIZNDKMFQHDOIE-KKUMJFAQSA-N Leu-Phe-Asn Chemical compound CC(C)C[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(O)=O)CC1=CC=CC=C1 BIZNDKMFQHDOIE-KKUMJFAQSA-N 0.000 description 1
- VULJUQZPSOASBZ-SRVKXCTJSA-N Leu-Pro-Glu Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O VULJUQZPSOASBZ-SRVKXCTJSA-N 0.000 description 1
- CHJKEDSZNSONPS-DCAQKATOSA-N Leu-Pro-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O CHJKEDSZNSONPS-DCAQKATOSA-N 0.000 description 1
- IDGZVZJLYFTXSL-DCAQKATOSA-N Leu-Ser-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IDGZVZJLYFTXSL-DCAQKATOSA-N 0.000 description 1
- RGUXWMDNCPMQFB-YUMQZZPRSA-N Leu-Ser-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RGUXWMDNCPMQFB-YUMQZZPRSA-N 0.000 description 1
- HWMQRQIFVGEAPH-XIRDDKMYSA-N Leu-Ser-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(C)C)C(O)=O)=CNC2=C1 HWMQRQIFVGEAPH-XIRDDKMYSA-N 0.000 description 1
- AAKRWBIIGKPOKQ-ONGXEEELSA-N Leu-Val-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O AAKRWBIIGKPOKQ-ONGXEEELSA-N 0.000 description 1
- QFGVDCBPDGLVTA-SZMVWBNQSA-N Lys-Gln-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCCN)C(O)=O)=CNC2=C1 QFGVDCBPDGLVTA-SZMVWBNQSA-N 0.000 description 1
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 1
- LUTDBHBIHHREDC-IHRRRGAJSA-N Lys-Pro-Lys Chemical compound NCCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(O)=O LUTDBHBIHHREDC-IHRRRGAJSA-N 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- RKIIYGUHIQJCBW-SRVKXCTJSA-N Met-His-Glu Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(O)=O RKIIYGUHIQJCBW-SRVKXCTJSA-N 0.000 description 1
- RXWPLVRJQNWXRQ-IHRRRGAJSA-N Met-His-His Chemical compound C([C@H](NC(=O)[C@@H](N)CCSC)C(=O)N[C@@H](CC=1N=CNC=1)C(O)=O)C1=CNC=N1 RXWPLVRJQNWXRQ-IHRRRGAJSA-N 0.000 description 1
- BEZJTLKUMFMITF-AVGNSLFASA-N Met-Lys-Arg Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CCCNC(N)=N BEZJTLKUMFMITF-AVGNSLFASA-N 0.000 description 1
- SPSSJSICDYYTQN-HJGDQZAQSA-N Met-Thr-Gln Chemical compound CSCC[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(N)=O SPSSJSICDYYTQN-HJGDQZAQSA-N 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 101001033003 Mus musculus Granzyme F Proteins 0.000 description 1
- WUGMRIBZSVSJNP-UHFFFAOYSA-N N-L-alanyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)C)C(O)=O)=CNC2=C1 WUGMRIBZSVSJNP-UHFFFAOYSA-N 0.000 description 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 1
- 241000270276 Natrix Species 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- MQVFHOPCKNTHGT-MELADBBJSA-N Phe-Asp-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC2=CC=CC=C2)N)C(=O)O MQVFHOPCKNTHGT-MELADBBJSA-N 0.000 description 1
- OJUMUUXGSXUZJZ-SRVKXCTJSA-N Phe-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O OJUMUUXGSXUZJZ-SRVKXCTJSA-N 0.000 description 1
- LWPMGKSZPKFKJD-DZKIICNBSA-N Phe-Glu-Val Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C(C)C)C(O)=O LWPMGKSZPKFKJD-DZKIICNBSA-N 0.000 description 1
- JJHVFCUWLSKADD-ONGXEEELSA-N Phe-Gly-Ala Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](C)C(O)=O JJHVFCUWLSKADD-ONGXEEELSA-N 0.000 description 1
- VJLLEKDQJSMHRU-STQMWFEESA-N Phe-Gly-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O VJLLEKDQJSMHRU-STQMWFEESA-N 0.000 description 1
- ROOQMPCUFLDOSB-FHWLQOOXSA-N Phe-Phe-Gln Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1C=CC=CC=1)C(=O)N[C@@H](CCC(N)=O)C(O)=O)C1=CC=CC=C1 ROOQMPCUFLDOSB-FHWLQOOXSA-N 0.000 description 1
- AFNJAQVMTIQTCB-DLOVCJGASA-N Phe-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC1=CC=CC=C1 AFNJAQVMTIQTCB-DLOVCJGASA-N 0.000 description 1
- 229920002562 Polyethylene Glycol 3350 Polymers 0.000 description 1
- VXCHGLYSIOOZIS-GUBZILKMSA-N Pro-Ala-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 VXCHGLYSIOOZIS-GUBZILKMSA-N 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 1
- SKICPQLTOXGWGO-GARJFASQSA-N Pro-Gln-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)N)C(=O)N2CCC[C@@H]2C(=O)O SKICPQLTOXGWGO-GARJFASQSA-N 0.000 description 1
- PEYNRYREGPAOAK-LSJOCFKGSA-N Pro-His-Ala Chemical compound C([C@@H](C(=O)N[C@@H](C)C([O-])=O)NC(=O)[C@H]1[NH2+]CCC1)C1=CN=CN1 PEYNRYREGPAOAK-LSJOCFKGSA-N 0.000 description 1
- SOACYAXADBWDDT-CYDGBPFRSA-N Pro-Ile-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O SOACYAXADBWDDT-CYDGBPFRSA-N 0.000 description 1
- KDBHVPXBQADZKY-GUBZILKMSA-N Pro-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 KDBHVPXBQADZKY-GUBZILKMSA-N 0.000 description 1
- GMJDSFYVTAMIBF-FXQIFTODSA-N Pro-Ser-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O GMJDSFYVTAMIBF-FXQIFTODSA-N 0.000 description 1
- 241000169446 Promethis Species 0.000 description 1
- 101710188053 Protein D Proteins 0.000 description 1
- 101710132893 Resolvase Proteins 0.000 description 1
- 108010046983 Ribonuclease T1 Proteins 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- VQBLHWSPVYYZTB-DCAQKATOSA-N Ser-Arg-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CO)N VQBLHWSPVYYZTB-DCAQKATOSA-N 0.000 description 1
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 1
- CNIIKZQXBBQHCX-FXQIFTODSA-N Ser-Asp-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O CNIIKZQXBBQHCX-FXQIFTODSA-N 0.000 description 1
- BGOWRLSWJCVYAQ-CIUDSAMLSA-N Ser-Asp-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BGOWRLSWJCVYAQ-CIUDSAMLSA-N 0.000 description 1
- UQFYNFTYDHUIMI-WHFBIAKZSA-N Ser-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CO UQFYNFTYDHUIMI-WHFBIAKZSA-N 0.000 description 1
- FLONGDPORFIVQW-XGEHTFHBSA-N Ser-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CO FLONGDPORFIVQW-XGEHTFHBSA-N 0.000 description 1
- KQNDIKOYWZTZIX-FXQIFTODSA-N Ser-Ser-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCNC(N)=N KQNDIKOYWZTZIX-FXQIFTODSA-N 0.000 description 1
- SZRNDHWMVSFPSP-XKBZYTNZSA-N Ser-Thr-Gln Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CO)N)O SZRNDHWMVSFPSP-XKBZYTNZSA-N 0.000 description 1
- BDMWLJLPPUCLNV-XGEHTFHBSA-N Ser-Thr-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O BDMWLJLPPUCLNV-XGEHTFHBSA-N 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 238000010459 TALEN Methods 0.000 description 1
- CAGTXGDOIFXLPC-KZVJFYERSA-N Thr-Arg-Ala Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N CAGTXGDOIFXLPC-KZVJFYERSA-N 0.000 description 1
- MQBTXMPQNCGSSZ-OSUNSFLBSA-N Thr-Arg-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)[C@@H](C)O)CCCN=C(N)N MQBTXMPQNCGSSZ-OSUNSFLBSA-N 0.000 description 1
- IRKWVRSEQFTGGV-VEVYYDQMSA-N Thr-Asn-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O IRKWVRSEQFTGGV-VEVYYDQMSA-N 0.000 description 1
- SIMKLINEDYOTKL-MBLNEYKQSA-N Thr-His-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N[C@@H](C)C(=O)O)N)O SIMKLINEDYOTKL-MBLNEYKQSA-N 0.000 description 1
- WRUWXBBEFUTJOU-XGEHTFHBSA-N Thr-Met-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CO)C(=O)O)N)O WRUWXBBEFUTJOU-XGEHTFHBSA-N 0.000 description 1
- BDENGIGFTNYZSJ-RCWTZXSCSA-N Thr-Pro-Met Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(O)=O BDENGIGFTNYZSJ-RCWTZXSCSA-N 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- ZOBLBMGJKVJVEV-BZSNNMDCSA-N Tyr-Lys-Lys Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)O)N)O ZOBLBMGJKVJVEV-BZSNNMDCSA-N 0.000 description 1
- IEWKKXZRJLTIOV-AVGNSLFASA-N Tyr-Ser-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O IEWKKXZRJLTIOV-AVGNSLFASA-N 0.000 description 1
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 1
- IZFVRRYRMQFVGX-NRPADANISA-N Val-Ala-Gln Chemical compound C[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](C(C)C)N IZFVRRYRMQFVGX-NRPADANISA-N 0.000 description 1
- ASQFIHTXXMFENG-XPUUQOCRSA-N Val-Ala-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O ASQFIHTXXMFENG-XPUUQOCRSA-N 0.000 description 1
- AZSHAZJLOZQYAY-FXQIFTODSA-N Val-Ala-Ser Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O AZSHAZJLOZQYAY-FXQIFTODSA-N 0.000 description 1
- VMRFIKXKOFNMHW-GUBZILKMSA-N Val-Arg-Ser Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N VMRFIKXKOFNMHW-GUBZILKMSA-N 0.000 description 1
- TZVUSFMQWPWHON-NHCYSSNCSA-N Val-Asp-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N TZVUSFMQWPWHON-NHCYSSNCSA-N 0.000 description 1
- XGJLNBNZNMVJRS-NRPADANISA-N Val-Glu-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O XGJLNBNZNMVJRS-NRPADANISA-N 0.000 description 1
- FEFZWCSXEMVSPO-LSJOCFKGSA-N Val-His-Ala Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](C)C(O)=O FEFZWCSXEMVSPO-LSJOCFKGSA-N 0.000 description 1
- UMPVMAYCLYMYGA-ONGXEEELSA-N Val-Leu-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)NCC(O)=O UMPVMAYCLYMYGA-ONGXEEELSA-N 0.000 description 1
- UOUIMEGEPSBZIV-ULQDDVLXSA-N Val-Lys-Tyr Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 UOUIMEGEPSBZIV-ULQDDVLXSA-N 0.000 description 1
- GBIUHAYJGWVNLN-AEJSXWLSSA-N Val-Ser-Pro Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CO)C(=O)N1CCC[C@@H]1C(=O)O)N GBIUHAYJGWVNLN-AEJSXWLSSA-N 0.000 description 1
- GBIUHAYJGWVNLN-UHFFFAOYSA-N Val-Ser-Pro Natural products CC(C)C(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O GBIUHAYJGWVNLN-UHFFFAOYSA-N 0.000 description 1
- UVHFONIHVHLDDQ-IFFSRLJSSA-N Val-Thr-Glu Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](C(C)C)N)O UVHFONIHVHLDDQ-IFFSRLJSSA-N 0.000 description 1
- UFCHCOKFAGOQSF-BQFCYCMXSA-N Val-Trp-Glu Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N UFCHCOKFAGOQSF-BQFCYCMXSA-N 0.000 description 1
- 108020000999 Viral RNA Proteins 0.000 description 1
- 241000589636 Xanthomonas campestris Species 0.000 description 1
- 241000048615 Xanthomonas campestris pv. armoraciae Species 0.000 description 1
- 101100165011 Xanthomonas euvesicatoria avrBs3 gene Proteins 0.000 description 1
- ZKHQWZAMYRWXGA-KNYAHOBESA-N [[(2r,3s,4r,5r)-5-(6-aminopurin-9-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] dihydroxyphosphoryl hydrogen phosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)O[32P](O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KNYAHOBESA-N 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 1
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 1
- 108010047495 alanylglycine Proteins 0.000 description 1
- 108010087924 alanylproline Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 108010068380 arginylarginine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010060035 arginylproline Proteins 0.000 description 1
- 108010069205 aspartyl-phenylalanine Proteins 0.000 description 1
- 208000004668 avian leukosis Diseases 0.000 description 1
- 230000000680 avirulence Effects 0.000 description 1
- 238000010876 biochemical test Methods 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 238000009933 burial Methods 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003638 chemical reducing agent Substances 0.000 description 1
- 229960005091 chloramphenicol Drugs 0.000 description 1
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 1
- YTRQFSDWAXHJCC-UHFFFAOYSA-N chloroform;phenol Chemical compound ClC(Cl)Cl.OC1=CC=CC=C1 YTRQFSDWAXHJCC-UHFFFAOYSA-N 0.000 description 1
- 238000002288 cocrystallisation Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 239000003184 complementary RNA Substances 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000002447 crystallographic data Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000005034 decoration Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000010612 desalination reaction Methods 0.000 description 1
- 238000011033 desalting Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 241001493065 dsRNA viruses Species 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000005684 electric field Effects 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 229910052876 emerald Inorganic materials 0.000 description 1
- 239000010976 emerald Substances 0.000 description 1
- 238000012869 ethanol precipitation Methods 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 108010026364 glycyl-glycyl-leucine Proteins 0.000 description 1
- 108010033719 glycyl-histidyl-glycine Proteins 0.000 description 1
- 108010077515 glycylproline Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 201000009277 hairy cell leukemia Diseases 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010050343 histidyl-alanyl-glutamine Proteins 0.000 description 1
- 108010036413 histidylglycine Proteins 0.000 description 1
- 108010028295 histidylhistidine Proteins 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 230000007813 immunodeficiency Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- NDDAHWYSQHTHNT-UHFFFAOYSA-N indapamide Chemical compound CC1CC2=CC=CC=C2N1NC(=O)C1=CC=C(Cl)C(S(N)(=O)=O)=C1 NDDAHWYSQHTHNT-UHFFFAOYSA-N 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000009830 intercalation Methods 0.000 description 1
- 230000002687 intercalation Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 210000002429 large intestine Anatomy 0.000 description 1
- 239000010985 leather Substances 0.000 description 1
- 108010044311 leucyl-glycyl-glycine Proteins 0.000 description 1
- 108010051673 leucyl-glycyl-phenylalanine Proteins 0.000 description 1
- 108010034529 leucyl-lysine Proteins 0.000 description 1
- 108010044056 leucyl-phenylalanine Proteins 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 239000006210 lotion Substances 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 235000013372 meat Nutrition 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 239000002159 nanocrystal Substances 0.000 description 1
- 229910052757 nitrogen Inorganic materials 0.000 description 1
- 102000044158 nucleic acid binding protein Human genes 0.000 description 1
- 108700020942 nucleic acid binding protein Proteins 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 108010082795 phenylalanyl-arginyl-arginine Proteins 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 108010051242 phenylalanylserine Proteins 0.000 description 1
- 244000000003 plant pathogen Species 0.000 description 1
- 239000002574 poison Substances 0.000 description 1
- 231100000614 poison Toxicity 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 230000001376 precipitating effect Effects 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 108010012557 prothrombin complex concentrates Proteins 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 108010026333 seryl-proline Proteins 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 230000007480 spreading Effects 0.000 description 1
- 238000003892 spreading Methods 0.000 description 1
- 230000005469 synchrotron radiation Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 108010038745 tryptophylglycine Proteins 0.000 description 1
- 238000000108 ultra-filtration Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 230000001018 virulence Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6813—Hybridisation assays
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
- A61K38/16—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- A61K38/164—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61P—SPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
- A61P35/00—Antineoplastic agents
Landscapes
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Public Health (AREA)
- Pharmacology & Pharmacy (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Medicinal Chemistry (AREA)
- Genetics & Genomics (AREA)
- General Engineering & Computer Science (AREA)
- Animal Behavior & Ethology (AREA)
- Molecular Biology (AREA)
- Veterinary Medicine (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- General Chemical & Material Sciences (AREA)
- Gastroenterology & Hepatology (AREA)
- Epidemiology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
本发明公开了一种特异结合和靶定DNA‑RNA杂合双链的方法。该方法包括用TALE及其衍生蛋白来特异性识别特定的DNA‑RNA杂合双链并与之结合。
Description
技术领域
本发明涉及生物技术领域,更具体地说,涉及特异结合和靶定DNA-RNA杂合双链的方法。
背景技术
TALE(Transcription Activator Like Effectors, 转录激活子样效应因子) 是植物致病菌黄单胞菌属(Xanthomonas)的细胞内的一种蛋白质。当病原菌侵染植株时,病菌会通过其自身的III型分泌系统将包括TALE在内的一系列效应分子注入到植物细胞内。这些效应分子通过影响宿主细胞的信号传递,基因表达等方式来协助病菌进一步扩增。TALE则是这些效应分子中最大的一类,它像植物自身的转录激活子一样行使功能。
TALE家族蛋白一般由3个主要的功能结构域组成,N端结构域与TALE的分泌转运有关; C端具有转录激活结构域和入核信号肽片段;位于TALE中部的区域是DNA结合结构域,但它的DNA 结合结构域不同于其他已知的DNA结合结构域,它是由一段串联的重复单元组成,大多数情况下每个重复单元由34个氨基酸组成,个别重复单元由33或35个氨基酸残基组成。这34个氨基酸中除了第12和13位的氨基酸变化较大之外,其他氨基酸高度保守。这两个不保守的氨基酸被命名为RVD(repeat variable diresidue,重复可变双残基)。J. Boch等人和M.J. Moscou等(参见 J. Boch, H. Scholze, S. Schornack, A. Landgraf, S.Hahn, S. Kay, T. Lahaye, A. Nickstadt, U. Bonas, Breaking the code of DNAbinding specificity of TAL-type III effectors, Science, 326 (2009) 1509-1512和M.J. Moscou, A.J. Bogdanove, A simple cipher governs DNA recognition by TALeffectors, Science, 326 (2009) 1501)已于2009年分别通过实验和生物信息学研究发现每个重复单元中第12和13位的氨基酸(RVD)与识别的核苷酸种类有特殊的对应关系,例如:
表1 部分RVD与DNA碱基序列的对应关系
TALE蛋白的特异DNA序列识别以及灵活的可组装性为它们在分子生物学中的应用提供了巨大的前景,科学家们可以设计组装任意的TALE单元去识别任意的DNA双螺旋序列。这一特性已经被用来构造切割特异双链DNA序列的DNA酶TALEN (TALE nuclease,TALE核酸酶),用于在细胞基因组中引入定点突变、定点敲除等操作(A.J. Bogdanove, D.F.Voytas, TAL effectors: customizable proteins for DNA targeting, Science, 333(2011) 1843-1846.)。在目前所有已知的报道中,TALE识别的都是双链的DNA螺旋(dsDNA)。
发明内容
本发明提供了一种特异结合DNA-RNA杂合链的方法,包括用TALE及其衍生蛋白来特异性识别特定的DNA-RNA杂合双链并与之结合。
本发明提供了一种抑制以RNA为模板来生成DNA的方法,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合。在一个优选实施方式中,本发明提供了一种抑制逆转录病毒基因组复制的方法,所述方法包括用TALE及其衍生蛋白来特异结合DNA-RNA杂合双链并与之结合。
本发明提供了一种抑制以RNA为引物、DNA为模板来生成DNA的方法,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合。在一个优选实施方式中,本发明提供了一种抑制细胞增殖,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合从而抑制细胞基因组复制。在一个更优选实施方式中,本发明提供了一种抑制肿瘤细胞增殖的方法,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合从而抑制肿瘤细胞基因组复制。
本发明提供了一种抑制以RNA为引物、DNA为模板来生成RNA的方法,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合,条件是生成的RNA能与DNA形成暂时稳定的双链体。
本发明提供了一种保护DNA-RNA杂合链中RNA分子不被RNA水解酶RNase H降解的方法,包括用TALE及其衍生蛋白来特异性识别DNA-RNA杂合双链并与之结合。
TALE蛋白可以为自然界已有的TALE蛋白以及在此基础上通过基因方法突变、修饰、组装获得的保持或增强DNA-RNA杂合链结合能力的TALE衍生蛋白。所述TALE衍生蛋白还包含具有TALE蛋白DNA结合结构域的重组蛋白。
所述DNA还可以包含修饰的DNA衍生物,例如甲基化碱基、羟甲基化碱基等。
所述RNA还可以包含修饰的RNA衍生物,例如甲基化碱基、羟甲基化碱基等。
在一个优选实施方式中,所述逆转录病毒包括逆转录病毒科(Retroviridae)中所属病毒,包括但不限于:人类免疫缺陷病毒(Human Immunodeficiency Virus,HIV)、劳斯肉瘤病毒(Rous Sarcoma Virus,RSV)、鼠白血病病毒(Murine Leukemia Virus,MLV)、人类T细胞白血病病毒(Human T-cell Leukemia Virus,HTLV)等等。所述逆转录病毒还包括在复制过程中形成RNA-DNA杂合双链的或以与已知逆转录病毒基因组复制方式类似的其他RNA病毒,包括尚未发现的病毒种类。
在一个优选实施方式中,所述方法用于抑制哺乳动物中的肿瘤细胞增殖。
本发明提供了TALE蛋白在制备特异性识别DNA-RNA杂合双链的试剂中的用途。
本发明提供了TALE蛋白在制备药物中的用途,所述药物用于治疗或预防由逆转录病毒感染引起的疾病,例如,由逆转录病毒引起的人、畜、植物疾病,包括但不限于人免疫缺陷综合症(AIDS)、人T细胞白血病、人毛细胞白血病、鼠白血病、禽白血病等等。
本发明提供了TALE蛋白在制备药物中的用途,所述药物用于治疗或预防肿瘤。
本发明提供了治疗或预防由逆转录病毒感染引起的疾病的方法,其通过TALE及其衍生蛋白来干扰以RNA为模板的DNA复制来抑制逆转录病毒的复制。
本发明提供了治疗或预防肿瘤的方法,其中通过TALE及其衍生蛋白来干扰以RNA为引物的DNA复制来抑制肿瘤细胞增殖。
本发明提供了用于特异性识别DNA-RNA杂合双链的TALE蛋白。
本发明提供了用于治疗或预防由逆转录病毒感染引起的疾病或用于治疗或预防肿瘤的TALE蛋白。
附图说明
图1是dHax3的DNA结合域(dHax3截短体,标记为dHax3-Δ)与双链DNA的高分辨率晶体结构(1.85埃)示意图。左图中的1-10 表示dHax3的DNA结合域的每个重复单元,其识别右侧对应的DNA序列。每个重复单元由两个α螺旋组成,两个螺旋分别为a和b。该结构已上传到PDB数据库中,代码为:3V6T。其中dHax3 (designed Hax3)指经过改造的TALE蛋白Hax3。
图2是示意图,表明dHax3与DNA的相互作用主要集中于DNA的编码链。A, dHax3的表面电荷势,显示dHax3表面有一条正电荷分布。蓝色分布刚好与DNA分子的磷酸基团相互作用(DNA 分子位于蛋白的中间,金黄色基团表示磷酸基团)。B,这种相互作用只存在于dHax3与具有其识别序列的DNA链之间。C,每个重复单元中的第16和17位的氨基酸残基K和Q会通过氢键与DNA磷酸基团相互作用。D,每个重复单元中主链也会与DNA磷酸基团形成氢键相互作用。
图3是电泳图,显示了dHax3-NI变体(即dHax3的DNA结合域的第七个重复单元中的RVD——NS——通过点突变技术变成NI。它具有与dHax3相同的DNA识别序列,同时具有更高的识别特异性)与双链DNA(图A泳道1-5, dsDNA)、单链DNA(图A泳道6-10,ssDNA)、DNA–RNA杂合双链(图B, 泳道1-5: fDNA+rRNA,泳道6-10:fRNA+rDNA)、双链RNA(图C泳道1-5,dsRNA)和单链DNA(图C泳道6-10,ssRNA)的凝胶阻滞实验。泳道1-5和6-10中,dHax3-NI蛋白浓度分别为 0、0.15 μM、0.44 μM、1.33 μM和4μM,同时每个泳道中含有大约4 nM 的带有32P放射性标记的核酸探针。结果显示dHax3-NI可以特异性识别DNA双链和一种DNA-RNA杂合双链。“f”: 正向链。“r”: 反向链。
图4显示了dHax3-NI的DNA结合域(即dHax3-NI的截短体,标记为dHax3-NI-Δ)与DNA-RNA杂合双链复合物的晶体结构。dHax3-NI-Δ以缎带模型表示,“DNA编码链”和“互补RNA链”分别标出。该结构已上传到PDB数据库中,代码为:4GG4。
图5是电泳图,显示了dHax3全长蛋白的纯化结果。泳道标注说明:1. 全菌破碎液;2. 全菌破碎离心沉淀;3. 全菌破碎离心上清液;4. 镍柱培养弃液;5. 镍柱清洗液;6. 镍柱洗脱回收液;7. 镍柱柱材;8. 分子量标志物。
图6是电泳图,显示了dHax3截短体蛋白(dHax3-Δ)的纯化结果。泳道标注说明:A.全菌破碎液;P. 全菌破碎离心沉淀;S. 全菌破碎离心上清液;F. 镍柱穿透液;W1. 镍柱清洗液1;W1. 镍柱清洗液2;E. 镍柱洗脱回收液;R. 镍柱柱材;M. 分子量标志物。
图7是示意图,显示了真核生物DNA复制原理。
图8是电泳图,显示了dHax3-NI保护DNA-RNA,阻止RNase H对DNA-RNA杂合双链中的RNA的酶切。1和2道分别为在没有RNase H情况下,有或者无dHax3-NI的对照组;3为加入RNase H情况下,无dHax3-NI的对照;4~10为加入RNase H情况下,加入梯度浓度的dHax3-NI,蛋白终浓度梯度为0.004、0.015、0.05、0.025、0.1、0.4和1.6 μM。13和14道分别为制备的RNA梯带(T1和A)用于检测RNase H的在DNA-RNA杂合双链中的剪切位置。
图9是电泳图,显示了dHax3-TALE24重复单元嵌合蛋白保护DNA-RNA,阻止RNase H对DNA-RNA杂合双链中的RNA的酶切。0和11道分别为制备的RNA梯带(T1 和 A)用于检测RNase H在DNA-RNA杂合双链中的剪切位置。1和2道分别为在没有RNase H情况下,有或者无dHax3-TALE24重复单元的对照组;3道为加入RNase H情况下,无dHax3-TALE24重复单元的对照;4~10道为加入RNase H情况下,加入梯度浓度的dHax3-TALE24重复单元。蛋白的终浓度依次为:0.004、0.015、0.05、0.025、0.1、0.4和1.6 μM。
图10是电泳图,显示了dHax3-TALEHIV重复单元嵌合蛋白保护DNA-RNA,阻止RNaseH对DNA-RNA杂合双链中的RNA的酶切。1和2道分别为在没有RNase H情况下,有或者无dHax3-TALEHIV重复单元的对照组;3道为加入RNase H情况下,无dHax3-TALEHIV重复单元的对照;4~10道为加入RNase H情况下,加入梯度浓度的dHax3-TALEHIV重复单元,蛋白的终浓度分别为:0.004、0.015、0.05、0.025、0.1、0.4和1.6 μM;11和12道分别为在加入dHax3-TALEHIV重复单元情况下,有或者无RNase H的对照;13和14 道分别为在加入BSA情况下,有或者无RNase H的对照。
具体实施方式
发明人成功解析了经过改造的TALE蛋白Hax3(在本文中称为dHax3 (designedHax3))的DNA结合结构域与dsDNA的复合物晶体结构。该结构除了揭示出TALE蛋白特异识别每一个DNA碱基的分子基础,还显示双链DNA里只有一条链(即具有TALE识别序列的链)与TALE相互作用。
发明人通过生物化学实验发现TALE蛋白可以特异识别DNA-RNA杂合双链,并成功解析了dHax3蛋白的DNA结合结构域与DNA-RNA杂合双链复合体的晶体结构。
发明人通过结构观察与生物化学手段首次发现TALE蛋白可以特异识别DNA – RNA杂合双链,这一发现拓宽了TALE蛋白的应用前景。
(1) 对逆转录病毒的治疗。
逆转录病毒,以RNA作为其遗传物质,比如对人类造成严重疾病的人类免疫缺陷病毒、人类T细胞白血病病毒等等。它们要实现扩增,都必须通过在宿主细胞内逆转录的方式来完成病毒基因组的复制。逆转录病毒在通过侵染宿主复制自身的过程中,关键一步是在宿主内以病毒RNA基因组作为模板,合成与RNA互补的DNA链。当基因组信息被传递到单链的DNA上后,病毒逆转录酶(Reverse Transcriptase)上的RNA水解酶结构域RNase H,会将DNA- RNA杂合双链中的RNA链降解掉,释放出来的单链DNA再作为模板,病毒逆转录酶将其复制成双链DNA,最后将双链DNA插入到宿主的基因组中。
在病毒复制过程中,如果逆转录酶上的RNase H结构域在逆转录之后不能降解RNA,病毒就不能完成基因组复制。根据这个原理,以及发明人新发现的TALE蛋白可以特异结合DNA-RNA结合的特性,可以推测,当TALE特异的结合DNA-RNA杂合双链时,会占据逆转录酶和RNase H的结合位点,使得RNase H不能降解RNA,从而达到抑制病毒复制的目的。
发明人首次发现的TALE可以结合DNA-RNA杂合链的现象为抑制逆转录病毒基因组复制过程提供了一种新型方式,从而为治疗由逆转录病毒引发的诸如人免疫缺陷综合症、人T细胞白血病提供了一种新型的思路和方法。该方法还可用于治疗由在复制过程中形成RNA-DNA杂合双链的病毒引发的疾病,所述病毒包括以与已知逆转录病毒基因组复制方式类似的其他RNA病毒和尚未发现的病毒。
(2) 影响真核生物的DNA复制,从而为抑制肿瘤细胞增殖提供新方法。
如图7所示,真核生物基因组中,双链DNA以线性形式存在。由于DNA的复制方向从5’3’,其中前导链可以从5’端向3’端连续复制下去;而滞后链则要以RNA为引物,从5’端向3’端合成一段一段的冈崎片段(Okazaki fragment)。
现在发明人发现TALE可以有效结合DNA-RNA杂合链,那就可能与DNA聚合酶竞争对于DNA-RNA杂合链的结合,从而抑制DNA复制。这样的后果是可能抑制细胞分裂,从而对抑制肿瘤细胞增殖提供了一个新思路和新方法。
基于这种特异识别DNA-RNA杂合双链的新方法,为干扰细胞内所有通过形成DNA-RNA杂合双链的过程,比如逆转录病毒在宿主细胞内的复制、细胞基因组DNA的复制等重要过程,提供了新方法。
除非本文另有定义,本发明使用的相关科学和技术术语具有本领域普通技术人员通常理解的含义。而且,除非上下文有其它规定,单数形式的术语应当包括复数,而复数形式的术语应当包括单数。通常,与本文所述的分子生物学、生物化学、结构生物学及相关使用的命名以及技术,是本领域众所周知且普遍使用的那些。除非另有说明,下面的术语应当理解为具有下述含义:
本文所用的术语“TALE蛋白”是指Transcription Activator Like Effectors,即转录激活子样效应因子。TALE蛋白可以为自然界已有的TALE蛋白以及在此基础上通过基因方法突变、修饰、组装获得的保持或增强DNA、或DNA-RNA杂合链结合能力的TALE衍生蛋白。
本文所用的术语“Hax3”是指TALE蛋白家族的成员之一。Hax的全称为“Homolog ofavrBs3 in Xanthomonas”,而Hax3是从野油菜黄单胞菌变种Armoraciae(Xanthomonas campestris pv. Armoraciae)鉴定出的3个同源蛋白之一。作为TALE蛋白家族的成员之一,它的功能与其他已知的TALE蛋白如AvrBs3的功能类似(参见S. Kay, J. Boch, U. Bonas,Characterization of AvrBs3-like effectors from a Brassicaceae pathogenreveals virulence and avirulence activities and a protein with a novel repeatarchitecture, Molecular plant-microbe interactions : MPMI, 18 (2005) 838-848.)。
本文所用的术语“dHax3”是指人工改造的Hax3(designed Hax3),其基因的核苷酸序列为SEQ ID NO:1,氨基酸序列可参见SEQ ID NO:2(其中插入了6XHis标签)。M.M.Mahfouz等人设计了dHax3以使其具有特异识别如下DNA序列的能力:TCCCTTTATCTCT(M.M.Mahfouz, L. Li, M. Shamimuzzaman, A. Wibowo, X. Fang, J.K. Zhu, De novo-engineered transcription activator-like effector (TALE) hybrid nuclease withnovel DNA binding specificity creates double-strand breaks, Proceedings ofthe National Academy of Sciences of the United States of America, 108 (2011)2623-2628.)。
本文所用的术语“dHax3截短体蛋白”(“dHax3-Δ”)是指去除了N端结构域和C端结构域的dHax3截短体蛋白,其为dHax3蛋白序列230-721,具有11.5个重复单元。
本文所用的术语“dHax3-NI”是指dHax3的一种变体,其中在DNA结合域第七个重复单元中的NS通过点突变技术变成NI以获得与相应DNA链更特异性的结合能力。dHax3-NI与dHax3 都具有特异识别如下DNA序列的能力:TCCCTTTATCTCT。
本文所用的术语“dHax3-NI-Δ”是指dHax3-NI变体的蛋白序列230-721的截短体。
本文所用的术语“TALE24重复单元”是指一种人工合成的DNA结合域的重复单元,其具有24个重复单元,具体设计和制备参见P. Yin, D. Deng, C. Yan, X. Pan, J.J. Xi,N. Yan, Y. Shi, Specific DNA-RNA Hybrid Recognition by TAL Effectors, Cellreports, 2 (2012) 707-713。
本文所用的术语“TALEHIV重复单元”是指一种人工合成的的DNA结合域的重复单元,其特异性识别HIV基因组中特定片段,具体设计和制备参见P. Yin, D. Deng, C. Yan,X. Pan, J.J. Xi, N. Yan, Y. Shi, Specific DNA-RNA Hybrid Recognition by TALEffectors, Cell reports, 2 (2012) 707-713。
本文所用的术语“dHax3-TALE24重复单元”是指用TALE24重复单元来置换dHax3的DNA结合域中的重复单元从而形成的嵌合蛋白。
本文所用的术语“dHax3-TALEHIV重复单元”是指用TALEHIV重复单元来置换dHax3的DNA结合域中的重复单元从而形成的嵌合蛋白。
由于所有TALE蛋白中的RVD识别DNA碱基的分子机制相同,虽然不同的TALE蛋白存在一定序列差异性,但是涉及实施例中dHax3特异性识别DNA-RNA杂合双链的能力也同样适用于其他不同于实施例dHax3序列的其他TALE蛋白。同时,未使用表1中RVD的TALE蛋白,例如具有ND,NK,NH,HG,N*(*代表任意氨基酸)等等RVD的TALE蛋白,都与dHax3使用相同的分子机制识别DNA,也同样具有识别DNA-RNA杂合双链的能力,所以也在本专利的保护范围之内。
实施例中所采用的各种试剂,包括缓冲液、酶、载体、试剂盒等,均可通过商业途径购得或者按照《分子克隆实验指南》第三版(黄培堂, 科学出版社, 2002)所推荐的方法配制。
实施例
实施例1:几种TALE蛋白的构建以及纯化
1. 分子克隆及表达载体构建的实验方法如下:
●PCR扩增目的基因片段
50 µl标准PCR反应体系组成如下表所示,如有需要可按照比例扩增体系;
50 µl PCR反应标准体系
成功扩增目的片段后,直接使用普通DNA回收试剂盒回收扩增的目的基因片段。注意,如果是点突变的扩增基因片段需要先使用琼脂糖凝胶电泳去除DNA模板,然后使用琼脂糖凝胶DNA回收试剂盒回收目的基因。
●限制性内切酶处理扩增片段和载体
使用相同的限制性内切酶处理扩增片段和载体,从而产生相同的DNA粘性末端。50µl双酶切反应体系成分如下表所示:
50 µl标准双酶切反应体系
37 ℃温浴30~180 min,估计反应完全后,进行凝胶电泳,使用琼脂糖凝胶DNA回收试剂盒切胶回收DNA片段。
●DNA连接
使用T4 DNA连接酶将酶切后的目的基因片段连入载体,16℃或室温反应30~120min。连接体系如下表所示:
10 µl标准连接体系
●转化
将连接产物按照下述方法转入DH5α感受态细胞中,准备筛选阳性克隆:在连接产物中加入50~100µl DH5α感受态细胞,冰上放置30min;42℃热击90s;冰上放置2min;将所有产物加到氨苄抗性琼脂平板上,用涂布棒涂匀,37 ℃倒置培养14-16小时。
●使用菌落PCR法筛选阳性克隆
在前一步得到的平板上标记4~8个菌落,使用如下体系检验阳性克隆:
菌落PCR体系
使用凝胶电泳确认结果,挑取阳性克隆,在氨苄抗性LB培养基中37 ℃、220 rpm培养过夜。
●质粒提取
使用普通质粒小提试剂盒提取质粒,测序由金唯智(genewiz)生物科技有限公司完成。
●重组蛋白的诱导表达
为了获得大量纯化的蛋白,需要进行过量表达。现有的过量表达体系有大肠杆菌(E.coli)、酵母、昆虫细胞等。不同的蛋白可能适合在不同的体系中表达。目的蛋白是革兰氏阴性菌中的一种蛋白,所以选择大肠杆菌作为表达体系进行蛋白表达纯化。
纯化出性质好,纯度高的蛋白质是进行生化实验及结晶实验的前提条件。从大肠杆菌中纯化重组表达蛋白技术已经相当成熟。为了方便的使用亲和层析进行纯化,构建了带有各种标签的重组蛋白。经过比较,采用带有组氨酸标签的重组蛋白进行后续实验。6个组氨酸组成的组氨酸标签可以以配位键的形式结合到带有镍等金属原子的柱材上。经过镍柱亲和层析和肝素亲和层析纯化就可以得到纯度大约95%以上的蛋白。
具体纯化步骤如下:
a. 将转有TAL effector表达质粒的BL21(DE3)或者ROSETTA(DE3)接入50ml 含有氨苄青霉素或者氨苄青霉素/氯霉素双抗的LB培养基,并置于37℃摇床培养过夜。
b. 将5-10ml的小瓶培养液转接到1L含有抗生素的LB培养基于37℃摇床培养约3小时。当0D600=0.8~1.0时,加入0.2mM 终浓度的IPTG22℃诱导表达14~16小时。
c. 完成诱导的大肠杆菌于4℃4400rpm离心10min,弃上清。每升培养液离心收集的湿菌用20 ml 裂菌液(25 mM Tris-HCl pH 8.0, 500 mM NaCl)重悬。
d. 超声破菌后,14000rpm离心50min,取上清进行后续纯化。
e. 将上清缓缓加入事先用裂菌液(25 mM Tris-HCl pH 8.0,500 mM NaCl)平衡好的镍柱中。将穿过液重复上述操作1~2次。
f. 加入清洗缓冲液I(25 mM Tris-HCl pH 8.0, 1000 mM NaCl)10ml, 除去部分杂质。重复上述操作3次。
g. 加入清洗缓冲液II(25 mM Tris-HCl pH 8.0;100 mM NaCl;10mM Imidazole)10ml,进一步除去杂蛋白。
h. 加入洗脱缓冲液(25 mM Tris-HCl pH 8.0,50 mM NaCl,300mM Imidazole)10ml,将目的蛋白从镍柱上洗脱。用考马斯亮蓝G-250检测是否洗脱干净,如洗脱不完全,重复上述操作。
i. 将洗脱下来的蛋白缓缓加入事先已用缓冲液(25 mM Tris-HCl pH 8.0,50 mMNaCl)平衡好的肝素柱(heparin sepharose 6 Fast Flow)。将穿过液重复上述操作1~2次。
j. 加入清洗缓冲液I(25 mM Tris-HCl pH 8.0, 100 mM NaCl)10 ml, 除去杂质。重复上述操作3次。
k. 加入洗脱缓冲液(25 mM Tris-HCl pH 8.0,1000 mM NaCl,10 mM DTT)10ml,将目的蛋白从肝素柱上洗脱。用考马斯亮蓝G-250检测是否洗脱干净。如洗脱不完全,重复上述操作。使用SDS-PAGE鉴定蛋白纯度。
l. 经过上述两步亲和层析纯化得到的蛋白,使用超滤浓缩管浓缩到~10mg/ml。最后使用分子筛(Superdax 200) 进一步纯化蛋白并检测蛋白性质,分子筛所使用的缓冲液为25 mM Tris-HCl pH8.0,150 mM NaCl,10 mM DTT。使用脱盐柱(Hiprep 26/10)将dHax3(231~720)蛋白所在缓冲液置换为25 mM MES pH 6.0,50 mM NaCl,5 mM MgCl2,10mM DTT。
2. dHax3及dHax3-Δ的构建与表达
dHax3(designed Hax3)基因通过全基因合成得到,序列如下(SEQ ID NO:1):
ATGGACCCAATACGAAGCAGAACGCCATCACCAGCTAGGGAACTTCTCTCTGGACCACAGCCTGATGGAGTTCAGCCAACTGCAGATCGAGGTGTTTCTCCGCCAGCCGGTGGCCCTTTAGATGGTCTCCCAGCAAGAAGAACAATGTCCCGTACCAGACTCCCAAGTCCCCCTGCCCCGTCGCCAGCCTTTTCAGCTGACTCCTTCTCTGATCTTCTTAGGCAATTTGACCCTTCTCTTTTCAATACATCCCTTTTCGATTCACTTCCTCCTTTCGGCGCACATCATACTGAGGCAGCCACCGGCGAATGGGACGAAGTCCAAAGTGGTTTAAGGGCAGCTGATGCTCCACCACCGACGATGAGAGTCGCTGTTACCGCCGCACGTCCTCCTAGAGCCAAGCCAGCCCCTAGAAGACGAGCTGCGCAACCCTCCGATGCAAGCCCTGCAGCTCAAGTAGACCTTCGAACACTAGGTTACTCCCAGCAACAACAAGAAAAAATAAAGCCAAAGGTTAGATCTACAGTTGCACAACATCACGAAGCCCTAGTCGGACACGGATTTACACATGCTCATATCGTGGCTCTTTCACAACATCCTGCAGCTCTTGGAACAGTCGCTGTCAAATATCAGGATATGATTGCTGCATTGCCAGAAGCTACTCACGAAGCTATCGTCGGAGTTGGGAAACAATGGTCAGGCGCAAGAGCATTAGAGGCGCTTCTCACCGTAGCTGGTGAATTACGAGGTCCTCCACTCCAATTGGATACTGGGCAATTATTAAAAATCGCTAAACGAGGTGGAGTCACTGCTGTCGAAGCCGTTCATGCATGGCGTAACGCTCTCACGGGCGCACCACTAAACCTTACTCCTGAACAGGTTGTCGCAATAGCTTCACATGATGGCGGAAAACAAGCTCTTGAAACAGTGCAACGTCTCCTTCCCGTCCTCTGTCAGGCTCACGGATTGACTCCTCAGCAGGTCGTCGCAATTGCATCACATGATGGAGGCAAACAAGCTTTAGAAACAGTACAAAGACTATTGCCCGTTCTTTGCCAAGCGCATGGGTTAACTCCCGAACAAGTCGTTGCCATTGCAAGTCACGACGGAGGTAAACAAGCTCTCGAAACGGTTCAAGCACTTTTACCCGTTCTCTGTCAAGCACATGGACTCACACCTGAACAAGTAGTTGCTATCGCATCGAATGGAGGTGGAAAACAAGCACTGGAAACTGTACAAAGACTTTTGCCAGTTTTATGTCAAGCGCACGGTCTTACTCCTCAACAAGTTGTCGCCATTGCCTCTAACGGTGGTGGAAAACAAGCTCTTGAAACTGTCCAGAGACTTCTGCCCGTTCTATGTCAGGCTCATGGGCTAACCCCTCAACAGGTTGTTGCAATCGCATCTAATGGAGGAGGAAAACAAGCTTTAGAAACTGTCCAACGACTACTGCCCGTTCTCTGCCAAGCACACGGACTTACCCCACAACAAGTTGTGGCAATAGCTTCTAATTCTGGTGGTAAACAAGCCCTTGAGACGGTTCAAAGACTTCTACCAGTTCTTTGTCAGGCACATGGATTGACCCCACAACAGGTCGTAGCAATCGCATCTAATGGAGGTGGTAAGCAAGCTCTAGAAACGGTACAAAGATTACTTCCCGTGCTTTGTCAAGCTCATGGACTCACTCCTCAACAAGTGGTCGCTATTGCAAGTCATGATGGTGGAAAGCAAGCACTAGAAACCGTCCAACGACTCCTTCCTGTTCTCTGTCAAGCACATGGTCTTACGCCCGAACAAGTTGTTGCTATAGCTTCGAACGGAGGTGGAAAACAAGCTCTCGAAACCGTCCAAAGGCTCCTCCCAGTACTTTGCCAAGCACATGGATTAACCCCTGAGCAAGTAGTTGCAATTGCCTCGCACGACGGAGGAAAGCAAGCATTAGAAACTGTTCAGAGACTTTTGCCTGTCCTGTGTCAAGCCCACGGTCTAACACCACAACAAGTCGTCGCAATCGCTAGTAATGGAGGAGGTAGACCTGCATTGGAGTCGATAGTCGCACAACTATCACGACCTGATCCCGCTCTTGCAGCATTGACAAACGATCATTTAGTCGCACTTGCATGTTTAGGAGGACGACCAGCACTTGATGCCGTTAAGAAAGGACTACCGCACGCCCCTGCATTGATTAAAAGAACAAACAGACGAATCCCGGAGAGAACTTCACATCGTGTAGCCGATCATGCTCAAGTCGTAAGAGTTTTGGGTTTCTTCCAATGTCATTCCCACCCAGCTCAAGCTTTTGACGATGCAATGACTCAATTTGGAATGAGTAGACATGGACTCCTGCAATTATTTCGAAGGGTCGGAGTTACAGAGCTCGAAGCCAGGTCAGGAACGCTGCCCCCCGCATCTCAACGATGGGATAGAATTCTCCAAGCCTCTGGAATGAAAAGAGCTAAACCTTCACCAACGTCCACACAAACACCAGACCAAGCTTCTCTCCACGCTTTTGCCGACTCACTAGAGAGAGATCTAGATGCACCGTCACCTATGCATGAAGGAGACCAAACAAGAGCCTCTTCAAGAAAACGTTCTCGTTCTGATAGAGCTGTCACTGGACCTTCCGCCCAACAATCTTTCGAAGTCCGAGTTCCTGAGCAACGAGATGCCCTACACCTGCCTTTGCTTTCTTGGGGAGTTAAGCGACCACGTACTAGAATTGGTGGACTACTCGATCCAGGTACACCAATGGATGCTGATCTCGTTGCTTCCTCTACCGTAGTATGGGAGCAAGACGCAGACCCCTTCGCTGGAACTGCTGACGATTTCCCAGCCTTTAACGAGGAAGAATTGGCTTGGTTAATGGAACTTCTACCGCAATGA。
合成的基因直接被连入pET300(invitrogen)质粒。表达出来的全长蛋白,N端有6个组氨酸标签,用于蛋白纯化时通过镍柱的亲和纯化。全长蛋白序列如下(SEQ ID NO:2):
MHHHHHHITSLYKKAGLMDPIRSRTPSPARELLSGPQPDGVQPTADRGVSPPAGGPLDGLPARRTMSRTRLPSPPAPSPAFSADSFSDLLRQFDPSLFNTSLFDSLPPFGAHHTEAATGEWDEVQSGLRAADAPPPTMRVAVTAARPPRAKPAPRRRAAQPSDASPAAQVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQALLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNSGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVADHAQVVRVLGFFQCHSHPAQAFDDAMTQFGMSRHGLLQLFRRVGVTELEARSGTLPPASQRWDRILQASGMKRAKPSPTSTQTPDQASLHAFADSLERDLDAPSPMHEGDQTRASSRKRSRSDRAVTGPSAQQSFEVRVPEQRDALHLPLLSWGVKRPRTRIGGLLDPGTPMDADLVASSTVVWEQDADPFAGTADDFPAFNEEELAWLMELLPQ。
dHax3全长蛋白的纯化图如图5所示(利用6×组氨酸标签经由镍柱亲和层析纯化,SDS-PAGE电泳后经考马斯亮蓝显色)。
通过蛋白质二级结构预测,发明人发现蛋白质的N端和C端都有一大段没有二级结构区域。这些区域不适合蛋白质结晶,发明人于是设计了截短体蛋白(dHax3截短体,标记为dHax3-Δ),包含蛋白序列230-721)来获得性质更加稳定的蛋白质。dHax3截短体被克隆到pET21(Novagen)表达载体中。表达出来的dHax3截短体蛋白序列如下,其中C端含有His6标签,用于蛋白纯化时通过镍柱的亲和纯化(SEQ ID NO:3):
MQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQALLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNSGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKLEHHHHHH。
dHax3截短体蛋白的纯化图如图6所示(利用Histidine6标签经由镍柱亲和层析纯化,SDS-PAGE电泳后经考马斯亮蓝显色)。
3. dHax3-NI及dHax3-NI-Δ的构建与表达
发明人还构建并表达了dHax3-NI-Δ蛋白用于与DNA-RNA共结晶实验,在DNA结合域第七个重复单元中的NS通过点突变技术变成NI ,并且构建并表达了dHax3-NI用于EMSA实验以及RNase H 酶切保护实验。dHax3-NI-Δ的氨基酸序列如下(SEQ ID NO:4):
MQWSGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQALLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPQQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPQQVVAIASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKLEHHHHHH。
4. TALE24重复单元及TALEHIV重复单元的构建
另外设计了两种TALE的DNA结构域的重复单元(TALEHIV重复单元和TALE24重复单元)。相应DNA结合域的重复单元通过合成得到。在合成的DNA结合域的重复单元两端分别存在SpeI 和 SalI的限制性酶切位点。TALE24重复单元以及TALEHIV重复单元的DNA序列和蛋白序列如下表2:
表2:TALE24重复单元和TALEHIV重复单元的DNA序列和蛋白序列
表中SpeI (ACTAGT)和SalI (GTCGAC)被突出显示。
5. dHax3-TALE24重复单元嵌合蛋白与dHax3-TALEHIV重复单元嵌合蛋白的构建
将合成的TALE24重复单元或dHax3-TALEHIV重复单元插入dHax3基因的NheI和SalI之间,从而取代dHax3重复单元形成两种嵌合蛋白 dHax3-TALE24重复单元与dHax3-TALEHIV重复单元。
实施例2:获得dHax3-Δ与双链DNA的复合物晶体结构以及dHax3-NI-Δ与DNA-RNA双链体复合物的晶体结构
●单双链DNA的获得
为了检验dHax3与单双链DNA的结合能力,以及获得蛋白质与dsDNA复合物的晶体,发明人通过化学合成的方法得到单链DNA(17nt):(Invitrogen & Takara)
5’ TG TCCCTTTATCTCT CT 3’ (SEQ ID NO:9 )
3’ AC AGGGAAATAGAGA GA 5’ (SEQ ID NO:10)
将合成得到的单链DNA溶解至1 mM, 等摩尔比将两条单链DNA混合,85℃温浴3min以上,缓慢降温到22℃,此过程不得少于3个小时。为了长期保存退火的双链DNA可以进行冻干超低温保存。
●DNA-RNA杂合链的获得
为了检验dHax3与DNA-RNA杂合链的结合能力,以及获得蛋白质与DNA-RNA复合物的晶体,发明人通过化学合成的方法得到单链DNA(17nt)和RNA:(核酸的合成由Invitrogen& Takara公司完成)
DNA 5’ TG TCCCTTTATCTCT CT 3’ (SEQ ID NO:9)
RNA 3’ AC AGGGAAAUAGAGA GA 5’ (SEQ ID NO:11)
将合成得到的单链DNA或RNA溶解至1 mM, 等摩尔比将两条单链混合,85℃温浴3min以上,缓慢降温到22℃,此过程不得少于3个小时。为了长期保存退火的DNA-RNA杂合链可以进行冻干超低温保存。
●复合物结晶的获得
将纯化好的dHax3-Δ(全长序列中的231-720)调整蛋白浓度在6~7 mg/ml,加入摩尔比1.5 :1的退火后的双链DNA,4 ℃孵育30 min.
前期的结晶条件筛选主要是基于商业化的Screen Kit,包括:Hampton公司的SaltRX,Natrix,PEG/Ion,Crystal Screen,Index;Emerald公司的Wizard I, II ,III ;Molecular dimension的ProPlex。
从上述Kit中筛选出蛋白结晶的条件,通过调节沉淀剂浓度,种类;盐离子的浓度和种类;缓冲液的浓度和种类优化结晶条件。使用Addtive Screen和Detergent ScreenKit对晶体进行优化。同时对晶体进行脱水,退火等尝试,以提高晶体的衍射质量。
使用蛋白质结晶没有规律可循,所以到目前为止仍然还是一门艺术。起始阶段常用Sparse matrix screen,即购买各公司配置的结晶条件进行筛选。大多数情况下,初筛得到的结晶条件中并不能长出衍射质量高的晶体,在接下来的实验中,发明人又进一步对初始结晶条件的基础上进一步细化,包括调整沉淀剂、pH缓冲液、盐、添加还原剂、去垢剂或醇;调整结晶实验的温度,时间等。最后采用的结晶条件为将如下结晶母液与孵育好的蛋白核酸复合物通过1:1的体积比混合,通过悬滴法(hanging drop vapor diffusion method)在18 ℃培养两天,即可获得晶体。
结晶母液:8-10% PEG3350 (w/v),12% ethanol,0.1 M MES pH 6.0。
●数据收集及处理
使用上海同步辐射中心(SSRF)BL17U线束站或者日本SPRING-8 BL41XU线束站进行数据收集。所有收集的衍射数据用HKL2000软件进行积分计算,进一步的数据处理通过CCP4软件实现。使用不结合DNA的dHax3作为置换的模式,通过分子置换的方法,解析dHax3与DNA复合物的结构。最后使用Phenix 和 COOT 两个软件完成对结构的修正处理。数据处理和结构解析、修正完成之后,dHax3蛋白的结构分辨率达到2.4Å,dHax3-Δ蛋白与dsDNA复合物结构均达到1.85Å; dHax3-NI-Δ蛋白结合DNA-RNA双链体的晶体结构达到2.5 Å。数据收集和结构修正的统计数据,见表4-表5:
数据收集和结构修正的统计数据
表4. dHax3晶体结构以及DNA-结合的dHax3-Δ复合物晶体结构的数据收集和结构修正的统计数据
表5.dHax3-NI-Δ-DNA/RNA双链体复合物晶体结构的数据收集和结构修正的统计数据
发明人解析了dHax3-Δ与双链DNA(dsDNA)的高分辨率晶体结构(1.85埃)。该结构清晰地展示了dHax3展现右手螺旋结构,将dsDNA包裹于整个复合体的中间。蛋白质缠绕在DNA外面,嵌入DNA的大沟(见图1)。
结构分析显示dHax3与DNA的相互作用主要集中于具有识别序列的DNA链,而其互补链则不参与蛋白-DNA的相互作用(见图2)。即使互补链变成RNA,dHax3也应同样能结合。结构分析还进一步显示:非编码链不直接与dHax3接触,因此可以容忍相当程度针对碱基及骨架的修饰,即非编码链可以是DNA、RNA,或者它们的衍生、修饰后分子。
图4显示了dHax3-NI-Δ与DNA-RNA杂合双链复合物的晶体结构。
实施例3:凝胶阻滞实验验证dHax3-NI可以与DNA – RNA 杂合双链相互作用
●EMSA(electrophoretic mobility shift assay,电泳迁移率变动分析,又称凝胶阻滞实验)
凝胶阻滞实验是一种体外研究DNA/RNA 与蛋白质相互作用的特殊的凝胶电泳技术。其基本原理为:在凝胶电泳中,由于电场的作用,小分子的核酸片段比其结合了蛋白质的核酸片段向阳极移动的速度快。因此,可标记短的核酸片段,将其与蛋白质混合,对混合物进行凝胶电泳,若目的DNA与特异性蛋白质结合,其移动的速度受到阻滞,对凝胶进行放射自显影,就可以找到核酸结合蛋白。同时通过统计结合蛋白的DNA 和未结合蛋白的DNA的量,可以比较准确的拟合计算出,蛋白质对核酸的结合能力(binding affinity)。
●DNA/DNA oligo和DNA/RNA oligo
用于凝胶阻滞实验的DNA/DNA oligo的片段,如下表所示:
用于凝胶阻滞实验的DNA/RNA oligo的片段,如下表所示:
●DNA/RNA 末端标记
按照上表设置好反应体系后,轻轻混匀,置于37℃孵育30分钟37℃孵育30分钟37℃孵育30分钟37℃孵育30分钟37 ℃孵育30 min;使用G25 预装脱盐层析柱出去多余的[γ-32P]-ATP,加入过量的未标记的互补链,退火生成双链DNA或者DNA–RNA 杂合双链。
●DNA/RNA和蛋白相互作用体系
全长蛋白(不同浓度) | 5 ul |
DNA /RNA | 2 ul |
5X缓冲液 | 2 ul |
ddH20 | 1 ul |
将反应成分按上述比例加入反应体系中,混匀后4 ℃孵育20 min;将反应好的样品跑6 % 非变性胶;
跑完胶用干胶仪将胶干透,放在磷屏上曝光过夜;
用Typhoon 9400 varible 扫描仪读取图像数据。
发明人通过凝胶阻滞实验证明了dHax3-NI蛋白可以与DNA – RNA 杂合双链相互作用,并保持了很强的结合能力,详见图3。
实施例4:RNase H酶切保护实验验证
用于RNase H酶切保护实验的DNA-RNA链的序列如下:
获得DNA-RNA杂合双链的与EMSA实验相同,但是用于RNase H酶切保护实验的DNA-RNA的杂合双链进行了放射性标记。
将带有P32标记的DNA-RNA双链核酸分别与上述三种TALE蛋白质(dHax3-NI、TALE24和TALEHIV repeats)混合或者与作为对照的BSA混合后置于冰上孵育20分钟。孵育的缓冲体系为:20 mM Tris-HCl (pH 8.0), 50 mM NaCl, 5 mM MgCl2, 10 mM DTT。孵育结束后加入0.1 U/μl RNase H (Takara)于室温反应5分钟,使用酚氯仿终止反应,使用乙醇沉淀纯化反应后生成的核酸片段。经过醇沉处理的样品重悬于RNA-上样缓冲液 (95%甲酰胺、18mM EDTA, 0.025% 二甲苯蓝, 0.025% 溴酚蓝)。样品使用12% 7 M尿素聚丙烯酰胺凝胶进行鉴定。跑完胶用干胶仪将胶干透,放在磷屏上曝光过夜;使用Typhoon 9400 读取图像数据。RNA梯带使用RNase T1或者RNase A酶切ssRNA制备。
如图8所示,dHax3-NI 保护DNA-RNA,阻止RNase H对DNA-RNA杂合双链中的RNA的酶切。第1和2道分别为在没有RNase H情况下,有或者无dHax3的对照组中,RNA没有明显的降解条带出现;在第3道中,加入RNase H情况下,无dHax3的对照实验发现:RNA绝大部分都被降解成小片段;在第4~10道中在加入RNase H情况下同时加入梯度浓度的dHax3-NI(0.004,0.015,0.05,0.025,0.1,0.4,1.6 μM),出现如箭头所示的部分降解RNA条带,这些RNA条带直接说明了dHax3-NI结合到DNA-RNA双链上,直接起到了保护的作用,阻止了RNaseH对DNA-RNA双链中RNA链的降解。13和14道分别为制备的RNA梯带(T1和A)用于检测RNase H的在DNA-RNA杂合双链中的剪切位置。
为了研究TALE蛋白对DNA-RNA双链保护作用是否具有普遍性,即这种保护作用是否只存在于dHax3一种TALE蛋白,发明人设计了另一种具有不同长度的重复单元——TALE24重复单元,其具有24个重复单元,能识别更长的DNA-RNA杂合双链(参见P. Yin, D.Deng, C. Yan, X. Pan, J.J. Xi, N. Yan, Y. Shi, Specific DNA-RNA HybridRecognition by TAL Effectors, Cell reports, 2 (2012) 707-713)。如图9所示,通过RNase H保护实验,发明人惊讶地发现dHax3-TALE24重复单元嵌合蛋白也可以保护DNA-RNA,阻止RNase H对DNA-RNA杂合双链中的RNA的酶切。因此,TALE24重复单元同样可以阻止RNase H对DNA-RNA双链中RNA链的降解。
为了研究TALE在HIV治疗中的潜在作用,发明人设计了可以特异性识别HIV基因组中特定片段的TALEHIV重复单元(参见P. Yin, D. Deng, C. Yan, X. Pan, J.J. Xi, N.Yan, Y. Shi, Specific DNA-RNA Hybrid Recognition by TAL Effectors, Cellreports, 2 (2012) 707-713),并构建了dHax3-TALEHIV重复单元嵌合蛋白进行RNase H降解实验。发明人惊讶地发现,TALEHIV重复单元阻止RNase H对DNA-RNA双链中RNA链的降解。如图10所示,在梯度浓度的dHax3-TALEHIV重复单元(0.004,0.015,0.05,0.025,0.1,0.4,1.6 μM)蛋白保护下RNA的降解逐渐减弱。这里没有出现部分降解的原因是因为暴露在TALEHIV重复单元保护外面的DNA-RNA双链较短。这表明具有TALEHIV重复单元的TALE蛋白能够在HIV基因组的复制过程中,阻止RNA链的降解。
HIV在核酸复制过程中,通过逆转录酶将RNA逆转录生成DNA,进一步逆转录酶上的RNase H结构域将RNA降解,得到释放的单链DNA会利用DNA聚合酶复制出互补DNA链,形成DNA双链,因此DNA-RNA双链中RNA的降解是HIV复制过程中一个必须环节。利用识别HIV基因组中特定片段的TALE能够阻止HIV基因组在复制过程中,RNA链的降解。从而达到抑制或者减缓HIV复制过程的功能。
尽管在本文中参考示例性的实施方案详细描述了本发明,但是应当理解的是,本发明不限于所述实施方案。具有本领域普通技能且可获取本文教导的人员会认识到在本发明范围内的其它变化、修改和实施方案。因此,本发明应与后面所述的权利要求一致地被广义地解释。
序列表
<110> 清华大学
<120> 特异结合和靶定DNA-RNA杂合双链的方法
<130> FPCH12160040P
<150> CN 201210021004.9
<151> 2012-01-04
<160> 20
<170> PatentIn version 3.3
<210> 1
<211> 2883
<212> DNA
<213> 人工的
<220>
<223> dHax3 DNA序列
<400> 1
atggacccaa tacgaagcag aacgccatca ccagctaggg aacttctctc tggaccacag 60
cctgatggag ttcagccaac tgcagatcga ggtgtttctc cgccagccgg tggcccttta 120
gatggtctcc cagcaagaag aacaatgtcc cgtaccagac tcccaagtcc ccctgccccg 180
tcgccagcct tttcagctga ctccttctct gatcttctta ggcaatttga cccttctctt 240
ttcaatacat cccttttcga ttcacttcct cctttcggcg cacatcatac tgaggcagcc 300
accggcgaat gggacgaagt ccaaagtggt ttaagggcag ctgatgctcc accaccgacg 360
atgagagtcg ctgttaccgc cgcacgtcct cctagagcca agccagcccc tagaagacga 420
gctgcgcaac cctccgatgc aagccctgca gctcaagtag accttcgaac actaggttac 480
tcccagcaac aacaagaaaa aataaagcca aaggttagat ctacagttgc acaacatcac 540
gaagccctag tcggacacgg atttacacat gctcatatcg tggctctttc acaacatcct 600
gcagctcttg gaacagtcgc tgtcaaatat caggatatga ttgctgcatt gccagaagct 660
actcacgaag ctatcgtcgg agttgggaaa caatggtcag gcgcaagagc attagaggcg 720
cttctcaccg tagctggtga attacgaggt cctccactcc aattggatac tgggcaatta 780
ttaaaaatcg ctaaacgagg tggagtcact gctgtcgaag ccgttcatgc atggcgtaac 840
gctctcacgg gcgcaccact aaaccttact cctgaacagg ttgtcgcaat agcttcacat 900
gatggcggaa aacaagctct tgaaacagtg caacgtctcc ttcccgtcct ctgtcaggct 960
cacggattga ctcctcagca ggtcgtcgca attgcatcac atgatggagg caaacaagct 1020
ttagaaacag tacaaagact attgcccgtt ctttgccaag cgcatgggtt aactcccgaa 1080
caagtcgttg ccattgcaag tcacgacgga ggtaaacaag ctctcgaaac ggttcaagca 1140
cttttacccg ttctctgtca agcacatgga ctcacacctg aacaagtagt tgctatcgca 1200
tcgaatggag gtggaaaaca agcactggaa actgtacaaa gacttttgcc agttttatgt 1260
caagcgcacg gtcttactcc tcaacaagtt gtcgccattg cctctaacgg tggtggaaaa 1320
caagctcttg aaactgtcca gagacttctg cccgttctat gtcaggctca tgggctaacc 1380
cctcaacagg ttgttgcaat cgcatctaat ggaggaggaa aacaagcttt agaaactgtc 1440
caacgactac tgcccgttct ctgccaagca cacggactta ccccacaaca agttgtggca 1500
atagcttcta attctggtgg taaacaagcc cttgagacgg ttcaaagact tctaccagtt 1560
ctttgtcagg cacatggatt gaccccacaa caggtcgtag caatcgcatc taatggaggt 1620
ggtaagcaag ctctagaaac ggtacaaaga ttacttcccg tgctttgtca agctcatgga 1680
ctcactcctc aacaagtggt cgctattgca agtcatgatg gtggaaagca agcactagaa 1740
accgtccaac gactccttcc tgttctctgt caagcacatg gtcttacgcc cgaacaagtt 1800
gttgctatag cttcgaacgg aggtggaaaa caagctctcg aaaccgtcca aaggctcctc 1860
ccagtacttt gccaagcaca tggattaacc cctgagcaag tagttgcaat tgcctcgcac 1920
gacggaggaa agcaagcatt agaaactgtt cagagacttt tgcctgtcct gtgtcaagcc 1980
cacggtctaa caccacaaca agtcgtcgca atcgctagta atggaggagg tagacctgca 2040
ttggagtcga tagtcgcaca actatcacga cctgatcccg ctcttgcagc attgacaaac 2100
gatcatttag tcgcacttgc atgtttagga ggacgaccag cacttgatgc cgttaagaaa 2160
ggactaccgc acgcccctgc attgattaaa agaacaaaca gacgaatccc ggagagaact 2220
tcacatcgtg tagccgatca tgctcaagtc gtaagagttt tgggtttctt ccaatgtcat 2280
tcccacccag ctcaagcttt tgacgatgca atgactcaat ttggaatgag tagacatgga 2340
ctcctgcaat tatttcgaag ggtcggagtt acagagctcg aagccaggtc aggaacgctg 2400
ccccccgcat ctcaacgatg ggatagaatt ctccaagcct ctggaatgaa aagagctaaa 2460
ccttcaccaa cgtccacaca aacaccagac caagcttctc tccacgcttt tgccgactca 2520
ctagagagag atctagatgc accgtcacct atgcatgaag gagaccaaac aagagcctct 2580
tcaagaaaac gttctcgttc tgatagagct gtcactggac cttccgccca acaatctttc 2640
gaagtccgag ttcctgagca acgagatgcc ctacacctgc ctttgctttc ttggggagtt 2700
aagcgaccac gtactagaat tggtggacta ctcgatccag gtacaccaat ggatgctgat 2760
ctcgttgctt cctctaccgt agtatgggag caagacgcag accccttcgc tggaactgct 2820
gacgatttcc cagcctttaa cgaggaagaa ttggcttggt taatggaact tctaccgcaa 2880
tga 2883
<210> 2
<211> 977
<212> PRT
<213> 人工的
<220>
<223> dHax3蛋白氨基酸序列
<400> 2
Met His His His His His His Ile Thr Ser Leu Tyr Lys Lys Ala Gly
1 5 10 15
Leu Met Asp Pro Ile Arg Ser Arg Thr Pro Ser Pro Ala Arg Glu Leu
20 25 30
Leu Ser Gly Pro Gln Pro Asp Gly Val Gln Pro Thr Ala Asp Arg Gly
35 40 45
Val Ser Pro Pro Ala Gly Gly Pro Leu Asp Gly Leu Pro Ala Arg Arg
50 55 60
Thr Met Ser Arg Thr Arg Leu Pro Ser Pro Pro Ala Pro Ser Pro Ala
65 70 75 80
Phe Ser Ala Asp Ser Phe Ser Asp Leu Leu Arg Gln Phe Asp Pro Ser
85 90 95
Leu Phe Asn Thr Ser Leu Phe Asp Ser Leu Pro Pro Phe Gly Ala His
100 105 110
His Thr Glu Ala Ala Thr Gly Glu Trp Asp Glu Val Gln Ser Gly Leu
115 120 125
Arg Ala Ala Asp Ala Pro Pro Pro Thr Met Arg Val Ala Val Thr Ala
130 135 140
Ala Arg Pro Pro Arg Ala Lys Pro Ala Pro Arg Arg Arg Ala Ala Gln
145 150 155 160
Pro Ser Asp Ala Ser Pro Ala Ala Gln Val Asp Leu Arg Thr Leu Gly
165 170 175
Tyr Ser Gln Gln Gln Gln Glu Lys Ile Lys Pro Lys Val Arg Ser Thr
180 185 190
Val Ala Gln His His Glu Ala Leu Val Gly His Gly Phe Thr His Ala
195 200 205
His Ile Val Ala Leu Ser Gln His Pro Ala Ala Leu Gly Thr Val Ala
210 215 220
Val Lys Tyr Gln Asp Met Ile Ala Ala Leu Pro Glu Ala Thr His Glu
225 230 235 240
Ala Ile Val Gly Val Gly Lys Gln Trp Ser Gly Ala Arg Ala Leu Glu
245 250 255
Ala Leu Leu Thr Val Ala Gly Glu Leu Arg Gly Pro Pro Leu Gln Leu
260 265 270
Asp Thr Gly Gln Leu Leu Lys Ile Ala Lys Arg Gly Gly Val Thr Ala
275 280 285
Val Glu Ala Val His Ala Trp Arg Asn Ala Leu Thr Gly Ala Pro Leu
290 295 300
Asn Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser His Asp Gly Gly
305 310 315 320
Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln
325 330 335
Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser His Asp
340 345 350
Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu
355 360 365
Cys Gln Ala His Gly Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser
370 375 380
His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Ala Leu Leu Pro
385 390 395 400
Val Leu Cys Gln Ala His Gly Leu Thr Pro Glu Gln Val Val Ala Ile
405 410 415
Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu
420 425 430
Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val
435 440 445
Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln
450 455 460
Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln
465 470 475 480
Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr
485 490 495
Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro
500 505 510
Gln Gln Val Val Ala Ile Ala Ser Asn Ser Gly Gly Lys Gln Ala Leu
515 520 525
Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu
530 535 540
Thr Pro Gln Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln
545 550 555 560
Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His
565 570 575
Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser His Asp Gly Gly
580 585 590
Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln
595 600 605
Ala His Gly Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser Asn Gly
610 615 620
Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu
625 630 635 640
Cys Gln Ala His Gly Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser
645 650 655
His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro
660 665 670
Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile
675 680 685
Ala Ser Asn Gly Gly Gly Arg Pro Ala Leu Glu Ser Ile Val Ala Gln
690 695 700
Leu Ser Arg Pro Asp Pro Ala Leu Ala Ala Leu Thr Asn Asp His Leu
705 710 715 720
Val Ala Leu Ala Cys Leu Gly Gly Arg Pro Ala Leu Asp Ala Val Lys
725 730 735
Lys Gly Leu Pro His Ala Pro Ala Leu Ile Lys Arg Thr Asn Arg Arg
740 745 750
Ile Pro Glu Arg Thr Ser His Arg Val Ala Asp His Ala Gln Val Val
755 760 765
Arg Val Leu Gly Phe Phe Gln Cys His Ser His Pro Ala Gln Ala Phe
770 775 780
Asp Asp Ala Met Thr Gln Phe Gly Met Ser Arg His Gly Leu Leu Gln
785 790 795 800
Leu Phe Arg Arg Val Gly Val Thr Glu Leu Glu Ala Arg Ser Gly Thr
805 810 815
Leu Pro Pro Ala Ser Gln Arg Trp Asp Arg Ile Leu Gln Ala Ser Gly
820 825 830
Met Lys Arg Ala Lys Pro Ser Pro Thr Ser Thr Gln Thr Pro Asp Gln
835 840 845
Ala Ser Leu His Ala Phe Ala Asp Ser Leu Glu Arg Asp Leu Asp Ala
850 855 860
Pro Ser Pro Met His Glu Gly Asp Gln Thr Arg Ala Ser Ser Arg Lys
865 870 875 880
Arg Ser Arg Ser Asp Arg Ala Val Thr Gly Pro Ser Ala Gln Gln Ser
885 890 895
Phe Glu Val Arg Val Pro Glu Gln Arg Asp Ala Leu His Leu Pro Leu
900 905 910
Leu Ser Trp Gly Val Lys Arg Pro Arg Thr Arg Ile Gly Gly Leu Leu
915 920 925
Asp Pro Gly Thr Pro Met Asp Ala Asp Leu Val Ala Ser Ser Thr Val
930 935 940
Val Trp Glu Gln Asp Ala Asp Pro Phe Ala Gly Thr Ala Asp Asp Phe
945 950 955 960
Pro Ala Phe Asn Glu Glu Glu Leu Ala Trp Leu Met Glu Leu Leu Pro
965 970 975
Gln
<210> 3
<211> 499
<212> PRT
<213> 人工的
<220>
<223> dHax3截短体氨基酸序列(C端含有6个His标签)
<400> 3
Met Gln Trp Ser Gly Ala Arg Ala Leu Glu Ala Leu Leu Thr Val Ala
1 5 10 15
Gly Glu Leu Arg Gly Pro Pro Leu Gln Leu Asp Thr Gly Gln Leu Leu
20 25 30
Lys Ile Ala Lys Arg Gly Gly Val Thr Ala Val Glu Ala Val His Ala
35 40 45
Trp Arg Asn Ala Leu Thr Gly Ala Pro Leu Asn Leu Thr Pro Glu Gln
50 55 60
Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr
65 70 75 80
Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro
85 90 95
Gln Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu
100 105 110
Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu
115 120 125
Thr Pro Glu Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln
130 135 140
Ala Leu Glu Thr Val Gln Ala Leu Leu Pro Val Leu Cys Gln Ala His
145 150 155 160
Gly Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly
165 170 175
Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln
180 185 190
Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser Asn Gly
195 200 205
Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu
210 215 220
Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser
225 230 235 240
Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro
245 250 255
Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile
260 265 270
Ala Ser Asn Ser Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu
275 280 285
Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val
290 295 300
Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln
305 310 315 320
Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln
325 330 335
Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr
340 345 350
Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro
355 360 365
Glu Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu
370 375 380
Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu
385 390 395 400
Thr Pro Glu Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln
405 410 415
Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His
420 425 430
Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly
435 440 445
Arg Pro Ala Leu Glu Ser Ile Val Ala Gln Leu Ser Arg Pro Asp Pro
450 455 460
Ala Leu Ala Ala Leu Thr Asn Asp His Leu Val Ala Leu Ala Cys Leu
465 470 475 480
Gly Gly Arg Pro Ala Leu Asp Ala Val Lys Lys Leu Glu His His His
485 490 495
His His His
<210> 4
<211> 499
<212> PRT
<213> 人工的
<220>
<223> dHax3-NI截短体氨基酸序列(C端含有6个His标签)
<400> 4
Met Gln Trp Ser Gly Ala Arg Ala Leu Glu Ala Leu Leu Thr Val Ala
1 5 10 15
Gly Glu Leu Arg Gly Pro Pro Leu Gln Leu Asp Thr Gly Gln Leu Leu
20 25 30
Lys Ile Ala Lys Arg Gly Gly Val Thr Ala Val Glu Ala Val His Ala
35 40 45
Trp Arg Asn Ala Leu Thr Gly Ala Pro Leu Asn Leu Thr Pro Glu Gln
50 55 60
Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr
65 70 75 80
Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro
85 90 95
Gln Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu
100 105 110
Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu
115 120 125
Thr Pro Glu Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln
130 135 140
Ala Leu Glu Thr Val Gln Ala Leu Leu Pro Val Leu Cys Gln Ala His
145 150 155 160
Gly Leu Thr Pro Glu Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly
165 170 175
Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln
180 185 190
Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser Asn Gly
195 200 205
Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu
210 215 220
Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser
225 230 235 240
Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro
245 250 255
Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val Ala Ile
260 265 270
Ala Ser Asn Ile Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu
275 280 285
Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln Val Val
290 295 300
Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln
305 310 315 320
Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Gln Gln
325 330 335
Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr
340 345 350
Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro
355 360 365
Glu Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu
370 375 380
Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu
385 390 395 400
Thr Pro Glu Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln
405 410 415
Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His
420 425 430
Gly Leu Thr Pro Gln Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly
435 440 445
Arg Pro Ala Leu Glu Ser Ile Val Ala Gln Leu Ser Arg Pro Asp Pro
450 455 460
Ala Leu Ala Ala Leu Thr Asn Asp His Leu Val Ala Leu Ala Cys Leu
465 470 475 480
Gly Gly Arg Pro Ala Leu Asp Ala Val Lys Lys Leu Glu His His His
485 490 495
His His His
<210> 5
<211> 794
<212> PRT
<213> 人工的
<220>
<223> TALE24重复单元氨基酸序列
<400> 5
Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
1 5 10 15
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala
20 25 30
Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
35 40 45
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val
50 55 60
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
65 70 75 80
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
85 90 95
Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu
100 105 110
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
115 120 125
Pro Ala Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala
130 135 140
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
145 150 155 160
Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys
165 170 175
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
180 185 190
His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Ile Gly
195 200 205
Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys
210 215 220
Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn
225 230 235 240
Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val
245 250 255
Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala
260 265 270
Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
275 280 285
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala
290 295 300
Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
305 310 315 320
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val
325 330 335
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
340 345 350
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
355 360 365
Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu
370 375 380
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
385 390 395 400
Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala
405 410 415
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
420 425 430
Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys
435 440 445
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
450 455 460
His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly
465 470 475 480
Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys
485 490 495
Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser His
500 505 510
Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val
515 520 525
Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala
530 535 540
Ser Asn Ile Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
545 550 555 560
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala
565 570 575
Ile Ala Ser Asn Asn Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
580 585 590
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val
595 600 605
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
610 615 620
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
625 630 635 640
Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu
645 650 655
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
660 665 670
Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala
675 680 685
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
690 695 700
Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser Asn Asn Gly Gly Lys
705 710 715 720
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
725 730 735
His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Ile Gly
740 745 750
Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys
755 760 765
Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn
770 775 780
Asn Gly Gly Arg Arg Cys Tyr Lys Ala Leu
785 790
<210> 6
<211> 760
<212> PRT
<213> 人工的
<220>
<223> TALEHIV 重复单元氨基酸序列
<400> 6
Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
1 5 10 15
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala
20 25 30
Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
35 40 45
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val
50 55 60
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
65 70 75 80
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
85 90 95
Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu
100 105 110
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
115 120 125
Pro Ala Gln Val Val Ala Ile Ala Ser Asn Ile Gly Gly Lys Gln Ala
130 135 140
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
145 150 155 160
Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Asn Gly Gly Lys
165 170 175
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
180 185 190
His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Gly Gly
195 200 205
Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys
210 215 220
Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser Asn
225 230 235 240
Gly Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val
245 250 255
Leu Cys Gln Ala His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala
260 265 270
Ser Asn Ile Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
275 280 285
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala
290 295 300
Ile Ala Ser Asn Asn Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
305 310 315 320
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val
325 330 335
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
340 345 350
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
355 360 365
Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu
370 375 380
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
385 390 395 400
Pro Ala Gln Val Val Ala Ile Ala Ser Asn Ile Gly Gly Lys Gln Ala
405 410 415
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
420 425 430
Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Asn Gly Gly Lys
435 440 445
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
450 455 460
His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn Ile Gly
465 470 475 480
Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys
485 490 495
Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala Ser Asn
500 505 510
Asn Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val
515 520 525
Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala Ile Ala
530 535 540
Ser Asn Ile Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg Leu Leu
545 550 555 560
Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val Val Ala
565 570 575
Ile Ala Ser Asn Asn Gly Gly Lys Gln Ala Leu Glu Thr Val Gln Arg
580 585 590
Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Asp Gln Val
595 600 605
Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala Leu Glu Thr Val
610 615 620
Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr Pro Ala
625 630 635 640
Gln Val Val Ala Ile Ala Ser Asn Gly Gly Gly Lys Gln Ala Leu Glu
645 650 655
Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly Leu Thr
660 665 670
Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys Gln Ala
675 680 685
Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala His Gly
690 695 700
Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly Gly Lys
705 710 715 720
Gln Ala Leu Glu Thr Val Gln Arg Leu Leu Pro Val Leu Cys Gln Ala
725 730 735
His Gly Leu Thr Pro Ala Gln Val Val Ala Ile Ala Ser His Asp Gly
740 745 750
Gly Arg Arg Cys Tyr Lys Ala Leu
755 760
<210> 7
<211> 2397
<212> DNA
<213> 人工的
<220>
<223> TALE24重复单元DNA序列
<400> 7
attctagaag acactagtca tgacggtggc aaacaggctc ttgagaccgt ccaacgcctt 60
ctaccagttc tctgtcaagc ccacggacta accccagcgc aagttgtagc gattgctagt 120
catgacggtg gcaaacaggc ccttgagaca gtccaacgcc ttctaccagt tctctgccaa 180
gcacacggac taaccccagc gcaagttgta gcgattgcta gtcatgacgg tggcaaacag 240
gctcttgaaa ccgtgcaacg actgctccca gttctctgtc aagcccacgg cctcaccccg 300
gcgcaagttg tagcgattgc tagtaatggg ggtggcaaac aggctcttga aaccgtgcaa 360
cgactgctcc cagttctctg tcaagcccac ggcctcaccc cggcgcaagt tgtagcgatt 420
gctagtaatg ggggtggcaa acaggcactt gagactgttc agcgactact accagttctc 480
tgccaagccc acggacttac cccagatcaa gttgtagcga ttgctagtaa tgggggtggc 540
aaacaggcac ttgagactgt tcagcgacta ctaccagttc tctgccaagc ccacggactt 600
accccagatc aagttgtagc gattgctagt aatattggtg gcaaacaggc acttgagacg 660
gttcagcgcc tccttccagt tctttgtcaa gctcacggac tcaccccaga tcaagttgta 720
gcgattgcta gtaatggggg tggcaaacag gctcttgaaa ccgtgcaacg actgctccca 780
gttctctgtc aagcccacgg cctcaccccg gcgcaagttg tagcgattgc tagtcatgac 840
ggtggcaaac aggctcttga aaccgtgcaa cgactgctcc cagttctctg tcaagcccac 900
ggcctcaccc cggcgcaagt tgtagcgatt gctagtaatg ggggtggcaa acaggctctt 960
gaaaccgtgc aacgactgct cccagttctc tgtcaagccc acggcctcac cccggcgcaa 1020
gttgtagcga ttgctagtca tgacggtggc aaacaggctc ttgagaccgt ccaacgcctt 1080
ctaccagttc tctgtcaagc ccacggacta accccagcgc aagttgtagc gattgctagt 1140
aatgggggtg gcaaacaggc tcttgaaacc gtgcaacgac tgctcccagt tctctgtcaa 1200
gcccacggcc tcaccccggc gcaagttgta gcgattgcta gtcatgacgg tggcaaacag 1260
gctcttgaga ccgtccaacg ccttctacca gttctctgtc aagcccacgg actaacccca 1320
gcgcaagttg tagcgattgc tagtaatggg ggtggcaaac aggctcttga aaccgtgcaa 1380
cgactgctcc cagttctctg tcaagcccac ggcctcaccc cggcgcaagt tgtagcgatt 1440
gctagtcatg acggtggcaa acaggctctt gaaaccgtgc aacgactgct cccagttctc 1500
tgtcaagccc acggcctcac cccggcgcaa gttgtagcga ttgctagtca tgacggtggc 1560
aaacaggctc ttgagaccgt ccaacgcctt ctaccagttc tctgtcaagc ccacggacta 1620
accccagcgc aagttgtagc gattgctagt aatattggtg gcaaacaggc acttgagacg 1680
gttcagcgcc tccttccagt tctttgtcaa gctcacggac tcaccccaga tcaagttgta 1740
gcgattgcta gtaacaatgg tggcaaacag gctctcgaaa ccgtacaacg actcctccca 1800
gttctctgtc aagcccacgg actaactcct gatcaagttg tagcgattgc tagtcatgac 1860
ggtggcaaac aggctcttga gaccgtccaa cgccttctac cagttctctg tcaagcccac 1920
ggactaaccc cagcgcaagt tgtagcgatt gctagtaatg ggggtggcaa acaggctctt 1980
gaaaccgtgc aacgactgct cccagttctc tgtcaagccc acggcctcac cccggcgcaa 2040
gttgtagcga ttgctagtca tgacggtggc aaacaggctc ttgaaaccgt gcaacgactg 2100
ctcccagttc tctgtcaagc ccacggcctc accccggcgc aagttgtagc gattgctagt 2160
aacaatggtg gcaaacaggc tctcgaaacc gtacaacgac tcctcccagt tctctgtcaa 2220
gcccacggac taactcctga tcaagttgta gcgattgcta gtaatattgg tggcaaacag 2280
gcacttgaga cggttcagcg cctccttcca gttctttgtc aagctcacgg actcacccca 2340
gatcaagttg tagcgattgc tagcaacaat ggcggtcgac gctgctataa agcttta 2397
<210> 8
<211> 2295
<212> DNA
<213> 人工的
<220>
<223> TALEHIV重复单元DNA序列
<400> 8
attctagaag acactagtca tgacggtggc aaacaggctc ttgagaccgt ccaacgcctt 60
ctaccagttc tctgtcaagc ccacggacta accccagcgc aagttgtagc gattgctagt 120
catgacggtg gcaaacaggc tcttgagacc gtccaacgcc ttctaccagt tctctgtcaa 180
gcccacggac taaccccagc gcaagttgta gcgattgcta gtcatgacgg tggcaaacag 240
gctcttgaaa ccgtgcaacg actgctccca gttctctgtc aagcccacgg cctcaccccg 300
gcgcaagttg tagcgattgc tagtaatggg ggtggcaaac aggctcttga aaccgtgcaa 360
cgactgctcc cagttctctg tcaagcccac ggcctcaccc cggcgcaagt tgtagcgatt 420
gctagtaata ttggtggcaa acaggcactt gagacggttc agcgcctcct tccagttctt 480
tgtcaagctc acggactcac cccagatcaa gttgtagcga ttgctagtaa caatggtggc 540
aaacaggctc tcgaaaccgt acaacgactc ctcccagttc tctgtcaagc ccacggacta 600
actcctgatc aagttgtagc gattgctagt aatgggggtg gcaaacaggc tcttgaaacc 660
gtgcaacgac tgctcccagt tctctgtcaa gcccacggcc tcaccccggc gcaagttgta 720
gcgattgcta gtaatggggg tggcaaacag gctcttgaaa ccgtgcaacg actgctccca 780
gttctctgtc aagcccacgg cctcaccccg gcgcaagttg tagcgattgc tagtaatatt 840
ggtggcaaac aggcacttga gacggttcag cgcctccttc cagttctttg tcaagctcac 900
ggactcaccc cagatcaagt tgtagcgatt gctagtaaca atggtggcaa acaggctctc 960
gaaaccgtac aacgactcct cccagttctc tgtcaagccc acggactaac tcctgatcaa 1020
gttgtagcga ttgctagtca tgacggtggc aaacaggctc ttgagaccgt ccaacgcctt 1080
ctaccagttc tctgtcaagc ccacggacta accccagcgc aagttgtagc gattgctagt 1140
catgacggtg gcaaacaggc tcttgaaacc gtgcaacgac tgctcccagt tctctgtcaa 1200
gcccacggcc tcaccccggc gcaagttgta gcgattgcta gtaatattgg tggcaaacag 1260
gcacttgaga cggttcagcg cctccttcca gttctttgtc aagctcacgg actcacccca 1320
gatcaagttg tagcgattgc tagtaacaat ggtggcaaac aggctctcga aaccgtacaa 1380
cgactcctcc cagttctctg tcaagcccac ggactaactc ctgatcaagt tgtagcgatt 1440
gctagtaata ttggtggcaa acaggcactt gagacggttc agcgcctcct tccagttctt 1500
tgtcaagctc acggactcac cccagatcaa gttgtagcga ttgctagtaa caatggtggc 1560
aaacaggctc tcgaaaccgt acaacgactc ctcccagttc tctgtcaagc ccacggacta 1620
actcctgatc aagttgtagc gattgctagt aatattggtg gcaaacaggc acttgagacg 1680
gttcagcgcc tccttccagt tctttgtcaa gctcacggac tcaccccaga tcaagttgta 1740
gcgattgcta gtaacaatgg tggcaaacag gctctcgaaa ccgtacaacg actcctccca 1800
gttctctgtc aagcccacgg actaactcct gatcaagttg tagcgattgc tagtcatgac 1860
ggtggcaaac aggctcttga gaccgtccaa cgccttctac cagttctctg tcaagcccac 1920
ggactaaccc cagcgcaagt tgtagcgatt gctagtaatg ggggtggcaa acaggctctt 1980
gaaaccgtgc aacgactgct cccagttctc tgtcaagccc acggcctcac cccggcgcaa 2040
gttgtagcga ttgctagtca tgacggtggc aaacaggccc ttgagacagt ccaacgcctt 2100
ctaccagttc tctgccaagc acacggacta accccagcgc aagttgtagc gattgctagt 2160
catgacggtg gcaaacaggc ccttgagaca gtccaacgcc ttctaccagt tctctgccaa 2220
gcacacggac taaccccagc gcaagttgta gcgattgcta gccatgacgg cggtcgacgc 2280
tgctataaag cttta 2295
<210> 9
<211> 17
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 5'到3'
<400> 9
tgtcccttta tctctct 17
<210> 10
<211> 17
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 3'到5'
<400> 10
acagggaaat agagaga 17
<210> 11
<211> 17
<212> RNA
<213> 人工的
<220>
<223> 人工合成的RNA链 3'到5'
<400> 11
acagggaaau agagaga 17
<210> 12
<211> 49
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 5'到3'
<400> 12
ccacatatgt catacgtgtc cctttatctc tctccagctc gaggaattc 49
<210> 13
<211> 48
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 5'到3'
<400> 13
gaattcctga gctggagaga gataaaggga cacgtatgac atatgtgg 48
<210> 14
<211> 49
<212> RNA
<213> 人工的
<220>
<223> 人工合成的RNA链 5'到3'
<400> 14
gaauuccucg agcuggagag agauaaaggg acacguauga cauaugugg 49
<210> 15
<211> 31
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 5'到3'
<400> 15
ccacatatgt catacgtgtc cctttatctc t 31
<210> 16
<211> 49
<212> RNA
<213> 人工的
<220>
<223> 人工合成的RNA链 5'到3'
<400> 16
gaauuccucg agcuggagag agauaaaggg acacguauga cauaugugg 49
<210> 17
<211> 43
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链 5'到3'
<400> 17
ccacatatgt catacgtgtc cctttatctc tctccagctc gag 43
<210> 18
<211> 49
<212> RNA
<213> 人工的
<220>
<223> 人工合成的RNA链 5'到3'
<400> 18
gaauuccucg agcuggagag agauaaaggg acacguauga cauaugugg 49
<210> 19
<211> 26
<212> DNA
<213> 人工的
<220>
<223> 人工合成的DNA链
<400> 19
gtgggttccc tagccagaga gctccc 26
<210> 20
<211> 36
<212> RNA
<213> 人工的
<220>
<223> 人工合成的RNA链
<400> 20
agaucugagc cugggagcuc ucuggcuaac uaggga 36
Claims (6)
1.一种非诊断和治疗目的的特异结合DNA-RNA杂合链的方法,包括用TALE蛋白来特异性识别特定的正向链为DNA且反向链为RNA的DNA-RNA杂合双链并与之结合。
2.一种非诊断和治疗目的的抑制以RNA为模板来生成DNA的方法,包括用TALE蛋白来特异性识别正向链为DNA且反向链为RNA的DNA-RNA杂合双链并与之结合。
3.一种非诊断和治疗目的的抑制以RNA为引物、DNA为模板来生成DNA的方法,包括用TALE蛋白来特异性识别正向链为DNA且反向链为RNA的DNA-RNA杂合双链并与之结合。
4.一种非诊断和治疗目的的保护DNA-RNA杂合链中RNA分子不被RNA水解酶RNase H降解的方法,包括用TALE蛋白来特异性识别正向链为DNA且反向链为RNA的DNA-RNA杂合双链并与之结合。
5.权利要求1-4中任一项的方法,其中所述DNA还包含修饰的DNA衍生物,其中所述修饰是甲基化碱基、羟甲基化碱基。
6.权利要求1-4中任一项的方法,其中所述RNA还包含修饰的RNA衍生物,其中所述修饰是甲基化碱基、羟甲基化碱基。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201280060126.7A CN104093855B (zh) | 2012-01-04 | 2012-12-21 | 特异结合和靶定dna‑rna杂合双链的方法 |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012100210049 | 2012-01-04 | ||
CN201210021004.9 | 2012-01-04 | ||
CN201210021004 | 2012-01-04 | ||
CN201280060126.7A CN104093855B (zh) | 2012-01-04 | 2012-12-21 | 特异结合和靶定dna‑rna杂合双链的方法 |
PCT/CN2012/001717 WO2013102289A1 (zh) | 2012-01-04 | 2012-12-21 | 特异结合和靶定dna-rna 杂合双链的方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104093855A CN104093855A (zh) | 2014-10-08 |
CN104093855B true CN104093855B (zh) | 2018-04-13 |
Family
ID=48744960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201280060126.7A Active CN104093855B (zh) | 2012-01-04 | 2012-12-21 | 特异结合和靶定dna‑rna杂合双链的方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN104093855B (zh) |
WO (1) | WO2013102289A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3055423B1 (en) * | 2013-10-11 | 2019-12-25 | Cellectis | Method for detecting nucleic acid sequences of interest using talen protein |
CN105802992B (zh) * | 2016-03-29 | 2019-08-20 | 中国科学院植物研究所 | 一种抑制植物基因转录的方法 |
CN108314736B (zh) * | 2017-01-18 | 2021-08-31 | 李燕强 | 一种促进rna降解的方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9567573B2 (en) * | 2010-04-26 | 2017-02-14 | Sangamo Biosciences, Inc. | Genome editing of a Rosa locus using nucleases |
CN103025344B (zh) * | 2010-05-17 | 2016-06-29 | 桑格摩生物科学股份有限公司 | 新型dna-结合蛋白及其用途 |
-
2012
- 2012-12-21 WO PCT/CN2012/001717 patent/WO2013102289A1/zh active Application Filing
- 2012-12-21 CN CN201280060126.7A patent/CN104093855B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN104093855A (zh) | 2014-10-08 |
WO2013102289A1 (zh) | 2013-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR20190059966A (ko) | S. 피오게네스 cas9 돌연변이 유전자 및 이에 의해 암호화되는 폴리펩티드 | |
KR20190082318A (ko) | Crispr/cpf1 시스템 및 방법 | |
Zhou et al. | pheS*, an effective host-genotype-independent counter-selectable marker for marker-free chromosome deletion in Bacillus amyloliquefaciens | |
CN103987860B (zh) | 特异识别含有5‑甲基化胞嘧啶的dna的方法 | |
CN109021086B (zh) | 一种抗菌肽天蚕素a突变体及其编码基因、制备方法和应用 | |
WO2020032711A1 (ko) | 신규한 crispr 연관 단백질 및 이의 용도 | |
CN102421892A (zh) | 二鸟苷酸环化酶、其生产方法及其在制备环化-di-GMP及其类似物中的应用 | |
CN104093855B (zh) | 特异结合和靶定dna‑rna杂合双链的方法 | |
CN110777155B (zh) | 最小霉素生物合成基因簇、重组菌及其应用 | |
CN106834252B (zh) | 一种高稳定型MazF突变体及其应用 | |
CN101696414A (zh) | 提高生物辐射抗性的基因及其应用 | |
CN110498847B (zh) | 水稻受体蛋白rga5a_s的制备及结晶方法 | |
Nogawa et al. | Genetic structure and polymorphisms of the N16 gene in Pinctada fucata | |
CN102140444B (zh) | 一种低温碱性磷酸酶及其制备方法 | |
KR102152142B1 (ko) | Cas10/Csm4를 이용한 사이클릭 올리고아데닐레이트 제조방법 | |
CN109234300B (zh) | 一种调控集胞藻生长速度的基因spkD的应用 | |
CN106906223A (zh) | 一种大豆耐盐基因GmMYB173及其表达载体和表达载体的应用 | |
Ri et al. | Functional expression of an antimicrobial peptide, belonging to halocin C8 family, from Natrinema sp. RNS21 in Escherichia coli | |
CN108864273B (zh) | 一种模拟人源性抗菌肽及其制备方法 | |
Luna-Chávez et al. | Molecular basis of inhibition of the ribonuclease activity in colicin E5 by its cognate immunity protein | |
CN103193871B (zh) | 根据蛋白质-dna复合物晶体结构设计新型tale的方法 | |
CN111647577B (zh) | Rna解旋酶的三维结构及其应用 | |
CN108588040B (zh) | 重组MtMetRS、其晶体及它们在制备抗结核药物中的应用 | |
Kavitha et al. | Cloning and molecular characterisation of resuscitation promoting factor-like gene from Mycobacterium avium subspecies avium | |
Cho et al. | Structural insight of the role of the Hahella chejuensis HapK protein in prodigiosin biosynthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |