WO2024063273A1 - Nouveaux variants d'adénine désaminase et procédé d'édition de bases les utilisant - Google Patents
Nouveaux variants d'adénine désaminase et procédé d'édition de bases les utilisant Download PDFInfo
- Publication number
- WO2024063273A1 WO2024063273A1 PCT/KR2023/009361 KR2023009361W WO2024063273A1 WO 2024063273 A1 WO2024063273 A1 WO 2024063273A1 KR 2023009361 W KR2023009361 W KR 2023009361W WO 2024063273 A1 WO2024063273 A1 WO 2024063273A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dna
- fusion protein
- target
- editing
- amino acid
- Prior art date
Links
- 108010052875 Adenine deaminase Proteins 0.000 title claims abstract description 72
- 238000000034 method Methods 0.000 title claims abstract description 55
- 108020001507 fusion proteins Proteins 0.000 claims abstract description 76
- 102000037865 fusion proteins Human genes 0.000 claims abstract description 76
- 230000009437 off-target effect Effects 0.000 claims abstract description 38
- 239000000203 mixture Substances 0.000 claims abstract description 35
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 31
- 230000002829 reductive effect Effects 0.000 claims abstract description 26
- 230000004075 alteration Effects 0.000 claims abstract description 15
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 118
- 108020004414 DNA Proteins 0.000 claims description 108
- 102000053602 DNA Human genes 0.000 claims description 98
- 210000004027 cell Anatomy 0.000 claims description 72
- 108090000623 proteins and genes Proteins 0.000 claims description 66
- 230000000694 effects Effects 0.000 claims description 61
- 108010080611 Cytosine Deaminase Proteins 0.000 claims description 59
- 102000000311 Cytosine Deaminase Human genes 0.000 claims description 59
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 51
- 102000004169 proteins and genes Human genes 0.000 claims description 42
- 102000052510 DNA-Binding Proteins Human genes 0.000 claims description 33
- 101710096438 DNA-binding protein Proteins 0.000 claims description 31
- 230000035772 mutation Effects 0.000 claims description 30
- 150000001413 amino acids Chemical class 0.000 claims description 28
- 108091033409 CRISPR Proteins 0.000 claims description 27
- 230000002438 mitochondrial effect Effects 0.000 claims description 27
- 102000040430 polynucleotide Human genes 0.000 claims description 21
- 108091033319 polynucleotide Proteins 0.000 claims description 21
- 239000002157 polynucleotide Substances 0.000 claims description 21
- 108020005196 Mitochondrial DNA Proteins 0.000 claims description 19
- 125000000539 amino acid group Chemical group 0.000 claims description 18
- 238000006467 substitution reaction Methods 0.000 claims description 18
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical class NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 claims description 17
- 230000001603 reducing effect Effects 0.000 claims description 15
- 101710185494 Zinc finger protein Proteins 0.000 claims description 14
- 102100023597 Zinc finger protein 816 Human genes 0.000 claims description 14
- 102000007260 Deoxyribonuclease I Human genes 0.000 claims description 12
- 108010008532 Deoxyribonuclease I Proteins 0.000 claims description 11
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 claims description 10
- 239000013604 expression vector Substances 0.000 claims description 10
- 241000282414 Homo sapiens Species 0.000 claims description 9
- 102220573234 Ras-related protein Ral-A_A48W_mutation Human genes 0.000 claims description 9
- 108010042407 Endonucleases Proteins 0.000 claims description 8
- 208000026350 Inborn Genetic disease Diseases 0.000 claims description 8
- 101710163270 Nuclease Proteins 0.000 claims description 8
- 108010031100 chloroplast transit peptides Proteins 0.000 claims description 8
- 108091093105 Nuclear DNA Proteins 0.000 claims description 7
- 108010066154 Nuclear Export Signals Proteins 0.000 claims description 7
- 208000024556 Mendelian disease Diseases 0.000 claims description 6
- 108091092740 Organellar DNA Proteins 0.000 claims description 6
- 208000028782 Hereditary disease Diseases 0.000 claims description 5
- 108010077850 Nuclear Localization Signals Proteins 0.000 claims description 5
- 230000008685 targeting Effects 0.000 claims description 5
- 229940035893 uracil Drugs 0.000 claims description 5
- 230000009977 dual effect Effects 0.000 claims description 4
- 229940113491 Glycosylase inhibitor Drugs 0.000 claims description 3
- 238000001727 in vivo Methods 0.000 claims description 3
- 108010073062 Transcription Activator-Like Effectors Proteins 0.000 claims 7
- 101710095342 Apolipoprotein B Proteins 0.000 claims 1
- 102100040202 Apolipoprotein B-100 Human genes 0.000 claims 1
- 238000010354 CRISPR gene editing Methods 0.000 claims 1
- 102000004533 Endonucleases Human genes 0.000 claims 1
- 230000004913 activation Effects 0.000 claims 1
- 235000018102 proteins Nutrition 0.000 description 42
- 235000001014 amino acid Nutrition 0.000 description 32
- 239000013612 plasmid Substances 0.000 description 29
- 229940024606 amino acid Drugs 0.000 description 27
- 230000000981 bystander Effects 0.000 description 23
- 239000002773 nucleotide Substances 0.000 description 22
- 238000001890 transfection Methods 0.000 description 20
- 150000007523 nucleic acids Chemical class 0.000 description 16
- 238000006243 chemical reaction Methods 0.000 description 15
- 238000012163 sequencing technique Methods 0.000 description 15
- 239000013598 vector Substances 0.000 description 15
- 229960000643 adenine Drugs 0.000 description 14
- 102000039446 nucleic acids Human genes 0.000 description 12
- 108020004707 nucleic acids Proteins 0.000 description 12
- 229930024421 Adenine Natural products 0.000 description 11
- 108020005004 Guide RNA Proteins 0.000 description 10
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 10
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 10
- 108091028043 Nucleic acid sequence Proteins 0.000 description 9
- 238000002347 injection Methods 0.000 description 9
- 239000007924 injection Substances 0.000 description 9
- 238000007481 next generation sequencing Methods 0.000 description 9
- 102100031780 Endonuclease Human genes 0.000 description 8
- 230000027455 binding Effects 0.000 description 8
- 238000012350 deep sequencing Methods 0.000 description 8
- 108090000765 processed proteins & peptides Proteins 0.000 description 8
- 108700028369 Alleles Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 239000000758 substrate Substances 0.000 description 7
- 108091028113 Trans-activating crRNA Proteins 0.000 description 6
- 229940104302 cytosine Drugs 0.000 description 6
- 239000012634 fragment Substances 0.000 description 6
- 238000012268 genome sequencing Methods 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 102000055025 Adenosine deaminases Human genes 0.000 description 5
- 108091093088 Amplicon Proteins 0.000 description 5
- 108091079001 CRISPR RNA Proteins 0.000 description 5
- 101100275473 Caenorhabditis elegans ctc-3 gene Proteins 0.000 description 5
- 230000004568 DNA-binding Effects 0.000 description 5
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 5
- 241001465754 Metazoa Species 0.000 description 5
- 101100168274 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) cox-3 gene Proteins 0.000 description 5
- 238000003559 RNA-seq method Methods 0.000 description 5
- 241000193996 Streptococcus pyogenes Species 0.000 description 5
- 230000003197 catalytic effect Effects 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 239000013642 negative control Substances 0.000 description 5
- 229920001184 polypeptide Polymers 0.000 description 5
- 102000004196 processed proteins & peptides Human genes 0.000 description 5
- 239000013603 viral vector Substances 0.000 description 5
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 4
- 101000884048 Burkholderia cenocepacia (strain H111) Double-stranded DNA deaminase toxin A Proteins 0.000 description 4
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 4
- 241000282693 Cercopithecidae Species 0.000 description 4
- 241000196324 Embryophyta Species 0.000 description 4
- 101000964382 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3D Proteins 0.000 description 4
- 101000606129 Homo sapiens Tyrosine-protein kinase receptor TYRO3 Proteins 0.000 description 4
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 description 4
- 241000699670 Mus sp. Species 0.000 description 4
- 241000700159 Rattus Species 0.000 description 4
- 102100039127 Tyrosine-protein kinase receptor TYRO3 Human genes 0.000 description 4
- 229960005305 adenosine Drugs 0.000 description 4
- 238000006481 deamination reaction Methods 0.000 description 4
- 239000012091 fetal bovine serum Substances 0.000 description 4
- 230000000051 modifying effect Effects 0.000 description 4
- 239000000047 product Substances 0.000 description 4
- 230000000717 retained effect Effects 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 125000006850 spacer group Chemical group 0.000 description 4
- 238000012049 whole transcriptome sequencing Methods 0.000 description 4
- 108010079649 APOBEC-1 Deaminase Proteins 0.000 description 3
- 108700040115 Adenosine deaminases Proteins 0.000 description 3
- 102220470957 Amiloride-sensitive sodium channel subunit delta_R21A_mutation Human genes 0.000 description 3
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- 241000283690 Bos taurus Species 0.000 description 3
- 238000010453 CRISPR/Cas method Methods 0.000 description 3
- 241000282472 Canis lupus familiaris Species 0.000 description 3
- 108020004638 Circular DNA Proteins 0.000 description 3
- 102100040264 DNA dC->dU-editing enzyme APOBEC-3D Human genes 0.000 description 3
- 238000010442 DNA editing Methods 0.000 description 3
- 241000589601 Francisella Species 0.000 description 3
- 241000282412 Homo Species 0.000 description 3
- 101000964385 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3B Proteins 0.000 description 3
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 3
- 241000904817 Lachnospiraceae bacterium Species 0.000 description 3
- 241000251745 Petromyzon marinus Species 0.000 description 3
- 241000288906 Primates Species 0.000 description 3
- 108010076504 Protein Sorting Signals Proteins 0.000 description 3
- 102220513001 Serine/arginine repetitive matrix protein 1_K20A_mutation Human genes 0.000 description 3
- 235000009582 asparagine Nutrition 0.000 description 3
- 229960001230 asparagine Drugs 0.000 description 3
- 235000003704 aspartic acid Nutrition 0.000 description 3
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 3
- 210000004899 c-terminal region Anatomy 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 210000000349 chromosome Anatomy 0.000 description 3
- 231100000433 cytotoxic Toxicity 0.000 description 3
- 230000001472 cytotoxic effect Effects 0.000 description 3
- 230000009615 deamination Effects 0.000 description 3
- 239000013613 expression plasmid Substances 0.000 description 3
- 230000003301 hydrolyzing effect Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 244000005700 microbiome Species 0.000 description 3
- 239000000178 monomer Substances 0.000 description 3
- 230000032965 negative regulation of cell volume Effects 0.000 description 3
- 229920000642 polymer Polymers 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 238000003259 recombinant expression Methods 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 238000012070 whole genome sequencing analysis Methods 0.000 description 3
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- 101710169336 5'-deoxyadenosine deaminase Proteins 0.000 description 2
- 102000012758 APOBEC-1 Deaminase Human genes 0.000 description 2
- 241000203069 Archaea Species 0.000 description 2
- 241000701822 Bovine papillomavirus Species 0.000 description 2
- 241000605902 Butyrivibrio Species 0.000 description 2
- 241000223282 Candidatus Peregrinibacteria Species 0.000 description 2
- 241000283707 Capra Species 0.000 description 2
- 241000195628 Chlorophyta Species 0.000 description 2
- 201000000915 Chronic Progressive External Ophthalmoplegia Diseases 0.000 description 2
- 241000195493 Cryptophyta Species 0.000 description 2
- 102220605874 Cytosolic arginine sensor for mTORC1 subunit 2_D10A_mutation Human genes 0.000 description 2
- 102100040262 DNA dC->dU-editing enzyme APOBEC-3B Human genes 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 108010067770 Endopeptidase K Proteins 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 241000589599 Francisella tularensis subsp. novicida Species 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 244000068988 Glycine max Species 0.000 description 2
- 235000010469 Glycine max Nutrition 0.000 description 2
- 208000032087 Hereditary Leber Optic Atrophy Diseases 0.000 description 2
- 101000964322 Homo sapiens C->U-editing enzyme APOBEC-2 Proteins 0.000 description 2
- 101000964383 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3C Proteins 0.000 description 2
- 101000964377 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3F Proteins 0.000 description 2
- 101000800426 Homo sapiens Putative C->U-editing enzyme APOBEC-4 Proteins 0.000 description 2
- 201000000639 Leber hereditary optic neuropathy Diseases 0.000 description 2
- 208000035177 MELAS Diseases 0.000 description 2
- 238000000719 MTS assay Methods 0.000 description 2
- 231100000070 MTS assay Toxicity 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 2
- 235000007164 Oryza sativa Nutrition 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 241000605861 Prevotella Species 0.000 description 2
- 238000010357 RNA editing Methods 0.000 description 2
- 230000026279 RNA modification Effects 0.000 description 2
- 230000004570 RNA-binding Effects 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 102100022433 Single-stranded DNA cytosine deaminase Human genes 0.000 description 2
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 2
- 241000194022 Streptococcus sp. Species 0.000 description 2
- 241000282887 Suidae Species 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 241000209140 Triticum Species 0.000 description 2
- 235000021307 Triticum Nutrition 0.000 description 2
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 2
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 2
- 240000008042 Zea mays Species 0.000 description 2
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 2
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 2
- 238000002835 absorbance Methods 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 239000002543 antimycotic Substances 0.000 description 2
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 244000309466 calf Species 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 239000008004 cell lysis buffer Substances 0.000 description 2
- 230000003833 cell viability Effects 0.000 description 2
- 238000003570 cell viability assay Methods 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 239000013068 control sample Substances 0.000 description 2
- 235000005822 corn Nutrition 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 239000012636 effector Substances 0.000 description 2
- 210000002308 embryonic cell Anatomy 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000003527 eukaryotic cell Anatomy 0.000 description 2
- 208000016361 genetic disease Diseases 0.000 description 2
- 238000010362 genome editing Methods 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000007912 intraperitoneal administration Methods 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000001638 lipofection Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 230000011278 mitosis Effects 0.000 description 2
- 238000007911 parenteral administration Methods 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 235000009566 rice Nutrition 0.000 description 2
- 239000000523 sample Substances 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 210000001082 somatic cell Anatomy 0.000 description 2
- 210000000130 stem cell Anatomy 0.000 description 2
- 238000007920 subcutaneous administration Methods 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 230000002103 transcriptional effect Effects 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- MZZYGYNZAOVRTG-UHFFFAOYSA-N 2-hydroxy-n-(1h-1,2,4-triazol-5-yl)benzamide Chemical compound OC1=CC=CC=C1C(=O)NC1=NC=NN1 MZZYGYNZAOVRTG-UHFFFAOYSA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 108010029988 AICDA (activation-induced cytidine deaminase) Proteins 0.000 description 1
- 108010004483 APOBEC-3G Deaminase Proteins 0.000 description 1
- 241000093740 Acidaminococcus sp. Species 0.000 description 1
- 241000219194 Arabidopsis Species 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 241000371430 Burkholderia cenocepacia Species 0.000 description 1
- 102100040397 C->U-editing enzyme APOBEC-1 Human genes 0.000 description 1
- 102100040399 C->U-editing enzyme APOBEC-2 Human genes 0.000 description 1
- 241000589875 Campylobacter jejuni Species 0.000 description 1
- 241000589994 Campylobacter sp. Species 0.000 description 1
- 241001502303 Candidatus Methanoplasma Species 0.000 description 1
- 241001040999 Candidatus Methanoplasma termitum Species 0.000 description 1
- 241001437378 Candidatus Paceibacter Species 0.000 description 1
- 241000243205 Candidatus Parcubacteria Species 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- -1 Cas3 Proteins 0.000 description 1
- 108020004705 Codon Proteins 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- 238000007702 DNA assembly Methods 0.000 description 1
- 102100040261 DNA dC->dU-editing enzyme APOBEC-3C Human genes 0.000 description 1
- 102100040266 DNA dC->dU-editing enzyme APOBEC-3F Human genes 0.000 description 1
- 102100038076 DNA dC->dU-editing enzyme APOBEC-3G Human genes 0.000 description 1
- 102100038050 DNA dC->dU-editing enzyme APOBEC-3H Human genes 0.000 description 1
- 101710082737 DNA dC->dU-editing enzyme APOBEC-3H Proteins 0.000 description 1
- 238000012270 DNA recombination Methods 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000186394 Eubacterium Species 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 241000282575 Gorilla Species 0.000 description 1
- 101000964330 Homo sapiens C->U-editing enzyme APOBEC-1 Proteins 0.000 description 1
- 101000742736 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3G Proteins 0.000 description 1
- 101000742769 Homo sapiens DNA dC->dU-editing enzyme APOBEC-3H Proteins 0.000 description 1
- 101000755690 Homo sapiens Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 101000658622 Homo sapiens Testis-specific Y-encoded-like protein 2 Proteins 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 241001134638 Lachnospira Species 0.000 description 1
- 101710128836 Large T antigen Proteins 0.000 description 1
- 208000006136 Leigh Disease Diseases 0.000 description 1
- 208000017507 Leigh syndrome Diseases 0.000 description 1
- 241001148627 Leptospira inadai Species 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 239000012097 Lipofectamine 2000 Substances 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 206010052641 Mitochondrial DNA mutation Diseases 0.000 description 1
- 206010058799 Mitochondrial encephalomyopathy Diseases 0.000 description 1
- 241000542065 Moraxella bovoculi Species 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 101100377883 Mus musculus Apobec1 gene Proteins 0.000 description 1
- 101100377889 Mus musculus Apobec2 gene Proteins 0.000 description 1
- 101100489915 Mus musculus Apobec4 gene Proteins 0.000 description 1
- 101000755751 Mus musculus Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 241000588650 Neisseria meningitidis Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 101710147059 Nicking endonuclease Proteins 0.000 description 1
- 102000019040 Nuclear Antigens Human genes 0.000 description 1
- 108010051791 Nuclear Antigens Proteins 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 241000606856 Pasteurella multocida Species 0.000 description 1
- 241000606580 Pasteurella sp. Species 0.000 description 1
- 241000605894 Porphyromonas Species 0.000 description 1
- 241000878522 Porphyromonas crevioricanis Species 0.000 description 1
- 241001135241 Porphyromonas macacae Species 0.000 description 1
- 241001135219 Prevotella disiens Species 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 102100033091 Putative C->U-editing enzyme APOBEC-4 Human genes 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 101710143275 Single-stranded DNA cytosine deaminase Proteins 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 1
- 102100034917 Testis-specific Y-encoded-like protein 2 Human genes 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 108020005202 Viral DNA Proteins 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 241001531273 [Eubacterium] eligens Species 0.000 description 1
- 101150063416 add gene Proteins 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000013602 bacteriophage vector Substances 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 108020001778 catalytic domains Proteins 0.000 description 1
- 230000006037 cell lysis Effects 0.000 description 1
- 238000001516 cell proliferation assay Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 238000004737 colorimetric analysis Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000013601 cosmid vector Substances 0.000 description 1
- 210000004748 cultured cell Anatomy 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000005782 double-strand break Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000001493 electron microscopy Methods 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000001502 gel electrophoresis Methods 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 125000000291 glutamic acid group Chemical group N[C@@H](CCC(O)=O)C(=O)* 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 235000004554 glutamine Nutrition 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002458 infectious effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 230000003902 lesion Effects 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- VMGAPWLDMVPYIA-HIDZBRGKSA-N n'-amino-n-iminomethanimidamide Chemical compound N\N=C\N=N VMGAPWLDMVPYIA-HIDZBRGKSA-N 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 238000007857 nested PCR Methods 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 230000030648 nucleus localization Effects 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 229940051027 pasteurella multocida Drugs 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000010412 perfusion Effects 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- 238000001814 protein method Methods 0.000 description 1
- 230000007398 protein translocation Effects 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- BOLDJAUMGUJJKM-LSDHHAIUSA-N renifolin D Natural products CC(=C)[C@@H]1Cc2c(O)c(O)ccc2[C@H]1CC(=O)c3ccc(O)cc3O BOLDJAUMGUJJKM-LSDHHAIUSA-N 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical class CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 235000014393 valine Nutrition 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/78—Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04001—Cytosine deaminase (3.5.4.1)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y305/00—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
- C12Y305/04—Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
- C12Y305/04002—Adenine deaminase (3.5.4.2)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/01—Fusion polypeptide containing a localisation/targetting motif
- C07K2319/09—Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
Definitions
- an adenine deaminase variant capable of reducing off-target editing
- a fusion protein comprising a DNA-binding protein, and the adenine deaminase variant
- a base editing composition for A-to-G base editing in DNA comprising the fusion protein
- a method for A-to-G base editing in DNA comprising delivering the base editing composition to a cell containing a target DNA.
- mtDNA mammalian mitochondrial DNA
- Programmable deaminases which are composed of a custom DNA-binding protein and a nucleobase deaminase, enable mitochondrial DNA editing in a targeted manner.
- adenine base editors In contrast to cytosine base editors, adenine base editors, known as TALEDs, incorporate TadA8e, a deoxy-adenine deaminase engineered from the tRNA-specific TadA protein derived from E. coli. TadA8e and related variants of TadA are crucial components in CRISPR RNA-guided adenine base editors, extensively employed for A-to-G base editing of nuclear DNA. Nevertheless, the use of RNA-guided ABEs for editing organellar DNA poses a challenge due to the difficulty of delivering guide RNA into organelles.
- TadA8e present in ABEs retains residual deaminase activity for RNA substrates, resulting in unintended, transcriptome-wide off-target base editing. Off-target effects arise when the editing machinery mistakenly acts on unintended genomic regions, leading to undesired modifications.
- the present invention relates to a base editing composition and its method of application in gene therapy and genome engineering. More specifically, the composition and method involve base editor variants that exhibit a significant reduction in off-target genome cleavage and side effects, such as RNA off-target toxicity, when employed in clinical settings.
- the present invention offers a fresh avenue for targeted base editing with wide-ranging applications in the fields of medicine and biotechnology.
- an adenine deaminase variant in one embodiment, provided herein is an adenine deaminase variant.
- a fusion protein comprising a DNA-binding protein, and the adenine deaminase variant.
- a polynucleotide encoding the adenine deaminase variant or the fusion protein, and an expression vector comprising the polynucleotide.
- a base editing composition for A-to-G base editing in DNA comprising the fusion protein, a polynucleotide encoding the fusion protein or an expression vector comprising the polynucleotide.
- a method for A-to-G base editing in DNA comprising delivering the base editing composition to a cell containing a target DNA.
- described herein is a method for reducing off-target editing effects, the method comprising delivering the base editing composition to a cell containing a target DNA.
- adenine deaminase variant or the fusion protein in A-to-G base editing in DNA and/or reducing off-target effect in A-to-G base editing in DNA, or in preparing a composition for A-to-G base editing in DNA and/or reducing off-target effect in A-to-G base editing in DNA.
- FIG. 1(a) exemplifies the structures of base editors used in the present invention.
- AD TedA8e adenine deaminase
- AD* TedA8e adenine deaminase variant
- MTS mitochondrial targeting sequence
- UGI uracil glycosylase inhibitor
- FIG. 1(b) is a graph showing the number of RNA edits and editing frequencies for the Cox3.1-specific sTALEDs.
- FIG. 1(c) is graph showing A-to-G edits and C-to-T edits for the Cox3.1-specific sTALEDs (left), as well as on-target activity of Cox3.1-specific sTALED variants relative to that of the wild-type Cox3.1-specific sTALED (right).
- FIG. 1(d) is a graph showing the number of RNA edits and editing frequencies for the ND1-specific sTALEDs.
- FIG. 1(e) is graphs showing A-to-G edits and C-to-T edits for the ND1-specific sTALEDs (left), as well as on-target activity of the ND1-specific sTALED variants relative to that of the wild-type ND1-specific sTALED (right).
- FIG. 2(a) Structural representations of the TadA portion of ABE8e (Protein Data Bank (PDB) accession number 6VPC).
- FIG. 2(b) is a heat map showing DNA on-target activity (left), RNA off-target activity (middle), and the relative ratio of DNA on-target editing frequencies to RNA off-target editing frequencies (right) of the 101 sTALED variants among the total of 209 sTALED variants that retain mitochondrial DNA on-target activity.
- the relative ratio is normalized to that for the original sTALED, which has a value of 1.
- FIG. 3 is graphs showing DNA on-target base editing frequencies induced by 209 sTALED variants.
- FIG. 4(a) is a graph showing the editing frequencies at six RNA off-target sites measured by targeted RNA sequencing.
- FIG. 4(b) is a graph showing RNA off-target editing frequencies induced by sTALED and sTALED variants extracted from transcriptome-wide sequencing at six selected RNA off-target sites.
- FIG. 4(c) is a graph showing RNA off-target editing frequencies induced by sTALED and sTALED variants analyzed by targeted RNA amplicon sequencing.
- FIG. 5(a) is a graph showing the number of RNA edits and editing frequencies for the Cox3.1-specific sTALEDs.
- FIG. 5(b) is graphs showing A-to-G edits and C-to-T edits for the Cox3.1-specific sTALEDs.
- FIG. 5(c) is a graph showing the number of RNA edits and editing frequencies for the ND1-specific sTALEDs.
- FIG. 5(d) is graphs showing A-to-G edits and C-to-T edits for the ND1-specific sTALEDs.
- FIG. 5(e) is a graph showing the number of RNA edits and editing frequencies for the ND6-specific sTALEDs.
- FIG. 5(f) is a graph showing A-to-G edits and C-to-T edits for the ND6-specific sTALEDs.
- FIG. 6(a) is a graph showing on-target activities of ND1-specific sTALED variants relative to that of wild-type ND1-specific sTALED.
- FIG. 6(b) is a graph showing on-target activities of ND6-specific sTALED variants relative to that of wild-type ND6-specific sTALED.
- FIG. 6(c) is a heat map showing RNA off-target activities of ND1-specific sTALED and sTALED variants at six representative sites.
- FIG. 6(d) is a heat map showing RNA off-target activities of ND6-specific sTALED and sTALED variants at six representative sites.
- FIG. 6(g) is a bar graph showing the ratio of DNA on-target editing frequencies relative to RNA off-target editing frequencies induced by ND1-specific sTALED and sTALED variants.
- FIG. 6(h) is a bar graph showing the ratio of DNA on-target editing frequencies relative to RNA off-target editing frequencies induced by ND6-specific sTALED and sTALED variants. This ratio is normalized to that for sTALED, which has a value of 1.
- FIG. 6(i) is a graph showing average relative ratio values for the sTALEDs and sTALED variants targeted to Cox3.1, ND1 (FIG. 6(g)), and ND6 (FIG. 6(h)).
- FIG. 7(a) is a heat maps depicting A-to-G conversions caused by sTALED or sTALED variants targeted to the Cox3.1 site.
- FIG. 7(b) is a heat maps depicting A-to-G conversions caused by sTALED or sTALED variants targeted to the ND1 site.
- FIG. 7(c) is a heat maps depicting A-to-G conversions caused by sTALED or sTALED variants targeted to the ND6 site.
- FIGs. 7(d)-(f) show the analysis of Cox3.1, ND1, ND6 alleles summarized in FIGs. 7(a)-(c), respectively.
- the spacer sequence is shown on the left, and bar graphs displaying the frequency of each allele are shown on the right.
- the reference sequence is written all in capital letters, whereas the lowercase letters indicate the positions at which base editing has taken place.
- FIG. 8(a) shows the plots indicating the positions of on-target and off-target edits across the mitochondrial genome at day 4 post-transfection.
- Black, and gray dots indicate off-target edits, and naturally-occurring single-nucleotide variations (SNVs), respectively, and the arrows indicate on-target (and bystander) edits.
- Nucleotide positions in the human mitochondrial genome are represented on the X axis.
- FIG. 8(b) shows the average frequencies of genome-wide off-target edits induced by wild-type TALED and TALED variants.
- FIG. 9(a) shows the plots indicating the positions of on-target and off-target edits across the mitochondrial genome at day 2 post-transfection.
- Black, and gray dots indicate off-target edits, and naturally-occurring single-nucleotide variations (SNVs), respectively, and the arrows indicate on-target (and bystander) edits.
- Nucleotide positions in the human mitochondrial genome are represented on the X axis.
- FIG. 11(a) illustrated an experimental scheme of one embodiment of the present invention.
- FIGs. 11(b) and (c) are bar graphs showing viability of cells transfected with plasmids expressing sTALED, sTALED-V106W, sTALED-V28R and sTALED-R111S targeted to the indicated sites which was determined by observing the color change caused by formazan formation in an MTS assay at day 2 (B) and day 4 (C) post-transfection.
- FIG. 12(a) exemplifies the architecture of ABE8e and ABE8e variant constructs.
- AD TedA8e adenine deaminase
- AD* TedA8e adenine deaminase variant
- NLS nuclear localization sequence
- FIG. 12(b) is a graph showing on-target activity of ABE8e and the ABE8e variants (ABE8e-V106W, ABE8e-V28R, ABE8e-R111S) at the nuclear TYRO3 site.
- FIG. 12(c) depicts a heat map showing the frequencies of A-to-G conversions caused by ABE8e and the ABE8e variants (ABE8e-V106W, ABE8e-V28R, ABE8e-R111S) at the nuclear TYRO3 site.
- FIG. 12(d) is a graph showing RNA off-target activity of TYRO3-targeted ABE8e and ABE8e variants (ABE8e-V106W, ABE8e-V28R, ABE8e-R111S) at six representative sites.
- FIGs. 12(e) and 12(f) are graphs illustrating the total number of RNA edits found in HEK 293T cells that expressed ABE8e or the ABE8e variants targeted to the nuclear TYRO3 site as assessed by whole transcriptome sequencing.
- FIG. 13(a) is a graph showing DNA on-target activity of Cox3-specific TALEDs including dimeric TALEDs (dTALEDs), half monomer (dTALED-ADs), monomeric TALEDS (mTALEDs), and untreated samples.
- dTALEDs dimeric TALEDs
- dTALED-ADs half monomer
- mTALEDs monomeric TALEDS
- FIG. 13(b) is a graph showing RNA off-target activity of Cox3-specific TALEDs including dimeric TALEDs (dTALEDs), half monomer (dTALED-ADs), monomeric TALEDS (mTALEDs), and untreated samples at six representative sites.
- dTALEDs dimeric TALEDs
- dTALED-ADs half monomer
- mTALEDs monomeric TALEDS
- FIG. 13(c) is a graph showing specificity ratios of RNA off-target editing relative to on-target editing induced by Cox3-specific TALEDs.
- the term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within 1 or more than 1 standard deviation, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 10%, up to 5%, or up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, such as within 5-fold or within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value should be assumed.
- corresponding refers to amino acid residues at positions listed in the polypeptide or amino acid residues that are similar, identical, or homologous to those listed in the polypeptide. Identifying the amino acid at the corresponding position may be determining a specific amino acid in a sequence that refers to a specific sequence.
- corresponding region generally refers to a similar or corresponding position in a related protein or a reference protein. For example, an arbitrary amino acid sequence is aligned with SEQ ID NO: 3, and based on this, each amino acid residue of the amino acid sequence may be numbered with reference to the amino acid residue of SEQ ID NO: 3 and the numerical position of the corresponding amino acid residue.
- a sequence alignment algorithm as described in the present disclosure may determine the position of an amino acid or a position at which modification such as substitution, insertion, or deletion occurs through comparison with that in a query sequence (also referred to as a "reference sequence").
- alignment means mapping sequence reads to a reference genome and then aligning the bases having identical sites in genomes to fit for each site. Accordingly, so long as it can align sequence reads in the same manner as above, any computer program may be employed.
- the program may be one already known in the pertinent art or may be selected from among programs tailored to the purpose. In one embodiment, alignment is performed using ISAAC, but is not limited thereto.
- the term “host cell” (or “recombinant host cell”), as used herein, is intended to refer to a cell that has been genetically altered, or is capable of being genetically altered by introduction of an exogenous polynucleotide molecule, such as a recombinant plasmid or vector. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progency may not, in fact, be identical to the parent cell, but are still included within the scope of the term “host cell” as used herein.
- the expression "base editor (BE)” is meant an agent that binds a polynucleotide and has nucleobase modifying activity.
- the base editor comprises a nucleobase modifying polypeptide (e.g., a deaminase) and a nucleic acid programmable nucleotide binding domain.
- the base editor comprises a nucleobase modifying polypeptide (e.g., a deaminase) and a nucleic acid programmable nucleotide binding domain in conjunction with a guide polynucleotide (e.g., guide RNA).
- the agent is a biomolecular complex comprising a protein domain having base editing activity, i.e., a domain capable of modifying a base (e.g., A, T, C, G, I, or U) within a nucleic acid molecule (e.g., DNA).
- a protein domain having base editing activity i.e., a domain capable of modifying a base (e.g., A, T, C, G, I, or U) within a nucleic acid molecule (e.g., DNA).
- the polynucleotide programmable DNA binding domain is fused or linked to a deaminase domain.
- the agent is a fusion protein comprising a domain having base editing activity.
- the protein domain having base editing activity is linked to the guide RNA (e.g., via an RNA binding motif on the guide RNA and an RNA binding domain fused to the deaminase).
- the domain having base editing activity is capable of deaminating a base within a nucleic acid molecule.
- the base editor is capable of deaminating one or more bases within a DNA molecule.
- the base editor is capable of deaminating an adenine (A) within DNA.
- the base editor is an adenine base editor (ABE).
- composition administration e.g., injection
- composition administration can be performed by intravenous (i.v.) injection, sub-cutaneous (s.c.) injection, intradermal (i.d.) injection, intraperitoneal (i.p.) injection, or intramuscular (i.m.) injection.
- s.c. sub-cutaneous injection
- i.d. intradermal
- i.p. intraperitoneal
- intramuscular injection i.m.
- Parenteral administration can be, for example, by bolus injection or by gradual perfusion over time.
- parenteral administration includes infusing or injecting intravascularly, intravenously, intramuscularly, intraarterially, intrathecally, intratumorally, intradermally, intraperitoneally, transtracheally, subcutaneously, subcuticularly, intraarticularly, subcapsularly, subarachnoidly and intrasternally.
- administration can be by the oral route.
- another amino acid may be intended to refer to an amino acid selected from among alanine, isoleucine, leucine, methionine, phenylalanine, proline, tryptophan, valine, asparagine, cysteine, glutamine, glycine, serine, threonine, tyrosine, aspartic acid, glutamic acid, arginine, histidine, lysine, and all known variants thereof, exclusive of the amino acid having a wild-type protein retained at the original substitution position.
- off-target site may refer to a site that is not an on-target site, but to which the adenine base editors show activity. That is, the off-target site may refer to a site where base editing occurs, besides an on-target site.
- the term "off-target site” may be used to cover not only sites that are not on-target sites of the adenine base editors, but also sites having possibility to be off-target sites thereof.
- the term “whole genome sequencing” refers to a method of reading the genome by many multiples such as in 10X, 20X, and 40X formats for whole genome sequencing by next generation sequencing.
- Next generation sequencing means a technology that fragments the whole genome or targeted regions of genome in a chip-based and PCR-based paired end format and performs sequencing of the fragments by high throughput on the basis of chemical reaction (hybridization).
- nucleic acid refers to either DNA or RNA.
- Nucleic acid sequence or “polynucleotide sequence” refers to a single- or double-stranded polymer of deoxyribonucleotide or ribonucleotide bases read from the 5′ to the 3′ end. It includes both self-replicating plasmids, infectious polymers of DNA or RNA, and nonfunctional DNA or RNA.
- nucleic acid molecule encoding refers to a nucleic acid molecule which directs the expression of a specific protein or peptide.
- the nucleic acid sequences include both the DNA strand sequence that is transcribed into RNA and the RNA sequence that is translated into protein or peptide.
- the nucleic acid molecule includes both the full-length nucleic acid sequences as well as non-full length sequences derived from the full length protein. It being further understood that the sequence includes the degenerate codons of the native sequence or sequences which may be introduced to provide codon preference in a specific host cell.
- vector refers to viral expression systems, autonomous self-replicating circular DNA (plasmids), and includes both expression and nonexpression plasmids. Where a recombinant microorganism or cell is described as hosting an “expression vector,” this includes both extrachromosomal circular DNA and DNA that has been incorporated into the host chromosome(s). Where a vector is being maintained by a host cell, the vector may either be stably replicated by the cells during mitosis as an autonomous structure, or the vector may be incorporated within the host's genome.
- plasmid refers to an autonomous circular DNA molecule capable of replication in a cell, and includes both the expression and nonexpression types. Where a recombinant microorganism or cell is described as hosting an “expression plasmid”, this includes latent viral DNA integrated into the host chromosome(s). Where a plasmid is being maintained by a host cell, the plasmid is either being stably replicated by the cell during mitosis as an autonomous structure, or the plasmid is incorporated within the host's genome.
- the “percentage amino acid sequence homology” or percent amino acid sequence identity” refers to between a first amino acid sequence and a second amino acid sequence. may be calculated by dividing [the number of amino acid residues in the first amino acid sequence that are identical to the amino acid residues at the corresponding positions in the second amino acid sequence] by [the total number of amino acid residues in the first amino acid sequence] and multiplying by [100%], in which each deletion, insertion, substitution or addition of an amino acid residue in the second amino acid sequence-compared to the first amino acid sequence-is considered as a difference at a single amino acid residue (position), i.e. as an “amino acid difference” as defined herein.
- the degree of sequence identity between two amino acid sequences may be calculated using a known computer algorithm, such as those mentioned above for determining the degree of sequence identity for nucleotide sequences, again using standard settings.
- RNA off-target effects include, for example, bystander off-target effects
- DNA off-target effects including, for example, bystander off-target effects
- such objective was accomplished by substituting different amino acid residues at specific locations within TadA8e that interact with nucleotides.
- RNA sequencing as well as whole mitochondrial genome sequencing were performed. This comprehensive analysis allowed for the identification and confirmation of either the entire spectrum of RNA off-target sites or a representative selection of six prominent RNA off-target sites.
- the dynamics of RNA off-target effects over time were also measured. Such measurements were taken at various time points to assess how the RNA off-target landscape changed over the course of expression.
- an adenine deaminase comprising the amino acid sequence of SEQ ID NO:1 or an amino acid sequence having at least 80% sequence homology to the amino acid sequence of SEQ ID NO:1, wherein at least one amino acid residue selected from residues 28, 30, 46, 48, 49, 82, 84, 106, 108, 110, and 111 of SEQ ID NO:1 or the corresponding amino acid residue of the amino acid sequence having at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% sequence homology (or sequence identity) to the amino acid sequence of SEQ ID NO:1 is substituted with another amino acid.
- adenine deaminase refers to a polypeptide or a fragment capable of catalyzing the hydrolytic deamination of adenine or adenosine.
- the deaminase or deaminase domain represents an adenine deaminase that facilitates the hydrolytic deamination of adenosine to inosine or deoxyadenosine to deoxyinosine.
- the adenine deaminase performs the hydrolytic deamination of adenine or adenosine in DNA (deoxyribonucleic acid).
- the adenosine deaminases such as engineered adenosine deaminases or evolved adenosine deaminases, described herein can be derived from any organism, including bacteria.
- the adenosine deaminase is a TadA deaminase. In some embodiments, the TadA deaminase is TadA variant. In some embodiments, the TadA variant is a TadA8e. In some embodiments, the deaminase or deaminase domain is a variant of a naturally occurring deaminase from an organism, such as bacteria, archaea, human, chimpanzee, gorilla, monkey, cow, dog, rat, or mouse. In some embodiments, the deaminase or deaminase domain does not occur in nature.
- the deaminase or deaminase domain is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75% at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identical to a naturally occurring deaminase
- TadA8e adenosine deaminase has the following sequence:
- the amino acid substitution may be at least one selected from the group consisting of V28Q, or V28R, A48W, F84M, V106A, K110S, K110T, or K110V, and R111F, R111Q, R111S, R111T, or R111Y of the amino acid sequence of SEQ ID NO:1.
- the amino acid substitution having the lowest RNA or DNA off-target editing efficiency may be at least one selected from the group consisting of V28Q, or V28R, A48W, and R111S, of the amino acid sequence of SEQ ID NO:1.
- the adenine deaminase variant may exhibit remarkably reduced off-target effects involving an unintended base alteration in DNA and/or RNA.
- the adenine deaminase variant may reduce unwanted bystander effects while narrowing activity windows.
- the adenine deaminase variant may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- a fusion protein comprising a DNA-binding protein; and the adenine deaminase variant.
- the amino acid substitution of the adenine deaminase variant may be at least one selected from the group consisting of V28Q, or V28R, A48W, F84M, V106A, K110S, K110T, or K110V, and R111F, R111Q, R111S, R111T, or R111Y of the amino acid sequence of SEQ ID NO:1.
- the amino acid substitution of the adenine deaminase variant having the lowest RNA or DNA off-target editing effects may be at least one selected from the group consisting of V28Q, or V28R, A48W, and R111S, of the amino acid sequence of SEQ ID NO:1.
- the fusion protein may exhibits remarkably reduced off-target effects involving an unintended base alteration in DNA and/or RNA.
- the fusion protein may reduce unwanted bystander effects while narrowing activity windows.
- the fusion protein may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- the DNA binding protein may be, for example, but is not limited to, 1) zinc finger protein, 2) transcriptional activator-like effector (TALE) protein, 3) CRISPR-associated nuclease.
- the nuclease is a type II and / or type V such as Cas protein (e.g., Cas9 protein (CRISPR (Clustered regularly interspaced short palindromic repeats) associated protein 9)) or Cpf1 protein (CRISPR from Prevotella and Francisella 1).
- Cas protein e.g., Cas9 protein (CRISPR (Clustered regularly interspaced short palindromic repeats) associated protein 9)
- Cpf1 protein CRISPR from Prevotella and Francisella 1).
- a nuclease associated with the CRISPR system for example, an endonuclease or the like may be used.
- the nuclease may be a Cas protein such as Cas3, Cas9, Cpf1, Cas6, or C2c2, specifically the Cas protein of CRISPR/Cas type II, and more specifically a Cas9 protein derived from Streptococcus Pyogenes.
- a Cas protein such as Cas3, Cas9, Cpf1, Cas6, or C2c2, specifically the Cas protein of CRISPR/Cas type II, and more specifically a Cas9 protein derived from Streptococcus Pyogenes.
- TALE Transcriptional Activator-Like Effector
- RVD Repeat Variable Diresidue
- the RVD motif determines binding specificity to a nucleic acid sequence, and can be engineered according to methods well known to those of skill in the art to specifically bind a desired DNA sequence.
- the simple relationship between amino acid sequence and DNA recognition has allowed for the engineering of specific DNA binding domains by selecting a combination of repeat segments containing the appropriate RVDs.
- the DNA binding protein may be a TALE.
- the TALE may be a dual TALE module consisting of a first TALE module and a second TALE module.
- each of the first and second TALE modules may be connected to various deaminases.
- the first TALE module may be linked to cytosine deaminase such as DddA tox in a full-length form
- the second TALE module may be linked to the adenine deaminase variant.
- Cas9 protein is a main protein component of the CRISPR/Cas system, which can function as an activated endonuclease or nickase.
- Cas9 protein or gene information thereof may be acquired from a well-known database such as the GenBank of NCBI (National Center for Biotechnology Information).
- the Cas9 protein may be at least one selected from the group consisting of, but not limited to:
- Streptococcus pyogenes e.g., SwissProt Accession number Q99ZW2(NP_269215.1) (encoding gene: SEQ ID NO: 229);
- a Cas9 protein derived from Streptococcus sp. for example, Streptococcus thermophiles or Streptocuccus aureus ;
- Pasteurella multocida a Cas9 protein derived from Pasteurella sp., for example, Pasteurella multocida ;
- a Cas9 protein derived from Francisella sp. for example, Francisella novicida .
- Cpf1 protein which is an endonuclease of a new CRISPR system distinguished from the CRISPR/Cas system, is small in size compared to Cas9, requires no tracrRNA, and can function with a single guide RNA.
- Cpf1 can recognize thymidine-rich PAM (protospacer-adjacent motif) sequences and produces cohesive double-strand breaks (cohesive end).
- the Cpf1 protein may be an endonuclease derived from Candidatus spp., Lachnospira spp., Butyrivibrio spp., Peregrinibacteria , Acidominococcus spp., Porphyromonas spp., Prevotella spp., Francisella spp., Candidatus Methanoplasma ), or Eubacterium spp.
- Examples of the microorganism from which the Cpf1 protien may be derived include, but are not limited to, Parcubacteria bacterium (GWC2011_GWC2_44_17), Lachnospiraceae bacterium (MC2017), Butyrivibrio proteoclasiicus , Peregrinibacteria bacterium (GW2011_GWA_33_10), Acidaminococcus sp. (BV3L6), Porphyromonas macacae , Lachnospiraceae bacterium (ND2006), Porphyromonas crevioricanis , Prevotella disiens , Moraxella bovoculi (237), Smiihella sp.
- SC_KO8D17 Leptospira inadai , Lachnospiraceae bacterium (MA2020), Francisella novicida (U112), Candidatus Methanoplasma termitum , Candidatus Paceibacter, and Eubacterium eligens .
- the Cas9 protein may be at least one selected from the group consisting of modified Cas9 that lacks endonuclease activity and retains nickase activity as a result of introducing mutations (e.g., substitution with a different amino acid) to D10 of Streptococcus pyogenes -derived Cas9 protein (e.g., SwissProt Accession number Q99ZW2(NP_269215.1)), and modified Cas9 protein that lacks both endonuclease activity and nickase activity as a result of introducing mutations (e.g., substitutions with different amino acids) to both D10 and H840 of Streptococcus pyogenes -derived Cas9 protein.
- the mutation at D10 may be D10A mutation (the amino acid D at position 10 in Cas9 protein is substituted with A)
- the mutation at H840 may be H840A
- the nick may be introduced simultaneously with the diaminase-mediated base modification (e.g. cytidine converted to uradine) or sequentially, in any order, on the strand on which the base modification occurred or on the opposite strand thereof (e.g. strand opposite to the strand where base conversion occurred) (e.g., a nick is introduced at a position between the third nucleotide and the fourth nucleotide positions in the direction of the 5' end of the PAM sequence on the opposite strand of the strand where the PAM is located).
- Nuclease mutations e.g., amino acid substitutions, etc.
- can occur in the catalytically active domain of the nuclease e.g., in the case of Cas9, the RuvC catalytic domain).
- the mutations may be a substitution of at least one amino acid selected from the group consisting of catalytic aspartic acid at position 10 (D10), glutamic acid at position 762 (E762), histidine at position 840 (H840), asparagine at position 854 (N854), asparagine at position 863 (N863), and aspartic acid at position 986 (D986) for another amino acid.
- D10 catalytic aspartic acid at position 10
- E762 glutamic acid at position 762
- H840 histidine at position 840
- N854 asparagine at position 854
- N863 asparagine at position 863
- D986 aspartic acid at position 986
- it can include variants in which amino acids at N863, H840-N863, or H839-H840-N863 of Cas9 are replaced with another amino acid.
- D10A SpCas9 nickase SpCas9 nickase prepared by removing some catalytic domains may also be used.
- the fusion protein may further include a guide RNA.
- the guide RNA may be, for example, at least one selected from the group consisting of CRISPR RNA (crRNA), trans-activating crRNA (tracrRNA), and single guide RNA (sgRNA). Specifically, it may be a double-stranded crRNA:tracrRNA complex in which crRNA and tracrRNA are bonded to each other, or a single-stranded guide RNA (sgRNA) in which crRNA or a portion thereof and tracrRNA or a portion thereof are linked by an oligonucleotide linker.
- crRNA CRISPR RNA
- tracrRNA trans-activating crRNA
- sgRNA single guide RNA
- the adenine deaminase variant and the DNA binding protein may be used in the form of a fusion protein in which they are fused to each other directly or via a peptide linker (e.g., existing in the order of adenine deaminase variant-DNA binding protein in the N- to C-terminus direction (i.e. , DNA binding protein fused to the C-terminus of adenine deaminase variant) or in the order of DNA binding protein-adenine deaminase variant in the N- to C-terminus direction ( i.e.
- a peptide linker e.g., existing in the order of adenine deaminase variant-DNA binding protein in the N- to C-terminus direction (i.e. , DNA binding protein fused to the C-terminus of adenine deaminase variant) or in the order of DNA binding protein-adenine deaminase variant in the N- to C-terminus
- adenine deaminase variant fused to the C-terminus of DNA binding protein a mixture of the adenine deaminase variant or mRNA coding therefor and the DNA binding protein or mRNA coding therefor, a plasmid carrying both an adenine deaminase variant-encoding gene and a DNA binding protein-encoding gene (e.g., the two genes arranged to encode the fusion protein described above, or a mixture of a adenine deaminase variant expression plasmid and a DNA binding protein expression plasmid, or a plasmid which carry an adenine deaminase variant-encoding gene and an DNA binding protein-encoding gene, respectively).
- a plasmid carrying both an adenine deaminase variant-encoding gene and a DNA binding protein-encoding gene e.g., the two genes arranged to encode the fusion protein described above, or a mixture of a adenine
- the fusion protein may further comprise a cytosine deaminase.
- the cytosine deaminase refers to any enzyme having activity to convert a cytosine, which is found in nucleotide (e.g., cytosine present in double stranded DNA or RNA), to uracil (C-to-U conversion activity or C-to-U editing activity).
- the cytosine deaminase converts cytosine positioned on a strand where a PAM sequence linked to target sequence is present, to uracil.
- the cytosine deaminase may be originated from mammals including bacteria, archaea, primates such as humans and monkeys, rodents such as rats and mice, and the like, but not be limited thereto.
- the cytosine deaminase may be at least one selected from the group consisting of PmCDA1 (Petromyzon marinus cytosine deaminase 1) from Petromyzon marinus, DddA tox from Burkholderia cenocepacia , and APOBEC (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like) family, but not be limited thereto.
- the cytosine deaminase is wild-type Petromyzon marinus CDA1 (pmCDA1) or a catalytic domain thereof. In some embodiments, the cytosine deaminase comprises one or more mutations in the pmCDA1 sequence, such that the editing efficiency, and/or substrate editing preference of pmCDA1 is changed according to specific needs.
- pmCDA1 has the following amino acid sequence:
- DddA tox is cytotoxic, and thus, in order to avoid toxicity in host cells, DddA tox is split into two inactive halves, each of which is fused to a DNA-binding protein in a DddA-derived cytosine base editor (DdCBE).
- DdCBE DddA-derived cytosine base editor
- a functional deaminase is reassembled at a target DNA site, when two inactive halves are brought together by the DNA-binding protein.
- the full-length DddA tox has the following amino acid sequence:
- 6U08_A of Bu rkhol.de ria cenocepacia can include fragments or variants thereof, including amino acid sequences having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identify with DddA of 6U08_A.
- the APOBEC apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like family, for example, may be at least one selected from the following group, but not be limited to:
- APOBEC1 Homo sapiens APOBEC1 (Protein: GenBank Accession Nos. NP_001291495.1, NP_001635.2, NP_005880.2, etc.; gene (mRNA or cDNA; described in the order of the above listed corresponding proteins): GenBank Accession Nos. NM_001304566.1, NM_001644.4, NM_005889.3, etc.), Mus musculus APOBEC1 (protein: GenBank Accession Nos. NP_001127863.1, NP_112436.1, etc.; gene: GenBank Accession Nos. NM_001134391.1, NM_031159.3, etc.);
- APOBEC2 Homo sapiens APOBEC2 (protein: GenBank Accession No. NP_006780.1, etc.; gene: GenBank Accession No. NM_006789.3 etc.), mouse APOBEC2 (protein: GenBank Accession No. NP_033824.1, etc.; gene: GenBank Accession No. NM_009694. 3, etc.);
- APOBEC3B Homo sapiens APOBEC3B (protein: GenBank Accession Nos. NP_001257340.1, NP_004891.4, etc.; gene: GenBank Accession Nos. NM_001270411.1, NM_004900.4, etc.), Mus musculus APOBEC3B (proteins: GenBank Accession Nos. NP_001153887.1, NP_001333970.1, NP_084531.1, etc.; gene: GenBank Accession Nos. NM_001160415.1, NM_001347041.1, NM_030255.3, etc.);
- APOBEC3C Homo sapiens APOBEC3C (protein: GenBank Accession No. NP_055323.2 etc.; gene: GenBank Accession No. NM_014508.2 etc.);
- APOBEC3D Homo sapiens APOBEC3D (protein: GenBank Accession No. NP_689639.2, etc.; gene: GenBank Accession No. NM_152426.3 etc.);
- APOBEC3F Homo sapiens APOBEC3F (protein: GenBank Accession Nos. NP_660341.2, NP_001006667.1, etc.; gene: GenBank Accession Nos. NM_145298.5, NM_001006666.1, etc.);
- APOBEC3G Homo sapiens APOBEC3G (protein: GenBank Accession Nos. NP_068594.1, NP_001336365.1, NP_001336366.1, NP_001336367.1, etc.; gene: GenBank Accession Nos. NM_021822.3, NM_001349436.1, NM_001349437.1, NM_001349438.1, etc.);
- APOBEC3H Homo sapiens APOBEC3H (protein: GenBank Accession Nos. NP_001159474.2, NP_001159475.2, NP_001159476.2, NP_861438.3, etc.; gene: GenBank Accession Nos. NM_001166002.2, NM_001166003. 2, NM_001166004.2, NM_181773.4, etc.);
- APOBEC4 (including APOBEC3E): Homo sapiens APOBEC4 (protein: GenBank Accession No. NP_982279.1, etc.; gene: GenBank Accession No. NM_203454.2 etc.); mouse APOBEC4 (protein: GenBank Accession No. NP_001074666.1, etc.; gene: GenBank Accession No. NM_001081197.1, etc.); and
- Activation-induced cytidine deaminase Homo sapiens AID (Protein: GenBank Accession Nos. NP_001317272.1, NP_065712.1, etc; Genes: GenBank Accession Nos. NM_001330343 .1, NM_020661.3, etc.); mouse AID (protein: GenBank Accession No. NP_033775.1, etc., gene: GenBank Accession No. NM_009645.2, etc.), and the like.
- the cytosine deaminase may be a non-toxic full-length deaminase (i.e. monomeric cytosine deaminase) or be in a two-split form ( i.e. dimeric deaminase) comprising separated first and second domains, each of which may be characterized by the absence of deaminase activity.
- the adenine deaminase variant may bind to the N- or C-terminus of a DNA binding protein or cytosine deaminase or variant thereof.
- the DNA-binding protein is ZFP
- the adenine deaminase variant is TadA8e variant
- the cytosine deaminase or its variant is DddA tox
- they can be included in the following order, but are not limited thereto: ZFP-TadA8e variant- DddA tox , ZFP -DddA tox -TadA8e variant, TadA - DddA tox -ZFP, or DddAtox-TadA8e variant-ZFP.
- the adenine deaminase variant may be attached to the C- terminus of the zinc finger protein (ZF-Left), the N-terminus or C-terminus of the first domain of the cytosine deaminase, the N-terminus of zinc finger protein (ZF-Right), or the N-terminus or C-terminus of the second domain of the cytosine deaminase.
- the adenine deaminase variant may bind to:
- the adenine deaminase variant may bind to C-terminus of zinc finger protein (ZF-Left), N-terminus or C-terminus of the first domain of the cytosine deaminase, zinc finger protein (ZF-Right), or N-terminal or C-terminal of the second domain of the cytosine deaminase.
- the cytosine deaminase when the cytosine deaminase is in a split form and the DNA binding protein is a TALE, the first domain of the cytosine deaminase is attached to a first TALE, the second domain of the cytosine deaminase is attached to a second TALE, and each has a structure of N'-TALE-first domain DDDA-C' and N'-TALE-second domain DDDA-C', respectively.
- the adenine deaminase variant may bind to the N-terminus or C-terminus of the first domain of cytosine deaminase or the N-terminus or C-terminus of the second domain of cytosine deaminase.
- the cytosine deaminase when included in a full-length form and the DNA binding protein is a TALE, it can include a single TALE module, including a single TALE module and a cytosine deaminase in the NC orientation, wherein an adenine deaminase variant can bind to the C-terminus of the single TALE module, or to the N- or C-terminus of the cytosine deaminase.
- cytosine deaminase when the cytosine deaminase is included in a full-length form and the DNA binding protein is a TALE, a dual TALE module may be included.
- a first TALE module and cytosine deaminase are included in the N-C direction, and a second domain including an adenine deaminase variant and a second TALE may be further included.
- the adenine deaminase variant can bind to the N-terminus or C-terminus of TALE.
- the fusion protein may further comprise UGI (uracil glycosylase inhibitor).
- UGI can increase the efficiency of base correction by inhibiting the activity of UDG (Uracil DNA glycosylase), an enzyme that repairs mutant DNA that catalyzes the removal of U from DNA.
- the DNA may be nuclear DNA or organellar DNA.
- the fusion protein may further comprise NLS (nuclear localization signal).
- the nuclear localization signal protein may be, for example, derived from the simian virus 40 large tumor antigen (SV40 large T antigen), but is not limited thereto.
- the nuclear localization signal protein may contain, for example, the following amino acid sequence, but is not limited thereto:
- PKKKRKV (SEQ ID NO: 4)
- the fusion protein may further comprise MTS (mitochondrial targeting sequence) or CTP (chloroplast transit peptide).
- MTS mitochondrial targeting sequence
- CTP chloroplast transit peptide
- the mitochondrial targeting sequence protein may be, for example, SOD2-MTS or COX8A-MTS, and may contain the following amino acid sequences, but are not limited thereto:
- SOD2-MTS LSRAVCGTSRQLAPVLGYLGSRQKHSLPD (SEQ ID NO: 5)
- COX8A-MTS SVLTPLLLRGLTGSARRLPVPRAKIHSL (SEQ ID NO: 6).
- the chloroplast transit peptide protein may be, for example, derived from Arabidopsis RECA1 but is not limited thereto.
- the chloroplast transit peptide protein may contain, for example, the following amino acid sequence, but is not limited thereto:
- the fusion protein may further comprise NES (nuclear export signal).
- the nuclear export signal protein may be, for example, derived from MVM (Mirute virus of mice), but is not limited thereto.
- the nuclear export signal protein may contain, for example, the following amino acid sequence, but is not limited thereto:
- VDEMTKKFGTLTIHDTEK (SEQ ID NO: 8)
- the structure when the signal peptides are attached to the fusion protein, the structure may be as follows: Signal peptide - DNA binding protein - Deaminase. In another embodiment, the structure could be Signal Peptide - Deaminase - DNA binding protein. In one embodiment, the nuclear export signal protein, CTP (chloroplast transit peptide), or a polynucleotide encoding the same may be attached to the N-terminus of a DNA-binding protein, cytosine deaminase (DdCBE), or a polynucleotide encoding the same.
- CTP chloroplast transit peptide
- DdCBE cytosine deaminase
- the fusion protein may further comprise a nickase, for example, MutH, MutH variants, or Nt.BspD6I(C), but not limited thereto.
- MutH is a weak endonuclease that is activated once bound to MutL. It nicks unmethylated DNA and the unmethylated strand of hemimethylated DNA but does not nick fully methylated DNA.
- nicking endonuclease Nt.BspD6I (Nt.BspD6I) is the large subunit of the heterodimeric restriction endonuclease R.BspD6I.
- the fusion protein further comprising a nickase may be in a dimeric form comprising a first fusion protein and a second protein.
- the first fusion protein may include the DNA-binding protein, and the adenine deaminase variant, and the second protein may include another DNA-binding protein, and the nickase.
- a polynucleotide encoding the adenine deaminase or the fusion protein is provided.
- the term “polynucleotide” is used interchangeably with “nucleic acid”, “oligonucleotide”, “nucleotide”, “nucleotide sequence.” It can contain polymeric forms of nucleotides of any length, deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides can have any three-dimensional structure and can perform any known or unknown function.
- a polynucleotide can comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. Modifications to the nucleotide structure are possible before or after assembly of the polymer.
- the nucleic acid can be an RNA sequence, in particular an mRNA sequence, a DNA sequence, or a combination thereof (RNA-DNA combination sequences).
- the nucleic acid may be delivered using a viral vector such as Adeno-Associated Viral Vector (AAV), Adenoviral Vector (AdV), Lentiviral Vector (LV) Retroviral Vector (RV), or other viral vectors such as episomal vector including Simian virus 40 (SV40) ori, bovine papilloma virus (BPV) ori or Epstein-Barr nuclear antigen (EBV).
- AAV Adeno-Associated Viral Vector
- AdV Adenoviral Vector
- LV Lentiviral Vector
- RV Retroviral Vector
- the delivery may be also carried out using a non-viral vector, or through plasmid or mRNA delivery.
- the vector may be delivered in vivo or into cells by a local injection method (e.g., direct injection into a lesion or target site), electroporation, lipofection, viral vector, nanoparticles, PTD (protein translocation domain) fusion protein method, or the like.
- a local injection method e.g., direct injection into a lesion or target site
- electroporation e.g., electroporation, lipofection, viral vector, nanoparticles, PTD (protein translocation domain) fusion protein method, or the like.
- PTD protein translocation domain
- known expression vectors such as plasmid vectors, cosmid vectors and bacteriophage vectors can be used.
- vectors can be readily prepared by those skilled in the art according to any known method using DNA recombination techniques.
- a recombinant expression vector is designed to carry nucleic acid in a format that facilitates its expression within a host cell.
- the nucleic acid sequence intended for expression is operably linked to the recombinant expression vector, and this vector is equipped with one or more regulatory elements, which can be chosen according to the specific host cell.
- "operably linked" means that the nucleotide sequence of interest is linked to a regulatory element in a manner that allows expression of the nucleotide sequence. (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced to the host cell).
- a base editing composition for A-to-G base editing in DNA comprising the fusion protein, a polynucleotide encoding the fusion protein or an expression vector comprising the polynucleotide.
- the fusion protein has been explained above in detail.
- the fusion protein of the base editing composition may further comprise a cytosine deaminase.
- the cytosine deaminase has been explained above.
- the cytosine deaminase may be present in a two split form, and the fusion protein may comprise a first fusion protein comprising a first split of the cytosine deaminase and a second fusion protein comprising a second split of the cytosine deaminase.
- the first split may comprise the amino acid sequence of SEQ ID NO: 9 or 10
- the second split may comprise the amino acid sequence of SEQ ID NO: 11 or 12, but is not limited thereto.
- two TALEDs may consist of the left- or right-side TALE fused to the N-terminal DddAtox half split at G1397 (L-1397N or R-1397N, respectively) and the right- or left-side TALE fused to the C-terminal DddAtox half split at G1397 and TadA8e (R-1397C-AD or L-1397C-AD).
- the base editing composition may exhibits remarkably reduced off-target effects involving an unintended base alteration in DNA and/or RNA.
- the base editing composition may reduce unwanted bystander effects while narrowing activity windows.
- the base editing composition may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- a method for A-to-G base editing in DNA comprising delivering the base editing composition to a cell containing a target DNA.
- the cell may be eukaryotic cells (e.g., fungi such as yeast, eukaryotic animals and/or eukaryotic plant-derived cells (e.g., embryonic cells, stem cells, somatic cells, gametes, etc.), eukaryotic animals (e.g., humans, monkeys, primates dogs, pigs, cattle, sheep, goats, mice, rats, etc.), or eukaryotic plants (e.g., algae such as green algae, corn, soybeans, wheat, rice, etc.), but is not limited thereto.
- fungi such as yeast
- eukaryotic animals and/or eukaryotic plant-derived cells e.g., embryonic cells, stem cells, somatic cells, gametes, etc.
- eukaryotic animals e.g., humans, monkeys, primates dogs, pigs, cattle, sheep, goats, mice, rats, etc.
- eukaryotic plants e.g., algae such as green algae, corn, soybean
- the delivery of the base editing composition to a cell containing a target DNA may be carried out ex vivo or in vivo.
- the target DNA may be nuclear DNA, organellar DNA, or mitochondrial DNA of a human subject with a hereditary disease.
- hereditary disease refers to a pathological condition that occurs due to a mutation that is harmful to a gene or chromosome.
- hereditary diseases include, but are not limited to, mitochondrial encephalopathy, lactic acidosis, and stroke-like episodes (MELAS) syndrome, DEAF, leber hereditary optic neuropathy (LHON), leigh Syndrome, Myopath and Chronic progressive external ophthalmoplegia (CPEO)
- the method may method may exhibit reduced off-target effects compared to a case when using a base editor comprising an adenine deaminase having the amino acid sequence of SEQ ID NO:1, wherein the off-target editing is characterized by an unintended base alteration in DNA and/or RNA.
- the method may reduce unwanted bystander effects while narrowing activity windows compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1 mutation.
- the method may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- the method may exhibit a reduced off-target effect compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1 with V106W mutation, wherein the off-target editing is characterized by an unintended base alteration in DNA and/or RNA.
- the method may reduce unwanted bystander effects while narrowing activity windows compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1 with V106W mutation.
- the method may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- a method for reducing off-target effect and/or unwanted bystander effects while narrowing activity windows comprising delivering the base editing composition to a cell containing a target DNA.
- the cell may be eukaryotic cells (e.g., fungi such as yeast, eukaryotic animals and/or eukaryotic plant-derived cells (e.g., embryonic cells, stem cells, somatic cells, gametes, etc.), eukaryotic animals (e.g., humans, monkeys, primates dogs, pigs, cattle, sheep, goats, mice, rats, etc.), or eukaryotic plants (e.g., algae such as green algae, corn, soybeans, wheat, rice, etc.), but is not limited thereto.
- fungi such as yeast
- eukaryotic animals and/or eukaryotic plant-derived cells e.g., embryonic cells, stem cells, somatic cells, gametes, etc.
- eukaryotic animals e.g., humans, monkeys, primates dogs, pigs, cattle, sheep, goats, mice, rats, etc.
- eukaryotic plants e.g., algae such as green algae, corn, soybean
- the target DNA may be nuclear DNA, organellar DNA, or mitochondrial DNA of a human subject with a hereditary disease.
- the hereditary disease has been defined above.
- the method may method may exhibit reduced off-target effects compared to a case when using a base editor comprising an adenine deaminase having the amino acid sequence of SEQ ID NO:1, wherein the off-target editing is characterized by an unintended base alteration in DNA and/or RNA.
- the method may reduce unwanted bystander effects while narrowing activity windows compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1.
- the method may induce base editing only at a single nucleotide residue without any intended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- the method may exhibit a reduced off-target effect compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1 with V106W mutation, wherein the off-target editing is characterized by an unintended base alteration in DNA and/or RNA.
- the method may reduce unwanted bystander effects while narrowing activity windows compared to when using a base editor comprising an adenine deaminase having amino acids represented by SEQ ID:1 with V106W mutation.
- the method may induce base editing only at a single nucleotide residue without any unintended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- adenine deaminase variant or the fusion protein in A-to-G base editing in DNA and/or reducing off-target effect in A-to-G base editing in DNA, or in preparing a composition for A-to-G base editing in DNA and/or reducing off-target effect in A-to-G base editing in DNA.
- the adenine deaminase variant and the fusion protein have been explained above in detail.
- the fusion protein may further comprise a cytosine deaminase.
- the cytosine deaminase has been explained above.
- the cytosine deaminase may be present in two split form, and the fusion protein comprises a first fusion protein comprising a first split of the cytosine deaminase and a second fusion protein comprising a second split of the cytosine deaminase.
- the first split may comprise the amino acid sequence of SEQ ID NO: 9 or 10 and the second split may comprise the amino acid sequence of SEQ ID NO: 11 or 12, but is not limited thereto.
- the composition for A-to-G base editing in DNA and/or reducing off-target effect in A-to-G base editing in DNA may exhibit remarkably reduced off-target effects involving an unintended base alteration in DNA and/or RNA.
- the composition may reduce unwanted bystander effects while narrowing activity windows Alternatively, the composition may induce base editing only at a single nucleotide residue without any unintended off-target editing in the target DNA with a frequency of at least 10%, at least 11%, at least 12%, at least 13%, at least 14%, or at least 15%.
- HEK 293T cells were purchased from the American Type Culture Collection (ATCC) (CRL-11268). NIH3T3 and B16F10 cells were purchased from ATCC (CRL-1658, CRL-6475). HEK 293T cells were cultured in Dulbecco's Modified Eagle Medium (DMEM; Welgene) supplemented with 10% (v/v) fetal bovine serum (Welgene) and 1% (v/v) antibiotic-antimycotic solution (Welgene).
- DMEM Dulbecco's Modified Eagle Medium
- NIH3T3 and B16F10 cells were cultured in DMEM supplemented with 10% (v/v) bovine calf serum (Gibco) for NIH3T3 cells or 10% fetal bovine serum (Gibco) for B16F10 cells in the absence of any antibiotics. The cells were incubated at 37°C with 5% CO 2 . All cell lines were passaged before approaching 90% confluency.
- the ABE8e (PDB accession number 6VPC) structure was downloaded from PDB and visualized with PyMOL v.2.5.4. Some elements including Cas9, single guide RNA, and double-stranded DNA were excluded from the PDB file; 8AZ, a transition state analog for adenosine deamination reactions, and the TadA monomer were retained. 11 residues (V28, V30, N46, A48, I49, V82, F84, V106, N108, K110, and R111) that were near 8AZ, including residues previously known to contact DNA, were selected.
- plasmids encoding DdCBEs, DddA-split TALEDs (sTALEDs), and mTALEDs that target specific sites plasmids containing a stuffer, which is a sequence between two restriction enzyme sites that is helpful when separating fragments during gel electrophoresis were prepared.
- the architectures of the plasmids are as follows: sTALED plasmids (p3s-stuffer-DddA tox half (1397C)-AD and p3s-stuffer-DddA tox half (1397N)); mTALED plasmids (p3s-stuffer-E1347A DddA tox full-AD).
- Plasmids encoding DdCBEs, sTALEDs, and mTALEDs targeting specific sites were constructed by inserting custom-designed TALE array sequences as shown in Table 1 below.
- each plasmid digested with BsaI or BsmbI (NEB) and inserts, including a custom-designed TALE array sequence synthesized by IDT, were inserted into the digested vector using a HiFi DNA assembly kit (NEB).
- the desired TALED construct was generated by digesting the master vector with BsaI or BsmbI to cleave the site in the stuffer and then assembling six TALE arrays in that position using the Golden Gate method.
- HEK 293T cells were grown in DMEM (Welgene) with 10% fetal bovine serum (Welgene) and 1% antibiotic-antimycotic solution (Welgene).
- NIH3T3 (CRL-1658, ATCC) and B16F10 (CRL-6475, ATCC) cells were grown in DMEM supplemented with 10% (v/v) bovine calf serum (Gibco) for NIH3T3 cells or 10% fetal bovine serum (Gibco) for B16F10 cells in the absence of any antibiotics.
- Cell lines were maintained at 37°C with 5% CO 2 and were passaged before reaching 90% confluency, with timing depending on the doubling period of the specific cell line.
- HEK 293T cells For HEK 293T cells, cells were seeded onto 48-well plates (Corning) at a density of 7.5 ⁇ 10 4 cells per well prior to transfection. The cells were transfected with plasmids (total 1ug) using 1.5 uL of Lipofectamine 2000 (Invitrogen) after 24 h. For delivery of sTALED pair, DdCBE pair, and CRISPR-Base Editor with sgRNA, the total amount of plasmid was 1 ug (500 ng each). When single construct was used, the amount of transfected plasmid was 500 ng. After 96 h, the transfected cells were harvested.
- NIH3T3 and B16F10 cells were seeded in 24-well cell culture plates (SPL, Seoul, Korea) at a density of 1 ⁇ 10 5 cells per well, 18-24 h before transfection.
- Lipofection using Lipofectamine 3000 was performed with 500 ng of each sTALED-encoding plasmid to make up 1000 ng of total plasmid DNA.
- mTALED 500 ng of plasmid was used.
- Cells were harvested 3 days after transfection.
- RNA libraries were prepared using a TruSeq Stranded Total RNA Library Prep Gold kit (Illumina). RNA library quality was assessed using a 2200 TapeStation with a D1000 ScreenTape system (Agilent).
- Total RNA sequencing was performed using a NovaSeq 6000 Sequencer (Illumina) at Macrogen with paired-end sequencing systems (2 x 100bp).
- RNA base-editing variants were called using GATK HaplotypeCaller.
- RNA variant loci were compared to those of control samples and filtered based on the following criteria: (1) loci with a read depth of at least 10 were retained; (2) loci with a variant count of at least 2 were retained; (3) loci also present in the control sample were removed; and (4) undeterminable loci due to insufficient sequencing depth in the control sample were excluded.
- untreated replicate-2 was used as the control for filtering, and for the replicate-2 experimental sets, untreated replicate-1 was used as the control.
- A-to-G edits were counted the number of RNA variant loci with A-to-G edits on the positive strand or T-to-C edits on the negative strand.
- C-to-T edits were counted the number of RNA variant loci with C-to-T edits on the positive strand or G-to-A edits on the negative strand.
- HEK 293T cells were treated with 100 ⁇ L of cell lysis buffer (50mM Tris-HCl; pH 8.0, 1mM EDTA, 0.005% sodium dodecyl sulfate) supplemented with 5 ⁇ L of Proteinase K (Qiagen). The cells were lysed by incubation at 55°C for 1h, and then at 95°C for 10min. The genomic DNA mixture was subjected to targeted deep sequencing.
- cell lysis buffer 50mM Tris-HCl; pH 8.0, 1mM EDTA, 0.005% sodium dodecyl sulfate
- NGS Libraries for targeted deep sequencing were created using nested PCR.
- the target area was initially amplified by PCR using PrimeSTAR® GXL polymerase (Takara). Amplicons were amplified again by PCR with TruSeq DNA-RNA CD index-containing primers to label each fragment with adapter and index sequences in order to build NGS libraries.
- the PCR primers are listed in Table 2 below.
- the final PCR products were purified with a PCR purification kit (MGmed) and sequenced using a MiniSeq sequencer (Illumina). Base editing frequencies from targeted deep sequencing data were measured with source code (https://github.com/ibs-cge/maund).
- the DNA on-target activity and RNA off-target activity of the variants were normalized to the sTALED values. Then, the normalized DNA on-target value is divided by the normalized RNA off-target value, as shown below.
- this value is 1 because both the normalized DNA on-target value and RNA off-target value are 1. The higher the relative ratio indicates lower the off-target activity compared to the on-target activity.
- PCR amplification For whole mitochondrial genome sequencing, three procedures were required: PCR amplification, NGS library creation, and NGS. Initially, cells were treated with 100 ⁇ L of cell lysis buffer (50mM Tris-HCl; pH 8.0, 1mM EDTA, 0.005% sodium dodecyl sulfate) supplemented with 5 ⁇ L of Proteinase K (Qiagen) after removing the growth medium. The cells were lysed by incubation at 55°C for 1h, and then at 95°C for 10min. Mitochondrial DNA was then amplified by PCR using PrimeSTAR® GXL polymerase (Takara). PCR was performed using two sets of slightly overlapping primers in shown in Table 2 above to reduce primer bias.
- cell lysis buffer 50mM Tris-HCl; pH 8.0, 1mM EDTA, 0.005% sodium dodecyl sulfate
- Qiagen Proteinase K
- Mitochondrial DNA was then ampl
- PCR products were then purified with a PCR purification kit (MGmed). Finally, an Illumina DNA Prep kit with Nextera DNA CD Indexes was used to create an NGS library from the purified PCR products (Illumina). The libraries were then pooled and transferred onto a MiniSeq sequencer (Illumina).
- the remaining sites were regarded as off-target sites, and we counted the number of edited A/T nucleotides with an editing frequency > 0.1%.
- the average A/T to G/C editing frequency were calculated for all bases in the mitochondrial genome by averaging the conversion rates at each base location in the off-target sites, as shown below.
- Mitochondrial genome-wide graphs were constructed by plotting the conversion rates at on-target and off-target sites with an editing frequency ⁇ 1% across the entire mitochondrial genome.
- Cell viability assays were performed using CellTiter 96® Aqueous One Solution (Promega) at day 2 or day 4 after plasmid transfection.
- the MTS assay measured the number of viable cells with a colorimetric method.
- Cells were treated with CellTiter 96® Aqueous One Solution, and the quantification of bio-reduced product was measured by recording the absorbance at 490 nm according to the manufacturer’s instructions.
- transcriptome-wide off-target A-to-I conversions induced by TALEDs site-specific mutations were introduced in TadA8e, including V106W, V106G, K20A/R21A (dual mutations), or F148A, which are known to reduce off-target RNA editing when incorporated in CRISPR RNA-guided adenine base editors (ABEs).
- ABEs CRISPR RNA-guided adenine base editors
- Transcriptome-wide sequencing showed that sTALED variants incorporating these mutations in TadA8e reduced the number of off-target A-to-G edits significantly but not completely, while retaining DNA on-target editing efficiencies (FIG. 1).
- TALEDs were engineered by mutating amino-acid residues, including V106, at the substrate binding site in TadA8e.
- plasmids encoding each of the resulting sTALED pairs were transfected into HEK 293T cells.
- RNA off-target editing frequencies measured by targeted deep sequencing were in good agreement with those estimated by transcriptome-wide sequencing (FIGs 4(b) and 4(c)).
- a total of 12 TadA8e variants were chosen, which minimized RNA off-target editing efficiencies at the six representative sites, while retaining mtDNA on-target editing efficiencies (FIG. 2(b)).
- Example 3 Engineered TALEDs reduce bystander and off-target editing
- sTALED variants with site-specific mutations at the TadA8e substrate-binding site could also reduce bystander editing at the target site and off-target editing in the mitochondrial genome, because these mutations could potentially fine-tune adenine deaminase activity for DNA substrates in addition to reducing activity for RNA substrates. That said, base editing frequencies at each nucleotide position were examined. Both sTALED-V28R and -R111S variants specific to the Cox3.1, ND1, and ND6 sites induced A-to-I edits in a narrower window than did the wild-type sTALEDs and sTALED-V106W variants (FIG. 7).
- the wild-type sTALED and sTALED-v106W targeted to the Cox3.1 site induced A-to-I edits at multiple positions, with frequencies of >1.1%, not only in the spacer region between the two TALE-binding sites but also in the TALE-binding sites.
- the Cox3.1- specific sTALED-V28R induced an A-to-I edit at a single position in the spacer region (FIG. 7a).
- TALEDs inducing single-base substitutions with no or few bystander edits are desired, because the vast majority of pathogenic mitochondrial DNA mutations, responsible for mitochondrial genetic disorders, are single-nucleotide variations rather than multiple-nucleotide variations.
- the number of off-target edits induced in the mitochondrial genome was also reduced by the sTALED variants.
- the ND1 -specific, wild-type sTALED caused A-to-I off-target edits at 108 sites in human mitochodrial DNA with frequencies of > 0.1%
- sTALED-V28R and -R111S induced off-target edits at 14 and 17 sites, respectively, similar to the baseline number (that is, 24) seen in the untreated sample (FIG. 8(a)).
- Whole mitochondrial genome sequencing at day 2 post-transfection was also performed (FIG. 9), when RNA off-target edits were induced at the highest level.
- FIG. 10 it was investigated whether on-target mutations induced by various forms of sTALEDs were stably maintained and how long RNA off-target variations persisted over time.
- RNA edits were heavily induced by sTALEDs and sTALED-V106W variants at day 1 and 2 post-transfection but almost completely disappeared by day 8 post-transfection.
- the new sTALED variants reduced RNA off-target editing frequencies by several fold, compared to sTALEDs or the sTALED-V106W variants, even at day 1 and 2 post-transfection (FIG. 10(g)). Mitochondria DNA on-target edits induced by the sTALEDs and sTALED-V106W variants were not stably maintained. Thus, the frequencies of mitochondria DNA on-target edits induced by the sTALEDs and sTALED-V106W variants dropped significantly over time.
- mitochondria DNA on-target edits were induced by the ND6 -specific sTALED and sTALED-V106W with high frequencies of up to 47% and 40%, respectively, at day 1 and 2 post-transfection but with low frequencies of ⁇ 10% at day 8 or later (FIG. 10(c)).
- the frequencies of on-target edits induced by sTALED-V28R and -R111S increased or were more stably maintained over time.
- the editing frequencies observed for our new sTALED variants were lower at day 1 and 2 post-transfection but were higher at day 8 or later than those induced by previous versions of sTALEDs (FIG. 10(a)-(c)).
- sTALEDs and sTALED-V106W variants were cytotoxic, because they induced too many RNA off-target edits with high frequencies, and that mtDNA-edited cells could not divide or died out over time.
- sTALED variants of the present invention could be tolerated, possibly because they avoided RNA off-target editing or did not induce too many bystander edits at the target site and off-target mutations in the mitochondrial genome.
- MTS cell proliferation assays FIG. 11(a) were performed to confirm that the sTALEDs and sTALED-V106W variants were indeed cytotoxic, reducing cell viability significantly, compared to the negative control (pEGFP transfection) (FIG. 11(b)). The sTALED variants were tolerated much better, such that cell viability was not reduced at day 4 post-transfection (FIG. 11(c)).
- FIG. 12 it was investigated whether the V28R and R111S mutations in TadA8e could also reduce bystander editing and RNA off-target editing induced by CRISPR RNA-guided ABEs, widely used for nuclear DNA editing.
- on-target and bystander editing frequencies at the TYRO3 site were measured.
- Both ABE8e-V28R and -R111S were as efficient as ABE8e and ABE8e-V106W (ABE8eW) at the target site with editing frequencies of > 30% (FIGs. 12(b) and 12(c)).
- the V28R and R111S variants exhibited a narrowed editing window with efficient editing (up to 37%) at positions 5 and 7 (A5 and A7 in FIG.
- ABE8e and ABE8eW showed a broader editing window with maximum editing at A5 (29% and 30%, respectively) and A7 (29% and 30%, respectively) and substantial bystander editing at A10 (9.3% and 3.1%, respectively).
- RNA off-target editing activities were used to assess RNA off-target editing activities.
- ABE8e-V28R and -R111S reduced average RNA off-target editing frequencies measured at a total of 6 sites by 3.8 fold and 2.5 fold, respectively, compared to ABE8e, and 3.1 fold and 2.0 fold, compared to ABE8eW (FIG. 12(d)).
- whole transcriptome sequencing showed that ABE8e-V28R and -R111S reduced the number of RNA off-target edits substantially, compared to ABE8e and ABE8eW (FIGs. 12(e) and 12(f)).
- Example 6 DNA on-target and RNA off-target editing by mTALEDs and sTALEDs
- each TALE unit contains a cytosine deaminase (e.g. DddA tox variant (E1347A)) on one side and an adenine deaminase (e.g. TadA8e) on the other side.
- cytosine deaminase e.g. DddA tox variant (E1347A)
- adenine deaminase e.g. TadA8e
- FIG. 13(a) it was confirmed that both dTALEDs and mTALEDs targeted to Cox3 site exhibited on-target activity. Furthermore, it was observed that gene editing did not occur when only the adenine deaminase was present in the TALE unit.
- FIG. 13(b) relates to the results of editing efficiency at six representative sites. Similar to the sTALEDs, a significant reduction in RNA off-target efficiencies for both dTALEDs and mTALEDs variants with V28R and R111S was observed compared to the TadA8e.
- RNA off-target editing was analyzed.
- the results demonstrate that the TadA variants with V28R and R111S, not only reduced the RNA off-target efficiency in sTALED and other systems like ABE but also in dTALEDs and mTALEDs as well.
- RNA off-target effects were predominantly induced by the TadA adenine deaminase rather than by a DNA binding proteins or cytosine deaminase.
- the TadA variants can remarkably reduce such undesired RNA off-target effects.
Abstract
La présente invention concerne un nouveau variant d'adénine désaminase, une protéine de fusion comprenant le variant d'adénine désaminase, une composition d'édition de bases pour l'édition de bases A en G dans l'ADN comprenant la protéine de fusion, et un procédé d'édition de bases A en G dans l'ADN comprenant l'administration de la composition d'édition de bases à une cellule contenant un ADN cible. Le nouveau variant d'adénine désaminase peut entraîner une réduction remarquable des effets hors cible impliquant une altération involontaire des bases dans l'ADN et/ou l'ARN, et induire une édition de bases uniquement au niveau d'un seul résidu nucléotidique sans aucune édition hors cible involontaire dans l'ADN cible.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2022-0120697 | 2022-09-23 | ||
KR20220120697 | 2022-09-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024063273A1 true WO2024063273A1 (fr) | 2024-03-28 |
Family
ID=90454517
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2023/009361 WO2024063273A1 (fr) | 2022-09-23 | 2023-07-04 | Nouveaux variants d'adénine désaminase et procédé d'édition de bases les utilisant |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024063273A1 (fr) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200308571A1 (en) * | 2019-02-04 | 2020-10-01 | The General Hospital Corporation | Adenine dna base editor variants with reduced off-target rna editing |
WO2021050571A1 (fr) * | 2019-09-09 | 2021-03-18 | Beam Therapeutics Inc. | Nouveaux éditeurs de nucléobases et leurs procédés d'utilisation |
CN114045277A (zh) * | 2021-10-21 | 2022-02-15 | 复旦大学 | 碱基编辑器及其构建方法与应用 |
WO2022119294A1 (fr) * | 2020-12-01 | 2022-06-09 | 한양대학교 산학협력단 | Éditeur de base d'adénine dépourvu d'activité d'édition de cytosine et son utilisation |
-
2023
- 2023-07-04 WO PCT/KR2023/009361 patent/WO2024063273A1/fr unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200308571A1 (en) * | 2019-02-04 | 2020-10-01 | The General Hospital Corporation | Adenine dna base editor variants with reduced off-target rna editing |
WO2021050571A1 (fr) * | 2019-09-09 | 2021-03-18 | Beam Therapeutics Inc. | Nouveaux éditeurs de nucléobases et leurs procédés d'utilisation |
WO2022119294A1 (fr) * | 2020-12-01 | 2022-06-09 | 한양대학교 산학협력단 | Éditeur de base d'adénine dépourvu d'activité d'édition de cytosine et son utilisation |
CN114045277A (zh) * | 2021-10-21 | 2022-02-15 | 复旦大学 | 碱基编辑器及其构建方法与应用 |
Non-Patent Citations (1)
Title |
---|
MOK BEVERLY Y.; DE MORAES MARCOS H.; ZENG JUN; BOSCH DUSTIN E.; KOTRYS ANNA V.; RAGURAM ADITYA; HSU FOSHENG; RADEY MATTHEW C.; PET: "A bacterial cytidine deaminase toxin enables CRISPR-free mitochondrial base editing", NATURE, vol. 583, no. 7817, 8 July 2020 (2020-07-08), pages 631 - 637, XP037200062, DOI: 10.1038/s41586-020-2477-4 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11932884B2 (en) | High efficiency base editors comprising Gam | |
WO2019103442A2 (fr) | Composition d'édition génomique utilisant un système crispr/cpf1 et son utilisation | |
US20220220462A1 (en) | Nucleobase editors and uses thereof | |
WO2016021973A1 (fr) | Édition du génome à l'aide de rgen dérivés du système campylobacter jejuni crispr/cas | |
US11702643B2 (en) | System and method for genome editing | |
WO2016076672A1 (fr) | Procédé de détection de site hors-cible de ciseaux génétique dans le génome | |
CN111065647A (zh) | 用于提高碱基编辑精度的融合蛋白 | |
WO2018176009A1 (fr) | Éditeurs de nucléobase comprenant des protéines de liaison à l'adn programmable par acides nucléiques | |
US11840685B2 (en) | Inhibition of unintended mutations in gene editing | |
JP2020521454A (ja) | 改善された精度および特異性を有する塩基エディター | |
WO2015163733A1 (fr) | Procédé de sélection d'une séquence cible nucléase pour effectuer l'inactivation de gène sur la base de la microhomologie | |
WO2010076939A1 (fr) | Nouvelle nucléase à doigts de zinc et ses utilisations | |
WO2013082519A2 (fr) | Séquences de reconnaissance spécifique de nucléotide pour des effecteurs tal sur mesure | |
US20220251580A1 (en) | Improved gene editing system | |
US20210363206A1 (en) | Proteins that inhibit cas12a (cpf1), a cripr-cas nuclease | |
US20210395730A1 (en) | Selective Curbing of Unwanted RNA Editing (SECURE) DNA Base Editor Variants | |
WO2022065689A1 (fr) | Composition d'édition de gènes basée sur l'édition primaire avec une efficacité d'édition améliorée et son utilisation | |
WO2024063273A1 (fr) | Nouveaux variants d'adénine désaminase et procédé d'édition de bases les utilisant | |
WO2019066378A1 (fr) | Lapin à inactivation génique du facteur viii ou du facteur ix, procédé pour sa préparation et utilisation correspondante | |
WO2021020884A2 (fr) | Composition d'édition de base de cytosine et son utilisation | |
WO2023008887A1 (fr) | Éditeur de bases et utilisation associée | |
WO2023153811A1 (fr) | Procédé de prédiction de hors cible qui peut se produire dans un processus d'édition de génome à l'aide d'un système d'édition primaire | |
WO2021125840A1 (fr) | Composition pour l'édition de gène ou l'inhibition de son expression, comprenant cpf1 et un guide d'arn-adn chimérique | |
WO2024063538A1 (fr) | Édition de bases de l'adn d'organites de cellules végétales | |
WO2022059928A1 (fr) | Nouvelle protéine de fusion d'édition ou de révision de bases améliorée et son utilisation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23868333 Country of ref document: EP Kind code of ref document: A1 |