CN116790566A

CN116790566A - Nuclease with novel base editing function and editing system

Info

Publication number: CN116790566A
Application number: CN202310149826.3A
Authority: CN
Inventors: 汪阳明; 米黎
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2023-02-13
Filing date: 2023-02-13
Publication date: 2023-09-22

Abstract

The present application provides an enzyme Ddd _SS from Simiaoa sunii (Simiaoa) which can effectively deaminate cytidine in a GC environment. The application further provides an optimized derivative DdBE of Ddd _SS. Gene editing can be performed in a GC environment by using the enzymes provided by the application, so that mutation of disease-related mitochondrial DNA (mtDNA) is introduced, which cannot be realized in the prior art.

Description

Nuclease with novel base editing function and editing system

Background

The base editing can realize the accurate mutation of target DNA sequence, and is helpful to analyze the functions of genes and regulatory elements, model diseases and develop medicines for treating diseases. Base editing in nuclear DNA is achieved by a base editing system based on aggregation of regularly spaced short palindromic repeats (clustered regularly interspaced short palindromic repeats, CRISPR). Unfortunately, CRISPR base editing systems are currently not suitable for mitochondrial DNA editing due to the lack of methods to deliver guide RNAs to mitochondria. Recently, base editors (dda-derived Cytosine Base Editor, ddCBE) derived from transcription activator-like effectors (Transcription Activator Like Effector, TALE) have been developed to catalyze the editing of c→t in mitochondrial DNA. These methods rely on ddatox, a double-stranded DNA editing enzyme from new burkholderia cepacia (Burkholderia cenocepacia). The original ddatox requires a stringent TC sequence site. Through phage-assisted evolution, mok et al further obtained a DddAtox variant that could catalyze the editing of C.fwdarw.T at the TC/AC/CC sequence site. However, there is currently no DdBE suitable for the GC sequence site. Furthermore, it is unclear whether double-stranded DNA deaminase is found in other species. In this study, the present inventors identified and designed novel double-stranded DNA deaminase homologs to solve the above-described problems.

Disclosure of Invention

Drawings

FIG. 1 shows that the C-terminal sequence of DddAtox protein has an important role in its dsDNA deamination activity. Fig. 1 (a): boxes indicate the sequence similar to the sequence of the SPKK motif in the reported sea urchin (Psammechinus miliaris) histone h2b.1 protein sequence and the sequence of the new burkholderia cepacia (Burkholderia cenocepacia) dda protein. Fig. 1 (b): schematic of the protein structure of wild-type ddatox (WT), truncations (Delta) of the similar sequence of the ddatox deletion sphk motif and fusion of the AT-Hook sequence (AT-Hook) on the basis of truncations. Fig. 1 (c): the proteins shown in FIG. 1 (b) were mixed with an equal amount of substrate dsDNA at different concentrations to react with protein deamination activity by identifying the relative proportion of dsDNA to the total amount of substrate that had undergone deamination, the data shown in the figures being mean.+ -. Standard deviation, and 3 independent replicates. FIG. 2 shows a comparison of various dsDNA cytosine deaminase enzymes homologous to DddAtox. Fig. 2 (a): summary of the sequence characteristics and activity of various candidate dsDNA cytosine deaminase enzymes homologous to ddataox. #1 ranking from high to low according to PSI-BLAST scoring. #2 + the C-terminus of the marker protein contains a sequence similar to the SPKK motif; ++, the C-terminal of the protein contains two similar sequences of the SPKK motif. # 3. It is noted that in vitro deamination activity is comparable to ddatox; * Indicating that in vitro deamination activity is higher than ddatox; weak, deaminated product can be identified, but at a protein concentration of 10. Mu.M the average deamination ratio is <5%. Fig. 2 (b): probability map of the two sides of the mutant cytosine on the genome by overexpressing Ddd _SS, ddd _Fa and DddAtox in UNG knocked-out E.coli.

FIG. 3 shows the results of constructing a mitochondrial base editor using Ddd _SS. Fig. 3 (a): ddBE construction schematic. Fig. 3 (b): HEK293T cells transfected ND 5.1-DdBE, mitochondrial DNA editing ratio after 3 days: A_G1397, dddAtox-G1397; ss_n94: ddd _ss-N94; A_G1333, dddAtox-G1333; ss_n29: ddd _ss-N29, data shown in the figure are mean ± standard deviation, 3 independent replicates.

FIG. 4 shows the editing function of Ddd _SS mutant on pathogenic genes. Fig. 4 (a): related diseases caused by mutations in two mitochondrial genes. Fig. 4 (b): HEK293T cells transfected ND 4-DdBE, mitochondrial DNA editing ratio after 3 days; fig. 4 (c): mitochondrial DNA editing ratio 3 days after HEK293T cells are transfected with ND 6-DdBE; in FIGS. 4 (b) and 4 (c), the gray background indicates the disease-associated site in (a), the gray base is the TALE binding moiety, ddd _SS1: ddd _SS (T26I+T7I+T110I); ddd _SS2: ddd _SS (T26I); ddd _SS3: ddd _SS (T77I); ddd _SS4: ddd _SS (T110I); ddd _SS5: ddd _SS (T7I+T110I); the data shown in the figure are mean.+ -. Standard deviation, 3 independent replicates of the experiment, with the Dead-Ddd _SS5: ddd _SS (E44A+T7I+T110I), ddd _SS and its mutant using the N94 cleavage site, dddA and its mutant using the G1397 cleavage site.

FIG. 5 shows the expansion of editable sites by DddAtox mutants. Fig. 5 (a): ND 5.1-DdBE was transfected with HEK293T cells, mitochondrial DNA editing ratio after 3 days; fig. 5 (b): mitochondrial DNA editing ratio 3 days after transfection of ND 1-DdBE with HEK293T cells; ddd-SS and its mutant used the N94 cleavage site, dddA and its mutant used the G1397 cleavage site, and the data shown in the figures are mean.+ -. Standard deviation, 3 independent replicates.

Detailed Description

The molecular biological methods used in the present application, unless otherwise indicated, are all experimental methods well known to those skilled in the art. The reagents used, unless otherwise indicated, are commercially available and common. Thus, the experiments of the present application can be repeated entirely and the same results can be obtained by those skilled in the art upon reading the following description.

Test method

Strains and culture conditions

All strains were grown at 37℃in Luria-Bertani (LB) medium or agar solid LB medium. Kanamycin (50 mg/L), ampicillin (100 mg/L), L-arabinose (2 g/L) and IPTG (0.5 mM) may be added to the medium as required. Coli dh5α, BL21 (DE 3), BW25113 a ung were used to construct and generate plasmids, protein expression and heterologous expression of candidate deaminase, respectively, to determine substrate preference.

Plasmid construction

To construct deaminase expression plasmids, the Azenta Life Sciences (su state) synthetic deaminase gene and corresponding immunity proteins were cloned into MCS-1 (BamHI and NotI sites, N-terminal hexahistidine tag introduced) and MCS-2 (NdeI and XhoI sites) of pcoladat 1. For expression of deaminase in E.coli BW 25113. DELTA. ung, dddAtox, ddd_SS and Ddd _Fa were cloned downstream of the araBAD promoter in pBAD and the corresponding immunoprotein gene driven by the T7 promoter was cloned downstream of the deaminase gene of the same plasmid. To assemble the DdBE, the TALE array was assembled from the tetrameric template using the gold gate cloning method, and then the TALE array was digested with Ndel and BamH1 enzymes and ligated with the cleavage deaminase, UGI and other DdBE sequences. The DdBE construct in this study contained a mitochondrial targeting sequence from the TXN2 gene, a Flag/HA tag, a TALE array, a split deaminase, and UGI. Except for L-A1397N and L-SS94N in FIGS. 2b and 2c, all DdBE was fused to GFP by self-cleaving T2A sequences. DdBE was then cloned into the piggyBac vector under the control of the CAGGS promoter.

Protein purification in vitro DNA deamination assays

For the expression of deaminase, pCOLADuet-1 containing deaminase and its corresponding immunoprotein gene was transformed into BL21 (DE 3). Individual colonies were selected for characterization and amplification. Bacterial liquid with OD600 of 0.6-0.8 and 0.5mM IPTG are selected to induce protein expression at 18 deg.C overnight. Cells were harvested and resuspended in lysis buffer (50 mM Tris-HCl, pH7.5, 500mM NaCl,5% glycerol, 20mM imidazole, 5mM 2-mercaptoethanol and 1mM PMSF) and the supernatant was separated by sonication by centrifugation in a JA-25.50 centrifuge (Beckman) at 18000r.p.m. for 30 minutes. Deaminase immune protein complex was purified from cell lysates by nickel affinity chromatography using 1mL Ni-sepharose 6 fast flow agarose beads, loaded onto a gravity flow column (GE healthcare). The supernatant was loaded onto the column and the resin was washed with 10ml of wash buffer (50 mM Tris-HCl, pH7.5, 500mM NaCl,20mM imidazole and 5mM 2-mercaptoethanol). Deaminase immune protein complex was eluted with 3mL elution buffer (50 mM Tris-HCl pH7.5, 300mM imidazole, 500mM NaCl and 5mM 2-mercaptoethanol) and then deaminase was separated from the complex by denaturation and renaturation steps. For denaturation, the eluted protein samples were added to 25ml of 6M guanidine hydrochloride denaturation buffer (50 mM Tris-HCl pH7.5, 20mM imidazole, 500mM NaCl and 5mM 2-mercaptoethanol) and incubated at 4℃for 1 hour. The eluted protein containing 6M guanidine hydrochloride buffer was loaded on a gravity flow column with 1mL Ni-Sepharose 6 fast flow agarose beads. The column was washed with 10ml of 6m guanidine hydrochloride buffer to remove any remaining immune proteins. While deaminase is still bound to the Ni agarose beads, it is washed sequentially with 8ml of denaturation buffer containing 10. Mu.M ZnCl2 and decreasing concentrations of guanidine hydrochloride (5M, 4M, 3M, 2M, 1M) for protein renaturation, and finally with washing buffer to remove residual guanidine hydrochloride. The column bound protein was then eluted with 3ml elution buffer. Eluted deaminase was again purified by size exclusion chromatography using a Superdex75 column (GE Healthcare) in sizing buffer (20 mM Tris-HCl pH7.5, 200mM NaCl, 5mM 2-mercaptoethanol, and 5% glycerol). Fractions were assessed for purity by SDS-PAGE staining with Coomassie blue, and then the highest quality fractions were stored at-80 ℃.

DNA deamination assay

The DNA deamination assay was essentially performed as described previously, with some modifications. DNA substrates were purchased from Sangon Biotech (Shanghai, china) containing a 6-FAM fluorophore at its 5' end for visualization. To generate a double-stranded DNA substrate, the unmodified reverse complement oligonucleotide was annealed at equimolar concentrations to the substrate modified with the 6-FAM fluorophore. The reaction was carried out in 10. Mu.L of deamination buffer containing 20mM Tris-HCl pH7.5, 200mM NaCl, 5mM 2-mercaptoethanol and 1. Mu.M substrate. The reaction was incubated at 37℃for 1 hour, then 5. Mu.L of UDG reaction solution (New England Biolabs, 0.02U/. Mu.L of UDG in 1 XUDG buffer) was added, and incubated at 37℃for another 30 minutes. Cleavage of the base site resulting from cleavage of uracil residues in the UDG-mediated substrate was induced by addition of 100mM NaOH and incubation at 95℃for 2 min. The reaction was analyzed by 20% acrylamide 8M urea gel electrophoresis in 1 XTBE buffer and the 6-FAM fluorophore signal was detected by fluorescence imaging by the ChemiDoc MP imaging system (Bio-Rad). Quantification of the percent deamination was performed by ImageJ.

Single Nucleotide Variation (SNV) analysis

To obtain genomic DNA, E.coli BW25113ung strain expressing candidate deaminase was inoculated into 20mL of LB broth at a 1:100 dilution, the culture was grown to about OD600 of 0.6, and then treated with 2g/L L-arabinose for 1 hour to induce deaminase expression. Bacterial genomes were extracted using 3mL bacterial cultures using FastPure blood/cells/tissue/bacterial DNA isolation mini-kit (Vazyme, DC 112). The extraction yield was quantified using Qubit (ThermoFisher Scientific). Sequencing libraries were constructed using the VAHTS Universal Plus DNA library preparation kit for Illumina V2 (Vazyme, ND 627) according to the manufacturer's instructions except that VAHTS HiFi Amplification Mix component was replaced with a KAPA HiFi HotStart Uracil + ReadyMix (KAPA Biosystems, KK 2801) component to achieve efficient amplification of uracil encountered in DNA templates. Library concentration and quality were assessed by Qubit and 1% agarose gel electrophoresis. Sequencing was performed using the Illumina Nova-seq 6000 sequencing system (novogen) and read-out of the reference genome (nc_ 000913.3) was mapped using BWA software (version 0.7.17). The duplicate items were deleted using the Picard tool (version 2.18.29). Pile-up data from the alignment was generated using SAMtools (version 1.14) and variable calls were performed using VarScan (version 2.4.4). The SNV verification threshold is set to: variable frequency >0.01, coverage > 50 reads per base and p-value <0.01. A probability logo that modifies the consensus region on both sides of the base was generated using the Weblogo on-line tool (https:// weblog. Berkeley. Edu).

Cell culture and transfection

HEK293T cells were cultured in high-sugar DMEM medium (Hyclone, D6429) supplemented with 10% fbs (PANSera, 2602-P130707) at 37 ℃ under 5% co 2. To edit the mitochondrial genome, 8×10 was used approximately 20 hours prior to transfection ⁴ The HEK293T cells were seeded on 24-well plates coated with poly D-lysine 466 (PDL). JetTime transfection reagent (Polymer-transfection,468 PT-114-75), the amount of each DdBE monomer was 250ng, and the total amount of transfected total plasmid DNA was 500ng. Cells were harvested 72h after transfection and genomic DNA was extracted, and then PCR amplified for regions containing the editing site of interest for Sanger sequencing (Azenta Life Sciences) or library preparation for next generation sequencing. EditR was used to evaluate the basal editing efficiency of Sanger sequencing data.

Target amplicon sequencing and analysis

For target amplicon sequencing, the region of interest was first amplified by round 1 PCR using KAPAHiFi HotStart Uracil +readymix (KAPA Biosystems, KK 2801), then PCR product 1 was amplified again in round 2 PCR using N323 VAHTS RNAMultiplex Oligos Set for lolumina (Vazyme, N323) to increase Illumina index. qPCR was used to optimize the PCR cycle number to the top of the linear range to minimize the bias of amplification. For example, the inventors used 100ng whole cell genome as the initial template, 12 cycles for round 1 PCR and 10 cycles for round 2 PCR. The product was further purified by 1× VAHTSDNA Clean Beads (Vazyme, N411) and the library was then sequenced by Illumina Nova seq 6000 sequencing system. Target amplicon sequencing analysis was performed with CRISPResso 2. The output file "nucleic_permanent_table.txt" is used for quantization of the editing frequency.

Statistics of

Data are expressed as mean ± standard deviation, statistical analysis using GraphPad Prism 9, unless otherwise indicated.

Example one modification to ddatox

From the amino acid sequence analysis of ddatox, the inventors noted that its C-terminus comprises two peptide motifs associated with SPKK (fig. 1 (a)), which are known to be more preferred to bind to a/T-rich DNA sequences located in minor double-stranded DNA grooves. Deletion of these two SPKK-related motifs completely abrogated the double-stranded DNA deaminase activity of DddAtox, while addition of AT-hook with similar DNA binding properties to the SPKK-related motif restored the deaminase activity of truncated DddAtox (FIGS. 1 (b) and 1 (c)). These data indicate that the SPKK-related motif at the C-terminus of DddAtox is important for its double-stranded DNA deamination activity.

Next, the inventors searched for homologs of ddatox using PSI-BLAST in MPI Bioinformatics Toolkit. The inventors run a non-redundant (NR) protein database (2021, nr50_1_nov), iterating until no new sequences appear, and finally determining 555 homologous candidate sequences. Then, the present inventors selected 8 candidate proteins from among 4 proteins having the SPKK-related motif, 4 proteins having no SPKK-related motif, and tested their deaminase activity on a double-stranded DNA substrate (FIG. 2 (a)). As can be seen from FIG. 2 (a), all 4 proteins containing the SPKK-related motif show deamination activity comparable to or higher than DddAtox. In contrast, all 4 proteins without the SPKK-associated motif showed little or no deamination activity. These results indicate that the SPKK-associated motif can be used to recognize highly active double-stranded DNA deaminase.

The four protein sequences with the SPKK-associated motifs are as follows:

Ddd_SS, from Acidovorax facilis (Simiaoa Sunii):

MSLPEYDGTTTHGVLVLDDGTQIGFTSGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMTATFLPEGATLTVVPPENAVANNSRAIDYVKTYTGTSNDPKISPRYKGN(SEQ ID NO：1)

Ddd_Ru from Ruminococcus MSJ-25 (Ruminococcus sp. MSJ-25):

VLPKYDGKTTEGVMVTPDGKQISFKSGNSSTPSYPQYKAQSASHVEGKAALYMRENGINEATVFHNNPNGTCGFCDRQVPALLPKGAKLTVVPPSNSVANNVRAIPVPKTYIGNSTVPKIK(SEQ ID NO：2)

Ddd_Fa, derived from Fusarium species MSJ-15 (Falcatimonas sp. MSJ-15):

SINLPEYDGKTTHGVLVLDDGTQVPFSSGNANPNYKNYIPASHVEGKSAIYMRENGINNGTVFHNNTDGTCPYCDKMLPTLLEEGSTLTVVPPANANAPKPSWVDTVKTYIGNDKIPKKPK(SEQ ID NO：3)

ddd_ca, derived from cartilage needle mould (Chondromyces Apiculatus):

MGNTLPGWDGGKTQGWFVYPDGTERHLISGYDGPSKFTQGIPGMNGNIKSHVEAHAAALMRQYELSKATLYINRVPCPGVRGCDALLARMLPEGVQLEIIGPNGFKKTYTGLPDPKLKPKGCS(SEQ ID NO：4)

in the deamination assay, the inventors noted multiple bands formed by double stranded DNA deaminase (Ddd _ss) from siella far away (Simiaoa Sunii), indicating that it has extensive deaminase activity in non-TC environments. In fact, whole genome sequencing of UNG-deficient e.coli strain expressing Ddd _ss showed that it prefers deamination activity at AC/GC/TC sequence sites, whereas the other double stranded DNA deaminase Ddd _fa had deamination activity at AC/CC/TC sequence sites, slightly higher preference for TC, whereas the original ddatox had very high preference for TC sequence sites (fig. 2 (b)). Since Ddd _ss possesses deamination at GC sequence sites that ddatox and its derived variants do not, and shows the highest deamination activity among all deaminases tested (including ddatox) (compare product yields of deaminase at 0.5 μm), the inventors focused on Ddd _ss during subsequent studies.

Example two base editing effect of Ddd _ss construct on mitochondrial DNA

In this example, the inventors tested Ddd _ss for gene editing in target mitochondrial DNA by fusing half of Ddd _ss into a TALE array protein comprising a mitochondrial localization sequence (MTS). Alignment of Ddd _ss structure and ddatox crystal structure based on ColabFold prediction, N29 and N94 in Ddd _ss correspond to the optimal cleavage sites G1333 and G1397 reported by ddatox 6. From the initial dcbe study, the inventors devised the following mitochondrial dcbe (dcbe_ss) containing Ddd _ss: it is the product of a pair of mitochondrial TALEs linked to a cleaved Ddd _ss, containing an MTS, TALE array, half Ddd _ss cleaved from the N29 or N94 site, and UGI protein. The C-terminal half and the N-terminal half of Ddd _ss are connected to the right and left sides of TALE, respectively (fig. 3 (a)).

The inventors first tested DdBE_SS targeting MT-ND5, MT-ND5 encoding the NADH dehydrogenase 5 subunit of Complex I. The inventors found that DdBE_SS can achieve C.fwdarw.T editing at the poly-C site and has different sequence preferences at the same level as compared to DdBEs containing DddAtox. Importantly, ddBE_SS (DdBE_SS_N94) with N94 cleavage site was at GCThe editing efficiency for C6 at the site was about 40%, while DdBE_A with the G1397 cleavage site (DdBE_A_G 1397) was only about 8% for C6. In addition, in the case of the optical fiber,DdBE_SS_N 94 at GCThe editing efficiency for C7 at the site was about 33 times higher than that of DdBE_A_G 1397. For this MT-ND5.1 site, the efficiency of editing of the two GC sites by the DdBE_SS_N 29 construct was lower than that of the DdBE_SS_N 94 construct (6.1% for C6 and 5.2% for C7), but this was still significantly higher than that of the DdBE_A_G 1333 construct (both sites less than 0.3%) (FIG. 3 (b)). These results indicate that the inventors have successfully engineered DdBE using Ddd _SS to obtain a modified gene sequence in GCSite specific DddA _tox A much more efficient means of mtDNA editing. It should be noted that: since the base editor can edit only one strand of dsDNA, the efficiency of editing measured by the above method is theoretically up to 50%.

To test the versatility of dcbe_ss in mitochondrial DNA, the inventors subsequently constructed dcbe_ss targeted to MT-ATP 6. The inventors observed that the editing efficiency of dcbe_ss_n94 was up to 38% and that of dcbe_ss_n29 was up to 24% (fig. 2c and 3). Inspired by these results, the inventors subsequently tested the editing of 8 additional sites in the 6 mitochondrial genes (table 2), with an editing efficiency of about 7-42% for dcbe_ss_n94 at these sites (fig. 4a-4 c). Taken together, these results indicate that dcbe_ss is a versatile and efficient mitochondrial DNA editing tool.

Example TridCBE_SS edit function on pathogenic mutant Gene

In this example, the inventors introduced two mutations at the GC sites of MT-ND4 and MT-ND6, which are associated with the human diseases Leber Hereditary Optic Neuropathy (LHON) and Leigh syndrome (FIG. 2 d), both of which are devastating genetic diseases, for which no effective treatment is currently available. DdBE_SS_N 94 produced 10% edit efficiency for MT-ND4 (C8) (FIG. 4 (b)), and about 20% edit efficiency for MT-ND6 (C8) (FIG. 4 (C)). To further increase the efficiency of DdBE_SS_N 94, the present inventors have mutated it again to mutate three sites of T26I, T I and T110I (corresponding to S1330I, T1380I and T1413I of DddAtox, respectively). It was exciting that the DdBE_SS_N94 mutant (DdBE_SS5) having the T77I and T110I mutations had significantly improved editing efficiency for the disease-associated target site, about 25% for MT-ND4 (C8) (FIG. 4 (b)),the editing efficiency for MT-ND6 (C8) was about 30% (FIG. 4 (C)). These results indicate that the optimized DdBE obtained by the present inventors from Ddd _SS can be found in GCThe site implements disease-related mtDNA mutations, which have not been previously possible.

The amino acid sequence of Ddd _ss containing each mutation is shown below:

Ddd_SS1:Ddd_SS(T26I+T77I+T110I)；

SLPEYDGTTTHGVLVLDDGTQIGFISGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMIATFLPEGATLTVVPPENAVANNSRAIDYVKTYIGTSNDPKISPRYKGN(SEQ ID NO：5)

Ddd_SS2:Ddd_SS(T26I)；

SLPEYDGTTTHGVLVLDDGTQIGFISGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMTATFLPEGATLTVVPPENAVANNSRAIDYVKTYTGTSNDPKISPRYKGN(SEQ ID NO：6)

Ddd_SS3:Ddd_SS(T77I)；

SLPEYDGTTTHGVLVLDDGTQIGFTSGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMIATFLPEGATLTVVPPENAVANNSRAIDYVKTYTGTSNDPKISPRYKGN(SEQ ID NO：7)

Ddd_SS4:Ddd_SS(T110I)；

SLPEYDGTTTHGVLVLDDGTQIGFTSGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMTATFLPEGATLTVVPPENAVANNSRAIDYVKTYIGTSNDPKISPRYKGN(SEQ ID NO：8)

Ddd_SS5:Ddd_SS(T77I+T110I)；

SLPEYDGTTTHGVLVLDDGTQIGFTSGNGDPRYTNYRNNGHVEQKSALYMRENNISNATVYHNNTNGTCGYCNTMIATFLPEGATLTVVPPENAVANNSRAIDYVKTYIGTSNDPKISPRYKGN(SEQ ID NO：9)

Dead-Ddd_SS5:Ddd_SS(E44A+T77I+T110I)，

SLPEYDGTTTHGVLVLDDGTQIGFTSGNGDPRYTNYRNNGHVAQKSALYMRENNISNATVYHNNTNGTCGYCNTMIATFLPEGATLTVVPPENAVANNSRAIDYVKTYIGTSNDPKISPRYKGN(SEQ ID NO：10)

example four ddatox mutant expansion of editable sites

DdBE_SS has a different sequence preference than DdBE_A. Previous studies on the C.fwdarw.U RNA editing enzyme APOBEC have shown that the sequence preference of the loop sequence for enzyme activity is important. To find out the different sequences that determine the different sequence preferences of Ddd _ss and ddatox, the inventors mutated three loop sequences around the ddatox active site, respectively, based on the Ddd _ss loop sequence. The inventors still used the MT-ND5.1 site as a target, with the dcbe_a mutant with the loop 2 mutation (E1370N) having about 3.2-fold improvement in editing efficiency at C6 (GC site), about 2.0-fold improvement in editing efficiency at C13 (AC site), and about 2.3-fold improvement in editing efficiency at C14 (CC site) (fig. 5 b). In contrast, the loop 3 mutation had little effect on the activity of DdBE_A, and the loop 1 mutation also caused the activity of DdBE_A to be slightly decreased (FIG. 5 (a)). Furthermore, the DdBE_A mutant having the loop 2 mutation (E1370N) also obtained higher editing efficiency for MT-ND1 and MT-ND5.2 sites as compared with the original DdBE_A (FIG. 5 (b)). Taken together, these results demonstrate that by exchanging sequences from different homologues, the sequence compatibility and editing efficiency of DDCBE can be reasonably optimized.

From the above examples, the inventors of the present application found that: the SPKK-related motif at the C-terminus of the protein is important for DNA deamination efficiency of DddAtox. Furthermore, the present inventors identified many ddatox homolog candidates by PSI-BLAST and identified four homologs with double stranded DNA deaminase activity, all of which have the SPKK-related motif. Then, the present inventors constructed a plurality of DDCBEs from Ddd _ss, capable of efficiently editing 14 mitochondrial DNA sites of 10 mitochondrial genes. Importantly, the constructed variants of dcbe_ss successfully achieved c→t efficient editing at GC sites of mitochondrial DNA that could not previously be achieved. Finally, by introducing the Ddd _SS mutant into DddAtox, the inventor successfully constructs a DdBE_A with wider sequence compatibility and higher editing efficiency. In summary, the present application provides previously unrealizable C.fwdarw.T editing at the GC site of mitochondrial DNA and provides the potential for further screening and design of mtDNA base editors with potentially higher efficiency and wider sequence compatibility.

Claims

1. An enzyme having deamination activity to cytosine at the GC sequence site of double stranded DNA within a strand. It is a protein as described in the following (a) or (b):

(a) Consists of SEQ ID NO:1, a protein comprising the amino acid sequence shown in the specification,

(b) Consists of SEQ ID NO:1, and has deamination activity of cytosine at a GC sequence site of the intragranular double-stranded DNA.

2. A base editing system for deamination of cytosine at a GC sequence site of a double stranded DNA within a wire, comprising an N-terminal half and a C-terminal half of the enzyme of claim 1 linked to a pair of transcription activator-like effectors (TALEs), respectively.

3. The base editing system of claim 2, further comprising a uracil DNA glycosylase inhibitor UGI protein.

4. The base editing system according to claim 2, wherein the enzyme according to claim 1 is split in two from its N94 site or N29 site.

5. The base editing system of claim 2, wherein the enzyme of claim 1 further comprises a point mutation of T77I and/or T110I.