CN116410963A

CN116410963A - Mini gene editing system for realizing efficient base transversion and application thereof

Info

Publication number: CN116410963A
Application number: CN202111641225.1A
Authority: CN
Inventors: 陈亮; 李大力; 汝高盟; 李长青; 高弘毅; 刘明耀
Original assignee: East China Normal University; Bioray Laboratories Inc
Current assignee: East China Normal University; Bioray Laboratories Inc
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2023-07-11

Abstract

The invention discloses a mini gene editing system for realizing efficient base transversion and application thereof. The gene editing system carries out amino acid substitution on the catalytic active site of adenine deaminase TadA-8e on the basis of a single base editing tool ABE8e to obtain 30 single-point mutants. Wherein ABE8e-N46L (named as ADS-BE 1) is CGBE with the smallest volume at present, can cause accurate C-G to G-C base transversion editing at 5-7 positions, and completely eliminates the original function of ABE (executing editing capability on adenine A); compared with the traditional CGBE series, the CGBE series has higher editing activity and lower off-target event, and has extremely high safety. The ADS-BE1 base editor is a technical improvement in the technical field of single base gene editing, and can greatly promote the application of the ADS-BE1 base editor in aspects of gene editing, accurate medical treatment, disease model manufacturing, crop genetic breeding and the like.

Description

Mini gene editing system for realizing efficient base transversion and application thereof

Technical Field

The invention relates to the technical field of gene editing, in particular to a mini gene editing system for realizing efficient base transversion and application thereof.

Background

The nature of human genetic diseases is that about 60% of genetic diseases are caused by single base mutation, and the traditional method for correcting the genetic diseases by using homologous recombination mediated by genome editing technology is very inefficient (0.1% -5%) ^[1,2] . The single base editor derived based on the CRISPR system is an emerging high-efficiency base editing technology in recent years, and has great application prospect in basic research and clinical disease treatment due to the advantages of no DNA double strand break, no recombination template, high-efficiency editing and the like.

Classical base editors are mainly divided into a cytosine base editor CBE and an adenine base editor ABE, the former consists of a modified activity of a cytosine deaminase rAPOBEC1 derived from Streptococcus pyogenes (Streptococcus pyogenes) spCas9n, a rat-derived cytosine deaminase rAPOBEC1 and a uracil glycosidase inhibitor, wherein the Cas9 protein recognizes and specifically binds to DNA with NGG as PAM, and then under the action of deaminase and DNA repair, the substitution of C.G-T.A is finally realized within 20bp of a targeting sequence upstream of NGG (21-23), and the editing window is mainly located at 4-8 positions ^[2] The method comprises the steps of carrying out a first treatment on the surface of the The latter is to fuse the TadA and spCas9 from bacteria, and to obtain adenine base editor ABE7.10 capable of acting on single-stranded DNA through 7 rounds of evolution with the aid of directed evolution and protein engineering technology, the active editing area is mainly located at 4The average editing efficiency of this system in human cells, at position 7, for causing A.T.G.C is about 53%, much higher than the efficiency of mediating base mutations by homologous recombination, with product purities as high as 99.9% and very low indels (insertions and deletions) occurring ^[3] . During CBE development scientists found that uracil glycosidase UNG affected C to G product development ^[4] Thus, scientists re-fuse uracil glycosidase UNG on a CBE basis ^[5 ^,6] Repair factors such as UNG variant/base excision repair protein XRCC1 ^[7] The CGBE series was developed to cause C-G to G-C base transversions in specific regions of DNA. However, the existing CGBE series forms a large-volume and complex construction system for blocking later gene therapy due to the incorporation of a plurality of fusion proteins, and the base editing range of the CGBE fused with the repair factor can cover 3-9 positions, and cannot be compatible with accurate targeting and efficient editing ^[7,8] . Furthermore, CGBE derives from CBE, which produces higher off-target edits on the genome relative to ABE, and thus its safety issues are of concern. Meanwhile, multiple subject groups report that ABE induces low frequency cytosine editing in addition to efficiently mediating a to G editing ^[5,9-11] 。

Disclosure of Invention

The invention solves the technical problems that:

in order to overcome the defects and shortcomings of the prior art base editor, the invention takes adenine deaminase TadA-8e (only 167 amino acids) as an entry point, and through analysis and rational design of a crystal structure of the TadA-8e, a molecular switch (shown in figure 1) for inducing ABE to BE converted into CGBE is identified, namely, the adenine editing capability of the ABE is completely eliminated, the editing characteristic of substrate cytosine is greatly improved, and the ADS-BE1 driven by the ABE (CGBE with the smallest volume at present) is developed through multiple screening transformation, so that accurate C-G to G-C base inversion editing can BE caused at 5-7 positions, and compared with the CGBE series derived from the traditional cytosine deaminase, the ADS-BE1 shows higher editing activity and lower off-target event, so that the method has extremely high safety.

The technical scheme of the invention is as follows:

one of the technical schemes provided by the invention is as follows: an adenine deaminase having an amino acid difference in one or more of positions 28, 30 and 46 compared to the amino acid sequence shown in SEQ ID No. 2; preferably at position 46.

In some preferred embodiments, the adenine deaminase is substituted with a or G at amino acid residue V at position 28; substitution of amino acid residue V at position 30 with F; the amino acid residue N at position 46 is replaced by A, G, L or P, preferably L or P;

preferably, the adenine deaminase comprises a nuclear localization signal sequence, the amino acid sequence of which is preferably as shown in SEQ ID NO. 1;

more preferably, the adenine deaminase comprises a linker sequence, the amino acid sequence of which is preferably shown in SEQ ID NO. 3.

The second technical scheme provided by the invention is as follows: a gene editor for realizing efficient base transversion, which introduces the adenine deaminase according to one of the technical schemes on the basis of a single base editing tool; and/or introducing other mutations or other modifications, such as changing the length of the linker, fusing new functional proteins, etc.

As a preferred embodiment, the gene editor for effecting efficient base transversion comprises: promoter elements, nucleases, polyadenylation a and adenine deaminase as described in one of the claims; wherein:

the promoter element is preferably a spectroscopic promoter, a tissue specific promoter or a liver specific promoter, more preferably a CMV promoter;

the nuclease is preferably a Cas protein and variants thereof, the Cas protein is preferably spCas9 from saccharomyces cerevisiae, saCas9 from staphylococcus aureus, lbCas12a from bacteria of the family trichomonadaceae or enacas 12a from bacteria of the genus aerococcus; the Cas protein variant is preferably VQR-spCas9, VRER-spCas9, spRY, spNG, saCas9-KKH, or SaCas9-NG; preferably, the amino acid sequence of the nuclease is shown as SEQ ID NO. 4;

the polyadenylation A is preferably bovine growth hormone polyadenylation signal or other species polyadenylation signal, more preferably bovine growth hormone polyadenylation signal BGHpolyA.

The third technical scheme provided by the invention is as follows: a gene editing system for effecting efficient base transversions, comprising: sgRNA and a gene editor for realizing efficient base transversion according to the second technical scheme;

preferably, the target sequence of the sgRNA is shown as the nucleotide sequence of SEQ ID NO. 5-21.

The technical scheme provided by the invention is as follows: a pharmaceutical composition comprising an adenine deaminase according to one of the claims, a gene editor for effecting efficient base transversions according to the second of the claims, or a gene editing system for effecting efficient base transversions according to the third of the claims.

The technical scheme provided by the invention is as follows: a method of gene editing for non-therapeutic purposes, the method comprising: expressing the adenine deaminase according to one of the technical schemes, the gene editor for realizing efficient base inversion according to the second technical scheme, and the gene editing system for realizing efficient base inversion according to the third technical scheme in target cells, so that the target cells are subjected to gene editing of base inversion;

preferably, the source of the target cells is an isolated cell line;

more preferably, the isolated cell line is a 293T cell, HELA cell, U2OS cell, NIH3T3 cell or N2A cell.

The technical scheme provided by the invention is as follows: the adenine deaminase according to one of the technical schemes, the gene editor for realizing efficient base transversion according to the second technical scheme, and the gene editing system for realizing efficient base transversion according to the third technical scheme are applied to the preparation of medicines for gene editing or medicines for preparing gene therapy.

The seventh technical scheme provided by the invention is as follows: the adenine deaminase according to one of the technical schemes, the gene editor for realizing efficient base inversion according to the second technical scheme and the gene editing system for realizing efficient base inversion according to the third technical scheme are applied to the construction of animal models and crop breeding.

The eighth technical scheme provided by the invention is as follows: the adenine deaminase according to one of the technical schemes, the gene editor for realizing efficient base transversion according to the second technical scheme and the gene editing system for realizing efficient base transversion according to the third technical scheme are applied to the preparation of base editing tools.

The Tad8e single point mutations and combination mutations used in the present invention are not limited to the 30 single point mutations mentioned herein, but also include other species of origin, as well as other prokaryotic sources. The single base editing tool used in the invention is ABE8e, and other similar single base tools and mutants capable of realizing A to G mutation are also included. The cells used in the invention are eukaryotic cell gene editing and also comprise non-eukaryotic cells, such as prokaryotes, archaea and the like. The ADS-BE1 series used in the invention is composed of CMV-Tad8e mutation-Cas 9n-BGHpolyA, and also comprises permutation and combination capable of enabling more accurate and efficient activity of C to G relative to ABE8e, and also comprises other position transformation such as Tad protein embedding into the middle of Cas 9.

The key innovation point of the invention is that:

according to the invention, adenine deaminase TadA-8e (only 167 amino acids) is used as an entry point, a molecular switch inducing ABE to BE converted into CGBE is identified through analysis and rational design of a crystal structure of the TadA-8e, namely adenine editing capability of the ABE is completely eliminated, editing characteristics of a substrate cytosine are greatly improved, (Adenine Base Editors, an adenine base editor is provided, ABE is fused by the adenosine deaminase TadA and Cas9 protein, tadA-8e is the latest evolution version of TadA), multiple screening and transformation are performed to develop ADS-BE1 (CGBE with the smallest volume at present), and accurate C-G to G-C base inversion editing can BE caused at 5-7 positions. Traditional CGBE is engineered from CBE, which itself has a higher off-target event. The ADS-BE1 is modified in ABE, the volume of the ABE is smaller than that of the CBE, and the ADS-BE1 is not fused with other factors, so that the volume of the ADS-BE1 is smaller. In addition, ABE is lower than the off-target event of CBE, so ADS-BE1 shows higher editing activity and lower off-target event and has extremely high safety. In general, the ADS-BE1 base editor provided by the invention is a great technical improvement in the technical field of single base gene editing, and can greatly promote the application of the ADS-BE1 base editor in aspects of gene editing, accurate medical treatment, disease model making, crop genetic breeding and the like.

The invention has the positive progress effects that:

1. according to the invention, through analysis and rational design of a TadA-8e crystal structure, a 'molecular switch' for inducing ABE to be converted into CGBE is identified, namely, adenine editing capability of the ABE is completely eliminated, and editing characteristics of substrate cytosine are greatly improved;

2. the invention develops the ADS-BE1 driven by ABE, which is CGBE with the smallest volume at present;

3. compared with the CGBE series derived from the traditional cytosine deaminase, the ADS-BE1 base editor provided by the invention has higher editing activity and lower off-target event, so that the ADS-BE1 base editor has extremely high safety;

4. the ADS-BE1 base editor can cause accurate C-G to G-C base transversion editing at 5-7 positions;

5. the ADS-BE1 base editor is a technical improvement in the technical field of single base gene editing, and can greatly promote the application of the ADS-BE1 base editor in aspects of gene editing, accurate medical treatment, disease model manufacturing, crop genetic breeding and the like.

Drawings

Fig. 1: the construct was designed according to the Tad-8e crystal structure (PDB: 6 VPC).

Fig. 2:30 ABE8e mutants A3, A4 and C6 base editing comparisons were made at the FANCF site1 site on 293T.

Fig. 3: ADS-BE1, CGBE1, APO1-nCas9-XRCC1 displayed C > G base editing alignment at 12 endogenous sites on 293T.

Fig. 4: average C > G base editing alignment (statistics of 12 endogenous sites) at different positions of ADS-BE1, CGBE1, APO1-nCas9-XRCC 1.

Fig. 5: ADS-BE1, CGBE1, APO1-nCas9-XRCC1, ABE8e generated index comparison (statistics of 12 endogenous sites).

Fig. 6A, 6B, and 6C: independent off-target comparisons at the ABE site23 site were generated by ADS-BE1, CGBE1, APO1-nCas9-XRCC1, ABE8 e.

Fig. 7: comparison of C to G efficiencies generated by ADS-BE1, CGBE1, APO1-nCas9-XRCC1, ABE8e at the target site ABE site 23.

Fig. 8: nSaCas9-R-loop pattern.

Detailed Description

The invention is further illustrated by means of the following examples, which are not intended to limit the scope of the invention. The experimental methods, in which specific conditions are not noted in the following examples, were selected according to conventional methods and conditions, or according to the commercial specifications.

Example 1: single point mutations of ABE8e-N46L and ABE8e-N46P eliminate editing A to G and produce efficient C to G editing

1.1 plasmid design and construction

1.1.1 according to the crystal structure of ABE8e combined with substrate DNA, 30 single point mutations of ABE8e were designed, and 1 endogenous target FANCF site1 from human was designed (Table 2);

1.1.2 30 ABE8e single-point mutation sequences are synthesized according to the table 1, ABE8e is taken as a carrier (Addgene # 138489), then seamless cloning assembly is carried out (a kit is purchased from Vazyme ClonExpress MultiS One Step Cloning Kit, C113-01), and two oligos are synthesized according to the table 2, wherein the target point is that the two oligos are synthesized according to the table 2, the CACC is added to the positive strand, the AAAC is added to the negative strand, and the two oligos are connected to U6-sgRNA-EF1 alpha-GFP which is cut by BbsI; the nucleotide sequence of plasmid U6-sgRNA-EF1 alpha-GFP is shown as SEQ ID NO:56, wherein the coding sequence of the sgRNA targeting the target sequence is represented by consecutive N.

1.1.3 plasmids constructed in 1.1.1 and 1.1.2 were sequenced by sanger to ensure complete correctness.

Table 1: tadA-8e single point mutation sequence

The order of arrangement of the functional blocks of ABE8e is: bNLS+TadA8e+linker+spCas9n (D10A) +bNLS.

The order of arrangement of the functional blocks of ADS-BE1 is as follows: bNLS+TadA8e (N46L) +linker+spCas9n (D10A) +bNLS.

The amino acid sequence of bNLS in the construct is 2-19 of SEQ ID NO. 1, and M at the first position is omitted.

SEQ ID NO:1:MKRTADGSEFESPKKKRKV；

The amino acid sequence of TadA-8e in the construct is 2-167 of SEQ ID NO. 2, and M at the first position is omitted.

SEQ ID NO:2:MSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSIN；

The amino acid sequence of the Linker is shown as SEQ ID NO. 3,

SEQ ID NO:3:SGGSSGGSSGSETPGTSESATPESSGGSSGGS；

the amino acid sequence of spCas9n (D10A) in the construct is from position 2 to 1368 of SEQ ID NO. 4, omitting the M at the first position.

SEQ ID NO:4:MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD。

Table 2: targets and sequences used in the examples

1.2 cell transfection

Day 1 24 well plates were seeded with HEK293T cells (HEK 293T cells are ATCC CRL-3216 cell line):

(1) HEK293T cells were digested according to 2X 10 ⁵ Inoculating 24-well plates per well;

note that: after cell resuscitation, the cells generally need to be passaged for 2 times and then can be used for transfection experiments;

day 2 transfection:

(2) Observing the cell state of each hole;

note that: the cell density before transfection is required to be 70% -90% and the state is normal;

(3) Plasmid transfection;

each ABE8e mutant plasmid newly constructed in step 1 above was used: plasmid amount of U6-sgRNA-EF1 alpha-GFP=750 ng:250ng, transfection reagent PEI (3. Mu.L PEI added to 1. Mu.g plasmid), co-transfection HEK293T host, ABE8e as control; n=3 wells/group were set.

1.3 genome extraction and preparation of amplicon libraries

72h after transfection, the cell genomic DNA was extracted using the root cell genome extraction kit (DP 304). Then using Hi-Tom Gene Editing Detection Kit (Norway source) operation flow to design corresponding identification primer (shown in Table 3), namely adding bridging sequence 5'-ggagtgagtacggtgtgc-3' (SEQ ID NO: 57) at the 5 'end of the forward identification primer, adding bridging sequence 5'-gagttggatgctggatgg-3'(SEQ ID NO: 58) at the 5' end of the reverse identification primer, obtaining a round of PCR product, then using the round of PCR product as a template to carry out two rounds of PCR products, then mixing together, cutting, recovering and purifying, and then sending to a company for sequencing (sequencing company: suzhou Jin Weizhi Biotechnology Co.).

Table 3: identification primers for target used

1.4 analysis and statistics of deep sequencing results

Deep sequencing results, i.e., ratios of statistics A to G, C to T, C to G, C to A were analyzed using the BE-analyzer website (http:// www.rgenome.net/BE-analyzer/# |), and statistical mapping was performed using graphpad prism 9.1.0, as shown in Table 4 and FIG. 2.

Table 4: editing efficiency results of target FANCF site1 (unit:%)

/>

/>

From the second generation sequencing results, it was found that both ABE8e-N46A, ABE e-N46G, ABE e-N46L, ABE e-N46P and ABE8e-N46R completely abolished the a to G edits relative to ABE8e (fig. 2 and table 4), wherein ABE8e-N46L and ABE8e-N46P exhibited both efficient cytosine C6 edits (C to g+c+a) and the product was more C to G based, yielding C to G edits of 43.93% and 43.87%, respectively; in contrast, C7 and C8 at the 3' -end hardly generate edits (Table 4) and exhibit high accuracy, so ABE8e-N46L was designated as ADS-BE1 (ABE Derived Single Base Editor) as a novel mini CGBE independent of cytosine deaminase, uracil glycosidase, DNA repair factor and the like.

Example 2: ADS-BE1 operation characterization

2.1 plasmid design and construction

2.1.1 designing 12 endogenous targets SSH2 sg10, TIM3-sg4, ABE site 8, EGFR-sg4, HBG-sg14, PPP1R12C site 5, FANCF-sg15, ABE site23, PPP1R12C site7, FANCF-sg17, hp53-sg1 and Ox40-sg1 from human for working characterization (Table 2), construction method was 1.1.2;

2.1.2 the plasmid constructed in 2.1.1 was sequenced by sanger to ensure complete correctness.

2.2 cell transfection

day 2 transfection:

(2) Observing the cell state of each hole;

(3) Plasmid transfection amounts were as follows:

2.1, a newly constructed plasmid: the amount of plasmid U6-sgRNA-EF1 alpha-GFP=750 ng:250ng, the transfection reagent PEI (3 mu L PEI added per 1 mu g plasmid), co-transfected HEK293T host, and ABE8e, CGBE1 (Addgene# 140252) and APO1-nCas9-XRCC1 (Addgene# 165444) were used as controls; n=3 wells/group were set.

2.3 genome extraction and preparation of amplicon libraries

72h after transfection, the cell genomic DNA was extracted using the root cell genome extraction kit (DP 304). Then using Hi-Tom Gene Editing Detection Kit (Norway source) operation flow to design a corresponding identification primer (shown in Table 3), namely adding a bridging sequence shown as SEQ ID NO:57 at the 5 'end of the forward identification primer, adding a bridging sequence shown as SEQ ID NO:58 at the 5' end of the reverse identification primer, thus obtaining a round of PCR products, then using the round of PCR products as templates to carry out two rounds of PCR products, then mixing together, cutting, recovering and purifying, and then sending to a company for sequencing (sequencing company: suzhou Jin Weizhi Biotechnology Co.).

2.4 analysis and statistics of deep sequencing results

Deep sequencing results, i.e., ratios of statistics A to G, C to T, C to G, C to A, were analyzed using the BE-analyzer website and statistically plotted using graphpad prism 9.1.0.

Table 5: comparison of C > G base editing efficiency exhibited by ADS-BE1, CGBE1, APO1-nCas9-XRCC1, ABE8e at 12 endogenous sites on 293T (unit:%)

/>

/>

/>

/>

Table 6: ADS-BE1, CGBE1, APO1-nCas9-XRCC1, ABE8e different positions average C > G base editing alignment (statistics of 12 endogenous sites) (unit:%)

The results show (FIG. 3 and Table 5) that ADS-BE1 exhibits precise editing characteristics for FANCF-sg15, PPP1R12C site7, hp53-sg1 and Ox40-sg1 targets, with CGBE1, APO-nCas9-XRCC1 (two conventional CGBE) and ABE8e as controls, and that ADS-BE1 exhibits C5/C6 more preferred for editing according to the average C to G efficiency statistics for the different positions of the 12 targets (FIG. 4 and Table 6), and that for the C5 position, the average efficiencies of the three CGBE's ADS-BE1, CGBE1 and APO-nCas9-XRCC1 are 27.87%, 17.96% and 7.72%, respectively; for the C6 position, the three average efficiencies were 32.23%, 41.36% and 26.26%, respectively, while for bystanders C7, C8, C4, CGBE1 produced C to G average efficiencies of 11.04%, 11.39% and 9.55%; APO-nCas9-XRCC1 produced C to G average efficiencies of 6.29%, 6.1% and 4.45% and ADS-BE1 produced C to G average efficiencies of only 2.15%, 0.7% and 0.5%; it is shown that ADS-BE1 does not produce effective editing for positions other than C5/C6, and in addition, ADS-BE1 has preference for motif (background sequence of target C), for TC sequence targets at C5 and C6 positions, ADS-BE1 is equal to the efficiency of traditional CGBE1 and APO-nCas9-XRCC1, but for GC and CC sequence targets, ADS-BE1 editing efficiency is higher than that of traditional CGBE1 and APO-nCas9-XRCC1; but is extremely insensitive to AC sequence targets, ADS-BE1, and the editing efficiency is far lower than that of the traditional CGBE1 and APO-nCas9-XRCC1. From the above results, it follows that: the extremely narrow window of C5/C6 and the preference of the motif sequence endow the ADS-BE1 with high-precision editing performance.

Example 3: ADS-BE1 safety evaluation

3.1 plasmid design and construction

3.1.1 to further describe the safety of ADS-BE1, again with ABE8e as control, target site ABE site23, design 4 Cas9 independent off-target detection targets (Table 2), synthesize two oligos, plus CACC, plus AAAC, minus CACC, connect to nSaCas9-R-loop vector that has been digested with BbsI (FIG. 8);

3.1.2 sequencing the plasmid constructed in 1.1 by sanger to ensure complete correctness;

3.2. cell transfection

Day 1, 24 well plates were seeded with HEK293 (HEK 293T cells are ATCC CRL-3216 cell line) cells:

day 2 transfection:

(2) Observing the cell state of each hole;

(3) Plasmid transfection amounts were as follows:

ADS-BE1/CGBE1/APO-nCas9-XRCC1/ABE8e:3.1, newly constructed plasmid: abesite 23 target plasmid = 400ng:300ng plasmid dose, transfection reagent PEI (3 μl PEI per 1 μg plasmid), co-transfected HEK293T host, set n = 3 wells/group.

3.3 genome extraction and preparation of amplicon library

72h after transfection, the cell genomic DNA was extracted using the root cell genome extraction kit (DP 304). Then using Hi-Tom Gene Editing Detection Kit (Norway source) operation flow to design a corresponding identification primer (shown in Table 3), namely adding a bridging sequence shown as SEQ ID NO:57 at the 5 'end of the forward identification primer, adding a bridging sequence shown as SEQ ID NO:58 at the 5' end of the reverse identification primer, thus obtaining a round of PCR products, then using the round of PCR products as templates to carry out two rounds of PCR products, mixing together, cutting, recovering and purifying, and then carrying out sequencing by a company (sequencing company: suzhou Jin Weizhi Biotech Co.).

3.4 analysis and statistics of deep sequencing results

Deep sequencing results were analyzed using the BE-analyzer website, i.e., statistics of the ratios C to T, C to G, C to A and indexes, and statistical mapping was performed using graphpad prism 9.1.0.

Table 7: ADS-BE1, CGBE1, APO1-nCas9-XRCC1, index generated by ABE8e was compared (statistics of 12 endogenous sites) (unit:%)

TABLE 8 independent off-target comparisons (unit:%)

Table 9: comparison of C-G efficiency (unit:%)

Statistics from the index (insertions and deletions) data (fig. 5 and table 7) indicate: the average efficiency of ABE8e generation index is 5.25%, the average efficiency of ADS-BE1 generation index is 8.1%, the average efficiency of CGBE1 generation index is 15.88%, and the average efficiency of APO-nCas9-XRCC1 generation index is 26.93%, so that the average efficiency of ADS-BE1 generation index is far lower than that of the traditional CGBE series and is equal to that of original ABE8 e; according to Cas9 independent off-target evaluation (fig. 6A, 6B, 6C and table 8), ADS-BE1 was found to cause very low C edits (C to G, C to T, C to a) at the 4R-loop site detection, these minor off-target editing events exhibited higher security of ADS-BE 1; at the same time, the target site produces efficient C to G editing (FIG. 7 and Table 9).

Reference is made to:

[1]Rees HA,Liu DR.Base editing:Precision chemistry on the genome and transcriptome of living cells.Nat Rev Genet,2018,19:770-788.

[2]Komor AC,Kim YB,Packer MS,et al.,Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage.Nature,2016,533:420-424.

[3]Gaudelli NM,Komor AC,Rees HA,et al.,Programmable base editing of a*t to g*c in genomic DNA without DNA cleavage.Nature,2017,551:464-471.

[4]Komor AC,Zhao KT,Packer MS,et al.,Improved base excision repair inhibition and bacteriophage mu gam protein yields c:G-to-t:Abase editors with higher efficiency and product purity.Sci Adv,2017,3:eaao4774.

[5]Kurt IC,Zhou R,Iyer S,et al.,Crispr c-to-g base editors for inducing targeted DNA transversions in human cells.Nat Biotechnol,2021,39:41-46.

[6]Zhao D,Li J,Li S,et al.,Glycosylase base editors enable c-to-a and c-to-g base changes.Nat Biotechnol,2021,39:35-40.

[7]Koblan LW,Arbab M,Shen MW,et al.,Efficient c*g-to-g*c base editors developed using crispri screens,target-library analysis,and machine learning.Nat Biotechnol,2021.

[8]Chen L,Park JE,Paa P,et al.,Programmable c:G to g:C genome editing with crispr-cas9-directed base excision repair proteins.Nat Commun,2021,12:1384.

[9]Li S,Yuan B,Cao J,et al.,Docking sites inside cas9 for adenine base editing diversification and rna off-target elimination.Nat Commun,2020,11:5827.

[10]Kim HS,Jeong YK,Hur JK,et al.,Adenine base editors catalyze cytosine conversions in human cells.Nat Biotechnol,2019,37:1145-1148.

[11]Grunewald J,Zhou R,Iyer S,et al.,Crispr DNA base editors with reduced rna off-target and self-editing activities.Nat Biotechnol,2019,37:1041-1048.

SEQUENCE LISTING

<110> university of east China

Shanghai Bangyao Biological Technology Co.,Ltd.

<120> Mini gene editing system for realizing efficient base transversion and application thereof

<130> P21018913C

<160> 58

<170> PatentIn version 3.5

<210> 1

<211> 19

<212> PRT

<213> Artificial Sequence

<220>

<223> Nuclear localization Signal sequence

<400> 1

Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys

1 5 10 15

Arg Lys Val

<210> 2

<211> 167

<212> PRT

<213> Artificial Sequence

<220>

<223> adenine deaminase TadA-8e sequence

<400> 2

Met Ser Glu Val Glu Phe Ser His Glu Tyr Trp Met Arg His Ala Leu

1 5 10 15

Thr Leu Ala Lys Arg Ala Arg Asp Glu Arg Glu Val Pro Val Gly Ala

20 25 30

Val Leu Val Leu Asn Asn Arg Val Ile Gly Glu Gly Trp Asn Arg Ala

35 40 45

Ile Gly Leu His Asp Pro Thr Ala His Ala Glu Ile Met Ala Leu Arg

50 55 60

Gln Gly Gly Leu Val Met Gln Asn Tyr Arg Leu Ile Asp Ala Thr Leu

65 70 75 80

Tyr Val Thr Phe Glu Pro Cys Val Met Cys Ala Gly Ala Met Ile His

85 90 95

Ser Arg Ile Gly Arg Val Val Phe Gly Val Arg Asn Ser Lys Arg Gly

100 105 110

Ala Ala Gly Ser Leu Met Asn Val Leu Asn Tyr Pro Gly Met Asn His

115 120 125

Arg Val Glu Ile Thr Glu Gly Ile Leu Ala Asp Glu Cys Ala Ala Leu

130 135 140

Leu Cys Asp Phe Tyr Arg Met Pro Arg Gln Val Phe Asn Ala Gln Lys

145 150 155 160

Lys Ala Gln Ser Ser Ile Asn

165

<210> 3

<211> 32

<212> PRT

<213> Artificial Sequence

<220>

<223> linker sequence

<400> 3

Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr

1 5 10 15

Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser

20 25 30

<210> 4

<211> 1368

<212> PRT

<213> Artificial Sequence

<220>

<223> nuclease spCas9n (D10A) sequences

<400> 4

Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val

1 5 10 15

Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe

20 25 30

Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile

35 40 45

Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu

50 55 60

Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys

65 70 75 80

Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser

85 90 95

Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys

100 105 110

His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr

115 120 125

His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp

130 135 140

Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His

145 150 155 160

Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro

165 170 175

Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr

180 185 190

Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala

195 200 205

Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn

210 215 220

Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn

225 230 235 240

Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe

245 250 255

Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp

260 265 270

Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp

275 280 285

Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp

290 295 300

Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser

305 310 315 320

Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys

325 330 335

Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe

340 345 350

Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser

355 360 365

Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp

370 375 380

Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg

385 390 395 400

Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu

405 410 415

Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe

420 425 430

Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile

435 440 445

Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp

450 455 460

Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu

465 470 475 480

Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr

485 490 495

Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser

500 505 510

Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys

515 520 525

Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln

530 535 540

Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr

545 550 555 560

Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp

565 570 575

Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly

580 585 590

Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp

595 600 605

Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr

610 615 620

Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala

625 630 635 640

His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr

645 650 655

Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp

660 665 670

Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe

675 680 685

Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe

690 695 700

Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu

705 710 715 720

His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly

725 730 735

Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly

740 745 750

Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln

755 760 765

Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile

770 775 780

Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro

785 790 795 800

Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu

805 810 815

Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg

820 825 830

Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys

835 840 845

Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg

850 855 860

Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys

865 870 875 880

Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys

885 890 895

Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp

900 905 910

Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr

915 920 925

Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp

930 935 940

Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser

945 950 955 960

Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg

965 970 975

Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val

980 985 990

Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe

995 1000 1005

Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala

1010 1015 1020

Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe

1025 1030 1035

Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1040 1045 1050

Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu

1055 1060 1065

Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val

1070 1075 1080

Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr

1085 1090 1095

Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys

1100 1105 1110

Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro

1115 1120 1125

Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val

1130 1135 1140

Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys

1145 1150 1155

Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser

1160 1165 1170

Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys

1175 1180 1185

Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu

1190 1195 1200

Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly

1205 1210 1215

Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1220 1225 1230

Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser

1235 1240 1245

Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys

1250 1255 1260

His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys

1265 1270 1275

Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1280 1285 1290

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn

1295 1300 1305

Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala

1310 1315 1320

Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser

1325 1330 1335

Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr

1340 1345 1350

Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp

1355 1360 1365

<210> 5

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target FANCF site1 sequence

<400> 5

ggaatccctt ctgcagcacc 20

<210> 6

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target SSH2-sg10 sequence

<400> 6

tcactccctc ttcaagctga 20

<210> 7

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target TIM3-sg4 sequence

<400> 7

gaacctcgtg cccgtctgct 20

<210> 8

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target ABE site 8 sequence

<400> 8

gtaaacaaag catagactga 20

<210> 9

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target EGFR-sg4 sequence

<400> 9

aagatcaaag tgctgggctc 20

<210> 10

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target HBG-sg14 sequence

<400> 10

agctcctagt ccagacgcca 20

<210> 11

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target PPP1R12C site 5 sequence

<400> 11

gctgactcag agaccctgag 20

<210> 12

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target FANCF-sg15 sequence

<400> 12

gaagctcgga aaagcgatcc 20

<210> 13

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target ABE site23 sequence

<400> 13

taagcataga ctccaggata 20

<210> 14

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target PPP1R12C site7 sequence

<400> 14

gctggctcag gttcaggaga 20

<210> 15

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target FANCF-sg17 sequence

<400> 15

gcgatccagg tgctgcagaa 20

<210> 16

<211> 21

<212> DNA

<213> Artificial Sequence

<220>

<223> target Hp53-sg1 sequence

<400> 16

cttacctcgc ttagtgctcc c 21

<210> 17

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target OX40-sg1 sequence

<400> 17

acacctaccc cagcaacgac 20

<210> 18

<211> 22

<212> DNA

<213> Artificial Sequence

<220>

<223> target R-loop 2 sequence

<400> 18

atttacagcc tggcctttgg gg 22

<210> 19

<211> 22

<212> DNA

<213> Artificial Sequence

<220>

<223> target R-loop 3 sequence

<400> 19

gtgtcaggta atgtgctaaa ca 22

<210> 20

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target R-loop 5 sequence

<400> 20

tctgcttctc cagccctggc 20

<210> 21

<211> 20

<212> DNA

<213> Artificial Sequence

<220>

<223> target R-loop 6 sequence

<400> 21

gatgttccaa tcagtacgca 20

<210> 22

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF site1 Pre-primer sequence

<400> 22

ggagtgagta cggtgtgcaa ggaacacgga taaagacgct ggg 43

<210> 23

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF site1 post-primer sequence

<400> 23

gagttggatg ctggatggta ggtagtgctt gagaccgcca gaa 43

<210> 24

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> SSH2-sg10 Pre-primer sequence

<400> 24

ggagtgagta cggtgtgctg gaagtcacga gttcttgatg gac 43

<210> 25

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> SSH2-sg10 post-primer sequence

<400> 25

gagttggatg ctggatggcc tgggtgatag ggcaagactc tgt 43

<210> 26

<211> 38

<212> DNA

<213> Artificial Sequence

<220>

<223> TIM3-sg4 Pre-primer sequence

<400> 26

ggagtgagta cggtgtgcct caccgcttga gtcttggc 38

<210> 27

<211> 39

<212> DNA

<213> Artificial Sequence

<220>

<223> TIM3-sg4 post-primer sequences

<400> 27

gagttggatg ctggatggac gttgccacat tcaaacaca 39

<210> 28

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> ABE site 8 Pre-primer sequence

<400> 28

ggagtgagta cggtgtgcct gctgccgtgg gagacaattc ata 43

<210> 29

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> ABE site 8 post-primer sequence

<400> 29

gagttggatg ctggatggag ctgttgcatg aggaaaggga cta 43

<210> 30

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> EGFR-sg4 Pre-primer sequence

<400> 30

ggagtgagta cggtgtgcct tgtggagcct cttacaccca gtg 43

<210> 31

<211> 41

<212> DNA

<213> Artificial Sequence

<220>

<223> EGFR-sg4 post-primer sequence

<400> 31

gagttggatg ctggatggct ccccaccaga ccatgagagg c 41

<210> 32

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> HBG-sg14 Pre-primer sequence

<400> 32

ggagtgagta cggtgtgctt agagtatcca gtgaggccag ggg 43

<210> 33

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> HBG-sg14 post-primer sequence

<400> 33

gagttggatg ctggatggtt gccccacagg cttgtgatag tag 43

<210> 34

<211> 38

<212> DNA

<213> Artificial Sequence

<220>

<223> PPP1R12C site 5 Pre-primer sequence

<400> 34

ggagtgagta cggtgtgcga cttgcccaga gctcttcc 38

<210> 35

<211> 45

<212> DNA

<213> Artificial Sequence

<220>

<223> post-PPP 1R12C site 5 primer sequence

<400> 35

gagttggatg ctggatggaa taaaaatacg gtgaatttct ggttg 45

<210> 36

<211> 40

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF-sg15 Pre-primer sequence

<400> 36

ggagtgagta cggtgtgcga ccaaagcgcc gatggatgtg 40

<210> 37

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF-sg15 post-primer sequence

<400> 37

gagttggatg ctggatggct ccaaggtgaa agcggaagta ggg 43

<210> 38

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> ABE site23 Pre-primer sequence

<400> 38

ggagtgagta cggtgtgcag gagttccacc gccttgttta ctg 43

<210> 39

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> ABE site23 post-primer sequence

<400> 39

gagttggatg ctggatggct cggactttgg ggtaggtttg cat 43

<210> 40

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> PPP1R12C site7 Pre-primer sequence

<400> 40

ggagtgagta cggtgtgcgg ccaggcagat agaccagact gag 43

<210> 41

<211> 38

<212> DNA

<213> Artificial Sequence

<220>

<223> post-PPP 1R12C site7 primer sequence

<400> 41

gagttggatg ctggatggac tggccctggc tttggcag 38

<210> 42

<211> 40

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF-sg17 Pre-primer sequence

<400> 42

ggagtgagta cggtgtgcga ccaaagcgcc gatggatgtg 40

<210> 43

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> FANCF-sg17 post-primer sequence

<400> 43

gagttggatg ctggatggct ccaaggtgaa agcggaagta ggg 43

<210> 44

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> Hp53-sg1 Pre-primer sequence

<400> 44

ggagtgagta cggtgtgcaa cagctttgag gtgcgtgttt gtg 43

<210> 45

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> Hp53-sg1 post-primer sequence

<400> 45

gagttggatg ctggatggat ctgaggcata actgcaccct tgg 43

<210> 46

<211> 42

<212> DNA

<213> Artificial Sequence

<220>

<223> OX40-sg1 Pre-primer sequence

<400> 46

ggagtgagta cggtgtgcga cagcagagac gaggatgtgc gt 42

<210> 47

<211> 39

<212> DNA

<213> Artificial Sequence

<220>

<223> OX40-sg1 post-primer sequences

<400> 47

gagttggatg ctggatggcc tcacctggcc tgcactcgt 39

<210> 48

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 2 Pre-primer sequence

<400> 48

ggagtgagta cggtgtgcaa aggacatttc caccgcaaaa tgg 43

<210> 49

<211> 41

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 2 post-primer sequence

<400> 49

gagttggatg ctggatggga atggggagaa gggcaggttc c 41

<210> 50

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 3 Pre-primer sequence

<400> 50

ggagtgagta cggtgtgctc tttgctccag atttcccttc ata 43

<210> 51

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 3 post-primer sequence

<400> 51

gagttggatg ctggatggcc ttaagtgttc agctgctttt ctt 43

<210> 52

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 5 Pre-primer sequence

<400> 52

ggagtgagta cggtgtgcag tctatttctg ctgcaagtaa gca 43

<210> 53

<211> 43

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 5 post-primer sequence

<400> 53

gagttggatg ctggatggac atactagccc ctgtctagga aaa 43

<210> 54

<211> 39

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 6 Pre-primer sequence

<400> 54

ggagtgagta cggtgtgcga acacggataa agacgctgg 39

<210> 55

<211> 39

<212> DNA

<213> Artificial Sequence

<220>

<223> R-Loop 6 post-primer sequence

<400> 55

gagttggatg ctggatgggc agaagggatt ccatgaggt 39

<210> 56

<211> 2340

<212> DNA

<213> Artificial Sequence

<220>

<223> plasmid U6-sgRNA-EF1 alpha-GFP sequence

<220>

<221> misc_feature

<222> (250)..(269)

<223> n is a, c, g or t

<400> 56

gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60

ataattagaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120

aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180

atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240

cgaaacaccn nnnnnnnnnn nnnnnnnnng ttttagagct agaaatagca agttaaaata 300

aggctagtcc gttatcaact tgaaaaagtg gcaccgagtc ggtgcttttt ttaggcctga 360

attctgcaga tatccatcac actggcggct ccggtgcccg tcagtgggca gagcgcacat 420

cgcccacagt ccccgagaag ttggggggag gggtcggcaa ttgaaccggt gcctagagaa 480

ggtggcgcgg ggtaaactgg gaaagtgatg tcgtgtactg gctccgcctt tttcccgagg 540

gtgggggaga accgtatata agtgcagtag tcgccgtgaa cgttcttttt cgcaacgggt 600

ttgccgccag aacacaggta agtgccgtgt gtggttcccg cgggcctggc ctctttacgg 660

gttatggccc ttgcgtgcct tgaattactt ccactggctg cagtacgtga ttcttgatcc 720

cgagcttcgg gttggaagtg ggtgggagag ttcgaggcct tgcgcttaag gagccccttc 780

gcctcgtgct tgagttgagg cctggcctgg gcgctggggc cgccgcgtgc gaatctggtg 840

gcaccttcgc gcctgtctcg ctgctttcga taagtctcta gccatttaaa atttttgatg 900

acctgctgcg acgctttttt tctggcaaga tagtcttgta aatgcgggcc aagatctgca 960

cactggtatt tcggtttttg gggccgcggg cggcgacggg gcccgtgcgt cccagcgcac 1020

atgttcggcg aggcggggcc tgcgagcgcg gccaccgaga atcggacggg ggtagtctca 1080

agctggccgg cctgctctgg tgcctggcct cgcgccgccg tgtatcgccc cgccctgggc 1140

ggcaaggctg gcccggtcgg caccagttgc gtgagcggaa agatggccgc ttcccggccc 1200

tgctgcaggg agctcaaaat ggaggacgcg gcgctcggga gagcgggcgg gtgagtcacc 1260

cacacaaagg aaaagggcct ttccgtcctc agccgtcgct tcatgtgact ccacggagta 1320

ccgggcgccg tccaggcacc tcgattagtt ctcgagcttt tggagtacgt cgtctttagg 1380

ttggggggag gggttttatg cgatggagtt tccccacact gagtgggtgg agactgaagt 1440

taggccagct tggcacttga tgtaattctc cttggaattt gccctttttg agtttggatc 1500

ttggttcatt ctcaagcctc agacagtggt tcaaagtttt tttcttccat ttcaggtgtc 1560

gtgaaatacg actcactata gggagaccca agctggctag ttaagcttgg taccgccacc 1620

atggtgagca agggcgagga gctgttcacc ggggtggtgc ccatcctggt cgagctggac 1680

ggcgacgtaa acggccacaa gttcagcgtg tccggcgagg gcgagggcga tgccacctac 1740

ggcaagctga ccctgaagtt catctgcacc accggcaagc tgcccgtgcc ctggcccacc 1800

ctcgtgacca ccctgaccta cggcgtgcag tgcttcagcc gctaccccga ccacatgaag 1860

cagcacgact tcttcaagtc cgccatgccc gaaggctacg tccaggagcg caccatcttc 1920

ttcaaggacg acggcaacta caagacccgc gccgaggtga agttcgaggg cgacaccctg 1980

gtgaaccgca tcgagctgaa gggcatcgac ttcaaggagg acggcaacat cctggggcac 2040

aagctggagt acaactacaa cagccacaac gtctatatca tggccgacaa gcagaagaac 2100

ggcatcaagg tgaacttcaa gatccgccac aacatcgagg acggcagcgt gcagctcgcc 2160

gaccactacc agcagaacac ccccatcggc gacggccccg tgctgctgcc cgacaaccac 2220

tacctgagca cccagtccgc cctgagcaaa gaccccaacg agaagcgcga tcacatggtc 2280

ctgctggagt tcgtgaccgc cgccgggatc actctcggca tggacgagct gtacaagtaa 2340

<210> 57

<211> 18

<212> DNA

<213> Artificial Sequence

<220>

<223> Forward primer terminal bridge sequence

<400> 57

ggagtgagta cggtgtgc 18

<210> 58

<211> 18

<212> DNA

<213> Artificial Sequence

<220>

<223> reverse primer end bridge sequence

<400> 58

gagttggatg ctggatgg 18

Claims

1. An adenine deaminase characterized in that the adenine deaminase has an amino acid difference in one or more of positions 28, 30 and 46 compared to the amino acid sequence shown in SEQ ID No. 2; preferably at position 46.

2. The adenine deaminase of claim 1, wherein,

amino acid residue V at position 28 is replaced with a or G;

substitution of amino acid residue V at position 30 with F;

the amino acid residue N at position 46 is replaced by A, G, L or P, preferably L or P;

3. A gene editor for effecting efficient base transversions, characterized in that it introduces the adenine deaminase of claim 1 or 2 on the basis of a single base editing tool; and/or introducing other mutations or other modifications, such as changing the length of the linker, fusing new functional proteins.

4. A gene editor for effecting efficient base transversions as claimed in claim 3, characterized in that it comprises: promoter element, nuclease, polyadenylation a and adenine deaminase according to claim 1 or 2; wherein:

5. A gene editing system for effecting efficient base transversions, comprising: sgRNA and a gene editor according to claim 3 that implements efficient base transversions;

6. A pharmaceutical composition comprising an adenine deaminase according to claim 1 or 2, a gene editor for effecting efficient base transversions according to claim 3 or 4, or a gene editing system for effecting efficient base transversions according to claim 5.

7. A method of gene editing for non-therapeutic purposes, the method comprising:

expressing the adenine deaminase of claim 1 or 2, the gene editor for effecting efficient base inversion of claim 3 or 4, the gene editing system for effecting efficient base inversion of claim 5 in a target cell, and gene editing for effecting base inversion of said target cell;

preferably, the source of the target cells is an isolated cell line;

8. Use of an adenine deaminase according to claim 1 or 2, a gene editor for effecting efficient base transversions according to claim 3 or 4, a gene editing system for effecting efficient base transversions according to claim 5 for the preparation of a medicament for gene editing or for the preparation of a medicament for gene therapy.

9. Use of the adenine deaminase of claim 1 or 2, the gene editor of claim 3 or 4 for effecting efficient base transversions, the gene editing system of claim 5 for effecting efficient base transversions for constructing animal models and crop breeding.

10. Use of an adenine deaminase according to claim 1 or 2, a gene editor for effecting efficient base transversions according to claim 3 or 4, a gene editing system for effecting efficient base transversions according to claim 5 for the preparation of a base editing tool.