CN117904069A

CN117904069A - VirEN protein-mediated DNA splicing and gene editing method

Info

Publication number: CN117904069A
Application number: CN202410009324.5A
Authority: CN
Inventors: 韩文元; 王雨薇; 付攀; 夏永振
Original assignee: Huazhong Agricultural University
Current assignee: Huazhong Agricultural University
Priority date: 2024-01-04
Filing date: 2024-01-04
Publication date: 2024-04-19

Abstract

The invention discloses a VirEN protein-mediated DNA splicing and gene editing method from an archaea-eukaryotic primer enzyme (archaeo-eukaryotic primases, AEP) superfamily BT4734 branch, belonging to the field of biotechnology and genetic engineering. The inventors found that VirEN has the function of micro-homology mediated end ligation (Microhomology-MEDIATED END joining, MMEJ) and is capable of mediating annealing of complementary short 3' protruding sequences between nucleic acid molecules and catalyzing DNA synthesis reactions in vitro and gene editing in vivo, such as knockdown or large fragment deletion.

Description

VirEN protein-mediated DNA splicing and gene editing method

Technical Field

The invention discloses a VirEN protein-mediated DNA splicing and gene editing method, belonging to the field of biotechnology and genetic engineering.

Background

DNA polymerase is a kind of protein which takes DNA as a template and dNTPs as substrates to catalyze and synthesize progeny DNA molecules. DNA polymerase has important application value in the fields of biotechnology and genetic engineering, for example, catalyzing DNA synthesis in PCR reaction, and assembling DNA molecules containing homologous ends in Gibson assembly. The principle of Gibson assembly is to degrade dsDNA in the 5' -3' direction using T5 exonuclease, thereby creating a 3' overhang, the 3' overhang homology spontaneously anneals, and DNA polymerase catalyzes DNA synthesis using the annealed 3' overhang, followed by DNA ligase repair of the gap. Kits for DNA assembly can be developed using T5 exonucleases (e.g., the disclosure of CN108841901a patent). However, the source of DNA polymerase that has been used is relatively single and has many disadvantages such as high cost, the need for long homologous ends (15-30 bp), which have limited the development of related biotechnology to some extent. Thus, there is a need in the art to mine more DNA polymerases of different sources and functions.

The archaea-eukaryotic primer enzyme (Archaeo-Eukaryotic Primase, AEP) superfamily contains an atypical class of DNA polymerase that functions as primer enzyme in archaea and eukaryotes, and also exists widely in movable elements of archaea and bacteria (including plasmids, transposons, viruses, etc.). AEP superfamily members from mobile elements exhibit functional diversity in DNA metabolism and their functional and activity studies are relatively lacking. Thus, the AEP superfamily has the potential to mine novel DNA polymerases.

DNA Double Strand Breaks (DSBs) are the most common form of DNA damage. Nuclease-or radiation-induced double-ended DSBs can be repaired by several DNA repair systems: non-homologous end joining (non-homologous end joining, NHEJ), homologous recombination (homologous recombination, HR) and microhomology-mediated end joining (microhomology-MEDIATED END joining, MMEJ). MMEJ is a means of end repair by annealing depending on the tiny homologous sequence (about 3-20 bp). MMEJ function in dependence on polymerase theta (Pol theta, encoded by the gene POLQ). In addition, PARP1, FEN1, XRCC1, APE2 and ligase3 (ligase 3) are all closely related to MMEJ functions. Studies have shown that MMEJ, most of the time, is an alternative to the NHEJ and HR repair systems, and MMEJ will only function when the current two repair systems are not functioning. MMEJ as a form of alternative end-ligation, only a very small region of homology is required for repair, making it easier to construct targeting vectors. Method for creating CRISPR-based gene edits using MMEJ (Van Vu T,Thi Hai Doan D,Kim J,et al.CRISPR/Cas-based precision genome editing via microhomology-mediated end joining.Plant Biotechnol J.2021,19(2):230-239.).

Disclosure of Invention

In order to achieve the above purpose, the invention adopts the following technical scheme:

The invention provides an application of VirEN protein in DNA assembly or genome editing, which is characterized in that the protein simultaneously meets all the following characteristics:

(1) Contains a virE N-terminal domain;

In some embodiments, the VirE N-terminal domain contains two αβ units at the N-terminus and one RNArecognition motif (RRM) fold at the C-terminus;

(2) Contains three conserved motifs: motif I is hhhDhD/E (h is a hydrophobic residue), motif II is sxK (s is a small residue, x can be any residue), and motif III (hD/E);

Wherein motifs I and III are involved in binding divalent metal ions and motif II is involved in binding nucleotides;

(3) The amino acid sequence has more than 30% identity with the sequence shown in SEQ ID NO. 1.

In some embodiments, the protein described above belongs to the prokaryotic Argonaute-related protein;

in some embodiments, the gene encoding the protein described above is located 20 gene regions upstream or downstream of the Argonaute-associated protein encoding gene;

In some embodiments, the amino acid sequences of the above proteins are shown in SEQ ID NO. 1-5.

The invention also provides an in-vitro DNA assembly reaction system, which is characterized by comprising the following components:

1) A VirEN protein according to any one of claims 1-2;

2) T5 exonuclease; optionally, the T5 exonuclease is used in an amount of 0.02 to 0.32U per total reaction system;

3)dNTP；

4) A buffer containing divalent metal ions.

In some embodiments, the concentration of VirEN protein above is no less than 0.05 μm;

In some embodiments, the concentration of VirEN protein described above is 0.2 μm.

In some embodiments, the concentration of dNTPs is 5 to 500. Mu.M;

in some embodiments, the concentration of dNTPs described above is 50. Mu.M.

In some embodiments, the divalent metal ion buffer described above is PEG8000 buffer;

in some embodiments, the mass ratio of PEG8000 in the PEG8000 buffer solution is 0-5%;

In some embodiments, the above PEG8000 buffer composition is 100+ -5 mM, pH=7.5 Tris-HCl, 10+ -1 mM MgCl ₂, 10+ -1 mM DTT, mass ratio of PEG8000 of 0-5%.

The invention also provides an in-vitro DNA assembly method which is characterized in that the system is used, 2-4 fragments to be assembled are added, and the reaction is carried out for 20-40 min at the temperature of 30-37 ℃.

The present invention also provides a composition characterized by comprising:

1) VirEN proteins as described above; 2) A nuclease system capable of causing a double strand break at the target site;

In some embodiments, the nuclease system of 2) comprises a CRISPR-cas system or a TALEN (Transcription Activator-like Effector Nucleases) system or a zinc-finger ribonuclease (zinc-finger nucleases) system.

The invention also provides a carrier system capable of producing the above composition.

The invention also provides the application of the composition or the vector system in the deletion of the genome target position fragment, which is characterized in that the two sides of the deletion fragment contain micro-homologous sequences;

in some embodiments, the deletion fragment described above is between 486bp and 16377 bp.

The beneficial effects of the invention are as follows: compared with the prior art, the invention creatively discovers VirEN protein-mediated MMEJ function. Based on this property, virEN proteins can be used for DNA splicing and gene editing. The invention develops a VirEN-based DNA splicing method, and the length of a required homologous sequence is obviously smaller than that of the traditional DNA splicing method. The invention also proves the technical effect of VirEN on gene editing in vivo.

Drawings

FIG. 1 is a schematic diagram of the structure of Argonaute (Ago) and its associated VirEN gene clusters from different sources.

FIG. 2 uses fluorescent markers containing different end sequences to analyze VirEN polymerase activity.

FIG. 3 verifies VirEN-mediated DNA splicing using fluorescently labeled dsDNA with 3' -overhanging ends.

Fig. 4 is a schematic diagram of the MEDA principle.

FIG. 5 is a schematic illustration of the effectiveness of the MEDA assembly method with homologous sequences of different lengths.

FIG. 6 positive cloning efficiency and accuracy A of MEDA without screening pressure: the number of transformants obtained; b: colony PCR results for 24 transformants per group; c: sequencing results of positive clones, the upper and lower of the sequencing result graphs are upstream and downstream sequencing results of cloning sites, respectively.

FIG. 7 effects of different VirEN protein concentrations on MEDA assembly efficiency.

FIG. 8 effects of different dNTP concentrations on assembly efficiency.

FIG. 9 shows the results of the MEDA method on the assembly of different numbers of fragments and linearized vectors. The KanR, smR and CmR resistance gene expression cassette inserts were designated A, B and C fragments, respectively, and the pUC19 plasmid PCR linearized fragment was designated V.

Figure 10 MEDA assembly efficiencies at different reaction times and reaction temperatures.

FIG. 11 effect of PEG8000 on MEDA assembly efficiency.

FIG. 12 effect of VirEN protein on dinB deletion mutation rate after double strand break on dinB gene using CRISPR-Cas9 system.

Detailed Description

In order to more clearly illustrate the technical solutions and embodiments of the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings and specific technical methods. The described embodiments are only some, but not all, embodiments of the invention and do not constitute a limitation of the invention in any way.

The experimental methods in the following examples, unless otherwise specified, are conventional methods, and are carried out according to techniques or conditions described in the literature in the field or according to the product specifications. Materials, reagents, instruments and the like used in the examples described below are commercially available unless otherwise specified. The quantitative tests in the following examples were all set up in triplicate and the results averaged.

Among them, the primers and primer sequences used in the examples below are shown below.

Examples

Example 1VirEN has the function of mediating MMEJ

Argonaute (Ago) protein is a protein existing in archaea, bacteria and eukaryotes, and provides anchor sites for non-coding small RNAs, so as to achieve the purpose of degrading target genes or inhibiting translation. Studies have shown that Ago in archaea activates transmembrane toxic effector proteins after recognition of viral invasion, kills infected cells by inducing cell membrane depolarization, thereby inhibiting viral proliferation and providing immune protection (Zeng Z.etal.A short prokaryotic Argonaute activates membrane effector to confer antiviral defense.Cell host&microbe,2022,30,930-943,e936).pAgo to cell populations functions dependent on its cognate protein (a protein encoded by a gene adjacent to the gene). During the process of excavating pAgo related proteins, the inventors found that there was a coding gene of the archaea-eukaryotic primer enzyme (archaeo-eukaryotic primases, AEP) superfamily conservatively in the region within 20 genes upstream or downstream of a class LongB pAgo gene by genomic structural analysis, phylogenetic map analysis and adjacent gene analysis (fig. 1). This class of AEP proteins belongs to the BT4734-like branch, which is designated VirE N-terminal domain containing protein in the NCBI et al database, designated VirEN in the present invention.

VirEN are widely present in different prokaryotes, such as south China sea weed (Seonamhaeicola sp.S2-3, NCBI accession number: WP_083692401.1,SEQ ID NO.1), bacteroides timonensis (NCBI accession number: WP_052356213.1,SEQ ID NO.2), arabidopsis thaliana (Bacteroides oleiciplenus, NCBI accession number: WP_009128905.1,SEQ ID NO.3), cyclobacillus marinus (Cyclobacteriummarinum, NCBI accession number: WP_014022284.1,SEQ ID NO.4), flavobacterium (Flavobacteriaceae, NCBI accession number: WP_027879785.1,SEQ ID NO.5), and the like.

The VirEN protein structure has obvious characteristics, taking SEQ ID NO.1 as an example, the 121-294 amino acids are VirE N-terminal domain, wherein the 121-182 amino acids are two alpha beta units, and the 183-294 amino acids are RRM folding; LTIDFD _205-210 is motif I, GLK _241-243 is motif II, VD _274-275 is motif III. Wherein motifs I and III are involved in binding divalent metal ions and motif II is involved in binding nucleotides.

Thus, the VirE N-terminal domain (containing two αβ units at the N-terminus and one RNA Recognition Motif (RRM) fold at the C-terminus) contains three conserved motifs: proteins having a motif I of hhhDhD/E (h is a hydrophobic residue), a motif II of sxK (s is a small residue, x can be any residue) and a motif III (hD/E), the amino acid sequence of which has a sequence identity of greater than 30% to the sequence shown in SEQ ID NO.1, are among the VirEN proteins of the invention. In addition, virEN proteins also belong to the prokaryotic Argonaute-related proteins, so the original VirEN protein-encoding gene is located 20 gene regions upstream or downstream of the Argonaute-related protein-encoding gene.

The inventors selected south China sea weed Seonamhaeicola sp.S2-3 for VirEN study. In addition to its typical polymerase activity and primer enzyme activity, it was unexpectedly found that VirEN was able to catalyze a primer extension reaction on a substrate containing a terminal complementary sequence (ssDNA-4) without such activity for a substrate without a terminal complementary sequence (ssDNA-0) when the polymerase reaction of VirEN was analyzed with single-stranded DNA as a substrate (FIG. 2). These results indicate that VirEN has a micro-homology mediated end ligation (Microhomology-MEDIATED END joining, MMEJ) function, potentially for use in DNA splicing.

Example 2 in vitro DNA Assembly based on VirEN-mediated MMEJ

The inventors converted ssDNA-4 to a double-sized product for substrates containing complementary 3'-CCGG sequences by non-denaturing gel electrophoresis in performing VirEN-mediated terminal transferase activity reactions of single-stranded DNA, but no product was observed in non-complementary 3' -CCAA sequences with 3 'projections, confirming that VirEN can mediate single-stranded complementary 3' projecting base complementary pairing for amplification of the template, respectively (fig. 2).

The present invention tests whether VirEN can mediate DNA splicing by performing MMEJ reactions using different dsDNA substrates with 3' overhangs (containing no microhomologous ends or 2,4, and 6nt microhomologous ends) and the results are shown in figure 3. In the non-denaturing gel, virEN converts pssDNA-4 to a double-sized product for substrates containing the complementary 3'-CCGG sequence, but no product was observed in the non-complementary 3' -CCAA sequence with the 3 'overhang, confirming VirEN-mediated MMEJ reaction is mediated by complementary 3' overhang base pairing. VirEN also promote the formation of the MMEJ product at the 3' -CCCGGG terminus, but is less efficient in the ligation of pssDNA-7 (3 ' -GTAC) and pssDNA-9 (3 ' -TTTAAA). In addition, there was little MMEJ product in pssDNA-8 (3 '-TTAA) and pssDNA-2 (3' -GC). These results indicate that VirEN is capable of catalyzing the MMEJ reaction and that the efficiency of the reaction is related to the GC content and the length of the complementary sequence. For GC-rich homologous ends, a minimum of 4nt of complementary sequence is required, while for AT-rich homologous ends a minimum of 6nt of complementary sequence is required.

When the length of the homologous sequence is more than or equal to 4 and the number of hydrogen bonds is more than 12, the reaction has higher efficiency. When the length of the homologous sequence is < 4 and the number of hydrogen bonds is less than 12, the efficiency of the reaction is significantly reduced. Substrates such as GTAC (number of hydrogen bonds 10) and TTAA (number of hydrogen bonds 8) end up with lower reaction efficiency, while GC (length of homologous sequence 2 and number of hydrogen bonds 6) end up with almost no MMEJ products.

Based on VirEN's ability to catalyze MMEJ, the invention designs a DNA splicing method, namely, mediating two DNA molecules with homologous 3' protruding sequences at the tail ends to anneal, and catalyzing DNA synthesis by taking one 3 'protruding sequence as a template and the other 3' protruding sequence as a primer, thereby realizing the splicing of DNA molecules. This MMEJ-assisted DNA Assembly method, designated MEDA Assembly (MMEJ-ASSISTED DNA Assembly), can be used in DNA cloning (FIG. 4).

The advantages of the MEDA over the traditional Gibson assembly method are:

(1) The assembly reaction is aided by MMEJ, so that DNA assembly can be performed under a shorter homologous sequence, the cost of primer synthesis is reduced, and the adverse effect of the overlong homologous sequence on PCR amplification or DNA assembly is reduced;

(2) The assembled reaction product can be directly transformed into escherichia coli without subsequent operation;

(3) The reaction system is simple, low in cost, convenient and fast to operate and suitable for large-scale popularization and use.

EXAMPLE 3 Assembly of homologous sequences of different lengths

The invention further tested the effectiveness of VirEN-mediated MEDA assembly methods on homologous sequences of different lengths.

The specific process is as follows:

1) The template required for the preparation of the linearized vector was derived from plasmid pUC19 and was linearized by PCR amplification by means of primers 19k-F/19k-R (length 2686 bp). The inserted KanR resistance gene fragment (length 900 bp) was amplified by PCR with 4-Kan-F/4-Kan-R, 6-Kan-F/6-Kan-R, 8-Kan-F/8-Kan-R and 15-Kan-F/15-Kan-R primer pairs, respectively, and introduced a homology arm of 4bp-15bp at the end. And then, agarose gel electrophoresis is carried out on the PCR product, the target fragment is recovered by cutting gel, and the quantity is determined.

2) The linearized pUC19 vector and Kan fragments with homology arms of different lengths are assembled as 2 fragments, the assembled system is that VirEN protein is used in an amount of 1 mu M, T5 exonuclease is used in an amount of 0.04U/total reaction system, and the reaction buffer system is composed of 100mM Tris-HCl with pH=7.5, 10mM MgCl ₂, 1mM DTT, 5% PEG8000 and 100 mu M dNTP. The total system was 20. Mu.L. Wherein the amount of linearized carrier is 100ng, the molar ratio of insert to carrier addition is 4:1, and the reaction is carried out at 30℃for 40min. The control group used the TEDA technique (see CN 108841901A), and the reaction system was the same except that VirEN and dNTP were not added. Wherein the PEG8000 buffer solution comprises 100+ -5 mM Tris-HCl with pH=7.5, 10+ -1 mM MgCl ₂, 10+ -1 mM DTT, and PEG8000 with mass ratio of 0-5%.

3) After completion of the reaction, 10. Mu.L of the assembled product was transformed into DH 5. Alpha. Competent cells prepared by the Inoue method. The transformation products were plated onto 50. Mu.g/mL Kan-resistant plates to screen positive clones and the number of monoclonal clones counted after overnight incubation.

As a result, as shown in FIG. 5, the MEDA and TEDA assembly efficiency was the same at 15bp homologous sequence, and also at the same level as that of the commercial DNA assembly method. When the homologous sequence is 4-8bp, the assembly efficiency of the MEDA is far higher than that of the TEDA. Meanwhile, for the MEDA technology, the 8bp homologous sequence and the 15bp homologous sequence obtain similar assembly efficiency. These results indicate that MEDA can utilize shorter homologous sequences for DNA assembly.

The positive cloning rate and the accuracy rate of the MEDA under the condition of no screening pressure are further verified. The DNA assembly of the pUC19 plasmid PCR product and the 8bp homology arm Kan fragment was transformed into a plate containing 100. Mu.g/mL Amp resistance. And verifying the number of positive clones by colony PCR, analyzing the positive cloning rate of the MEDA, and further sequencing the positive clones to analyze the accuracy of the MEDA. The results are shown in FIG. 6. 24 single colonies which are randomly picked have a TEDA positive rate of 30% and a MEDA positive rate of 75% under the condition; the sequencing analysis was performed on 10 positive clones generated by MEDA at random, and the results showed that the sequences were all correct.

Example 4 Effect of different VirEN protein concentrations on MEDA assembly efficiency

The invention further tests the effect of different VirEN protein concentrations on the efficiency of MEDA assembly. Experimental procedure referring to example 3, kan PCR fragment introduced 8bp homology arm, and different VirEN protein concentrations were introduced during the MEDA reaction. The specific process is as follows:

1) The template required for the preparation of the linearized vector was derived from plasmid pUC19 and was linearized by primer 19k-F/19k-R PCR. The inserted KanR resistance gene expression cassette was PCR-passed through an 8-Kan-F/8-Kan-R primer pair and introduced with an 8bp homology arm at the end. And then, agarose gel electrophoresis is carried out on the PCR product, the target fragment is recovered by cutting gel, and the quantity is determined.

2) The MEDA reaction system containing different VirEN protein concentrations was configured by introducing different VirEN protein concentrations of 0, 0.01, 0.05, 0.1, 0.2, 0.5, 0.75, 1 and 2. Mu.M, respectively, into a 20. Mu.L MEDA assembly system. The amount of T5 exonuclease used was 0.04U/total reaction system, and the reaction buffer system consisted of 100mM Tris-HCl, pH=7.5, 10mM MgCl ₂, 1mM DTT, 5% PEG8000 and 100. Mu.M dNTP.

3) The linearized pUC19 vector and Kan fragment with 8bp homology arm are assembled as 2 fragments, and the assembled system uses the MEDA reaction system with different VirEN protein concentrations, wherein the dosage of the linearized vector is 100ng, the molar ratio of the inserted fragment to the vector is 4:1, and the reaction is carried out for 40min at 30 ℃. The control group was assembled with the linearized pUC19 vector using an insert having an 8bp homologous sequence at the end, and the reaction system was the same except for the absence of VirEN and dNTPs.

4) After completion of the reaction, 10. Mu.L of the assembly product was transformed into DH 5. Alpha. Competent cells prepared by the Inoue method. The transformation products were plated onto 50. Mu.g/mL Kan-resistant plates to screen positive clones and the number of monoclonal clones counted after overnight incubation.

As shown in fig. 7, virEN had little effect on the assembly efficiency when the VirEN concentration was below 0.05 μm, while the assembly efficiency was greatly improved when the VirEN concentration reached 0.05 μm, and the assembly efficiency increased with the increase in VirEN concentration; at a concentration of VirEN higher than 0.1 μm, the change in assembly efficiency with the increase in VirEN concentration was not significant.

Example 5dNTP concentration vs MEDA Assembly efficiency

The invention further tests the effect of different dNTP concentrations on MEDA assembly efficiency. Experimental procedure referring to example 4, the viren protein was used in an amount of 0.2 μm, and different dNTP concentrations were introduced into the MEDA assembly system as follows:

2) The MEDA reaction system containing different dNTP concentrations was configured by introducing different dNTP concentrations of 0, 5, 20, 50, 100, 200 and 500. Mu.M, respectively, into the 20. Mu.L MEDA assembly system. VirEN protein was used in an amount of 0.2. Mu.M, T5 exonuclease was used in an amount of 0.04U per total reaction system, and the reaction buffer system consisted of 100mM Tris-HCl, pH=7.5, 10mM MgCl ₂, 1mM DTT, 5% by mass of PEG8000.

3) The linearized pUC19 vector and Kan segment with 8bp homology arm are assembled as 2 segments, and the assembled system uses the MEDA reaction system with different dNTP concentrations, which is prepared by the above method, wherein the dosage of the linearized vector is 100ng, the molar ratio of the inserted segment to the vector is 4:1, and the reaction is carried out for 40min at 30 ℃. The control group was assembled with the linearized pUC19 vector using an insert having an 8bp homologous sequence at the end, and the reaction system was the same except for the absence of VirEN and dNTPs.

As shown in FIG. 8, the MEDA assembly requires the presence of dNTPs, which are important to the assembly system established by the present invention. Good assembly results are achieved between 5 and 500. Mu.M, and below 0.5. Mu.M MEDA assembly cannot occur, with 50. Mu.M being the most efficient. To ensure the assembly efficiency, we determined that the addition amount of dNTPs in the reaction system was 50. Mu.M. After VirEN and dNTPs are added into the assembly system, virEN protein is annealed with each other through 3' protruding sequences with homology at the catalytic terminal, one 3' protruding sequence is used as a template, the other 3' protruding sequence is used as a primer to catalyze DNA synthesis, and single-chain gaps are repaired and filled, so that the splicing of DNA molecules is realized.

Example 6MEDA Assembly method multiple fragments can be assembled

The invention further tests the assembly results of the MEDA method on different numbers (1-3) of fragments and linearization vectors. The specific process is as follows:

1) The template required for the preparation of the linearized vector was derived from plasmid pUC19 and was linearized by primer 19k-F/19k-R PCR. The inserts were derived from KanR, smR and CmR resistance gene expression cassettes. 8-Kan, 10-Kan and 15-Kan fragments were amplified using 8-Kan-F/8-Kan-R, 10-Kan-F/10-Kan-R and 15-Kan-F/15-Kan-R, respectively; 8-Kan-2, 10-Kan-2 and 15-Kan-2 were amplified using 8-Kan-F/8-Kan-SPC-R, 10-Kan-F/10-Kan-SPC-R and 15-Kan-F/15-Kan-SPC-R; 8-SPC-2, 10-SPC-2 and 15-SPC-2 were amplified using SPC-F/8-SPC-19K-R, SPC-F/10-SPC-19K-R and SPC-F/15-SPC-19K-R; 8-SPC-3, 10-SPC-2 and 15-SPC-2 were amplified using SPC-F/8-SPC-CHL-R, SPC-F/10-SPC-CHL-R and SPC-F/15-SPC-CHL-R; 8-CHL-3, 10-CHL-3 and 15-CHL-3 were amplified using CHL-F/8-CHL-19K-R, CHL-F/10-CHL-19K-R and CHL-F/15-CHL-19K-R. When amplified by the PCR described above, homology arms of 8bp or 10bp or 15bp were introduced.

2) Assembling a linearized pUC19 vector and Kan fragments with 8bp, 10bp and 15bp homologous sequences respectively as 2 fragments; assembling the linearized pUC19 vector with Kan-2 and SPC-2 with 8bp, 10bp and 15bp homologous sequences respectively as 3 fragments; the linearized pUC19 vector was assembled with Kan-2, SPC-3 and Chl-3 having 8bp, 10bp and 15bp homologous sequences, respectively, as 4 fragments. The assembled system was VirEN protein at 0.2. Mu.M, T5 exonuclease at 0.04U/total reaction system, and the reaction buffer system consisted of 100mM Tris-HCl pH=7.5, 10mM MgCl ₂, 1mM DTT, 5% PEG8000 and 50. Mu.M dNTP. Wherein the amount of linearized carrier is 100ng, the molar ratio of insert to carrier addition is 2:1, and the reaction is carried out at 30℃for 40min.

3) After completion of the assembly reaction, 10. Mu.L of the assembly product was transformed into DH 5. Alpha. Competent cells prepared by the Inoue method. The transformation products were plated onto 50. Mu.g/mL Kan, 50. Mu.g/mL Kan+50. Mu.g/mL Str, and 50. Mu.g/mL Kan+50. Mu.g/mL Str+25. Mu.g/mL CHl resistance plates, respectively, positive clones were selected, and the number of monoclonal clones was counted after overnight incubation.

As shown in FIG. 9, the MEDA assembly method can realize the assembly of multiple fragments of 2 fragments, 3 fragments and 4 fragments, and has slightly lower assembly efficiency of 3 fragments and 4 fragments compared with the assembly of 2 fragments, but still can obtain enough cloning number, and the assembly efficiency of 10bp homology arms is similar to that of 15bp homology arms. At a length of 10bp for homologous sequences, the efficiency of assembly of the MEDA to the multiple fragments is higher than that of TEDA.

Example 7 Effect of assembly on MEDA assembly efficiency at different reaction times and different reaction temperatures

The MEDA assembly method is performed at a reaction temperature of 30℃instead of 50℃for Gibson assembly, since a decrease in temperature may allow annealing of some short single stranded DNA. However, as a result of the above study on the molecular mechanism of VirEN, the biochemical reaction temperature was 37 ℃, so the invention further tested the effect of different reaction conditions on the assembly efficiency of the MEDA. Experimental procedure referring to example 3, kan PCR fragment was introduced into 8bp homology arm, and different reaction temperature and reaction time were used for MEDA reaction. The specific process is as follows:

2) The linearized pUC19 vector and Kan fragment with 8bp homology arm are assembled as 2 fragments, the assembled system is VirEN protein with the dosage of 1 mu M, the dosage of T5 exonuclease is 0.04U/total reaction system, and the reaction buffer system is composed of 100mM Tris-HCl with pH=7.5, 10mM MgCl ₂, 1mM DTT, 5% PEG8000 and 100 mu M dNTP. The total system was 20. Mu.L. Wherein the amount of linearized support is 100ng and the molar ratio of insert to support addition is 4:1. The experimental group sets the reaction temperatures and reaction times to be 30 ℃ 20min, 30 ℃ 40min, 33 ℃ 20min, 33 ℃ 40min, 37 ℃ 20min and 37 ℃ 40min, respectively. The control group used TEDA technology (refer to CN 108841901A), and the reaction system was the same except that VirEN and dNTP were not added, and the reaction temperature was 30℃and the reaction time was 40min.

3) After completion of the reaction, 10. Mu.L of the assembly product was transformed into DH 5. Alpha. Competent cells prepared by the Inoue method. The transformation products were plated onto 50. Mu.g/mL Kan-resistant plates to screen positive clones and the number of monoclonal clones counted after overnight incubation.

The experimental results are shown in FIG. 10, and the MEDA assembly efficiency is optimal when the reaction is carried out at 30 ℃ for 20-40 min. As the reaction time increases, the longer the T5 exonuclease digests the linearized vector and the 5' end of the insert, the more positive clones the reaction for 40min than 20min at 37℃reaction temperature in this test.

Example 8 Effect of PEG8000 on DNA Assembly

PEG8000 molecules in the buffer solution system can further enhance the annealing effect, and have a protective effect on the annealing of DNA molecules. VirEN also promote annealing of short single stranded DNA, thus the present invention explores the effect of PEG8000 on MEDA assembly. Experimental procedure referring to example 3, kan PCR fragment was introduced into 8bp homology arm, and buffer solution without or with PEG8000 was used for DNA assembly test, and the specific procedure is as follows:

2) An MEDA reaction system without or with PEG8000 was configured, and the brand of PEG8000 was biofroxx by not introducing or introducing 5% by mass of PEG8000 into a 20. Mu.L MEDA assembly system. VirEN protein was used in an amount of 0.2. Mu.M, T5 exonuclease was used in an amount of 0.04U per total reaction system, and the reaction buffer system consisted of 100mM Tris-HCl, pH=7.5, 10mM MgCl ₂, 1mM DTT.

3) The linearized pUC19 vector and Kan fragment with 8bp homology arm are assembled as 2 fragments, and the assembled system uses the MEDA reaction system which is prepared by the above and does not contain or contains PEG8000 (5%), wherein the dosage of the linearized vector is 100ng, the molar ratio of the inserted fragment to the vector is 4:1, and the reaction is carried out for 40min at 30 ℃. The control group was assembled with the linearized pUC19 vector using an insert having an 8bp homologous sequence at the end, and the reaction system was the same except for the absence of VirEN and dNTPs.

As shown in FIG. 11, in the case of homology arm 8 bp, the number of positive clones formed by the MEDA assembled group without PEG8000 was significantly lower than that formed by the PEG8000 group. When the homology arm is 15bp, virEN can replace the action of PEG8000 molecules. Therefore, the PEG8000 with the mass ratio of 0-5% can complete the MEDA assembly.

Technical result of the above test, this example provides a typical reaction system. The reaction system consists of VirEN protein, T5 exonuclease and buffer solution containing dNTP and PEG 8000; the amount of VirEN protein was 0.2. Mu.M and the amount of T5 exonuclease was 0.04U per total reaction system, based on a total reaction volume of 20. Mu.L, and the reaction buffer system consisted of 100 mM, tris-HCl at pH=7.5, 10mM MgCl ₂, 10mM DTT, 5% PEG8000 and 50. Mu.M dNTPs.

Wherein VirEN and T5 exonuclease in the reaction system are pre-mixed and pre-configured to be 2 multiplied by concentration, and are reserved for later addition of DNA. The repeated freezing and thawing of the 2X premixed reaction system at-80 ℃ does not affect the assembly reaction effect.

5×MEDA reaction buffer

2×MEDAassembly mixture

In some embodiments, the amount of T5 exonuclease may be in the range of 0.04U-0.08U per total reaction system, and the buffer system component of PEG8000 may be 110±5mM Tris-HCl with ph=7.5, 10±1 mM MgCl ₂, 10±1 mM DTT, and 5±1% PEG8000 by mass.

Example 9VirEN mediated in vivo Gene editing

MMEJ, in addition to mediating in vitro DNA assembly, can also be used in vivo for template independent gene knockout or large fragment DNA deletion. After introducing a DNA break on the genome using CRISPR technology (or other technologies that can cause DNA breaks, such as TALEN), it can be repaired by homologous recombination pathways (homologous recombination, HR) and Non-homologous end joining pathways (Non-Homologous End Joining, NHEJ) that rely on repair templates, whereas MMEJ is a third repair pathway, which is characterized by DNA deletions that are independent of repair templates and can result in longer fragments. Therefore, the gene function can be conveniently and quickly verified, or the large fragment can be deleted. In particular, most microorganisms do not have the NHEJ pathway, which further shows the utility value of the MMEJ pathway.

The invention tests VirEN mediated gene editing effect of MMEJ in prokaryotes.

Double Strand Breaks (DSBs) were introduced on dinB genes using CRISPR-Cas9 in e.coli MG1655 strain, followed by VirEN-mediated MMEJ random repair of DSBs using microhomologous sequences on both sides of dinB genes, resulting in gene knockouts or large fragment deletions, and the process is independent of repair templates. The technical scheme adopted by the invention is as follows:

1) An editing plasmid pCas-VirEN was constructed in which Cas9 was derived from Streptococcus pyogenes, expressed by induction of the arabinose promoter (P _BAD), virEN was expressed fusion to the MBP domain, and expressed by the T5-lac promoter. A plasmid (pCas-MBP) that does not express VirEN was also constructed as a control.

2) Prior to construction of the sgRNA editing plasmid, the appropriate target site needs to be selected, in this example, we selected dinB. The targeting site can be selected by means of CHOPCHOP website aided design, the sequence of the targeting site selected by the user is 5'-ggtaaggtttgtaaaaatgccgg-3' according to the principle of high efficiency and low off-target rate, the targeting site consists of protospacer (20 bp) and PAM (3 bp), the spacer sequence is GGTAAGGTTTGTAAAAATGC, and the PAM is cgg. PCR was performed using the pTargetF plasmid stored in the laboratory as a template and through dinB-sgRNA-F/dinB-sgRNA-R primers to obtain a DNA fragment containing the spacer sequence. The template was then digested with DpnI (Saimer' FASTDIGEST DPN) at 37℃for 1h, 10. Mu.L of recovered product was directly transformed into DH 5. Alpha. Competent cells after clean recovery, and single colonies were picked up for validation and sequencing after overnight incubation at 37 ℃. The obtained plasmid pTargetF-sgRNA ^dinB takes a spectinomycin resistance gene as a screening marker, wherein the sgRNA is expressed under the control of a P _BAD promoter.

3) When the gene editing is carried out, the pCas-VirEN plasmid and the pTargetF-sgRNA ^dinB plasmid are firstly transferred into the escherichia coli K-12MG1655 to be subjected to the gene editing to be used as an experimental group, and the pCas-MBP plasmid and the pTargetF-sgRNA ^dinB plasmid are firstly transferred into the escherichia coli K-12MG1655 to be subjected to the gene editing to be used as a control group. Transformants were then picked and inoculated (containing kanamycin, spectinomycin and glucose at a final concentration of 2 mg/ml). Overnight cultures 1:100 were transferred to 10ml LB medium (containing kanamycin and spectinomycin), incubated at 37℃until OD600 = 0.6-0.7, induced by the addition of 0.4mM IPTG to VirEN protein expression, incubated at 18℃for 2-3 hours, induced by the addition of 3g/L L-arabinose to Cas9 and sgRNA expression, incubated at 37℃for 2-3 hours, and the bacterial extracts were plated at different dilutions on LB-induced plates (containing kanamycin, spectinomycin, L-arabinose and IPTG) and incubated at 37℃overnight.

4) 96 Single colonies are randomly picked on the control group and experimental group plates, and the colony PCR is performed by using dinB-F/dinB-R primers to analyze mutation rate, so that the colony ratio of successful dinB gene knockout is obtained. If the editing is not successful, agarose electrophoresis can obtain a band with the size of 1056 bp; the deleted fragment is smaller, so that a band smaller than 1056bp can be obtained; if the dinB gene and surrounding sequences are deleted, agarose electrophoresis does not give a band. The mutation rate of dinB gene after statistics is shown in FIG. 12, the probability of dinB gene deletion obtained by VirEN is about 40%, and the probability is improved by 4 times compared with the comparison group.

5) To further analyze whether gene deletion was completed by the MMEJ pathway, 9 clones from the experimental group were randomly picked and sequencing analysis was performed on the PCR products obtained from dinB-F/dinB-R, and the sequencing result statistics are shown in Table 1. The MMEJ pathway is characterized by the presence of a pair of microhomologous sequences at both ends of the DSB prior to repair, and only one end of the microhomologous sequence remains after repair. Sequencing results showed that VirEN-mediated MMEJ resulted in a 486bp-16377bp deletion, including the sgRNA targeted region, and that all deletion mutants had a micro-homologous sequence at one end, mapping to a short (6-11 nucleotide) sequence repeat in the genome. These results indicate that VirEN is able to repair double strand breaks by the MMEJ pathway, completing gene editing.

TABLE 1 sequencing results of MMEJ mediated dinB mutations

/>

The micro-homologous sequences flanking the deleted fragment are marked in bold

The invention also uses the I-C CRISPR-Cas system to knock out the lacZ gene of the escherichia coli MG 1655.

Using the reported Pseudomonas aeruginosa as a template, 4 Cas genes (Cas 5, cas7, cas8 and Cas 3) of the I-C CRISPR-Cas system Cascade-Cas3 were amplified and cloned into the expression vector as editing plasmid 1 in the order described above. A linear dsDNA template annealed by cloning repeat sequences with I-C CRISPR-Cas system on another expression vector is located on both sides of two BsaI restriction enzyme recognition sites, thereby obtaining the expression vector of crRNA. To verify VirEN-mediated gene editing effects of MMEJ in prokaryotes, for example, to knock out the lacZ gene of E.coli K-12MG1655 strain, the appropriate target sequence was first selected and the design was aided by a CHOPCHOP website tool. After annealing the oligonucleotide primer encoding the spacer sequence against the LacZ gene and phosphorylating with T4 PNK (NEB), pcrRNA-LacZ was constructed as editing plasmid 2 by cloning into the crRNA expression vector using the BasI locus.

The Cascade-Cas3 and crRNA expression elements are expressed by the inducible promoter P _BAD, and the inducible promoter of the VirEN protein is a T5-lac promoter. Editing plasmid 1 and editing plasmid 2 used different resistance genes as selection markers, editing plasmid 2 having the sacB gene sensitive to sucrose.

When gene editing is carried out, the plasmids 1 and 2 are firstly transferred into escherichia coli K-12MG1655 to be subjected to gene editing, and then transformants are selected for inoculation culture (containing antibiotics and glucose). Overnight cultures were transferred 1:100 into 10ml LB medium (containing antibiotics) and incubated at 37℃until OD600 = 0.6-0.7, protein VirEN was induced by adding 0.4mM IPTG, after incubation at 18℃for 2-3 hours, cascade-Cas3 and crRNA were induced by adding 3g/L L-arabinose, incubation was continued at 37℃for 2-3 hours, and bacterial solutions were plated at different dilutions on LB induction plates (containing antibiotics, L-arabinose and IPTG) containing X-Gal. After overnight incubation, successfully edited cells were screened by blue and white spots, white colonies were positive, blue colonies were negative, and verified by picking single colonies for colony PCR and sequencing.

If large fragment deletions of the genome are to be achieved, two targets can be used to simultaneously generate a DNA double strand break. During design, a pair of micro-homologous sequences needs to be searched on two sides of the target deletion segment, and then during design of the editing plasmid 2, the sites of the designed target sequence need to be respectively aimed at the inner part of the target deletion segment and are as close to the two micro-homologous sequences as possible. Thus, after double strand break occurs in target cleavage, the two micro-homologous sequences at the break can anneal under the action of MMEJ, resulting in the deletion of a large fragment at the target position.

While the invention has been described in detail in the foregoing general description and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims

1. Use of VirEN protein in DNA assembly or genome editing, characterized in that said protein simultaneously satisfies all of the following characteristics:

(1) Contains a virE N-terminal domain;

Optionally, the VirE N-terminal domain contains two αβ units at the N-terminus and one RNA Recognition Motif (RRM) fold at the C-terminus;

2. The use according to claim 1, wherein said protein belongs to the prokaryotic Argonaute-related protein;

Optionally, the gene encoding the protein is located within 20 gene regions upstream or downstream of the Argonaute-associated protein encoding gene;

Optionally, the amino acid sequence of the protein is shown as SEQ ID NO. 1-5.

3. An in vitro DNA assembly reaction system, comprising the following components:

1) A VirEN protein according to any one of claims 1-2;

3)dNTP；

4) A buffer containing divalent metal ions.

4. A system according to claim 3, wherein the VirEN protein is at a concentration of no less than 0.05 μm;

Optionally, the VirEN protein is at a concentration of 0.2 μm.

5. A system according to claim 3, wherein the concentration of dntps is 5 to 500 μm;

Optionally, the concentration of dntps is 50 μm.

6. The system of claim 3, wherein the divalent metal ion buffer is PEG8000 buffer;

Optionally, the mass ratio of PEG8000 in the PEG8000 buffer solution is 0-5%;

optionally, the PEG8000 buffer solution comprises 100+ -5 mM Tris-HCl with pH=7.5, 10+ -1 mM MgCl ₂, 10+ -1 mM DTT and PEG8000 with mass ratio of 0-5%.

7. An in vitro DNA assembly method characterized in that 2-4 fragments to be assembled are added and reacted for 20min-40min at 30 ℃ -37 ℃ using the system of any one of claims 3-6.

8. A composition, comprising:

1) A VirEN protein according to any one of claims 1-2; 2) A nuclease system capable of causing a double strand break at the target site;

optionally, the nuclease system of 2) comprises a CRISPR-cas system or a TALEN (Transcription Activator-like Effector Nucleases) system or a zinc-finger ribonuclease (zinc-finger nucleases) system.

9. A carrier system capable of producing the composition of claim 8.

10. Use of the composition of claim 8 or the vector system of claim 9 for the deletion of fragments at genomic target positions, wherein the deleted fragments comprise microhomologous sequences on both sides;

Optionally, the deletion fragment is between 486bp and 16377 bp.