WO2019153902A1 - 植物基因组定点替换的方法 - Google Patents

植物基因组定点替换的方法 Download PDF

Info

Publication number
WO2019153902A1
WO2019153902A1 PCT/CN2018/122014 CN2018122014W WO2019153902A1 WO 2019153902 A1 WO2019153902 A1 WO 2019153902A1 CN 2018122014 W CN2018122014 W CN 2018122014W WO 2019153902 A1 WO2019153902 A1 WO 2019153902A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
acid construct
promoter
plant
vector
Prior art date
Application number
PCT/CN2018/122014
Other languages
English (en)
French (fr)
Inventor
朱健康
华凯
Original Assignee
中国科学院上海生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院上海生命科学研究院 filed Critical 中国科学院上海生命科学研究院
Publication of WO2019153902A1 publication Critical patent/WO2019153902A1/zh

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01HNEW PLANTS OR NON-TRANSGENIC PROCESSES FOR OBTAINING THEM; PLANT REPRODUCTION BY TISSUE CULTURE TECHNIQUES
    • A01H4/00Plant reproduction by tissue culture techniques ; Tissue culture techniques therefor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material

Definitions

  • the present invention relates to the field of biotechnology, and in particular to a method for site-point replacement of plant genomes.
  • a first aspect of the invention provides a nucleic acid construct having a structure of formula I of 5'-3' (5' to 3'):
  • I1 is the first integrated component
  • I2 is the second integrated component
  • Z1 is the first expression cassette
  • Z2 is a second expression cassette
  • one of the Z1 and Z2 expression cassettes has the Ia structure, and the other expression cassette has the structure of the formula Ib:
  • P1, S1, X1, L1, X2, L2, X3, P2, and Y1 are respectively elements for constituting the construct
  • P1 is a first promoter, and the first promoter includes a ubiquitin promoter
  • S1 is the coding sequence of no or signal peptide
  • X1 is the coding sequence of adenine deaminase (such as wild-type and/or mutant TadA);
  • L1 is the coding sequence of no or first linker peptide
  • X2 is the coding sequence of Cas9 nuclease, which is cleavage-free or single-strand cleavage activity
  • L2 is the coding sequence of the no or second linker peptide
  • X3 is a coding sequence of a nuclear localization signal, and the nuclear localization signal is VirD2;
  • P2 is the second promoter
  • Y1 is the coding sequence of sgRNA
  • each "-" is a bond or nucleotide linkage sequence.
  • the ubiquitin promoter comprises a maize ubiquitin promoter.
  • the ubiquitin promoter comprises a UBI promoter.
  • the second promoter comprises a U6 promoter.
  • the "no-cleavage activity or single-strand cleavage activity” refers to the single-strand non-cleavage activity of the Cas9 nuclease for the target site T.
  • nucleotide elements of the present invention are ligated in-frame to express a fusion protein having the correct amino acid sequence.
  • the construct has the structure of Formula IIa or Formula IIb:
  • the 5th to 10th positions of the sgRNA correspond to a position (i.e., T) at which a T ⁇ C site-directed mutation is predetermined to occur.
  • positions 6-14 of the sgRNA correspond to a position (i.e., T) at which a T ⁇ C site-directed mutation is predetermined to occur.
  • the 12th, 13th, and/or 14th position of the sgRNA corresponds to a position (i.e., T) at which a T ⁇ C site-directed mutation is predetermined to occur.
  • sequence lengths of L1 and L2 are each independently from 3 to 120 nt, preferably from 3 to 96 nt, and preferably a multiple of three.
  • the nucleotide linkage sequence is from 1 to 300 nt in length, preferably from 1 to 100 nt.
  • the first expression cassette and the second expression cassette each have a terminator.
  • the first integration element comprises a 5' homology arm sequence.
  • the first integrated component is an RB sequence.
  • the RB sequence is set forth in SEQ ID NO.: 1 (TAAACGCTCTTTTCTCTTAGGTTTAC).
  • the signal peptide comprises a nuclear localization signal peptide of VirD2.
  • the Cas9 nuclease is selected from the group consisting of Cas9, Cas9n, or a combination thereof.
  • the Cas9 nuclease is a mutated Cas9 nuclease.
  • the mutation site is at the D10A position of the Cas9 nuclease (SEQ ID NO.: 2).
  • amino acid sequence of the Cas9 nuclease in the X2 element is set forth in SEQ ID NO.:69.
  • the source of the X2 element is selected from the group consisting of Streptococcus pyogenes, Staphylococcus aureus, or a combination thereof.
  • the linker sequence comprises XTEN.
  • the linker sequence is set forth in SEQ ID NO.: 3 (TCTGGAGGGTCCTCCGGCGGATCGTCCGGCAGCGAGACGCCAGGCACCTCCGAGAGCGCTACGCC TGAATCCTCCGGGGGATCTTCAGGAGGATCA).
  • the adenine deaminase comprises TadA.
  • the adenine deaminase comprises wild type and mutant form.
  • the adenine deaminase mutant comprises TadA7-10.
  • the adenine deaminase is a tandem adenine deaminase, and the tandem adenine deaminase structure is as shown in formula II:
  • Z8 is the amino acid sequence of the wild type adenine deaminase TadA
  • L8 is an optional linker peptide sequence
  • Z9 is the amino acid sequence of the mutant adenine deaminase TadA7-10.
  • the first promoter is derived from one or more plants selected from the group consisting of corn, rice, soybean, Arabidopsis, and tomato.
  • the second promoter is derived from one or more plants selected from the group consisting of rice, corn, soybean, Arabidopsis, and tomato.
  • the second integration element comprises a 3' homology arm sequence.
  • the second integration element is an LB sequence.
  • the LB sequence is set forth in SEQ ID NO.: 4 (TGTTTACACCACAATATATCCTGCCA).
  • the nuclear localization signal is derived from Agrobacterium.
  • the nucleic acid construct has a length of from 5,000 to 10,000 bp, preferably from 7,500 to 8,500 bp.
  • one or more additional expression cassettes are additionally inserted.
  • the additional expression cassette is independent of the first expression cassette and the second expression cassette.
  • the additional expression cassette expresses a substance selected from the group consisting of:
  • the marker gene comprises a resistance gene (hygromycin gene), a fluorescent gene, or a combination thereof.
  • a second aspect of the invention provides a vector comprising the nucleic acid construct of the first aspect of the invention.
  • the vector is a plant expression vector.
  • the vector is an expression vector that can be transfected or transformed into a plant cell.
  • the carrier is an Agrobacterium Ti carrier.
  • the construct is integrated into the T-DNA region of the vector.
  • the carrier is cyclic or linear.
  • a third aspect of the invention provides a genetically engineered cell comprising the nucleic acid construct of the first aspect of the invention, or a genome thereof comprising one or more of the nucleic acid constructs of the first aspect of the invention.
  • the cell is a plant cell.
  • the plant is selected from the group consisting of a gramineous plant, a leguminous plant, a cruciferous plant, or a combination thereof.
  • the plant comprises: Arabidopsis thaliana, wheat, barley, oats, corn, rice, sorghum, millet, soybean, peanut, tobacco, tomato, or a combination thereof.
  • the genetically engineered cell is introduced into the cell by the method according to the first aspect of the invention by a method selected from the group consisting of Agrobacterium transformation, gene gun method, microinjection method, Electroshock, ultrasonic and polyethylene glycol (PEG) mediated methods.
  • a fourth aspect of the invention provides a method for genetically editing a plant, comprising the steps of:
  • step (ii) and step (iii) are from the same site.
  • the introduction is introduced by Agrobacterium.
  • the introduction is by a gene gun.
  • the gene is edited as a fixed point base substitution (or mutation).
  • the site-directed substitution comprises mutating T to C and/or A to G.
  • the plant is selected from the group consisting of a gramineous plant, a leguminous plant, a cruciferous plant, or a combination thereof.
  • the plant comprises: Arabidopsis thaliana, wheat, barley, oats, corn, rice, sorghum, millet, soybean, peanut, tobacco, tomato, or a combination thereof.
  • a fifth aspect of the invention provides a method of preparing a transgenic plant cell, comprising the steps of:
  • the transfection is performed using an Agrobacterium transformation method or a gene gun bombardment method.
  • a sixth aspect of the invention provides a method of preparing a transgenic plant cell, comprising the steps of:
  • a seventh aspect of the invention provides a method of preparing a transgenic plant, comprising the steps of:
  • the transgenic plant cell prepared by the method of the fifth aspect of the invention or the method of the sixth aspect of the invention is regenerated into a plant body, thereby obtaining the transgenic plant.
  • An eighth aspect of the invention provides a transgenic plant, characterized in that the plant is prepared by the method of the seventh aspect of the invention.
  • Figure 1 shows efficient base editing of A.T to G.C in rice.
  • FIG. 1 A schematic diagrams of two rice adenine base editing vectors pRABEsp-OsU6 and pRABEsa-OsU6sa.
  • B Schematic representation of the sgRNA1 target site in the OsSPL14 gene. The sequence in which OsmiR156 binds to OsSPL14 is highlighted in red.
  • C Sequencing peak maps of two representative OsSPL14 editing plants SG1-7 and SG1-15.
  • D Schematic representation of the sgRNA2 target site in the SLR1 gene. The T-C substitution at position 6 of the protospacer region may result in the amino acid V at position 92 in the TVHYNP motif becoming A.
  • Figure 2 shows the results of TA clone sequencing of two base-edited plants (SG1-7 and SG1-15) in which sgRNA1 targets the OsSPL14 gene. A randomization of 20 clones per row was performed for sequencing, showing representative sequencing peaks for each genotype.
  • Figure 3 shows that pRABEsp-OsU6 can be used for multiple base editing of rice.
  • (A)sgRNA3 was designed to simultaneously target and edit three genes in the rice genome. OsmiR156 binding nucleotides in the three genes are highlighted in red.
  • (B) provides a representative sequence peak map of three target sites. In the SG3-11 and SG3-12 plants, the target sites in OsSPL16 and OsSPL18 were simultaneously edited, but the target sites of LOC_Os02g24720 were wild-type in the two plants.
  • Figure 4 shows that pRABEsa-OsU6sa performs simultaneous base editing on OsSPL16 and OsSPL18.
  • sgRNA5 can simultaneously target two sites of OsSPL16 and OsSPL18.
  • B Sequencing peaks of representative three transgenic plants SG5-7, SG5-18, SG5-44 at two target sites. Both genes were simultaneously edited in three selected transgenic plants.
  • Figure 5 shows the editing of the OsSPL14 target site by ABE-P1S.
  • the vector schematic of the A.ABE-P1S base editor The schematic diagram of the sgRNA1 target site in the B.OsSPL14 gene. The sequence in which OsmiR156 binds to OsSPL14 is highlighted in red. C. Sequencing peaks of the two representative lines Line 7 and Line 12 at the OsSPL14 target site, with black arrows indicating the sites where base substitution occurred. D.ABE-P1S off-target editing of the OsSPL17 locus. The sequence in which OsmiR156 binds to OsSPL17 is highlighted in red, with mismatched base letters in lowercase and black arrows indicating the site where base substitution occurs.
  • Figure 6 shows the editing of the SPX2-MFS2 locus by ABE-P2S.
  • the figure shows the editing of the OsSPL13 locus by 7ABE-P5 and ABE-P5S.
  • the present inventors have extensively and intensively studied, and screened a specific promoter that drives expression of a fusion protein composed of Cas9 nuclease and adenine deaminase, nuclear localization signal VirD2, and a specific promoter that drives transcription of sgRNA. And by using a nucleic acid construct of a specific structure, the present invention successfully achieves sgRNA-directed base site-directed mutagenesis in plants for the first time (eg, T mutation to C or A mutation to G), and the mutation efficiency is very high (up to ⁇ 60) % or higher, and the applicant unexpectedly discovered that the simplified adenine base editor ABE-P1S has higher base editing efficiency. The present invention has been completed on the basis of this.
  • homologous arm refers to a flanking sequence that is identical to a genomic sequence flanking a foreign sequence to be inserted on a targeting vector, for identifying and recombining a region.
  • plant promoter refers to a nucleic acid sequence capable of initiating transcription of a nucleic acid in a plant cell.
  • the plant promoter may be derived from a plant, a microorganism (such as a bacterium, a virus) or an animal, or a synthetic or engineered promoter.
  • base mutation refers to a substitution, insertion, and/or deletion of a base at a position of a nucleotide sequence.
  • base substitution refers to the mutation of a base at a position of a nucleotide sequence to another different base, such as a T mutation to C.
  • A.T to G.C refers to the mutation of an A-T base pair at a position to or from a G-C base pair in a double stranded nucleic acid sequence, particularly a genomic sequence.
  • screening marker gene refers to a gene used for screening a transgenic cell or a transgenic animal in a transgenic process
  • the screening marker gene useful in the present application is not particularly limited, and includes various screening marker genes commonly used in the transgenic field, representative examples. Including (but not limited to): hygromycin resistance gene (Hyg), kanamycin resistance gene (NPTII), neomycin gene, puromycin resistance gene, hygromycin resistance gene (HYG , G418 and kanamycin resistance gene (NPTII), Basta resistance gene (BAR), puromycin resistance gene (PAC), and/or neomycin resistance gene (NEO).
  • Hyg hygromycin resistance gene
  • NPTII kanamycin resistance gene
  • NPTII kanamycin resistance gene
  • NPTII puromycin resistance gene
  • BAR Basta resistance gene
  • PAC puromycin resistance gene
  • NEO neomycin resistance gene
  • Cas protein refers to a nuclease.
  • a preferred Cas protein is the Cas9 protein.
  • Typical Cas9 proteins include, but are not limited to, Cas9 derived from Streptococcus pyogenes.
  • the Cas9 protein is a mutated Cas9 protein, specifically, a mutant Cas9 protein having no cleavage activity or only single-strand cleavage activity.
  • the term "coding sequence of a Cas protein” refers to a nucleotide sequence that encodes a Cas protein.
  • the skilled artisan will recognize that because of the degeneracy of the codon, a large number of polynucleotide sequences can encode the same polypeptide. .
  • the skilled person will also recognize that different species have a certain preference for codons, and may optimize the codons of the Cas protein according to the needs of expression in different species. These variants are all referred to by the term "Cas protein.
  • the coding sequence is specifically covered.
  • the term specifically encompasses a full-length sequence substantially identical to the Cas gene sequence, as well as a sequence encoding a protein that retains the function of the Cas protein.
  • nucleotide sequence is from the 5' to 3' direction unless otherwise specified.
  • adenine deaminase refers to TadA adenine deaminase, derived from E. coli, which acts on tRNA and is capable of deaminating a specific adenine in a tRNA.
  • a suitable TadA comprises both a wild-type form and a specific mutant form thereof, TadA7-10, and may also comprise a combination of a wild-type form and a mutant form.
  • TadA7-10 is capable of performing a deamination reaction using DNA as a substrate.
  • the coding sequence of the adenine deaminase of the present invention is optimized for codons to enable more efficient expression in plants.
  • the present invention provides a nucleic acid construct for gene editing of a plant having a structure of formula I of 5'-3':
  • I1 is the first integrated component
  • I2 is the second integrated component
  • Z1 is the first expression cassette
  • Z2 is a second expression cassette
  • one of the Z1 and Z2 expression cassettes has the Ia structure, and the other expression cassette has the structure of the formula Ib:
  • I1, P1, S1, X1, L1, X2, L2, X3, L3, P2, Y1, I2 are respectively elements for constituting the construct, and the definition thereof is as described in the first aspect of the invention
  • each "-" is a bond or nucleotide linkage sequence.
  • the I1 element (or the left integration element) and the I2 element (or the right integration element) can cooperate to integrate the element located therebetween (ie, the nucleotide sequence from P1 to Y1) to In the genome of plant cells.
  • I1 and I2 are Ti elements derived from Agrobacterium. Of course, other elements that function similarly can also be used in the present invention.
  • constructs of the invention are either known in the art or can be prepared by methods known to those skilled in the art.
  • the constructs of the present invention can be formed by conventional methods, such as PCR methods, full artificial chemical synthesis, and enzymatic cleavage methods, and then joined together by well-known DNA ligation techniques.
  • the construction of the present invention is carried out by inserting the construct of the present invention into an exogenous vector, especially a vector suitable for transgenic plant manipulation.
  • the transgenic plant cells are prepared by transforming the vector of the present invention into plant cells to mediate the integration of the plant cell chromosomes by the vector of the present invention.
  • transgenic plant cells of the present invention are regenerated into plant bodies to obtain transgenic plants.
  • the above nucleic acid construct constructed by the present invention can be introduced into a plant cell by a conventional plant recombination technique (for example, Agrobacterium transfer technology) to obtain a nucleic acid construct (or a vector carrying the nucleic acid construct). Plant cells, or plant cells in the genome in which the nucleic acid construct is integrated.
  • a conventional plant recombination technique for example, Agrobacterium transfer technology
  • the main feature of this vector is the linkage of adenine deaminase to the Cas protein in the CRISPR/Cas system and the coding sequence of the nuclear localization signal VirD2 to form the coding sequence for the fusion protein.
  • the fusion protein encoded by the coding sequence is expressed in the cytoplasm, the fusion protein can be transferred to the nucleus very efficiently and directed to the target site in the genome by the guide RNA encoded by the construct of the formula I. Thereby AT to GC base substitutions are made at the target site and the risk of insertion/deletion is substantially avoided or eliminated.
  • the Cas protein is a mutant Cas protein having no cleavage activity.
  • the Cas protein of the present invention may be SaKKH-Cas9 (D10A), the amino acid sequence of which is set forth in SEQ ID NO.:69.
  • the proteins are usually linked by some flexible short peptides, namely Linker (linker peptide sequence).
  • Linker linker peptide sequence
  • the Linker can use XTEN.
  • suitable promoters include constitutive and/or inducible promoters.
  • a strong promoter suitable for plant cells can be selected, and representative examples include, but are not limited to, the CaMV 35S promoter or the UBI promoter or the Actin promoter and the like.
  • the expression cassette of the guide RNA suitable for plant cells is selected and constructed in the same vector as the open expression cassette (ORF) of the above fusion protein.
  • the action region of the deaminase is immobilized.
  • the deamination region of the human cell line As the 4th to 7th base regions of the protospacer region, the experimental results obtained by the present invention indicate that if SpCas9 is used, the base editing window is the 5-10th position of the original spacer sequence (protospacer), if SaCas9, the base editing window is the 6th to 14th positions of the original spacer sequence (protospacer). According to this principle, the adenine to be edited is in the "deamination window" when designing the target.
  • the method of introducing the construct of the present invention into a cell or integrating into a genome there is no particular limitation on the method of introducing the construct of the present invention into a cell or integrating into a genome.
  • This can be carried out by a conventional method, for example, by introducing a construct of the formula I or a corresponding vector into a plant cell by a suitable method.
  • Representative methods of introduction include, but are not limited to, Agrobacterium transfection, gene gun, microinjection, electroporation, sonication, and polyethylene glycol (PEG) mediated methods.
  • the recipient plant is not particularly limited, and includes various plant plants (such as gramineous plants), forestry plants, horticultural plants (such as flower plants), and the like.
  • plant plants such as gramineous plants
  • forestry plants such as horticultural plants
  • horticultural plants such as flower plants
  • Representative examples include, but are not limited to, rice, soybean, tomato, corn, tobacco, wheat, sorghum, and the like.
  • the DNA in the transformed plant cell expresses the fusion protein and gRNA.
  • the Cas protein fused to adenine deaminase mutates the T at the target position to C (and thus the A mutation of the complementary strand to G) under the guidance of the corresponding gRNA.
  • the corresponding transgenic plants can be regenerated by conventional methods.
  • a plant after base substitution is obtained by tissue culture.
  • the invention can be used in the field of plant genetic engineering for plant research and breeding, especially genetic improvement of economically valuable crops, forestry crops or horticultural plants.
  • the present invention provides, for the first time, an efficient method for realizing site-directed base mutations (T ⁇ C or A ⁇ G) in plants, which can be widely used for plant research and breeding.
  • the method of site-directed base mutation of the present invention can efficiently perform base mutation (e.g., T ⁇ C or A ⁇ G) at a specific position of a plant cell.
  • base mutation e.g., T ⁇ C or A ⁇ G
  • the present invention enables simultaneous editing of multiple sites in a plant genome.
  • the present invention can expand sites in the plant genome that can be fixedly edited by using different forms of Cas9.
  • the present invention significantly reduces or substantially eliminates the risk of insertion and/or deletion at the target site.
  • the present inventors have found that the 12th, 13th, and/or 14th position of the sgRNA corresponds to a base substitution (i.e., T) (from 5' to 3') at which a T ⁇ C site-directed mutation is scheduled to occur. higher efficiency.
  • the Escherichia coli wild-type tRNA adenine deaminase gene TadA (ecTadA) and its mutant form TadA7-10 (ecTadA*7.10) were synthesized by a conventional method.
  • the 3' ends of TadA and TadA7-10 were each further supplemented with a 96 bp linker coding sequence encoding a 32 amino acid residue linker sequence.
  • AarI restriction sites were added to both ends of TadA-linker and TadA7-10-linker, respectively.
  • the Cas9 gene in the pCas9 (OsU6) vector was driven by the maize ubiquitin promoter and the sgRNA backbone was driven by the rice U6 promoter.
  • the Cas9 (D10A) nickase and SaCas9 (D10A) nickase were amplified by PCR using pCas9 (OsU6) and pX600 (Addgene, #61592) as templates.
  • the upstream primer has two AarI restriction sites at the 5' end, and the VirD2 nuclear localization signal is added to the 3' end of the downstream primer.
  • the amplified product is recovered and used to replace the Cas9 gene in pCas9 (OsU6).
  • the intermediate vectors pRSp-OsU6 and pRSa-OsU6 were obtained, respectively.
  • the TadA-linker and TadA7-10-linker fragments were ligated into pRSp-OsU6 and pRSa-OsU6 by Golden Gate, and the pRABESp-OsU6 and pRBESa-OsU6 vectors were obtained, respectively.
  • the rice U6 promoter, two BsaI cleavage sites, and the sgRNA backbone of SaCas9 were ligated together by the overlap PCR to replace the OsU6-sgRNA fragment in pRBESa-OsU6 to obtain the vector pRABESa-OsU6Sa.
  • the OsU6-SsgRNA expression cassette was replaced with HindIII and XmaI from the previously constructed pRABESa-OsU6Sa vector to replace the OsU6-SpsgRNA expression cassette in pRSABESa-OsU6 to obtain the pRSABESa-OsU6Sa vector, which corresponds to the adenine base editor ABE-P2S. .
  • the pRABESp-OsU6 and pRABESa-OsU6Sa vectors contain a hygromycin gene on the backbone for screening of transgenic plants.
  • the sequence of sgRNA1 to sgRNA5 was synthesized.
  • the primers were annealed on a PCR machine to form a short oligonucleotide linker, and the oligonucleotide linker was inserted into the pRABESp-OsU6, pRABESa-OsU6Sa vector digested with BsaI.
  • Leaf DNA of all transgenic rice plants was extracted by CTAB method. Primers were designed around 250 bp upstream and downstream of the target site, and the target site was amplified by PCR. The PCR product was recovered and sent directly to the company for sequencing. The sequencing results were analyzed by Sequencher software. For some plants that were edited at the target site, we performed further validation by TA cloning. Specifically, the target site was amplified by PCR, purified, and then ligated into p-EASY blunt Zero vector to transform Escherichia coli. Twenty clones were randomly selected for sequencing. The efficiency of base editing is calculated by dividing the number of plants that produce base edits by the total number of transgenes.
  • Predictive analysis was performed on the off-target sites of sgRNA1 and sgRNA4. Sites with a mismatch of less than 5 bases in the rice genome with sgRNA1 and sgRNA4 sequences are considered potential off-target sites. Primers were designed around 250 bp upstream and downstream of all potential off-target sites, and these potential off-target sites were amplified by PCR. The PCR products were purified and sent directly to the company for sequencing. The sequencing results were analyzed by Sequencher software.
  • the present invention synthesizes the coding sequence of wild-type ecTadA and its mutant form ecTadA*7.10. They are then ligated together by using a linker coding sequence encoding 32 amino acid residues. Next, the coding sequence of the recombinant protein was fused to the N-terminus of the Cas9 (D10A) nickase coding sequence having the same linker. Finally, the coding sequence of the VirD2 nuclear localization signal was ligated to the C-terminus of the Cas9 (D10A) nickase to form ABE7-10.
  • ABE7-10 was then cloned into a binary vector under the control of the maize ubiquitin promoter and the sgRNA was driven by the rice U6 promoter to form the vector pRABEsp-OsU6 (Fig. 1A).
  • IPA1 which regulates the ideal plant type of rice, as a target gene (Fig. 1B).
  • the present invention designed an sgRNA (sgRNA1) to target the OsmiR156 binding site sequence of OsSPL14 (Fig. 1B), and the corresponding primers for sgRNA1 vector construction are sgRNA1F and sgRNA1R in Table 3.
  • the binary vector was transformed into Agrobacterium by freeze-thaw method, and then Agrobacterium tumefaciens was used to infect the callus of Nipponbare, and 23 independent transgenic lines were obtained. Then, the target region was amplified by PCR and genotyped by Sanger sequencing, and the amplification primers were SPL14-seq-F and SPL14-seq-R in Table 3.
  • the present invention found that two transgenic plants (SG1-7 and SG1-10) were edited at position 10 of the protospacer (Fig. 1C). To further confirm the Sanger sequencing results, SG1-7 and SG1-15 were selected for TA cloning, and 20 clones were randomly selected for sequencing. Interestingly, the present invention found that 11 clones from SG1-7 have a T-C substitution at position 5 of the protospacer and 9 clones have a T-C substitution at position 10 of the protospacer, indicating that the transgene is bi-allele (Fig. 2). However, only 15% of the SG1-15 strains (3/20) had T-C substitutions at positions 5 and 7 of the protospacer, indicating that the transgenic plants were chimeric (Fig. 2).
  • SLR1 encodes a DELLA protein in rice that acts as a repressor in the GA signaling pathway.
  • the present invention designs sgRNA (sgRNA2) directed against the TVHYNP domain of SLR1.
  • the primers corresponding to the sgRNA2 vector construction were sgRNA2F and sgRNA2R in Table 3.
  • the T-C base substitution at position 6 of the Protospacer resulted in a V92A substitution in the TVHYNP motif (Fig. 1D).
  • the target region amplification primers are SLR-seq-F and SLR-seq-R in Table 3. Since there is only one editable T in the base editing window of SLR1, no other mutant forms were found in the target locus.
  • Example 3 Detecting whether the base editing system of the present invention can simultaneously edit two or more sites in the rice genome
  • sgRNAs3 a third sgRNA (sgRNAs3) was designed, and the OsmiR156 binding site of OsSPL16 and OsSPL18 was simultaneously targeted (Fig. 3 , A and B).
  • the primers corresponding to the construction of the sgRNA3 vector were sgRNA3F and sgRNA3R in Table 3.
  • an off-target site that is 100% matched to the sgRNA3 sequence was found in the intron of the LOC_Os02g24720 gene (Fig. 3, A and B). Therefore, sgRNA3 can simultaneously target three sites in the rice genome.
  • the present invention genotyped these three target sites in 21 transgenic lines.
  • Base editing using the pRABEsp-OsU6 vector requires a PAM sequence containing NGG downstream of the protospacer. This requirement significantly limits the number of loci in the rice genome that can be edited by pRABEsp-OsU6.
  • Cas9 D10A
  • SaCas9 D10A
  • sgRNA4 sgRNA4 that targets the OsmiR156 binding site of OsSPL14 and OsSPL17
  • the primers corresponding to the sgRNA4 vector construction were sgRNA4F and sgRNA4R in Table 3. It is worth noting that despite the different PAM sequences, the protospacer sequence recognized by sgRNA4 overlaps with the protospacer sequence recognized by sgRNA2 ( Figures 1F and 1B). From the 31 transgenic lines identified by our gene, we found that 14 lines had T-C substitution at the OsSPL14 target site, and 19 lines at the OsSPL17 target site had T-C substitution.
  • the amplification primers at the target site are SPL14-seq-F/SPL14-seq-R, SPL17-seq-F/SPL17-seq-R in Table 3, therefore, the base of pRABEsa-OsU6sa at the OsSPL14 and OsSPL17 target sites
  • the editing efficiency of the base was 45.2% and 61.3%, respectively, which was much higher than that of sgRNA2-targeted pRABEsp-OsU6 (Fig. 1H). More importantly, 13 lines (41.9%) were simultaneously edited at these two target sites.
  • the base editing window of the SaCas9 (D10A) nicking enzyme is broader than the Cas9 (D10A) nicking enzyme, probably due to more single-strand exposure during the formation of the induced R-loop complex by SaCas9 (D10A).
  • Adenine deaminase was edited at both target sites.
  • the base edit window of the specific ABE-P1 is shown in Tables 4 and 5.
  • the base edit position is calculated from the PAM remote end, and the PAM is recorded as position 21-23 (Note: Base editing position was counted from the PAM-distal end, scoring the PAM as position 21-23.).
  • the base edit position is calculated from the PAM remote end, and the PAM is recorded as position 22-27 (Note: Base editing position was counted from the PAM-distal end, scoring the PAM as position 22-27.).
  • the potential off-target site of sgRNA4 in the rice genome was predicted using the online tool CRISPR-GE.
  • the potential off-target sites of sgRNA4 were sequenced and found to be free of any form of mutation at these sites, indicating that pRABEsa-OsU6sa is also highly specific in rice. See Table 2 for information on potential off-target sites for sgRNA4 and primers for amplifying potential off-target sites for sgRNA4.
  • the present invention also designed another sgRNA (sgRNA5) that simultaneously targets the OsmiR156 binding site of OsSPL16 and OsSPL18 (Fig. 4).
  • the corresponding primers for sgRNA5 vector construction were sgRNA5F and sgRNA5R in Table 3.
  • the editing efficiency of sgRNA5 is much lower than that of sgRNA4. Only 17% (8/47) of the OsSPL16 target sites had T-C substitutions, and 23.4% (11/47) of the OsSPL18 loci were edited ( Figure 4, A and B).
  • the amplification primers of the target sites are SPL16-seq-F/SPL16-seq-R, SPL18-seq-F/SPL18-seq-R in Table 3. It is worth noting that 14.6% (6/47) of the strains simultaneously replaced T-C at these two sites, further confirming that pRABEsa-OsU6sa can also be used for multi-site base editing of rice.
  • Example 6 The editing effect of the simplified adenine base vector pRSABESp-OsU6 in rice was tested.
  • adenine base editor ABE-P1 In order to further improve the efficiency of adenine base editing in rice, the original base editing vector pRABEsp-OsU6 (called adenine base editor ABE-P1) is simplified to obtain a new adenine base.
  • Vector pRSABESp-OsU6 (referred to as adenine base editor ABE-P1S) (Fig. 5A).
  • adenine base editor ABE-P1S adenine base editor ABE-P1S
  • Fig. 5A Compared to pRABEsp-OsU6, in vector pRSABESp-OsU6, we only ligated ecTadA*7.10 to the N-terminus of the SpCas9 (D10A) nickase coding sequence via a linker encoding 32 amino acid residues, with no changes to other sequences.
  • sgRNA1 to target the OsmiR156 binding site sequence of OsSPL14 (Fig. 5B).
  • the corresponding primers for sgRNA1 vector construction are sgRNA1F in Table 6.
  • sgRNA1R The binary vector was transformed into Agrobacterium by freeze-thaw method, and then Agrobacterium tumefaciens was used to infect the callus of Nipponbare, and 17 independent transgenic lines were obtained. Then, the target region was amplified by PCR and genotyped by Sanger sequencing, and the amplification primers were SPL14-seq-F and SPL14-seq-R in Table 10.
  • Example 7 The editing effect of the simplified adenine base vector pRSABESa-OsU6Sa in rice was tested.
  • the adenine base editing vector pRABEsa-OsU6sa (called adenine base editor ABE-P2) uses the SaCas9 (D10A) nickase, which recognizes different PAM sequences, NNGRRT, which extends the rice adenine base editor. An editable target in the rice genome.
  • the pRSABESa-OsU6Sa vector is also simplified, and a new adenine base vector pRSABESa-OsU6Sa (referred to as adenine base editor ABE-) is obtained.
  • P2S (Fig. 6A).
  • pRABEsa-OsU6Sa in the vector pRSABESa-OsU6Sa, we only ligated ecTadA*7.10 to the N-terminus of the SaCas9 (D10A) nickase coding sequence via a linker encoding 32 amino acid residues, with no changes to other sequences.
  • sgRNA8 In order to compare the editing effects of the new adenine bases ABE-P2S and ABE-P2 in rice, we also selected sgRNA8 to target the OsmiR827 binding site of rice SPX-MFS2 (Fig. 6B), and the corresponding primers for sgRNA8 are sgRNA8F and sgRNA8R in 6.
  • the sgRNA8 was separately loaded into pRABEsa-OsU6Sa and pRSABEsa-OsU6Sa, and the binary vector was transformed into Agrobacterium by freeze-thaw method, and then the callus of Nipponbare rice was infested with Agrobacterium.
  • pRABEsa-OsU6Sa we obtained 41 transgenic positive vaccines.
  • the target region was amplified by PCR and genotyped by Sanger sequencing, and the amplification primers were SPX-MFS2-F and SPX-MFS2-R in Table 10. It was identified that 4 strains of single base A-G substitution occurred at the target site, and the editing efficiency was 9.8% (Table 8).
  • the site where base substitution occurred was the adenine at position 1, 9 or 15 of the protospacer region, but none of the lines showed homozygous substitution.
  • Example 8 Simplifies the adenine base device ABE-P5 containing Cas9 protein variant and observes its editing effect in rice
  • the adenine base editor ABE-P5 uses the SaKKH-Cas9 (D10A) nickase (Fig. 7A), and SaKKH-Cas9 (D10A) introduces three mutations of E782K/N968K/R1015H in SaCas9 (D10A), which can Identify different PAM sequences NNNRRT.
  • the present invention further improves the editing efficiency of the adenine base editor ABE-P5, and simplifies it to obtain a new adenine base editor ABE-P5S (Fig. 7A).
  • ABE-P5S in the vector ABE-P5S, we only ligated ecTadA*7.10 to the N-terminus of the SaKKH-Cas9 (D10A) nickase coding sequence via a linker encoding 32 amino acid residues, and the other sequences did not have any change.
  • sgRNA11 To compare the editing efficiency of ABE-P5 and ABE-P5S, we designed sgRNA11 to target the OsmiR156 binding site of OsSPL13 (Fig. 7B).
  • the primers corresponding to sgRNA11 are sgRNA11-F and sgRNA11-R in Table 6.
  • the sgRNA11 was separately loaded into ABE-P5 and ABE-P5S, and the binary vector was transformed into Agrobacterium by freeze-thaw method, and then the callus of Nipponbare rice was infested with Agrobacterium.
  • ABE-P5 we obtained a total of 46 transgenic vaccines.
  • Line 23 has an A-G replacement at the 7th position of the protospacer, and Line 27 has replaced the A-G at the 9th position of the protospacer (Fig. 7C). Therefore, at the sgRNA11 target site, the efficiency of ABE-P5S is 2.8 times that of ABE-P5.
  • the method was the same as in Example 1, except that the rice Actin promoter was used to drive the expression of Cas9 nuclease, nuclear localization signal VirD2 and adenine deaminase, and the sgRNA was driven by a promoter dependent on type II RNA polymerase or a U3 promoter. Transcription.
  • the method is the same as in Embodiment 1, except that VirD2 is replaced with the SV40 nuclear localization signal.

Abstract

本发明提供了一种植物基因组定点替换的方法,具体地,本发明涉及一种核酸构建物,本发明采用特定结构的核酸构建物,首次在植物中成功实现了sgRNA引导的碱基定点突变(如T突变为C或A突变为G)。

Description

植物基因组定点替换的方法 技术领域
本发明涉及生物技术领域,具体地,涉及植物基因组定点替换的方法。
背景技术
在真核生物中,用基因组进行单碱基分辨率的精确修饰主要是通过同源重组完成。然而,同源重组的固有的低效率和外源供体DNA模板的依赖性极大地限制了其在许多物种中的使用(Komor et al.,2017a)。目前由APOBEC/AID基因家族介导的C-T的碱基替换虽然高效,但是突变的形式比较单一,而且突变产物中往往会伴随的InDel的产生。由于插入或缺失突变常常会导致移码突变,因此造成基因编辑的风险上升。
因此,本领域迫切需要开发一种在植物细胞中可高效且精确地实现T-C(或者A-G)转化同时显著降低插入或缺失突变风险的方法。
发明内容
本发明的目的在于提供一种在植物细胞中可高效且精确地实现T-C转化的方法。
本发明第一方面提供了一种核酸构建物,所述核酸构建物具有5’-3’(5’至3’)的式I结构:
I1-Z1-Z2-I2  (I)
式中,
I1为第一整合元件;
I2为第二整合元件;
Z1为第一表达盒;
Z2为第二表达盒;
并且,Z1和Z2中的一个表达盒具有Ia结构,而另一个表达盒具有式Ib结构:
P1-S1-X1-L1-X2-L2-X3  (Ia)
P2-Y1  (Ib);
式中,
P1、S1、X1、L1、X2、L2、X3、P2、Y1分别为用于构成所述构建物的元件;
P1为第一启动子,所述第一启动子包括泛素启动子;
S1为无或信号肽的编码序列;
X1为腺嘌呤脱氨酶(如野生型和/或突变型TadA)的编码序列;
L1为无或第一连接肽的编码序列;
X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割活性的;
L2为无或第二连接肽的编码序列;
X3为核定位信号的编码序列,所述核定位信号为VirD2;
P2为第二启动子;
Y1为sgRNA的编码序列;
并且,各“-”为键或核苷酸连接序列。
在另一优选例中,所述泛素启动子包括玉米泛素启动子。
在另一优选例中,所述泛素启动子包括UBI启动子。
在另一优选例中,所述第二启动子包括U6启动子。
在另一优选例中,所述的“无切割活性或单链切割活性”指Cas9核酸酶对于靶位点T所在的单链无切割活性。
在另一优选例中,本发明的上述核苷酸元件是按阅读框(in-frame)连接的,从而表达氨基酸序列正确的融合蛋白。
在另一优选例中,所述的构建物具有式IIa或式IIb结构:
I1-P1-S1-X1-L1-X2-L2-X3-P2-Y1-I2  (IIa);
I1-P2-Y1-P1-S1-X1-L1-X2-L2-X3-I2  (IIb);
式中,各元件的定义如上所述。
在另一优选例中,所述的sgRNA的第5位-第10位对应于预定发生T→C定点突变的位置(即为T)。
在另一优选例中,所述的sgRNA的第6-14位对应于预定发生T→C定点突变的位置(即为T)。
在另一优选例中,所述的sgRNA的第12位、第13位和/或第14位对应于预定发生T→C定点突变的位置(即为T)。
在另一优选例中,所述的L1和L2的序列长度各自独立地为3-120nt,较佳地, 3-96nt,并且优选为3的倍数。
在另一优选例中,所述的核苷酸连接序列长度为1-300nt,较佳地1-100nt。
在另一优选例中,所述的第一表达盒和第二表达盒均具有终止子。
在另一优选例中,所述第一整合元件包括5’同源臂序列。
在另一优选例中,所述第一整合元件为RB序列。
在另一优选例中,所述RB序列如SEQ ID NO.:1(TAAACGCTCTTTTCTCTTAGGTTTAC)所示。
在另一优选例中,所述信号肽包括VirD2的核定位信号肽。
在另一优选例中,所述Cas9核酸酶选自下组:Cas9、Cas9n、或其组合。
在另一优选例中,所述Cas9核酸酶为突变的Cas9核酸酶。
在另一优选例中,所述X2元件中,突变位点在Cas9核酸酶(SEQ ID NO.:2)的D10A位。
Figure PCTCN2018122014-appb-000001
Figure PCTCN2018122014-appb-000002
在另一优选例中,所述X2元件中,Cas9核酸酶的氨基酸序列如SEQ ID NO.:69所示。
在另一优选例中,所述X2元件的来源选自下组:酿脓链球菌(Streptococcus pyogenes)、葡萄球菌(Staphylococcus aureus)、或其组合。
在另一优选例中,所述连接序列包括XTEN。
在另一优选例中,所述连接序列如SEQ ID NO.:3(TCTGGAGGGTCCTCCGGCGGATCGTCCGGCAGCGAGACGCCAGGCACCTCCGAGAGCGCTACGCC TGAATCCTCCGGGGGATCTTCAGGAGGATCA)所示。
在另一优选例中,所述腺嘌呤脱氨酶包括TadA。
在另一优选例中,所述腺嘌呤脱氨酶包括野生型和突变型。
在另一优选例中,所述腺嘌呤脱氨酶的突变型包括TadA7-10。
在另一优选例中,所述腺嘌呤脱氨酶为串联型腺嘌呤脱氨酶,所述串联型腺嘌呤脱氨酶结构如式II所示:
Z8-L8-Z9 (II)
其中,
Z8为野生型的腺嘌呤脱氨酶TadA的氨基酸序列;
L8为任选的连接肽序列;
Z9为突变型的腺嘌呤脱氨酶TadA7-10的氨基酸序列。
在另一优选例中,所述第一启动子来源于选自下组的一种或多种植物:玉米、水稻、大豆、拟南芥、番茄。
在另一优选例中,所述第二启动子来源于选自下组的一种或多种植物:水稻、玉米、大豆、拟南芥、番茄。
在另一优选例中,所述第二整合元件包括3’同源臂序列。
在另一优选例中,所述第二整合元件为LB序列。
在另一优选例中,所述LB序列如SEQ ID NO.:4(TGTTTACACCACAATATATCCTGCCA)所示。
在另一优选例中,所述核定位信号来源于农杆菌。
在另一优选例中,所述核酸构建物的长度为5000-10000bp,较佳地,7500-8500bp。
在另一优选例中,在所述的I1和I2元件之间,还含有额外插入的一个或多个 额外的表达盒。
在另一优选例中,所述的额外表达盒是独立于所述的第一表达盒和第二表达盒的。
在另一优选例中,所述的额外表达盒表达选自下组的物质:
(a1)标记基因;
(a2)与Y1编码的sgRNA不同的一种或多种sgRNA。
在另一优选例中,所述标记基因包括抗性基因(潮霉素基因)、荧光基因、或其组合。
本发明第二方面提供了一种载体,所述载体含有本发明第一方面所述的核酸构建物。
在另一优选例中,所述载体为植物表达载体。
在另一优选例中,所述的载体为可转染或转化植物细胞的表达载体。
在另一优选例中,所述的载体为农杆菌Ti载体。
在另一优选例中,所述的构建物整合到所述载体的T-DNA区。
在另一优选例中,所述载体是环状的或线性的。
本发明第三方面提供了一种基因工程细胞,所述细胞含有本发明第一方面所述的核酸构建物,或其基因组整合有一个或多个本发明第一方面所述的核酸构建物。
在另一优选例中,所述的细胞为植物细胞。
在另一优选例中,所述的植物选自下组:禾本科植物、豆科植物、十字花科植物、或其组合。
在另一优选例中,所述的植物包括:拟南芥、小麦、大麦、燕麦、玉米、水稻、高粱、粟、大豆、花生、烟草、番茄、或其组合。
在另一优选例中,所述的基因工程细胞是用选自下组的方法将本发明第一方面所述的核酸构建物导入细胞的:农杆菌转化法、基因枪法、显微注射法、电击法、超声波法和聚乙二醇(PEG)介导法。
本发明第四方面提供了一种对植物进行基因编辑的方法,包括步骤:
(i)提供待编辑植物;和
(ii)将本发明第一方面所述的核酸构建物或本发明第二方面所述的载体导入所述待编辑植物的植物细胞,从而在所述植物细胞内进行基因编辑。
在另一优选例中,所述步骤(ii)和步骤(iii)中的植物细胞来自同一部位。
在另一优选例中,所述导入为通过农杆菌导入。
在另一优选例中,所述导入为通过基因枪导入。
在另一优选例中,所述的基因编辑为定点碱基替换(或突变)。
在另一优选例中,所述定点替换(或突变)包括将T突变为C和/或A突变为G。
在另一优选例中,所述的植物选自下组:禾本科植物、豆科植物、十字花科植物、或其组合。
在另一优选例中,所述的植物包括:拟南芥、小麦、大麦、燕麦、玉米、水稻、高粱、粟、大豆、花生、烟草、番茄、或其组合。
本发明第五方面提供了一种制备转基因植物细胞的方法,包括步骤:
(i)将本发明第一方面所述的核酸构建物、或本发明第二方面所述的载体转染植物细胞,使得所述核酸构建物与所述植物细胞中的染色体发生定点替换(或突变),从而制得所述转基因植物细胞。
在另一优选例中,所述的转染采用农杆菌转化法或基因枪轰击法。
本发明第六方面提供了一种制备转基因植物细胞的方法,包括步骤:
(i)将本发明第一方面所述的核酸构建物、或本发明第二方面所述的载体转染植物细胞,使得所述植物细胞含有所述核酸构建物,从而制得所述转基因植物细胞。
本发明第七方面提供了一种制备转基因植物的方法,包括步骤:
将本发明第五方面或本发明第六方面所述方法制备的所述转基因植物细胞再生为植物体,从而获得所述转基因植物。
本发明第八方面提供了一种转基因植物,其特征在于,所述的植物是用本发明第七方面所述的方法制备的。
应理解,在本发明范围内中,本发明的上述各技术特征和在下文(如实施例)中具体描述的各技术特征之间都可以互相组合,从而构成新的或优选的技术方案。限于篇幅,在此不再一一累述。
附图说明
图1显示了水稻中A.T到G.C的高效碱基编辑。
其中,(A)两个水稻腺嘌呤碱基编辑载体pRABEsp-OsU6和pRABEsa-OsU6sa的示意图。(B)OsSPL14基因中sgRNA1靶位点的示意图。OsmiR156与OsSPL14结合的序列以红色突出显示。(C)两个代表性的OsSPL14编辑植株SG1-7和 SG1-15的测序峰图。(D)SLR1基因中sgRNA2靶位点的示意图。原间隔序列(protospacer)区第6位的T-C替换可能导致TVHYNP基序中92位氨基酸V变为A。(E)两个SLR1编辑植株SG2-18和SG2-19的测序峰图。(F)设计的sgRNA4同时靶向编辑OsSPL14和OsSPL17基因。OsSPL14和OsSPL17中的OsmiR156结合序列以红色突出显示。(G)OsSPL14和OsSPL17在两个转基因水稻系SG4-5和SG4-25中都发生碱基编辑。两个株系两个靶位点的测序峰图如图所示。(H)本发明中使用不同sgRNA的碱基编辑效率的统计。水稻中的多位点碱基编辑分别由sgRNAs3-5引导。
图2显示了sgRNA1靶向OsSPL14基因的两个碱基编辑植株(SG1-7和SG1-15)的TA克隆测序结果。随机选取每行20个克隆进行测序,显示了每个基因型的代表性测序峰图。
图3显示了pRABEsp-OsU6可用于水稻的多重碱基编辑。
其中,(A)sgRNA3被设计用于同时靶向编辑水稻基因组中的三个基因。三个基因中的OsmiR156结合核苷酸以红色突出显示。(B)提供了三个靶位点的代表性序列峰图。在SG3-11和SG3-12植株中,同时编辑OsSPL16和OsSPL18中的目标站点同时编辑,但LOC_Os02g24720的目标位点在这两个植株中为野生型。
图4显示了pRABEsa-OsU6sa在OsSPL16和OsSPL18上同时进行碱基编辑。
其中,(A)sgRNA5可同时靶向OsSPL16和OsSPL18两个位点。(B)代表性的三个转基因植株SG5-7,SG5-18,SG5-44在两个靶位点的测序峰图。这两个基因都是在三个选定的转基因植株中被同时编辑了。
图5显示了ABE-P1S对OsSPL14靶位点的编辑。
其中,A.ABE-P1S碱基编辑器的载体示意图.B.OsSPL14基因中sgRNA1靶位点的示意图。OsmiR156与OsSPL14结合的序列以红色突出显示。C.两个代表性的株系Line 7和Line 12在OsSPL14靶位点的测序峰图,黑色箭头指示了发生碱基替换的位点。D.ABE-P1S对OsSPL17位点的脱靶编辑,。OsmiR156与OsSPL17结合的序列以红色突出显示,错配碱基字母小写,黑色箭头指示了发生碱基替换的位点。
图6显示了ABE-P2S对SPX2-MFS2位点的编辑。
其中,A.ABE-P2S碱基编辑器的载体示意图.B.SPX2-MFS2基因中sgRNA8靶位点的示意图。OsmiR827与SPX2-MFS2结合的序列以红色突出显示。C.两个 代表性的株系Line 38和Line 40在SPX2-MFS2靶位点的测序峰图,黑色箭头指示了发生碱基替换的位点。
图显示了7ABE-P5和ABE-P5S对OsSPL13位点的编辑。
其中,A.ABE-P5和ABE-P5S碱基编辑器的载体示意图.B.OsSPL13基因中sgRNA11靶位点的示意图。OsmiR156与OsSPL13结合的序列以蓝色突出显示,PAM序列由红色字母标出。C.代表性的株系Line3,Line 23和Line 27在OsSPL13靶位点的测序峰图,黑色箭头指示了发生碱基替换的位点。
具体实施方式
本发明人经过广泛而深入地研究,通过大量筛选,筛选出驱动Cas9核酸酶与腺嘌呤脱氨酶、核定位信号VirD2构成的融合蛋白表达的特定启动子,以及驱动sgRNA转录的特定的启动子,并通过采用特定结构的核酸构建物,本发明首次在植物中成功实现了sgRNA引导的碱基定点突变(如T突变为C或A突变为G),并且突变效率非常高(可高达≥60%或更高),并且申请人意外的发现,简化后的腺嘌呤碱基编辑器ABE-P1S具有更高的碱基编辑效率。并在此基础上完成了本发明。
术语
如本文所用,术语“同源臂”指打靶载体上待插入的外源序列两侧的与基因组序列完全一致的侧翼序列,用于识别并发生重组的区域。
如本文所用,术语“植物启动子”指能够在植物细胞中启动核酸转录的核酸序列。该植物启动子可以是来源于植物、微生物(如细菌、病毒)或动物等,或者是人工合成或改造过的启动子。
如本文所用,术语“碱基突变”指核苷酸序列的某一位置处发生碱基的替换(substitution)、插入(insertion)和/或缺失(deletion)。
如本文所用,术语“碱基替换”指核苷酸序列的某一位置处的碱基突变为另一不同的碱基,比如T突变为C。
如本文所用,术语“A.T到G.C”指在双链核酸序列(尤其是基因组序列)中,某一位置上的A-T碱基对突变为或替换为G-C碱基对。
如本文所用,“筛选标记基因”指转基因过程中用来筛选转基因细胞或转基因动物的基因,可用于本申请的筛选标记基因没有特别限制,包括转基因领 域常用的各种筛选标记基因,代表性例子包括(但并不限于):潮霉素抗性基因(Hyg)、卡那霉素抗性基因(NPTII)、新霉素基因、嘌呤霉素抗性基因、潮霉素的抗性基因(HYG)、G418和卡那霉素的抗性基因(NPTII)、Basta的抗性基因(BAR)、嘌呤霉素抗性基因(PAC)、和/或新霉素抗性基因(NEO)。
如本文所用,术语“Cas蛋白”指一种核酸酶。一种优选的Cas蛋白是Cas9蛋白。典型的Cas9蛋白包括(但并不限于):来源于酿脓链球菌(Streptococcuspyogenes)的Cas9。在本发明中,Cas9蛋白为突变的Cas9蛋白,具体地,是无切割活性或只具有单链切割活性的突变的Cas9蛋白。
如本文所用,术语“Cas蛋白的编码序列”指编码Cas蛋白的核苷酸序列。在插入的多聚核苷酸序列被转录和翻译从而产生功能性Cas蛋白的情况下,技术人员会认识到,因为密码子的简并性,有大量多聚核苷酸序列可以编码相同的多肽。另外,技术人员也会认识到不同物种对于密码子具有一定的偏好性,可能会根据在不同物种中表达的需要,会对Cas蛋白的密码子进行优化,这些变异体都被术语“Cas蛋白的编码序列”所具体涵盖。此外,术语特定地包括了全长的、与Cas基因序列基本相同的序列,以及编码出保留Cas蛋白功能的蛋白质的序列。
在本发明中,核苷酸序列的描述是从5’至3’方向,除非特别注明。
腺嘌呤脱氨酶
如本文所用,术语“腺嘌呤脱氨酶”指TadA腺嘌呤脱氨酶,来源于大肠杆菌,原本作用于tRNA,能够对tRNA中的特定腺嘌呤进行脱氨反应。
在本发明中,适用的TadA既包含野生型的形式也包含其特定的突变形式TadA7-10,也可包含野生型的形式和突变形式的组合。TadA7-10能够以DNA作为底物进行脱氨反应。
在另一优选例中,本发明的腺嘌呤脱氨酶的编码序列是对密码子进行优化的,从而能够更高效地在植物中表达。
本发明的构建物
本发明提供了一种核酸构建物,用于对植物进行基因编辑,所述的核酸构建物具有5’-3’的式I结构:
I1-Z1-Z2-I2  (I)
其中,
I1为第一整合元件;
I2为第二整合元件;
Z1为第一表达盒;
Z2为第二表达盒;
并且,Z1和Z2中的一个表达盒具有Ia结构,而另一个表达盒具有式Ib结构:
P1-S1-X1-L1-X2-L2-X3  (Ia)
P2-Y1  (Ib);
I1、P1、S1、X1、L1、X2、L2、X3、L3、P2、Y1、I2分别为用于构成所述构建物的元件,其定义如本发明第一方面所述;
并且,各“-”为键或核苷酸连接序列。
在上述式I结构中,I1元件(或左侧整合元件)和I2元件(或右侧整合元件)可协同作用,从而将位于其间的元件(即从P1至Y1的核苷酸序列)整合到植物细胞的基因组中。
代表性的I1和I2是来自于农杆菌的Ti元件。当然,其他可起到类似整合作用的元件也可用于本发明。
本发明的构建物中所用的各种元件或者是本领域中已知的,或者可用本领域技术人员已知的方法制备。例如,可通过常规方法,如PCR方法、全人工化学合成法、酶切方法获得相应的元件,然后通过熟知的DNA连接技术连接在一起,就形成了本发明的构建物。
将本发明的构建物插入外源载体(尤其是适合转基因植物操作的载体),就构成了本发明的载体。
将本发明的载体转化植物细胞从而介导本发明的载体对植物细胞染色体进行整合,制得转基因植物细胞。
将本发明的转基因植物细胞再生为植物体,从而获得转基因植物。
将本发明构建好的上述核酸构建物,通过常规的植物重组技术(例如农杆菌转让技术),可以导入植物细胞,从而获得携带所述核酸构建物(或带有所述核酸构建物的载体)的植物细胞,或获得基因组中整合有所述核酸构建物的植物细胞。
载体构建
该载体的主要特征是将腺嘌呤脱氨基酶与CRISPR/Cas系统中的Cas蛋白 以及核定位信号VirD2的编码序列连接在一起,从而形成融合蛋白的编码序列。当该编码序列所编码的融合蛋白在细胞质中表达后,所述的融合蛋白可以非常高效地被转移至细胞核内,并由式I构建物所编码的guide RNA引导至基因组中的靶点位置,从而在靶点位置进行A.T到G.C的碱基替换,并基本上避免或消除了发生插入/缺失的风险。
由于腺嘌呤脱氨基酶直接将T突变为C或A突变为G,并不需要Cas蛋白的DNA双链切割活性。因此,在本发明中Cas蛋白是无切割活性的突变的Cas蛋白。在一优选实施方式中,本发明的Cas蛋白可以是SaKKH-Cas9(D10A),其氨基酸序列如SEQ ID NO.:69所示。一般的,为了增加融合蛋白的活性,蛋白间一般通过一些柔性短肽连接,即Linker(连接肽序列)。优选的,该Linker可以选用XTEN。
在本发明中,合适的启动子包括组成型和/或诱导型启动子。优选地,为了增加效率,可以选择适用于植物细胞的强启动子,代表性的例子包括(但并不限于):CaMV 35S启动子或者UBI启动子或Actin启动子等。
选择适用于植物细胞的guide RNA的表达框,并将其与上述融合蛋白的开放表达框(ORF)构建在同一载体。
靶点设计
在本发明中,当腺嘌呤脱氨基酶通过CRISPR/Cas9系统引导至靶点位置后,脱氨基酶的作用区域就被固定的。
例如,将腺嘌呤脱氨基酶TadA或突变型的腺嘌呤脱氨基酶TadA7-10蛋白通过32个氨基酸的XTEN Linker连接至Cas9的N端后,通常,在人细胞系里面其脱氨基的作用区域为原间隔序列(protospacer)区的第4-7个碱基区域,本发明得到的实验结果表明,如果用SpCas9,碱基编辑窗口为原间隔序列(protospacer)区第5-10位,如果用SaCas9,碱基编辑窗口为原间隔序列(protospacer)区第6-14位。根据这一原则,在设计靶点时,需将待编辑的腺嘌呤处于“脱氨基窗口区”。
遗传转化
在本发明中,对于将本发明的式I构建物导入细胞或整合到基因组的方法,没有特别限制。可以用常规的方法进行,例如将式I构建物或相应的载体通过 合适的方法导入到植物细胞中。代表性的导入方法包括但并不限于:农杆菌转染法、基因枪法、显微注射法、电击法、超声波法、和聚乙二醇(PEG)介导法等。
在本发明中,对于受体植物没有特别限制,其中包括各种不同的农作物植物(如禾本科植物)、林业植物、园艺植物(如花卉植物)等。代表性的例子包括但不限于:水稻、大豆、番茄、玉米、烟草、小麦、高粱等。
上述DNA载体或片段导入植物细胞后,使转化的植物细胞中的DNA表达该融合蛋白和gRNA。融合腺嘌呤脱氨基酶的Cas蛋白在相应gRNA的引导下,将靶点位置的T突变为C(进而使得互补链的A突变为G)。
对于用本发明方法进行植物基因组定点替换后的植物细胞或组织或器官,可以用常规方法再生获得相应的转基因植株。例如,通过组织培养,再生获得碱基替换后的植株。
应用
本发明可以用于植物基因工程领域,用于植物研究和育种,尤其是具有经济价值的农作物、林业作物或园艺植物的遗传改良。
本发明的主要优点包括:
(1)本发明首次提供了一种在植物中实现定点碱基突变(T→C或A→G)的高效方法,可以广泛的用于植物研究和育种。
(2)本发明的定点碱基突变的方法可高效地在植物细胞的特定位置进行碱基突变(如T→C或A→G)。
(3)用本发明的方法在植物细胞中进行碱基替换的效率非常高,可高达60%或更高。
(4)本发明能够同时对植物基因组中的多个位点同时进行编辑。
(5)本发明可以通过使用不同形式的Cas9,扩大植物基因组中可被定点编辑的位点。
(6)本发明可显著降低或基本上消除在靶位点发生插入和/或缺失的风险。
(7)本发明发现,sgRNA的第12位、第13位和/或第14位对应于预定发生T→C定点突变的位置(即为T)(从5’至3’)的碱基替换效率更高。
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件如Sambrook等人,分子克隆:实验室手册(New York:Cold Spring Harbor Laboratory Press,1989)中所述的条件,或按照制造厂商所建议的条件。除非另外说明,否则百分比和份数按重量计算。本发明中所涉及的实验材料和试剂如无特殊说明均可从市售渠道获得。
通用方法
1.载体构建
用常规方法合成大肠杆菌野生型tRNA腺嘌呤脱氨基酶基因TadA(ecTadA)和其突变体形式TadA7-10(ecTadA*7.10)。TadA和TadA7-10 3’端还各加有一段编码32个氨基酸残基的连接肽序列的96bp的接头编码序列。TadA-linker和TadA7-10-linker两端分别加有AarI酶切位点。
通过对pCas9(OsU6)载体的改造,我们获得水稻腺嘌呤碱基编辑载体。
pCas9(OsU6)载体中Cas9基因由玉米泛素启动子驱动表达,sgRNA骨架由水稻U6启动子驱动表达。Cas9(D10A)切口酶和SaCas9(D10A)切口酶通过PCR以pCas9(OsU6)和pX600(Addgene,#61592)为模板扩增获得。上游引物5’端含有两个AarI酶切位点,下游引物3’端加有VirD2核定位信号,扩增产物回收后用于替换pCas9(OsU6)中的Cas9基因。分别得到中间载体pRSp-OsU6 and pRSa-OsU6。。TadA-linker和TadA7-10-linker片段通过Golden Gate的方式连入pRSp-OsU6和pRSa-OsU6,分别获得pRABESp-OsU6和pRBESa-OsU6载体。
通过overlap PCR,将水稻U6启动子,两个BsaI酶切位点,和SaCas9的sgRNA骨架连接在一起用于替换pRBESa-OsU6中OsU6-sgRNA片段,获得载体pRABESa-OsU6Sa。
简化版腺嘌呤碱基编辑载体构建
为了构建简化版的腺嘌呤碱基编辑载体,我们首先对合成的ecTadA*(7.10)两端的AarI酶切位点接头进行了改造,然后通过金门克隆的方式将单个ecTadA*(7.10)连接到之前构建的pRSp-OsU6 and pRSa-OsU6中间载体中,得到载体pRSABESp-OsU6和pRSABESa-OsU6。pRSABESp-OsU6即对应腺嘌呤碱基编辑器ABE-P1S。用HindIII和XmaI从之前构建的pRABESa-OsU6Sa 载体上切下OsU6-SasgRNA表达盒来替换掉pRSABESa-OsU6中OsU6-SpsgRNA表达盒,获得pRSABESa-OsU6Sa载体,它对应腺嘌呤碱基编辑器ABE-P2S。我们用PCR从pRABEsa-SaKKH载体上扩增得到nSaKKH-Cas9(D10A)缺口酶,用它替换pRSABESa-OsU6Sa载体中的nSaCas9(D10A),得到pRSABEsa-SaKKH载体,它对应腺嘌呤碱基编辑器ABE-P5S。
pRABESp-OsU6和pRABESa-OsU6Sa载体骨架上含有潮霉素基因用于转基因植株的筛选。
人工合成sgRNA1至sgRNA5的序列。引物在PCR仪上退火后形成短的寡聚核苷酸接头,寡聚核苷酸接头插入到由BsaI酶切的pRABESp-OsU6,pRABESa-OsU6Sa载体中。
所有的载体都通过Sanger测序确认其准确性。
2.水稻转化
所有的双元载体通过冻融法转入常规的农杆菌EHA105之中。水稻转化受体为日本晴。具体的实验步骤按(Nishimura et al.,2007)等报道的进行。愈伤组织在农杆菌侵染两天以后转到含50mg/L潮霉素的筛选培养基上进行筛选。筛选2周以后,抗性愈伤组织直接转到分化培养基之中进行分化。待小苗长到4-5cm长时,转到生根培养基中诱导生根。10天过后,打开生根盒炼苗3天后转到营养土中,放入温室生长,温室条件为28℃,光照12小时,22℃黑暗12小时。
3.目标位点基因型检测
以CTAB法提取所有转基因水稻植株的叶片DNA。在靶标位点上下游各250bp左右设计引物,通过PCR扩增出目标位点。PCR产物回收后直接送公司测序。测序结果有Sequencher软件分析。对于有些在目标位点发生编辑的植株,我们通过TA克隆进行了进一步的验证。具体为目标位点通过PCR扩增,纯化后连入p-EASY blunt Zero载体转化大肠杆菌。随机挑选20个克隆进行测序。碱基编辑的效率的计算由产生碱基编辑的植株数目除以总的转基因数目。
4.脱靶位点的检测
对于sgRNA1和sgRNA4的脱靶位点进行预测分析。水稻基因组中与sgRNA1和sgRNA4序列错配低于5个碱基以下的位点被认为是潜在的脱靶位点。在所有潜在脱靶位点上下游各250bp左右设计引物,PCR扩增出这些潜在脱靶位点,PCR产物纯化后直接送公司进行测序。测序结果有Sequencher软件分析。
实施例1测试ABE7-10在水稻中的编辑效果
在本实施例中,根据通用方法,本发明合成了野生型ecTadA及其突变体形式ecTadA*7.10的编码序列。然后通过使用编码32个氨基酸残基的接头编码序列将它们连接在一起。接下来,将该重组蛋白的编码序列与具有相同接头的Cas9(D10A)切口酶编码序列的N-末端融合。最后将VirD2核定位信号的编码序列连接到Cas9(D10A)切口酶的C端,以形成ABE7-10。然后将ABE7-10克隆到玉米泛素启动子控制下的双元载体中,并且通过水稻U6启动子驱动sgRNA,形成载体pRABEsp-OsU6(图1A)。
为了测试ABE7-10在水稻中的编辑效果,我们选择调节水稻理想株型的IPA1(OsSPL14)作为靶基因(图1B)。
本发明设计了一个sgRNA(sgRNA1)靶向OsSPL14的OsmiR156结合位点序列(图1B),sgRNA1载体构建所对应的引物为表3中的sgRNA1F和sgRNA1R。该双元载体以冻融法转化农杆菌,然后以农杆菌侵染水稻日本晴的愈伤组织,获得了23个独立的转基因系。然后,通过PCR扩增目标区域,并通过Sanger测序进行基因型鉴定,扩增引物为表3中SPL14-seq-F和SPL14-seq-R。
表3本发明用到的引物
Figure PCTCN2018122014-appb-000003
Figure PCTCN2018122014-appb-000004
Figure PCTCN2018122014-appb-000005
Figure PCTCN2018122014-appb-000006
根据测序结果,在23株转基因植株中有6株在目标位点显示了预期的T-C替换,碱基编辑效率高达26%(6/23)(图1H)。在这6个株系中,两个株系(SG1-11,SG1-21)在protospacer的第5位具有T-C替换(将PAM位点定为21-23),两个株系(SG1-10,SG1-15)protospacer的5和7位具有T-C替换,而其余两株(SG1-7,SG1-23)在protospacer的5和10具有T-C置换(图1C)。
本发明发现两个转基因植株(SG1-7和SG1-10)在protospacer的第10位发生编辑(图1C)。为了进一步确认Sanger测序结果,选择了SG1-7和SG1-15进行TA克隆,随机挑选了20个克隆进行测序。有趣的是,本发明发现来自SG1-7 的11个克隆在protospacer第5位具有T-C替换,9个克隆在protospacer的第10位具有T-C取代,表明这株转基因是bi-allele(图2)。然而,SG1-15株系只有15%的克隆(3/20)在protospacer的5和7位有T-C替换,表明该转基因植株是嵌合型的(图2)。
为了评估ABE7-10在水稻中的特异性,通过在线工具CRISPR-GE预测了sgRNA1在水稻基因组中的潜在脱靶位点。对九个潜在脱靶位点的测序未发现任何碱基编辑事件。九个脱靶位点的信息见表1,对九个潜在脱靶位点扩增所用的引物见表3。
表1.sgRNA1潜在脱靶位点的编辑效率
Figure PCTCN2018122014-appb-000007
Figure PCTCN2018122014-appb-000008
注:PAM序列的核苷酸以粗体书写,潜在脱靶的错配碱基以小写字母显示。
测序结果表明,腺嘌呤碱基编辑在水稻中是高度特异的。
实施例2测试水稻中腺嘌呤碱基编辑系统的编辑能力
为了进一步测试水稻中腺嘌呤碱基编辑系统的编辑能力,对SLR1基因座做进一步实验。SLR1编码水稻中的DELLA蛋白,其作为GA信号传导途径中的阻遏物。
本发明设计了针对SLR1的TVHYNP结构域的sgRNA(sgRNA2)。sgRNA2载体构建所对应的引物为表3中的sgRNA2F和sgRNA2R。Protospacer第6位的T-C碱基替换可导致TVHYNP基序中的V92A替换(图1D)。
从获得的40个转基因株系中,5个株系在protospacer的6位具有预期的T-C替换(图1E,1H)。目标区域扩增引物为表3中SLR-seq-F和SLR-seq-R由于在SLR1的碱基编辑窗口中只有一个可编辑的T,因此在目标基因座中没有发现任何其他的突变形式。这些结果证实了水稻中腺嘌呤碱基编辑系统的高特异性。
实施例3检测本发明的碱基编辑系统是否可以同时编辑水稻基因组中的两个或多个位点
为了检测本发明的碱基编辑系统是否可以同时编辑水稻基因组中的两个或多个位点,同时设计了第三个sgRNA(sgRNAs3),同时靶向OsSPL16和OsSPL18的OsmiR156结合位点(图3,A和B)。sgRNA3载体构建所对应的引物为表3中的sgRNA3F和sgRNA3R。另外,在LOC_Os02g24720基因的内含子中发现了一个与sgRNA3序列100%匹配的脱靶位点(图3,A和B)。因此,sgRNA3可以同时靶向水稻基因组中的三个位点。本发明对21个转基因株系中的这三个靶位点进行了基因型鉴定。各有四个转基因株系在OsSPL16和OsSPL18靶向区域中显示T-C替换。然而,在LOC_Os02g24720中只有一个株系含有T-C取代(图1H和图3)。三个目标位点的扩增引物为表3中SPL16-seq-F/SPL16-seq-R,SPL18-seq-F/SPL18-seq-R,02G-seq-F/02G-seq-R。有趣的是,在SG3-11,SG3-12两个转基因株系中,OsSPL16和OsSPL18能够被同时编辑,这表明本发明的碱基编辑系统可以在水稻中同时靶向多个基因 (图3,A和B)。
利用pRABEsp-OsU6载体进行碱基编辑需要在protospacer下游含有NGG的PAM序列。这一要求显着限制了水稻基因组中可被pRABEsp-OsU6编辑的位点数目。为了进一步增加水稻腺嘌呤碱基编辑器在水稻基因组中可编辑的目标,我们用SaCas9(D10A)切口酶和SaCas9的骨架sgRNA替换了pRABEsp-OsU6载体中的Cas9(D10A)切口酶及其sgRNA骨架。最终构建成的载体pRABEsa-OsU6sa可以识别不同的PAM序列NNGRRT(图1A)。
实施例4测试该载体的可行性和效率
为了测试该载体的可行性和效率,我们设计了第四个sgRNA(sgRNA4),同时靶向OsSPL14和OsSPL17的OsmiR156结合位点(图1F)。sgRNA4载体构建所对应的引物为表3中的sgRNA4F和sgRNA4R。值得注意的是,尽管PAM序列不同,但sgRNA4识别的protospacer序列与sgRNA2识别的protospacer序列重叠(图1F和1B)。从我们基因鉴定的31个转基因株系中,我们发现14个株系在OsSPL14目标位点具有T-C替换,而在OsSPL17目标位点有19个株系发生了T-C的替换。目标位点的扩增引物为表3中SPL14-seq-F/SPL14-seq-R,SPL17-seq-F/SPL17-seq-R,因此,在OsSPL14和OsSPL17靶位点处pRABEsa-OsU6sa的碱基编辑效率分别为45.2%和61.3%,远高于sgRNA2靶向的pRABEsp-OsU6(图1H)。更重要的是,在这两个靶位点有13个株系(41.9%)被同时编辑。
此外,SaCas9(D10A)切口酶的碱基编辑窗口比Cas9(D10A)切口酶更宽,这可能是由于在由SaCas9(D10A)在诱导形成R-环复合物形成期间有更多的单链暴露于腺嘌呤脱氨酶。出乎意料的是,甚至在protospacer第12和14位的T在两个靶位点都被编辑了,具体的ABE-P1的碱基编辑窗口口如表4和表5所示。
表4.ABE-P1的碱基编辑窗口
Figure PCTCN2018122014-appb-000009
Figure PCTCN2018122014-appb-000010
注意:从PAM远端计算碱基编辑位置,将PAM记录为位置21-23(Note:Base editing position was counted from the PAM-distal end,scoring the PAM as position21-23.)。
表5.ABE-P2的碱基编辑窗口
Figure PCTCN2018122014-appb-000011
Figure PCTCN2018122014-appb-000012
注意:从PAM远端计算碱基编辑位置,将PAM记录为位置22-27(Note:Base editing position was counted from the PAM-distal end,scoring the PAM as position22-27.)。
为了评估pRABEsa-OsU6sa的脱靶编辑,用在线工具CRISPR-GE预测了sgRNA4在水稻基因组中的潜在脱靶位点。对sgRNA4的潜在脱靶位点进行了测序,发现这些位点没有发生任何形式的突变,表明pRABEsa-OsU6sa在水稻中也是高度特异的。sgRNA4的潜在脱靶位点信息见表2,扩增sgRNA4潜在脱靶位点的引物见表3。
表2sgRNA4潜在脱靶位点的编辑效率
Figure PCTCN2018122014-appb-000013
Figure PCTCN2018122014-appb-000014
注:PAM序列的核苷酸以粗体书写,潜在脱靶的错配碱基以小写字母显示。
实施例5测试pRABEsa-OsU6sa是否可以用于水稻的多位点碱基编辑
本发明还设计了另一个同时靶向OsSPL16和OsSPL18的OsmiR156结合位点的sgRNA(sgRNA5)(图4)。sgRNA5载体构建所对应的引物为表3中的sgRNA5F和sgRNA5R。然而,sgRNA5的编辑效率远远低于sgRNA4。在OsSPL16目标位点只有17%(8/47)的株系具有T-C替换,在OsSPL18位点有23.4%(11/47)株系发生了编辑(图4,A和B)。目标位点的扩增引物为表3中SPL16-seq-F/SPL16-seq-R,SPL18-seq-F/SPL18-seq-R。值得注意的是,有14.6%(6/47)的株系在这两个位点同时发生了T-C的替换,进一步证实pRABEsa-OsU6sa也可以用于水稻的多位点碱基编辑。
实施例6测试简化后的腺嘌呤碱基载体pRSABESp-OsU6在水稻中的编辑效果
本发明为进一步提高水稻中腺嘌呤碱基编辑的效率,对原有的碱基编辑载体pRABEsp-OsU6(称为腺嘌呤碱基编辑器ABE-P1)进行了简化,得到新的腺嘌呤碱基载体pRSABESp-OsU6(称为腺嘌呤碱基编辑器ABE-P1S)(图5A)。与pRABEsp-OsU6相比较,在载体pRSABESp-OsU6中,我们只把ecTadA*7.10通过编码32个氨基酸残基的接头连接到SpCas9(D10A)切口酶编码序列的N-末端,其它序列没有任何改变。
为了测试新的腺嘌呤碱基器ABE-P1S在水稻中的编辑效果,我们同样选择sgRNA1靶向OsSPL14的OsmiR156结合位点序列(图5B),sgRNA1载体构建所对应的引物为表6中的sgRNA1F和sgRNA1R。该双元载体以冻融法转化农杆菌,然后以农杆菌侵染水稻日本晴的愈伤组织,获得了17个独立的转基因系。然后,通过PCR扩增目标区域,并通过Sanger测序进行基因型鉴定,扩增引物为表10中SPL14-seq-F和SPL14-seq-R。
表10扩增靶位点及测序所用的引物
Figure PCTCN2018122014-appb-000015
Figure PCTCN2018122014-appb-000016
Figure PCTCN2018122014-appb-000017
根据测序结果,在17株转基因植株中有12株在目标位点显示了预期的T-C替换,碱基编辑效率高达70.6%(12/17)。是pRABEsp-OsU6在此位点编辑效率的2.7倍(对比表7和图1H)。对所有发生碱基替换的位置进行统计后我们发现pRSABESp-OsU6可以对OsSPL14protospacer区第1,3,5,7,10,12位的腺嘌呤进行编辑,碱基编辑窗口较pRABEsp-OsU6有了一定的扩展(图5C显示出代表性结果)。
此外,我们还设计了其它的sgRNAs靶向水稻基因组不同位点,同时选择了不同的水稻品种(日本晴和kittake)作为转基因受体材料。结果表明除了OsDEP1靶位点以外,简化后的腺嘌呤碱基编辑器ABE-P1S比原来的腺嘌呤碱基编辑器ABE-P1有更高的碱基编辑效率(表7)。
表7ABE-P1和ABE-P1S在不同靶位点编辑效率的统计
Figure PCTCN2018122014-appb-000018
Figure PCTCN2018122014-appb-000019
为了评估pRSABESp-OsU6在水稻中的特异性,我们对sgRNA1的九个潜在脱靶位点进行了测序,测序结果表明有一个株系在OsSPL17基因内的脱靶位点发生了碱基替换,该脱靶位点和sgRNA1在PAM上游第5位存在1个碱基的错配(图5D)。在其它潜在脱靶位点我们没有检测到任何形式的突变。九个脱靶位点的信息见表1,对九个潜在脱靶位点扩增所用的引物见表3。
实施例7测试简化后的腺嘌呤碱基载体pRSABESa-OsU6Sa在水稻中的编辑效果
腺嘌呤碱基编辑载体pRABEsa-OsU6sa(称为腺嘌呤碱基编辑器ABE-P2)使用了SaCas9(D10A)切口酶,它可以识别不同的PAM序列NNGRRT,这可以扩展水稻腺嘌呤碱基编辑器在水稻基因组中可编辑的目标。
本发明为进一步提高腺嘌呤碱基编辑器ABE-P2的编辑效率,对pRSABESa-OsU6Sa载体也进行了简化,得到新的腺嘌呤碱基载体pRSABESa-OsU6Sa (称为腺嘌呤碱基编辑器ABE-P2S)(图6A)。与pRABEsa-OsU6Sa相比较,在载体pRSABESa-OsU6Sa中,我们只把ecTadA*7.10通过编码32个氨基酸残基的接头连接到SaCas9(D10A)切口酶编码序列的N-末端,其它序列没有任何改变。
为了比较新的腺嘌呤碱基器ABE-P2S与ABE-P2在水稻中的编辑效果,我们同样选择sgRNA8靶向水稻SPX-MFS2的OsmiR827结合位点(图6B),sgRNA8所对应的引物为表6中的sgRNA8F和sgRNA8R。将sgRNA8分别装入pRABEsa-OsU6Sa和pRSABEsa-OsU6Sa,将双元载体以冻融法转化农杆菌,然后以农杆菌侵染水稻日本晴的愈伤组织。对于pRABEsa-OsU6Sa,我们得到了41株转基因阳性苗。通过PCR扩增目标区域,并通过Sanger测序进行基因型鉴定,扩增引物为表10中SPX-MFS2-F和SPX-MFS2-R。经鉴定发现有4株在靶位点发生了单个碱基A-G的替换,编辑效率为9.8%(表8)。发生碱基替换的位点为protospacer区第1,9或15位腺嘌呤,但没有株系表现为纯合替换。对于pRSABEsa-OsU6Sa,我们获得47株转基因阳性苗,其中7株在靶位点发生了A-G的替换,编辑效率为14.9%(表8),是pRABEsa-OsU6Sa在此位点编辑效率的1.5倍。除株系Line 40外,其它株系均在protosapcer区的3,6,9,15位发生了单个碱基A-G替换,其中株系Line 38在protospacer第6位表现为纯合替换(图6C)。
表8 ABE-P2和ABE-P2S在不同靶位点碱基编辑效率的统计
Figure PCTCN2018122014-appb-000020
Figure PCTCN2018122014-appb-000021
此外,我们还设计了其它的sgRNAs靶向水稻基因组不同位点。结果表明除了OsSPL17靶位点以外,简化后的腺嘌呤碱基编辑器ABE-P2S比原来的腺嘌呤碱基编辑器ABE-P2有更高的碱基编辑效率(对比表8与图1H)。
实施例8对含Cas9蛋白变体的腺嘌呤碱基器ABE-P5进行简化后观测其在水稻中的编辑效果
腺嘌呤碱基编辑器ABE-P5使用了SaKKH-Cas9(D10A)切口酶(图7A),SaKKH-Cas9(D10A)是在SaCas9(D10A)中引入了E782K/N968K/R1015H三个突变,它可以识别不同的PAM序列NNNRRT。
本发明为进一步提高腺嘌呤碱基编辑器ABE-P5的编辑效率,对其进行了简化,得到新的腺嘌呤碱基编辑器ABE-P5S(图7A)。与ABE-P5相比较,在载体ABE-P5S中,我们只把ecTadA*7.10通过编码32个氨基酸残基的接头连接到SaKKH-Cas9(D10A)切口酶编码序列的N-末端,其它序列没有任何改变。
为了比较ABE-P5和ABE-P5S的编辑效率,我们设计了sgRNA11靶向OsSPL13的OsmiR156结合位点(图7B)。sgRNA11所对应的引物为表6中的sgRNA11-F和sgRNA11-R。将sgRNA11分别装入ABE-P5和ABE-P5S,将双元载体以冻融法转化农杆菌,然后以农杆菌侵染水稻日本晴的愈伤组织。对于ABE-P5,我们共获得46株转基因苗。通过扩增引物OsSPL13-F和OsSPL13-R(表10)对靶位点扩增测序后只发现1株转基因苗Line 3在protospacer第11位发生了A-G的替换,编辑效率仅为2.2%(表9)。而且从Sanger测序峰图可以看出这个株系可能只有极少部分细胞在protospacer第11位发生了A-G的替换(图7C)。对于ABE-P5S,我们共获得了33株转基因苗,其中有2株在靶位点发生了A-G替换,编辑效率为6.1%(表9)。Line 23在protospacer第7位发生了A-G的替换,而Line 27在protospacer第9位发生了A-G的替换(图7C)。因此在sgRNA11靶位点,ABE-P5S的效率是ABE-P5的2.8倍。
此外,我们还设计了其它的sgRNA12靶向水稻基因组的SNB基因。结果表明 简化后的腺嘌呤碱基编辑器ABE-P5S(33.9%)的编辑效率是ABE-P5(6.5%)的5.2倍(表9)。
表9 ABE-P5和ABE-P5S在不同靶位点碱基编辑效率的统计
Figure PCTCN2018122014-appb-000022
对比例1
方法同实施例1,区别在于,用水稻Actin启动子驱动Cas9核酸酶、核定位信号VirD2和腺嘌呤脱氨酶的表达,用依赖于II型RNA聚合酶的启动子或者U3启动子驱动sgRNA的转录。
结果表明,在水稻中的编辑效率(即定点替换效率)仅为实施例1的50%。
对比例2
方法同实施例1,区别在于,用SV40核定位信号替换VirD2。
结果表明,在水稻中的编辑效率(即定点替换效率)仅为实施例1的约50%-70%。
参考文献
1.Komor,A.C.,Badran,A.H.,and Liu,D.R.(2017a).Editing the Genome  Without Double-Stranded DNA Breaks.ACS Chem Biol.Advance Access published September 28,2017,doi:10.1021/acschembio.7b00710.
2.Nishimura,A.,Aichi,I.,and Matsuoka,M.(2007).A protocol for Agrobacterium-mediated transformation in rice.Nat.Protoc.1:2796-2802.
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。

Claims (14)

  1. 一种核酸构建物,其特征在于,所述核酸构建物具有5’-3’(5’至3’)的式I结构:
    I1-Z1-Z2-I2   (I)
    式中,
    I1为第一整合元件;
    I2为第二整合元件;
    Z1为第一表达盒;
    Z2为第二表达盒;
    并且,Z1和Z2中的一个表达盒具有Ia结构,而另一个表达盒具有式Ib结构:
    P1-S1-X1-L1-X2-L2-X3   (Ia)
    P2-Y1   (Ib);
    式中,
    P1、S1、X1、L1、X2、L2、X3、P2、Y1分别为用于构成所述构建物的元件;
    P1为第一启动子,所述第一启动子包括泛素启动子;
    S1为无或信号肽的编码序列;
    X1为腺嘌呤脱氨酶(如野生型和/或突变型TadA)的编码序列;
    L1为无或第一连接肽的编码序列;
    X2为Cas9核酸酶的编码序列,所述的Cas9核酸酶是无切割活性或单链切割活性的;
    L2为无或第二连接肽的编码序列;
    X3为核定位信号的编码序列,所述核定位信号为VirD2;
    P2为第二启动子;
    Y1为sgRNA的编码序列;
    并且,各“-”为键或核苷酸连接序列。
  2. 如权利要求1所述的核酸构建物,其特征在于,所述泛素启动子包括玉米泛素启动子。
  3. 如权利要求1所述的核酸构建物,其特征在于,所述第二启动子包括U6启动子。
  4. 如权利要求1所述的核酸构建物,其特征在于,所述的sgRNA的第5位-第10位对应于预定发生T→C定点突变的位置(即为T)。
  5. 如权利要求1所述的核酸构建物,其特征在于,所述的sgRNA的第6-14位对应于预定发生T→C定点突变的位置(即为T)。
  6. 如权利要求1所述的核酸构建物,其特征在于,所述信号肽包括VirD2的核定位信号肽。
  7. 如权利要求1所述的核酸构建物,其特征在于,所述第一启动子来源于选自下组的一种或多种植物:玉米、水稻、大豆、拟南芥、番茄。
  8. 如权利要求1所述的核酸构建物,其特征在于,所述第二启动子来源于选自下组的一种或多种植物:水稻、玉米、大豆、拟南芥、番茄。
  9. 一种载体,其特征在于,所述载体含有权利要求1所述的核酸构建物。
  10. 一种基因工程细胞,其特征在于,所述细胞含有权利要求1所述的核酸构建物,或其基因组整合有一个或多个权利要求1所述的核酸构建物。
  11. 一种对植物进行基因编辑的方法,其特征在于,包括步骤:
    (i)提供待编辑植物;和
    (ii)将权利要求1所述的核酸构建物或权利要求9所述的载体导入所述待编辑植物的植物细胞,从而在所述植物细胞内进行基因编辑。
  12. 一种制备转基因植物细胞的方法,其特征在于,包括步骤:
    (i)将权利要求1所述的核酸构建物、或权利要求9所述的载体转染植物细胞,使得所述核酸构建物与所述植物细胞中的染色体发生定点替换,从而制得所述转基因植物细胞。
  13. 一种制备转基因植物细胞的方法,其特征在于,包括步骤:
    (i)将权利要求1所述的核酸构建物、或权利要求9所述的载体转染植物细胞,使得所述植物细胞含有所述核酸构建物,从而制得所述转基因植物细胞。
  14. 一种制备转基因植物的方法,其特征在于,包括步骤:
    将权利要求12或权利要求13所述方法制备的所述
    转基因植物细胞再生为植物体,从而获得所述转基因植物。
PCT/CN2018/122014 2018-02-11 2018-12-19 植物基因组定点替换的方法 WO2019153902A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810142052 2018-02-11
CN201810142052.0 2018-02-11

Publications (1)

Publication Number Publication Date
WO2019153902A1 true WO2019153902A1 (zh) 2019-08-15

Family

ID=67548751

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/122014 WO2019153902A1 (zh) 2018-02-11 2018-12-19 植物基因组定点替换的方法

Country Status (2)

Country Link
CN (1) CN110157726B (zh)
WO (1) WO2019153902A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020177751A1 (zh) * 2019-03-06 2020-09-10 山东舜丰生物科技有限公司 一种用于基因编辑的核酸构建物
CN110526993B (zh) * 2019-03-06 2020-06-16 山东舜丰生物科技有限公司 一种用于基因编辑的核酸构建物
CN110527695B (zh) * 2019-03-07 2020-06-16 山东舜丰生物科技有限公司 一种用于基因定点突变的核酸构建物
CN112725348B (zh) * 2019-10-28 2022-04-01 安徽省农业科学院水稻研究所 一种提高水稻单碱基编辑效率的基因、方法及应用
CN110878305B (zh) * 2019-12-09 2022-04-12 安徽省农业科学院水稻研究所 一种宽窗口单碱基编辑基因及其应用和育种方法
EP4116426A1 (en) * 2020-03-04 2023-01-11 Suzhou Qi Biodesign biotechnology Company Limited Multiplex genome editing method and system
CN111508558B (zh) * 2020-03-23 2021-12-14 广州赛业百沐生物科技有限公司 一种基于CRISPR-Cas9技术设计点突变模型的方法及系统
CN113774082A (zh) * 2020-05-22 2021-12-10 山东舜丰生物科技有限公司 一种核酸表达的方法
CN112553246A (zh) * 2020-12-08 2021-03-26 安徽省农业科学院水稻研究所 一种基于CRISPR-SaCas9系统的高效基因组编辑载体及其应用
CN113717961B (zh) * 2021-09-10 2023-05-05 成都赛恩吉诺生物科技有限公司 一种融合蛋白及其多核苷酸、碱基编辑器及其在药物制备中的应用

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106609282A (zh) * 2016-12-02 2017-05-03 中国科学院上海生命科学研究院 一种用于植物基因组定点碱基替换的载体

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG11201707025WA (en) * 2015-03-02 2017-09-28 Synlogic Inc Bacteria engineered to treat diseases that benefit from reduced gut inflammation and/or tightened gut mucosal barrier
WO2017165724A1 (en) * 2016-03-24 2017-09-28 Morflora Llc Introducing dna into organisms for transient expression
CN107012164B (zh) * 2017-01-11 2023-03-03 电子科技大学 CRISPR/Cpf1植物基因组定向修饰功能单元、包含该功能单元的载体及其应用

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106609282A (zh) * 2016-12-02 2017-05-03 中国科学院上海生命科学研究院 一种用于植物基因组定点碱基替换的载体

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GAUDELLI, N.M. ET AL.: "Programmable Base Editing of A*T to G*C in Genomic DNA without DNA Cleavage", NATURE, vol. 551, no. 7681, 23 November 2017 (2017-11-23), pages 14, XP002785203 *

Also Published As

Publication number Publication date
CN110157726B (zh) 2023-06-23
CN110157726A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2019153902A1 (zh) 植物基因组定点替换的方法
CN108795972B (zh) 不使用转基因标记序列分离细胞的方法
US10308947B2 (en) Methods and compositions for multiplex RNA guided genome editing and other RNA technologies
CA2883800C (en) Fluorescence activated cell sorting (facs) enrichment to generate plants
CN108130342B (zh) 基于Cpf1的植物基因组定点编辑方法
CN111263810A (zh) 使用多核苷酸指导的核酸内切酶的细胞器基因组修饰
CN110526993B (zh) 一种用于基因编辑的核酸构建物
WO2018098935A1 (zh) 一种用于植物基因组定点碱基替换的载体
CN108064297B (zh) 小麦育性相关基因TaMS7及其应用方法
WO2019205939A1 (zh) 一种重复片段介导的植物定点重组方法
CN114829600A (zh) 植物mad7核酸酶及其扩大的pam识别能力
JP2022511508A (ja) ゲノム編集による遺伝子サイレンシング
CN113717960A (zh) 一种新Cas9蛋白、CRISPR-Cas9基因组定向编辑载体及基因组编辑方法
CN110951743A (zh) 一种提高植物基因替换效率的方法
CN113846075A (zh) Mad7-nls融合蛋白、用于植物基因组定点编辑的核酸构建物及其应用
KR20200004382A (ko) 전이유전자성 마커 서열을 이용하지 않는 세포 단리 방법
EP3052633B1 (en) Zea mays metallothionein-like regulatory elements and uses thereof
CN114686456B (zh) 基于双分子脱氨酶互补的碱基编辑系统及其应用
JP7288915B2 (ja) 植物のゲノム編集に用いられるdna構築物
US11932861B2 (en) Virus-based replicon for plant genome editing without inserting replicon into plant genome and uses thereof
WO2021056302A1 (en) Methods and compositions for dna base editing
WO2020171192A1 (ja) 植物細胞のゲノム編集用核酸及びその用途
JP7452884B2 (ja) Dnaが編集された植物細胞を製造する方法、及びそれに用いるためのキット
WO2020177751A1 (zh) 一种用于基因编辑的核酸构建物
KR101050048B1 (ko) 벡터

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905180

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18905180

Country of ref document: EP

Kind code of ref document: A1