CN110607320A - Plant genome directed base editing framework vector and application thereof - Google Patents

Plant genome directed base editing framework vector and application thereof Download PDF

Info

Publication number
CN110607320A
CN110607320A CN201811403794.0A CN201811403794A CN110607320A CN 110607320 A CN110607320 A CN 110607320A CN 201811403794 A CN201811403794 A CN 201811403794A CN 110607320 A CN110607320 A CN 110607320A
Authority
CN
China
Prior art keywords
lys
pmcda1
leu
vector
ncas9
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811403794.0A
Other languages
Chinese (zh)
Other versions
CN110607320B (en
Inventor
张勇
唐旭
郑雪莲
任秋蓉
周建平
邓科君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201811403794.0A priority Critical patent/CN110607320B/en
Publication of CN110607320A publication Critical patent/CN110607320A/en
Application granted granted Critical
Publication of CN110607320B publication Critical patent/CN110607320B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2810/00Vectors comprising a targeting moiety
    • C12N2810/10Vectors comprising a non-peptidic targeting moiety

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Cell Biology (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Breeding Of Plants And Reproduction By Means Of Culturing (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The invention belongs to the technical field of genetic engineering, and particularly relates to a plant genome directional base editing framework vector and application thereof. The invention aims to improve the directional base editing efficiency of plant cell genome and expand the base editing window. The technical scheme for solving the technical problem is to provide a plant genome directed base editing framework vector, wherein the framework vector is used for driving the transcription of a core unit consisting of an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression unit and a synthetic guide RNA (sgRNA) transcription expression unit by a Pol II type promoter. The single transcription unit directional base editing framework vector can effectively realize simple, quick and efficient directional editing of converting cytosine base (C) into thymine base (T), and is a molecular tool for effectively realizing directional editing of plant genome base.

Description

Plant genome directed base editing framework vector and application thereof
Technical Field
The invention belongs to the field of plant genetic engineering, and relates to a plant genome directed base editing framework vector and application thereof.
Background
Genome directed modification has been the leading and hot field of biological research, and is achieved by precisely directing and modifying specific regions of the genome: on one hand, the method can carry out accurate mutation aiming at a target sequence to obtain a mutant material and definitely identify the function of a target gene; on the other hand, the method can carry out accurate replacement or insertion of a target sequence and minimize the uncertainty of expression and inheritance caused by random introduction of foreign genes.
In 2012, researchers proved for the first time that CRISPR-Cas (Clustered regulated amplified short palindromic repeats-CRISPR associated protein) can realize sequence-specific DNA double-strand splicing, and then CRISPR-Cas9 system realizes RNA-guide-based intracellular genome-directed editing in animal and plant systems including cynomolgus monkey, zebrafish, mouse, human cell line, arabidopsis thaliana, rice and the like. In the genome targeted editing system, under the guidance of guide RNA, Cas protein recognizes and cuts specific DNA sequences to generate DNA Double Strand Breaks (DSBs), and then targeted editing of the DNA sequences of target sites is realized based on a cell endogenous DNA repair system. The eukaryotic DNA repair systems currently known can be divided into two broad categories: repair by "homologous recombination" (HR); "non-homologous end joining" (NHEJ) repair. HR precisely repairs damaged DNA regions by taking homologous sequences as templates; NHEJ does not need the existence of homologous sequence, and the broken ends formed by DNA damage are directly connected, so that sequence variation of different degrees is often introduced while the repair is completed.
Although the CRISPR-Cas genome editing tool can effectively realize targeted editing of a target genome sequence, editing events based on an NHEJ repair pathway mainly introduce base insertion or deletion mutation at a target modification site at random, and although the editing events based on an HR repair pathway can accurately replace the target modification site sequence according to donor template DNA, the occurrence frequency efficiency of the editing events is far lower than that of editing events mediated by an NHEJ repair pathway, so that the effective application of the CRISPR-Cas genome editing tool in precise base editing related basic research and application practice is greatly limited.
In order to improve the accurate editing efficiency of the specific base of the genome target site and effectively realize the accurate replacement editing of the single base of the target site, researchers realize the accurate replacement editing of the specific base of the genome target site (for example, base C is replaced by base T; base A is replaced by base G) by fusing the specific base deaminase with dCas9, nCas9or your Cas12a on the basis of a CRISPR-Cas genome editing tool, and the novel genome editing tool is called a Base Editor (BE). The directional base editing technology can effectively replace and edit a specific single base of a genome target site, is beneficial supplement of a CRISPR-Cas genome editing technology, is judged as one of ten-year-old scientific breakthroughs in 2017 by 'science' journal, and highlights important potential of the technology in basic research and application practice.
The CRISPR-Cas system-based directional base editing tool effectively expands the application range of the CRISPR-Cas system and shows wide application prospects. However, the existing directional base editing tools generally have the problems of low editing efficiency and limited editing window, particularly in plant genome editing application practice, the problems are more obvious, and the development of an enhanced plant directional base editing tool with high base editing efficiency and a wide base editing window is urgently needed, so that the active application of a directional base editing technology based on a CRISPR-Cas system in plant genome function research and breeding practice is effectively expanded.
Disclosure of Invention
The invention aims to improve the directional base editing efficiency of plant cell genome and expand the base editing window.
The technical scheme for solving the technical problem is to provide a plant genome directional base editing skeleton vector. The skeleton vector comprises a core unit consisting of two core regions of an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression unit and a synthetic guide RNA (sgRNA) transcription expression unit, wherein the core unit is driven by a Pol II type promoter to transcribe;
the core unit is nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-T in sequence from 5 'direction to 3' direction; the nCas9ORF is a coding frame of a Streptococcus pyogenes nuclease protein D10A mutant; PmCDA1 is a functional unit of cytosine deaminase coding region; poly A is Poly A area; the sgRNA cloning and transcription unit is sgRNA cloning and transcription unit, and the sgRNA cloning scaffold is at least one; t is a terminator.
Wherein, the functional unit of the coding region of the PmCDA1 cytosine deaminase in the skeleton vector sequentially comprises a GGGS joint, a SH3 joint, a PmCDA1 coding region, an NLS signal peptide, an UGI coding region, an SGGS joint and an NLS signal peptide from the N end to the C end.
Wherein the above-mentioned skeletal carrier meets at least one of the following:
a. the amino acid sequence coded by nCas9 nuclease protein D10A mutant coding frame nCas9ORF is shown as amino acids from position 1 to position 1382 in Seq ID No. 2;
b. the amino acid sequence encoded by the functional unit of the cytosine deaminase coding region of PmCDA1 is shown as amino acids 1383 to 1788 in Seq ID No. 2.
Wherein, the sgRNA cloning and transcription unit sgRNA cloning scaffold in the framework vector sequentially comprises a tRNA-Gly coding sequence, a BsaI-ccdB-BsaI unit, a sgRNA framework coding sequence and a tRNA-Gly coding sequence from the 5 'end to the 3' end.
Wherein, the number of sgRNA cloning and transcription units in the framework vector is 1-6.
The nucleotide sequence of the sgRNA cloning and transcription unit sgRNA cloning scaffold in the framework vector is shown as 7432bp to 8300bp in Seq ID No. 1.
Wherein the above-mentioned skeletal carrier meets at least one of the following:
a. the nucleotide sequence coded by nCas9 nuclease protein D10A mutant coding frame nCas9ORF is shown as 2011bp to 6156bp in Seq ID No. 1;
b. the nucleotide sequence coded by the functional unit of the coding region of the cytosine deaminase of PmCDA1 is shown as 6157bp to 7374bp in Seq ID No. 1.
c. The nucleotide sequence of Poly A region Poly A is shown from 7384bp to 7431bp in Seq ID No.1
d. The terminator is rice HSP terminator HSP T, and the nucleotide sequence is shown as a nucleotide sequence from 8307bp to 8556bp in Seq ID No. 1.
e. The Pol II type promoter is a corn pZmUbi1 promoter pZmUbi1, and the nucleotide sequence is shown from the 1bp to the 2008bp in Seq ID No. 1.
Wherein the core unit in the framework vector has the structure of pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT. Further, the nucleotide sequence of the core unit of pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSP T is shown in Seq ID No. 1.
Based on the skeleton vector, the invention also provides a preparation method of the recombinant expression vector for carrying out directional base editing on the specific cytosine base of the target site of the plant genome. The method comprises the following steps:
a. defining a target DNA region of a specific biological genome, analyzing a region with PAM (PAM full name promoter motif adjacent to a candidate recognition site) characteristic, and selecting a 15-30 bpDNA sequence adjacent to the 5' end of a PAM structure as a specific target sequence;
b. respectively synthesizing 5' -CGGA-N according to selected specific target sequenceXA forward oligonucleotide strand of-3 'character and having 5' -AAAC-NX-a reverse oligonucleotide strand of 3' character, N represents any of A, G, C, T, X is an integer, and 14 ≦ X ≦ 30, wherein N in the forward oligonucleotide strandXAnd N in reverse oligonucleotideXHas reverse complementary characteristics; obtaining a complementary oligonucleotide double-stranded fragment by annealing;
c. mixing the plant genome directed base editing skeleton vector of any one of claims 1 to 9 with the complementary oligonucleotide double-stranded fragment obtained in step b, simultaneously adding BsaI endonuclease and T4DNA ligase into the reaction system, and setting enzyme digestion-ligation cycling reaction to obtain the recombinant expression vector for directed base editing of the site.
Further, the length of the specific target sequence in the step a is 18-21 bp. Preferably, the length of the specific target sequence in step a is 20 bp.
Preferably, in step b, X is 18. ltoreq. X.ltoreq.21.
Preferably, in practical operation, a fusion PCR amplification strategy can be applied in step c to obtain a plurality of sgRNA transcription units that are separated by tRNA sequences and are amplified in series, and a BsaI-ccdB-BsaI unit is replaced by a "BsaI digestion-T4 DNA ligase ligation" cycling reaction, and the multiple sgRNA transcription units are cloned into a sgRNA cloning and transcription unit to obtain a recombinant expression vector that can perform specifically-directed base editing on a plurality of target sites.
The invention has the beneficial effects that: the invention constructs the core unit of a single transcription unit directional base editing framework vector by starting two core regions of a promoter-driven nCas9-PmCDA1 fusion protein and a synthetic guide RNA (sgRNA) transcription expression unit. The directional base editing skeleton vector containing the core unit can effectively realize simple, quick and efficient directional editing of converting cytosine base (C) into thymine base (T) aiming at a plant genome target sequence. Compared with the currently used plant base editing tool, the invention improves the base editing efficiency, expands the base editing window, promotes the effective application of the directional base editing strategy in the directional editing of plant genomes, is a molecular tool for effectively realizing the directional editing of the plant genomes bases, and has good application prospect.
Drawings
FIG. 1 is a schematic diagram of the core unit structure and the working of the single transcription unit directional base editing framework vector of the plant STU nCas9-PmCDA1 of the invention.
FIG. 2 shows targeted site cytosine targeted editing efficiency analysis based on Illumina high-throughput sequencing by transiently transforming rice protoplasts with STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04, and STU nCas9-PmCDA1-OsROC5-gRNA05 recombinant expression vectors. Wherein nCas9-PmCDA1 represents the single transcription unit oriented base editing framework vector of the plant STU nCas9-PmCDA1 of the invention, and nCas9-rApobec1 is a control group (according to the reference report (Komor AC, Kim YB, Packer MS, Zuris JA, LiuDR.2016. progrmmable editing of a target base in genomic DNA without double-stranded DNA clean. Nature,533(7603):420-424.), rApobec1 cytosine deaminase is substituted for the PmCDA1 unit in the single transcription unit oriented base editing framework vector of the plant STU nCas9-PmCDA1 of the invention).
FIG. 3 shows that rice protoplasts are transiently transformed based on the STU nCas9-PmCDA1-OsCDC48-sgRNA01 recombinant expression vector of the invention, Illumina high-throughput sequencing is performed, and the editing efficiency analysis of replacement of cytosine base sites at different positions of a specific editing site into thymine bases is performed. Wherein nCas9-PmCDA1 and nCas9-rApobec1 are as illustrated in FIG. 2.
FIG. 4 shows that rice protoplasts are transiently transformed based on the STU nCas9-PmCDA1-OsROC5-gRNA04 recombinant expression vector of the invention, Illumina high-throughput sequencing is performed, and the editing efficiency analysis of replacement of cytosine base sites at different positions of specific editing sites into thymine bases is performed. Wherein nCas9-PmCDA1 and nCas9-rApobec1 are as illustrated in FIG. 2.
FIG. 5 shows that rice protoplasts are transiently transformed based on the STU nCas9-PmCDA1-OsROC5-gRNA05 recombinant expression vector of the invention, Illumina high-throughput sequencing is performed, and the editing efficiency analysis of replacement of cytosine base sites at different positions of specific editing sites into thymine bases is performed. Wherein nCas9-PmCDA1 and nCas9-rApobec1 are as illustrated in FIG. 2.
FIG. 6 shows the results of Agrobacterium-mediated rice genetic transformation based on the recombinant expression vectors of STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04 and STU nCas9-PmCDA1-OsROC5-gRNA05, extraction of rice transformation regeneration seedling genome DNA, PCR amplification and Sanger sequencing analysis, and targeted site cytosine targeted editing efficiency analysis.
Detailed Description
The invention constructs a plant genome directed base editing framework vector (also called plant STU nCas9-PmCDA1 single transcription unit directed base editing framework vector in the invention) based on a CRISPR-Cas9 single transcription system and PmCDA1 cytosine deaminase through strategies such as coding region codon optimization, functional unit multiplex assembly and the like,
the plant genome directed base editing framework vector comprises a core unit consisting of two core regions of an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression unit and a synthetic guide RNA (sgRNA) transcription expression unit, wherein the core unit is driven to be transcribed by a Pol II type promoter;
the nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression unit comprises an nCas9ORF as a coding frame of a Streptococcus pyogenes nuclease protein D10A mutant; PmCDA1 is a functional unit of cytosine deaminase coding region; poly A is Poly A region, namely nCas9 ORF-PmCDA1-Poly A;
the core unit is nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-T in sequence from 5 'direction to 3' direction; the nCas9ORF is a coding frame of a Streptococcus pyogenes nuclease protein D10A mutant; PmCDA1 is a functional unit of cytosine deaminase coding region; poly A is Poly A area; the sgRNA cloning and transcription unit is sgRNA cloning and transcription unit, and the sgRNA cloning scaffold is at least one; t is a terminator.
Wherein, the functional unit of the coding region of the PmCDA1 cytosine deaminase in the skeleton vector sequentially comprises a GGGS joint, a SH3 joint, a PmCDA1 coding region, an NLS signal peptide, an UGI coding region, an SGGS joint and an NLS signal peptide from the N end to the C end.
Wherein the above-mentioned skeletal carrier meets at least one of the following:
a. the amino acid sequence coded by nCas9 nuclease protein D10A mutant coding frame nCas9ORF is shown as amino acids from position 1 to position 1382 in Seq ID No. 2;
b. the amino acid sequence encoded by the functional unit of the cytosine deaminase coding region of PmCDA1 is shown as amino acids 1383 to 1788 in Seq ID No. 2. The two components are connected to form an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression frame, and the amino acid sequence is shown as Seq ID No.2 in the sequence table.
Wherein, the sgRNA cloning and transcription unit sgRNA cloning scaffold in the framework vector sequentially comprises a tRNA-Gly coding sequence, a BsaI-ccdB-BsaI unit, a sgRNA framework coding sequence and a tRNA-Gly coding sequence from the 5 'end to the 3' end.
Wherein, the number of sgRNA cloning and transcription units in the framework vector is 1-6.
The nucleotide sequence of the sgRNA cloning and transcription unit sgRNA cloning scaffold in the framework vector is shown as 7432bp to 8300bp in Seq ID No. 1.
Wherein the above-mentioned skeletal carrier meets at least one of the following:
a. the nucleotide sequence coded by nCas9 nuclease protein D10A mutant coding frame nCas9ORF is shown as 2011bp to 6156bp in Seq ID No. 1;
b. the nucleotide sequence coded by the functional unit of the coding region of the cytosine deaminase of PmCDA1 is shown as 6157bp to 7374bp in Seq ID No. 1.
c. The nucleotide sequence of Poly A region Poly A is shown from 7384bp to 7431bp in Seq ID No.1
d. The terminator is rice HSP terminator HSP T, and the nucleotide sequence is shown as a nucleotide sequence from 8307bp to 8556bp in Seq ID No. 1.
e. The Pol II type promoter is a corn pZmUbi1 promoter pZmUbi1, and the nucleotide sequence is shown from the 1bp to the 2008bp in Seq ID No. 1.
Wherein the core unit in the framework vector has the structure of pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT. Further, the nucleotide sequence of the core unit of pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSP T is shown in Seq ID No. 1.
The core unit (pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT) can replace a promoter and a terminator in the promoter unit according to the requirements of a specific transformed host organism and experiments with any Pol II type promoter element (such as promoter elements commonly used in plants, such as OsUb1, CaMV35S and AtUb 10) and terminator element (such as terminator elements commonly used in plants, such as Nos T and 35s T) and can be placed in any plant expression skeleton vector (such as vector series commonly used in plants, such as pCambia, pBI, pMDC, pGreen and the like) to realize site-specific directed base editing.
In the invention, based on a single transcription unit directional base editing framework vector of a plant STU nCas9-PmCDA1, a specific plant genome site-specific STU nCas9-PmCDA1-sgRNA directional base editing recombinant expression vector is constructed and transformed, and under the condition of living cells, a Pol II promoter drives 'nCas 9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold' to be transcribed as a whole transcription unit to obtain a single-chain primary transcript. Under the action of a cell endogenous tRNA processing factor, single primary transcripts are subjected to self-shearing at two tRNA sites respectively to obtain complete nCas9 ORF-PmCDA1 nuclease-cytosine deaminase fusion protein expression frame mRNA (containing Poly A) and sgRNA transcription units. In a cell system, an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression frame (containing Poly A) is further translated to obtain nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein, and the nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein is combined with an existing sgRNA unit to form a functional nCas9-PmCDA1-sgRNA composite unit for genome target site specific cytosine base directed editing.
In the invention, the complete sgRNA is formed by replacing a BsaI-ccdB-BsaI unit in a sgRNA cloning and transcription unit of a framework vector by an 18-21 bp RNA fragment which can be complementarily combined with the target fragment, the framework RNA fragment is formed by embedding sgRNA, tracrRNA and crRNA which can be combined with a protospacer site in sequence to form functional RNA similar to a hairpin structure, and the framework RNA fragment can be combined with Cas9 nuclease.
After the sgRNA site is determined for a specific target gene (5' -N)X-NGG-3'; n represents A, G, C, T, X is an integer, and 14 is more than or equal to X is less than or equal to 30(18, 19, 20 and 21 are common values)), according to the construction method of the STU nCas9-PmCDA1-sgRNA recombinant expression vector provided by the invention, a designed sgRNA specific target sequence (protospacer) is cloned into a gRNA cloning and transcription unit in a mode of connection 'circulation reaction' of BsaI enzyme digestion-T4 DNA ligase, so that a specific functional STU nCas9-PmCDA1-sgRNA recombinant expression vector is obtainedAnd (3) a body.
In the invention, a BsaI-ccdB-BsaI unit is fused at the sgRNA cloning transcription framework unit end, and the BsaI-ccdB-BsaI unit is used as a multi-cloning site for enzyme digestion of CRISPR/Cas9 single transcription unit framework vector so as to clone a target gRNA specific target sequence (protospacer). The key content of the invention can be effectively realized by replacing a BsaI-ccdB-BsaI unit with a restriction enzyme which can introduce a cut on the framework vector of the invention and correspondingly modifying the cloning site of the sgRNA specific target sequence.
In the process of constructing the plant genome site specificity STU nCas9-PmCDA1-sgRNA directional base editing recombinant expression vector, recombinant clones containing the correct Cas9-gRNA expression vector can be screened through transforming escherichia coli and bacterial screening pressure, and can be identified by colony PCR, plasmid enzyme digestion, sequence determination and other modes, so that the plant genome site specificity STU nCas9-PmCDA1-sgRNA directional base editing recombinant expression vector for the purpose is definitely obtained.
By applying a fusion PCR amplification strategy, a plurality of sgRNA transcription units which are separated by tRNA sequences can be obtained to be connected in series to amplify products, BsaI enzyme digestion-T4 DNA ligase connection is carried out in a circulating reaction mode to replace BsaI-ccdB-BsaI units, the multiple sgRNA transcription units can be cloned into sgRNA cloning and transcription units, and an STU nCas9-PmCDA1-sgRNA1-sgRNA2- … -sgRNA x recombinant expression vector which can be specifically modified aiming at a plurality of target sites is obtained (see figure 1). Preferably, the number of sgRNA cloning and transcription units in the backbone vector is 1 to 6.
In the invention, site-specific STU nCas9-PmCDA1-sgRNA directed base editing recombinant expression vector constructed according to the invention can be transferred into plant cells by protoplast, gene gun and agrobacterium-mediated multiple transformation methods, so that the transformed cells simultaneously have nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein and sgRNA units aiming at specific genome target sequences; under the combined action of nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein and sgRNA unit, specific cytosine base of a specific genome target sequence is edited (the cytosine base is replaced by T (high probability), A (low probability) and G (low probability)). When the single transcription unit directional base editing framework vector of the STU nCas9-PmCDA1 is applied to plants, resistance genes comprising kanamycin, hygromycin, basta and the like can be used for screening plant transformants, and cells or tissues (such as protoplasts or callus tissues) of positive transformants are differentiated and regenerated to obtain regenerated plants containing target site directional modification.
Example 1 construction of a plant STU nCas9-PmCDA1 Single transcription Unit directed base editing backbone vector
The invention designs an STU nCas9-PmCDA1 single transcription unit directional base editing framework vector for plant genome engineering, and the core unit of the vector is pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT in sequence from 5 'to 3'. Wherein: pZUbi 1, maize pZUbi 1 promoter (the substitution of different PolII type promoters can be realized by the scheme of single transcription unit directional base editing framework vector of AscI, SbfI double digestion basic STU nCas9-PmCDA 1); the nCas9ORF is the coding frame of Streptococcus pyogenes (Streptococcus pyogenes) nuclease protein D10A mutant (containing a C-terminal NLS signal peptide); the PmCDA1 is a functional unit of a cytosine deaminase coding region (sequentially comprising a GGGS joint, an SH3 joint, a PmCDA1 coding region, an NLS signal peptide, an UGI coding region, an SGGS joint and an NLS signal peptide from the N end to the C end); poly A is Poly A area; a sgRNA cloning and transcription unit (comprising a tRNA-Gly coding sequence, a BsaI-ccdB-BsaI unit, a sgRNA framework coding sequence and a tRNA-Gly coding sequence from a 5 'end to a 3' end); HSP T is rice HSP terminator (the replacement of different terminators can be realized by a scheme of directional base editing of a framework vector by a single transcription unit of BamHI and SacI double-enzyme digestion basic STUnCas9-PmCDA 1). The structure and the working principle of the single transcription unit directional base editing framework vector of the plant STU nCas9-PmCDA1 are shown in figure 1.
Optionally, the scaffold carrier further comprises: the left and right border sequences of the T-DNA, "pZmUbi 1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT" core unit is located between the left and right borders of the T-DNA; the T-DNA can also comprise a hygromycin resistance gene expression unit (sequentially comprising the elements of 2 xCaMV 35S promoter-hygromycin phosphotransferase ORF-CaMV poly A; and the replacement of different resistance gene ORFs can be realized through the scheme of an AvrII and PacI double-enzyme digestion basic CRISPR/Cas9 single transcription unit framework vector) between the left and right border sequences as a plant transformant screening marker.
In order to realize the quick and efficient construction of a specific genome target STU nCas9-PmCDA1-sgRNA directed base editing recombinant expression vector, the plant STU nCas9-PmCDA1 single transcription unit directed base editing skeleton vector is fused with 637bp BsaI-ccdB-BsaI unit at the 5' end of the sgRNA transcription expression unit, based on the design strategy, in the subsequent construction process of the target STU nCas9-PmCDA1-sgRNA directional base editing recombinant expression vector, only needs to mix the plant STU nCas9-PmCDA1 single transcription unit directional base editing framework vector, the annealed specific target sequence complementary oligonucleotide double-stranded fragment, BsaI endonuclease and T4DNA ligase in a construction system, and setting a circulating reaction of' enzyme digestion at 37 ℃ and connection at 16 ℃ to realize effective construction of a specific Cas9-gRNA expression vector. The vector can be constructed by the conventional method in the existing molecular cloning technology, and it should be noted that the above element sequence is a unique part of the backbone plasmid vector, and it can also include the general structure of some conventional vectors, which will not be described in the present invention.
Plant codon optimization (adding NLS signal at 3' end) is carried out based on Streptococcus pyogenes (Streptococcus pyogenes) Cas9 nuclease protein coding genes (Streptococcus pyogenes Cas9, SpCas9), D10A mutation is introduced, and the complete ORF sequence of nCas9 nuclease protein D10A mutant coding frame (containing C-end NLS coding sequence) is artificially synthesized, wherein the nucleotide sequence is shown as 2011bp to 6156bp in Seq ID No.1 (coded by the NLS coding sequence)Amino acid sequences such as1AA to 1382AA in Seq ID No. 2). Meanwhile, the sequence (Nishida K, Arazoe T, Yache N, Banno S, Kakimoto M, Tabata M, Mochizuki M, Miyabe A, Araki M, Hara KY, Shimatani Z, Kondo A.2016. targettednucleic acid injection using hybrid prokaryotic and verterbric adaptive immune systems science,353 6305. pii: aaf8729) was examined based on the Lampetra eel (Petrymzon marinus) cytosine deaminase (PmCDA1), and a PmCDA1 cytosine deaminase expression unit coding frame (from N-terminal to C-terminal) was designedThe gene sequence sequentially comprises a GGGS joint, an SH3 joint, a PmCDA1 coding region, an NLS signal peptide, an UGI coding region, an SGGS joint and an NLS signal peptide), plant codon optimization is carried out, artificial synthesis is carried out, and the nucleotide sequence is shown as 6157bp to 7374bp in Seq ID No.1 (the coded amino acid sequence is shown as 1383AA to 1788AA in Seq ID No. 2). Further, another 3 basic units are obtained by artificial synthesis:
a. frag-A: poly A, nucleotide sequence as shown in 7384bp to 7431bp of Seq ID No. 1;
b. frag-B: the sgRNA cloning and transcription unit coding sequence sequentially comprises a tRNA-Gly coding sequence, a BsaI-ccdB-BsaI unit, a sgRNA framework coding sequence and a tRNA-Gly coding sequence from a 5 'end to a 3' end: the nucleotide sequence is shown as 7432bp to 8300bp in Seq ID No. 1);
c. frag-C: the rice HSP terminator coding sequence has the nucleotide sequence shown as 8307bp to 8556bp in Seq ID No. 1.
Sequentially fusing the coding sequence of the nCas9 nuclease protein D10A mutant coding frame, the coding sequence of the PmCDA1 cytosine deaminase expression unit coding frame and frag-A, frag-B, frag-C in a fusion PCR mode, and respectively adding SbfI and SacI restriction enzyme cutting sites at the 5 'and 3' ends of a fusion PCR product to obtain a 6560bp assembly unit.
SbfI and SacI double digestion are respectively carried out on a vector framework pGEL026(Tang X, Zheng X, Qi YP, Zhang D, Cheng Y, Tang A, Voytas DF, Zhang Y.2016.A single transcript CRISPR-Cas9 system for expression in molecular Plant,9(7): 1088. sup. and 1091.) and a 6560bp assembly unit, and target fragments are recovered, connected and transformed. Colony PCR, plasmid restriction enzyme digestion and DNA sequencing are carried out on the screened positive clones to confirm that 6560bp assembly units are cloned into the downstream of the original pZmUbi1 of pGEL026, and the construction of the plant STU nCas9-PmCDA1 single transcription unit directional base editing framework vector is completed.
Example 2 high-throughput identification of the editing efficiency of cytosine-oriented bases of rice endogenous genes based on the STU nCas9-PmCDA1 system
1. Rice endogenous gene sgRNA design and construction of STU nCas9-PmCDA1 base editing recombinant expression vector
5 '-NGG-3' PAM locus was searched for coding genes of rice OsCDC48(LOC _ Os03g05730) and OsROC5(LOC _ Os02g45250), and a 20bp upstream sequence of PAM was selected to design sgRNA (Table 1).
TABLE 1 design, Synthesis and detection information of endogenous Gene sgRNAcrRNA in Rice
According to the designed nucleic acid sequence of the sgRNA locus, corresponding forward and reverse oligonucleotide chains are artificially synthesized, and the specific sequences are as follows (the upper case base sequence represents the designed locus specificity guide sgRNA locus; the lower case base sequence represents the complementary cohesive end of the framework vector):
BE-OsCDC48-sgRNA01-F(Seq ID No.10):tgcaGACCAGCCAGCGTCTGGCGC;
BE-OsCDC48-sgRNA01-R(Seq ID No.11):aaacGCGCCAGACGCTGGCTGGTC;
BE-OsROC5-gRNA04-F(Seq ID No.12):tgcaGCAGCTGGCTGAGGGTGCAT;
BE-OsROC5-gRNA04-R(Seq ID No.13):aaacATGCACCCTCAGCCAGCTGC;
BE-OsROC5-gRNA05-F(Seq ID No.14):tgcaAGCCAGCTGCTTACAAAAC;
BE-OsROC5-gRNA05-R(Seq ID No.15):aaacGTTTTGTAAGCAGCTGGCT。
BE-OsCDC48-sgRNA01-F/R, BE-OsROC5-gRNA04-F/R, BE-OsROC5-gRNA05-F/R are mixed in equal proportion, boiled water bath is carried out for 10min, and then natural cooling and annealing are carried out to form double-stranded DNA with sticky ends, which is used as an insert fragment for constructing a recombinant vector. Adding plant STU nCas9-PmCDA1 single transcription unit directional base editing skeleton vector, sticky end insert, BsaI endonuclease and T4DNA ligase into a 200uL PCR tube, performing enzyme digestion at 37 ℃ and connection at 16 ℃ for 10 circulation reactions, treating inactivated endonuclease and ligase at 80 ℃, and taking a reaction product to perform escherichia coli transformation.
Positive transformants were identified by kanamycin resistance selection, colony PCR and enzyme digestion, and finally, recombinant expression vectors of STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04, and STUnCas9-PmCDA1-OsROC5-gRNA05 were obtained by sequencing and verification, respectively.
2. Rice protoplast transformation of rice endogenous gene STU nca 9-PmCDA1-sgRNA base editing recombinant expression vector
Rice Nipponbare protoplasts are separated, and rice protoplast transformation of recombinant expression vectors of STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04 and STU nCas9-PmCDA1-OsROC5-gRNA05 is respectively carried out based on a PEG-mediated transformation method. Specific procedures for rice protoplast transformation can be found in the experimental procedures disclosed in the literature (Tang X, Zheng X, Qi YP, Zhang D, Cheng Y, Tang A, Voytas DF, Zhang Y.2016. origin transfer CRISPR-Cas9 system for expression genome expression, molecular Plant,9(7): 1088. sup. 1091.).
3. Directional editing and detecting of specific site cytosine bases of endogenous OsCDC48 and OsROC5 genes of rice
After transformation of rice protoplasts, culturing at 25 ℃ in the dark for 48 hours, collecting transformed cells, extracting genome DNA of the rice protoplasts by a CTAB method, and performing PCR amplification and Illumina high-throughput sequencing by using the DNA as a template, wherein the method is disclosed in the specific method reference documents (Tang X, Lowder LG, Zhang T, Malzahn A, Zheng X, Voytas DF, Zhong Z, Chen Y, Ren Q, LiQ, Kirkland ER, Zhang Y, Qi Y.2017.A CRISPR-Cpf1 system for effective genetic evaluation and transcriptional expression in Plants Nature Plants,3: 17018.).
Analysis of Illumina high-throughput sequencing results shows that the editing efficiency of 28.62% (OsCDC48-sgRNA01), 30.99% (OsROC5-gRNA04) and 49.41% (OsROC5-gRNA05) of cytosine base to thymine base is realized for editing target sequences by using 3 cytosine bases of the rice OsCDC48 and OsROC5 endogenous genes respectively (FIG. 2: nCas9-PmCDA 1). Specifically, as a control group (according to a reference report (Komor AC, Kim YB, PackerMS, Zurics JA, Liu DR.2016.programmable editing of a target base in genomic DNAwith double-stranded DNA deletion. Nature,533(7603):420-424.), the rApobec1 cytosine deaminase was substituted for the PmCDA1 unit in the single transcription unit directional base editing framework vector of the plant STU nCas9-PmCDA1 constructed in the present invention), the cytosine editing efficiencies of the same target sequence were 7.65% (OsCDC48-sgRNA01), 4.44% (OsROC5-gRNA04), and 4.16% (OsROC5-gRNA05), respectively (FIG. 2: nCas9-rApobec 1).
Aiming at the determined rice 3 cytosine base editing target sequences, according to Illumina high-throughput sequencing results, further analyzing the editing efficiency of replacing independent cytosine base sites at specific editing sites with thymine bases shows that: the OsCDC48-sgRNA01 has effective thymine base substitution editing of 6 cytosine bases including C3, C4, C7, C8, C11 and C14 (FIG. 3: nCas9-PmCDA 1); the OsROC5-gRNA04 has effective thymine base substitution editing of 2 cytosine bases in total of C2 and C5 (figure 4: nCas9-PmCDA 1); the OsROC5-gRNA05 has effective thymine base substitution editing on 3 cytosine bases of C-1, C3 and C4 (figure 5: nCas9-PmCDA 1). Particularly, based on the single transcription unit directional base editing framework vector of the plant STU nCas9-PmCDA1 constructed by the invention, the editing efficiency of replacing 37.94 percent of cytosine bases with thymine bases at a C-1 site of-1 bp at the position of OsROC5-gRNA05 is detected (figure 5: nCas9-PmCDA1), and the cytosine base editing event positioned outside the sgRNA targeting sequence interval cannot be realized in the existing research. Meanwhile, as a control group, the editing window and editing efficiency of replacing independent cytosine base sites with thymine bases at the tested 3 rice cytosine base editing target sequences are both obviously lower than those of the embodiment of the invention (FIG. 3: nCas9-rApobec 1; FIG. 4: nCas9-rApobec 1; FIG. 5: nCas9-rApobec 1).
The test results show that the single transcription unit directional base editing skeleton vector of the plant STU nCas9-PmCDA1 constructed based on the invention can effectively realize efficient directional editing of the cytosine base in the specific region of the endogenous gene of rice and can further expand the editing window range.
Example 3 creation and efficiency analysis of Rice regenerated plants with cytosine-directed base editing of endogenous Gene of Rice based on STU nCas9-PmCDA1 System
1. Agrobacterium transformation of rice endogenous gene STU nca 9-PmCDA1-sgRNA base editing recombinant expression vector
The recombinant expression vectors successfully constructed in example 2 and tested for directional modification activity in rice protoplasts, STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04, and STU nCas9-PmCDA1-OsROC5-gRNA05 were transformed into Agrobacterium EHA105 competent cells by heat shock method, spread on LB solid medium containing 50 mg/L kanamycin and 50 mg/L rifampicin, and cultured in the dark at 28 ℃ for 2 days to obtain positive clones. Positive clones were activated in LB liquid medium containing 50 mg/L kanamycin and 50 mg/L rifampicin for subsequent transformation.
2. Agrobacterium-mediated rice callus transformation of rice endogenous gene STU nCas9-PmCDA1-sgRNA base editing recombinant expression vector
The rice callus transformation is respectively carried out on the recombinant expression vectors of STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04 and STU nCas9-PmCDA1-OsROC5-gRNA05 by an agrobacterium-mediated transformation method. Specific procedures for transformation reference (Tang X, Zheng X, Qi YP, Zhang D, Cheng Y, TangA, Voytas DF, Zhang Y.2016.A single transcript CRISPR-Cas9 system for expression. molecular Plant,9(7): 1088-.
3. Directional base editing detection and efficiency analysis of rice endogenous gene STU nCas9-PmCDA1-sgRNA base editing recombinant expression vector stable transformation regeneration plant
And after transformation, inducing the resistance callus into rice seedlings, extracting the genome DNA of the rice transformation regeneration seedlings, and performing PCR amplification and Sanger sequencing analysis by taking the DNA as a template. Analysis of STU nCas9-PmCDA1-OsCDC48-sgRNA01, STU nCas9-PmCDA1-OsROC5-gRNA04 and STU nCas9-PmCDA1-OsROC5-gRNA05 recombinant expression vector rice transformation regeneration plants shows that the base editing efficiencies of 44.44% (OsCDC48-sgRNA 01: 8/18), 100.00% (OsROC5-gRNA 04: 26/26) and 68.75% (OsROC5-gRNA 05: 11/16) are respectively realized for 3 cytosine base editing target sequences of the endogenous genes of rice OsCDC48 and OsROC5 (FIG. 6: nCas9-PmCDA 1). The test result further indicates that the single transcription unit directional base editing skeleton vector of the plant STU nCas9-PmCDA1 constructed by the invention can effectively realize efficient directional editing of cytosine bases in a specific region of an endogenous gene of rice and obtain a base editing regeneration plant, is a molecular tool for effectively realizing directional editing of plant genome bases, and can improve the improvement efficiency of genome engineering crops.
Nucleotide and amino acid sequences
Seq ID No. 1: pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-HSPT nucleotide sequence
CGCGCCTGCAGTGCAGCGTGACCCGGTCGTGCCCCTCTCTTGAGATAATGAGCATTGCATGTCTAAGTTATAAAAAATTACCACATATTTTTTTTGTCACACTTGTTTGAAGTGCAGTTTATCTATCTTTATACATATATTTAAACTTTACTCTACGAATAATATAATCTATAGTACTACAATAATATCAGTGTTTTAGAGAATCATATAAATGAACAGTTAGACATGGTCTAAAGGACAATTGAGTATTTTGACAACAGGACTCTACAGTTTTATCTTTTTAGTGTGCATGTGTTCTCCTTTTTTTTTGCAAATAGCTTCACCTATATAATACTTCATCCATTTTATTAGTACATCCATTTAGGGTTTAGGGTTAATGGTTTTTATAGACTAATTTTTTTAGTACATCTATTTTATTCTATTTTAGCCTCTAAATTAAGAAAACTAAAACTCTATTTTAGTTTTTTTATTTAATAATTTAGATATAAAATAGAATAAAATAAAGTGACTAAAAATTAAACAAATACCCTTTAAGAAATTAAAAAAACTAAGGAAACATTTTTCTTGTTTCGAGTAGATAATGCCAGCCTGTTAAACGCCGTCGACGAGTCTAACGGACACCAACCAGCGAACCAGCAGCGTCGCGTCGGGCCAAGCGAAGCAGACGGCACGGCATCTCTGTCGCTGCCTCTGGACCCCTCTCGAGAGTTCCGCTCCACCGTTGGACTTGCTCCGCTGTCGGCATCCAGAAATTGCGTGGCGGAGCGGCAGACGTGAGCCGGCACGGCAGGCGGCCTCCTCCTCCTCTCACGGCACCGGCAGCTACGGGGGATTCCTTTCCCACCGCTCCTTCGCTTTCCCTTCCTCGCCCGCCGTAATAAATAGACACCCCCTCCACACCCTCTTTCCCCAACCTCGTGTTGTTCGGAGCGCACACACACACAACCAGAACTCCCCCAAATCCACCCGTCGGCACCTCCGCTTCAAGGTACGCCGCTCGTCCTCCCCCCCCCCCCCTCTCTACCTTCTCAAGATCGGCGTTCCGGTCCATGGTTAGGGCCCGGTAGTTCTACTTCTGTTCATGTTTGTGTTAGATCCGTGTTTGTGTTAGATCCGTGCTACTAGCGTTCGTACACGGATGCGACCTGTACGTCAGACACGTTCTGATTGCTAACTTGCCAGTGTTTCTCTTTGGGGAATCCTGGGATGGCTCTAGCCGTTCCGCAGACGGGATCGATTTCATGATTTTTTTTGTTTCGTTGCATAGGGTTTGGTTTGCCCTTTTCCTTTATTTCAATATATGCCGTGCACTTGTTTGTCGGGTCATCTTTTCATGCTTTTTTTTGTCTTGGTTGTGATGATGTGGTCTGGTTGGGCGGTCGTTCAAGATCGGAGTAGAATTAATTCTGTTTCAAACTACCTGGTGGATTTATTAATTTTGGATCTGTATGTGTGTGCCATACATATTCATAGTTACGAATTGAAGATGATGGATGGAAATATCGATCTAGGATAGGTATACATGTTGATGCGGGTTTTACTGATGCATATACAGAGATGCTTTTTGTTCGCTTGGTTGTGATGATGTGGTGTGGTTGGGCGGTCGTTCATTCGTTCAAGATCGGAGTAGAATACTGTTTCAAACTACCTGGTGTATTTATTAATTTTGGAACTGTATGTGTGTGTCATACATCTTCATAGTTACGAGTTTAAGATGGATGGAAATATCGATCTAGGATAGGTATACATGTTGATGTGGGTTTTACTGATGCATATACATGATGGCATATGCAGCATCTATTCATATGCTCTAACCTTGAGTACCTATCTATTATAATAAACAAGTATGTTTTATAATTATTTTGATCTTGATATACTTGGATGATGGCATATGCAGCAGCTATATGTGGATTTTTTTAGCCCTGCCTTCATACGCTATTTATTTGCTTGGTACTGTTTCTTTTGTCGATGCTCACCCTGTTGTTTGGTGTTACTTCTGCAGCCTGCAGGATGGATAAGAAGTACTCTATCGGACTCGCTATCGGAACTAACTCTGTGGGATGGGCTGTGATCACCGATGAGTACAAGGTGCCATCTAAGAAGTTCAAGGTTCTCGGAAACACCGATAGGCACTCTATCAAGAAAAACCTTATCGGTGCTCTCCTCTTCGATTCTGGTGAAACTGCTGAGGCTACCAGACTCAAGAGAACCGCTAGAAGAAGGTACACCAGAAGAAAGAACAGGATCTGCTACCTCCAAGAGATCTTCTCTAACGAGATGGCTAAAGTGGATGATTCATTCTTCCACAGGCTCGAAGAGTCATTCCTCGTGGAAGAAGATAAGAAGCACGAGAGGCACCCTATCTTCGGAAACATCGTTGATGAGGTGGCATACCACGAGAAGTACCCTACTATCTACCACCTCAGAAAGAAGCTCGTTGATTCTACTGATAAGGCTGATCTCAGGCTCATCTACCTCGCTCTCGCTCACATGATCAAGTTCAGAGGACACTTCCTCATCGAGGGTGATCTCAACCCTGATAACTCTGATGTGGATAAGTTGTTCATCCAGCTCGTGCAGACCTACAACCAGCTTTTCGAAGAGAACCCTATCAACGCTTCAGGTGTGGATGCTAAGGCTATCCTCTCTGCTAGGCTCTCTAAGTCAAGAAGGCTTGAGAACCTCATTGCTCAGCTCCCTGGTGAGAAGAAGAACGGACTTTTCGGAAACTTGATCGCTCTCTCTCTCGGACTCACCCCTAACTTCAAGTCTAACTTCGATCTCGCTGAGGATGCAAAGCTCCAGCTCTCAAAGGATACCTACGATGATGATCTCGATAACCTCCTCGCTCAGATCGGAGATCAGTACGCTGATTTGTTCCTCGCTGCTAAGAACCTCTCTGATGCTATCCTCCTCAGTGATATCCTCAGAGTGAACACCGAGATCACCAAGGCTCCACTCTCAGCTTCTATGATCAAGAGATACGATGAGCACCACCAGGATCTCACACTTCTCAAGGCTCTTGTTAGACAGCAGCTCCCAGAGAAGTACAAAGAGATTTTCTTCGATCAGTCTAAGAACGGATACGCTGGTTACATCGATGGTGGTGCATCTCAAGAAGAGTTCTACAAGTTCATCAAGCCTATCCTCGAGAAGATGGATGGAACCGAGGAACTCCTCGTGAAGCTCAATAGAGAGGATCTTCTCAGAAAGCAGAGGACCTTCGATAACGGATCTATCCCTCATCAGATCCACCTCGGAGAGTTGCACGCTATCCTTAGAAGGCAAGAGGATTTCTACCCATTCCTCAAGGATAACAGGGAAAAGATTGAGAAGATTCTCACCTTCAGAATCCCTTACTACGTGGGACCTCTCGCTAGAGGAAACTCAAGATTCGCTTGGATGACCAGAAAGTCTGAGGAAACCATCACCCCTTGGAACTTCGAAGAGGTGGTGGATAAGGGTGCTAGTGCTCAGTCTTTCATCGAGAGGATGACCAACTTCGATAAGAACCTTCCAAACGAGAAGGTGCTCCCTAAGCACTCTTTGCTCTACGAGTACTTCACCGTGTACAACGAGTTGACCAAGGTTAAGTACGTGACCGAGGGAATGAGGAAGCCTGCTTTTTTGTCAGGTGAGCAAAAGAAGGCTATCGTTGATCTCTTGTTCAAGACCAACAGAAAGGTGACCGTGAAGCAGCTCAAAGAGGATTACTTCAAGAAAATCGAGTGCTTCGATTCAGTTGAGATTTCTGGTGTTGAGGATAGGTTCAACGCATCTCTCGGAACCTACCACGATCTCCTCAAGATCATTAAGGATAAGGATTTCTTGGATAACGAGGAAAACGAGGATATCTTGGAGGATATCGTTCTTACCCTCACCCTCTTTGAAGATAGAGAGATGATTGAAGAAAGGCTCAAGACCTACGCTCATCTCTTCGATGATAAGGTGATGAAGCAGTTGAAGAGAAGAAGATACACTGGTTGGGGAAGGCTCTCAAGAAAGCTCATTAACGGAATCAGGGATAAGCAGTCTGGAAAGACAATCCTTGATTTCCTCAAGTCTGATGGATTCGCTAACAGAAACTTCATGCAGCTCATCCACGATGATTCTCTCACCTTTAAAGAGGATATCCAGAAGGCTCAGGTTTCAGGACAGGGTGATAGTCTCCATGAGCATATCGCTAACCTCGCTGGATCTCCTGCAATCAAGAAGGGAATCCTCCAGACTGTGAAGGTTGTGGATGAGTTGGTGAAGGTGATGGGAAGGCATAAGCCTGAGAACATCGTGATCGAAATGGCTAGAGAGAACCAGACCACTCAGAAGGGACAGAAGAACTCTAGGGAAAGGATGAAGAGGATCGAGGAAGGTATCAAAGAGCTTGGATCTCAGATCCTCAAAGAGCACCCTGTTGAGAACACTCAGCTCCAGAATGAGAAGCTCTACCTCTACTACCTCCAGAACGGAAGGGATATGTATGTGGATCAAGAGTTGGATATCAACAGGCTCTCTGATTACGATGTTGATCATATCGTGCCACAGTCATTCTTGAAGGATGATTCTATCGATAACAAGGTGCTCACCAGGTCTGATAAGAACAGGGGTAAGAGTGATAACGTGCCAAGTGAAGAGGTTGTGAAGAAAATGAAGAACTATTGGAGGCAGCTCCTCAACGCTAAGCTCATCACTCAGAGAAAGTTCGATAACTTGACTAAGGCTGAGAGGGGAGGACTCTCTGAATTGGATAAGGCAGGATTCATCAAGAGGCAGCTTGTGGAAACCAGGCAGATCACTAAGCACGTTGCACAGATCCTCGATTCTAGGATGAACACCAAGTACGATGAGAACGATAAGTTGATCAGGGAAGTGAAGGTTATCACCCTCAAGTCAAAGCTCGTGTCTGATTTCAGAAAGGATTTCCAATTCTACAAGGTGAGGGAAATCAACAACTACCACCACGCTCACGATGCTTACCTTAACGCTGTTGTTGGAACCGCTCTCATCAAGAAGTATCCTAAGCTCGAGTCAGAGTTCGTGTACGGTGATTACAAGGTGTACGATGTGAGGAAGATGATCGCTAAGTCTGAGCAAGAGATCGGAAAGGCTACCGCTAAGTATTTCTTCTACTCTAACATCATGAATTTCTTCAAGACCGAGATTACCCTCGCTAACGGTGAGATCAGAAAGAGGCCACTCATCGAGACAAACGGTGAAACAGGTGAGATCGTGTGGGATAAGGGAAGGGATTTCGCTACCGTTAGAAAGGTGCTCTCTATGCCACAGGTGAACATCGTTAAGAAAACCGAGGTGCAGACCGGTGGATTCTCTAAAGAGTCTATCCTCCCTAAGAGGAACTCTGATAAGCTCATTGCTAGGAAGAAGGATTGGGACCCTAAGAAATACGGTGGTTTCGATTCTCCTACCGTGGCTTACTCTGTTCTCGTTGTGGCTAAGGTTGAGAAGGGAAAGAGTAAGAAGCTCAAGTCTGTTAAGGAACTTCTCGGAATCACTATCATGGAAAGGTCATCTTTCGAGAAGAACCCAATCGATTTCCTCGAGGCTAAGGGATACAAAGAGGTTAAGAAGGATCTCATCATCAAGCTCCCAAAGTACTCACTCTTCGAACTCGAGAACGGTAGAAAGAGGATGCTCGCTTCTGCTGGTGAGCTTCAAAAGGGAAACGAGCTTGCTCTCCCATCTAAGTACGTTAACTTTCTTTACCTCGCTTCTCACTACGAGAAGTTGAAGGGATCTCCAGAAGATAACGAGCAGAAGCAACTTTTCGTTGAGCAGCACAAGCACTACTTGGATGAGATCATCGAGCAGATCTCTGAGTTCTCTAAAAGGGTGATCCTCGCTGATGCAAACCTCGATAAGGTGTTGTCTGCTTACAACAAGCACAGAGATAAGCCTATCAGGGAACAGGCAGAGAACATCATCCATCTCTTCACCCTTACCAACCTCGGTGCTCCTGCTGCTTTCAAGTACTTCGATACAACCATCGATAGGAAGAGATACACCTCTACCAAAGAAGTGCTCGATGCTACCCTCATCCATCAGTCTATCACTGGACTCTACGAGACTAGGATCGATCTCTCACAGCTCGGTGGTGATTCAAGGGCTGATCCTAAGAAGAAGAGGAAGGTTGGAGACGACGGAGGTGGCGGTACAGGAGGGGGTGGGTCCGCTGAGTATGTCAGGGCGTTGTTCGACTTCAATGGAAACGACGAGGAAGATCTGCCTTTTAAAAAGGGAGATATTCTCAGGATCAGAGATAAGCCGGAAGAACAATGGTGGAACGCTGAAGACTCTGAAGGTAAGAGAGGTATGATTCTTGTCCCCTACGTCGAGAAGTATTCGGGTGACTATAAAGACCACGATGGAGATTATAAGGACCACGATATAGATTATAAGGATGATGATGATAAGAGCGGAATGACCGATGCAGAGTACGTCAGGATTCATGAGAAACTTGACATCTACACGTTTAAGAAACAGTTTTTCAACAACAAAAAATCTGTTAGTCACCGCTGTTACGTGCTGTTCGAATTGAAACGCAGAGGTGAGAGGAGAGCCTGCTTTTGGGGCTATGCCGTCAACAAGCCGCAAAGCGGCACAGAAAGGGGCATTCACGCGGAGATATTTAGCATTAGAAAGGTCGAGGAATACCTTCGGGATAATCCCGGGCAATTCACTATCAATTGGTACTCTTCATGGTCCCCGTGTGCAGATTGCGCTGAAAAGATACTGGAGTGGTATAATCAAGAACTCAGAGGAAACGGTCACACCCTCAAGATTTGGGCTTGCAAGCTTTACTACGAGAAAAATGCAAGGAACCAGATCGGCCTCTGGAACTTGCGCGACAACGGCGTGGGGTTGAATGTGATGGTGTCGGAGCATTACCAGTGCTGCCGGAAGATATTCATTCAGTCGTCACATAATCAATTGAACGAGAATAGGTGGCTCGAAAAAACCCTGAAGCGGGCCGAGAAGTGGAGGAGTGAACTCTCGATAATGATCCAGGTTAAAATACTGCATACTACCAAATCTCCGGCGGTGGGACCGAAGAAGAAGCGCAAGGTGGGGACCATGACTAATCTCTCAGATATAATCGAGAAGGAAACAGGAAAGCAACTGGTCATCCAAGAATCGATTTTGATGCTTCCCGAAGAAGTCGAAGAAGTTATAGGAAATAAGCCCGAGTCTGACATACTGGTTCACACAGCGTACGATGAAAGTACGGACGAGAATGTCATGTTGCTGACATCGGACGCACCTGAATACAAGCCTTGGGCTCTGGTCATACAAGATAGTAACGGAGAAAATAAGATTAAAATGCTTTCAGGTGGCTCCCCAAAGAAGAAACGCAAGGTTTGAGGATCTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAGCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGGTACAGACCCGGGTTCGATTCCCGGCTGGTGCAGGAGACCTTATATTCCCCAGAACATCAGGTTAATGGCGTTTTTGATGTCATTTTCGCGGTGGCTGAGATCAGCCACTTCTTCCCCGATAACGGAAACCGGCACACTGGCCATATCGGTGGTCATCATGCGCCAGCTTTCATCCCCGATATGCACCACCGGGTAAAGTTCACGGGAGACTTTATCTGACAGCAGACGTGCACTGGCCAGGGGGATCACCATCCGTCGCCCGGGCGTGTCAATAATATCACTCTGTACATCCACAAACAGACGATAACGGCTCTCTCTTTTATAGGTGTAAACCTTAAACTGCATTTCACCAGCCCCTGTTCTCGTCAGCAAAAGAGCCGTTCATTTCAATAAACCGGGCGACCTCAGCCATCCCTTCCTGATTTTCCGCTTTCCAGCGTTCGGCACGCAGACGACGGGCTTCATTCTGCATGGTTGTGCTTACCAGACCGGAGATATTGACATCATATATGCCTTGAGCAACTGATAGCTGTCGCTGTCAACTGTCACTGTAATACGCTGCTTCATAGCATACCTCTTTTTGACATACTTCGGGTATACATATCAGTATATATTCTTATACCGCAAAAATCAGCGCGCAAATACGCATACTGTTATCTGGCTTGGTCTCAGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCAACAAAGCACCAGTGGTCTAGTGGTAGAATAGTACCCTGCCACGGTACAGACCCGGGTTCGATTCCCGGCTGGTGCAGGATCCATATGAAGATGAAGATGAAATATTTGGTGTGTCAAATAAAAAGCTTGTGTGCTTAAGTTTGTGTTTTTTTCTTGGCTTGTTGTGTTATGAATTTGTGGCTTTTTCTAATATTAAATGAATGTAAGATCACATTATAATGAATAAACAAATGTTTCTATAATCCATTGTGAATGTTTTGTTGGATCTCTTCTGCAGCATATAACTACTGTATGTGCTATGGTATGGACTATGGAATATGATTAAAGATAAG
Seq ID No. 2: nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression frame amino acid sequence
MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSRADPKKKRKVGDDGGGGTGGGGSAEYVRALFDFNGNDEEDLPFKKGDILRIRDKPEEQWWNAEDSEGKRGMILVPYVEKYSGDYKDHDGDYKDHDIDYKDDDDKSGMTDAEYVRIHEKLDIYTFKKQFFNNKKSVSHRCYVLFELKRRGERRACFWGYAVNKPQSGTERGIHAEIFSIRKVEEYLRDNPGQFTINWYSSWSPCADCAEKILEWYNQELRGNGHTLKIWACKLYYEKNARNQIGLWNLRDNGVGLNVMVSEHYQCCRKIFIQSSHNQLNENRWLEKTLKRAEKWRSELSIMIQVKILHTTKSPAVGPKKKRKVGTMTNLSDIIEKETGKQLVIQESILMLPEEVEEVIGNKPESDILVHTAYDESTDENVMLLTSDAPEYKPWALVIQDSNGENKIKMLSGGSPKKKRKV。
Sequence listing
<110> university of electronic technology
<120> plant genome directed base editing framework vector and application thereof
<130> 20180001
<141> 2018-11-23
<160> 15
<170> SIPOSequenceListing 1.0
<210> 1
<211> 8556
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
cgcgcctgca gtgcagcgtg acccggtcgt gcccctctct tgagataatg agcattgcat 60
gtctaagtta taaaaaatta ccacatattt tttttgtcac acttgtttga agtgcagttt 120
atctatcttt atacatatat ttaaacttta ctctacgaat aatataatct atagtactac 180
aataatatca gtgttttaga gaatcatata aatgaacagt tagacatggt ctaaaggaca 240
attgagtatt ttgacaacag gactctacag ttttatcttt ttagtgtgca tgtgttctcc 300
tttttttttg caaatagctt cacctatata atacttcatc cattttatta gtacatccat 360
ttagggttta gggttaatgg tttttataga ctaatttttt tagtacatct attttattct 420
attttagcct ctaaattaag aaaactaaaa ctctatttta gtttttttat ttaataattt 480
agatataaaa tagaataaaa taaagtgact aaaaattaaa caaataccct ttaagaaatt 540
aaaaaaacta aggaaacatt tttcttgttt cgagtagata atgccagcct gttaaacgcc 600
gtcgacgagt ctaacggaca ccaaccagcg aaccagcagc gtcgcgtcgg gccaagcgaa 660
gcagacggca cggcatctct gtcgctgcct ctggacccct ctcgagagtt ccgctccacc 720
gttggacttg ctccgctgtc ggcatccaga aattgcgtgg cggagcggca gacgtgagcc 780
ggcacggcag gcggcctcct cctcctctca cggcaccggc agctacgggg gattcctttc 840
ccaccgctcc ttcgctttcc cttcctcgcc cgccgtaata aatagacacc ccctccacac 900
cctctttccc caacctcgtg ttgttcggag cgcacacaca cacaaccaga actcccccaa 960
atccacccgt cggcacctcc gcttcaaggt acgccgctcg tcctcccccc ccccccctct 1020
ctaccttctc aagatcggcg ttccggtcca tggttagggc ccggtagttc tacttctgtt 1080
catgtttgtg ttagatccgt gtttgtgtta gatccgtgct actagcgttc gtacacggat 1140
gcgacctgta cgtcagacac gttctgattg ctaacttgcc agtgtttctc tttggggaat 1200
cctgggatgg ctctagccgt tccgcagacg ggatcgattt catgattttt tttgtttcgt 1260
tgcatagggt ttggtttgcc cttttccttt atttcaatat atgccgtgca cttgtttgtc 1320
gggtcatctt ttcatgcttt tttttgtctt ggttgtgatg atgtggtctg gttgggcggt 1380
cgttcaagat cggagtagaa ttaattctgt ttcaaactac ctggtggatt tattaatttt 1440
ggatctgtat gtgtgtgcca tacatattca tagttacgaa ttgaagatga tggatggaaa 1500
tatcgatcta ggataggtat acatgttgat gcgggtttta ctgatgcata tacagagatg 1560
ctttttgttc gcttggttgt gatgatgtgg tgtggttggg cggtcgttca ttcgttcaag 1620
atcggagtag aatactgttt caaactacct ggtgtattta ttaattttgg aactgtatgt 1680
gtgtgtcata catcttcata gttacgagtt taagatggat ggaaatatcg atctaggata 1740
ggtatacatg ttgatgtggg ttttactgat gcatatacat gatggcatat gcagcatcta 1800
ttcatatgct ctaaccttga gtacctatct attataataa acaagtatgt tttataatta 1860
ttttgatctt gatatacttg gatgatggca tatgcagcag ctatatgtgg atttttttag 1920
ccctgccttc atacgctatt tatttgcttg gtactgtttc ttttgtcgat gctcaccctg 1980
ttgtttggtg ttacttctgc agcctgcagg atggataaga agtactctat cggactcgct 2040
atcggaacta actctgtggg atgggctgtg atcaccgatg agtacaaggt gccatctaag 2100
aagttcaagg ttctcggaaa caccgatagg cactctatca agaaaaacct tatcggtgct 2160
ctcctcttcg attctggtga aactgctgag gctaccagac tcaagagaac cgctagaaga 2220
aggtacacca gaagaaagaa caggatctgc tacctccaag agatcttctc taacgagatg 2280
gctaaagtgg atgattcatt cttccacagg ctcgaagagt cattcctcgt ggaagaagat 2340
aagaagcacg agaggcaccc tatcttcgga aacatcgttg atgaggtggc ataccacgag 2400
aagtacccta ctatctacca cctcagaaag aagctcgttg attctactga taaggctgat 2460
ctcaggctca tctacctcgc tctcgctcac atgatcaagt tcagaggaca cttcctcatc 2520
gagggtgatc tcaaccctga taactctgat gtggataagt tgttcatcca gctcgtgcag 2580
acctacaacc agcttttcga agagaaccct atcaacgctt caggtgtgga tgctaaggct 2640
atcctctctg ctaggctctc taagtcaaga aggcttgaga acctcattgc tcagctccct 2700
ggtgagaaga agaacggact tttcggaaac ttgatcgctc tctctctcgg actcacccct 2760
aacttcaagt ctaacttcga tctcgctgag gatgcaaagc tccagctctc aaaggatacc 2820
tacgatgatg atctcgataa cctcctcgct cagatcggag atcagtacgc tgatttgttc 2880
ctcgctgcta agaacctctc tgatgctatc ctcctcagtg atatcctcag agtgaacacc 2940
gagatcacca aggctccact ctcagcttct atgatcaaga gatacgatga gcaccaccag 3000
gatctcacac ttctcaaggc tcttgttaga cagcagctcc cagagaagta caaagagatt 3060
ttcttcgatc agtctaagaa cggatacgct ggttacatcg atggtggtgc atctcaagaa 3120
gagttctaca agttcatcaa gcctatcctc gagaagatgg atggaaccga ggaactcctc 3180
gtgaagctca atagagagga tcttctcaga aagcagagga ccttcgataa cggatctatc 3240
cctcatcaga tccacctcgg agagttgcac gctatcctta gaaggcaaga ggatttctac 3300
ccattcctca aggataacag ggaaaagatt gagaagattc tcaccttcag aatcccttac 3360
tacgtgggac ctctcgctag aggaaactca agattcgctt ggatgaccag aaagtctgag 3420
gaaaccatca ccccttggaa cttcgaagag gtggtggata agggtgctag tgctcagtct 3480
ttcatcgaga ggatgaccaa cttcgataag aaccttccaa acgagaaggt gctccctaag 3540
cactctttgc tctacgagta cttcaccgtg tacaacgagt tgaccaaggt taagtacgtg 3600
accgagggaa tgaggaagcc tgcttttttg tcaggtgagc aaaagaaggc tatcgttgat 3660
ctcttgttca agaccaacag aaaggtgacc gtgaagcagc tcaaagagga ttacttcaag 3720
aaaatcgagt gcttcgattc agttgagatt tctggtgttg aggataggtt caacgcatct 3780
ctcggaacct accacgatct cctcaagatc attaaggata aggatttctt ggataacgag 3840
gaaaacgagg atatcttgga ggatatcgtt cttaccctca ccctctttga agatagagag 3900
atgattgaag aaaggctcaa gacctacgct catctcttcg atgataaggt gatgaagcag 3960
ttgaagagaa gaagatacac tggttgggga aggctctcaa gaaagctcat taacggaatc 4020
agggataagc agtctggaaa gacaatcctt gatttcctca agtctgatgg attcgctaac 4080
agaaacttca tgcagctcat ccacgatgat tctctcacct ttaaagagga tatccagaag 4140
gctcaggttt caggacaggg tgatagtctc catgagcata tcgctaacct cgctggatct 4200
cctgcaatca agaagggaat cctccagact gtgaaggttg tggatgagtt ggtgaaggtg 4260
atgggaaggc ataagcctga gaacatcgtg atcgaaatgg ctagagagaa ccagaccact 4320
cagaagggac agaagaactc tagggaaagg atgaagagga tcgaggaagg tatcaaagag 4380
cttggatctc agatcctcaa agagcaccct gttgagaaca ctcagctcca gaatgagaag 4440
ctctacctct actacctcca gaacggaagg gatatgtatg tggatcaaga gttggatatc 4500
aacaggctct ctgattacga tgttgatcat atcgtgccac agtcattctt gaaggatgat 4560
tctatcgata acaaggtgct caccaggtct gataagaaca ggggtaagag tgataacgtg 4620
ccaagtgaag aggttgtgaa gaaaatgaag aactattgga ggcagctcct caacgctaag 4680
ctcatcactc agagaaagtt cgataacttg actaaggctg agaggggagg actctctgaa 4740
ttggataagg caggattcat caagaggcag cttgtggaaa ccaggcagat cactaagcac 4800
gttgcacaga tcctcgattc taggatgaac accaagtacg atgagaacga taagttgatc 4860
agggaagtga aggttatcac cctcaagtca aagctcgtgt ctgatttcag aaaggatttc 4920
caattctaca aggtgaggga aatcaacaac taccaccacg ctcacgatgc ttaccttaac 4980
gctgttgttg gaaccgctct catcaagaag tatcctaagc tcgagtcaga gttcgtgtac 5040
ggtgattaca aggtgtacga tgtgaggaag atgatcgcta agtctgagca agagatcgga 5100
aaggctaccg ctaagtattt cttctactct aacatcatga atttcttcaa gaccgagatt 5160
accctcgcta acggtgagat cagaaagagg ccactcatcg agacaaacgg tgaaacaggt 5220
gagatcgtgt gggataaggg aagggatttc gctaccgtta gaaaggtgct ctctatgcca 5280
caggtgaaca tcgttaagaa aaccgaggtg cagaccggtg gattctctaa agagtctatc 5340
ctccctaaga ggaactctga taagctcatt gctaggaaga aggattggga ccctaagaaa 5400
tacggtggtt tcgattctcc taccgtggct tactctgttc tcgttgtggc taaggttgag 5460
aagggaaaga gtaagaagct caagtctgtt aaggaacttc tcggaatcac tatcatggaa 5520
aggtcatctt tcgagaagaa cccaatcgat ttcctcgagg ctaagggata caaagaggtt 5580
aagaaggatc tcatcatcaa gctcccaaag tactcactct tcgaactcga gaacggtaga 5640
aagaggatgc tcgcttctgc tggtgagctt caaaagggaa acgagcttgc tctcccatct 5700
aagtacgtta actttcttta cctcgcttct cactacgaga agttgaaggg atctccagaa 5760
gataacgagc agaagcaact tttcgttgag cagcacaagc actacttgga tgagatcatc 5820
gagcagatct ctgagttctc taaaagggtg atcctcgctg atgcaaacct cgataaggtg 5880
ttgtctgctt acaacaagca cagagataag cctatcaggg aacaggcaga gaacatcatc 5940
catctcttca cccttaccaa cctcggtgct cctgctgctt tcaagtactt cgatacaacc 6000
atcgatagga agagatacac ctctaccaaa gaagtgctcg atgctaccct catccatcag 6060
tctatcactg gactctacga gactaggatc gatctctcac agctcggtgg tgattcaagg 6120
gctgatccta agaagaagag gaaggttgga gacgacggag gtggcggtac aggagggggt 6180
gggtccgctg agtatgtcag ggcgttgttc gacttcaatg gaaacgacga ggaagatctg 6240
ccttttaaaa agggagatat tctcaggatc agagataagc cggaagaaca atggtggaac 6300
gctgaagact ctgaaggtaa gagaggtatg attcttgtcc cctacgtcga gaagtattcg 6360
ggtgactata aagaccacga tggagattat aaggaccacg atatagatta taaggatgat 6420
gatgataaga gcggaatgac cgatgcagag tacgtcagga ttcatgagaa acttgacatc 6480
tacacgttta agaaacagtt tttcaacaac aaaaaatctg ttagtcaccg ctgttacgtg 6540
ctgttcgaat tgaaacgcag aggtgagagg agagcctgct tttggggcta tgccgtcaac 6600
aagccgcaaa gcggcacaga aaggggcatt cacgcggaga tatttagcat tagaaaggtc 6660
gaggaatacc ttcgggataa tcccgggcaa ttcactatca attggtactc ttcatggtcc 6720
ccgtgtgcag attgcgctga aaagatactg gagtggtata atcaagaact cagaggaaac 6780
ggtcacaccc tcaagatttg ggcttgcaag ctttactacg agaaaaatgc aaggaaccag 6840
atcggcctct ggaacttgcg cgacaacggc gtggggttga atgtgatggt gtcggagcat 6900
taccagtgct gccggaagat attcattcag tcgtcacata atcaattgaa cgagaatagg 6960
tggctcgaaa aaaccctgaa gcgggccgag aagtggagga gtgaactctc gataatgatc 7020
caggttaaaa tactgcatac taccaaatct ccggcggtgg gaccgaagaa gaagcgcaag 7080
gtggggacca tgactaatct ctcagatata atcgagaagg aaacaggaaa gcaactggtc 7140
atccaagaat cgattttgat gcttcccgaa gaagtcgaag aagttatagg aaataagccc 7200
gagtctgaca tactggttca cacagcgtac gatgaaagta cggacgagaa tgtcatgttg 7260
ctgacatcgg acgcacctga atacaagcct tgggctctgg tcatacaaga tagtaacgga 7320
gaaaataaga ttaaaatgct ttcaggtggc tccccaaaga agaaacgcaa ggtttgagga 7380
tctaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaacaaagca 7440
ccagtggtct agtggtagaa tagtaccctg ccacggtaca gacccgggtt cgattcccgg 7500
ctggtgcagg agaccttata ttccccagaa catcaggtta atggcgtttt tgatgtcatt 7560
ttcgcggtgg ctgagatcag ccacttcttc cccgataacg gaaaccggca cactggccat 7620
atcggtggtc atcatgcgcc agctttcatc cccgatatgc accaccgggt aaagttcacg 7680
ggagacttta tctgacagca gacgtgcact ggccaggggg atcaccatcc gtcgcccggg 7740
cgtgtcaata atatcactct gtacatccac aaacagacga taacggctct ctcttttata 7800
ggtgtaaacc ttaaactgca tttcaccagc ccctgttctc gtcagcaaaa gagccgttca 7860
tttcaataaa ccgggcgacc tcagccatcc cttcctgatt ttccgctttc cagcgttcgg 7920
cacgcagacg acgggcttca ttctgcatgg ttgtgcttac cagaccggag atattgacat 7980
catatatgcc ttgagcaact gatagctgtc gctgtcaact gtcactgtaa tacgctgctt 8040
catagcatac ctctttttga catacttcgg gtatacatat cagtatatat tcttataccg 8100
caaaaatcag cgcgcaaata cgcatactgt tatctggctt ggtctcagtt ttagagctag 8160
aaatagcaag ttaaaataag gctagtccgt tatcaacttg aaaaagtggc accgagtcgg 8220
tgcaacaaag caccagtggt ctagtggtag aatagtaccc tgccacggta cagacccggg 8280
ttcgattccc ggctggtgca ggatccatat gaagatgaag atgaaatatt tggtgtgtca 8340
aataaaaagc ttgtgtgctt aagtttgtgt ttttttcttg gcttgttgtg ttatgaattt 8400
gtggcttttt ctaatattaa atgaatgtaa gatcacatta taatgaataa acaaatgttt 8460
ctataatcca ttgtgaatgt tttgttggat ctcttctgca gcatataact actgtatgtg 8520
ctatggtatg gactatggaa tatgattaaa gataag 8556
<210> 2
<211> 1788
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Met Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
1 5 10 15
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
20 25 30
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
35 40 45
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
50 55 60
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
65 70 75 80
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
85 90 95
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
100 105 110
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
115 120 125
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
130 135 140
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
145 150 155 160
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
165 170 175
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
180 185 190
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
195 200 205
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
210 215 220
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
225 230 235 240
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
245 250 255
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
260 265 270
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
275 280 285
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
290 295 300
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
305 310 315 320
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
325 330 335
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
340 345 350
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
355 360 365
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
370 375 380
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
385 390 395 400
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
405 410 415
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
420 425 430
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
435 440 445
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
450 455 460
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
465 470 475 480
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
485 490 495
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
500 505 510
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
515 520 525
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
530 535 540
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
545 550 555 560
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
565 570 575
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
580 585 590
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
595 600 605
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
610 615 620
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
625 630 635 640
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
645 650 655
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
660 665 670
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
675 680 685
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
690 695 700
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
705 710 715 720
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
725 730 735
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
740 745 750
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
755 760 765
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
770 775 780
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
785 790 795 800
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
805 810 815
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
820 825 830
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
835 840 845
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
850 855 860
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
865 870 875 880
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
885 890 895
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
900 905 910
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
915 920 925
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
930 935 940
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
945 950 955 960
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
965 970 975
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
980 985 990
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe
995 1000 1005
Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser
1025 1030 1035 1040
Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu
1045 1050 1055
Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile
1060 1065 1070
Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser
1075 1080 1085
Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly
1090 1095 1100
Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile
1105 1110 1115 1120
Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser
1125 1130 1135
Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly
1140 1145 1150
Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1155 1160 1165
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala
1170 1175 1180
Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys
1185 1190 1195 1200
Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser
1205 1210 1215
Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1220 1225 1230
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser
1235 1240 1245
Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val
1265 1270 1275 1280
Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys
1285 1290 1295
His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu
1300 1305 1310
Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp
1315 1320 1325
Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp
1330 1335 1340
Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile
1345 1350 1355 1360
Asp Leu Ser Gln Leu Gly Gly Asp Ser Arg Ala Asp Pro Lys Lys Lys
1365 1370 1375
Arg Lys Val Gly Asp Asp Gly Gly Gly Gly Thr Gly Gly Gly Gly Ser
1380 1385 1390
Ala Glu Tyr Val Arg Ala Leu Phe Asp Phe Asn Gly Asn Asp Glu Glu
1395 1400 1405
Asp Leu Pro Phe Lys Lys Gly Asp Ile Leu Arg Ile Arg Asp Lys Pro
1410 1415 1420
Glu Glu Gln Trp Trp Asn Ala Glu Asp Ser Glu Gly Lys Arg Gly Met
1425 1430 1435 1440
Ile Leu Val Pro Tyr Val Glu Lys Tyr Ser Gly Asp Tyr Lys Asp His
1445 1450 1455
Asp Gly Asp Tyr Lys Asp His Asp Ile Asp Tyr Lys Asp Asp Asp Asp
1460 1465 1470
Lys Ser Gly Met Thr Asp Ala Glu Tyr Val Arg Ile His Glu Lys Leu
1475 1480 1485
Asp Ile Tyr Thr Phe Lys Lys Gln Phe Phe Asn Asn Lys Lys Ser Val
1490 1495 1500
Ser His Arg Cys Tyr Val Leu Phe Glu Leu Lys Arg Arg Gly Glu Arg
1505 1510 1515 1520
Arg Ala Cys Phe Trp Gly Tyr Ala Val Asn Lys Pro Gln Ser Gly Thr
1525 1530 1535
Glu Arg Gly Ile His Ala Glu Ile Phe Ser Ile Arg Lys Val Glu Glu
1540 1545 1550
Tyr Leu Arg Asp Asn Pro Gly Gln Phe Thr Ile Asn Trp Tyr Ser Ser
1555 1560 1565
Trp Ser Pro Cys Ala Asp Cys Ala Glu Lys Ile Leu Glu Trp Tyr Asn
1570 1575 1580
Gln Glu Leu Arg Gly Asn Gly His Thr Leu Lys Ile Trp Ala Cys Lys
1585 1590 1595 1600
Leu Tyr Tyr Glu Lys Asn Ala Arg Asn Gln Ile Gly Leu Trp Asn Leu
1605 1610 1615
Arg Asp Asn Gly Val Gly Leu Asn Val Met Val Ser Glu His Tyr Gln
1620 1625 1630
Cys Cys Arg Lys Ile Phe Ile Gln Ser Ser His Asn Gln Leu Asn Glu
1635 1640 1645
Asn Arg Trp Leu Glu Lys Thr Leu Lys Arg Ala Glu Lys Trp Arg Ser
1650 1655 1660
Glu Leu Ser Ile Met Ile Gln Val Lys Ile Leu His Thr Thr Lys Ser
1665 1670 1675 1680
Pro Ala Val Gly Pro Lys Lys Lys Arg Lys Val Gly Thr Met Thr Asn
1685 1690 1695
Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln
1700 1705 1710
Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn
1715 1720 1725
Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr
1730 1735 1740
Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro
1745 1750 1755 1760
Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met
1765 1770 1775
Leu Ser Gly Gly Ser Pro Lys Lys Lys Arg Lys Val
1780 1785
<210> 3
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
cgacatccgc aagtaccagg 20
<210> 4
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
agtacactgt ttccccgtat gt 22
<210> 5
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
gctgctggtg agtgctgat 19
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
acccattggg agtgtcttgc 20
<210> 7
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gaccagccag cgtctggcgc 20
<210> 8
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gcagctggct gagggtgcat 20
<210> 9
<211> 19
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
agccagctgc ttacaaaac 19
<210> 10
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
tgcagaccag ccagcgtctg gcgc 24
<210> 12
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
aaacgcgcca gacgctggct ggtc 24
<210> 13
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
tgcagcagct ggctgagggt gcat 24
<210> 13
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
aaacatgcac cctcagccag ctgc 24
<210> 14
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
tgcaagccag ctgcttacaa aac 23
<210> 15
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
aaacgttttg taagcagctg gct 23

Claims (10)

1. A plant genome directed base editing backbone vector characterized in that: comprises a core unit consisting of two core regions of an nCas9-PmCDA1 nuclease-cytosine deaminase fusion protein expression unit and a synthetic guide RNA (sgRNA) transcription expression unit, wherein the core unit is driven by a Pol II type promoter to transcribe;
the core unit is nCas9 ORF-PmCDA1-Poly A-sgRNA cloning scaffold-T in sequence from 5 'direction to 3' direction; the nCas9ORF is a coding frame of a Streptococcus pyogenes nuclease protein D10A mutant; PmCDA1 is a functional unit of cytosine deaminase coding region; poly A is Poly A area; the sgRNA cloning and transcription unit is sgRNA cloning and transcription unit, and the sgRNA cloning scaffold is at least one; t is a terminator.
2. The plant genome directed base editing scaffold vector of claim 1, wherein: the functional unit of the coding region of the PmCDA1 cytosine deaminase sequentially comprises a GGGS joint, an SH3 joint, a PmCDA1 coding region, an NLS signal peptide, an UGI coding region, an SGGS joint and an NLS signal peptide from the N end to the C end.
3. The plant genome directed base editing scaffold vector of claim 1 or 2, conforming to at least one of:
a. the amino acid sequence coded by nCas9 nuclease protein D10A mutant coding frame nCas9ORF is shown as amino acids from position 1 to position 1382 in Seq ID No. 2;
b. the amino acid sequence encoded by the functional unit of the cytosine deaminase coding region of PmCDA1 is shown as amino acids 1383 to 1788 in Seq ID No. 2.
4. The plant genome directed base editing scaffold vector according to any one of claims 1 to 3, wherein: the sgRNA cloning and transcription unit sgRNA cloning scaffold comprises a tRNA-Gly coding sequence, a BsaI-ccdB-BsaI unit, an sgRNA framework coding sequence and a tRNA-Gly coding sequence from 5 'end to 3' end in sequence.
5. The plant genome directed base editing scaffold vector of any one of claims 1 to 4, wherein: the number of sgRNA cloning and transcription units is 1-6.
6. The plant genome directed base editing scaffold vector of any one of claims 1 to 5, wherein: the nucleotide sequence of the sgRNA cloning and transcription unit sgRNA cloning scaffold is shown as 7432bp to 8300bp in Seq ID No. 1.
7. The plant genome directed base editing scaffold vector of any one of claims 1 to 7, wherein: at least one of the following is met:
a. the nucleotide sequence encoded by nCas9 nuclease protein D10A mutant encoding frame nCas9ORF is shown as 2011bp to 6156bp in Seq ID No. 1;
b. the nucleotide sequence coded by the functional unit of the coding region of the PmCDA1 cytosine deaminase is shown in 6157bp to 7374bp in Seq ID No. 1;
c. the nucleotide sequence of Poly A region Poly A is shown as 7384bp to 7431bp in Seq ID No. 1.
8. The plant genome directed base editing scaffold vector of any one of claims 1 to 6, wherein:
at least one of the following is met:
a. the terminator is a rice HSP terminator HSPT, and the nucleotide sequence of the terminator is shown by a nucleotide sequence from 8307bp to 8556bp in Seq ID No. 1;
b. the Pol II type promoter is a corn pZmUbi1 promoter pZmUbi1, and the nucleotide sequence is shown from the 1bp to the 2008bp in Seq ID No. 1.
9. The plant genome directed base editing scaffold vector of any one of claims 1 to 8, wherein: the core unit of the framework vector has the structure of pZmUbi1-nCas9 ORF-PmCDA1-Poly A-sgRNA cloningscaffold-HSPT, and the nucleotide sequence is shown in Seq ID No. 1.
10. The preparation method of the recombinant expression vector for carrying out directional base editing aiming at the specific cytosine base of the target site of the plant genome comprises the following steps:
a. defining a target DNA region of a specific biological genome, analyzing the region with PAM characteristics, and selecting a DNA sequence of 15-30 bpadjacent to the 5' end of a PAM structure as a specific target sequence;
b. respectively synthesizing 5' -CGGA-N according to selected specific target sequenceXA forward oligonucleotide strand of-3 'character and having 5' -AAAC-NX-a reverse oligonucleotide strand of 3' character, N represents any of A, G, C, T, X is an integer, and 14 ≦ X ≦ 30, wherein N in the forward oligonucleotide strandXAnd N in reverse oligonucleotideXHas reverse complementary characteristics; obtaining a complementary oligonucleotide double-stranded fragment by annealing;
c. mixing the plant genome directed base editing skeleton vector of any one of claims 1 to 9 with the complementary oligonucleotide double-stranded fragment obtained in step b, simultaneously adding BsaI endonuclease and T4DNA ligase into a reaction system, and setting enzyme digestion-ligation cycling reaction to obtain the recombinant expression vector for site directed base editing.
CN201811403794.0A 2018-11-23 2018-11-23 Plant genome directional base editing framework vector and application thereof Active CN110607320B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811403794.0A CN110607320B (en) 2018-11-23 2018-11-23 Plant genome directional base editing framework vector and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811403794.0A CN110607320B (en) 2018-11-23 2018-11-23 Plant genome directional base editing framework vector and application thereof

Publications (2)

Publication Number Publication Date
CN110607320A true CN110607320A (en) 2019-12-24
CN110607320B CN110607320B (en) 2023-05-12

Family

ID=68888837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811403794.0A Active CN110607320B (en) 2018-11-23 2018-11-23 Plant genome directional base editing framework vector and application thereof

Country Status (1)

Country Link
CN (1) CN110607320B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111850034A (en) * 2020-06-24 2020-10-30 中国农业大学 Gene editing carrier and method
CN112080517A (en) * 2020-09-08 2020-12-15 南京农业大学 Screening system for improving probability of obtaining gene editing plants, construction method and application thereof
CN112575014A (en) * 2020-12-11 2021-03-30 安徽省农业科学院水稻研究所 Novel base editor SpCas9-LjCDAL1 and construction and application thereof
CN112852791A (en) * 2020-11-20 2021-05-28 中国农业科学院植物保护研究所 Adenine base editor and related biological material and application thereof
CN114507683A (en) * 2021-11-19 2022-05-17 杭州嘉因生物科技有限公司 SURE strain with Kan resistance gene knocked out and construction method and application thereof
CN114540406A (en) * 2020-11-26 2022-05-27 电子科技大学 Genome editing expression box, vector and application thereof
CN116135974A (en) * 2021-11-17 2023-05-19 中国科学院天津工业生物技术研究所 Recombinant glycosylase base editing system and application thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105132451A (en) * 2015-07-08 2015-12-09 电子科技大学 CRISPR/Cas9 single transcription unit directionally modified backbone vector and application thereof
WO2017090761A1 (en) * 2015-11-27 2017-06-01 国立大学法人神戸大学 Method for converting monocot plant genome sequence in which nucleic acid base in targeted dna sequence is specifically converted, and molecular complex used therein
CN107012164A (en) * 2017-01-11 2017-08-04 电子科技大学 CRISPR/Cpf1 Plant Genome directed modifications functional unit, the carrier comprising the functional unit and its application
WO2018143477A1 (en) * 2017-02-06 2018-08-09 国立大学法人 筑波大学 Method of modifying genome of dicotyledonous plant
US20200377910A1 (en) * 2016-04-21 2020-12-03 National University Corporation Kobe University Method for increasing mutation introduction efficiency in genome sequence modification technique, and molecular complex to be used therefor
CN113717960A (en) * 2021-08-27 2021-11-30 电子科技大学 Novel Cas9 protein, CRISPR-Cas9 genome directed editing vector and genome editing method
WO2022060185A1 (en) * 2020-09-18 2022-03-24 기초과학연구원 Targeted deaminase and base editing using same
CN114317590A (en) * 2020-09-30 2022-04-12 北京市农林科学院 Method for mutating base C in plant genome into base T

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105132451A (en) * 2015-07-08 2015-12-09 电子科技大学 CRISPR/Cas9 single transcription unit directionally modified backbone vector and application thereof
WO2017090761A1 (en) * 2015-11-27 2017-06-01 国立大学法人神戸大学 Method for converting monocot plant genome sequence in which nucleic acid base in targeted dna sequence is specifically converted, and molecular complex used therein
US20200377910A1 (en) * 2016-04-21 2020-12-03 National University Corporation Kobe University Method for increasing mutation introduction efficiency in genome sequence modification technique, and molecular complex to be used therefor
CN107012164A (en) * 2017-01-11 2017-08-04 电子科技大学 CRISPR/Cpf1 Plant Genome directed modifications functional unit, the carrier comprising the functional unit and its application
WO2018143477A1 (en) * 2017-02-06 2018-08-09 国立大学法人 筑波大学 Method of modifying genome of dicotyledonous plant
WO2022060185A1 (en) * 2020-09-18 2022-03-24 기초과학연구원 Targeted deaminase and base editing using same
CN114317590A (en) * 2020-09-30 2022-04-12 北京市农林科学院 Method for mutating base C in plant genome into base T
CN113717960A (en) * 2021-08-27 2021-11-30 电子科技大学 Novel Cas9 protein, CRISPR-Cas9 genome directed editing vector and genome editing method

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
"Precise base editing in rice, wheat and maize with a Cas9-cytidine deaminase fusion", NAT BIOTECHNOL *
LEE HK等: "Development of CRISPR technology for precise single-base genome editing: a brief review", BMB REP *
NISHIDA K等: "Targeted nucleotide editing using hybrid prokaryotic and vertebrate adaptive immune systems", 《SCIENCE》 *
SHIMATANI Z等: "Targeted base editing in rice and tomato using a CRISPR-Cas9 cytidine deaminase fusion", NAT BIOTECHNOL *
WU Y等: "Increasing Cytosine Base Editing Scope and Efficiency With Engineered Cas9-PmCDA1 Fusions and the Modified sgRNA in Rice", 《 FRONT GENET》 *
ZHONG Z等: "Improving Plant Genome Editing with High-Fidelity xCas9 and Non-canonical PAM-Targeting Cas9-NG", 《MOL PLANT》 *
仲昭辉: "基于CRISPR-Cas的高效水稻基因组编辑系统构建及应用", 《中国优秀博硕士学位论文全文数据库(博士)农业科技辑》 *
唐旭: "基于CRISPR-Cas的高效水稻基因组编辑系统构建及应用", 《中国优秀博硕士学位论文全文数据库(博士)农业科技辑(月刊)》 *
张爱霞等: "基于CRISPR/Cas9系统的单碱基基因编辑技术及其在医药研究中的应用", 《中国药理学与毒理学杂志》 *
黄华媚等: "BE3型胞嘧啶碱基编辑器在谷氨酸棒杆菌中的开发及应用", 《生物技术通报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111850034A (en) * 2020-06-24 2020-10-30 中国农业大学 Gene editing carrier and method
CN111850034B (en) * 2020-06-24 2023-01-10 中国农业大学 Gene editing carrier and method
CN112080517A (en) * 2020-09-08 2020-12-15 南京农业大学 Screening system for improving probability of obtaining gene editing plants, construction method and application thereof
CN112852791A (en) * 2020-11-20 2021-05-28 中国农业科学院植物保护研究所 Adenine base editor and related biological material and application thereof
CN112852791B (en) * 2020-11-20 2022-05-24 中国农业科学院植物保护研究所 Adenine base editor and related biological material and application thereof
CN114540406A (en) * 2020-11-26 2022-05-27 电子科技大学 Genome editing expression box, vector and application thereof
CN114540406B (en) * 2020-11-26 2023-09-29 电子科技大学 Genome editing expression frame, vector and application thereof
CN112575014A (en) * 2020-12-11 2021-03-30 安徽省农业科学院水稻研究所 Novel base editor SpCas9-LjCDAL1 and construction and application thereof
CN112575014B (en) * 2020-12-11 2022-04-01 安徽省农业科学院水稻研究所 Base editor SpCas9-LjCDAL1 and construction and application thereof
CN116135974A (en) * 2021-11-17 2023-05-19 中国科学院天津工业生物技术研究所 Recombinant glycosylase base editing system and application thereof
CN114507683A (en) * 2021-11-19 2022-05-17 杭州嘉因生物科技有限公司 SURE strain with Kan resistance gene knocked out and construction method and application thereof

Also Published As

Publication number Publication date
CN110607320B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
CN107012164B (en) CRISPR/Cpf1 plant genome directed modification functional unit, vector containing functional unit and application of functional unit
CN110607320B (en) Plant genome directional base editing framework vector and application thereof
JP2645217B2 (en) Chimeric genes expressed in plant cells
CN109136248B (en) Multi-target editing vector and construction method and application thereof
CN110747187B (en) Cas12a protein for identifying TTTV and TTV double-PAM sites, plant genome directed editing vector and method
CN112852791B (en) Adenine base editor and related biological material and application thereof
CN111662367B (en) Rice bacterial leaf blight-resistant protein and coding gene and application thereof
CN108034671B (en) Plasmid vector and method for establishing plant population by using same
CN113717960B (en) Novel Cas9 protein, CRISPR-Cas9 genome directed editing vector and genome editing method
CN114075559A (en) Type 2 CRISPR/Cas9 gene editing system and application thereof
CN106978438B (en) Method for improving homologous recombination efficiency
CN114540356B (en) Rhodosporidium toruloides promoter and application thereof
CN116064647A (en) Plant virus expression vector and application thereof
CN113493803B (en) Alfalfa CRISPR/Cas9 genome editing system and application thereof
CN110878293B (en) Application of bacillus licheniformis with deletion of yceD gene in production of heterologous protein
CN112501171B (en) sgRNA targeting sequences of two specific targeting pig Pax7 genes and application
CN116286742B (en) CasD protein, CRISPR/CasD gene editing system and application thereof in plant gene editing
CN117683755B (en) C-to-G base editing system
CN117327742B (en) Technical method for promoting efficient replacement and homogenization of chlamydomonas chloroplast genome
CN115851784B (en) Plant cytosine base editing system constructed by Lbcpf1 variant and application thereof
CN113832151B (en) Cucumber endogenous promoter and application thereof
CN110606894B (en) Cas9 fusion protein for improving gene editing efficiency, encoding gene and application thereof
CN114438123B (en) Dicotyledon polygene editing vector and construction method thereof
CN115896158A (en) Recombinant vector in plant gene editing, method for screening false positive callus by using recombinant vector and application of recombinant vector
CN115851756A (en) Plant gene editing recombinant vector capable of screening out false positive callus and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant