CN113201517B - Cytosine single base editor tool and application thereof - Google Patents

Cytosine single base editor tool and application thereof Download PDF

Info

Publication number
CN113201517B
CN113201517B CN202110519757.1A CN202110519757A CN113201517B CN 113201517 B CN113201517 B CN 113201517B CN 202110519757 A CN202110519757 A CN 202110519757A CN 113201517 B CN113201517 B CN 113201517B
Authority
CN
China
Prior art keywords
lys
leu
glu
ser
ile
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110519757.1A
Other languages
Chinese (zh)
Other versions
CN113201517A (en
Inventor
乔云波
李丽平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou University
Original Assignee
Guangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou University filed Critical Guangzhou University
Priority to CN202110519757.1A priority Critical patent/CN113201517B/en
Publication of CN113201517A publication Critical patent/CN113201517A/en
Application granted granted Critical
Publication of CN113201517B publication Critical patent/CN113201517B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04001Cytosine deaminase (3.5.4.1)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]

Abstract

The invention belongs to the technical field of biological engineering and gene editing, and particularly relates to a cytosine single-base editor tool and application thereof. The cytosine single-base editor contains a polynucleotide sequence for coding fusion protein, the fusion protein comprises Cas9 nicase homologous protein SgoCas9D9A nicase which is derived from Streptococcus gordonae and optimized by codons, the cytosine single-base editor can identify NNAAAG to serve as PAM, can edit window cytosine at 8-14 positions in a gRNA targeting sequence range, can realize the conversion of specific bases C-to-T, and widens the targeting range and application range of base editing.

Description

Cytosine single base editor tool and application thereof
Technical Field
The invention belongs to the technical field of biological engineering and gene editing, and particularly relates to a cytosine single-base editor tool and application thereof.
Background
The development and application of the CRISPR/Cas9 system makes a significant contribution to the development of biology and medicine, which as the most classical gene editing system consists of two parts: a gRNA (guide RNA) sequence that recognizes a target genomic target and Cas9 with endonuclease activity. The latter utilizes self-PID (PAM) domain to recognize the adjacent motif of the pre-spacer sequence of the target genome under the guidance of gRNA (PAM, protospacer ad jacent motif; specific base on the target genome, such as NGG), and further cuts double-stranded DNA through RuvC and HNH domains which have endonuclease activity (Anders, C., et al, structural basis of PAM-dependent target DNA recognition by the Cas9 end-structure, 2014.513 (7519): p.569-73), DNA damage repair response mechanism of organism, connects the broken sequences through homologous recombination repair or non-homologous recombination repair, realizes the knockout or insertion of target gene in cells (Lin, S., enhand. Homologous-directed gene cloning site), and CRISPR/modifying Cas.049.766.15.
The editing efficiency of CRISPR/Cas9 system is high, however, cutting double-stranded DNA causes relatively more insertions and deletions of gene fragments and fails to accurately control their repair bases. Thus, scientists have invented a single base editing tool based on the CRISPR/Cas9 system, including a single base editor ABE (Adenine base editor) (Gaudelli, N.M., et al., programming base editing of A T to G C in genomic DNA without DNA deletion. Nature,2017.551 (7681): p.464-471) for realizing the replacement of the target site base A/T with G/C (A/T-to-G/C) and a single base editor CBE (Cytidine base editor) (Komor, A.C., et al., programming edge of a target base in genomic DNA without DNA deletion DNA, nature 533.03-2016 (2016): 420-2016).
The single base editing tool is mainly a base editing tool which is formed by fusing cytidine deaminase or adenosine deaminase and Cas9 nickase (D10A mutation, active domain RuvC domain inactivation) and can accurately replace a single base. CBE was first reported by David Liu laboratory, harvard university, using a single base editor BE3 (APOBEC-XTEN-Cas 9n (D10A) -UGI) formed by fusion of rat-derived cytosine deaminase (APOBEC 1) with nCas9 (Komor, A.C., et al, programeble editing of a target base in genomic DNA without out double-stranded DNA clean. Nature,2016.533 (7603): p.420-4). BE3 has high editing efficiency, can realize accurate C-to-T conversion, and has deletion and insertion proportion far lower than that of a CRISPR/Cas9 knockout system. BE4 (APOBEC-XTEN-Cas 9n (D10A) -2. Times. UGI-SV40 NLS) was subsequently developed on the basis of BE3, i.e.a Uracil DNA glycosylase inhibitor (UGI) and a Nuclear Localization Signal (NLS) were added on the basis of BE3, as well as a change in linker length (Komor, A.C., et al, improved base expression replication and bacterial phase Mu peptide amplification C: G-to-T: A base indices with high efficiency and product purity. Sci Adv,2017.3 (8): p.eaao4774). On the basis of BE4, deaminase APOBEC1 is replaced by an Anc689 APOBEC and a nuclear localization signal is added to evolve AncBE4max (SV 40NLS-Anc689 APOBEC-XTEN-Cas9n (D10A) -2 UGI-SV40 NLS) (Koblan, L.W., et al, improving cycle and adenosine base indexes by expression timing and processed recovery. Nat Biotechnology, 2018.36 (9): p.843-846), and the like, so that the most efficient editing efficiency and the lowest proportion of byproducts are obtained. The PAM recognized by ancBE4max is NGG, the corresponding editing window is 4-8 sites of 5' end in the gRNA range, and Cas9n is derived from Streptococcus pyogenes (SpCas 9; total 1369 amino acids). However, the targeting window and PAM restriction of ancBE4max (PAM which primarily recognizes NGG sequences) greatly limits the range in the genome that can be targeted.
In view of the above problems, scientists developed a series of SpCas9 protein mutants obtained by protein engineering and directed evolution, such as SpCas9-NG, spRY (Walton, R.T., et al, unconstrained gene targeting with near-PAMless Engineered CRISPR-Cas9 variants. Science,2020.368 (6488): p.290-296), spCas9-VQR, spCas 9-2015R (Kleinverter, B.P., et al, engineered CRISPR-Cas9 nucleic acids with altered PAM specificity. Nature, 523 (7561): p.481-5), spCas9-HF1 (Kleinverter, B.P., et al, high-fidelity-nucleic acid-No. 9 nucleic acids), CRISPR 9 gene with gene-5, etc. (752016.2016.2016.2016); scientists have also sought new Cas9 protein homologues, such as Nme2Cas9 (Edraki, a., et al, a Compact, high-Accuracy Cas9 with a digital PAM for In Vivo Genome edition, mol Cell,2019.73 (4): p.714-726.e 4), saCas9 (Nishimasu, h., et al, crystalline Structure of Staphylococcus aureus aureous case 9.Cell,2015.162 (5): p.1113-26), st1Cas9 (Zhang, y., catalytic-state and engineering of Staphylococcus aureus 9. Natural Catalysis,2020.3 (10): p.823), calcium 9 (huas 9. H., 2019. Wo 9. H., 9. Tissue, and others. Based on this, a series of base editors with various targeting properties and recognition of PAM based on these variants were developed. The editing windows of the classical editors are mainly 4-8 bits, and PAM preference or low partial site targeting efficiency exists in all editors. Moreover, the size of the expression plasmid of the classical base editor far exceeds the packaging range of adenovirus, which is not beneficial to clinical research and application.
Therefore, the development of novel base editors with different editing windows, different identification PAMs and smaller expression plasmids is the key of the current gene editing application research and clinical application.
Disclosure of Invention
Aiming at the problems of the existing base editor, the invention aims to provide a cytosine single base editor tool and application thereof, which can efficiently induce the conversion of C-to-T at the 5' end 8-14 of an editing window, and the identified PAM is NNAAAG, so that the genome targeting range of base editing is enlarged, the size of the gene editing tool is reduced, the gene editing tool is more suitable for the packaging range of adenovirus, and the cytosine single base editor tool has a good application prospect.
Based on the purpose, the invention adopts the following technical scheme:
in a first aspect, the invention provides a fusion protein comprising a codon-optimized Cas9 nicase homologous protein derived from streptococcus gordonae, a cytosine deaminase, and a uracil glycosylase inhibitor protein;
wherein, the amino acid sequence of the Cas9 nickase homologous protein is as follows:
a) The method comprises the following steps The 2 nd to 1136 th amino acid sequences of the N end of SgoCas9D9A nickase shown in SEQ ID NO. 1;
or b): has more than 90 percent of sequence consistency with SEQ ID NO.1 and has the function of amino acid shown in SEQ ID NO. 1;
or c): has the function of obtaining the complete fusion protein by shearing and splicing the amino acid sequence of the intron sequence with the SEQ ID NO.1 or has the partial same amino acid sequence with the SEQ ID NO.1 and the function of the amino acid shown in the SEQ ID NO. 1.
The DNA coding sequence for the codon optimized Cas9 nicakase homologous protein derived from streptococcus gordonae is:
i) The method comprises the following steps The DNA coding sequence corresponding to SgoCas9D9A nickase shown in SEQ ID NO.2 is a DNA coding sequence which is suitable for eukaryotic expression after codon optimization;
or ii): a DNA coding sequence corresponding to an amino acid having a sequence identity of 90% or more with the amino acid sequence shown in SEQ ID No.1, and having the function of the amino acid shown in SEQ ID No. 1;
or iii): a DNA sequence having synonymous codons with the DNA sequence shown in SEQ ID NO. 2.
Further, the fusion protein also comprises N-terminal BPNLS-ancAPEC 1 polypeptide and C-terminal 2 x UGI-BPNLS polypeptide; wherein, the BPNLS-ancAPECE 1 polypeptide is formed by fusing BPNLS polypeptide and ancAPECE 1 polypeptide; 2 UGI-BPNLS polypeptide is formed by fusing UGI polypeptide and BPNLS polypeptide;
the amino acid sequence of the BPNLS-ancAPBEC 1 polypeptide is as follows:
d) The method comprises the following steps An amino acid sequence shown as SEQ ID NO. 3;
or e): has more than 90 percent of sequence consistency with SEQ ID NO.3 and has the function of amino acid shown in SEQ ID NO. 3;
2 the amino acid sequence of ugi-BPNLS polypeptide is:
f) The method comprises the following steps An amino acid sequence shown as SEQ ID NO. 4;
or g): has more than 90 percent of sequence consistency with SEQ ID NO.4 and has the amino acid sequence with the function of the amino acid shown in SEQ ID NO. 4.
Further, the N-terminal comprises 2-1136 th amino acid sequences of BPNLS polypeptide, ancAPEC 1 polypeptide, 32aa linker, sgoCas9D9A nickase, 10aa linker, 2 × UGI polypeptide and BPNLS polypeptide from the N terminal to the C terminal in sequence;
the complete amino acid sequence of the fusion protein is as follows:
h) The method comprises the following steps An amino acid sequence shown as SEQ ID NO. 5;
or i): the 6 elements of BPNLS polypeptide, ancAPEC 1 polypeptide, 32aa linker, sgoCas9D9A nickase, amino acid sequence from 2 th to 1136 th sites of N end, 10aa linker and 2 × UGI polypeptide are rearranged or increased or decreased to realize the function of editing cytosine base into thymine.
Or j): has more than 90 percent of sequence consistency with the amino acid sequence shown in SEQ ID NO.5, and has the amino acid sequence which can identify NNAAAG as PAM and edit cytosine base into thymine.
Further, the fusion protein recognizes nnaaaag as PAM, wherein N represents an arbitrary base; the fusion protein edits the cytosine base to thymine at positions 8-14 of the editing window.
Further, the fusion protein also comprises a nucleic acid positioning signal polypeptide fragment, and the amino acid sequence of the nucleic acid positioning signal polypeptide fragment is shown as SEQ ID NO. 8.
In a second aspect, the present invention provides a polynucleotide sequence, wherein the polynucleotide sequence is a polynucleotide sequence encoding the above fusion protein, and the polynucleotide sequence is shown in SEQ ID No. 6.
In a third aspect, the present invention provides a cytosine single base editor obtained by integrating a polynucleotide sequence encoding the above fusion protein into an expression vector.
Furthermore, the expression vector is derived from a gRNA scaffold formed by a Gordon streptococcus tandem repeat sequence, and the nucleotide sequence of the expression vector is shown as SEQ ID No. 7.
In a fourth aspect, the present invention provides a cell expression system comprising a cytosine single base editor of claim 7 or claim 8; the cell is a host cell, and the host cell is a eukaryotic cell or a prokaryotic cell.
Further, the cell is a mouse brain neuroma cell, a human embryonic kidney cell or a human colon cancer cell.
Furthermore, the mouse brain neuroma cell is an N2a cell, the human embryonic kidney cell is an HEK293T cell, and the human colon cancer cell is an HCT116 cell.
Compared with the prior art, the invention has the following beneficial effects:
according to the invention, cas9n protein is replaced by SgoCas9n on the basis of AncBE4max, potential target-identified PAM is NNAAAG, meanwhile, scaffold of a gRNA expression vector is replaced by gRNA scaffold which is designed from a Gordon streptococcus tandem repeat sequence, a novel efficient single-base editor Sgo-ancBE4max is formed together, the editing range is 8-14 th cytosine at the 5' end of a target sequence, an editing system can convert cytosine into thymine (C-to-T), the target range of base editing is widened, and the protein size of the base editing tool can be suitable for the packaging requirement of adenovirus.
Drawings
FIG. 1 is a schematic of the domain of Sgo protein;
FIG. 2 is a schematic diagram of the domain of the protein of Sgo-ancBE4 max;
FIG. 3 is a schematic diagram of the plasmid structure of Sgo-ancBE4 max;
FIG. 4 is a schematic diagram of the plasmid structure of gRNA of the Sgo-ancBE4max system;
fig. 5 is a schematic diagram of Sgo-ancBE4max showing the experimental results of embodiment 3 of the present invention, wherein Sanger sequencing results after HEK293T cells are transfected by Sgo-ancBE4max + gRNA together, a schematic diagram of targeting DNA sequences and PAM sequences are shown above the diagram, and a statistical diagram of editing results efficiency of 4 corresponding targeting sites is shown in the right of the diagram;
FIG. 6 is a heat map of the editing efficiency statistics of the Sgo-ancBE4max editing system in HEK293T cells;
FIG. 7 is a histogram of the editing efficiency statistical chart of Sgo-ancBE4max editing system in HEK293T cells, after normalization processing, with the dashed line box being the schematic diagram of the editing window;
FIG. 8 is a statistical chart of the editing efficiency of the Sgo-ancBE4max editing system in HCT116 cells;
FIG. 9 is a graph of the editing efficiency of the SgoCas9-ancBE4max system in N2A cells.
Detailed Description
To better illustrate the objects, aspects and advantages of the present invention, the present invention will be further described with reference to the following examples. It will be understood by those skilled in the art that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The test methods used in the examples are all conventional methods unless otherwise specified; the materials, reagents and the like used are commercially available unless otherwise specified.
Example 1 construction of the SgoCas9-ancBE4max plasmid
The embodiment provides a construction method of an SgoCas9-ancBE4max plasmid, which comprises the following specific steps:
firstly, the amino acid sequence alignment tool ClustalW2 is used to align the amino acid sequence of Cas9 protein homolog SgoCas9 from Gordonia (Streptococcus _ gordonii _ str. _ Challis _ substr. _ CH1; sgo for short) with spCas9 to divide the functional domain of SgoCas9 (as shown in FIG. 1), and find out the functional sites of SgoCas9 and the RuvC domain of SpCas9, wherein the 10 th position of the functional site of the RuvC domain of SpCas9 is aspartic acid D10, the 10 th position of the functional site of the RuvC domain of SgoCas9 is aspartic acid D9, and aspartic acid D9 is mutated into alanine (A), thus obtaining the SgoCas 9A nickase, the amino acid sequences 2 to 1136 of SgoCas 9A nickase are shown in SEQ ID NO. 1.
And secondly, carrying out eukaryotic optimization on prokaryotic codons of the SgoCas9D9A of the streptococcus gordonae so as to obtain a coding DNA sequence of the SgoCas9D9A suitable for eukaryotic cell expression, wherein the sequence is shown as SEQ ID NO. 2. Optimized SgoCas9D9A business company complete gene synthesis.
The construction strategy is to replace SpCas 9D 10A of the ancBE4max with SgoCas9D9A on the basis of the ancBE4max, wherein the ancBE4max is synthesized by a commercial company through the whole gene. To reduce PCR-introduced point mutations, a portion of ancBE4max, 32aa linker-SpCas 9D 10A-10aa linker-UGI, was excised by endonuclease BamHI, and then supplemented with the excised portion of ancBE4max, 32aa linker-SgoCas 9D 9A-10aa linker-UGI, when SgoCas9D9A was synthesized by commercial companies, the nucleotide sequence of the sequences being shown in SEQ ID NO.9, and the linker of the sequences containing the endonuclease BamHI site.
The plasmid AncBE4max (vector pCMV) was digested with restriction enzyme BamHI (R0136L) in a water bath at 37 ℃ for 2h, and the digestion system (50. Mu.L) was: 10xBuffer:5 μ L, vector: 5 μ g, bamHI enzyme: 3 μ L, ddH2O: adding to 50 μ L; identifying whether the enzyme digestion is complete through gel electrophoresis; after the completion of the digestion, the linearized vector was purified using clean up kit (AxyPrep PCR clean kit) using 15. Mu.L ddH2And (4) eluting by using oxygen. Carrying out PCR amplification on the synthesized 32aa linker-SgoCas 9D 9A-10aa linker-UGI, introducing protective bases outside enzyme cutting sites at two ends, and carrying out PCR reaction amplification on a carrier fragment by using a PCR primer synthesized by Jinzhi Biotechnology Limited, wherein the sequence of the Sgo PCR forward primer is as follows: agcggatcctggcagcgagacacca; the sequence of the Sgo PCR reverse primer is as follows: cctccgggatctccgctcagcatcttgatctta. And purified using clean up kit (AxyPrep PCR clean-up kit). And carrying out BamH1 enzyme digestion reaction on the purified PCR product, wherein the enzyme digestion system refers to the system.
The purified 32aa linker-SgoCas 9D 9A-10aa linker-UGI was enzymatically ligated to the BamH1 linearization vector pCMV _ AncBE4max to obtain the primary ligation product.
Ligation system (10 μ L): purification of the linearized vector pCMV _ AncBE4max: mu.L (50 ng), 32aa linker-SgoCas 9D 9A-10aa linker-UGI BamH1 cleavage product: 1 μ L (100 ng), T4 DNA Ligase Buffer: 1. Mu.L, T4 DNA Ligase:1 μ L, ddH2O:6 μ L. Enzyme-linked conditions: ligation was carried out at 16 ℃ for 2h.
And (3) after the enzyme-linked product is converted, coating a plate, selecting a monoclonal shake bacteria for sequencing and cloning and identifying, and constructing to obtain the SgoCas9-ancBE4max protein, namely the fusion protein, wherein the whole amino acid sequence of the fusion protein is shown as SEQ ID NO.5, and the DNA sequence of the fusion protein is shown as SEQ ID NO. 6. The structural schematic diagram of the constructed fusion protein is shown in figure 2, and the fusion protein sequentially comprises a polypeptide fragment consisting of the N-terminal 2-1136 amino acids of BPNLS, an ancAPECEC 1 polypeptide fragment, a 32aa linker, an SgoCas9D9A nickase, a 10aa linker, 2 × UGI polypeptide and a BPNLS polypeptide sequence); the SgoCas9-ancBE4max plasmid structure map is shown in FIG. 3, and includes a plasmid domain (i.e., a fusion protein) and an ampicillin polypeptide sequence.
The monoclonal bacterial liquid which is identified as positive by clone is enlarged and cultured, and Plasmid (TIANGEN: TIANPure Midi Plasmid Kit) is extracted according to the Kit steps and the concentration is measured, so that the sufficient dosage and no impurity pollution such as salt, protein and the like are ensured during transfection.
Example 2 construction of gRNA plasmid for the SgoCas9-ancBE4max System
The embodiment provides a construction method of a gRNA plasmid of an SgoCas9-ancBE4max system, which comprises the following steps:
2.1 Vector construction of gRNA plasmid of SgoCas9-ancBE4max system
pGL3-U6-sgRNA (Addgene # 51133) is used as an expression framework to construct a gRNA expression vector suitable for an SgoCas9 gRNA editing system. According to a tandem repeat sequence from a Gordonia streptococcus, a scaffold sequence suitable for an SgoCas9 gRNA action system is designed, scaffold (suitable for SpCas 9) of pGL3-U6-sgRNA (Addgene # 51133) is replaced by SgoCas9 gRNA scaffold, a successfully constructed complete plasmid is shown as SEQ ID NO.7 and named as pGL3-U6-Sgo gRNA, and a plasmid structure schematic diagram is shown in figure 4. The restriction sites for ligation into the targeted gRNA sequence were two BsaI, and the plasmid was synthesized from the whole gene of commercial company.
2.2 Construction of targeting gRNA plasmid of SgoCas9-ancBE4max system
Grnas were designed and two complementarily paired oligos were synthesized, with the upstream sequence: 5'-accg-24nt-3', the downstream sequence is: 5'-aaac-24nt-3' (the 24nt alternative sequence is complementary paired with the upstream sequence), the upstream sequence is 24nt-NNAAAG (DNA chain where PAM is located), the upstream and downstream sequences are annealed by a program (95 ℃,5min, 95 ℃ -85 ℃, at-2 ℃/s;85 ℃ -25 ℃, at-0.1 ℃/s; hold at 4 ℃) and connected to a pGL3-U6-SgRNA vector linearized by BsaI (NEB: R0539L).
The linearized digestion system is shown below: pGL3-U6-Sgo gRNA 2 ug; buffer (NEB: R0539L) 6. Mu.L; bsaI 2. Mu.L; ddH2The amount of O was adjusted to 60. Mu.L. The digestion was carried out overnight at 37 ℃. The linking system is as follows: t4 is connected to buffer (NEB: M0202L) 1 uL, linearly20ng of vector, 5. Mu.L of annealed oligo fragment (10. Mu.M), 0.5. Mu.L of T4 DNA ligase (NEB: M0202L), ddH2The amount of O was made up to 10. Mu.L. Ligation was carried out overnight at 16 ℃. The linked vector is transformed, selected and identified. The positive clones were amplified to extract the plasmid (Axygene: AP-MN-P-250G) and the concentration was determined.
Example 3 Gene editing Effect test of Gene editing tool SgoCas9-ancBE4max
Human endogenous genes EMX1, FANCF, CDKN2A, CFTR, DNMT1, DYRK1A, RUNX1, VEGFA and the like are selected, 10 gRNAs are designed in total, and 20 Oligos are synthesized, wherein the sequences are shown in Table 1.
TABLE 120 Oligos synthesized on the basis of EMX1, FANCF, CDKN2A, CFTR, DNMT1, DYRK1A, RUNX1 and VEGFA, respectively
Figure BDA0003061763050000081
Figure BDA0003061763050000091
HEK293T cells were transfected using a base editing system consisting of the SgoCas9-ancBE4max plasmid constructed in example 1 and example 2 and pGL3-U6-Sgo gRNA plasmids (20 oligo's synthesized in sequence Listing 1 were annealed and enzymatically ligated to linearized pGL3-U6-Sgo gRNA vectors: sgo-1, sgSgo-2, sgSgo-3, sgSgo-4, sgSgo-5, sgSgo-6, sgSgo-7, sgSgo-8, sgSgo-9 and sgSgo-10) as follows:
3.1HEK293T cells (from ATCC) were recovered and cultured in 10cm dishes (Corning, 430167) in DMEM (HyClone, SH 30243.01) mixed with 10% fetal bovine serum (HyClone, SV 30087). The culture temperature was 37 ℃ and the carbon dioxide concentration was 5%. After multiple passages when the cell density was 90%, the cells were plated to 24-well plates (gerbil).
3.2HEK293T cells are recovered for three generations and then the cell state is observed, the cells with good state are paved into a 24-hole plate, after the paved cells are cultured for 18-24 h, the cells are transfected when the cell concentration is 80%, and the dosage of each component in the transfection process is as follows: 1 ug of SgoCas9-ancBE4max plasmid, pGL3-U6-Sgo gRNA plasmid: mu.g, EZTrans transfection reagent (Liji organism) 4.5. Mu.L.
3.3 the specific transfection procedure (as high efficiency version procedure of EZ Trans transfection reagent for Prunus hainanensis organisms) is:
3.3.1 configuration reagent a: for each well of cells, 1.5. Mu.g of plasmid DNA (1. Mu.g of SgoCas9-ancBE4max plasmid + 0.5. Mu.g of pGL3-U6-Sgo gRNA plasmid) was diluted to 50. Mu.L of serum-free double-antibody-free high-glucose DMEM medium (or OPTI-MEM medium) and mixed well.
3.3.2 configuration B reagent: for each well of cells, 4.5. Mu.L of EZ Trans transfection reagent (EZ Trans: plasmid DNA = 2. (cannot use serum medium dilution plasmid and EZ Trans transfection reagent, because the serum contains a large amount of negatively charged proteins, may interfere with transfection reagent on nucleic acid adsorption, thereby affecting transfection efficiency)
3.3.3 the A reagent and the B reagent are simultaneously kept stand for 5min, and the B reagent is added into the A reagent as soon as possible and is gently mixed evenly. (the order of mixing cannot be reversed)
3.3.4 standing at room temperature for 15min to form EZ Trans-DNA complexes. The EZ Trans-DNA transfection complex prepared is dropped into a culture dish containing cells evenly, and the culture dish is shaken or shaken slightly to disperse the EZ Trans-DNA complex evenly.
3.3.5% CO at 37 ℃%2Culturing for 4-6 h in an incubator, removing the culture solution containing the EZ Trans-DNA compound, replacing the new culture solution, and culturing for 3 days.
3.4 transfected cells were cultured for 3 days, then the cells were digested with trypsin to obtain GFP-positive cells (FITC fluorescence intensity top 15%) and further flow-sorted to obtain GFP-positive cells, and the genomic DNA was extracted from the collected cells by phenol chloroform method.
3.5 designing and synthesizing PCR primers by 100-130 bp respectively at the upstream and downstream of the selected endogenous gene targeting site, and adding water to dilute to 10 mu M. Each genomic targeting site fragment was PCR amplified using the Novozam high fidelity enzyme kit (Vazyme, p501-d 2). PCR product samples were recovered by using AxyPrep DNA gel recovery kit (Axygen, AP-GX-250G) as tapping gel to remove non-specific bands. The PCR primer sequences are shown in Table 2.
TABLE 2 PCR primer sequence Listing
Figure BDA0003061763050000101
Figure BDA0003061763050000111
3.6 preliminarily identifying whether the target fragment is successfully amplified through gel electrophoresis, carrying out Sanger sequencing on the successfully amplified target fragment, and analyzing a sequencing result to observe whether a specific base point mutation (C-to-T or G-to-A) exists in a target site. The sequencing result is shown in figure 5, the first row of the left figure is a schematic diagram of the target DNA sequence; second behavior targeting gene editing experimental results, arrows indicate C-to-T editing positions; the right panel shows the statistical results of the editing efficiency of C-to-T at different positions in the gRNA range. The figure shows the editing results of 4 editing sites in total, and as can be seen from figure 5, the gene editing tool SgoCas9-ancBE4max obtained by the invention can cause efficient C-to-T conversion.
Example 4
4.1 transfection of HEK293T cells by the base editing system composed of the SgoCas9-ancBE4max plasmid and the pGL3-U6-Sgo gRNA plasmid constructed in example 1 and example 2, the editing efficiency of the SgoCas9-ancBE4max at 10 human genomic sites was co-detected, the statistics of the editing efficiency are shown in FIG. 6, and the results show that the SgoCas9-ancBE4max base editing system realizes the conversion of C-to-T at positions 1-24 of the 5' end to different degrees.
4.2 the editing efficiency in each gRNA range is normalized, that is, the point with the highest editing efficiency is normalized to be 1, the editing efficiency of other cytosine bases is counted, all points in the gRNA range are counted, and the high-efficiency editing window of the SgoCas9-ancBE4max base editing system in the gRNA range is counted as shown in FIG. 7. The results show that the editing range of the SgoCas9-ancBE4max base editing system obtained in the example is 8-14 bits of the 5' end of the gRNA, the possibility of constructing a novel base editing tool and system by searching for a Cas9 homologous protein is proved, and a cytosine base editor with different targeting ranges (PAM is NNAAAG) and editing windows for endogenous gene editing is obtained efficiently.
Example 5
Human colon carcinoma HCT116 cells and murine neuroma N2A cells were transfected with the base editing system consisting of the SgoCas9-ancBE4max plasmid and pGL3-U6-Sgo gRNA plasmid constructed in examples 1 and 2 as follows:
N2A cells (from ATCC) were recovered and cultured in 10cm dishes (Corning, 430167) in DMEM (HyClone, SH 30243.01) mixed with 10% fetal bovine serum (HyClone, SV 30087). The culture temperature was 37 ℃ and the carbon dioxide concentration was 5%. After multiple passages when the cell density was 90%, the cells were plated into 24-well plates.
HCT116 cells (from ATCC) were recovered and cultured in 10cm dishes (Corning, 430167) in RIPM1640 medium (Gibco, 11875093) mixed with 10% fetal bovine serum (HyClone, SV 30087). The culture temperature was 37 ℃ and the carbon dioxide concentration was 5%. After multiple passages when the cell density was 90%, the cells were plated to 24-well plates. Cell transfection protocol, cell sorting protocol, editing efficiency protocol were the same as in example 3 and example 4 above.
The editing efficiency of the SgoCas9-ancBE4max system in HCT116 cells is shown in FIG. 8, with the targeting sequences: sgo-1/-2/-3/-6/-8 (see attached Table 1). The editing efficiency of the SgoCas9-ancBE4max system in N2A cells is shown in fig. 9, with the targeting sequence: ggaaactcgatcgcattcattgcatg. As can be seen from fig. 8 and 9, the SgoCas9-ancBE4max base editing system can also lead to a highly efficient C-to-T transition in HCT116 and N2A cells; FIG. 8 shows the base positions with the highest editing efficiency of 5 human genomic loci on the abscissa, and the conversion efficiency of C-to-T on the ordinate; FIG. 9 shows arrows indicating the editing sites at positions N2A, and the efficient C-to-T transition can be seen by the peak plot.
In conclusion, the invention effectively overcomes the limitation of the application range of the base editing tool in the prior art, including PAM and editing window limitation, and the size of the cytosine base editor related to the invention is smaller than that of the classic SpCas 9-mediated base editor (total 1710 amino acids), so that the cytosine base editor is possibly suitable for packaging and application of lentiviruses or adenoviruses and has high industrial utilization value.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the protection scope of the present invention, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.
SEQUENCE LISTING
<110> Guangzhou university
<120> cytosine single-base editor tool and application thereof
<130> PZS215311-
<160> 9
<170> PatentIn version 3.5
<210> 1
<211> 1135
<212> PRT
<213> SgoCas9 D9A nickase
<400> 1
Asn Gly Leu Val Leu Gly Leu Ala Ile Gly Ile Ala Ser Val Gly Val
1 5 10 15
Gly Ile Leu Glu Lys Asp Thr Gly Lys Ile Ile His Ala Ser Ser Arg
20 25 30
Leu Phe Pro Ala Ala Thr Ala Asp Asn Asn Val Glu Arg Arg Ser Asn
35 40 45
Arg Gln Gly Arg Arg Leu Asn Arg Arg Lys Lys His Arg Ser Val Arg
50 55 60
Leu Gln Asp Leu Phe Glu Gly Tyr Gly Leu Leu Thr Asp Phe Ser Lys
65 70 75 80
Val Ser Met Asn Leu Asn Pro Tyr Gln Leu Arg Val Gln Gly Met Glu
85 90 95
Asn Gln Leu Thr Asn Glu Glu Leu Phe Val Ala Leu Lys Asn Ile Val
100 105 110
Lys Arg Arg Gly Ile Ser Tyr Leu Asp Asp Ala Ser Glu Asp Gly Gly
115 120 125
Thr Val Ser Ser Asp Tyr Gly Lys Ala Val Glu Glu Asn Arg Lys Leu
130 135 140
Leu Ala Glu Lys Thr Pro Gly Gln Ile Gln Leu Glu Arg Phe Glu Lys
145 150 155 160
Tyr Gly Gln Leu Arg Gly Asp Phe Thr Val Glu Glu Asn Gly Glu Lys
165 170 175
His Arg Leu Ile Asn Val Phe Ser Thr Ser Ala Tyr Arg Lys Glu Ala
180 185 190
Glu Arg Ile Leu Arg Lys Gln Gln Glu Phe Asn Ser Lys Ile Thr Asp
195 200 205
Glu Phe Ile Glu Asp Tyr Leu Ile Ile Leu Thr Gly Lys Arg Lys Tyr
210 215 220
Tyr His Gly Pro Gly Asn Glu Lys Ser Arg Thr Asp Tyr Gly Arg Phe
225 230 235 240
Arg Thr Asp Gly Thr Thr Leu Asp Asn Ile Phe Gly Ile Leu Ile Gly
245 250 255
Lys Cys Thr Phe Tyr Thr Glu Glu Tyr Arg Ala Ser Lys Ala Ser Tyr
260 265 270
Thr Ala Gln Glu Phe Asn Leu Leu Asn Asp Leu Asn Asn Leu Thr Val
275 280 285
Pro Thr Glu Thr Lys Lys Leu Ser Glu Glu Gln Lys Lys Leu Ile Ile
290 295 300
Glu Tyr Ala Lys Ser Ala Lys Thr Leu Gly Ala Ser Thr Leu Leu Lys
305 310 315 320
Tyr Ile Ala Lys Met Ile Asp Ala Ser Val Asp Gln Ile Arg Gly Tyr
325 330 335
Arg Val Asp Val Asn Asn Lys Pro Glu Met His Thr Phe Glu Val Tyr
340 345 350
Arg Lys Met Gln Ser Leu Glu Thr Ile Lys Val Glu Glu Leu Pro Arg
355 360 365
Lys Val Leu Asp Glu Leu Ala His Ile Leu Thr Leu Asn Thr Glu Arg
370 375 380
Glu Gly Ile Glu Glu Ala Ile Asn Ser Lys Leu Lys Asp Ile Phe Asn
385 390 395 400
Arg Asp Gln Val Leu Glu Leu Val Gln Phe Arg Lys Asn Asn Ser Ser
405 410 415
Leu Phe Ser Lys Gly Trp His Asn Phe Ser Ile Lys Leu Met Met Glu
420 425 430
Leu Ile Pro Glu Leu Tyr Glu Thr Ser Glu Glu Gln Met Thr Ile Leu
435 440 445
Thr Arg Leu Gly Lys Gln Arg Ser Lys Glu Thr Ser Lys Arg Thr Lys
450 455 460
Tyr Ile Asp Glu Lys Glu Leu Thr Glu Glu Ile Tyr Asn Pro Val Val
465 470 475 480
Ala Lys Ser Val Arg Gln Ala Ile Lys Ile Ile Asn Glu Ala Thr Lys
485 490 495
Lys Tyr Gly Ile Phe Asp Asn Ile Val Ile Glu Met Ala Arg Glu Asn
500 505 510
Asn Glu Glu Asp Ala Lys Lys Asp Tyr Ile Lys Arg Gln Lys Ala Asn
515 520 525
Gln Asp Glu Lys Asn Ala Ala Met Glu Lys Ala Ala Phe Gln Tyr Asn
530 535 540
Gly Lys Lys Glu Leu Pro Asp Asn Ile Phe His Gly His Lys Glu Leu
545 550 555 560
Thr Thr Lys Ile Arg Leu Trp His Gln Gln Gly Glu Lys Cys Leu Tyr
565 570 575
Thr Gly Lys Asn Ile Pro Ile Ser Asp Leu Ile His Asn Gln Tyr Lys
580 585 590
Tyr Glu Ile Asp His Ile Leu Pro Leu Ser Leu Ser Phe Asp Asp Ser
595 600 605
Leu Ser Asn Lys Val Leu Val Leu Ala Thr Ala Asn Gln Glu Lys Gly
610 615 620
Gln Arg Thr Pro Phe Gln Ala Leu Asp Ser Met Asp Asp Ala Trp Ser
625 630 635 640
Tyr Arg Glu Phe Lys Ser Tyr Val Lys Asp Ser Lys Leu Leu Ser Asn
645 650 655
Lys Lys Lys Asp Tyr Leu Leu Thr Glu Glu Asp Ile Ser Lys Ile Glu
660 665 670
Val Lys Gln Lys Phe Ile Glu Arg Asn Leu Val Asp Thr Arg Tyr Ser
675 680 685
Ser Arg Val Val Leu Asn Ala Leu Gln Asp Phe Tyr Lys Ser His Gln
690 695 700
Leu Asp Thr Thr Ile Ser Val Val Arg Gly Gln Phe Thr Ser Gln Leu
705 710 715 720
Arg Arg Lys Trp Gly Ile Glu Lys Ser Arg Glu Thr Tyr His His His
725 730 735
Ala Val Asp Ala Leu Ile Ile Ala Ala Ser Ser Gln Leu Arg Leu Trp
740 745 750
Lys Lys His Ser Asn Pro Leu Ile Ala Tyr Lys Glu Gly Gln Phe Val
755 760 765
Asp Ser Glu Thr Gly Glu Ile Val Ser Leu Ser Asp Glu Glu Tyr Lys
770 775 780
Glu Leu Val Phe Lys Ala Pro Tyr Asp His Phe Val Asp Thr Leu Arg
785 790 795 800
Ser Lys Lys Phe Glu Asp Ser Ile Leu Phe Ser Tyr Gln Val Asp Ser
805 810 815
Lys Tyr Asn Arg Lys Ile Ser Asp Ala Thr Ile Tyr Ala Thr Arg Lys
820 825 830
Ala Lys Leu Asp Lys Glu Lys Lys Glu Tyr Thr Tyr Thr Leu Gly Lys
835 840 845
Ile Lys Asp Ile Tyr Ala Leu Gly Thr Lys Thr Pro Ser Lys Thr Gly
850 855 860
Phe Tyr Lys Phe Leu Asp Leu Tyr Lys Thr Asp Lys Ser Gln Phe Leu
865 870 875 880
Met Tyr Gln Lys Asp Arg Lys Thr Trp Asp Glu Val Ile Glu Lys Ile
885 890 895
Ile Glu Gln Tyr Arg Pro Phe Lys Glu Tyr Asp Lys Asn Gly Lys Glu
900 905 910
Val Asp Phe Asn Pro Phe Glu Lys Tyr Arg Ile Gly Asn Gly Pro Ile
915 920 925
Arg Lys Tyr Ser Lys Lys Gly Asn Gly Pro Glu Ile Lys Ser Leu Lys
930 935 940
Tyr Tyr Asp Ile Leu Leu Gly Lys His Lys Asn Ile Thr Pro Asp Gly
945 950 955 960
Ser Arg Asn Thr Val Ala Leu Leu Ser Leu Asn Pro Trp Arg Thr Asp
965 970 975
Val Tyr Tyr Asn Ser Glu Thr Lys Lys Tyr Glu Phe Leu Gly Leu Lys
980 985 990
Tyr Ala Asp Leu Cys Phe Glu Glu Gly Gly Ala Tyr Gly Ile Ser Glu
995 1000 1005
Val Lys Tyr Lys Lys Ile Arg Glu Lys Glu Gly Ile Gly Lys Asn
1010 1015 1020
Ser Glu Phe Lys Phe Thr Leu Tyr Lys Asn Asp Leu Ile Leu Ile
1025 1030 1035
Lys Asp Thr Glu Thr Asn Cys Gln Gln Phe Phe Arg Phe Trp Ser
1040 1045 1050
Arg Thr Gly Lys Asp Asn Pro Lys Ser Phe Glu Lys His Lys Ile
1055 1060 1065
Glu Leu Lys Pro Tyr Glu Lys Ala Lys Phe Glu Lys Gly Glu Glu
1070 1075 1080
Leu Lys Val Leu Gly Lys Val Pro Pro Ser Ser Asn Gln Phe Gln
1085 1090 1095
Lys Asn Met Gln Ile Glu Asn Leu Ser Ile Tyr Lys Val Lys Thr
1100 1105 1110
Asp Ile Leu Gly Asn Lys His Phe Ile Lys Lys Glu Gly Asp Glu
1115 1120 1125
Pro Lys Leu Lys Phe Lys Lys
1130 1135
<210> 2
<211> 3405
<212> DNA
<213> SgoCas9 D9A nickase
<400> 2
aacggcctgg tgctgggcct ggccatcggc atcgcctctg tgggcgtggg catcctggag 60
aaagacactg gcaagatcat tcacgcttcg agcagactgt tcccagccgc cacagccgac 120
aacaatgtgg agagacggag caatagacag ggcagacggc taaaccggcg gaaaaagcac 180
agatccgtgc ggctgcagga cctgtttgaa ggatacggcc tgctgacaga cttcagcaag 240
gtgtccatga acctgaatcc ctaccagctg cgggtgcagg gaatggaaaa ccagctgacc 300
aacgaggagc tgttcgtggc cctgaagaat atcgtgaaga gaagaggcat cagctacctg 360
gacgatgcca gcgaggacgg cggcaccgtg agtagcgact acggcaaggc tgtggaagaa 420
aacagaaaac tgctggcgga aaagacgccc ggccaaatcc agctggaacg cttcgagaag 480
tatggccagc tgagaggcga cttcaccgtg gaagaaaatg gcgagaagca tagactgatc 540
aacgtgttca gcaccagcgc ctacagaaag gaagctgaac ggatcctgcg gaagcagcag 600
gagttcaaca gcaagatcac agacgagttt attgaggact acctgatcat cctgacagga 660
aaacggaagt actaccacgg acctggcaac gagaagagca gaaccgacta cggcagattc 720
agaaccgacg gcaccaccct ggacaacatc ttcggcatcc tgattggaaa gtgtacattc 780
tacaccgaag agtatcgggc ctctaaggcc agctacacag cccaggagtt caacctgctc 840
aacgatctga acaacctgac cgtgcctacc gagacaaaga aactgagcga ggagcagaag 900
aagctgatca tcgagtacgc caaatctgcc aagaccctcg gcgccagcac cctgctgaaa 960
tatatcgcca aaatgatcga cgccagcgtc gaccagatca gaggctaccg ggtggacgtg 1020
aacaacaagc ccgagatgca caccttcgag gtctaccgaa agatgcagag cctggaaaca 1080
atcaaggtgg aagaactgcc tagaaaggtc ctggatgaac tggcccacat cctcaccctg 1140
aataccgaga gagagggcat cgaggaggcc atcaacagca agctgaagga catcttcaac 1200
cgcgaccagg tgctggagct ggtgcagttc agaaagaaca acagcagtct gttctccaag 1260
ggatggcaca acttcagcat caagctgatg atggaactga tcccagagct gtatgaaaca 1320
tccgaagaac agatgaccat cctgacaaga ctgggcaaac agcgttctaa ggagacctct 1380
aagcggacca aatacatcga tgagaaagaa ctgaccgagg agatctataa ccccgtggtg 1440
gccaaaagcg tccggcaggc catcaagatc atcaacgagg ccactaagaa gtacggcatt 1500
ttcgacaaca tcgtgatcga gatggccaga gaaaacaacg aagaagatgc caagaaagat 1560
tatattaaaa ggcaaaaagc taatcaagat gaaaagaacg ccgccatgga aaaggctgca 1620
ttccagtaca atggcaagaa ggaactgcct gataatatct ttcacggcca caaggagctg 1680
acaacaaaaa ttcggctgtg gcaccagcag ggagaaaagt gcctgtacac cggaaagaat 1740
atccctatct ctgatcttat tcacaaccag tacaagtacg agatcgacca catcctgccc 1800
ctgtccctga gctttgacga ctctctgagc aacaaggttc tggttctggc caccgccaac 1860
caggagaagg gccaaagaac tcctttccag gccctggaca gcatggacga cgcctggagc 1920
tacagagagt tcaagagcta cgtgaaagac tctaaactgc tgtctaacaa gaagaaagac 1980
tacctgttga cagaggagga tatctccaag atcgaggtca agcagaaatt catcgagaga 2040
aatctggtgg ataccagata cagctccaga gtggttctga atgcccttca agacttctac 2100
aagagccacc agctggacac caccatctca gtggtgcggg gccagtttac cagccagctg 2160
cggagaaagt ggggcatcga gaaaagcagg gaaacctacc accaccatgc cgtagacgct 2220
cttatcattg ctgcctctag ccagctgcgg ctgtggaaga agcacagcaa ccctctgatc 2280
gcctataagg agggccagtt tgtggacagc gagacaggcg agatcgtgtc tctgtccgac 2340
gaagaataca aggaactggt gtttaaggcc ccttacgatc actttgtgga taccctgaga 2400
agcaagaaat tcgaagatag catcctgttt agctatcaag tggattctaa gtacaacaga 2460
aagatctccg atgcaacaat ctacgcgacc aggaaggcta agctggataa ggaaaagaag 2520
gagtacacat acaccctcgg aaagatcaaa gatatctacg ccctgggcac aaagacccct 2580
tccaagaccg gattctacaa gttcctggac ctgtacaaga ccgataagag ccagttcctg 2640
atgtaccaaa aggatagaaa gacctgggac gaggtgatcg agaaaatcat cgagcagtac 2700
cggcctttta aggagtacga caagaacggc aaagaggtgg atttcaaccc cttcgagaag 2760
tacagaatcg gcaatggccc catccggaaa tacagcaaga agggcaacgg acctgagatc 2820
aagagtctga aatattacga catcctgctg ggcaaacaca agaacatcac tcctgacgga 2880
tctagaaaca ccgtggccct gctgagcctg aacccttgga gaacagacgt gtactacaac 2940
agcgaaacaa agaagtacga gttcctggga ctcaagtacg ccgacctgtg cttcgaagag 3000
ggcggagcct acggcatcag cgaggtgaag tacaagaaga tcagagaaaa ggagggcatc 3060
ggcaagaata gcgagttcaa gttcaccctg tacaagaacg acctgattct gatcaaggac 3120
accgaaacca actgccagca gttcttcaga ttctggagca gaaccggtaa ggacaaccct 3180
aaatctttcg aaaagcataa gatcgagctg aagccttacg agaaagccaa gttcgagaaa 3240
ggcgaggagc taaaagtgct gggcaaggtg ccaccttctt ccaaccagtt tcagaagaac 3300
atgcaaatcg agaacttgag catctacaag gtcaagacag acatcctggg taacaaacac 3360
tttatcaaaa aggagggaga tgaacccaag ctcaagttca agaag 3405
<210> 3
<211> 235
<212> PRT
<213> BPNLS-ancAPOBEC1
<400> 3
Pro Lys Lys Lys Arg Lys Val Ser Ser Glu Thr Gly Pro Val Ala Val
1 5 10 15
Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe
20 25 30
Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile
35 40 45
Lys Trp Gly Thr Ser His Lys Ile Trp Arg His Ser Ser Lys Asn Thr
50 55 60
Thr Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Ser Glu Arg
65 70 75 80
His Phe Cys Pro Ser Thr Ser Cys Ser Ile Thr Trp Phe Leu Ser Trp
85 90 95
Ser Pro Cys Gly Glu Cys Ser Lys Ala Ile Thr Glu Phe Leu Ser Gln
100 105 110
His Pro Asn Val Thr Leu Val Ile Tyr Val Ala Arg Leu Tyr His His
115 120 125
Met Asp Gln Gln Asn Arg Gln Gly Leu Arg Asp Leu Val Asn Ser Gly
130 135 140
Val Thr Ile Gln Ile Met Thr Ala Pro Glu Tyr Asp Tyr Cys Trp Arg
145 150 155 160
Asn Phe Val Asn Tyr Pro Pro Gly Lys Glu Ala His Trp Pro Arg Tyr
165 170 175
Pro Pro Leu Trp Met Lys Leu Tyr Ala Leu Glu Leu His Ala Gly Ile
180 185 190
Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln
195 200 205
Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu
210 215 220
Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys
225 230 235
<210> 4
<211> 197
<212> PRT
<213> 2*UGI-BPNLS
<400> 4
Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val
1 5 10 15
Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile
20 25 30
Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu
35 40 45
Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr
50 55 60
Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile
65 70 75 80
Lys Met Leu Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Thr Asn Leu
85 90 95
Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val Ile Gln Glu
100 105 110
Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val Ile Gly Asn Lys
115 120 125
Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr Asp Glu Ser Thr Asp
130 135 140
Glu Asn Val Met Leu Leu Thr Ser Asp Ala Pro Glu Tyr Lys Pro Trp
145 150 155 160
Ala Leu Val Ile Gln Asp Ser Asn Gly Glu Asn Lys Ile Lys Met Leu
165 170 175
Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys
180 185 190
Lys Lys Arg Lys Val
195
<210> 5
<211> 1609
<212> PRT
<213> SgoCas9-acnBE4max
<400> 5
Pro Lys Lys Lys Arg Lys Val Ser Ser Glu Thr Gly Pro Val Ala Val
1 5 10 15
Asp Pro Thr Leu Arg Arg Arg Ile Glu Pro His Glu Phe Glu Val Phe
20 25 30
Phe Asp Pro Arg Glu Leu Arg Lys Glu Thr Cys Leu Leu Tyr Glu Ile
35 40 45
Lys Trp Gly Thr Ser His Lys Ile Trp Arg His Ser Ser Lys Asn Thr
50 55 60
Thr Lys His Val Glu Val Asn Phe Ile Glu Lys Phe Thr Ser Glu Arg
65 70 75 80
His Phe Cys Pro Ser Thr Ser Cys Ser Ile Thr Trp Phe Leu Ser Trp
85 90 95
Ser Pro Cys Gly Glu Cys Ser Lys Ala Ile Thr Glu Phe Leu Ser Gln
100 105 110
His Pro Asn Val Thr Leu Val Ile Tyr Val Ala Arg Leu Tyr His His
115 120 125
Met Asp Gln Gln Asn Arg Gln Gly Leu Arg Asp Leu Val Asn Ser Gly
130 135 140
Val Thr Ile Gln Ile Met Thr Ala Pro Glu Tyr Asp Tyr Cys Trp Arg
145 150 155 160
Asn Phe Val Asn Tyr Pro Pro Gly Lys Glu Ala His Trp Pro Arg Tyr
165 170 175
Pro Pro Leu Trp Met Lys Leu Tyr Ala Leu Glu Leu His Ala Gly Ile
180 185 190
Leu Gly Leu Pro Pro Cys Leu Asn Ile Leu Arg Arg Lys Gln Pro Gln
195 200 205
Leu Thr Phe Phe Thr Ile Ala Leu Gln Ser Cys His Tyr Gln Arg Leu
210 215 220
Pro Pro His Ile Leu Trp Ala Thr Gly Leu Lys Ser Gly Gly Ser Ser
225 230 235 240
Gly Gly Ser Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr
245 250 255
Pro Glu Ser Ser Gly Gly Ser Ser Gly Gly Ser Asn Gly Leu Val Leu
260 265 270
Gly Leu Ala Ile Gly Ile Ala Ser Val Gly Val Gly Ile Leu Glu Lys
275 280 285
Asp Thr Gly Lys Ile Ile His Ala Ser Ser Arg Leu Phe Pro Ala Ala
290 295 300
Thr Ala Asp Asn Asn Val Glu Arg Arg Ser Asn Arg Gln Gly Arg Arg
305 310 315 320
Leu Asn Arg Arg Lys Lys His Arg Ser Val Arg Leu Gln Asp Leu Phe
325 330 335
Glu Gly Tyr Gly Leu Leu Thr Asp Phe Ser Lys Val Ser Met Asn Leu
340 345 350
Asn Pro Tyr Gln Leu Arg Val Gln Gly Met Glu Asn Gln Leu Thr Asn
355 360 365
Glu Glu Leu Phe Val Ala Leu Lys Asn Ile Val Lys Arg Arg Gly Ile
370 375 380
Ser Tyr Leu Asp Asp Ala Ser Glu Asp Gly Gly Thr Val Ser Ser Asp
385 390 395 400
Tyr Gly Lys Ala Val Glu Glu Asn Arg Lys Leu Leu Ala Glu Lys Thr
405 410 415
Pro Gly Gln Ile Gln Leu Glu Arg Phe Glu Lys Tyr Gly Gln Leu Arg
420 425 430
Gly Asp Phe Thr Val Glu Glu Asn Gly Glu Lys His Arg Leu Ile Asn
435 440 445
Val Phe Ser Thr Ser Ala Tyr Arg Lys Glu Ala Glu Arg Ile Leu Arg
450 455 460
Lys Gln Gln Glu Phe Asn Ser Lys Ile Thr Asp Glu Phe Ile Glu Asp
465 470 475 480
Tyr Leu Ile Ile Leu Thr Gly Lys Arg Lys Tyr Tyr His Gly Pro Gly
485 490 495
Asn Glu Lys Ser Arg Thr Asp Tyr Gly Arg Phe Arg Thr Asp Gly Thr
500 505 510
Thr Leu Asp Asn Ile Phe Gly Ile Leu Ile Gly Lys Cys Thr Phe Tyr
515 520 525
Thr Glu Glu Tyr Arg Ala Ser Lys Ala Ser Tyr Thr Ala Gln Glu Phe
530 535 540
Asn Leu Leu Asn Asp Leu Asn Asn Leu Thr Val Pro Thr Glu Thr Lys
545 550 555 560
Lys Leu Ser Glu Glu Gln Lys Lys Leu Ile Ile Glu Tyr Ala Lys Ser
565 570 575
Ala Lys Thr Leu Gly Ala Ser Thr Leu Leu Lys Tyr Ile Ala Lys Met
580 585 590
Ile Asp Ala Ser Val Asp Gln Ile Arg Gly Tyr Arg Val Asp Val Asn
595 600 605
Asn Lys Pro Glu Met His Thr Phe Glu Val Tyr Arg Lys Met Gln Ser
610 615 620
Leu Glu Thr Ile Lys Val Glu Glu Leu Pro Arg Lys Val Leu Asp Glu
625 630 635 640
Leu Ala His Ile Leu Thr Leu Asn Thr Glu Arg Glu Gly Ile Glu Glu
645 650 655
Ala Ile Asn Ser Lys Leu Lys Asp Ile Phe Asn Arg Asp Gln Val Leu
660 665 670
Glu Leu Val Gln Phe Arg Lys Asn Asn Ser Ser Leu Phe Ser Lys Gly
675 680 685
Trp His Asn Phe Ser Ile Lys Leu Met Met Glu Leu Ile Pro Glu Leu
690 695 700
Tyr Glu Thr Ser Glu Glu Gln Met Thr Ile Leu Thr Arg Leu Gly Lys
705 710 715 720
Gln Arg Ser Lys Glu Thr Ser Lys Arg Thr Lys Tyr Ile Asp Glu Lys
725 730 735
Glu Leu Thr Glu Glu Ile Tyr Asn Pro Val Val Ala Lys Ser Val Arg
740 745 750
Gln Ala Ile Lys Ile Ile Asn Glu Ala Thr Lys Lys Tyr Gly Ile Phe
755 760 765
Asp Asn Ile Val Ile Glu Met Ala Arg Glu Asn Asn Glu Glu Asp Ala
770 775 780
Lys Lys Asp Tyr Ile Lys Arg Gln Lys Ala Asn Gln Asp Glu Lys Asn
785 790 795 800
Ala Ala Met Glu Lys Ala Ala Phe Gln Tyr Asn Gly Lys Lys Glu Leu
805 810 815
Pro Asp Asn Ile Phe His Gly His Lys Glu Leu Thr Thr Lys Ile Arg
820 825 830
Leu Trp His Gln Gln Gly Glu Lys Cys Leu Tyr Thr Gly Lys Asn Ile
835 840 845
Pro Ile Ser Asp Leu Ile His Asn Gln Tyr Lys Tyr Glu Ile Asp His
850 855 860
Ile Leu Pro Leu Ser Leu Ser Phe Asp Asp Ser Leu Ser Asn Lys Val
865 870 875 880
Leu Val Leu Ala Thr Ala Asn Gln Glu Lys Gly Gln Arg Thr Pro Phe
885 890 895
Gln Ala Leu Asp Ser Met Asp Asp Ala Trp Ser Tyr Arg Glu Phe Lys
900 905 910
Ser Tyr Val Lys Asp Ser Lys Leu Leu Ser Asn Lys Lys Lys Asp Tyr
915 920 925
Leu Leu Thr Glu Glu Asp Ile Ser Lys Ile Glu Val Lys Gln Lys Phe
930 935 940
Ile Glu Arg Asn Leu Val Asp Thr Arg Tyr Ser Ser Arg Val Val Leu
945 950 955 960
Asn Ala Leu Gln Asp Phe Tyr Lys Ser His Gln Leu Asp Thr Thr Ile
965 970 975
Ser Val Val Arg Gly Gln Phe Thr Ser Gln Leu Arg Arg Lys Trp Gly
980 985 990
Ile Glu Lys Ser Arg Glu Thr Tyr His His His Ala Val Asp Ala Leu
995 1000 1005
Ile Ile Ala Ala Ser Ser Gln Leu Arg Leu Trp Lys Lys His Ser
1010 1015 1020
Asn Pro Leu Ile Ala Tyr Lys Glu Gly Gln Phe Val Asp Ser Glu
1025 1030 1035
Thr Gly Glu Ile Val Ser Leu Ser Asp Glu Glu Tyr Lys Glu Leu
1040 1045 1050
Val Phe Lys Ala Pro Tyr Asp His Phe Val Asp Thr Leu Arg Ser
1055 1060 1065
Lys Lys Phe Glu Asp Ser Ile Leu Phe Ser Tyr Gln Val Asp Ser
1070 1075 1080
Lys Tyr Asn Arg Lys Ile Ser Asp Ala Thr Ile Tyr Ala Thr Arg
1085 1090 1095
Lys Ala Lys Leu Asp Lys Glu Lys Lys Glu Tyr Thr Tyr Thr Leu
1100 1105 1110
Gly Lys Ile Lys Asp Ile Tyr Ala Leu Gly Thr Lys Thr Pro Ser
1115 1120 1125
Lys Thr Gly Phe Tyr Lys Phe Leu Asp Leu Tyr Lys Thr Asp Lys
1130 1135 1140
Ser Gln Phe Leu Met Tyr Gln Lys Asp Arg Lys Thr Trp Asp Glu
1145 1150 1155
Val Ile Glu Lys Ile Ile Glu Gln Tyr Arg Pro Phe Lys Glu Tyr
1160 1165 1170
Asp Lys Asn Gly Lys Glu Val Asp Phe Asn Pro Phe Glu Lys Tyr
1175 1180 1185
Arg Ile Gly Asn Gly Pro Ile Arg Lys Tyr Ser Lys Lys Gly Asn
1190 1195 1200
Gly Pro Glu Ile Lys Ser Leu Lys Tyr Tyr Asp Ile Leu Leu Gly
1205 1210 1215
Lys His Lys Asn Ile Thr Pro Asp Gly Ser Arg Asn Thr Val Ala
1220 1225 1230
Leu Leu Ser Leu Asn Pro Trp Arg Thr Asp Val Tyr Tyr Asn Ser
1235 1240 1245
Glu Thr Lys Lys Tyr Glu Phe Leu Gly Leu Lys Tyr Ala Asp Leu
1250 1255 1260
Cys Phe Glu Glu Gly Gly Ala Tyr Gly Ile Ser Glu Val Lys Tyr
1265 1270 1275
Lys Lys Ile Arg Glu Lys Glu Gly Ile Gly Lys Asn Ser Glu Phe
1280 1285 1290
Lys Phe Thr Leu Tyr Lys Asn Asp Leu Ile Leu Ile Lys Asp Thr
1295 1300 1305
Glu Thr Asn Cys Gln Gln Phe Phe Arg Phe Trp Ser Arg Thr Gly
1310 1315 1320
Lys Asp Asn Pro Lys Ser Phe Glu Lys His Lys Ile Glu Leu Lys
1325 1330 1335
Pro Tyr Glu Lys Ala Lys Phe Glu Lys Gly Glu Glu Leu Lys Val
1340 1345 1350
Leu Gly Lys Val Pro Pro Ser Ser Asn Gln Phe Gln Lys Asn Met
1355 1360 1365
Gln Ile Glu Asn Leu Ser Ile Tyr Lys Val Lys Thr Asp Ile Leu
1370 1375 1380
Gly Asn Lys His Phe Ile Lys Lys Glu Gly Asp Glu Pro Lys Leu
1385 1390 1395
Lys Phe Lys Lys Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Thr
1400 1405 1410
Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys Gln Leu Val
1415 1420 1425
Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val Glu Glu Val
1430 1435 1440
Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His Thr Ala Tyr
1445 1450 1455
Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr Ser Asp Ala
1460 1465 1470
Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp Ser Asn Gly
1475 1480 1485
Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Gly Gly Ser Gly
1490 1495 1500
Gly Ser Thr Asn Leu Ser Asp Ile Ile Glu Lys Glu Thr Gly Lys
1505 1510 1515
Gln Leu Val Ile Gln Glu Ser Ile Leu Met Leu Pro Glu Glu Val
1520 1525 1530
Glu Glu Val Ile Gly Asn Lys Pro Glu Ser Asp Ile Leu Val His
1535 1540 1545
Thr Ala Tyr Asp Glu Ser Thr Asp Glu Asn Val Met Leu Leu Thr
1550 1555 1560
Ser Asp Ala Pro Glu Tyr Lys Pro Trp Ala Leu Val Ile Gln Asp
1565 1570 1575
Ser Asn Gly Glu Asn Lys Ile Lys Met Leu Ser Gly Gly Ser Lys
1580 1585 1590
Arg Thr Ala Asp Gly Ser Glu Phe Glu Pro Lys Lys Lys Arg Lys
1595 1600 1605
Val
<210> 6
<211> 4827
<212> DNA
<213> SgoCas9-acnBE4max
<400> 6
ccaaagaaga agcggaaagt cagcagtgaa accggaccag tggcagtgga cccaaccctg 60
aggagacgga ttgagcccca tgaatttgaa gtgttctttg acccaaggga gctgaggaag 120
gagacatgcc tgctgtacga gatcaagtgg ggcacaagcc acaagatctg gcgccacagc 180
tccaagaaca ccacaaagca cgtggaagtg aatttcatcg agaagtttac ctccgagcgg 240
cacttctgcc cctctaccag ctgttccatc acatggtttc tgtcttggag cccttgcggc 300
gagtgttcca aggccatcac cgagttcctg tctcagcacc ctaacgtgac cctggtcatc 360
tacgtggccc ggctgtatca ccacatggac cagcagaaca ggcagggcct gcgcgatctg 420
gtgaattctg gcgtgaccat ccagatcatg acagccccag agtacgacta ttgctggcgg 480
aacttcgtga attatccacc tggcaaggag gcacactggc caagataccc acccctgtgg 540
atgaagctgt atgcactgga gctgcacgca ggaatcctgg gcctgcctcc atgtctgaat 600
atcctgcgga gaaagcagcc ccagctgaca tttttcacca ttgctctgca gtcttgtcac 660
tatcagcggc tgcctcctca tattctgtgg gctacaggcc tgaagtctgg aggatctagc 720
ggaggatcct ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagcagt 780
ggcggcagca gcggcggcag caacggcctg gtgctgggcc tggccatcgg catcgcctct 840
gtgggcgtgg gcatcctgga gaaagacact ggcaagatca ttcacgcttc gagcagactg 900
ttcccagccg ccacagccga caacaatgtg gagagacgga gcaatagaca gggcagacgg 960
ctaaaccggc ggaaaaagca cagatccgtg cggctgcagg acctgtttga aggatacggc 1020
ctgctgacag acttcagcaa ggtgtccatg aacctgaatc cctaccagct gcgggtgcag 1080
ggaatggaaa accagctgac caacgaggag ctgttcgtgg ccctgaagaa tatcgtgaag 1140
agaagaggca tcagctacct ggacgatgcc agcgaggacg gcggcaccgt gagtagcgac 1200
tacggcaagg ctgtggaaga aaacagaaaa ctgctggcgg aaaagacgcc cggccaaatc 1260
cagctggaac gcttcgagaa gtatggccag ctgagaggcg acttcaccgt ggaagaaaat 1320
ggcgagaagc atagactgat caacgtgttc agcaccagcg cctacagaaa ggaagctgaa 1380
cggatcctgc ggaagcagca ggagttcaac agcaagatca cagacgagtt tattgaggac 1440
tacctgatca tcctgacagg aaaacggaag tactaccacg gacctggcaa cgagaagagc 1500
agaaccgact acggcagatt cagaaccgac ggcaccaccc tggacaacat cttcggcatc 1560
ctgattggaa agtgtacatt ctacaccgaa gagtatcggg cctctaaggc cagctacaca 1620
gcccaggagt tcaacctgct caacgatctg aacaacctga ccgtgcctac cgagacaaag 1680
aaactgagcg aggagcagaa gaagctgatc atcgagtacg ccaaatctgc caagaccctc 1740
ggcgccagca ccctgctgaa atatatcgcc aaaatgatcg acgccagcgt cgaccagatc 1800
agaggctacc gggtggacgt gaacaacaag cccgagatgc acaccttcga ggtctaccga 1860
aagatgcaga gcctggaaac aatcaaggtg gaagaactgc ctagaaaggt cctggatgaa 1920
ctggcccaca tcctcaccct gaataccgag agagagggca tcgaggaggc catcaacagc 1980
aagctgaagg acatcttcaa ccgcgaccag gtgctggagc tggtgcagtt cagaaagaac 2040
aacagcagtc tgttctccaa gggatggcac aacttcagca tcaagctgat gatggaactg 2100
atcccagagc tgtatgaaac atccgaagaa cagatgacca tcctgacaag actgggcaaa 2160
cagcgttcta aggagacctc taagcggacc aaatacatcg atgagaaaga actgaccgag 2220
gagatctata accccgtggt ggccaaaagc gtccggcagg ccatcaagat catcaacgag 2280
gccactaaga agtacggcat tttcgacaac atcgtgatcg agatggccag agaaaacaac 2340
gaagaagatg ccaagaaaga ttatattaaa aggcaaaaag ctaatcaaga tgaaaagaac 2400
gccgccatgg aaaaggctgc attccagtac aatggcaaga aggaactgcc tgataatatc 2460
tttcacggcc acaaggagct gacaacaaaa attcggctgt ggcaccagca gggagaaaag 2520
tgcctgtaca ccggaaagaa tatccctatc tctgatctta ttcacaacca gtacaagtac 2580
gagatcgacc acatcctgcc cctgtccctg agctttgacg actctctgag caacaaggtt 2640
ctggttctgg ccaccgccaa ccaggagaag ggccaaagaa ctcctttcca ggccctggac 2700
agcatggacg acgcctggag ctacagagag ttcaagagct acgtgaaaga ctctaaactg 2760
ctgtctaaca agaagaaaga ctacctgttg acagaggagg atatctccaa gatcgaggtc 2820
aagcagaaat tcatcgagag aaatctggtg gataccagat acagctccag agtggttctg 2880
aatgcccttc aagacttcta caagagccac cagctggaca ccaccatctc agtggtgcgg 2940
ggccagttta ccagccagct gcggagaaag tggggcatcg agaaaagcag ggaaacctac 3000
caccaccatg ccgtagacgc tcttatcatt gctgcctcta gccagctgcg gctgtggaag 3060
aagcacagca accctctgat cgcctataag gagggccagt ttgtggacag cgagacaggc 3120
gagatcgtgt ctctgtccga cgaagaatac aaggaactgg tgtttaaggc cccttacgat 3180
cactttgtgg ataccctgag aagcaagaaa ttcgaagata gcatcctgtt tagctatcaa 3240
gtggattcta agtacaacag aaagatctcc gatgcaacaa tctacgcgac caggaaggct 3300
aagctggata aggaaaagaa ggagtacaca tacaccctcg gaaagatcaa agatatctac 3360
gccctgggca caaagacccc ttccaagacc ggattctaca agttcctgga cctgtacaag 3420
accgataaga gccagttcct gatgtaccaa aaggatagaa agacctggga cgaggtgatc 3480
gagaaaatca tcgagcagta ccggcctttt aaggagtacg acaagaacgg caaagaggtg 3540
gatttcaacc ccttcgagaa gtacagaatc ggcaatggcc ccatccggaa atacagcaag 3600
aagggcaacg gacctgagat caagagtctg aaatattacg acatcctgct gggcaaacac 3660
aagaacatca ctcctgacgg atctagaaac accgtggccc tgctgagcct gaacccttgg 3720
agaacagacg tgtactacaa cagcgaaaca aagaagtacg agttcctggg actcaagtac 3780
gccgacctgt gcttcgaaga gggcggagcc tacggcatca gcgaggtgaa gtacaagaag 3840
atcagagaaa aggagggcat cggcaagaat agcgagttca agttcaccct gtacaagaac 3900
gacctgattc tgatcaagga caccgaaacc aactgccagc agttcttcag attctggagc 3960
agaaccggta aggacaaccc taaatctttc gaaaagcata agatcgagct gaagccttac 4020
gagaaagcca agttcgagaa aggcgaggag ctaaaagtgc tgggcaaggt gccaccttct 4080
tccaaccagt ttcagaagaa catgcaaatc gagaacttga gcatctacaa ggtcaagaca 4140
gacatcctgg gtaacaaaca ctttatcaaa aaggagggag atgaacccaa gctcaagttc 4200
aagaagagcg gcgggagcgg cgggagcggg gggagcacta atctgagcga catcattgag 4260
aaggagactg ggaaacagct ggtcattcag gagtccatcc tgatgctgcc tgaggaggtg 4320
gaggaagtga tcggcaacaa gccagagtct gacatcctgg tgcacaccgc ctacgacgag 4380
tccacagatg agaatgtgat gctgctgacc tctgacgccc ccgagtataa gccttgggcc 4440
ctggtcatcc aggattctaa cggcgagaat aagatcaaga tgctgagcgg aggatccgga 4500
ggatctggag gcagcaccaa cctgtctgac atcatcgaga aggagacagg caagcagctg 4560
gtcatccagg agagcatcct gatgctgccc gaagaagtcg aagaagtgat cggaaacaag 4620
cctgagagcg atatcctggt ccataccgcc tacgacgaga gtaccgacga aaatgtgatg 4680
ctgctgacat ccgacgcccc agagtataag ccctgggctc tggtcatcca ggattccaac 4740
ggagagaaca aaatcaaaat gctgtctggc ggctcaaaaa gaaccgccga cggcagcgaa 4800
ttcgagccca agaagaagag gaaagtc 4827
<210> 7
<211> 4941
<212> DNA
<213> pGL3-U6-Sgo gRNA insert site-scaffold
<400> 7
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg tgagaccgag agagggtctc agtttttgta ctctcaagga aacttgcaga 300
agctacaaag ataaggcttc atgccgaatt caacaccctg tcatttatgg cggggtgttt 360
ttttttttaa agaattctcg acctcgagac aaatggcagt attcatccac aattttaaaa 420
gaaaaggggg gattgggggg tacagtgcag gggaaagaat agtagacata atagcaacag 480
acatacaaac taaagaatta caaaaacaaa ttacaaaaat tcaaaatttt cgggtttatt 540
acagggacag cagagatcca ctttggccgc ggctcgaggg ggttggggtt gcgccttttc 600
caaggcagcc ctgggtttgc gcagggacgc ggctgctctg ggcgtggttc cgggaaacgc 660
agcggcgccg accctgggac tcgcacattc ttcacgtccg ttcgcagcgt cacccggatc 720
ttcgccgcta cccttgtggg ccccccggcg acgcttcctg ctccgcccct aagtcgggaa 780
ggttccttgc ggttcgcggc gtgccggacg tgacaaacgg aagccgcacg tctcactagt 840
accctcgcag acggacagcg ccagggagca atggcagcgc gccgaccgcg atgggctgtg 900
gccaatagcg gctgctcagc agggcgcgcc gagagcagcg gccgggaagg ggcggtgcgg 960
gaggcggggt gtggggcggt agtgtgggcc ctgttcctgc ccgcgcggtg ttccgcattc 1020
tgcaagcctc cggagcgcac gtcggcagtc ggctccctcg ttgaccgaat caccgacctc 1080
tctccccagg gggatccatg gtgagcaagg gcgaggagct gttcaccggg gtggtgccca 1140
tcctggtcga gctggacggc gacgtaaacg gccacaagtt cagcgtgtcc ggcgagggcg 1200
agggcgatgc cacctacggc aagctgaccc tgaagttcat ctgcaccacc ggcaagctgc 1260
ccgtgccctg gcccaccctc gtgaccaccc tgacctacgg cgtgcagtgc ttcagccgct 1320
accccgacca catgaagcag cacgacttct tcaagtccgc catgcccgaa ggctacgtcc 1380
aggagcgcac catcttcttc aaggacgacg gcaactacaa gacccgcgcc gaggtgaagt 1440
tcgagggcga caccctggtg aaccgcatcg agctgaaggg catcgacttc aaggaggacg 1500
gcaacatcct ggggcacaag ctggagtaca actacaacag ccacaacgtc tatatcatgg 1560
ccgacaagca gaagaacggc atcaaggtga acttcaagat ccgccacaac atcgaggacg 1620
gcagcgtgca gctcgccgac cactaccagc agaacacccc catcggcgac ggccccgtgc 1680
tgctgcccga caaccactac ctgagcaccc agtccgccct gagcaaagac cccaacgaga 1740
agcgcgatca catggtcctg ctggagttcg tgaccgccgc cgggatcact ctcggcatgg 1800
acgagctgta caagtaaagc ggccgcgact ctagatcata atcagccata ccacatttgt 1860
agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga aacataaaat 1920
gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca aataaagcaa 1980
tagcatcaca aatttcacaa ataaagcatt tttttcactg cattctagtt gtggtttgtc 2040
caaactcatc aatgtatctt agtcgaccga tgcccttgag agccttcaac ccagtcagct 2100
ccttccggtg ggcgcggggc atgactatcg tcgccgcact tatgactgtc ttctttatca 2160
tgcaactcgt aggacaggtg ccggcagcgc tcttccgctt cctcgctcac tgactcgctg 2220
cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 2280
tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 2340
aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 2400
catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 2460
caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 2520
ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 2580
aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 2640
gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 2700
cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 2760
ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta 2820
tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 2880
tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 2940
cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 3000
tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 3060
tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 3120
tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 3180
cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 3240
ccatctggcc ccagtgctgc aatgataccg cgggacccac gctcaccggc tccagattta 3300
tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 3360
gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat 3420
agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 3480
atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 3540
tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 3600
gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 3660
agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 3720
cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 3780
ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 3840
ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 3900
actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 3960
ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 4020
atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 4080
caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgcgcc ctgtagcggc 4140
gcattaagcg cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc 4200
ctagcgcccg ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc 4260
cgtcaagctc taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc 4320
gaccccaaaa aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg 4380
gtttttcgcc ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact 4440
ggaacaacac tcaaccctat ctcggtctat tcttttgatt tataagggat tttgccgatt 4500
tcggcctatt ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa 4560
atattaacgc ttacaatttg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga 4620
tcggtgcggg cctcttcgct attacgccag cccaagctac catgataagt aagtaatatt 4680
aaggtacggg aggtacttgg agcggccgca ataaaatatc tttattttca ttacatctgt 4740
gtgttggttt tttgtgtgaa tcgatagtac taacatacgc tctccatcaa aacaaaacga 4800
aacaaaacaa actagcaaaa taggctgtcc ccagtgcaag tgcaggtgcc agaacatttc 4860
tctatcgata ggtaccgatt agtgaacgga tctcgacggt atcgatcacg agactagcct 4920
cgagcggccg cccccttcac c 4941
<210> 8
<211> 7
<212> PRT
<213> nucleic acid localization signal polypeptide
<400> 8
Pro Lys Lys Lys Arg Lys Val
1 5
<210> 9
<211> 3785
<212> DNA
<213> 32aa linker-SgoCas9 D9A-10aa linker-UGI
<400> 9
agcggaggat cctctggcag cgagacacca ggaacaagcg agtcagcaac accagagagc 60
agtggcggca gcagcggcgg cagcaacggc ctggtgctgg gcctggccat cggcatcgcc 120
tctgtgggcg tgggcatcct ggagaaagac actggcaaga tcattcacgc ttcgagcaga 180
ctgttcccag ccgccacagc cgacaacaat gtggagagac ggagcaatag acagggcaga 240
cggctaaacc ggcggaaaaa gcacagatcc gtgcggctgc aggacctgtt tgaaggatac 300
ggcctgctga cagacttcag caaggtgtcc atgaacctga atccctacca gctgcgggtg 360
cagggaatgg aaaaccagct gaccaacgag gagctgttcg tggccctgaa gaatatcgtg 420
aagagaagag gcatcagcta cctggacgat gccagcgagg acggcggcac cgtgagtagc 480
gactacggca aggctgtgga agaaaacaga aaactgctgg cggaaaagac gcccggccaa 540
atccagctgg aacgcttcga gaagtatggc cagctgagag gcgacttcac cgtggaagaa 600
aatggcgaga agcatagact gatcaacgtg ttcagcacca gcgcctacag aaaggaagct 660
gaacggatcc tgcggaagca gcaggagttc aacagcaaga tcacagacga gtttattgag 720
gactacctga tcatcctgac aggaaaacgg aagtactacc acggacctgg caacgagaag 780
agcagaaccg actacggcag attcagaacc gacggcacca ccctggacaa catcttcggc 840
atcctgattg gaaagtgtac attctacacc gaagagtatc gggcctctaa ggccagctac 900
acagcccagg agttcaacct gctcaacgat ctgaacaacc tgaccgtgcc taccgagaca 960
aagaaactga gcgaggagca gaagaagctg atcatcgagt acgccaaatc tgccaagacc 1020
ctcggcgcca gcaccctgct gaaatatatc gccaaaatga tcgacgccag cgtcgaccag 1080
atcagaggct accgggtgga cgtgaacaac aagcccgaga tgcacacctt cgaggtctac 1140
cgaaagatgc agagcctgga aacaatcaag gtggaagaac tgcctagaaa ggtcctggat 1200
gaactggccc acatcctcac cctgaatacc gagagagagg gcatcgagga ggccatcaac 1260
agcaagctga aggacatctt caaccgcgac caggtgctgg agctggtgca gttcagaaag 1320
aacaacagca gtctgttctc caagggatgg cacaacttca gcatcaagct gatgatggaa 1380
ctgatcccag agctgtatga aacatccgaa gaacagatga ccatcctgac aagactgggc 1440
aaacagcgtt ctaaggagac ctctaagcgg accaaataca tcgatgagaa agaactgacc 1500
gaggagatct ataaccccgt ggtggccaaa agcgtccggc aggccatcaa gatcatcaac 1560
gaggccacta agaagtacgg cattttcgac aacatcgtga tcgagatggc cagagaaaac 1620
aacgaagaag atgccaagaa agattatatt aaaaggcaaa aagctaatca agatgaaaag 1680
aacgccgcca tggaaaaggc tgcattccag tacaatggca agaaggaact gcctgataat 1740
atctttcacg gccacaagga gctgacaaca aaaattcggc tgtggcacca gcagggagaa 1800
aagtgcctgt acaccggaaa gaatatccct atctctgatc ttattcacaa ccagtacaag 1860
tacgagatcg accacatcct gcccctgtcc ctgagctttg acgactctct gagcaacaag 1920
gttctggttc tggccaccgc caaccaggag aagggccaaa gaactccttt ccaggccctg 1980
gacagcatgg acgacgcctg gagctacaga gagttcaaga gctacgtgaa agactctaaa 2040
ctgctgtcta acaagaagaa agactacctg ttgacagagg aggatatctc caagatcgag 2100
gtcaagcaga aattcatcga gagaaatctg gtggatacca gatacagctc cagagtggtt 2160
ctgaatgccc ttcaagactt ctacaagagc caccagctgg acaccaccat ctcagtggtg 2220
cggggccagt ttaccagcca gctgcggaga aagtggggca tcgagaaaag cagggaaacc 2280
taccaccacc atgccgtaga cgctcttatc attgctgcct ctagccagct gcggctgtgg 2340
aagaagcaca gcaaccctct gatcgcctat aaggagggcc agtttgtgga cagcgagaca 2400
ggcgagatcg tgtctctgtc cgacgaagaa tacaaggaac tggtgtttaa ggccccttac 2460
gatcactttg tggataccct gagaagcaag aaattcgaag atagcatcct gtttagctat 2520
caagtggatt ctaagtacaa cagaaagatc tccgatgcaa caatctacgc gaccaggaag 2580
gctaagctgg ataaggaaaa gaaggagtac acatacaccc tcggaaagat caaagatatc 2640
tacgccctgg gcacaaagac cccttccaag accggattct acaagttcct ggacctgtac 2700
aagaccgata agagccagtt cctgatgtac caaaaggata gaaagacctg ggacgaggtg 2760
atcgagaaaa tcatcgagca gtaccggcct tttaaggagt acgacaagaa cggcaaagag 2820
gtggatttca accccttcga gaagtacaga atcggcaatg gccccatccg gaaatacagc 2880
aagaagggca acggacctga gatcaagagt ctgaaatatt acgacatcct gctgggcaaa 2940
cacaagaaca tcactcctga cggatctaga aacaccgtgg ccctgctgag cctgaaccct 3000
tggagaacag acgtgtacta caacagcgaa acaaagaagt acgagttcct gggactcaag 3060
tacgccgacc tgtgcttcga agagggcgga gcctacggca tcagcgaggt gaagtacaag 3120
aagatcagag aaaaggaggg catcggcaag aatagcgagt tcaagttcac cctgtacaag 3180
aacgacctga ttctgatcaa ggacaccgaa accaactgcc agcagttctt cagattctgg 3240
agcagaaccg gtaaggacaa ccctaaatct ttcgaaaagc ataagatcga gctgaagcct 3300
tacgagaaag ccaagttcga gaaaggcgag gagctaaaag tgctgggcaa ggtgccacct 3360
tcttccaacc agtttcagaa gaacatgcaa atcgagaact tgagcatcta caaggtcaag 3420
acagacatcc tgggtaacaa acactttatc aaaaaggagg gagatgaacc caagctcaag 3480
ttcaagaaga gcggcgggag cggcgggagc ggggggagca ctaatctgag cgacatcatt 3540
gagaaggaga ctgggaaaca gctggtcatt caggagtcca tcctgatgct gcctgaggag 3600
gtggaggaag tgatcggcaa caagccagag tctgacatcc tggtgcacac cgcctacgac 3660
gagtccacag atgagaatgt gatgctgctg acctctgacg cccccgagta taagccttgg 3720
gccctggtca tccaggattc taacggcgag aataagatca agatgctgag cggaggatcc 3780
ggagg 3785

Claims (5)

1. A fusion protein comprising a codon-optimized Cas9 nicakase homologous protein derived from streptococcus gordonae, a cytosine deaminase, and a uracil glycosylase inhibiting protein;
the amino acid sequence of the Cas9 nicase homologous protein is the amino acid sequence at the 2 nd to 1136 th sites of the N end of the SgoCas9D9A nicase shown in SEQ ID NO. 1;
the fusion protein also comprises N-terminal BPNLS-ancAPEC 1 polypeptide and C-terminal 2 x UGI-BPNLS polypeptide; wherein, the BPNLS-ancAPECE 1 polypeptide is formed by fusing BPNLS polypeptide and ancAPECE 1 polypeptide; 2 UGI-BPNLS polypeptide is formed by fusing UGI polypeptide and BPNLS polypeptide;
the amino acid sequence of the BPNLS-ancAPECC 1 polypeptide is the amino acid sequence shown in SEQ ID NO. 3;
the amino acid sequence of the 2-UGI-BPNLS polypeptide is the amino acid sequence shown in SEQ ID NO. 4;
the fusion protein also comprises a nucleic acid positioning signal polypeptide segment, and the amino acid sequence of the nucleic acid positioning signal polypeptide segment is shown as SEQ ID NO. 8;
the fusion protein sequentially comprises BPNLS polypeptide, ancAPECC 1 polypeptide, 32aa linker, amino acid sequences from 2 th to 1136 th sites of the N end of SgoCas9D9A nickase, 10aa linker, 2 x UGI polypeptide and BPNLS polypeptide from the N end to the C end;
the amino acid complete sequence of the fusion protein is an amino acid sequence shown in SEQ ID NO. 5;
the fusion protein recognizes NNAAAG as PAM, wherein N represents an arbitrary base; the fusion protein edits cytosine base into thymine at 8-14 sites of the 5' end of the gRNA of the editing window.
2. A polynucleotide sequence, wherein the polynucleotide sequence encodes the fusion protein of claim 1; the polynucleotide sequence is shown in SEQ ID NO. 6.
3. A cytosine single base editor obtained by integrating a polynucleotide sequence encoding the fusion protein of claim 1 into an expression vector; the cytosine single base editor recognizes NNAAAG as PAM, where N represents an arbitrary base; the cytosine single base editor edits cytosine base into thymine at 8-14 sites of the 5' end of the gRNA of the editing window.
4. The cytosine single-base editor of claim 3, wherein the expression vector comprises a gRNA scaffold derived from a concatemeric repeat sequence of Gordonia streptococcus, and the nucleotide sequence of the expression vector is shown in SEQ ID NO. 7.
5. A cell expression system comprising the cytosine single base editor of claim 3 or 4; the cell is a host cell which is a eukaryotic cell; the eukaryotic cell is a mouse brain neuroma cell, a human embryonic kidney cell or a human colon cancer cell.
CN202110519757.1A 2021-05-12 2021-05-12 Cytosine single base editor tool and application thereof Active CN113201517B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110519757.1A CN113201517B (en) 2021-05-12 2021-05-12 Cytosine single base editor tool and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110519757.1A CN113201517B (en) 2021-05-12 2021-05-12 Cytosine single base editor tool and application thereof

Publications (2)

Publication Number Publication Date
CN113201517A CN113201517A (en) 2021-08-03
CN113201517B true CN113201517B (en) 2022-11-01

Family

ID=77030981

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110519757.1A Active CN113201517B (en) 2021-05-12 2021-05-12 Cytosine single base editor tool and application thereof

Country Status (1)

Country Link
CN (1) CN113201517B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115704015A (en) * 2021-08-12 2023-02-17 清华大学 Targeted mutagenesis system based on adenine and cytosine double-base editor
CN117683755A (en) * 2024-01-31 2024-03-12 南京农业大学三亚研究院 C-to-G base editing system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105637087A (en) * 2013-09-18 2016-06-01 科马布有限公司 Methods, cells and organisms
CN110029096A (en) * 2019-05-09 2019-07-19 上海科技大学 A kind of adenine base edit tool and application thereof
CN110467679A (en) * 2019-08-06 2019-11-19 广州大学 A kind of fusion protein, base edit tool and method and its application
CN110835634A (en) * 2018-08-15 2020-02-25 华东师范大学 Novel base conversion editing system and application thereof
CN111172133A (en) * 2020-03-10 2020-05-19 上海科技大学 Base editing tool and application thereof
CN112266420A (en) * 2020-10-30 2021-01-26 华南农业大学 Plant efficient cytosine single-base editor and construction and application thereof
WO2021032155A1 (en) * 2019-08-20 2021-02-25 中国科学院遗传与发育生物学研究所 Base editing system and use method therefor

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105637087A (en) * 2013-09-18 2016-06-01 科马布有限公司 Methods, cells and organisms
CN110835634A (en) * 2018-08-15 2020-02-25 华东师范大学 Novel base conversion editing system and application thereof
CN110029096A (en) * 2019-05-09 2019-07-19 上海科技大学 A kind of adenine base edit tool and application thereof
CN110467679A (en) * 2019-08-06 2019-11-19 广州大学 A kind of fusion protein, base edit tool and method and its application
WO2021032155A1 (en) * 2019-08-20 2021-02-25 中国科学院遗传与发育生物学研究所 Base editing system and use method therefor
CN111172133A (en) * 2020-03-10 2020-05-19 上海科技大学 Base editing tool and application thereof
CN112266420A (en) * 2020-10-30 2021-01-26 华南农业大学 Plant efficient cytosine single-base editor and construction and application thereof

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
ACCESSION WP_012130469.1,type II CRISPR RNA-guided endonuclease Cas9 [Streptococcus gordonii];无;《GenBank》;20191009;第1-2页 *
Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage;Alexis C. Komor等;《Nature》;20160519(第533期);第420-424页 *
Two Compact Cas9 Ortholog-Based Cytosine Base Editors Expand the DNA Targeting Scope and Applications In Vitro and In Vivo;Susu Wu等;《frontiers in Cell and Developmental Biology》;20220301(第10期);第1-18页 *
利用CRISPR/Cas9技术创制大豆高油酸突变系;侯智红等;《作物学报》;20190316;第45卷(第06期);第839-847页 *
单碱基基因编辑系统的研究进展;刘佳慧等;《世界科技研究与发展》;20170915;第39卷(第06期);第457-462页 *
单碱基水平上胞嘧啶碱基编辑器(CBE)的研究进展;李广栋等;《畜牧兽医学报》;20200622;第51卷(第01期);第1-8页 *
基于CRISPR/Cas系统的单碱基编辑技术研究进展;王玥等;《中国生物工程杂志》;20201215;第40卷(第12期);第58-66页 *

Also Published As

Publication number Publication date
CN113201517A (en) 2021-08-03

Similar Documents

Publication Publication Date Title
Horwitz et al. Efficient multiplexed integration of synergistic alleles and metabolic pathways in yeasts via CRISPR-Cas
DK3004338T3 (en) LAGLIDADG HOMING ENDONUCLEASE DIVERSE T-CELL RECEPTOR ALPHA GENET AND APPLICATIONS THEREOF
CN113201517B (en) Cytosine single base editor tool and application thereof
CN108822217A (en) A kind of gene base editing machine
KR20200079497A (en) Formulation
KR20200058508A (en) Polynucleotides, compositions and methods for genome editing
HU211605A9 (en) Stem cell inhibiting proteins
PT96104A (en) PROCESS FOR THE PREPARATION OF FUSEO PROTEINS
CN114736893B (en) Method for realizing A/T to G/C editing on mitochondrial DNA
KR20200074134A (en) In vitro method of mRNA delivery using lipid nanoparticles
CN112154206A (en) Production of 2-keto-3-deoxy-D-gluconic acid in filamentous fungi
KR20220149588A (en) Compositions and methods for the treatment of metabolic liver disorders
CN110042067B (en) Method for improving xylose utilization capacity of recombinant saccharomyces cerevisiae strain and mutant strain thereof
US20040161756A1 (en) Substrate linked directed evolution (slide)
CN1156572C (en) In vitro transcription processes for screening natural products and other chemical substances
CN111296364B (en) Gene modification method for mouse animal model and application thereof
CN110484517A (en) A kind of composition and preparation method of the Rift Valley fever virus being used to prepare weak poison, RVFV attenuated vaccine
KR101960382B1 (en) Variant Microorganism Producing Butanol in Aerobic Condition and Method for Preparing Butanol Using the Same
Walter et al. Method for multiplexed integration of synergistic alleles and metabolic pathways in yeasts via CRISPR-Cas9
CN101481703A (en) Avian origin promoter expression vector, construction method and use thereof
CN113832091B (en) Bacillus thuringiensis engineering bacterium for expressing bivalent insecticidal protein, and construction method and application thereof
CN113736790B (en) sgRNA (ribonucleic acid) for knocking out duck hnRNPA3 gene, cell line, construction method and application thereof
CN114364440A (en) Gene editing therapy for AAV-mediated RPGR X-linked retinal degeneration
KR102583349B1 (en) Genetic modification of eremothecium to increase gmp synthetase activity
CN113493855B (en) Kit for detecting HBV cccDNA based on RAA-CRISPR-cas13a

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant