WO2021004456A1 - Improved genome editing system and use thereof - Google Patents

Improved genome editing system and use thereof Download PDF

Info

Publication number
WO2021004456A1
WO2021004456A1 PCT/CN2020/100664 CN2020100664W WO2021004456A1 WO 2021004456 A1 WO2021004456 A1 WO 2021004456A1 CN 2020100664 W CN2020100664 W CN 2020100664W WO 2021004456 A1 WO2021004456 A1 WO 2021004456A1
Authority
WO
WIPO (PCT)
Prior art keywords
cas9
genome editing
sgrna
fusion polypeptide
target
Prior art date
Application number
PCT/CN2020/100664
Other languages
French (fr)
Chinese (zh)
Inventor
邱金龙
刘关稳
尹康权
Original Assignee
中国科学院微生物研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院微生物研究所 filed Critical 中国科学院微生物研究所
Publication of WO2021004456A1 publication Critical patent/WO2021004456A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K19/00Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8201Methods for introducing genetic material into plant cells, e.g. DNA, RNA, stable or transient incorporation, tissue culture methods adapted for transformation
    • C12N15/8213Targeted insertion of genes into the plant genome by homologous recombination
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/10Cells modified by introduction of foreign genetic material
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • C07K2319/71Fusion polypeptide containing domain for protein-protein interaction containing domain for transcriptional activaation, e.g. VP16

Definitions

  • the invention relates to the field of genome editing. Specifically, the present invention relates to an improved genome editing system and its application. More specifically, the present invention provides a genome editing fusion polypeptide comprising a CRISPR nuclease domain and a transcription activation domain. The present invention also provides polynucleotides or expression constructs encoding the polypeptides, and genomic systems comprising the polypeptides, polynucleotides and/or constructs. The present invention also provides a method for editing cell genome using the genome editing system.
  • the CRISPR/Cas9 system has been widely and successfully used in genome engineering of many eukaryotic species. However, in animal and plant cells, the editing efficiency of different genomic sites varies greatly. The low CRISPR/Cas9 editing efficiency at certain sites limits the availability of targets in vivo, thus limiting further applications.
  • eukaryotic genomic DNA wraps around histones and further compresses to form higher-order chromatin structures that may prevent Cas9 from binding to its target.
  • Genome-wide mapping of the binding sites of catalytically inactivated Cas9 (dCas9) in mammalian cells shows that the binding sites are enriched in open chromatin regions.
  • CRISPR/Cas9 induces more insertions and deletions (indels) in open chromatin regions.
  • Indels insertions and deletions
  • the proxy-CRISPR strategy uses additional catalytically inactive SpCas9 (dCas9) to bind to nearby locations. This makes the target site accessible to FnCas9, CjCas9, NcCas9 and FnCpf1, thereby improving editing efficiency.
  • this method relies on the accessible genome of SpCas9 and requires the co-expression of two different CRISPR-Cas systems, which inevitably increases the size of the vector and the difficulty of in vivo application.
  • CRISPR-chrom in which Cas9 orthologs are fused with chromatin regulatory peptides (CMP) significantly improves Cas9 editing efficiency, especially at refractory sites.
  • CMP is a truncated form of endogenous protein, and it is not clear whether their overexpression has a dominant negative effect.
  • the art needs to provide further methods to improve the accessibility of eukaryotic organisms, especially plant genomic DNA to increase editing efficiency.
  • the present invention provides a genome editing fusion polypeptide, which comprises a CRISPR nuclease domain and a transcription activation domain.
  • the present invention also provides an isolated polynucleotide encoding the genome editing fusion polypeptide of the present invention.
  • the present invention also provides an expression vector, which comprises the polynucleotide of the present invention.
  • the present invention also provides a host cell, which contains the polynucleotide or expression vector of the present invention.
  • the present invention also provides a genome editing system, which comprises at least one of the following i) to v):
  • the genome editing system of the present invention further comprises or encodes a dsgRNA whose target site is 30-300 bp away from the sgRNA target site, preferably 40-270 bp, most preferably 115-120 bp.
  • the present invention also provides a host cell comprising the polynucleotide or expression vector of the present invention or the genome editing system of the present invention.
  • the present invention also provides a method for genetically modifying cells, including introducing the genome editing system of the present invention into cells, preferably plant cells.
  • Figure 1 shows the effect of chromatin accessibility on the efficiency of rice Cas9 genome editing.
  • Figure 1a summarizes the number of CRISPR/Cas9-mediated mutations and chromatin accessibility at 70 target sites. The mutagenesis efficiency was measured on regenerated TO rice plants by PCR/RE. The accessibility of each target site was obtained from the high-resolution map of rice DNase I hypersensitive (DH) sites generated by Zhang et al., 2012.
  • Figure 1c summarizes the insertion frequency and chromatin status of the 40 target sites in Figure 1b. The P value was calculated by the two-tailed Mann-Whitney test. **P ⁇ 0.01, ***P ⁇ 0.001.
  • Figure 2 shows that Cas9 editing in rice is more effective in open chromatin regions than in closed chromatin regions.
  • b summarizes the Cas9 editing efficiency in a. The P value was calculated by two-tailed Mann-Whitney test, *P ⁇ 0.05.
  • c shows that Cas9 cuts all 10 target sites the same in chromatin-free state.
  • the PCR product containing the corresponding target site was incubated with Cas9 ribonucleoprotein (RNP) complex, and observed and measured on an agarose gel.
  • RNP Cas9 ribonucleoprotein
  • Figure 3 shows that fusing the synthetic transcription activation domain with Cas9 improves its editing efficiency.
  • a is a schematic diagram of the structure of the fusion of transcription activation domain and Cas9 (Cas9-TV).
  • c shows the indel frequencies induced by Cas9 and Cas9-TV at 20 target sites.
  • d shows the insertion frequency of the target site in the open chromatin region induced by Cas9 and Cas9-TV.
  • e shows the frequency of indels induced by Cas9 and Cas9-TV at target sites in the enclosed chromatin region.
  • the P value was calculated by the two-tailed Mann-Whitney test. *P ⁇ 0.05, ***P ⁇ 0.001.
  • Figure 4 shows that proximal targeting of dsgRNA enhances Cas9-TV editing.
  • b shows the fold change of indel frequency induced by Cas9-TV/sgRNA and Cas9-TV/sgRNA-dsgRNA at 10 target sites relative to Cas9/sgRNA.
  • c shows the fold change of indel frequency in the open chromatin area.
  • d shows the summary of the indel frequency fold change of the target site in the enclosed chromatin region.
  • the P value was calculated by the two-tailed Mann-Whitney test. ***P ⁇ 0.001, ****P ⁇ 0.0001.
  • Figure 6 Increases Cas9 editing efficiency through proximal dsgRNA targeting.
  • b shows the indel frequency of 20 target sites.
  • C shows the frequency of indels induced by target sites in open chromatin.
  • D shows the indel frequency of the target site in the blocked chromatin.
  • the P value was calculated by the two-tailed Mann-Whitney test. ***p ⁇ 0.001, ****p ⁇ 0.0001.
  • Figure 7 shows the effect of the position of the proximal dsgRNA on the editing activity of Cas9.
  • the dsgRNA target site and the Cas9-TV target site are separated from each other by the distance expressed in bp, which is expressed by numbers, respectively.
  • Untreated protoplast samples were used as controls.
  • the indel frequency is measured by sequencing the targeted amplicon.
  • Figure 8 shows that Cas9-TV and proximal dsgRNA alter local chromatin accessibility.
  • the rice protoplasts were transfected with Cas9/sgRNA and Cas9-TV/sgRNA-dsgRNA, and the local chromatin accessibility around the target site was analyzed by DNase I determination of a small sample.
  • the fraction of intact genomic DNA is quantified by real-time PCR.
  • the relative amount of intact genomic DNA in the Cas9/sgRNA-treated sample is set as one unit. Error bars indicate SD for three replicates.
  • Figure 9 compares the off-target activity of Cas9/sgRNA, Cas9-TV/sgRNA and Cas9-TV/sgRNA-dsgRNA.
  • Figure 10 shows the indel patterns induced by Cas9 and Cas9-TV at the target site. The figure shows representative results from one of three independent experiments. All three experiments gave similar results.
  • Figure 11 shows the indel patterns produced by Cas9/sgRNA, Cas9/sgRNA-dsgRNA and Cas9-TV/sgRNA-dsgRNA at designated target sites. This figure shows representative results from one of three independent experiments that produced similar results.
  • Figure 12 shows that dsgRNA does not induce indels at the target site.
  • the dsgRNA was co-transformed with Cas9 or Cas9-TV into rice protoplasts.
  • the indel frequency is measured by sequencing the targeted amplicon.
  • Untreated protoplast samples were used as controls.
  • Figure 13 shows the sgRNA and dsgRNA target sites for the partial genomic DNA sequence of LOC_Os11g08760.
  • CRISPR nuclease generally refers to the nuclease that exists in the naturally occurring CRISPR system, as well as its modified form, its variant, its catalytically active fragment, and the like.
  • the term covers any effector protein based on the CRISPR system that can achieve gene targeting (such as gene editing, gene targeted regulation, etc.) in cells.
  • Cas9 nuclease examples include Cas9 nuclease or variants thereof.
  • the Cas9 nuclease may be a Cas9 nuclease from different species, such as spCas9 from S. pyogenes or SaCas9 derived from S. aureus.
  • Cas9 nuclease and Cas9 are used interchangeably herein, and refer to RNA comprising Cas9 protein or fragments thereof (for example, a protein containing the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas9) Guided nuclease.
  • Cas9 is a component of CRISPR/Cas (clustered regularly spaced short palindrome repeats and related systems) genome editing system, which can target and cut DNA target sequences under the guidance of guide RNA to form DNA double-strand breaks (DSB) ).
  • CRISPR/Cas clustered regularly spaced short palindrome repeats and related systems
  • CRISPR nuclease may also include Cpf1 nuclease or a variant thereof such as a highly specific variant.
  • the Cpf1 nuclease may be Cpf1 nuclease from different species, such as Cpf1 nuclease from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006.
  • a "transcription activation domain (TAD)" is generally a domain in a transcription factor that contains binding sites for other proteins such as transcriptional co-regulatory proteins. TAD is generally classified according to the composition of amino acids. These amino acids can be amino acids that are essential for activity or the most abundant amino acids in TAD. Transcription activation domains are generally divided into acid activation domains, glutamine-rich domains, proline-rich domains and isoleucine-rich domains.
  • gRNA and “guide RNA” are used interchangeably, and refer to RNA that can form a complex with the CRISPR nuclease and can target the complex to the target sequence due to its certain complementarity with the target sequence molecular.
  • gRNA is usually composed of crRNA and tracrRNA molecules that are partially complementary to form a complex, where the crRNA contains sufficient complementarity with the target sequence to hybridize with the target sequence and guide the CRISPR complex (Cas9+ crRNA+tracrRNA) A sequence that specifically binds to the target sequence sequence.
  • single guide RNA can be designed, which includes both the characteristics of crRNA and tracrRNA.
  • gRNA is usually composed of mature crRNA molecules only, and the sequence contained in crRNA is sufficiently identical to the target sequence to hybridize with the complementary sequence of the target sequence and direct the complex (Cpf1+crRNA) to the The target sequence specifically binds.
  • Designing a suitable gRNA sequence based on the CRISPR nuclease used and the target sequence to be edited is within the ability of those skilled in the art.
  • Dead sgRNA or “dsgRNA” refers to sgRNA that can guide Cas9 to a target site without inducing a double-strand break (DSB), which only has a spacer sequence (target sequence) of 14 or 15 bp.
  • DSB double-strand break
  • chromatin refers to a linear composite structure composed of DNA, histones, non-histone proteins, and a small amount of RNA in the interphase cell nucleus, and is a form of interphase cell genetic material.
  • the chromatin of eukaryotic cells condenses into rod-shaped chromosomes.
  • DNA regions that are easy to bind to other proteins are called “open chromatin regions”; DNA regions that are difficult to bind to other proteins are called “closed ( closed) Chromatin area”.
  • gene not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
  • cell includes cells of any organism suitable for genome editing.
  • organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants include monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and so on.
  • Genetically modified organism or “genetically modified cell” means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences in its genome.
  • exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations.
  • the exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct.
  • the modified gene or expression control sequence contains single or multiple deoxynucleotide substitutions, deletions and additions in the organism or cell genome.
  • Form in terms of sequence means a sequence from a foreign species or, if from the same species, a sequence whose composition and/or locus has been significantly altered from its natural form through deliberate human intervention.
  • nucleic acid sequence is used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases.
  • Nucleotides are referred to by their single letter names as follows: “A” is adenosine or deoxyadenosine (respectively for RNA or DNA), “C” is cytidine or deoxycytidine, and “G” is guanosine or Deoxyguanosine, “U” means uridine, “T” means deoxythymidine, “R” means purine (A or G), “Y” means pyrimidine (C or T), “K” means G or T, “ H” means A or C or T, “I” means inosine, and “N” means any nucleotide.
  • Polypeptide “peptide”, and “protein” are used interchangeably in the present invention and refer to a polymer of amino acid residues.
  • the term applies to amino acid polymers in which one or more amino acid residues are corresponding artificial chemical analogs of naturally occurring amino acids, as well as to naturally occurring amino acid polymers.
  • the terms "polypeptide”, “peptide”, “amino acid sequence” and “protein” may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
  • expression construct refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism.
  • “Expression” refers to the production of a functional product.
  • the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
  • the "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
  • the "expression construct" of the present invention may contain regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a way different from those normally occurring in nature.
  • regulatory sequence and “regulatory element” can be used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
  • Promoter refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment.
  • the promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell.
  • the promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
  • tissue-specific promoter and “tissue-preferred promoter” are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type The promoter.
  • tissue-preferred promoter refers to a promoter whose activity is determined by developmental events.
  • inducible promoters selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
  • operably linked refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element.
  • regulatory elements for example, but not limited to, promoter sequences, transcription termination sequences, etc.
  • nucleic acid sequences for example, coding sequences or open reading frames
  • Introducing a nucleic acid molecule (such as a plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming the cell of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell.
  • the "transformation” used in the present invention includes stable transformation and transient transformation.
  • Stable transformation refers to the introduction of exogenous nucleotide sequences into the genome, resulting in stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof.
  • Transient transformation refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of foreign genes. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
  • the present invention provides a genome editing fusion polypeptide, which comprises a CRISPR nuclease domain and a transcription activation domain.
  • the CRISPR nuclease of the present invention can be any CRISPR nuclease that can realize genome editing.
  • the CRISPR nuclease is Cas9 or an active fragment thereof, such as Cas9 from Streptococcus pyogenes (SpCas9), Cas9 from Staphylococcus aureus (SaCas9), Cas9 from Francisella novicida (FnCas9), Cas9 (CjCas9) from Campylobacter jejuni and Cas9 (NcCas9) from Neisseria cinerea.
  • the CRISPR nuclease is Cpf1 or an active fragment thereof, such as Cpf1 (FnCpf1) from Francisella novicida U112, Cpf1 from Acidaminococcus sp. BV3L6, and Lachnospiraceae bacterium Cpf1 (LbCpf1) of ND2006.
  • Cpf1 FnCpf1
  • Cpf1 from Acidaminococcus sp. BV3L6
  • Lachnospiraceae bacterium Cpf1 (LbCpf1) of ND2006 Lachnospiraceae bacterium Cpf1 (LbCpf1) of ND2006.
  • the transcription activation domain (TAD) used in the present invention is not particularly limited as long as it can realize the function of opening chromatin.
  • the transcription activation domain (TAD) comprises an acidic activation domain, a glutamine-rich domain, a proline-rich domain, an isoleucine-rich domain, and any combination thereof.
  • the acid activation domain is rich in aspartic acid and glutamate, including but not limited to Gal4, Oaf1, Leu3, Rtg3, Pho4, Gln3, Gcn4 TAD from yeast and p53, NFAT, NF- from mammals. TAD of ⁇ B and VP16.
  • the glutamine-rich domain contains multiple repeating sequences similar to "QQQXXXQQQ", including but not limited to TAD from POU2F1 (Oct1), POU2F2 (Oct2) and Sp1.
  • the proline-rich domain contains repeating sequences similar to "PPPXXXPPP”, including but not limited to TAD from c-jun, AP2 and Oct-2.
  • the isoleucine-rich domain contains the repeating sequence "IIXXII", for example, TAD from NTF-1.
  • the transcription activation domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more copies of the same or different TAD.
  • the transcription activation domain comprises one or more VP16-TAD.
  • the transcription activation domain comprises one or more transcription activator-like effector TADs (TALE-TAD).
  • TALE-TAD transcription activator-like effector TAD
  • the transcription activation domain comprises one or more VP16-TAD and one or more transcription activator-like effector TAD (TALE-TAD).
  • TALE-TAD transcription activator-like effector TAD
  • the transcription activation domain contains 8 copies of VP16-TAD and 6 copies of TALE-TAD.
  • the transcription activation domain comprises the amino acid sequence of SEQ ID NO:1.
  • the transcription activation domain consists of the amino acids of SEQ ID NO:1.
  • the transcription activation domain and the CRISPR nuclease domain may be directly or indirectly fused.
  • the transcription activation domain is directly fused to the CRISPR nuclease domain.
  • the transcription activation domain and the CRISPR nuclease domain may be fused indirectly, for example, connected via a linker.
  • the joint can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure.
  • the linker may be a flexible linker, such as GGGGS, GS, GAP, (GGGGS)x3, GGS and (GGS)x7, etc.
  • the transcription activation domain is located at the N-terminus or C-terminus of the CRISPR nuclease domain. In some embodiments, the transcription activation domain is fused to the N-terminus of the CRISPR nuclease domain. In some embodiments, the transcription activation domain is fused to the C-terminus of the CRISPR nuclease domain.
  • the polypeptide further comprises a nuclear localization sequence (NLS).
  • NLS nuclear localization sequence
  • one or more NLS in the polypeptide should have sufficient strength to drive the accumulation of the polypeptide in an amount that can achieve its genome editing function in the nucleus of the plant cell.
  • the strength of nuclear localization activity is determined by the number and location of NLS in the polypeptide, one or more specific NLS used, or a combination of these factors.
  • the NLS of the polypeptide of the present invention may be located at the N-terminal and/or C-terminal. In some embodiments of the present invention, the NLS of the polypeptide of the present invention may be located between the transcription activation domain and the CRISPR nuclease domain. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the N-terminus. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the C-terminus.
  • the polypeptide includes a combination of these, such as one or more NLS at the N-terminus and one or more NLS at the C-terminus. When there is more than one NLS, each one can be selected as not dependent on the other NLS.
  • the polypeptide comprises at least 2 NLS, for example, the at least 2 NLS are located at the C-terminus. In some preferred embodiments, the NLS is located at the C-terminus of the polypeptide. In some preferred embodiments, the polypeptide contains at least 3 NLS. In a more preferred embodiment, the polypeptide comprises at least 3 NLS at the C-terminus. In some preferred embodiments, the polypeptide does not comprise NLS at the N-terminus and/or between the transcription activation domain and the CRISPR nuclease domain.
  • NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of the protein, but other types of NLS are also known.
  • Non-limiting examples of NLS include: KKRKV (nucleotide sequence 5'-AAGAAGAGAAAGGTC-3'), PKKKRKV (nucleotide sequence 5'-CCCAAGAAGAAGAGGAAGGTG-3' or CCAAAGAAGAAGAGGAAGGTT), or SGGSPKKKRKV (nucleotide sequence 5'- TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG-3').
  • the polypeptide comprises two nuclear localization sequences.
  • one nuclear localization sequence is located at the N-terminus of the CRISPR nuclease or active fragment thereof, and one nuclear localization sequence is located at the CRISPR nuclease domain or Between the C-terminus of the active fragment and the N-terminus of the transcription activation domain.
  • the polypeptide of the present invention comprises the amino acid sequence of SEQ ID NO: 2. More preferably, the polypeptide consists of the amino acids of SEQ ID NO: 2.
  • the invention also provides isolated polynucleotides encoding the polypeptides of the invention.
  • the polynucleotide comprises the nucleotide sequence of SEQ ID NO: 3 or a degenerate variant thereof.
  • the polynucleotide consists of the nucleotide sequence of SEQ ID NO: 3 or a degenerate variant thereof.
  • the polynucleotide is codon-optimized for the edited organism, such as a plant.
  • Codon optimization refers to replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10) of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell. , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest.
  • Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and nature of the codon being translated
  • tRNA transfer RNA
  • genes can be tailored to be the best in a given organism based on codon optimization. Good gene expression. Codon utilization tables can be easily obtained, such as the "Codon Usage Database” available at www.kazusa.orjp/codon/ , and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
  • the present invention provides an improved genome editing system, which comprises at least one of the following i) to v):
  • the guide RNA is a sgRNA
  • the sgRNA targets a closed chromatin region.
  • the method of constructing a suitable sgRNA based on a given target sequence is known in the art. For example, see the literature: Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat conflicts heritable resistance to powdery mildew. Nat. Biotechnol. 32,947-951 (2014); Shan, Q.et gen. Target modified.
  • the design of a target sequence that can be recognized and targeted by the CRISPR nuclease and the guide RNA complex belongs to the skill of those of ordinary skill in the art.
  • the target sequence is a sequence complementary to the guide sequence of about 20 nucleotides contained in the guide RNA, and the 3'end is immediately adjacent to the protospacer adjacent motif (PAM).
  • PAM protospacer adjacent motif
  • the scaffold sequence of the guide RNA of the present invention is shown in SEQ ID NO: 4.
  • the CRISPR system of the present invention further comprises or encodes a dsgRNA whose target site is 30-300 bp away from the sgRNA target site, preferably 40-270 bp, most preferably 115-120 bp.
  • the dsgRNA only contains a 14 or 15 nucleotide leader sequence. That is, the dsgRNA only targets a target sequence of 14 or 15 nucleotides. Such dsgRNA can target the CRISPR nuclease to its target sequence, but cannot cause cleavage.
  • the CRISPR system of the present invention includes at least one of ii) to v) above.
  • the nucleotide sequence encoding the polypeptide of the present invention and/or the nucleotide sequence encoding the guide RNA is operably linked to an expression control sequence, preferably a plant expression control sequence, such as a promoter.
  • promoters examples include but are not limited to: cauliflower mosaic virus 35S promoter (Odell et al. (1985) Nature 313:810-812), maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, corn U3 promoter, rice actin promoter, TrpPro5 promoter (US Patent Application No. 10/377,318; filed March 16, 2005), pEMU promoter (Last et al. (1991) Theor .Appl.Genet.81:581-588), MAS promoter (Velten et al. (1984) EMBO J. 3:2723-2730), maize H3 histone promoter (Lepetit et al.
  • the promoters that can be used in the present invention also include commonly used tissue-specific promoters reviewed in Moore et al. (2006) Plant J.45(4):651-683.
  • the construct of the present invention includes a rice U3 promoter, which includes the nucleotide sequence shown in SEQ ID NO: 5.
  • the present invention provides a method for genetically modifying cells, including introducing the genome editing system of the present invention into the cells.
  • the design of a target sequence that can be recognized and targeted by the CRISPR nuclease and the guide RNA complex belongs to the skill of those of ordinary skill in the art.
  • the target sequence is a sequence complementary to the guide sequence of about 20 nucleotides contained in the guide RNA, and the 3'end is immediately adjacent to the protospacer adjacent motif (PAM).
  • PAM protospacer adjacent motif
  • the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression control region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression.
  • a functional gene such as a protein-coding gene
  • a gene expression control region such as a promoter region or an enhancer region
  • the target sequence is located in an enclosed chromatin region.
  • substitution, deletion and/or addition in the cell target sequence can be detected by T7EI, PCR/RE or sequencing methods.
  • the genome editing system can be introduced into cells by various methods well known to those skilled in the art.
  • Methods that can be used to introduce the genome editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus) Viruses and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
  • Cells that can be genome edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants, including monas Leafy plants and dicotyledonous plants, such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc.
  • the cell is a plant cell, such as a rice cell.
  • the methods of the invention are performed in vitro.
  • the cell is an isolated cell.
  • the method of the present invention can also be performed in vivo.
  • the cell is a cell in a living body, and the system of the present invention can be introduced into the cell in vivo by, for example, a virus-mediated method.
  • the cell is a germ cell.
  • the cell is a somatic cell.
  • the present invention also provides a genetically modified organism, which comprises a genetically modified cell produced by the method of the present invention.
  • the organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and so on.
  • the organism is a plant, preferably rice.
  • VP64 4 copies of VP16-TAD
  • 2TAL (2 copies of TALE-TAD)
  • SEQ ID NO: 6 and SEQ ID NO: 7 GenScript, Nanjing, China.
  • the VP64 coding sequence was fused with the 3'end of Cas9 by overlapping PCR, and the Avr II site was introduced between Cas9 and VP64.
  • the Cas9-VP64 fusion gene was cloned into pJIT163 to generate p163-Cas9-VP64.
  • the transfected protoplasts were incubated at 28°C. After 48 hours, the protoplasts were collected, and the genomic DNA was extracted by the CTAB method (see Murray&Thompson, (1980). Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res 8,4321-4325).
  • Genomic DNA extracted from protoplasts is used as a PCR template.
  • specific primers are used to amplify the genomic region flanking the CRISPR target site.
  • primers are used to amplify the 150-250bp PCR product to introduce the forward and reverse barcodes into the first round of PCR products.
  • the same amount of final PCR products were combined and sequenced by paired-end read sequencing using the Illumina NextSeq 500 platform (GENEWIZ, Suzhou, China). Detection of indels at the target site of sgRNA. Sequencing of each amplicon was repeated three times, using genomic DNA from three independent protoplast samples.
  • the micro-sample DNase I digestion assay was performed as previously reported (see Lu et al. (2016). Establishing Chromatin Regulatory Landscape during Mouse Preimplantation Development. Cell 165, 1375-1388).
  • the transfected protoplasts were cultured at 28°C for 24 hours. Resuspend 4 ⁇ 10 5 transfected aquatic protoplast samples in 45 ⁇ L lysis buffer (10mM Tris-HCl[pH 7.5], 10mM NaCl, 3mM MgCl 2 , 0.1% Triton X-100), and incubate on ice 5min, then add DNase I (1000U/ml, Sigma, AMPD1-1KT) to a final concentration of 2U/mL.
  • the sample was incubated at 37°C for another 5 minutes, and then 50 ⁇ L of stop buffer (10 mM Tris-HCl [pH 7.5], 10 mM NaCl, 0.15% SDS, 10 mM EDTA) containing 1 U of proteinase K was added to terminate the reaction. Incubate at 55°C for 1 hour. Genomic DNA was extracted from each sample by the phenol-chloroform method (see Sambrook & Russell, (2006). Purification of nucleic acids by extraction with Phenol: Chloroform. CSH Protoc 2006: pdb.prot4455), and by real-time qPCR (SYBR Premix Ex TaqTM II , Takara) for analysis.
  • stop buffer 10 mM Tris-HCl [pH 7.5], 10 mM NaCl, 0.15% SDS, 10 mM EDTA
  • CRISPR-P Use the online tool CRISPR-P (see Liu et al., (2017).CRISPR-P 2.0: An Improved CRISPR-Cas9 Tool for Genome Editing in Plants.Mol Plant 10, 530-532) to predict the potential of sgRNA24, 28, 34, and 38 Off-target site.
  • Locus-specific primers for these sites are designed to generate PCR products of about 150 to 250 bp.
  • specific primers are used to amplify the genomic regions flanking the target and off-target sites.
  • the resulting PCR product is used as a template for the second round of PCR, and a barcode is added to each end of the PCR product.
  • the PCR products were then combined in equal amounts for next-generation sequencing. Check the target and potential off-target indels. Sequencing of each amplicon was repeated three times, using genomic DNA from three independent protoplast samples.
  • sgRNA A ⁇ E five independent spacers (sgRNA A ⁇ E) were identified with the sequence of open and closed chromatin regions (Table 3).
  • the selected sgRNAs each target two genomic sites with opposite chromatin states.
  • the synthetic transcription activation domain (hereinafter referred to as TV) contains 6 copies of TALE (transcription activator-like effector)-TAD (transcription activation domain) and 8 copies of VP16, fused to the C-terminus of Cas9.
  • Generate Cas9-TV Figure 3a). 20 sgRNAs targeting different chromatin regions (Table 3) were used to study the genome editing efficiency of Cas9-TV in rice protoplasts.
  • Example 4 Use dsgRNA for proximal targeting to improve genome editing
  • sgRNA a dsgRNA targeting sequence b Distance c sgRNA2 GACATCATCTGGCAGGG 50bp sgRNA 4 TGCAGGCTTCACGACGG 32bp sgRNA 6 TGACCTGATGCCCAAGG 55bp sgRNA 8 GCGCTGGTGCTTGCTGG 57bp sgRNA 10 CTTCGCGCTCCATGG 35bp sgRNA 12 GGCGTGGGCAAGAGCGG 39bp sgRNA 14 TACAAGCTCAAGCTCGG 50bp sgRNA 16 GGACCTTGGACTCGAGG 55bp sgRNA 18 ACCTGATTGGGTGAAGG 60bp sgRNA 20 TATGGTAGCGAGCGTGG 68bp sgRNA 22 AACAGCTAGGCTCTTGG 39bp sgRNA 24 ACTGCAGGCTGCAGG 59bp sgRNA 26 ACTCATCGGTGTGTAGG 92bp sgRNA 28 GTTGATGGACGAGGTGG
  • sgRNA is the same as Table 2; b, 14nt guide sequence + PAM; c, the distance between the dsgRNA targeting site and the sgRNA targeting site is expressed in bp.
  • the distance between the sgRNA targeting site and the dsgRNA binding site ranges from 32 to 92 bp.
  • the proximal dsgRNA increased the editing efficiency of all target sites ( Figure 4a).
  • the indel frequency obtained by combining Cas9-TV with proximal dsgRNA was 1.5 times higher than Cas9-TV and 2.5 times higher than Cas9 ( Figure 4b).
  • Proximal dsgRNA promotes Cas9-TV editing in open and closed chromatin regions ( Figure 4c, d), and does not affect the pattern of Cas9-TV-induced indels ( Figure 11).
  • dsgRNA 1, 2, 6 and dsgRNA 3, 4, 5 were designed to target sites on either side of the PAM sequence of sgRNA34.
  • the distance between dsgRNA and sgRNA binding sites ranges from 47 to 266 bp ( Figure 5).
  • Each dsgRNA or dsgRNA pair was co-transformed with Cas9-TV and the corresponding sgRNA into rice protoplasts, and the indel frequency was measured by targeted deep sequencing.
  • sgRNA 24, 28, 34 and 38 By using sgRNA 24, 28, 34 and 38 to sequence the targeted amplicons at target and non-target sites to detect the frequency of indels, and to detect the off-target effects of Cas9-TV and Cas9-TV/dsgRNA.
  • OT off-target
  • Target site Sequence a Target gene locus Position 24 ACGGCCGCCTCCGTACGCCGCGG LOC_Os04g18650 OT24-1 ACGGCCGC T TCCG C ACGCCGCGG LOC_Os03g05590 OT24-2 C CG CT CGCC C CCGTACGCCGCGG LOC_Os06g11400 OT24-3 G CGGCCGC GG CCGTACGC T GGGG LOC_Os01g73410 Locus 28 GTCTTTGGACGTAGCCATGGTGG LOC_Os04g12220 OT28-1 GTCTTTG C AC A TAGCCATGGCGG LOC_Os05g04110 OT28-2 GTCTTT T GA T G C AGC A ATGGAGG LOC_Os01g56140 OT28-3 GT T TTTGGAC T TAGCCA A GGAGG LOC_Os04g57390 Locus 34 AGACATCGTCACCAAGGCGCAGG LOC_Os11g08760 OT34
  • Cas9-TV has higher on-target activity than Cas9 ( Figure 9).

Abstract

Provided is a genome edited fusion polypeptide, with same comprising a CRISPR nuclease domain and a transcription activation domain. Also provided are a polynucleotide and an expression construct encoding the fusion polypeptide, a genome editing system comprising the fusion polypeptide, polynucleotide and/or expression construct, and a method for editing the genome of a cell by means of the genome editing system.

Description

改进的基因组编辑系统及其应用Improved genome editing system and its applications 发明领域Invention field
本发明涉及基因组编辑领域。具体而言,本发明涉及改进的基因组编辑系统及其应用。更具体而言,本发明提供一种基因组编辑融合多肽,其包含CRISPR核酸酶结构域和转录激活结构域。本发明还提供编码所述多肽的多核苷酸或表达构建体,以及包含所述多肽、多核苷酸和/或构建体的基因组系统。本发明还提供用所述基因组编辑系统编辑细胞基因组的方法。The invention relates to the field of genome editing. Specifically, the present invention relates to an improved genome editing system and its application. More specifically, the present invention provides a genome editing fusion polypeptide comprising a CRISPR nuclease domain and a transcription activation domain. The present invention also provides polynucleotides or expression constructs encoding the polypeptides, and genomic systems comprising the polypeptides, polynucleotides and/or constructs. The present invention also provides a method for editing cell genome using the genome editing system.
背景技术Background technique
CRISPR/Cas9系统已广泛和成功地用于多种真核物种的基因组工程。然而,在动物和植物细胞中,不同基因组位点的编辑效率差异很大。某些位点的低CRISPR/Cas9编辑效率限制了体内靶标的可用性,从而限制了进一步的应用。The CRISPR/Cas9 system has been widely and successfully used in genome engineering of many eukaryotic species. However, in animal and plant cells, the editing efficiency of different genomic sites varies greatly. The low CRISPR/Cas9 editing efficiency at certain sites limits the availability of targets in vivo, thus limiting further applications.
与原核DNA不同,真核基因组DNA缠绕在组蛋白周围,并进一步压缩形成可能阻碍Cas9与其靶标结合的高阶染色质结构。在哺乳动物细胞中催化失活的Cas9(dCas9)的结合位点的全基因组作图显示结合位点富集于开放的染色质区域。此外,在人类细胞中,CRISPR/Cas9在开放的染色质区域中诱导产生更多的插入和缺失(插入缺失,indel)。体外和体内实验已经证明,Cas9结合和切割受到染色质的基本单元核小体的抑制。与此相同的是,在HEK293T,HeLa和人成纤维细胞中,Cas9介导的基因组编辑在常染色质区域中比异染色质区域更有效。有趣的是,染色质结构对CRISPR/Cas9的脱靶活性具有更显著的抑制作用。相反,在斑马鱼中未发现染色质可及性影响CRISPR/Cas9活性。染色质可及性是否影响植物细胞中的Cas9编辑尚不清楚。Unlike prokaryotic DNA, eukaryotic genomic DNA wraps around histones and further compresses to form higher-order chromatin structures that may prevent Cas9 from binding to its target. Genome-wide mapping of the binding sites of catalytically inactivated Cas9 (dCas9) in mammalian cells shows that the binding sites are enriched in open chromatin regions. In addition, in human cells, CRISPR/Cas9 induces more insertions and deletions (indels) in open chromatin regions. In vitro and in vivo experiments have proved that Cas9 binding and cleavage are inhibited by the basic unit nucleosomes of chromatin. Similarly, in HEK293T, HeLa and human fibroblasts, Cas9-mediated genome editing is more effective in euchromatin regions than in heterochromatin regions. Interestingly, the chromatin structure has a more significant inhibitory effect on the off-target activity of CRISPR/Cas9. In contrast, chromatin accessibility was not found to affect CRISPR/Cas9 activity in zebrafish. Whether chromatin accessibility affects Cas9 editing in plant cells is unclear.
有一些研究尝试改变局部可及性以改善体内Cas9活性。proxy-CRISPR策略使用额外的催化失活的SpCas9(dCas9)在临近的位置结合。这使得目标位点对于FnCas9、CjCas9、NcCas9和FnCpf1是可及的,从而提高了编辑效率。然而,该方法依赖于SpCas9可及的基因组,并且需要两种不同CRISPR-Cas系统的共表达,这不可避免地增加了载体大小和体内应用的难度。There are some studies trying to change local accessibility to improve Cas9 activity in vivo. The proxy-CRISPR strategy uses additional catalytically inactive SpCas9 (dCas9) to bind to nearby locations. This makes the target site accessible to FnCas9, CjCas9, NcCas9 and FnCpf1, thereby improving editing efficiency. However, this method relies on the accessible genome of SpCas9 and requires the co-expression of two different CRISPR-Cas systems, which inevitably increases the size of the vector and the difficulty of in vivo application.
最近,一种称为CRISPR-chrom的方法,其中Cas9直系同源物与染色质调节肽(CMP)融合,显著提高了Cas9编辑效率,特别是在不应性位点。CMP是内源蛋白的截短形式,目前尚不清楚它们的过表达是否具有显性负面作用。Recently, a method called CRISPR-chrom, in which Cas9 orthologs are fused with chromatin regulatory peptides (CMP), significantly improves Cas9 editing efficiency, especially at refractory sites. CMP is a truncated form of endogenous protein, and it is not clear whether their overexpression has a dominant negative effect.
本领域需要提供进一步的方法,改进真核生物,特别是植物基因组DNA的可及性以提高编辑效率。The art needs to provide further methods to improve the accessibility of eukaryotic organisms, especially plant genomic DNA to increase editing efficiency.
发明内容Summary of the invention
一方面,本发明提供一种基因组编辑融合多肽,其包含CRISPR核酸酶结构域和转录激活结构域。In one aspect, the present invention provides a genome editing fusion polypeptide, which comprises a CRISPR nuclease domain and a transcription activation domain.
另一方面,本发明还提供一种分离的多核苷酸,其编码本发明的基因组编辑融合多肽。In another aspect, the present invention also provides an isolated polynucleotide encoding the genome editing fusion polypeptide of the present invention.
另一方面,本发明还提供一种表达载体,其包含本发明的多核苷酸。In another aspect, the present invention also provides an expression vector, which comprises the polynucleotide of the present invention.
另一方面,本发明还提供一种宿主细胞,其包含本发明的多核苷酸或表达载体。In another aspect, the present invention also provides a host cell, which contains the polynucleotide or expression vector of the present invention.
另一方面,本发明还提供一种基因组编辑系统,其包含以下i)至v)中至少一项:On the other hand, the present invention also provides a genome editing system, which comprises at least one of the following i) to v):
i)本发明的基因组编辑融合多肽和向导RNA;i) The genome editing fusion polypeptide and guide RNA of the present invention;
ii)本发明的表达构建体,和向导RNA;ii) The expression construct of the present invention, and the guide RNA;
iii)本发明的基因组编辑融合多肽,和包含编码向导RNA的核苷酸序列的表达构建体;iii) The genome editing fusion polypeptide of the present invention, and an expression construct containing a nucleotide sequence encoding a guide RNA;
iv)本发明的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;iv) The expression construct of the present invention, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
v)包含本发明的多核苷酸和编码向导RNA的核苷酸序列的表达构建体。v) An expression construct comprising the polynucleotide of the present invention and a nucleotide sequence encoding a guide RNA.
在一些实施方案中,本发明的基因组编辑系统还包含或编码dsgRNA,其靶向的位点与所述sgRNA靶向的位点相距30-300bp,优选40-270bp,最优选115-120bp。In some embodiments, the genome editing system of the present invention further comprises or encodes a dsgRNA whose target site is 30-300 bp away from the sgRNA target site, preferably 40-270 bp, most preferably 115-120 bp.
另一方面,本发明还提供一种宿主细胞,其包含本发明的多核苷酸或表达载体或本发明的基因组编辑系统。On the other hand, the present invention also provides a host cell comprising the polynucleotide or expression vector of the present invention or the genome editing system of the present invention.
另一方面,本发明还提供一种对细胞进行遗传修饰的方法,包括将本发明的基因组编辑系统引入细胞,优选植物细胞。On the other hand, the present invention also provides a method for genetically modifying cells, including introducing the genome editing system of the present invention into cells, preferably plant cells.
附图说明Description of the drawings
图1显示染色质可及性对水稻Cas9基因组编辑效率的影响。图1a总结了在70个靶位点的CRISPR/Cas9介导的突变数和染色质可及性。通过PCR/RE在再生的T0水稻植物上测量诱变效率。每个靶位点的可及性是从Zhang et al.,2012生成的水稻DNase I超敏(DH)位点的高分辨率图谱中获得的。图1b显示在原生质体中检测的20个水稻基因中的40个目标位点的indel频率。在每个基因中通过独立的sgRNA靶向两个位点。通过对靶向扩增子进行测序来测量indel频率。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。图1c总结了图1b中40个目标位点的插入频率和染色质状态。通过双尾Mann-Whitney检验计算P值。**P<0.01,***P<0.001。Figure 1 shows the effect of chromatin accessibility on the efficiency of rice Cas9 genome editing. Figure 1a summarizes the number of CRISPR/Cas9-mediated mutations and chromatin accessibility at 70 target sites. The mutagenesis efficiency was measured on regenerated TO rice plants by PCR/RE. The accessibility of each target site was obtained from the high-resolution map of rice DNase I hypersensitive (DH) sites generated by Zhang et al., 2012. Figure 1b shows the indel frequency of 40 target sites in 20 rice genes detected in protoplasts. Two sites are targeted by independent sgRNAs in each gene. The indel frequency is measured by sequencing the targeted amplicon. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m. Figure 1c summarizes the insertion frequency and chromatin status of the 40 target sites in Figure 1b. The P value was calculated by the two-tailed Mann-Whitney test. **P<0.01, ***P<0.001.
图2显示水稻中Cas9编辑在开放染色质区域比在封闭染色质区域更有效。a分别成对比较了在开放和封闭染色质区域中sgRNA靶向位点的indel频率。通过对靶向扩增子进行测序,测量水稻原生质体中的indel频率。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。b总结了a中的Cas9编辑效率。通过双尾Mann-Whitney检验计算P值,*P<0.05。c显示Cas9在无染色质状态下对所有10个靶位点的切割相同。将含有相应靶位点的PCR产物与Cas9核糖核蛋白(RNP)复合物一起温育,并在琼脂糖 凝胶上观察和测量。数据来自三个独立的生物学重复(n=3),并显示为平均值±s.e.m.。d显示在10个靶位点生成的indel模式。所有实验重复三次,结果相似。Figure 2 shows that Cas9 editing in rice is more effective in open chromatin regions than in closed chromatin regions. a Compare the indel frequencies of sgRNA target sites in open and closed chromatin regions in pairs. By sequencing the targeted amplicons, the frequency of indels in rice protoplasts was measured. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m. b summarizes the Cas9 editing efficiency in a. The P value was calculated by two-tailed Mann-Whitney test, *P<0.05. c shows that Cas9 cuts all 10 target sites the same in chromatin-free state. The PCR product containing the corresponding target site was incubated with Cas9 ribonucleoprotein (RNP) complex, and observed and measured on an agarose gel. The data are from three independent biological replicates (n=3) and are shown as mean ± s.e.m. d shows the indel patterns generated at 10 target sites. All experiments were repeated three times with similar results.
图3显示将合成的转录激活结构域与Cas9融合提高了其编辑效率。a是转录激活结构域与Cas9的融合物(Cas9-TV)结构的示意图。b显示在水稻原生质体中的20个靶点由Cas9和Cas9-TV诱导的indel频率。未处理的原生质体样品用作对照。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。c显示Cas9和Cas9-TV在20个靶位点诱导的indel频率。d显示Cas9和Cas9-TV诱导的开放染色质区域靶位点的插入频率。e显示Cas9和Cas9-TV在封闭染色质区域靶位点诱导的indel频率。P值由双尾Mann-Whitney检验计算。*P<0.05,***P<0.001。Figure 3 shows that fusing the synthetic transcription activation domain with Cas9 improves its editing efficiency. a is a schematic diagram of the structure of the fusion of transcription activation domain and Cas9 (Cas9-TV). b shows the indel frequencies induced by Cas9 and Cas9-TV in 20 targets in rice protoplasts. Untreated protoplast samples were used as controls. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m. c shows the indel frequencies induced by Cas9 and Cas9-TV at 20 target sites. d shows the insertion frequency of the target site in the open chromatin region induced by Cas9 and Cas9-TV. e shows the frequency of indels induced by Cas9 and Cas9-TV at target sites in the enclosed chromatin region. The P value was calculated by the two-tailed Mann-Whitney test. *P<0.05, ***P<0.001.
图4显示dsgRNA的近端靶向增强了Cas9-TV编辑。a显示在水稻原生质体中的20个靶位点处的Cas9/sgRNA,Cas9-TV/sgRNA和Cas9-TV/sgRNA-dsgRNA的indel频率。未处理的原生质体样品用作对照。通过对靶向扩增子进行测序来测量indel频率。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。b显示相对于Cas9/sgRNA,在10个靶位点处Cas9-TV/sgRNA和Cas9-TV/sgRNA-dsgRNA诱导的indel频率的倍数变化。c显示开放染色质区域中indel频率倍数变化。d显示封闭染色质区域中靶位点的indel频率倍数变化的总结。P值由双尾Mann-Whitney检验计算。***P<0.001,****P<0.0001。Figure 4 shows that proximal targeting of dsgRNA enhances Cas9-TV editing. a shows the indel frequencies of Cas9/sgRNA, Cas9-TV/sgRNA and Cas9-TV/sgRNA-dsgRNA at 20 target sites in rice protoplasts. Untreated protoplast samples were used as controls. The indel frequency is measured by sequencing the targeted amplicon. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m. b shows the fold change of indel frequency induced by Cas9-TV/sgRNA and Cas9-TV/sgRNA-dsgRNA at 10 target sites relative to Cas9/sgRNA. c shows the fold change of indel frequency in the open chromatin area. d shows the summary of the indel frequency fold change of the target site in the enclosed chromatin region. The P value was calculated by the two-tailed Mann-Whitney test. ***P<0.001, ****P<0.0001.
图5显示近端dsgRNA的位置对Cas9-TV编辑的影响。基于sgRNA和dsgRNA靶位点之间的核苷酸计算距离。通过对水稻原生质体中的靶向扩增子进行测序来测量indel频率。未处理的原生质体样品用作对照。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。Figure 5 shows the effect of the position of the proximal dsgRNA on Cas9-TV editing. The distance is calculated based on the nucleotides between the sgRNA and dsgRNA target sites. The indel frequency was measured by sequencing targeted amplicons in rice protoplasts. Untreated protoplast samples were used as controls. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m.
图6通过近端dsgRNA靶向增加Cas9编辑效率。A显示在水稻原生质体中的20个靶位点由Cas9/sgRNA和Cas9/sgRNA-dsgRNA诱导的的indel频率。未处理的原生质体样品用作对照。通过对靶向扩增子进行测序来测量indel频率。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。b显示20个靶位点的indel频率。C显示在开放染色质中的靶位点诱导的indel频率。D显示在封闭染色质中的靶位点的indel频率。通过双尾Mann-Whitney检验计算P值。***p<0.001,****p<0.0001。Figure 6 Increases Cas9 editing efficiency through proximal dsgRNA targeting. A shows the indel frequencies induced by Cas9/sgRNA and Cas9/sgRNA-dsgRNA at 20 target sites in rice protoplasts. Untreated protoplast samples were used as controls. The indel frequency is measured by sequencing the targeted amplicon. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m. b shows the indel frequency of 20 target sites. C shows the frequency of indels induced by target sites in open chromatin. D shows the indel frequency of the target site in the blocked chromatin. The P value was calculated by the two-tailed Mann-Whitney test. ***p<0.001, ****p<0.0001.
图7显示近端dsgRNA的位置对Cas9编辑活性的影响。dsgRNA靶位点和Cas9-TV靶位点彼此分开以bp表示的距离,分别用数字表示。未处理的原生质体样品用作对照。通过对靶向扩增子进行测序来测量indel频率。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。Figure 7 shows the effect of the position of the proximal dsgRNA on the editing activity of Cas9. The dsgRNA target site and the Cas9-TV target site are separated from each other by the distance expressed in bp, which is expressed by numbers, respectively. Untreated protoplast samples were used as controls. The indel frequency is measured by sequencing the targeted amplicon. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m.
图8显示Cas9-TV和近端dsgRNA改变局部染色质可及性。用Cas9/sgRNA和Cas9-TV/sgRNA-dsgRNA分别转染水稻原生质体,并通过微量样品DNase I测定分析靶位点周围的局部染色质可及性。通过实时PCR定量完整基因组DNA的级分。对于每个位点,将Cas9/sgRNA处理的样品中的完整基因组DNA的相对量设定为一个单位。误差棒表示三次重复的SD。Figure 8 shows that Cas9-TV and proximal dsgRNA alter local chromatin accessibility. The rice protoplasts were transfected with Cas9/sgRNA and Cas9-TV/sgRNA-dsgRNA, and the local chromatin accessibility around the target site was analyzed by DNase I determination of a small sample. The fraction of intact genomic DNA is quantified by real-time PCR. For each site, the relative amount of intact genomic DNA in the Cas9/sgRNA-treated sample is set as one unit. Error bars indicate SD for three replicates.
图9比较了Cas9/sgRNA、Cas9-TV/sgRNA和Cas9-TV/sgRNA-dsgRNA的脱靶活性。通过对水稻原生质体中的靶向扩增子进行测序来测量indel频率。未处理的原生质体样品用作对照。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。Figure 9 compares the off-target activity of Cas9/sgRNA, Cas9-TV/sgRNA and Cas9-TV/sgRNA-dsgRNA. The indel frequency was measured by sequencing targeted amplicons in rice protoplasts. Untreated protoplast samples were used as controls. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m.
图10显示Cas9和Cas9-TV在靶位点诱导的indel模式。该图显示了三个独立实验之一的代表性结果。所有三个实验给出了类似的结果。Figure 10 shows the indel patterns induced by Cas9 and Cas9-TV at the target site. The figure shows representative results from one of three independent experiments. All three experiments gave similar results.
图11显示通过Cas9/sgRNA、Cas9/sgRNA-dsgRNA和Cas9-TV/sgRNA-dsgRNA在指定的靶位点处产生的indel模式。此图显示了三个独立实验之一的代表性结果,这三个实验产生了类似的结果。Figure 11 shows the indel patterns produced by Cas9/sgRNA, Cas9/sgRNA-dsgRNA and Cas9-TV/sgRNA-dsgRNA at designated target sites. This figure shows representative results from one of three independent experiments that produced similar results.
图12显示dsgRNA不在靶位点诱导插入缺失。将dsgRNA分别与Cas9或Cas9-TV共转化到水稻原生质体中。通过对靶向扩增子进行测序来测量indel频率。未处理的原生质体样品用作对照。数据来自三组独立的生物学重复(n=3),并显示为平均值±s.e.m.。Figure 12 shows that dsgRNA does not induce indels at the target site. The dsgRNA was co-transformed with Cas9 or Cas9-TV into rice protoplasts. The indel frequency is measured by sequencing the targeted amplicon. Untreated protoplast samples were used as controls. The data are from three independent biological replicates (n=3) and are shown as the mean±s.e.m.
图13显示针对LOC_Os11g08760的部分基因组DNA序列的sgRNA和dsgRNA靶位点。Figure 13 shows the sgRNA and dsgRNA target sites for the partial genomic DNA sequence of LOC_Os11g08760.
发明详述Detailed description of the invention
一、定义1. Definition
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的蛋白质和核酸化学、分子生物学、细胞和组织培养、微生物学、免疫学相关术语和实验室操作步骤均为相应领域内广泛使用的术语和常规步骤。例如,本发明中使用的标准重组DNA和分子克隆技术为本领域技术人员熟知,并且在如下文献中有更全面的描述:Sambrook,J.,Fritsch,E.F.和Maniatis,T.,Molecular Cloning:A Laboratory Manual;Cold Spring Harbor Laboratory Press:Cold Spring Harbor,1989(简称为“Sambrook”)。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。In the present invention, unless otherwise specified, the scientific and technical terms used herein have the meanings commonly understood by those skilled in the art. In addition, protein and nucleic acid chemistry, molecular biology, cell and tissue culture, microbiology, immunology related terms and laboratory procedures used herein are all terms and routine procedures widely used in the corresponding fields. For example, the standard recombinant DNA and molecular cloning techniques used in the present invention are well known to those skilled in the art, and are described more fully in the following documents: Sambrook, J., Fritsch, EF and Maniatis, T., Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 1989 (abbreviated as "Sambrook"). At the same time, in order to better understand the present invention, definitions and explanations of related terms are provided below.
如本文所用,术语“CRISPR核酸酶”通常指在天然存在的CRISPR系统中存在的核酸酶,以及其修饰形式、其变体、其催化活性片段等。该术语涵盖基于CRISPR系统的能够在细胞内实现基因靶向(例如基因编辑、基因靶向调控等)的任何效应蛋白。As used herein, the term "CRISPR nuclease" generally refers to the nuclease that exists in the naturally occurring CRISPR system, as well as its modified form, its variant, its catalytically active fragment, and the like. The term covers any effector protein based on the CRISPR system that can achieve gene targeting (such as gene editing, gene targeted regulation, etc.) in cells.
“CRISPR核酸酶”的实例包括Cas9核酸酶或其变体。所述Cas9核酸酶可以是来自不同物种的Cas9核酸酶,例如来自化脓链球菌(S.pyogenes)的spCas9或衍生自金黄色葡萄球菌(S.aureus)的SaCas9。“Cas9核酸酶”和“Cas9”在本文中可互换使用,指的是包括Cas9蛋白或其片段(例如包含Cas9的活性DNA切割结构域和/或Cas9的gRNA结合结构域的蛋白)的RNA指导的核酸酶。Cas9是CRISPR/Cas(成簇的规律间隔的短回文重复序列及其相关系统)基因组编辑系统的组分,能在向导RNA的指导下靶向并切割DNA靶序列形成DNA双链断裂(DSB)。Examples of "CRISPR nuclease" include Cas9 nuclease or variants thereof. The Cas9 nuclease may be a Cas9 nuclease from different species, such as spCas9 from S. pyogenes or SaCas9 derived from S. aureus. "Cas9 nuclease" and "Cas9" are used interchangeably herein, and refer to RNA comprising Cas9 protein or fragments thereof (for example, a protein containing the active DNA cleavage domain of Cas9 and/or the gRNA binding domain of Cas9) Guided nuclease. Cas9 is a component of CRISPR/Cas (clustered regularly spaced short palindrome repeats and related systems) genome editing system, which can target and cut DNA target sequences under the guidance of guide RNA to form DNA double-strand breaks (DSB) ).
“CRISPR核酸酶”的实例还可以包括Cpf1核酸酶或其变体例如高特异性变体。所述 Cpf1核酸酶可以是来自不同物种的Cpf1核酸酶,例如来自Francisella novicida U112、Acidaminococcus sp.BV3L6和Lachnospiraceae bacterium ND2006的Cpf1核酸酶。Examples of "CRISPR nuclease" may also include Cpf1 nuclease or a variant thereof such as a highly specific variant. The Cpf1 nuclease may be Cpf1 nuclease from different species, such as Cpf1 nuclease from Francisella novicida U112, Acidaminococcus sp. BV3L6 and Lachnospiraceae bacterium ND2006.
如本文所用,“转录激活结构域(TAD)”一般是转录因子中的结构域,其含有其他蛋白质如转录辅调节蛋白的结合位点。TAD一般根据氨基酸组成分类,这些氨基酸可以是对活性至关重要的氨基酸,也可以是TAD中最丰富的氨基酸。转录激活结构域一般分为酸性激活结构域、富谷氨酰胺结构域、富脯氨酸结构域和富异亮氨酸结构域。As used herein, a "transcription activation domain (TAD)" is generally a domain in a transcription factor that contains binding sites for other proteins such as transcriptional co-regulatory proteins. TAD is generally classified according to the composition of amino acids. These amino acids can be amino acids that are essential for activity or the most abundant amino acids in TAD. Transcription activation domains are generally divided into acid activation domains, glutamine-rich domains, proline-rich domains and isoleucine-rich domains.
如本文所用,“gRNA”和“向导RNA”可互换使用,指的是能够与CRISPR核酸酶形成复合物并由于与靶序列具有一定互补性而能够将所述复合物靶向靶序列的RNA分子。例如,在基于Cas9的基因编辑系统中,gRNA通常由部分互补形成复合物的crRNA和tracrRNA分子构成,其中crRNA包含与靶序列具有足够互补性以便与该靶序列杂交并且指导CRISPR复合物(Cas9+crRNA+tracrRNA)与该靶序列序列特异性地结合的序列。然而,本领域已知可以设计单向导RNA(sgRNA),其同时包含crRNA和tracrRNA的特征。而在基于Cpf1的基因组编辑系统中,gRNA通常仅由成熟crRNA分子构成,其中crRNA包含的序列与靶序列具有足够相同性以便与靶序列的互补序列杂交并且指导复合物(Cpf1+crRNA)与该靶序列序列特异性结合。基于所使用的CRISPR核酸酶和待编辑的靶序列设计合适的gRNA序列属于本领域技术人员的能力范围内。As used herein, "gRNA" and "guide RNA" are used interchangeably, and refer to RNA that can form a complex with the CRISPR nuclease and can target the complex to the target sequence due to its certain complementarity with the target sequence molecular. For example, in a Cas9-based gene editing system, gRNA is usually composed of crRNA and tracrRNA molecules that are partially complementary to form a complex, where the crRNA contains sufficient complementarity with the target sequence to hybridize with the target sequence and guide the CRISPR complex (Cas9+ crRNA+tracrRNA) A sequence that specifically binds to the target sequence sequence. However, it is known in the art that single guide RNA (sgRNA) can be designed, which includes both the characteristics of crRNA and tracrRNA. In a genome editing system based on Cpf1, gRNA is usually composed of mature crRNA molecules only, and the sequence contained in crRNA is sufficiently identical to the target sequence to hybridize with the complementary sequence of the target sequence and direct the complex (Cpf1+crRNA) to the The target sequence specifically binds. Designing a suitable gRNA sequence based on the CRISPR nuclease used and the target sequence to be edited is within the ability of those skilled in the art.
“dead sgRNA”或“dsgRNA”是指可以将Cas9引导至靶位点而不诱导双链断裂(DSB)的sgRNA,其仅仅具有14或15bp的间隔区序列(靶序列)。"Dead sgRNA" or "dsgRNA" refers to sgRNA that can guide Cas9 to a target site without inducing a double-strand break (DSB), which only has a spacer sequence (target sequence) of 14 or 15 bp.
如本文所用,“染色质”是指间期细胞核内由DNA、组蛋白、非组蛋白及少量RNA组成的线性复合结构,是间期细胞遗传物质存在的形式。在有丝分裂或减数分裂过程中,真核细胞的染色质聚缩而成棒状结构的染色体。在染色质中,易于结合其他蛋白质(如核酸酶、转座酶、修饰酶等)的DNA区域称为“开放(open)染色质区域”;而难于结合其他蛋白质的DNA区域称为“封闭(closed)染色质区域”。As used herein, "chromatin" refers to a linear composite structure composed of DNA, histones, non-histone proteins, and a small amount of RNA in the interphase cell nucleus, and is a form of interphase cell genetic material. During mitosis or meiosis, the chromatin of eukaryotic cells condenses into rod-shaped chromosomes. In chromatin, DNA regions that are easy to bind to other proteins (such as nucleases, transposases, modifying enzymes, etc.) are called "open chromatin regions"; DNA regions that are difficult to bind to other proteins are called "closed ( closed) Chromatin area".
如本文所用,“基因组”不仅涵盖存在于细胞核中的染色体DNA,而且还包括存在于细胞的亚细胞组分(如线粒体、质体)中的细胞器DNA。As used herein, "genome" not only covers chromosomal DNA present in the nucleus, but also includes organelle DNA present in subcellular components of the cell (such as mitochondria, plastids).
如本文所用,“细胞”包括适于基因组编辑的任何生物体的细胞。生物体的实例包括但不限于,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。As used herein, "cell" includes cells of any organism suitable for genome editing. Examples of organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, and cats; poultry such as chickens, ducks, and geese; plants include monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and so on.
“经遗传修饰的生物体”或“经遗传修饰的细胞”意指在其基因组内包含外源多核苷酸或修饰的基因或表达调控序列的生物体或细胞。例如外源多核苷酸能够稳定地整合进生物体或细胞的基因组中,并遗传连续的世代。外源多核苷酸可单独地或作为重组DNA构建体的部分整合进基因组中。修饰的基因或表达调控序列为在生物体或细胞基因组中所述序列包含单个或多个脱氧核苷酸取代、缺失和添加。"Genetically modified organism" or "genetically modified cell" means an organism or cell that contains exogenous polynucleotides or modified genes or expression control sequences in its genome. For example, exogenous polynucleotides can be stably integrated into the genome of organisms or cells, and inherited for successive generations. The exogenous polynucleotide can be integrated into the genome alone or as part of a recombinant DNA construct. The modified gene or expression control sequence contains single or multiple deoxynucleotide substitutions, deletions and additions in the organism or cell genome.
针对序列而言的“外源”意指来自外来物种的序列,或者如果来自相同物种,则指通过蓄意的人为干预而从其天然形式发生了组成和/或基因座的显著改变的序列。"Foreign" in terms of sequence means a sequence from a foreign species or, if from the same species, a sequence whose composition and/or locus has been significantly altered from its natural form through deliberate human intervention.
“多核苷酸”、“核酸序列”、“核苷酸序列”或“核酸片段”可互换使用并且是单链或双链RNA或DNA聚合物,任选地可含有合成的、非天然的或改变的核苷酸碱基。核苷酸通过如下它们的单个字母名称来指代:“A”为腺苷或脱氧腺苷(分别对应RNA或DNA),“C”表示胞苷或脱氧胞苷,“G”表示鸟苷或脱氧鸟苷,“U”表示尿苷,“T”表示脱氧胸苷,“R”表示嘌呤(A或G),“Y”表示嘧啶(C或T),“K”表示G或T,“H”表示A或C或T,“I”表示肌苷,并且“N”表示任何核苷酸。"Polynucleotide", "nucleic acid sequence", "nucleotide sequence" or "nucleic acid fragment" are used interchangeably and are single-stranded or double-stranded RNA or DNA polymers, optionally containing synthetic, non-natural Or changed nucleotide bases. Nucleotides are referred to by their single letter names as follows: "A" is adenosine or deoxyadenosine (respectively for RNA or DNA), "C" is cytidine or deoxycytidine, and "G" is guanosine or Deoxyguanosine, "U" means uridine, "T" means deoxythymidine, "R" means purine (A or G), "Y" means pyrimidine (C or T), "K" means G or T, " H" means A or C or T, "I" means inosine, and "N" means any nucleotide.
“多肽”、“肽”、和“蛋白质”在本发明中可互换使用,指氨基酸残基的聚合物。该术语适用于其中一个或多个氨基酸残基是相应的天然存在的氨基酸的人工化学类似物的氨基酸聚合物,以及适用于天然存在的氨基酸聚合物。术语“多肽”、“肽”、“氨基酸序列”和“蛋白质”还可包括修饰形式,包括但不限于糖基化、脂质连接、硫酸盐化、谷氨酸残基的γ羧化、羟化和ADP-核糖基化。"Polypeptide", "peptide", and "protein" are used interchangeably in the present invention and refer to a polymer of amino acid residues. The term applies to amino acid polymers in which one or more amino acid residues are corresponding artificial chemical analogs of naturally occurring amino acids, as well as to naturally occurring amino acid polymers. The terms "polypeptide", "peptide", "amino acid sequence" and "protein" may also include modified forms, including but not limited to glycosylation, lipid linkage, sulfation, gamma carboxylation of glutamic acid residues, hydroxyl And ADP-ribosylation.
如本发明所用,“表达构建体”是指适于感兴趣的核苷酸序列在生物体中表达的载体如重组载体。“表达”指功能产物的产生。例如,核苷酸序列的表达可指核苷酸序列的转录(如转录生成mRNA或功能RNA)和/或RNA翻译成前体或成熟蛋白质。As used in the present invention, "expression construct" refers to a vector, such as a recombinant vector, suitable for expression of a nucleotide sequence of interest in an organism. "Expression" refers to the production of a functional product. For example, the expression of a nucleotide sequence may refer to the transcription of the nucleotide sequence (such as transcription to generate mRNA or functional RNA) and/or the translation of RNA into a precursor or mature protein.
本发明的“表达构建体”可以是线性的核酸片段、环状质粒、病毒载体,或者,在一些实施方式中,可以是能够翻译的RNA(如mRNA)。The "expression construct" of the present invention can be a linear nucleic acid fragment, a circular plasmid, a viral vector, or, in some embodiments, can be an RNA (such as mRNA) that can be translated.
本发明的“表达构建体”可包含不同来源的调控序列和感兴趣的核苷酸序列,或相同来源但以不同于通常天然存在的方式排列的调控序列和感兴趣的核苷酸序列。The "expression construct" of the present invention may contain regulatory sequences and nucleotide sequences of interest from different sources, or regulatory sequences and nucleotide sequences of interest from the same source but arranged in a way different from those normally occurring in nature.
“调控序列”和“调控元件”可互换使用,指位于编码序列的上游(5'非编码序列)、中间或下游(3'非编码序列),并且影响相关编码序列的转录、RNA加工或稳定性或者翻译的核苷酸序列。调控序列可包括但不限于启动子、翻译前导序列、内含子和多腺苷酸化识别序列。"Regulatory sequence" and "regulatory element" can be used interchangeably and refer to the upstream (5' non-coding sequence), middle or downstream (3' non-coding sequence) of the coding sequence, and affect the transcription, RNA processing or processing of the related coding sequence. Stability or translated nucleotide sequence. Regulatory sequences may include, but are not limited to, promoters, translation leader sequences, introns, and polyadenylation recognition sequences.
“启动子”指能够控制另一核酸片段转录的核酸片段。在本发明的一些实施方案中,启动子是能够控制细胞中基因转录的启动子,无论其是否来源于所述细胞。启动子可以是组成型启动子或组织特异性启动子或发育调控启动子或诱导型启动子。"Promoter" refers to a nucleic acid fragment capable of controlling the transcription of another nucleic acid fragment. In some embodiments of the invention, the promoter is a promoter capable of controlling gene transcription in a cell, regardless of whether it is derived from the cell. The promoter can be a constitutive promoter or a tissue-specific promoter or a developmentally regulated promoter or an inducible promoter.
“组成型启动子”指一般将引起基因在多数细胞类型中在多数情况下表达的启动子。“组织特异性启动子”和“组织优选启动子”可互换使用,并且指主要但非必须专一地在一种组织或器官中表达,而且也可在一种特定细胞或细胞型中表达的启动子。“发育调控启动子”指其活性由发育事件决定的启动子。“诱导型启动子”响应内源性或外源性刺激(环境、激素、化学信号等)而选择性表达可操纵连接的DNA序列。A "constitutive promoter" refers to a promoter that will generally cause gene expression in most cell types in most cases. "Tissue-specific promoter" and "tissue-preferred promoter" are used interchangeably, and refer to mainly but not necessarily exclusively expressed in a tissue or organ, and can also be expressed in a specific cell or cell type The promoter. "Developmentally regulated promoter" refers to a promoter whose activity is determined by developmental events. "Inducible promoters" selectively express operably linked DNA sequences in response to endogenous or exogenous stimuli (environment, hormones, chemical signals, etc.).
如本文中所用,术语“可操作地连接”指调控元件(例如但不限于,启动子序列、转录终止序列等)与核酸序列(例如,编码序列或开放读码框)连接,使得核苷酸序列的转录被所述转录调控元件控制和调节。用于将调控元件区域可操作地连接于核酸分子的技术为本领域已知的。As used herein, the term "operably linked" refers to the connection of regulatory elements (for example, but not limited to, promoter sequences, transcription termination sequences, etc.) to nucleic acid sequences (for example, coding sequences or open reading frames) such that the nucleotides The transcription of the sequence is controlled and regulated by the transcription control element. Techniques for operably linking regulatory element regions to nucleic acid molecules are known in the art.
将核酸分子(例如质粒、线性核酸片段、RNA等)或蛋白质“导入”生物体是指用所述核酸或蛋白质转化生物体细胞,使得所述核酸或蛋白质在细胞中能够发挥功能。本发 明所用的“转化”包括稳定转化和瞬时转化。“稳定转化”指将外源核苷酸序列导入基因组中,导致外源基因稳定遗传。一旦稳定转化,外源核酸序列稳定地整合进所述生物体和其任何连续世代的基因组中。“瞬时转化”指将核酸分子或蛋白质导入细胞中,执行功能而没有外源基因稳定遗传。瞬时转化中,外源核酸序列不整合进基因组中。"Introducing a nucleic acid molecule (such as a plasmid, linear nucleic acid fragment, RNA, etc.) or protein into an organism refers to transforming the cell of the organism with the nucleic acid or protein so that the nucleic acid or protein can function in the cell. The "transformation" used in the present invention includes stable transformation and transient transformation. "Stable transformation" refers to the introduction of exogenous nucleotide sequences into the genome, resulting in stable inheritance of the exogenous gene. Once stably transformed, the exogenous nucleic acid sequence is stably integrated into the genome of the organism and any successive generations thereof. "Transient transformation" refers to the introduction of nucleic acid molecules or proteins into cells to perform functions without stable inheritance of foreign genes. In transient transformation, the foreign nucleic acid sequence is not integrated into the genome.
二、基因组编辑融合多肽2. Genome editing fusion peptide
本发明提供一种基因组编辑融合多肽,其包含CRISPR核酸酶结构域和转录激活结构域。The present invention provides a genome editing fusion polypeptide, which comprises a CRISPR nuclease domain and a transcription activation domain.
本发明所述的CRISPR核酸酶可以是能够实现基因组编辑的任何CRISPR核酸酶。在一些实施方案中,所述CRISPR核酸酶是Cas9或其活性片段,例如来自化脓链球菌的Cas9(SpCas9)、来自金黄色葡萄球菌的Cas9(SaCas9)、来自Francisella novicida的Cas9(FnCas9)、来自空肠弧菌(Campylobacter jejuni)的Cas9(CjCas9)和来自灰色奈瑟氏菌(Neisseria cinerea)的Cas9(NcCas9)。在一些实施方案中,所述CRISPR核酸酶是Cpf1或其活性片段,例如来自Francisella novicida U112的Cpf1(FnCpf1)、氨基球菌属物种(Acidaminococcus sp.)BV3L6的Cpf1和毛罗科菌(Lachnospiraceae bacterium)ND2006的Cpf1(LbCpf1)。The CRISPR nuclease of the present invention can be any CRISPR nuclease that can realize genome editing. In some embodiments, the CRISPR nuclease is Cas9 or an active fragment thereof, such as Cas9 from Streptococcus pyogenes (SpCas9), Cas9 from Staphylococcus aureus (SaCas9), Cas9 from Francisella novicida (FnCas9), Cas9 (CjCas9) from Campylobacter jejuni and Cas9 (NcCas9) from Neisseria cinerea. In some embodiments, the CRISPR nuclease is Cpf1 or an active fragment thereof, such as Cpf1 (FnCpf1) from Francisella novicida U112, Cpf1 from Acidaminococcus sp. BV3L6, and Lachnospiraceae bacterium Cpf1 (LbCpf1) of ND2006.
本发明中所用的转录激活结构域(TAD)没有特别限制,只要其能够实现打开染色质的功能。在一些实施方案中,所述转录激活结构域(TAD)包含酸性激活结构域、富谷氨酰胺结构域、富脯氨酸结构域、富异亮氨酸结构域和其任何组合。所述酸性激活结构域富含天冬氨酸和谷氨酸,包括但不限于来自酵母的Gal4、Oaf1、Leu3、Rtg3、Pho4、Gln3、Gcn4的TAD和来自哺乳动物的p53、NFAT、NF-κB和VP16的TAD。所述富谷氨酰胺结构域含有多个类似于“QQQXXXQQQ”重复序列,包括但不限于来自POU2F1(Oct1)、POU2F2(Oct2)和Sp1的TAD。所述富脯氨酸结构域含有类似“PPPXXXPPP”的重复序列,包括但不限于来自c-jun、AP2和Oct-2的TAD。所述富异亮氨酸结构域含有重复序列“IIXXII”,例如,来自NTF-1的TAD。The transcription activation domain (TAD) used in the present invention is not particularly limited as long as it can realize the function of opening chromatin. In some embodiments, the transcription activation domain (TAD) comprises an acidic activation domain, a glutamine-rich domain, a proline-rich domain, an isoleucine-rich domain, and any combination thereof. The acid activation domain is rich in aspartic acid and glutamate, including but not limited to Gal4, Oaf1, Leu3, Rtg3, Pho4, Gln3, Gcn4 TAD from yeast and p53, NFAT, NF- from mammals. TAD of κB and VP16. The glutamine-rich domain contains multiple repeating sequences similar to "QQQXXXQQQ", including but not limited to TAD from POU2F1 (Oct1), POU2F2 (Oct2) and Sp1. The proline-rich domain contains repeating sequences similar to "PPPXXXPPP", including but not limited to TAD from c-jun, AP2 and Oct-2. The isoleucine-rich domain contains the repeating sequence "IIXXII", for example, TAD from NTF-1.
在一些实施方案中,所述转录激活结构域包含1、2、3、4、5、6、7、8、9、10个或更多个拷贝的相同或不同的TAD。在一些实施方案中,所述转录激活结构域包含一或多个VP16-TAD。在一些实施方案中,所述转录激活结构域包含一或多个转录激活因子样效应子的TAD(TALE-TAD)。在一些实施方案中,所述转录激活结构域包含一或多个VP16-TAD以及一或多个转录激活因子样效应子的TAD(TALE-TAD)。优选地,所述转录激活结构域包含8个拷贝的VP16-TAD和6个拷贝的TALE-TAD。优选地,所述转录激活结构域包含SEQ ID NO:1的氨基酸序列。优选地,所述转录激活结构域由SEQ ID NO:1的氨基酸组成。In some embodiments, the transcription activation domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more copies of the same or different TAD. In some embodiments, the transcription activation domain comprises one or more VP16-TAD. In some embodiments, the transcription activation domain comprises one or more transcription activator-like effector TADs (TALE-TAD). In some embodiments, the transcription activation domain comprises one or more VP16-TAD and one or more transcription activator-like effector TAD (TALE-TAD). Preferably, the transcription activation domain contains 8 copies of VP16-TAD and 6 copies of TALE-TAD. Preferably, the transcription activation domain comprises the amino acid sequence of SEQ ID NO:1. Preferably, the transcription activation domain consists of the amino acids of SEQ ID NO:1.
在本发明的多肽中,所述转录激活结构域与所述CRISPR核酸酶结构域可以直接或间接融合。在一些实施方案中,所述转录激活结构域与所述CRISPR核酸酶结构域直接融合。在一些实施方案中,所述转录激活结构域与所述CRISPR核酸酶结构域可以间接 融合,例如通过接头连接。所述接头可以是长1-50个(例如1、2、3、4、5、6、7、8、9、10、11、12、13、14、15、16、17、18、19、20个或20-25个、25-50个)或更多个氨基酸、无二级以上结构的非功能性氨基酸序列。例如,所述接头可以是柔性接头,例如GGGGS、GS、GAP、(GGGGS)x 3、GGS和(GGS)x7等。In the polypeptide of the present invention, the transcription activation domain and the CRISPR nuclease domain may be directly or indirectly fused. In some embodiments, the transcription activation domain is directly fused to the CRISPR nuclease domain. In some embodiments, the transcription activation domain and the CRISPR nuclease domain may be fused indirectly, for example, connected via a linker. The joint can be 1-50 long (e.g. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or 20-25, 25-50) or more amino acids, non-functional amino acid sequences without secondary or higher structure. For example, the linker may be a flexible linker, such as GGGGS, GS, GAP, (GGGGS)x3, GGS and (GGS)x7, etc.
在本发明的多肽中,所述转录激活结构域位于所述CRISPR核酸酶结构域的N末端或C末端。在一些实施方案中,所述转录激活结构域与所述CRISPR核酸酶结构域的N末端融合。在一些实施方案中,所述转录激活结构域与所述CRISPR核酸酶结构域的C末端融合。In the polypeptide of the present invention, the transcription activation domain is located at the N-terminus or C-terminus of the CRISPR nuclease domain. In some embodiments, the transcription activation domain is fused to the N-terminus of the CRISPR nuclease domain. In some embodiments, the transcription activation domain is fused to the C-terminus of the CRISPR nuclease domain.
在一些实施方案中,所述多肽还包含核定位序列(NLS)。一般而言,所述多肽中的一个或多个NLS应具有足够的强度,以便在植物细胞的核中驱动所述多肽以可实现其基因组编辑功能的量积聚。一般而言,核定位活性的强度由所述多肽中NLS的数目、位置、所使用的一个或多个特定的NLS、或这些因素的组合决定。In some embodiments, the polypeptide further comprises a nuclear localization sequence (NLS). Generally speaking, one or more NLS in the polypeptide should have sufficient strength to drive the accumulation of the polypeptide in an amount that can achieve its genome editing function in the nucleus of the plant cell. Generally speaking, the strength of nuclear localization activity is determined by the number and location of NLS in the polypeptide, one or more specific NLS used, or a combination of these factors.
在本发明的一些实施方案中,本发明的多肽的NLS可以位于N端和/或C端。在本发明的一些实施方案中,本发明的多肽的NLS可以位于所述转录激活结构域与所述CRISPR核酸酶结构域之间。在一些实施方案中,所述多肽包含约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述多肽包含在或接近于N端的约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述多肽包含在或接近于C端约1、2、3、4、5、6、7、8、9、10个或更多个NLS。在一些实施方案中,所述多肽包含这些的组合,如包含在N端的一个或多个NLS以及在C端的一个或多个NLS。当存在多于一个NLS时,每一个可以被选择为不依赖于其他NLS。在本发明的一些优选实施方式中,所述多肽包含至少2个NLS,例如所述至少2个NLS位于C端。在一些优选的实施方案中,所述NLS位于所述多肽的C末端。在一些优选的实施方案中,所述多肽包含至少3个NLS。在更优选的实施方案中,所述多肽在C末端包含至少3个NLS。在一些优选的实施方案中,所述多肽在N末端和/或在所述转录激活结构域与所述CRISPR核酸酶结构域之间不包含NLS。In some embodiments of the present invention, the NLS of the polypeptide of the present invention may be located at the N-terminal and/or C-terminal. In some embodiments of the present invention, the NLS of the polypeptide of the present invention may be located between the transcription activation domain and the CRISPR nuclease domain. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the N-terminus. In some embodiments, the polypeptide comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more NLS at or near the C-terminus. In some embodiments, the polypeptide includes a combination of these, such as one or more NLS at the N-terminus and one or more NLS at the C-terminus. When there is more than one NLS, each one can be selected as not dependent on the other NLS. In some preferred embodiments of the present invention, the polypeptide comprises at least 2 NLS, for example, the at least 2 NLS are located at the C-terminus. In some preferred embodiments, the NLS is located at the C-terminus of the polypeptide. In some preferred embodiments, the polypeptide contains at least 3 NLS. In a more preferred embodiment, the polypeptide comprises at least 3 NLS at the C-terminus. In some preferred embodiments, the polypeptide does not comprise NLS at the N-terminus and/or between the transcription activation domain and the CRISPR nuclease domain.
一般而言,NLS由暴露于蛋白表面上的带正电的赖氨酸或精氨酸的一个或多个短序列组成,但其他类型的NLS也是已知的。NLS的非限制性实例包括:KKRKV(核苷酸序列5’-AAGAAGAGAAAGGTC-3’)、PKKKRKV(核苷酸序列5’-CCCAAGAAGAAGAGGAAGGTG-3’或CCAAAGAAGAAGAGGAAGGTT),或SGGSPKKKRKV(核苷酸序列5’-TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG-3’)。Generally speaking, NLS consists of one or more short sequences of positively charged lysine or arginine exposed on the surface of the protein, but other types of NLS are also known. Non-limiting examples of NLS include: KKRKV (nucleotide sequence 5'-AAGAAGAGAAAGGTC-3'), PKKKRKV (nucleotide sequence 5'-CCCAAGAAGAAGAGGAAGGTG-3' or CCAAAGAAGAAGAGGAAGGTT), or SGGSPKKKRKV (nucleotide sequence 5'- TCGGGGGGGAGCCCAAAGAAGAAGCGGAAGGTG-3').
在优选的实施方案中,所述多肽包含两个核定位序列,优选地,其中一个核定位序列位于所述CRISPR核酸酶或其活性片段的N末端,一个核定位序列位于CRISPR核酸酶结构域或其活性片段的C末端与所述转录激活结构域的N末端之间。In a preferred embodiment, the polypeptide comprises two nuclear localization sequences. Preferably, one nuclear localization sequence is located at the N-terminus of the CRISPR nuclease or active fragment thereof, and one nuclear localization sequence is located at the CRISPR nuclease domain or Between the C-terminus of the active fragment and the N-terminus of the transcription activation domain.
在优选的实施方案中,本发明的多肽包含SEQ ID NO:2的氨基酸序列。更优选的,所述多肽由SEQ ID NO:2的氨基酸组成。In a preferred embodiment, the polypeptide of the present invention comprises the amino acid sequence of SEQ ID NO: 2. More preferably, the polypeptide consists of the amino acids of SEQ ID NO: 2.
本发明还提供编码本发明的多肽的分离的多核苷酸。在一些实施方案中,所述多核苷酸包含SEQ ID NO:3的核苷酸序列或其简并变体。优选地,所述多核苷酸由SEQ ID NO:3的核苷酸序列或其简并变体组成。The invention also provides isolated polynucleotides encoding the polypeptides of the invention. In some embodiments, the polynucleotide comprises the nucleotide sequence of SEQ ID NO: 3 or a degenerate variant thereof. Preferably, the polynucleotide consists of the nucleotide sequence of SEQ ID NO: 3 or a degenerate variant thereof.
为了获得有效表达,在一些实施方案中,针对所编辑的生物体,例如植物,对所述多核苷酸进行密码子优化。In order to obtain effective expression, in some embodiments, the polynucleotide is codon-optimized for the edited organism, such as a plant.
密码子优化是指通过用在宿主细胞的基因中更频繁地或者最频繁地使用的密码子代替天然序列的至少一个密码子(例如约或多于约1、2、3、4、5、10、15、20、25、50个或更多个密码子同时维持该天然氨基酸序列而修饰核酸序列以便增强在感兴趣宿主细胞中的表达的方法。不同的物种对于特定氨基酸的某些密码子展示出特定的偏好。密码子偏好性(在生物之间的密码子使用的差异)经常与信使RNA(mRNA)的翻译效率相关,而该翻译效率则被认为依赖于被翻译的密码子的性质和特定的转运RNA(tRNA)分子的可用性。细胞内选定的tRNA的优势一般反映了最频繁用于肽合成的密码子。因此,可以将基因定制为基于密码子优化在给定生物中的最佳基因表达。密码子利用率表可以容易地获得,例如在 www.kazusa.orjp/codon/上可获得的密码子使用数据库(“Codon Usage Database”)中,并且这些表可以通过不同的方式调整适用。参见,Nakamura Y.等,“Codon usage tabulated from the international DNA sequence databases:status for the year2000.Nucl.Acids Res.,28:292(2000)。 Codon optimization refers to replacing at least one codon (e.g., about or more than about 1, 2, 3, 4, 5, 10) of the natural sequence with a codon that is used more frequently or most frequently in the gene of the host cell. , 15, 20, 25, 50 or more codons while maintaining the natural amino acid sequence to modify the nucleic acid sequence to enhance expression in the host cell of interest. Different species display certain codons for specific amino acids Codon preference (the difference in codon usage between organisms) is often related to the translation efficiency of messenger RNA (mRNA), and the translation efficiency is considered to depend on the nature and nature of the codon being translated The availability of specific transfer RNA (tRNA) molecules. The advantages of selected tRNAs in cells generally reflect the most frequently used codons for peptide synthesis. Therefore, genes can be tailored to be the best in a given organism based on codon optimization. Good gene expression. Codon utilization tables can be easily obtained, such as the "Codon Usage Database" available at www.kazusa.orjp/codon/ , and these tables can be adjusted in different ways Applicable. See, Nakamura Y. et al., "Codon usage tabulated from the international DNA sequence databases: status for the year 2000. Nucl. Acids Res., 28:292 (2000).
三、改进的基因组编辑系统3. Improved genome editing system
本发明提供一种改进的基因组编辑系统,其包含以下i)至v)中至少一项:The present invention provides an improved genome editing system, which comprises at least one of the following i) to v):
i)本发明的基因组编辑融合多肽和向导RNA;i) The genome editing fusion polypeptide and guide RNA of the present invention;
ii)编码本发明的基因组编辑融合多肽的表达构建体,和向导RNA;ii) An expression construct encoding the genome editing fusion polypeptide of the present invention, and a guide RNA;
iii)本发明的基因组编辑融合多肽,和包含编码向导RNA的核苷酸序列的表达构建体;iii) The genome editing fusion polypeptide of the present invention, and an expression construct containing a nucleotide sequence encoding a guide RNA;
iv)编码本发明的基因组编辑融合多肽的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;iv) An expression construct encoding the genome editing fusion polypeptide of the present invention, and an expression construct containing a nucleotide sequence encoding a guide RNA;
v)包含本发明的多核苷酸和编码向导RNA的核苷酸序列的表达构建体。v) An expression construct comprising the polynucleotide of the present invention and a nucleotide sequence encoding a guide RNA.
在一些实施方案中,其中所述向导RNA是sgRNA,优选地所述sgRNA靶向封闭染色质区域。根据给定的靶序列构建合适的sgRNA的方法是本领域已知的。例如,可参见文献:Wang,Y.et al.Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew.Nat.Biotechnol.32,947-951(2014);Shan,Q.et al.Targeted genome modification of crop plants using a CRISPR-Cas system.Nat.Biotechnol.31,686-688(2013);Liang,Z.et al.Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system.J Genet Genomics.41,63–68(2014)。In some embodiments, wherein the guide RNA is a sgRNA, preferably the sgRNA targets a closed chromatin region. The method of constructing a suitable sgRNA based on a given target sequence is known in the art. For example, see the literature: Wang, Y. et al. Simultaneous editing of three homoeoalleles in hexaploid bread wheat conflicts heritable resistance to powdery mildew. Nat. Biotechnol. 32,947-951 (2014); Shan, Q.et gen. Target modified. of crop plants using a CRISPR-Cas system.Nat.Biotechnol.31,686-688(2013); Liang,Z.et al.Targeted mutagenesis in Zea mays using TALENs and the CRISPR/Cas system.J Genet Genomics.41,63-- 68 (2014).
可以被CRISPR核酸酶和向导RNA复合物识别并靶向的靶序列的设计属于本领域普通技术人员的技能范围。一般而言,靶序列是与向导RNA中包含的大约20个核苷酸 的引导序列互补的序列,且3’末端紧邻前间区序列邻近基序(protospacer adjacent motif)(PAM)。The design of a target sequence that can be recognized and targeted by the CRISPR nuclease and the guide RNA complex belongs to the skill of those of ordinary skill in the art. In general, the target sequence is a sequence complementary to the guide sequence of about 20 nucleotides contained in the guide RNA, and the 3'end is immediately adjacent to the protospacer adjacent motif (PAM).
在一个示例性的实施方案中,本发明的向导RNA的scaffold序列如SEQ ID NO:4所示。In an exemplary embodiment, the scaffold sequence of the guide RNA of the present invention is shown in SEQ ID NO: 4.
在一些实施方案中,本发明的CRISPR系统还包含或编码dsgRNA,其靶向的位点与所述sgRNA靶向的位点相距30-300bp,优选40-270bp,最优选115-120bp。在一些实施方案中,所述dsgRNA仅包含14或15个核苷酸的引导序列。也就是说,所述dsgRNA仅靶向14或15个核苷酸的靶序列。这样的dsgRNA能够将CRISPR核酸酶靶向其靶序列,然而并不能引起切割。In some embodiments, the CRISPR system of the present invention further comprises or encodes a dsgRNA whose target site is 30-300 bp away from the sgRNA target site, preferably 40-270 bp, most preferably 115-120 bp. In some embodiments, the dsgRNA only contains a 14 or 15 nucleotide leader sequence. That is, the dsgRNA only targets a target sequence of 14 or 15 nucleotides. Such dsgRNA can target the CRISPR nuclease to its target sequence, but cannot cause cleavage.
在一些实施方案中,本发明的CRISPR系统包含以上ii)至v)的至少一项。在一些实施方案中,编码本发明的多肽的核苷酸序列和/或编码向导RNA的核苷酸序列与表达调控序列,优选植物表达调控序列,如启动子可操作地连接。In some embodiments, the CRISPR system of the present invention includes at least one of ii) to v) above. In some embodiments, the nucleotide sequence encoding the polypeptide of the present invention and/or the nucleotide sequence encoding the guide RNA is operably linked to an expression control sequence, preferably a plant expression control sequence, such as a promoter.
本发明可使用的启动子的实例包括但不限于:花椰菜花叶病毒35S启动子(Odell et al.(1985)Nature 313:810-812)、玉米Ubi-1启动子、小麦U6启动子、水稻U3启动子、玉米U3启动子、水稻肌动蛋白启动子、TrpPro5启动子(美国专利申请No.10/377,318;2005年3月16日提请)、pEMU启动子(Last et al.(1991)Theor.Appl.Genet.81:581-588)、MAS启动子(Velten et al.(1984)EMBO J.3:2723-2730)、玉米H3组蛋白启动子(Lepetit et al.(1992)Mol.Gen.Genet.231:276-285和Atanassova et al.(1992)Plant J.2(3):291-300)和欧洲油菜(Brassica napus)ALS3(PCT申请WO 97/41228)启动子。可用于本发明的启动子还包含Moore et al.(2006)Plant J.45(4):651-683中综述的常用组织特异性启动子。Examples of promoters that can be used in the present invention include but are not limited to: cauliflower mosaic virus 35S promoter (Odell et al. (1985) Nature 313:810-812), maize Ubi-1 promoter, wheat U6 promoter, rice U3 promoter, corn U3 promoter, rice actin promoter, TrpPro5 promoter (US Patent Application No. 10/377,318; filed March 16, 2005), pEMU promoter (Last et al. (1991) Theor .Appl.Genet.81:581-588), MAS promoter (Velten et al. (1984) EMBO J. 3:2723-2730), maize H3 histone promoter (Lepetit et al. (1992) Mol.Gen Genet.231:276-285 and Atanassova et al. (1992) Plant J.2(3):291-300) and Brassica napus ALS3 (PCT application WO 97/41228) promoters. The promoters that can be used in the present invention also include commonly used tissue-specific promoters reviewed in Moore et al. (2006) Plant J.45(4):651-683.
在一个示例性的实施方案中,本发明的构建体包含水稻U3启动子,其包含SEQ ID NO:5所示的核苷酸序列。In an exemplary embodiment, the construct of the present invention includes a rice U3 promoter, which includes the nucleotide sequence shown in SEQ ID NO: 5.
四、对细胞进行遗传修饰的方法Fourth, the method of genetic modification of cells
在另一方面,本发明提供了一种对细胞进行遗传修饰的方法,包括将本发明的基因组编辑系统导入所述细胞。In another aspect, the present invention provides a method for genetically modifying cells, including introducing the genome editing system of the present invention into the cells.
可以被CRISPR核酸酶和向导RNA复合物识别并靶向的靶序列的设计属于本领域普通技术人员的技能范围。一般而言,靶序列是与向导RNA中包含的大约20个核苷酸的引导序列互补的序列,且3’末端紧邻前间区序列邻近基序(protospacer adjacent motif)(PAM)。The design of a target sequence that can be recognized and targeted by the CRISPR nuclease and the guide RNA complex belongs to the skill of those of ordinary skill in the art. Generally speaking, the target sequence is a sequence complementary to the guide sequence of about 20 nucleotides contained in the guide RNA, and the 3'end is immediately adjacent to the protospacer adjacent motif (PAM).
在本发明中,待进行修饰的靶序列可以位于基因组的任何位置,例如位于功能基因如蛋白编码基因内,或者例如可以位于基因表达调控区如启动子区或增强子区,从而实现对所述基因功能修饰或对基因表达的修饰。优选地,所述靶序列位于封闭染色质区域。In the present invention, the target sequence to be modified can be located anywhere in the genome, for example, in a functional gene such as a protein-coding gene, or, for example, can be located in a gene expression control region such as a promoter region or an enhancer region, so as to achieve Modification of gene function or modification of gene expression. Preferably, the target sequence is located in an enclosed chromatin region.
可以通过T7EI、PCR/RE或测序方法检测所述细胞靶序列中的取代、缺失和/或添加。The substitution, deletion and/or addition in the cell target sequence can be detected by T7EI, PCR/RE or sequencing methods.
在本发明的方法中,所述基因组编辑系统可以通过本领域技术人员熟知的各种方法导入细胞。In the method of the present invention, the genome editing system can be introduced into cells by various methods well known to those skilled in the art.
可用于将本发明的基因组编辑系统导入细胞的方法包括但不限于:磷酸钙转染、原生质融合、电穿孔、脂质体转染、微注射、病毒感染(如杆状病毒、痘苗病毒、腺病毒和其他病毒)、基因枪法、PEG介导的原生质体转化、土壤农杆菌介导的转化。Methods that can be used to introduce the genome editing system of the present invention into cells include, but are not limited to: calcium phosphate transfection, protoplast fusion, electroporation, liposome transfection, microinjection, viral infection (such as baculovirus, vaccinia virus, adenovirus) Viruses and other viruses), gene bombardment, PEG-mediated transformation of protoplasts, and Agrobacterium-mediated transformation.
可以通过本发明的方法进行基因组编辑的细胞可以来自例如,哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。优选地,所述细胞是植物细胞,例如水稻细胞。Cells that can be genome edited by the method of the present invention can be derived from, for example, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cattle, cats; poultry such as chickens, ducks, and geese; plants, including monas Leafy plants and dicotyledonous plants, such as rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis, etc. Preferably, the cell is a plant cell, such as a rice cell.
在一些实施方式中,本发明的方法在体外进行。例如,所述细胞是分离的细胞。在另一些实施方式中,本发明的方法还可以在体内进行。例如,所述细胞是生物体内的细胞,可以通过例如病毒介导的方法将本发明的系统体内导入所述细胞。在一些实施方式中,所述细胞是生殖细胞。在一些实施方式中,所述细胞是体细胞。In some embodiments, the methods of the invention are performed in vitro. For example, the cell is an isolated cell. In other embodiments, the method of the present invention can also be performed in vivo. For example, the cell is a cell in a living body, and the system of the present invention can be introduced into the cell in vivo by, for example, a virus-mediated method. In some embodiments, the cell is a germ cell. In some embodiments, the cell is a somatic cell.
在另一方面,本发明还提供经遗传修饰的生物体,其包含通过本发明的方法产生的经遗传修饰的细胞。In another aspect, the present invention also provides a genetically modified organism, which comprises a genetically modified cell produced by the method of the present invention.
所述生物体包括但不限于哺乳动物如人、小鼠、大鼠、猴、犬、猪、羊、牛、猫;家禽如鸡、鸭、鹅;植物,包括单子叶植物和双子叶植物,例如水稻、玉米、小麦、高粱、大麦、大豆、花生、拟南芥等。优选地,所述生物体是植物,优选水稻。The organisms include, but are not limited to, mammals such as humans, mice, rats, monkeys, dogs, pigs, sheep, cows, and cats; poultry such as chickens, ducks, and geese; plants, including monocots and dicots, For example, rice, corn, wheat, sorghum, barley, soybean, peanut, Arabidopsis and so on. Preferably, the organism is a plant, preferably rice.
实施例Example
实施例1、方法 Embodiment 1. Method
质粒构建Plasmid construction
VP64(4个拷贝的VP16-TAD)和2TAL(2个拷贝的TALE-TAD)的编码序列针对水稻(Oryza sativa)进行密码子优化,并合成(分别如SEQ ID NO:6和SEQ ID NO:7所示)(GenScript,南京,中国)。通过重叠PCR将VP64编码序列与Cas9的3'末端融合,并在Cas9和VP64之间引入Avr II位点。将Cas9-VP64融合基因克隆到pJIT163中以产生p163-Cas9-VP64。然后将1个拷贝的VP64和3拷贝的2TAL片段依次插入p163-Cas9-VP64的AvrII位点以产生p163-Cas9-TV,其中Cas9-TV的序列如SEQ ID NO:3的核苷酸所述。如前所述,将不同的sgRNA引入pOsU3-sgRNA(参见Shan et al.(2014).Genome editing in rice and wheat using the CRISPR/Cas system.Nat Protoc 9,2395-2410)。如先前报道的那样构建sgRNA-dsgRNA共表达质粒(参见Xing et al.,(2014).A CRISPR/Cas9 toolkit for multiplex genome editing in plants.BMC Plant Biol 14,327)。The coding sequences of VP64 (4 copies of VP16-TAD) and 2TAL (2 copies of TALE-TAD) are codon optimized for rice (Oryza sativa) and synthesized (as shown in SEQ ID NO: 6 and SEQ ID NO: 7) (GenScript, Nanjing, China). The VP64 coding sequence was fused with the 3'end of Cas9 by overlapping PCR, and the Avr II site was introduced between Cas9 and VP64. The Cas9-VP64 fusion gene was cloned into pJIT163 to generate p163-Cas9-VP64. Then 1 copy of VP64 and 3 copies of 2TAL fragments were sequentially inserted into the AvrII site of p163-Cas9-VP64 to generate p163-Cas9-TV, where the sequence of Cas9-TV is as described in the nucleotide sequence of SEQ ID NO: 3 . As mentioned above, introduce different sgRNAs into pOsU3-sgRNA (see Shan et al. (2014). Genome editing in rice and wheat using the CRISPR/Cas system. Nat Protoc 9, 2395-2410). Construct the sgRNA-dsgRNA co-expression plasmid as previously reported (see Xing et al., (2014). A CRISPR/Cas9 toolkit for multiplex genome editing in plants. BMC Plant Biol 14,327).
DNase-seq数据分析DNase-seq data analysis
从NCBI的Gene Expression Omnibus(GEO)获得之前报道的水稻幼苗(GSE26610)的DNase-seq数据(参见Zhang et al.,(2012).High-resolution mapping of open chromatin in the rice genome.Genome Res 22,151-162)。将DNase-seq数据加载到水稻注释项目数据库 (RAP-DB)的Gbrowse(Gbrowse of the rice annotation project database)中,观察目标位点的染色质状态。Obtain the previously reported DNase-seq data of rice seedlings (GSE26610) from NCBI’s Gene Expression Omnibus (GEO) (see Zhang et al., (2012). High-resolution mapping of open chromatin in the rice gene. Genome Res 22,151- 162). Load the DNase-seq data into the Gbrowse (Gbrowse of the rice annotation project database) of the Rice Annotation Project Database (RAP-DB), and observe the chromatin status of the target site.
原生质体转染Protoplast transfection
使用水稻栽培种“Nipponbare”的两周龄幼苗来分离原生质体。按照标准方案(参见Shan et al.(2014).Genome editing in rice and wheat using the CRISPR/Cas system.Nat Protoc 9,2395-2410)进行原生质体分离和转染。通过PEG介导的转染将质粒(每种构建体10μg)转染到原生质体中。Two-week-old seedlings of the rice cultivar "Nipponbare" were used to isolate the protoplasts. According to the standard protocol (see Shan et al. (2014). Genome editing in rice and wheat using the CRISPR/Cas system. Nat Protoc 9, 2395-2410) for protoplast isolation and transfection. Plasmids (10 μg per construct) were transfected into protoplasts by PEG-mediated transfection.
植物基因组DNA的提取Extraction of Plant Genomic DNA
将经转染的原生质体在28℃温育。48小时后,收集原生质体,用CTAB法提取基因组DNA(参见Murray&Thompson,(1980).Rapid isolation of high molecular weight plant DNA.Nucleic Acids Res 8,4321-4325)。The transfected protoplasts were incubated at 28°C. After 48 hours, the protoplasts were collected, and the genomic DNA was extracted by the CTAB method (see Murray&Thompson, (1980). Rapid isolation of high molecular weight plant DNA. Nucleic Acids Res 8,4321-4325).
靶向区域的PCR扩增和二代测序PCR amplification and next-generation sequencing of targeted regions
从原生质体中提取的基因组DNA用作PCR模板。在第一轮PCR中,使用特异性引物扩增CRISPR靶位点侧翼的基因组区域。在第二轮中,使用引物扩增150-250bp PCR产物,以将正向和反向barcode引入第一轮PCR产物中。合并等量的最终PCR产物并使用Illumina NextSeq 500平台通过配对末端读取测序进行测序(GENEWIZ,中国苏州)。检测sgRNA靶位点的插入缺失。每个扩增子的测序重复三次,使用来自三个独立原生质体样品的基因组DNA。Genomic DNA extracted from protoplasts is used as a PCR template. In the first round of PCR, specific primers are used to amplify the genomic region flanking the CRISPR target site. In the second round, primers are used to amplify the 150-250bp PCR product to introduce the forward and reverse barcodes into the first round of PCR products. The same amount of final PCR products were combined and sequenced by paired-end read sequencing using the Illumina NextSeq 500 platform (GENEWIZ, Suzhou, China). Detection of indels at the target site of sgRNA. Sequencing of each amplicon was repeated three times, using genomic DNA from three independent protoplast samples.
Cas9RNP的体外切割In vitro cutting of Cas9RNP
如先前报道的那样,通过Cas9RNP进行体外切割(参见Liang et al.,(2017).Efficient DNA-free genome editing of bread wheat using CRISPR/Cas9 ribonucleoprotein complexes.Nat Commun 8,14261)。使用特异性引物通过PCR扩增靶DNA序列,然后纯化并以无RNase的水洗脱。将Cas9蛋白(1μg)和sgRNA(1μg)预混合,与靶DNA(200ng)在37℃温育1h。然后将产物在2%琼脂糖凝胶上分离,并使用Image J软件测量条带强度以计算Cas9切割活性。As previously reported, in vitro cutting is performed by Cas9RNP (see Liang et al., (2017). Efficient DNA-free genome editing of bread wheat using CRISPR/Cas9 ribonucleoprotein complexes. NatCommun 8,14261). The target DNA sequence is amplified by PCR using specific primers, then purified and eluted with RNase-free water. The Cas9 protein (1μg) and sgRNA (1μg) were premixed and incubated with the target DNA (200ng) at 37°C for 1h. The product was then separated on a 2% agarose gel, and the band intensity was measured using Image J software to calculate the Cas9 cleavage activity.
检测染色质可及性Detect chromatin accessibility
如先前报道的那样进行微量样品DNase I消化测定(参见Lu et al.(2016).Establishing Chromatin Regulatory Landscape during Mouse Preimplantation Development.Cell 165,1375-1388)。将转染的原生质体在28℃下培养24小时。将4×10 5转染的水原生质体样品重悬于45μL裂解缓冲液(10mM Tris-HCl[pH 7.5],10mM NaCl,3mM MgCl  2,0.1%Triton X-100)中,在冰上温育5min,然后加入DNase I(1000U/ml,Sigma,AMPD1-1KT) 至最终浓度为2U/mL。将样品在37℃下再温育5分钟,然后加入含有1U蛋白酶K的50μL终止缓冲液(10mM Tris-HCl[pH 7.5],10mM NaCl,0.15%SDS,10mM EDTA)终止反应。在55℃下温育1小时。通过酚-氯仿法(参见Sambrook&Russell,(2006).Purification of nucleic acids by extraction with Phenol:Chloroform.CSH Protoc 2006:pdb.prot4455)从每个样品提取基因组DNA,并通过实时qPCR(SYBR Premix Ex TaqTM II,Takara)进行分析。 The micro-sample DNase I digestion assay was performed as previously reported (see Lu et al. (2016). Establishing Chromatin Regulatory Landscape during Mouse Preimplantation Development. Cell 165, 1375-1388). The transfected protoplasts were cultured at 28°C for 24 hours. Resuspend 4×10 5 transfected aquatic protoplast samples in 45μL lysis buffer (10mM Tris-HCl[pH 7.5], 10mM NaCl, 3mM MgCl 2 , 0.1% Triton X-100), and incubate on ice 5min, then add DNase I (1000U/ml, Sigma, AMPD1-1KT) to a final concentration of 2U/mL. The sample was incubated at 37°C for another 5 minutes, and then 50 μL of stop buffer (10 mM Tris-HCl [pH 7.5], 10 mM NaCl, 0.15% SDS, 10 mM EDTA) containing 1 U of proteinase K was added to terminate the reaction. Incubate at 55°C for 1 hour. Genomic DNA was extracted from each sample by the phenol-chloroform method (see Sambrook & Russell, (2006). Purification of nucleic acids by extraction with Phenol: Chloroform. CSH Protoc 2006: pdb.prot4455), and by real-time qPCR (SYBR Premix Ex TaqTM II , Takara) for analysis.
检测脱靶突变Detection of off-target mutations
通过在线工具CRISPR-P(参见Liu et al.,(2017).CRISPR-P 2.0:An Improved CRISPR-Cas9 Tool for Genome Editing in Plants.Mol Plant 10,530-532)预测sgRNA24、28、34和38的潜在脱靶位点。设计用于这些位点的基因座特异性引物以产生约150至250bp的PCR产物。在第一轮PCR中,使用特异性引物扩增位于靶上和靶外位点侧翼的基因组区域。将得到的PCR产物用作第二轮PCR的模板,将条形码(code)添加到PCR产物的每一端。然后将PCR产物以相等的量合并用于二代测序。检查目标和潜在的脱靶位点的插入缺失。每个扩增子的测序重复三次,使用来自三个独立原生质体样品的基因组DNA。Use the online tool CRISPR-P (see Liu et al., (2017).CRISPR-P 2.0: An Improved CRISPR-Cas9 Tool for Genome Editing in Plants.Mol Plant 10, 530-532) to predict the potential of sgRNA24, 28, 34, and 38 Off-target site. Locus-specific primers for these sites are designed to generate PCR products of about 150 to 250 bp. In the first round of PCR, specific primers are used to amplify the genomic regions flanking the target and off-target sites. The resulting PCR product is used as a template for the second round of PCR, and a barcode is added to each end of the PCR product. The PCR products were then combined in equal amounts for next-generation sequencing. Check the target and potential off-target indels. Sequencing of each amplicon was repeated three times, using genomic DNA from three independent protoplast samples.
实施例2、Cas9基因组编辑在水稻的开放染色质区域中更有效Example 2. Cas9 genome editing is more effective in open chromatin regions of rice
使用CRISPR-Cas9系统用70个sgRNA编辑了41个水稻基因(表2)。通过农杆菌转化将Cas9和各种sgRNA转化到水稻愈伤组织中。通过PCR/RE分析再生的T0植物中的编辑,并通过Sanger测序确认。CRISPR-Cas9在各种靶位点诱导的indel频率变化很大(表1)。Using the CRISPR-Cas9 system, 41 rice genes were edited with 70 sgRNAs (Table 2). Cas9 and various sgRNAs were transformed into rice callus by Agrobacterium transformation. The edits in the regenerated T0 plants were analyzed by PCR/RE and confirmed by Sanger sequencing. The frequency of indels induced by CRISPR-Cas9 at various target sites varies greatly (Table 1).
表1、在水稻T0植物中由CRISPR/Cas9在不同基因组位点诱导的诱变效率Table 1. Mutagenesis efficiency induced by CRISPR/Cas9 at different genomic sites in rice T0 plants
Figure PCTCN2020100664-appb-000001
Figure PCTCN2020100664-appb-000001
Figure PCTCN2020100664-appb-000002
Figure PCTCN2020100664-appb-000002
Figure PCTCN2020100664-appb-000003
Figure PCTCN2020100664-appb-000003
Figure PCTCN2020100664-appb-000004
Figure PCTCN2020100664-appb-000004
然后分析了indel频率是否与染色质可及性相关。开放染色质是DNase I敏感的(DH),可以使用水稻基因组的综合DNase I敏感性数据。使用这些数据,发现在测试的目标位点,Cas9诱导的插入缺失的频率在DH位点显著更高(图1a),表明水稻中的CRISPR-Cas9活性受染色质开放性的影响。为了确认染色质结构影响水稻中的Cas9编辑,基于水稻 开放染色质图测试了开放和封闭染色质区域中的另外20个基因。为每个基因设计两个sgRNA,一个靶向启动子,另一个靶向外显子(表2)。Then analyzed whether indel frequency is related to chromatin accessibility. Open chromatin is DNase I sensitive (DH), and the comprehensive DNase I sensitivity data of rice genome can be used. Using these data, it was found that the frequency of Cas9-induced indels at the tested target sites was significantly higher at the DH site (Figure 1a), indicating that CRISPR-Cas9 activity in rice is affected by chromatin openness. To confirm that the chromatin structure affects Cas9 editing in rice, another 20 genes in open and closed chromatin regions were tested based on the rice open chromatin map. Two sgRNAs were designed for each gene, one targeting the promoter and the other targeting the exons (Table 2).
表2、所选择的40个靶位点的信息Table 2. Information of the selected 40 target sites
Figure PCTCN2020100664-appb-000005
Figure PCTCN2020100664-appb-000005
Figure PCTCN2020100664-appb-000006
Figure PCTCN2020100664-appb-000006
将Cas9和这些sgRNA中的每一个转化到水稻原生质体中,并通过靶向的深度测序测量所有40个靶位点的插入缺失频率(图1b)。结果证实,开放染色质区域中的编辑效 率高于封闭染色质区域(图1c)。Cas9 and each of these sgRNAs were transformed into rice protoplasts, and the indel frequency of all 40 target sites was measured by targeted deep sequencing (Figure 1b). The results confirmed that the editing efficiency in the open chromatin region was higher than that in the closed chromatin region (Figure 1c).
为了排除间隔区序列组成对编辑效率的可能影响,用开放和封闭染色质区域的序列鉴定了五个独立的间隔区(sgRNA A~E)(表3)。In order to exclude the possible influence of spacer sequence composition on editing efficiency, five independent spacers (sgRNA A~E) were identified with the sequence of open and closed chromatin regions (Table 3).
表3、所选择的sgRNA每个靶向具有相反染色质状态的两个基因组位点。Table 3. The selected sgRNAs each target two genomic sites with opposite chromatin states.
Figure PCTCN2020100664-appb-000007
Figure PCTCN2020100664-appb-000007
这些位点的indel频率成对比较显示,开放染色质区域的Cas9活性高于封闭染色质区域的Cas9活性高达13.4倍,而不同sgRNA诱导的indel频率变化很大(图2a,2b)。有趣的是,当体外靶向PCR产物或无染色质DNA时,Cas9能够对所有这些靶位点进行几乎相同的编辑(图2c)。此外,在成对的目标位点产生的插入缺失模式是相似的(图2d)。总之,这些结果表明,水稻细胞中CRISPR-Cas9基因组编辑在开放染色质区域比在封闭染色质区域更有效。Pairwise comparison of indel frequencies at these sites showed that the Cas9 activity in the open chromatin region was 13.4 times higher than that in the closed chromatin region, while the frequency of indel induced by different sgRNAs varied greatly (Figure 2a, 2b). Interestingly, when targeting PCR products or chromatin-free DNA in vitro, Cas9 was able to edit all these target sites almost identically (Figure 2c). In addition, the patterns of indels produced at the paired target sites are similar (Figure 2d). Taken together, these results indicate that CRISPR-Cas9 genome editing in rice cells is more effective in open chromatin regions than in closed chromatin regions.
实施例3、与合成的转录激活结构域融合增加Cas9在水稻中的编辑活性Example 3. Fusion with a synthetic transcription activation domain increases the editing activity of Cas9 in rice
合成的转录激活结构域(此后称为TV)含有6个拷贝的TALE(转录激活因子样效应子)-TAD(转录激活域)和8个拷贝的VP16,融合到Cas9的C末端。生成Cas9-TV(图3a)。用靶向不同染色质区域的20个sgRNA(表3)在水稻原生质体中研究Cas9-TV的基因组编辑效率。The synthetic transcription activation domain (hereinafter referred to as TV) contains 6 copies of TALE (transcription activator-like effector)-TAD (transcription activation domain) and 8 copies of VP16, fused to the C-terminus of Cas9. Generate Cas9-TV (Figure 3a). 20 sgRNAs targeting different chromatin regions (Table 3) were used to study the genome editing efficiency of Cas9-TV in rice protoplasts.
结果显示,Cas9和Cas9-TV诱导的靶位点的插入缺失频率分别为1.95%~29.56%和 3.81%~44.85%(图3b),Cas9-TV的基因组编辑效率在所有测试的位点都高于Cas9(图3c)。平均而言,Cas9-TV诱导的插入缺失频率在开放和封闭染色质区域中分别是Cas9的1.87倍和1.44倍(图3d,3e)。The results showed that the indel frequencies of the target sites induced by Cas9 and Cas9-TV were 1.95%-29.56% and 3.81%-44.85%, respectively (Figure 3b). The genome editing efficiency of Cas9-TV was high in all tested sites. In Cas9 (Figure 3c). On average, the frequency of indels induced by Cas9-TV in open and closed chromatin regions was 1.87 times and 1.44 times higher than that of Cas9, respectively (Figure 3d, 3e).
还发现Cas9-TV和Cas9生成的插入缺失模式相似(图10)。这些数据表明Cas9-TV体内编辑活性在开放和封闭染色质区域的靶位点处增加。It was also found that Cas9-TV and Cas9 produced similar indel patterns (Figure 10). These data indicate that Cas9-TV in vivo editing activity increases at target sites in open and closed chromatin regions.
实施例4、使用dsgRNA进行近端靶向改进基因组编辑Example 4. Use dsgRNA for proximal targeting to improve genome editing
使用水稻基因组中的20个sgRNA靶向位点(表2)并设计了靶向每个附近的近端位点的dsgRNA(表4)。Using 20 sgRNA targeting sites in the rice genome (Table 2) and designing dsgRNAs targeting each nearby proximal site (Table 4).
表4、所选定的sgRNA和其对应的近端dsgRNA靶向的位置Table 4. Selected sgRNA and its corresponding proximal dsgRNA targeting position
sgRNA a sgRNA a dsgRNA靶向序列 b dsgRNA targeting sequence b 距离 c Distance c
sgRNA2sgRNA2 GACATCATCTGGCAGGG GACATCATCTGGCAGGG 50bp50bp
sgRNA 4sgRNA 4 TGCAGGCTTCACGACGG TGCAGGCTTCACGACGG 32bp32bp
sgRNA 6sgRNA 6 TGACCTGATGCCCAAGG TGACCTGATGCCCAAGG 55bp55bp
sgRNA 8sgRNA 8 GCGCTGGTGCTTGCTGG GCGCTGGTGCTTGCTGG 57bp57bp
sgRNA 10sgRNA 10 CTTCGCGCGCTCCATGG CTTCGCGCGCTCCATGG 35bp35bp
sgRNA 12sgRNA 12 GGCGTGGGCAAGAGCGG GGCGTGGGCAAGAGCGG 39bp39bp
sgRNA 14sgRNA 14 TACAAGCTCAAGCTCGG TACAAGCTCAAGCTCGG 50bp50bp
sgRNA 16sgRNA 16 GGACCTTGGACTCGAGG GGACCTTGGACTCGAGG 55bp55bp
sgRNA 18sgRNA 18 ACCTGATTGGGTGAAGG ACCTGATTGGGTGAAGG 60bp60bp
sgRNA 20sgRNA 20 TATGGTAGCGAGCGTGG TATGGTAGCGAGCGTGG 68bp68bp
sgRNA 22sgRNA 22 AACAGCTAGGCTCTTGG AACAGCTAGGCTCTTGG 39bp39bp
sgRNA 24sgRNA 24 ACTGCAGGCGCTGCAGG ACTGCAGGCGCTGCAGG 59bp59bp
sgRNA 26sgRNA 26 ACTCATCGGTGTGTAGG ACTCATCGGTGTGTAGG 92bp92bp
sgRNA 28sgRNA 28 GTTGATGGACGAGGTGG GTTGATGGACGAGGTGG 61bp61bp
sgRNA 30sgRNA 30 AGCAGCACGTGCCTCGG AGCAGCACGTGCCTCGG 62bp62bp
sgRNA 32sgRNA 32 GGCCAACTGAACGACGG GGCCAACTGAACGACGG 56bp56bp
sgRNA 34sgRNA 34 GGCCACGTCGCTCGCGG GGCCACGTCGCTCGCGG 55bp55bp
sgRNA 36sgRNA 36 CCGATGCAGCCCACCGG CCGATGCAGCCCACCGG 66bp66bp
sgRNA 38sgRNA 38 GCGCATTAGACCAAGGG GCGCATTAGACCAAGGG 83bp83bp
sgRNA 40sgRNA 40 GGCGCGACCAACCACGGGGCGCGACCAACCACGG 40bp40bp
a,sgRNA与表2相同;b,14nt向导序列+PAM;c,dsgRNA靶向位点和sgRNA靶向位点之间的距离以bp表示。a, sgRNA is the same as Table 2; b, 14nt guide sequence + PAM; c, the distance between the dsgRNA targeting site and the sgRNA targeting site is expressed in bp.
sgRNA靶向位点和dsgRNA结合位点之间的距离范围为32至92bp。与单独使用sgRNA相比,dsgRNA与sgRNA组合与Cas9-TV或Cas9一起转化到水稻原生质体中时,近端dsgRNA提高了所有靶位点编辑的效率(图4a)。平均而言,Cas9-TV与近端dsgRNA组合获得的插入缺失频率比Cas9-TV高1.5倍,比Cas9高2.5倍(图4b)。The distance between the sgRNA targeting site and the dsgRNA binding site ranges from 32 to 92 bp. Compared with sgRNA alone, when dsgRNA and sgRNA were combined with Cas9-TV or Cas9 to transform into rice protoplasts, the proximal dsgRNA increased the editing efficiency of all target sites (Figure 4a). On average, the indel frequency obtained by combining Cas9-TV with proximal dsgRNA was 1.5 times higher than Cas9-TV and 2.5 times higher than Cas9 (Figure 4b).
此外,在dsgRNA靶向位点未检测到插入缺失(图12)。In addition, no indels were detected at the dsgRNA targeting site (Figure 12).
近端dsgRNA在开放和封闭染色质区域中促进Cas9-TV编辑(图4c,d),并且不影响Cas9-TV诱导的插入缺失的模式(图11)。Proximal dsgRNA promotes Cas9-TV editing in open and closed chromatin regions (Figure 4c, d), and does not affect the pattern of Cas9-TV-induced indels (Figure 11).
为了优化近端dsgRNA靶向,靶向sgRNA34的PAM序列的任一侧的位点设计了dsgRNA 1、2、6和dsgRNA 3、4、5(表5)(图13)。In order to optimize proximal dsgRNA targeting, dsgRNA 1, 2, 6 and dsgRNA 3, 4, 5 (Table 5) (Figure 13) were designed to target sites on either side of the PAM sequence of sgRNA34.
表5、dsgRNA靶向序列和其到sgRNA34靶向位点的距离Table 5, dsgRNA targeting sequence and its distance to the sgRNA34 targeting site
Figure PCTCN2020100664-appb-000008
Figure PCTCN2020100664-appb-000008
a,14nt向导序列+PAM;b,dsgRNA靶向位点和sgRNA靶向位点之间的距离以bp表示。a, 14nt guide sequence + PAM; b, the distance between the dsgRNA target site and the sgRNA target site is expressed in bp.
dsgRNA和sgRNA结合位点的距离范围为47至266bp(图5)。将每个dsgRNA或dsgRNA对与Cas9-TV和相应的sgRNA共转化到水稻原生质体中,并通过靶向深度测序测量插入缺失频率。The distance between dsgRNA and sgRNA binding sites ranges from 47 to 266 bp (Figure 5). Each dsgRNA or dsgRNA pair was co-transformed with Cas9-TV and the corresponding sgRNA into rice protoplasts, and the indel frequency was measured by targeted deep sequencing.
结果显示,所有dsgRNA增强编辑,但靶向位于切割位点117bp位点的dsgRNA4具有最大效果(图5)。结果还显示,dsgRNA相对于PAM的位置(下游与上游)不会显著影响编辑效率(图5)。The results showed that all dsgRNA enhanced editing, but targeting dsgRNA4 located at the 117bp site of the cutting site had the greatest effect (Figure 5). The results also showed that the position (downstream and upstream) of dsgRNA relative to PAM did not significantly affect editing efficiency (Figure 5).
此外,使用dsgRNA对而不是单个dsgRNA不会进一步增加Cas9-TV介导的编辑(图5)。Cas9介导的编辑获得了类似的结果(图7)。In addition, the use of dsgRNA pairs instead of a single dsgRNA does not further increase Cas9-TV-mediated editing (Figure 5). Cas9-mediated editing achieved similar results (Figure 7).
实施例5、Cas9-TV与近端dsgRNA一起增加染色质可及性Example 5. Cas9-TV and proximal dsgRNA increase chromatin accessibility
使用DNase I消化分析测定了位点26、28和34处的染色质可及性以确定Cas9-TV和dsgRNA的结合是否会改变靶区域的染色质结构。结果显示,Cas9-TV加dsgRNA明显增加了每个位点的染色质可及性(图8)。这些结果表明Cas9-TV/dsgRNA能够增加体内靶位点的染色质可及性。DNase I digestion analysis was used to determine the chromatin accessibility at sites 26, 28 and 34 to determine whether the binding of Cas9-TV and dsgRNA would change the chromatin structure of the target area. The results showed that Cas9-TV plus dsgRNA significantly increased the chromatin accessibility at each site (Figure 8). These results indicate that Cas9-TV/dsgRNA can increase the chromatin accessibility of target sites in vivo.
实施例6、TV和近端dsgRNA均未增加Cas9的脱靶活性Example 6. Neither TV nor proximal dsgRNA increased the off-target activity of Cas9
通过使用sgRNA 24、28、34和38对目标和非靶位点的靶向扩增子进行测序来检测插入缺失频率,检测Cas9-TV和Cas9-TV/dsgRNA的脱靶效应。By using sgRNA 24, 28, 34 and 38 to sequence the targeted amplicons at target and non-target sites to detect the frequency of indels, and to detect the off-target effects of Cas9-TV and Cas9-TV/dsgRNA.
对于sgRNA 24和28分别鉴定了具有2至4个错配的可能的三个脱靶(OT)位点,针对sgRNA 38鉴定了4个脱靶位点,针对sgRNA 34鉴定了5个脱靶位点(表6)。For sgRNA 24 and 28, three possible off-target (OT) sites with 2 to 4 mismatches were identified, 4 off-target sites were identified for sgRNA 38, and 5 off-target sites were identified for sgRNA 34 (Table 6).
表6、对于四个sgRNA在水稻基因组中鉴定的潜在的脱靶位点Table 6. Potential off-target sites identified in the rice genome for four sgRNAs
靶位点Target site 序列 a Sequence a 靶基因基因座Target gene locus
位点24Position 24 ACGGCCGCCTCCGTACGCCGCGGACGGCCGCCTCCGTACGCCGCGG LOC_Os04g18650LOC_Os04g18650
OT24-1OT24-1 ACGGCCGC TTCCG CACGCCGCGG ACGGCCGC T TCCG C ACGCCGCGG LOC_Os03g05590LOC_Os03g05590
OT24-2OT24-2 CCG CTCGCC CCCGTACGCCGCGG C CG CT CGCC C CCGTACGCCGCGG LOC_Os06g11400LOC_Os06g11400
OT24-3OT24-3 GCGGCCGC GGCCGTACGC TGGGG G CGGCCGC GG CCGTACGC T GGGG LOC_Os01g73410LOC_Os01g73410
位点28Locus 28 GTCTTTGGACGTAGCCATGGTGGGTCTTTGGACGTAGCCATGGTGG LOC_Os04g12220LOC_Os04g12220
OT28-1OT28-1 GTCTTTG CAC ATAGCCATGGCGG GTCTTTG C AC A TAGCCATGGCGG LOC_Os05g04110LOC_Os05g04110
OT28-2OT28-2 GTCTTT TGA TG CAGC AATGGAGG GTCTTT T GA T G C AGC A ATGGAGG LOC_Os01g56140LOC_Os01g56140
OT28-3OT28-3 GT TTTTGGAC TTAGCCA AGGAGG GT T TTTGGAC T TAGCCA A GGAGG LOC_Os04g57390LOC_Os04g57390
位点34Locus 34 AGACATCGTCACCAAGGCGCAGGAGACATCGTCACCAAGGCGCAGG LOC_Os11g08760LOC_Os11g08760
OT34-1OT34-1 CGAC GCCG ACACCAAGGCGCTGG C GAC GC CG A CACCAAGGCGCTGG LOC_Os04g56110LOC_Os04g56110
OT34-2OT34-2 GGAC GTCCTC GCCAAGGCGCAGG G GAC G TCCTC G CCAAGGCGCAGG LOC_Os09g38050LOC_Os09g38050
OT34-3OT34-3 GGACATCGTC GTC GAGGCGCTGG G GACATCGTC GT C G AGGCGCTGG LOC_Os04g32010LOC_Os04g32010
OT34-4OT34-4 CGAC GTCGT GACCAAGG TGCCGG C GAC G TCGT G ACCAAGG T GCCGG LOC_Os11g04940LOC_Os11g04940
OT34-5OT34-5 AG TCATCCTCA ACAAGGC CCAGG AG T CATCCTCA A CAAGGC C CAGG LOC_Os02g14059LOC_Os02g14059
位点38Locus 38 TGGGTAATGGTGATATCCCATGGTGGGTAATGGTGATATCCCATGG LOC_Os09g24280LOC_Os09g24280
OT38-1OT38-1 T AGGT GATG ATGATAT ACCAAGG T A GGT G ATG A TGATAT A CCAAGG LOC_Os12g29220LOC_Os12g29220
OT38-2OT38-2 T AGGTA GT TGTGATATC ACAGGG T A GGTA G T T GTGATATC A CAGGG LOC_Os12g39430LOC_Os12g39430
OT38-3OT38-3 TGGGT GATG ATGATATCC ATCGG TGGGT G ATG A TGATATCC AT CGG LOC_Os03g37411LOC_Os03g37411
OT38-4OT38-4 T ATGT GATGGTGATATCC TACGG T AT GT G ATGGTGATATCC T ACGG LOC_Os12g40790LOC_Os12g40790
a错配碱基以下划线显示,PAM继续以粗体显示。 a Mismatched bases are underlined and PAM continues to be displayed in bold.
在所有目标位置,Cas9-TV都具有比Cas9更高的中靶(on target)活性(图9)。In all target positions, Cas9-TV has higher on-target activity than Cas9 (Figure 9).
另一方面,在sgRNA24的OT24-2位点和sgRNA34的OT34-1位点,Cas9,Cas9-TV和Cas9-TV/dsgRNA以频率相似诱导插入缺失。在sgRNA24的位点OT24-1和OT24-3,sgRNA28的位点OT28-2和OT28-2,sgRNA34的位点OT34-2、OT34-3、OT34-4和OT34-5,以及sgRNA38的所有脱靶位点,所有核酸酶都没有诱导显著数量的插入缺失。令人惊讶的是,在OT28-3位点,Cas9-TV和Cas9-TV/dsgRNA诱导的插入缺失频率低于Cas9诱导的频率(图9)。On the other hand, at the OT24-2 site of sgRNA24 and the OT34-1 site of sgRNA34, Cas9, Cas9-TV and Cas9-TV/dsgRNA induced indels with similar frequencies. At the positions OT24-1 and OT24-3 of sgRNA24, the positions OT28-2 and OT28-2 of sgRNA28, the positions OT34-2, OT34-3, OT34-4 and OT34-5 of sgRNA34, and all off-targets of sgRNA38 At the site, none of the nucleases induced a significant number of indels. Surprisingly, at OT28-3, the frequency of indels induced by Cas9-TV and Cas9-TV/dsgRNA is lower than that induced by Cas9 (Figure 9).
这些结果表明,TV和近端dsgRNA的组合不会改变Cas9的脱靶活性。These results indicate that the combination of TV and proximal dsgRNA does not change the off-target activity of Cas9.

Claims (10)

  1. 一种基因组编辑融合多肽,包含CRISPR核酸酶结构域和转录激活结构域(TAD),优选地,所述转录激活结构域与所述CRISPR核酸酶结构域的C末端融合。A genome editing fusion polypeptide comprising a CRISPR nuclease domain and a transcription activation domain (TAD), preferably, the transcription activation domain is fused to the C-terminus of the CRISPR nuclease domain.
  2. 权利要求1的基因组编辑融合多肽,其中所述CRISPR核酸酶是Cas9或Cpf1。The genome editing fusion polypeptide of claim 1, wherein the CRISPR nuclease is Cas9 or Cpf1.
  3. 权利要求1或2的基因组编辑融合多肽,其中所述转录激活结构域包含一或多个VP16-TAD。The genome editing fusion polypeptide of claim 1 or 2, wherein the transcription activation domain comprises one or more VP16-TAD.
  4. 权利要求1-3任一项的基因组编辑融合多肽,其中所述转录激活结构域包含一或多个TALE-TAD。The genome editing fusion polypeptide of any one of claims 1 to 3, wherein the transcription activation domain comprises one or more TALE-TAD.
  5. 权利要求1-4任一项的基因组编辑融合多肽,其中所述转录激活结构域包含SEQ ID NO:1的氨基酸序列。The genome editing fusion polypeptide of any one of claims 1 to 4, wherein the transcription activation domain comprises the amino acid sequence of SEQ ID NO:1.
  6. 权利要求1-5任一项的基因组编辑融合多肽,还包含一或多个核定位序列,优选两个,优选地,其中一个核定位序列位于所述CRISPR核酸酶结构域的N末端,一个核定位序列位于CRISPR核酸酶结构域的C末端与所述转录激活结构域的N末端之间。The genome editing fusion polypeptide of any one of claims 1-5, further comprising one or more nuclear localization sequences, preferably two, preferably, one of the nuclear localization sequences is located at the N-terminus of the CRISPR nuclease domain, and one nuclear The positioning sequence is located between the C-terminus of the CRISPR nuclease domain and the N-terminus of the transcription activation domain.
  7. 一种改进的基因组编辑系统,其包含以下i)至v)中至少一项:An improved genome editing system comprising at least one of the following i) to v):
    i)权利要求1-6任一项的基因组编辑融合多肽和向导RNA;i) The genome editing fusion polypeptide and guide RNA of any one of claims 1-6;
    ii)包含编码权利要求1-6任一项的基因组编辑融合多肽的多核苷酸的表达构建体,和向导RNA;ii) An expression construct comprising a polynucleotide encoding the genome editing fusion polypeptide of any one of claims 1 to 6, and a guide RNA;
    iii)权利要求1-6任一项的基因组编辑融合多肽,和包含编码向导RNA的核苷酸序列的表达构建体;iii) The genome editing fusion polypeptide of any one of claims 1-6, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
    iv)包含编码权利要求1-6任一项的基因组编辑融合多肽的多核苷酸的表达构建体,和包含编码向导RNA的核苷酸序列的表达构建体;iv) an expression construct comprising a polynucleotide encoding the genome editing fusion polypeptide of any one of claims 1 to 6, and an expression construct comprising a nucleotide sequence encoding a guide RNA;
    v)包含编码权利要求1-6任一项的基因组编辑融合多肽的多核苷酸和编码向导RNA的核苷酸序列的表达构建体。v) An expression construct comprising a polynucleotide encoding the genome editing fusion polypeptide of any one of claims 1 to 6 and a nucleotide sequence encoding a guide RNA.
  8. 权利要求7的基因组编辑系统,其中所述向导RNA是sgRNA,优选地所述sgRNA靶向封闭染色质区域。The genome editing system of claim 7, wherein the guide RNA is sgRNA, preferably the sgRNA targets a closed chromatin region.
  9. 权利要求8的基因组编辑系统,还包含或编码dsgRNA,其靶向的位点与所述sgRNA靶向的位点相距30-300bp,优选40-270bp,最优选115-120bp。The genome editing system of claim 8, further comprising or encoding dsgRNA, the target site of which is 30-300 bp away from the site targeted by the sgRNA, preferably 40-270 bp, most preferably 115-120 bp.
  10. 一种对细胞进行遗传修饰的方法,包括将权利要求7-9任一项的基因组编辑系统引入细胞,优选植物细胞。A method for genetically modifying cells, comprising introducing the genome editing system of any one of claims 7-9 into cells, preferably plant cells.
PCT/CN2020/100664 2019-07-08 2020-07-07 Improved genome editing system and use thereof WO2021004456A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910611416.X 2019-07-08
CN201910611416.XA CN112266418A (en) 2019-07-08 2019-07-08 Improved genome editing system and application thereof

Publications (1)

Publication Number Publication Date
WO2021004456A1 true WO2021004456A1 (en) 2021-01-14

Family

ID=74114361

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/100664 WO2021004456A1 (en) 2019-07-08 2020-07-07 Improved genome editing system and use thereof

Country Status (2)

Country Link
CN (1) CN112266418A (en)
WO (1) WO2021004456A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114438127B (en) * 2022-03-02 2024-03-19 苏州科锐迈德生物医药科技有限公司 Recombinant nucleic acid molecule and application thereof in preparation of circular RNA
CN114686456B (en) * 2022-05-10 2023-02-17 中山大学 Base editing system based on bimolecular deaminase complementation and application thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105658805A (en) * 2013-06-05 2016-06-08 杜克大学 Rna-guided gene editing and gene regulation
CN107722125A (en) * 2017-09-28 2018-02-23 中山大学 A kind of efficient manual transcription activity factor dCas9 TV and its encoding gene and application
WO2018148256A1 (en) * 2017-02-07 2018-08-16 The Regents Of The University Of California Gene therapy for haploinsufficiency
WO2019040650A1 (en) * 2017-08-23 2019-02-28 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107794272B (en) * 2016-09-06 2021-10-12 中国科学院上海营养与健康研究所 High-specificity CRISPR genome editing system
AU2017358264A1 (en) * 2016-11-14 2019-02-21 Institute Of Genetics And Developmental Biology, Chinese Academy Of Sciences A method for base editing in plants

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105658805A (en) * 2013-06-05 2016-06-08 杜克大学 Rna-guided gene editing and gene regulation
WO2018148256A1 (en) * 2017-02-07 2018-08-16 The Regents Of The University Of California Gene therapy for haploinsufficiency
WO2019040650A1 (en) * 2017-08-23 2019-02-28 The General Hospital Corporation Engineered crispr-cas9 nucleases with altered pam specificity
CN107722125A (en) * 2017-09-28 2018-02-23 中山大学 A kind of efficient manual transcription activity factor dCas9 TV and its encoding gene and application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GUANWEN LIU, YIN KANGQUAN, ZHANG QIANWEI, GAO CAIXIA, QIU JIN-LONG: "Modulating chromatin accessibility by transactivation and targeting proximal dsgRNAs enhances Cas9 editing efficiency in vivo", GENOME BIOLOGY, vol. 20, no. 1, 26 July 2019 (2019-07-26), pages 1 - 11, XP055771921, ISSN: 1465-6906, DOI: 10.1186/s13059-019-1762-8 *

Also Published As

Publication number Publication date
CN112266418A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
JP7216877B2 (en) Novel CRISPR/Casl2f enzymes and systems
CN108690845B (en) Genome editing system and method
US20180362590A1 (en) Polypeptides with type v crispr activity and uses thereof
JP6715419B2 (en) Genome editing using RGEN derived from Campylobacter jejuni CRISPR/CAS system
JP6657069B2 (en) RNA-induced targeting of genomic and epigenomic regulatory proteins to specific genomic loci
US10760064B2 (en) RNA-guided targeting of genetic and epigenomic regulatory proteins to specific genomic loci
JP2023002712A (en) S. pyogenes cas9 mutant genes and polypeptides encoded by the same
JP2020508046A (en) Genome editing system and method
US20160289284A1 (en) Artificial dna-binding proteins and uses thereof
CN110527697B (en) RNA fixed-point editing technology based on CRISPR-Cas13a
JP2022518329A (en) CRISPR-Cas12j Enzymes and Systems
WO2019127087A1 (en) System and method for genome editing
KR102626503B1 (en) Target sequence-specific modification technology using nucleotide target recognition
CN113373130A (en) Cas12 protein, gene editing system containing Cas12 protein and application
US9688997B2 (en) Genetically modified plants with resistance to Xanthomonas and other bacterial plant pathogens
Kapusi et al. phiC31 integrase-mediated site-specific recombination in barley
WO2021004456A1 (en) Improved genome editing system and use thereof
WO2020087631A1 (en) System and method for genome editing based on c2c1 nucleases
CN112805385B (en) Base editor based on human APOBEC3A deaminase and application thereof
CN113025597A (en) Improved genome editing system
WO2023216415A1 (en) Base editing system based on bimolecular deaminase complementation, and use thereof
JP7454881B2 (en) Target nucleotide sequence modification technology using CRISPR type ID system
US20220220460A1 (en) Enzymes with ruvc domains
KR20220150363A (en) Improved Cytosine Base Editing System
WO2021098709A1 (en) Gene editing system derived from flavobacteria

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20836214

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20836214

Country of ref document: EP

Kind code of ref document: A1