WO2021023307A1 - CRISPR/Cas9基因编辑系统及其应用 - Google Patents

CRISPR/Cas9基因编辑系统及其应用 Download PDF

Info

Publication number
WO2021023307A1
WO2021023307A1 PCT/CN2020/107880 CN2020107880W WO2021023307A1 WO 2021023307 A1 WO2021023307 A1 WO 2021023307A1 CN 2020107880 W CN2020107880 W CN 2020107880W WO 2021023307 A1 WO2021023307 A1 WO 2021023307A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
protein
cas9
cells
sgrna
Prior art date
Application number
PCT/CN2020/107880
Other languages
English (en)
French (fr)
Inventor
王永明
胡子英
王大奇
王帅
Original Assignee
复旦大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201910731402.1A external-priority patent/CN110551761B/zh
Priority claimed from CN201910731803.7A external-priority patent/CN110499335B/zh
Priority claimed from CN201910731802.2A external-priority patent/CN110551763B/zh
Priority claimed from CN201910731412.5A external-priority patent/CN110577972B/zh
Priority claimed from CN201910731390.2A external-priority patent/CN110551760B/zh
Priority claimed from CN201910731396.XA external-priority patent/CN110577969B/zh
Priority claimed from CN201910731401.7A external-priority patent/CN110577971B/zh
Priority claimed from CN201910731398.9A external-priority patent/CN110577970B/zh
Priority claimed from CN201910731794.1A external-priority patent/CN110551762B/zh
Priority claimed from CN201910731795.6A external-priority patent/CN110499334A/zh
Application filed by 复旦大学 filed Critical 复旦大学
Priority to US17/633,354 priority Critical patent/US20240175055A1/en
Priority to EP20849939.2A priority patent/EP4012037A1/en
Priority to JP2022507560A priority patent/JP2022543451A/ja
Publication of WO2021023307A1 publication Critical patent/WO2021023307A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/80Vectors containing sites for inducing double-stranded breaks, e.g. meganuclease restriction sites

Definitions

  • the invention belongs to the technical field of gene editing, and specifically relates to a CRISPR/Cas9 system capable of performing gene editing in cells and its related applications.
  • CRISPR/Cas9 is an adaptive immune system evolved by bacteria and archaea to resist the invasion of foreign viruses or plasmids.
  • crRNA CRISPR-derived RNA
  • tracrRNA trans-activating RNA
  • Cas9 protein the PAM (Protospacer Adjacent Motif) sequence of the target site is recognized, and crRNA will form with the target DNA sequence Complementary structure
  • Cas9 protein performs the function of cutting DNA, causing DNA breakage and damage.
  • tracrRNA and crRNA can be fused into a single guide RNA (sgRNA) through a connecting sequence.
  • NHEJ non-homologous end-joining
  • HR homologous recombination
  • CRISPR/Cas9 In addition to basic scientific research, CRISPR/Cas9 also has a wide range of clinical applications. When using the CRISPR/Cas9 system for gene therapy, Cas9 and sgRNA need to be introduced into the body. At present, the most effective delivery vector for gene therapy is AAV virus. However, the DNA packaged by AAV virus generally does not exceed 4.5 kb. SpCas9 is widely used because of its simple PAM sequence (identifying NGG) and high activity. However, the SpCas9 protein itself has 1368 amino acids, plus sgRNA and promoters, cannot be effectively packaged into the AAV virus, which limits its clinical application.
  • Cas9 In order to overcome this problem, several small Cas9 were invented, including SaCas9 (PAM sequence is NNGRRT), St1Cas9 (PAM sequence is NNAGAW), NmCas9 (PAM sequence is NNNNGATT), Nme2Cas9 (PAM sequence is NNNNCC), CjCas9 (PAM) The sequence is NNNNRYAC).
  • these Cas9s are either easily off-target (ie, non-targeted site cleavage), or the PAM sequence is complicated, or the editing activity is low, and it is difficult to be widely used.
  • the purpose of the present invention is to provide a new CRISPR/Cas9 gene editing system with high editing activity, high specificity, small Cas9 protein, and simple PAM sequence and its application.
  • the present invention provides a CRISPR/Cas9 gene editing system that is performed in cells or in vitro, characterized in that the CRISPR/Cas9 system is a complex of Cas9 protein and sgRNA, It can accurately locate and cut the targeted DNA sequence, causing double-strand break damage to the targeted DNA sequence;
  • the Cas9 protein is:
  • the SauriCas9 protein has the amino acid sequence shown in SEQ ID NO:1,
  • ShaCas9 protein which has the amino acid sequence shown in SEQ ID NO: 2,
  • SlugCas9 protein which has the amino acid sequence shown in SEQ ID NO: 3,
  • SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 4,
  • Sa-Sauri protein which has the amino acid sequence shown in SEQ ID NO: 5,
  • the Sa-SepCas9 protein has the amino acid sequence shown in SEQ ID NO: 6,
  • Sa-SeqCas9 protein which has the amino acid sequence shown in SEQ ID NO: 7,
  • the Sa-ShaCas9 protein has the amino acid sequence shown in SEQ ID NO: 8,
  • the Sa-SlugCas9 protein has the amino acid sequence shown in SEQ ID NO: 9,
  • Sa-SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 10, or
  • SlugCas9-HF protein which has the amino acid sequence shown in SEQ ID NO: 58, or
  • the Cas9 protein has an amino acid sequence that is at least 80% identical to the amino acid sequence shown in any one of SEQ ID NO: 1-10 and SEQ ID NO: 58;
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11.
  • the present invention provides a method for gene editing in a cell using the CRISPR/Cas9 gene editing system of the first aspect of the present invention, the method recognizes and locates the targeted DNA sequence by the complex of Cas9 protein and sgRNA To edit the targeted DNA sequence, the method includes the following steps:
  • a humanized Cas9 gene sequence for example, a nucleotide sequence shown in any one of SEQ ID NO: 23-32 and SEQ ID NO: 112;
  • the reverse chain sequence of the uronic acid is ligated to the restriction site of the expression vector cloned with the Cas9 gene sequence, such as the BsaI restriction site of the plasmid pAAV2_Cas9_U6_BsaI, to obtain the expression of the Cas9 protein and the sgRNA Expression vector, such as pAAV2_Cas9-hU6-sgRNA, wherein the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or the nucleotide sequence that is at least 80% identical to the nucleotide sequence shown in SEQ ID NO: 11 Sequence,
  • the expression vector expressing the Cas9 protein and the sgRNA is delivered to the cell containing the target site, so as to realize the editing of the target site.
  • the present invention provides a kit of CRISPR/Cas9 gene editing system for gene editing, the kit comprising:
  • Cas9 protein and sgRNA wherein the Cas9 protein is:
  • the SauriCas9 protein has the amino acid sequence shown in SEQ ID NO:1,
  • ShaCas9 protein which has the amino acid sequence shown in SEQ ID NO: 2,
  • SlugCas9 protein which has the amino acid sequence shown in SEQ ID NO: 3,
  • SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 4,
  • Sa-Sauri protein which has the amino acid sequence shown in SEQ ID NO: 5,
  • the Sa-SepCas9 protein has the amino acid sequence shown in SEQ ID NO: 6,
  • Sa-SeqCas9 protein which has the amino acid sequence shown in SEQ ID NO: 7,
  • the Sa-ShaCas9 protein has the amino acid sequence shown in SEQ ID NO: 8,
  • the Sa-SlugCas9 protein has the amino acid sequence shown in SEQ ID NO: 9,
  • Sa-SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 10, or
  • SlugCas9-HF protein which has the amino acid sequence shown in SEQ ID NO: 58, or
  • the Cas9 protein has an amino acid sequence that is at least 80% identical to the amino acid sequence shown in any one of SEQ ID NO: 1-10 and SEQ ID NO: 58;
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11;
  • (c) is a humanized Cas9 gene sequence, for example, a nucleotide sequence shown in any one of SEQ ID NO: 23-32 and SEQ ID NO: 112, and
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11.
  • the present invention provides the CRISPR/Cas9 gene editing system of the first aspect of the present invention in terms of gene knockout, site-directed base changes, site-directed insertion, regulation of gene transcription levels, regulation of DNA methylation, and DNA acetylation.
  • the CRISPR/Cas9 gene editing system of the present invention contains a smaller Cas9 protein. Compared with the prior art, it has fewer amino acids, so it can Effective packaging; in addition, the PAM sequence targeted by the CRISPR/Cas9 gene editing system of the present invention is simpler, so that more DNA sequences in the genome can be targeted, and the editing efficiency is higher.
  • Figure 1 is a schematic diagram of the CRISPR/Cas9 gene editing system cutting the target DNA sequence.
  • the gray oval represents the Cas9 protein
  • the black curved shape represents the sgRNA sequence
  • the darkened area in the upper chain of the genome represents the PAM sequence.
  • Figure 2 is a schematic diagram of the map of plasmid pAAV2_Cas9_U6_BsaI.
  • AAV2 ITR CMV enhancer
  • CMV promoter SV40 NLS
  • Cas9 nucleoplasmin NLS
  • 3x HA 3x HA
  • bGH poly(A) human U6 promoter
  • hU6 human U6 promoter
  • BsaI endonuclease site sgRNA scaffold sequence, etc. element.
  • Figures 3a to 3j are the partial second-generation sequencing results after the target site DNA sequence has been edited.
  • the edited result contains deletions, insertions or mismatches, and the final 4bp or 5bp represents the PAM sequence.
  • Figures 3a to 3j are respectively NNGG, NNGRM, NNGG, NNGR, NNGG, NNGG, NNGRM, NNGRM, NNGG and NNGRR;
  • Figure 3k shows the editing situation of the SlugCas9-HF gene editing system at two target sites, where the X axis represents the two target sites of G4 and G7 Point, Y axis represents indel efficiency.
  • Figure 4a to Figure 4j are the results of T7 Endonuclease I digestion at the endogenous site, where the arrow indicates the size of the cut fragment;
  • Figure 4k is the specific detection result of the SlugCas9-HF gene editing system in the GFP reporter system cell line.
  • the upper center shows a schematic diagram of the GFP reporting system.
  • a specific target DNA sequence and PAM are inserted between the initiation codon ATG and the GFP coding sequence, resulting in a GFP frameshift mutation.
  • the cell repairs itself The system will restore some cells to the GFP reading frame and produce green fluorescence.
  • the Y-axis of the histogram in the figure represents the GFP-positive ratio
  • the X-axis represents the sequence of On-target sgRNA and mismatch sgRNA.
  • the existing CRISPR/Cas9 gene editing system has various problems.
  • the Cas9 protein is too large, so that the system cannot be effectively packaged into a vector such as a virus.
  • the currently targeted PAM sequence is relatively complex, which results in a small editing range and is difficult to be widely used.
  • the current small-scale Cas9 editing activity is generally low.
  • the purpose of the present invention is to provide a new CRISPR/Cas9 gene editing system with high editing activity, high specificity, small Cas9 protein, and simple PAM sequence and its application.
  • the present invention provides a CRISPR/Cas9 gene editing system, the gene editing is performed in cells or in vitro, the CRISPR/Cas9 system is a complex of Cas9 protein and sgRNA, which can accurately locate the target Cut to the DNA sequence and cause double-strand break damage to the targeted DNA sequence;
  • the Cas9 protein is:
  • the SauriCas9 protein has the amino acid sequence shown in SEQ ID NO:1,
  • ShaCas9 protein which has the amino acid sequence shown in SEQ ID NO: 2,
  • SlugCas9 protein which has the amino acid sequence shown in SEQ ID NO: 3,
  • SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 4,
  • Sa-Sauri protein which has the amino acid sequence shown in SEQ ID NO: 5,
  • the Sa-SepCas9 protein has the amino acid sequence shown in SEQ ID NO: 6,
  • Sa-SeqCas9 protein which has the amino acid sequence shown in SEQ ID NO: 7,
  • the Sa-ShaCas9 protein has the amino acid sequence shown in SEQ ID NO: 8,
  • the Sa-SlugCas9 protein has the amino acid sequence shown in SEQ ID NO: 9,
  • Sa-SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 10, or
  • SlugCas9-HF protein which has the amino acid sequence shown in SEQ ID NO: 58, or
  • the Cas9 protein has an amino acid sequence that is at least 80% identical to the amino acid sequence shown in any one of SEQ ID NOs: 1-10 and 58;
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11.
  • sequences of SEQ ID NO: 1-11 and SEQ ID NO: 58 are as follows:
  • SEQ ID NO: 1 Amino acid sequence of SauriCas9 protein:
  • SEQ ID NO: 3 Amino acid sequence of SlugCas9 protein:
  • SEQ ID NO: 4 Amino acid sequence of SlutCas9 protein:
  • SEQ ID NO: 5 The amino acid sequence of Sa-SauriCas9 protein:
  • SEQ ID NO: 6 Amino acid sequence of Sa-SepCas9 protein:
  • SEQ ID NO: 7 Amino acid sequence of Sa-SeqCas9 protein:
  • SEQ ID NO: 8 Amino acid sequence of Sa-ShaCas9 protein:
  • SEQ ID NO: 9 The amino acid sequence of Sa-SlugCas9 protein:
  • SEQ ID NO: 11 The nucleotide sequence of sgRNA:
  • SEQ ID NO: 58 Amino acid sequence of SlugCas9-HF protein:
  • the present inventors have discovered a variety of Cas9 proteins that can complex with single-stranded guide RNA (sgRNA).
  • sgRNA single-stranded guide RNA
  • the CRISPR/SauriCas9 gene editing system that is, the system in which the SauriCas9 protein and single-stranded guide RNA (sgRNA) work together to achieve gene editing.
  • the complexes formed by other Cas9 proteins and sgRNA can be named in a similar way, such as CRISPR/ShaCas9 gene editing system, CRISPR/SlugCas9 gene editing system, and so on.
  • All Cas9 proteins of the present invention are very small, with only less than one thousand one hundred amino acids.
  • SauriCas9 protein has 1061 amino acids
  • ShaCas9 protein, Sa-SepCas9 protein, Sa-ShaCas9 protein, Sa-SlugCas9 protein has 1055 amino acids
  • Sa-SeqCas9 protein has 1053 amino acids
  • SlugCas9 protein, SlugCas9-HF protein and SlutCas9 Both the proteins have 1054 amino acids
  • the Sa-SauriCas9 protein and Sa-SlutCas9 protein have 1056 amino acids.
  • the Cas9 protein of the present invention has an amino acid sequence that is at least 80%, 81%, 82%, 83%, 84%, 85% identical to any one of SEQ ID NO: 1-10 and SEQ ID NO: 58. %, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 100%, or 80% -Amino acid sequences that are 100% identical in any percentage.
  • the cells include eukaryotic cells and prokaryotic cells
  • the eukaryotic cells include, for example, mammalian cells and plant cells
  • the mammalian cells include, for example, Chinese hamster ovary cells, baby hamster kidney cells, and small Mouse Sertoli cells, mouse breast tumor cells, buffalo rat hepatocytes, rat hepatoma cells, monkey kidney CVI lines transformed by SV40, monkey kidney cells, canine kidney cells, human cervical cancer cells, human lung cells, human liver Cells, HIH/3T3 cells, human U2-OS osteosarcoma cells, human A549 cells, human K562 cells, human HEK293 cells, human HEK293T cells, human HCT116 cells or human MCF-7 cells or TRI cells, but are not limited thereto.
  • the CRISPR/Cas9 system includes Staphylococcus auricularisCas9 (SauriCas9) protein, which has the amino acid sequence shown in SEQ ID NO: 1, and it works with single-stranded guide RNA (sgRNA) to achieve gene editing.
  • SauriCas9 Staphylococcus auricularisCas9
  • the SauriCas9 protein is derived from Staphylococcus auricularis, and the UniProt accession number of the SauriCas9 protein is A0A2T4M4R5.
  • the SauriCas9 protein includes a SauriCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes Staphylococcus haemolyticus Cas9 (ShaCas9) protein, which has the amino acid sequence shown in SEQ ID NO: 2, and it works with single-stranded guide RNA (sgRNA) to realize gene editing.
  • ShaCas9 Staphylococcus haemolyticus Cas9
  • sgRNA single-stranded guide RNA
  • the ShaCas9 protein is derived from Staphylococcus haemolyticus, and the access number of the UniProt of the ShaCas9 protein is A0A2T4SLN6.
  • the ShaCas9 protein includes ShaCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes Staphylococcus lugdunensis Cas9 (SlugCas9) protein, which has the amino acid sequence shown in SEQ ID NO: 3, and it works with single-stranded guide RNA (sgRNA) to realize gene editing.
  • SlugCas9 Staphylococcus lugdunensis Cas9
  • the SlugCas9 protein is derived from Staphylococcus lugdunensis, and the UniProt accession number of the SlugCas9 protein is A0A133QCR3.
  • the SlugCas9 protein includes a SlugCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes Staphylococcus lutrae Cas9 (SlutCas9) protein, which has the amino acid sequence shown in SEQ ID NO: 4, and it works with single-stranded guide RNA (sgRNA) to realize gene editing.
  • Staphylococcus lutrae Cas9 SlutCas9 protein, which has the amino acid sequence shown in SEQ ID NO: 4, and it works with single-stranded guide RNA (sgRNA) to realize gene editing.
  • the SlutCas9 protein is derived from Staphylococcus lutrae, and the UniProt access number of the SlutCas9 protein is A0A1W6BMI2.
  • the SlutCas9 protein includes a SlutCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system comprises Sa-SauriCas9 protein, which is a fusion protein obtained by replacing the PI domain of SaCas9 with the PI domain of SauriCas9, wherein SauriCas9 is Staphylococcus auricularis Cas9.
  • the Sa-SauriCas9 protein has the amino acid sequence shown in SEQ ID NO: 5.
  • Sa-SauriCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to achieve gene editing.
  • the SauriCas9 protein is derived from Staphylococcus auricularis, and the accession number of the UniProt of the SauriCas9 protein is A0A2T4M4R5.
  • the Sa-SauriCas9 protein includes a Sa-SauriCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes a Sa-SepCas9 protein, a fusion protein obtained by replacing the PI domain of SaCas9 with the PI domain of SepCas9 in the Sa-SepCas9 protein, where SepCas9 is Staphylococcus epidermidis Cas9.
  • the Sa-SepCas9 protein has an amino acid sequence shown in SEQ ID NO:6.
  • the Sa-SepCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to realize gene editing.
  • the SepCas9 protein is derived from Staphylococcus epidermidis, the UniProt accession number of the SepCas9 protein is A0A1Q9MLU4, and the NCBI accession number is WP_075777761.1.
  • the Sa-SepCas9 protein includes a Sa-SepCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system comprises a Sa-SeqCas9 protein, which is a fusion protein obtained by replacing the PI domain of SaCas9 with the PI domain of SeqCas9, wherein SeqCas9 is Staphylococcus equiorum Cas9 .
  • the Sa-SeqCas9 protein has an amino acid sequence shown in SEQ ID NO:7.
  • the Sa-SeqCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to realize gene editing.
  • the SeqCas9 protein is derived from Staphylococcus equiorum, and the accession number of the UniProt of the SeqCas9 protein is A0A1E5TL62.
  • the Sa-SeqCas9 protein includes a Sa-SeqCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes Sa-ShaCas9 protein, which is a fusion protein in which the PI domain of SaCas9 is replaced with the PI domain of ShaCas9, wherein ShaCas9 is Staphylococcus haemolyticus Cas9.
  • the Sa-ShaCas9 protein has an amino acid sequence shown in SEQ ID NO: 8.
  • the Sa-ShaCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to realize gene editing.
  • the ShaCas9 protein is derived from Staphylococcus haemolyticus, and the accession number of the UniProt of the ShaCas9 protein is A0A2T4SLN6.
  • the Sa-ShaCas9 protein includes a Sa-ShaCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system comprises a Sa-SlugCas9 protein
  • the Sa-SlugCas9 protein is a fusion protein obtained by replacing the PI domain of SaCas9 with the PI domain of SlugCas9, wherein the SlugCas9 protein is Staphylococcus lugdunensis Cas9.
  • the Sa-SlugCas9 protein has an amino acid sequence shown in SEQ ID NO: 9.
  • the SlugCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to realize gene editing.
  • the SlugCas9 protein is derived from Staphylococcus lugdunensis, and the UniProt accession number of the SlugCas9 protein is A0A133QCR3.
  • the Sa-SlugCas9 protein includes a Sa-SlugCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes Sa-SlutCas9 protein, which is a fusion protein obtained by replacing the PI domain of SaCas9 with the PI domain of SlutCas9, wherein SlutCas9 is Staphylococcus lutrae Cas9 .
  • the Sa-SlutCas9 protein has an amino acid sequence shown in SEQ ID NO: 10.
  • the Sa-SlutCas9 fusion protein and single-stranded guide RNA (sgRNA) work together to realize gene editing.
  • the SlutCas9 protein is derived from Staphylococcus lutrae, and the UniProt access number of the SlutCas9 protein is A0A1W6BMI2.
  • the Sa-SlutCas9 protein includes a Sa-SlutCas9 protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the CRISPR/Cas9 system includes the SlugCas9-HF protein
  • the SlugCas9-HF protein is an amino acid modified protein
  • R247A, N415A, T421A, and R656A mutations are introduced on SlugCas9
  • SlugCas9-HF is Staphylococcus lugdunensis Cas9- HiFi.
  • SlugCas9-HF and single-stranded guide RNA (sgRNA) work together to achieve gene editing.
  • the complex of the SlugCas9-HF protein and sgRNA has a low off-target rate, high specificity, and has a low tolerance for non-targeting DNA sequences, that is, the non-targeting DNA sequences are not cut or cut substantially.
  • the SlugCas9 protein belongs to Staphylococcus lugdunensis, and the UniProt accession number of the SlugCas9 protein is A0A133QCR3.
  • the SlugCas9-HF protein includes a SlugCas9-HF protein that has no cleavage activity, or only has single-stranded cleavage activity, or has double-stranded cleavage activity.
  • the precise positioning targeting DNA sequence includes a 5'end 20bp or 21bp sequence in the sgRNA that can form a base complementary pairing structure with the targeting DNA sequence.
  • the precise positioning of the targeting DNA sequence includes the complex of the Cas9 protein and sgRNA recognizing the PAM sequence on the targeting DNA sequence.
  • the 5'end 20bp or 21bp sequence in the sgRNA can form an incomplete base complementary pairing structure with a non-targeting DNA sequence.
  • the incomplete base complementary pairing structure includes a part of base complementary pairing structure and a part of non-base complementary pairing structure.
  • the complex of the SlugCas9-HF protein and sgRNA can recognize the PAM sequence on the non-targeting DNA sequence.
  • the PAM sequence and the targeting DNA sequence are as follows:
  • the PAM is NNGG, and the targeting DNA sequence is shown in SEQ ID NO: 12;
  • the PAM is NNGRM, and the target DNA sequence is shown in SEQ ID NO: 13;
  • the PAM is NNGG, and the targeting DNA sequence is shown in SEQ ID NO: 12;
  • the PAM is NNGR, and the targeting DNA sequence is shown in SEQ ID NO: 14;
  • the PAM is NNGG, and the targeting DNA sequence is shown in SEQ ID NO: 12;
  • the PAM is NNGG, and the target DNA sequence is shown in SEQ ID NO: 12;
  • the PAM is NNGRM, and the target DNA sequence is shown in SEQ ID NO: 13;
  • the PAM is NNGRM, and the target DNA sequence is shown in SEQ ID NO: 13;
  • the PAM is NNGG, and the target DNA sequence is shown in SEQ ID NO: 12;
  • the PAM is NNGRR, and the target DNA sequence is shown in SEQ ID NO: 15;
  • the PAM sequence is NNGG, and the targeting DNA sequence is shown in SEQ ID NO: 12.
  • the nucleotide sequence of SEQ ID NO: 12-15 is as follows:
  • the base N in the above represents any one of the four bases A (adenine), T (thymine), C (cytosine) and G (guanine).
  • the base M in the text represents any one of the two bases A and C, and the base R above represents any one of the two bases A and G.
  • the complex of Cas9 protein and sgRNA can accurately locate the target DNA sequence, which means that the complex of Cas9 protein and sgRNA can recognize and bind to the target DNA sequence, or refers to the complex that will be fused with the Cas9 protein.
  • Other proteins or proteins that specifically recognize sgRNA are brought to the position of the targeted DNA sequence.
  • the complex of the Cas9 protein and sgRNA, or the other protein fused with the Cas9 protein, or the protein that specifically recognizes the sgRNA can modify and regulate the targeted DNA region, so
  • the modification and regulation include, but are not limited to, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification, single base switch or chromatin imaging tracking.
  • the complex of the SlugCas9-HF protein and sgRNA has a low tolerance to non-targeted DNA sequences, that is, the complex of the SlugCas9-HF protein and sgRNA is basically unable or unable to recognize and bind to non-targets.
  • the complex of the SlugCas9-HF protein and sgRNA is basically unable or unable to recognize and bind to non-targets.
  • To the DNA sequence it is basically impossible or impossible to bring other proteins fused with the SlugCas9-HF protein or a protein that specifically recognizes sgRNA to a position other than the target DNA sequence.
  • the term "essentially” in the expression "the complex of the SlugCas9-HF protein and sgRNA basically cannot recognize and bind to non-targeted DNA sequences" means that the complex of the SlugCas9-HF protein and sgRNA recognizes And the degree of binding, if any, to non-targeted DNA sequences has little or no biological and/or statistical significance.
  • the complex of the SlugCas9-HF protein and sgRNA or the other protein fused with the SlugCas9-HF protein or the protein that specifically recognizes sgRNA is basically unable or unable to target non-targeted DNA regions.
  • Modification and regulation are performed, and the modification and regulation include, but are not limited to, regulation of gene transcription level, DNA methylation regulation, DNA acetylation modification, histone acetylation modification, single-base converter or chromatin imaging tracking.
  • the expression "the complex of the SlugCas9-HF protein and sgRNA or the other protein fused with the SlugCas9-HF protein or the protein that specifically recognizes sgRNA is basically unable or unable to target non-targeting
  • the term “substantially” in “modification and regulation of DNA region” refers to the complex of the SlugCas9-HF protein and sgRNA or the other protein fused with the SlugCas9-HF protein or the protein modification and specific recognition of sgRNA The degree of regulation, if any, of the non-targeted DNA region has little or no biological and/or statistical significance.
  • the single-base converter includes but is not limited to the conversion of the base adenine to guanine, or the conversion of cytosine to thymine, or the conversion of cytosine to uracil, or other bases Conversion between.
  • the CRISPR/Cas9 gene editing system provided by the present invention has high editing activity and high specificity, and has obvious advantages compared with the existing CRISPR/Cas9 gene editing system.
  • the invention detects the editing efficiency and off-target rate of the CRISPR/Cas9 system through technologies such as gene synthesis, molecular cloning, cell transfection, deep sequencing of PCR products, flow cytometry, bioinformatics analysis and the like.
  • the CRISPR/Cas9 gene editing system of the present invention was verified in a GFP reporter system cell line containing a target site, and it was found that the gene editing system can edit target genes with high specificity and has a low off-target rate.
  • the present invention provides a method for gene editing in cells using the CRISPR/Cas9 gene editing system of the first aspect of the present invention.
  • the method recognizes, locates and targets by the complex of Cas9 protein and sgRNA.
  • DNA sequence to edit the target DNA sequence includes the following steps:
  • (c) is a humanized Cas9 gene sequence, for example, a nucleotide sequence shown in any one of SEQ ID NO: 23-32 and SEQ ID NO: 112;
  • oligonucleotide single-stranded DNA corresponding to the sgRNA namely the oligonucleotide forward strand sequence (Oligo-F) and the oligonucleotide reverse strand sequence (Oligo-R), and add the The nucleotide forward strand sequence and the oligonucleotide reverse strand sequence are annealed and ligated to the restriction site of the expression vector cloned with the Cas9 gene sequence, such as the BsaI restriction site of plasmid pAAV2_Cas9_U6_BsaI, to obtain An expression vector for expressing the Cas9 protein and the sgRNA, such as pAAV2_Cas9-hU6-sgRNA, wherein the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or the nucleoside shown in SEQ ID NO: 11
  • the acid sequence is at least 80% identical to a nucleot
  • the expression vector expressing the Cas9 protein and the sgRNA is delivered to the cell containing the target site, so as to realize the editing of the target site.
  • SEQ ID NO: 23-32 and SEQ ID NO: 112 are as follows:
  • SEQ ID NO: 23 Humanized SauriCas9 gene sequence:
  • SEQ ID NO: 24 Humanized ShaCas9 gene sequence:
  • SEQ.ID.NO:25 Humanized SlugCas9 gene sequence:
  • SEQ.ID.NO:26 Humanized SlutCas9 gene sequence:
  • SEQ ID NO: 27 Humanized Sa-SauriCas9 gene sequence:
  • SEQ ID NO: 28 Humanized Sa-SepCas9 gene sequence:
  • SEQ ID NO: 29 Humanized Sa-SeqCas9 gene sequence:
  • SEQ ID NO: 30 Humanized Sa-ShaCas9 gene sequence:
  • SEQ ID NO: 31 Humanized Sa-SlugCas9 gene sequence:
  • SEQ ID NO: 32 Humanized Sa-SlutCas9 gene sequence:
  • SEQ ID NO: 112 Humanized SlugCas9-HF gene sequence:
  • the expression vector can be a plasmid vector, a retroviral vector, an adenovirus vector, an adeno-associated virus vector such as pAAV2_ITR and the like.
  • a retroviral vector an adenovirus vector
  • an adeno-associated virus vector such as pAAV2_ITR and the like.
  • any other suitable expression vectors are also feasible.
  • any targeted sgRNA can be designed for the DNA sequence to be edited according to specific needs, and the sgRNA can be modified to a certain extent as known in the art. Therefore, in one embodiment, the modification to the sgRNA includes, but is not limited to, phosphorylation, shortening, lengthening, sulfurization, methylation, and hydroxylation.
  • any mismatch sgRNA can be designed for the DNA sequence to be edited according to specific needs, and the sgRNA can be modified to a certain extent as known in the art.
  • the modifications include but are not limited to phosphorylation, shortening, Lengthen, vulcanize, methylate, hydroxylate.
  • the CRISPR/Cas9 system delivered to the cell containing the target site in step (3) may include, but is not limited to: plasmid vectors expressing the Cas9 protein and sgRNA of the present invention, retroviruses, adenoviruses, Or adeno-associated virus vector, or sgRNA and protein itself, according to specific needs.
  • the delivery means include, but are not limited to, liposomes, cationic polymers, nanoparticles, multifunctional envelope nanoparticles, and viral vectors.
  • the cells include, but are not limited to, eukaryotic cells and prokaryotic cells such as bacterial cells, and the eukaryotic cells include, for example, mammalian cells and plant cells.
  • Mammalian cells include, for example, animal cells such as Chinese hamster ovary cells, baby hamster kidney cells, mouse Sertoli cells, mouse breast tumor cells, buffalo rat hepatocytes, rat hepatoma cells, monkey kidney CVI lines transformed by SV40, Monkey kidney cells, dog kidney cells, human cells such as human cervical cancer cells, human lung cells, human hepatocytes, HIH/3T3 cells, human U2-OS osteosarcoma cells, human A549 cells, human K562 cells, human HEK293 cells, human HEK293T cells, human HCT116 cells or human MCF-7 cells or TRI cells.
  • the modification described in step (2) includes but is not limited to phosphorylation, shortening, lengthening, sulfidation or methylation.
  • the Oligo-F is SEQ ID NO: 16
  • the Oligo-R is SEQ ID NO: 17
  • the Oligo-F and the Oligo-R include the first oligonucleotide forward strand sequence (Oligo-F1) and the first oligonucleotide reverse sequence shown in SEQ ID NO: 59 and SEQ ID NO: 60 Chain sequence (Oligo-R1), as well as the second oligonucleotide forward chain sequence (Oligo-F2) and the second oligonucleotide reverse chain sequence shown in SEQ ID NO: 61 and SEQ ID NO: 62 ( Oligo-R2).
  • the Oligo-F sequence and Oligo-R sequence need to be annealed to become double-stranded DNA. Therefore, in one embodiment, the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F sequence, 1 ⁇ L 100 ⁇ M oligo-R sequence, and 28 ⁇ L water. After shaking and mixing, place the annealing reaction system in a PCR machine and run the annealing program.
  • the annealing program is as follows: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, store at 4°C, cooling rate 0.3°C/s.
  • the expression vector cloned with Cas9 such as the plasmid pAAV2_Cas9_ITR, needs to be linearized with restriction enzymes such as BsaI.
  • the product after annealing of the Oligo-F sequence and the Oligo-R sequence is ligated with the linearized expression vector containing Cas9, such as the pAAV2_Cas9_ITR backbone vector, by DNA ligase,
  • the linearized expression vector containing Cas9 such as the pAAV2_Cas9_ITR backbone vector
  • DNA ligase DNA ligase
  • the pAAV2_Cas9-hU6-sgRNA is an adeno-associated virus backbone plasmid, which includes AAV2ITR, CMV enhancer, CMV promoter, SV40 NLS, Cas9, nucleoplasmin NLS, 3x HA, bGH poly(A ), human U6 promoter, BsaI endonuclease site, sgRNA scaffold sequence.
  • the ligation product is transformed into competent cells, then the correct clone is verified by Sanger sequencing, and then the plasmid is extracted for use.
  • the cell in step (3) is a HEK293T cell, and the target site contained therein has the nucleotide sequence shown in SEQ ID NO: 18;
  • the target site in the cell in step (3) has the nucleotide sequence shown in SEQ ID NO: 63 and SEQ ID NO: 64, respectively.
  • the delivery tool in step (3) is liposome, including, for example, Or PEI.
  • the method further includes step (4) detecting the editing efficiency of the edited target site, for example, by performing PCR amplification on the edited target site, and then performing T7EI digestion or Second-generation sequencing method.
  • the template used for PCR amplification in step (4) is the edited genomic DNA of HEK293T cells.
  • the primer sequences used for PCR amplification are SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21. SEQ ID NO: 22;
  • the primer sequences for PCR amplification are SEQ ID NO: 65, SEQ ID NO: 66, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 67 .
  • the present invention also provides a kit of CRISPR/Cas9 gene editing system for gene editing, the kit comprising:
  • Cas9 protein and sgRNA wherein the Cas9 protein is:
  • the SauriCas9 protein has the amino acid sequence shown in SEQ ID NO:1,
  • ShaCas9 protein which has the amino acid sequence shown in SEQ ID NO: 2,
  • SlugCas9 protein which has the amino acid sequence shown in SEQ ID NO: 3,
  • SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 4,
  • the Sa-SauriCas9 protein has the amino acid sequence shown in SEQ ID NO: 5,
  • the Sa-SepCas9 protein has the amino acid sequence shown in SEQ ID NO: 6,
  • Sa-SeqCas9 protein which has the amino acid sequence shown in SEQ ID NO: 7,
  • the Sa-ShaCas9 protein has the amino acid sequence shown in SEQ ID NO: 8,
  • the Sa-SlugCas9 protein has the amino acid sequence shown in SEQ ID NO: 9,
  • Sa-SlutCas9 protein which has the amino acid sequence shown in SEQ ID NO: 10, or
  • SlugCas9-HF protein which has the amino acid sequence shown in SEQ ID NO: 58, or
  • the Cas9 protein has an amino acid sequence that is at least 80% identical to the amino acid sequence shown in any one of SEQ ID NO: 1-10 and SEQ ID NO: 58;
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11;
  • (c) is a humanized Cas9 gene sequence, for example, a nucleotide sequence shown in any one of SEQ ID NO: 23-32 and SEQ ID NO: 112, and
  • the sgRNA has the nucleotide sequence shown in SEQ ID NO: 11, or is an sgRNA sequence modified based on SEQ ID NO: 11.
  • the present invention also provides the CRISPR/Cas9 gene editing system of the first aspect of the present invention for gene knockout, site-directed base change, site-directed insertion, regulation of gene transcription level, regulation of DNA methylation, and DNA acetylation.
  • the CRISPR/Cas9 gene editing system of the present invention contains a smaller Cas9 protein. Compared with the prior art, it has fewer amino acids, so it can Effective packaging; in addition, the PAM sequence targeted by the CRISPR/Cas9 gene editing system of the present invention is simpler, so that more DNA sequences in the genome can be targeted, and the editing efficiency is higher.
  • the present invention will be described in more detail through specific embodiments. It should be understood that, unless otherwise specified, the reagents, methods and equipment used in the present invention are all conventional reagents, methods and equipment in the technical field. Unless otherwise specified, the reagents and materials used in the following examples are all commercially available. Experimental methods without specific conditions are usually implemented in accordance with conventional conditions or conditions recommended by the manufacturer.
  • Step (1) Download the amino acid sequence of Cas9 gene according to its access number on UniProt.
  • the SauriCas9 gene ShaCas9 gene, SlugCas9 gene, SlutCas9 gene, Sa-SauriCas9 gene, Sa-SepCas9 gene, Sa-SeqCas9 gene, Sa-ShaCas9 gene, Sa-SlugCas9 gene and Sa-SlutCas9 gene on UniProt
  • the access number and amino acid sequence of each Cas9 gene on UniProt are as follows:
  • Cas9 gene search number Amino acid sequence SauriCas9 A0A2T4M4R5 SEQ ID NO: 1 ShaCas9 A0A2T4SLN6 SEQ ID NO: 2 SlugCas9 A0A133QCR3 SEQ ID NO: 3 SlutCas9 A0A1W6BMI2 SEQ ID NO: 4 Sa-SauriCas9 A0A2T4M4R5 SEQ ID NO: 5 Sa-SepCas9 A0A1Q9MLU4 SEQ ID NO: 6 Sa-SeqCas9 A0A1E5TL62 SEQ ID NO: 7 Sa-ShaCas9 A0A2T4SLN6 SEQ ID NO: 8 Sa-SlugCas9 A0A133QCR3 SEQ ID NO: 9 Sa-SlutCas9 A0A1W6BMI2 SEQ ID NO: 10 SlugCas9-HF A0A133QCR3 SEQ ID
  • SEQ ID NO: 58 introduces R247A, N415A, T421A, and R656A mutations.
  • Step (2) Codon optimization is performed on the amino acid sequence of Cas9 to obtain a coding sequence for the high expression of Cas9 in human cells.
  • Step (3) Gene synthesis of the Cas9 coding sequence obtained in step (2) and constructing it on the pAAV2_ITR backbone plasmid to obtain the plasmid pAAV2_Cas9_ITR, as shown in FIG. 2.
  • Step (1) The plasmid pAAV2_Cas9_ITR obtained in Example 1 was digested and linearized with BsaI restriction enzyme.
  • the digestion system was: 1 ⁇ g plasmid pAAV2_Cas9_ITR, 5 ⁇ L 10x CutSmart buffer, 1 ⁇ L BsaI endonuclease, ddH 2 O make up to 50 ⁇ L.
  • the digestion system was allowed to react at 37°C for 1 hour.
  • Step (2) The digested product was electrophoresed on a 1% agarose gel and electrophoresed at 120V for 30 minutes.
  • Step (3) Cut DNA fragments, recover them with a gel recovery kit according to the steps provided by the manufacturer, and finally wash them with ddH 2 O.
  • the DNA fragment is the linearized plasmid pAAV2_Cas9_ITR containing SauriCas9, ShaCas9, SlugCas9, SlutCas9, Sa-Sauri, Sa-SepCas9, Sa-SeqCas9, Sa-ShaCas9, Sa-SlugCas9, Sa-SlutCas9, and SlugCas9-HF, which The sizes are 7447bp, 7430bp, 7427bp, 7437bp, 7433bp, 7430bp, 7423bp, 7430bp, 7430bp, 7433bp and 7427bp, respectively.
  • Step (4) Use NanoDrop to determine the DNA concentration of the recovered linearized plasmid pAAV2_Cas9_ITR, and store it at -20°C for long-term storage.
  • Step (1) Design the sgRNA sequence.
  • Step (2) Add the corresponding sticky end sequences on both sides of the linearized plasmid pAAV2_Cas9_ITR on the sense strand and antisense strand corresponding to the designed sgRNA sequence, and synthesize oligonucleotide single-stranded DNA.
  • Oligo-F CACCGCTCGGAGATCATCATTGCG (SEQ ID NO: 16);
  • Oligo-R AAACCGCAATGATGATCTCCGAGC (SEQ ID NO: 17).
  • Oligo-F1 CACCAGAGTAGGCTGGTAGATGGAG (SEQ ID NO: 59);
  • Oligo-R1 AAACCTCCATCTACCAGCCTACTCT (SEQ ID NO: 60);
  • Oligo-F2 CACCGTCAGACATGAGATCACAGAT (SEQ ID NO: 61);
  • Oligo-R2 AAACATCTGTGATCTCATGTCTGAC (SEQ ID NO: 62).
  • Step (3) The oligonucleotide single-stranded DNA is annealed into double-stranded DNA.
  • the annealing reaction system is: 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L ddH 2 O. After shaking and mixing the annealing system, place it in a PCR machine and run the annealing program.
  • the annealing program is: 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, store at 4°C, cooling rate 0.3°C/s.
  • Step (4) Connect the annealed product with the linearized plasmid pAAV2_Cas9_ITR obtained in Example 2 under the action of DNA ligase according to the steps provided by the product.
  • Step (5) Take 1 ⁇ L of the ligation product for chemically competent transformation, and perform Sanger sequencing verification on the growing bacterial clone.
  • Step (6) The cloned shake bacteria that is correctly connected to the sequencing verification, the plasmid pAAV2_Cas9-hU6-sgRNA is extracted for use.
  • Step (1) On day 0, according to the needs of transfection, plate the HEK293T cell line containing the GFP reporter system on a 6-well plate with a cell density of about 30%.
  • the sequence of the target site is as SEQ ID NO: 18 (GCTCGGAGATCATCATTGCGNNNNN) shown.
  • Step (2) On the first day, perform transfection, the transfection steps are as follows:
  • the cell to be transfected contains the CMV-ATG-target site-NNNNNNN-GFP nucleotide sequence, the nucleotide sequence of which is shown in SEQ ID NO: 113, which contains the sequence shown in SEQ ID NO: 18. It should be noted that the nucleotide sequence has 7 random bases N between the target site and the GFP sequence as the PAM sequence (marked in bold underline).
  • Step (3) Place the cells in a 37°C, 5% CO 2 incubator to continue culturing.
  • Step (4) After editing for 3 days, the GFP positive cells are sorted out by flow sorting, and placed in a 37°C, 5% CO 2 incubator for continued cultivation.
  • Step (1) On day 0, according to the requirements of transfection, the HEK293T cell line containing the sgRNA targeting site is plated on a 6-well plate with a cell density of about 30%.
  • the sequence of the G4 targeting site of SlugCas9-HF is shown in SEQ ID NO: 63 (AGAGTAGGCTGGTAGATGGAGNNNN), and the sequence of the G7 targeting site is shown in SEQ ID NO: 64 (ATCTGTGATCTCATGTCTGACNNNN).
  • Step (2) On the first day, perform transfection, the transfection steps are as follows:
  • iii. Will be diluted Mix with the diluted plasmid, gently pipette to mix, let stand at room temperature for 20 minutes, and then add it to the medium containing the cells to be transfected.
  • Step (3) Place the cells in a 37°C, 5% CO 2 incubator to continue culturing.
  • Step (1) Collect the HEK293T cells after editing for 3 days or the GFP reporter HEK293T cell line after flow sorting, and use the DNA kit to extract genomic DNA according to the steps provided by the product.
  • Step (2) Perform the first round of PCR for PCR library construction, and use 2xQ5 Mastermix for PCR reaction.
  • PCR primers are as shown in SEQ ID NO: 19 and SEQ ID NO: 20;
  • SlugCas9-HF genes PCR primers are as shown in SEQ ID NO: 65 and SEQ ID NO: 66. details as follows:
  • the reaction system is as follows:
  • the PCR running program is as follows:
  • Step (3) Perform the second round of PCR for PCR library building, and use 2xQ5 Mastermix for PCR reaction.
  • PCR primers are shown in SEQ ID NO: 21 and SEQ ID NO: 22;
  • SlugCas9-HF a round of PCR products at G4 site use SEQ ID NO: 21 and SEQ ID NO :
  • the primer shown in 22 is amplified, and a round of PCR products at the G7 site are amplified with the primers shown in SEQ ID NO: 21 and SEQ ID NO: 67.
  • the reaction system is as follows:
  • the PCR running program is as follows:
  • Step (4) Purify the DNA fragments of 366bp or 406bp (the latter is only for SlugCas9-HF) size DNA fragments with the size of 366bp or 406bp (the latter is only for SlugCas9-HF) using the gel recovery kit for the second round of PCR products.
  • Step (1) hand over the prepared second-generation sequencing library to the company for pair-end sequencing on HiseqXTen.
  • Step (2) Bioinformatics analysis of the second-generation sequencing results, some of the editing results are shown in Figure 3a to Figure 3j. It can be seen from the figure that there are deletions, insertions or mismatches in the editing results, and the last 4bp or 5bp represents the PAM sequence, from Figure 3a to Figure 3j are NNGG, NNGRM, NNGG, NNGR, NNGG, NNGG, NNGRM, NNGRM, respectively , NNGG and NNGRR.
  • Figure 3k shows the editing of the SlugCas9-HF gene editing system at two target sites, where the X axis represents the two target sites G4 and G7, and the Y axis represents the indel efficiency.
  • Step (1) Pass the plasmid pAAV2_Cas9-hU6-sgRNA expressing Cas9 and sgRNA Transfect into HEK293T cells according to the steps provided by the manufacturer.
  • the specific sequences for different Cas9, crRNA and target sites are shown in Table 1;
  • Step (2) Extract the cell genomic DNA after 5 days of editing, and use the primers Test-F and Test-R to amplify the targeted DNA sequence through 2x Q5 Master mix; the specific sequences of the primers Test-F and Test-R Shown in Table 1 below;
  • Step (3) The PCR product is recovered through an agarose gel, and the DNA fragments of different sizes thus obtained are purified.
  • the sizes of the DNA fragments are shown in Table 1;
  • Step (4) Digest the purified DNA fragments according to the instructions of T7 Endonuclease I, and then run the gel for detection.
  • FIGS. 4a-4j The results are shown in Figures 4a-4j.
  • the left side is the negative control group, there is no sgRNA during transfection, and there is no cut fragment after T7 Endonuclease I cut the target sequence, indicating that no editing has occurred;
  • the right side of the figure is the experimental group, with sgRNA during transfection , T7 Endonuclease I cut the target sequence and the cut fragment appears, indicating that editing has occurred.
  • SlugCas9-HF is taken as an example to verify the specificity of the CRISPR/Cas9 system.
  • the specific operations are as follows:
  • Step (1) design on-target sgRNA sequence and mismatch sgRNA sequence.
  • Step (2) add the corresponding sticky end sequences on both sides of the linearized plasmid pAAV2_SlugCas9-HF_ITR to the sense strand and antisense strand corresponding to the designed On-target sgRNA sequence and mismatch sgRNA sequence, and synthesize oligonucleotides
  • the specific sequence of single-stranded DNA is (wherein, bases in bold underline are mismatch bases):
  • Oligo-F3 CACCGGCTCGGAGATCATCATTGCG (SEQ ID NO: 68) (On-target)
  • Oligo-R3 AAACCGCAATGATGATCTCCGAGCC (SEQ ID NO: 89) (On-target)
  • Oligo-F4 CACC AA CTCGGAGATCATCATTGCG (SEQ ID NO: 69) (mismatch)
  • Oligo-F5 CACCG AT TCGGAGATCATCATTGCG (SEQ ID NO: 70) (mismatch)
  • Oligo-F6 CACCGG TC CGGAGATCATCATTGCG (SEQ ID NO: 71) (mismatch)
  • Oligo-F7 CACCGGC CT GGAGATCATCATTGCG (SEQ ID NO: 72) (mismatch)
  • Oligo-F8 CACCGGCT TA GAGATCATCATTGCG (SEQ ID NO: 73) (mismatch)
  • Oligo-F9 CACCGGCTC AA AGATCATCATTGCG (SEQ ID NO: 74) (mismatch)
  • Oligo-F10 CACCGGCTCG AG GATCATCATTGCG (SEQ ID NO: 75) (mismatch)
  • Oligo-F11 CACCGGCTCGG GA ATCATCATTGCG (SEQ ID NO: 76) (mismatch)
  • Oligo-F12 CACCGGCTCGGA AG TCATCATTGCG (SEQ ID NO: 77) (mismatch)
  • Oligo-F13 CACCGGCTCGGAG GC CATCATTGCG (SEQ ID NO: 78) (mismatch)
  • Oligo-F14 CACCGGCTCGGAGA CT ATCATTGCG (SEQ ID NO: 79) (mismatch)
  • Oligo-F15 CACCGGCTCGGAGAT TG TCATTGCG (SEQ ID NO: 80) (mismatch)
  • Oligo-F16 CACCGGCTCGGAGATC GC CATTGCG (SEQ ID NO: 81) (mismatch)
  • Oligo-F17 CACCGGCTCGGAGATCA CT ATTGCG (SEQ ID NO: 82) (mismatch)
  • Oligo-F18 CACCGGCTCGGAGATCAT TG TTGCG (SEQ ID NO: 83) (mismatch)
  • Oligo-F19 CACCGGCTCGGAGATCATC GC TGCG (SEQ ID NO: 84) (mismatch)
  • Oligo-F20 CACCGGCTCGGAGATCATCA CC GCG (SEQ ID NO: 85) (mismatch)
  • Oligo-F21 CACCGGCTCGGAGATCATCAT CA CG (SEQ ID NO: 86) (mismatch)
  • Oligo-F22 CACCGGCTCGGAGATCATCATT AT G (SEQ ID NO: 87) (mismatch)
  • Oligo-F23 CACCGGCTCGGAGATCATCATTG TA (SEQ ID NO: 88) (mismatch)
  • Oligo-R4 AAACCGCAATGATGATCTCCGAG TT (SEQ ID NO: 90) (mismatch)
  • Oligo-R5 AAACCGCAATGATGATCTCCGA AT C (SEQ ID NO: 91) (mismatch)
  • Oligo-R6 AAACCGCAATGATGATCTCCG GA CC (SEQ ID NO: 92) (mismatch)
  • Oligo-R7 AAACCGCAATGATGATCTCC AG GCC (SEQ ID NO: 93) (mismatch)
  • Oligo-R8 AAACCGCAATGATGATCTC TA AGCC (SEQ ID NO: 94) (mismatch)
  • Oligo-R9 AAACCGCAATGATGATCT TT GAGCC (SEQ ID NO: 95) (mismatch)
  • Oligo-R10 AAACCGCAATGATGATC CT CGAGCC (SEQ ID NO: 96) (mismatch)
  • Oligo-R11 AAACCGCAATGATGAT TC CCGAGCC (SEQ ID NO: 97) (mismatch)
  • Oligo-R12 AAACCGCAATGATGA CT TCCGAGCC (SEQ ID NO: 98) (mismatch)
  • Oligo-R13 AAACCGCAATGATG GC CTCCGAGCC (SEQ ID NO: 99) (mismatch)
  • Oligo-R14 AAACCGCAATGAT AG TCTCCGAGCC (SEQ ID NO: 100) (mismatch)
  • Oligo-R15 AAACCGCAATGA CA ATCTCCGAGCC (SEQ ID NO: 101) (mismatch)
  • Oligo-R16 AAACCGCAATG GC GATCTCCGAGCC (SEQ ID NO: 102) (mismatch)
  • Oligo-R17 AAACCGCAAT AG TGATCTCCGAGCC (SEQ ID NO: 103) (mismatch)
  • Oligo-R18 AAACCGCAA CA ATGATCTCCGAGCC (SEQ ID NO: 104) (mismatch)
  • Oligo-R19 AAACCGCA GC GATGATCTCCGAGCC (SEQ ID NO: 105) (mismatch)
  • Oligo-R20 AAACCGC GG TGATGATCTCCGAGCC (SEQ ID NO: 106) (mismatch)
  • Oligo-R21 AAACCG TG ATGATGATCTCCGAGCC (SEQ ID NO: 107) (mismatch)
  • Oligo-R22 AAACC AT AATGATGATCTCCGAGCC (SEQ ID NO: 108) (mismatch)
  • Oligo-R23 AAAC TA CAATGATGATCTCCGAGCC (SEQ ID NO: 109) (mismatch)
  • Step (3) the oligonucleotide single-stranded DNA is annealed to become double-stranded DNA.
  • Annealing reaction system 1 ⁇ L 100 ⁇ M oligo-F, 1 ⁇ L 100 ⁇ M oligo-R, 28 ⁇ L ddH 2 O, shake and mix well, then place in the PCR machine
  • Medium running annealing program 95°C_5min, 85°C_1min, 75°C_1min, 65°C_1min, 55°C_1min, 45°C_1min, 35°C_1min, 25°C_1min, 4°C storage, cooling rate 0.3°C/s.
  • step (4) the annealed product and the linearized plasmid pAAV2_SlugCas9-HF_ITR are ligated under the action of DNA ligase according to the steps provided by the product.
  • step (5) 1 ⁇ L of the ligation product is taken for chemically competent transformation, and the growing bacterial clone is verified by Sanger sequencing.
  • step (6) the correct cloned shake bacteria is connected to the sequencing verification, and the plasmids pAAV2_SlugCas9-HF-hU6-On target sgRNA and pAAV2_SlugCas9-HF-hU6-mismatch sgRNA are extracted for use.
  • Step (1) On day 0, according to the requirements of transfection, plate the HEK293T cell line containing the GFP reporter system on a 6-well plate with a cell density of about 30%.
  • the target site sequence is as SEQ ID NO: 110(GGCTCGGAGATCATCATTGCGNNNN ) Shown.
  • Step (2) on the first day, carry out transfection, the transfection system is as follows:
  • the cell to be transfected contains the nucleotide sequence of CMV-ATG-target site-CTGG-GFP, the nucleotide sequence of which is shown in SEQ ID NO: 111, which contains the sequence shown in SEQ ID NO: 110.
  • Step (3) Place the cells in a 37°C, 5% CO 2 incubator to continue culturing.
  • Step (1) collect the cells after 3 days of editing and perform flow cytometric analysis
  • Step (2) use FlowJo analysis software to analyze the GFP positive ratio and plot it.
  • Figure 4k shows the specific detection results of the SlugCas9-HF gene editing system in the GFP reporter system HEK293T cell line. It can be seen from the figure that the complex of SlugCas9-HF and sgRNA specifically cleaves the target sequence (the sequence indicated by 1 in the figure), but basically cannot or cannot cleave the non-target sequence (as shown in 2-21 in the figure). Indicated sequence). It can be seen that the system of the present invention has high specificity and low off-target rate.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Wood Science & Technology (AREA)
  • Biomedical Technology (AREA)
  • Zoology (AREA)
  • Organic Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Virology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Mycology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

一种CRISPR/Cas9基因编辑系统以及其应用,该基因编辑系统为特定Cas9蛋白与sgRNA形成的复合体,能精确定位靶向DNA序列并产生切割,使靶向DNA序列发生双链断裂损伤;该基因编辑为在细胞中或体外进行基因编辑。所使用的特定Cas9蛋白小,仅具有大约1000个氨基酸,识别的PAM序列简单,该Cas9蛋白具有SEQ ID NOs:1-10和58中任一个所示的氨基酸序列,该sgRNA具有SEQ ID NO:11所示的核苷酸序列。

Description

CRISPR/Cas9基因编辑系统及其应用 技术领域
本发明属于基因编辑技术领域,具体涉及一种能在细胞中进行基因编辑的CRISPR/Cas9系统以及其相关应用。
背景技术
CRISPR/Cas9是细菌和古细菌为抵御外源病毒或质粒入侵而进化的一种获得性免疫系统。在CRISPR/Cas9系统中,crRNA(CRISPR-derived RNA)、tracrRNA(trans-activating RNA)以及Cas9蛋白形成复合体后,识别靶位点的PAM(Protospacer Adjacent Motif)序列,crRNA会与靶DNA序列形成互补结构,Cas9蛋白行使切割DNA的功能,使DNA发生断裂损伤。其中,tracrRNA和crRNA通过连接序列可以融合成为单链指导RNA(single guide RNA,sgRNA)。当DNA发生断裂损伤后,细胞内的两种主要DNA损伤修复机制负责修复:非同源末端连接(Non-homologous end-joining,NHEJ)和同源重组(homologous recombination,HR)。NHEJ修复的结果会引起碱基的缺失或插入,可以进行基因敲除;在提供同源模板的情况下,利用HR修复可以进行基因的定点插入和碱基的精确替换。
除了基础科研外,CRISPR/Cas9还具有广泛的临床应用前景。利用CRISPR/Cas9系统做基因治疗时,需要把Cas9和sgRNA导入到体内。目前做基因治疗最有效的递送载体是AAV病毒。但是AAV病毒包装的DNA一般不超过4.5kb。SpCas9因为PAM序列简单(识别NGG)和活性高而得到广泛应用。但是SpCas9蛋白自身有1368个氨基酸,加上sgRNA和启动子,无法有效地包装到AAV病毒中,限制了其在临床中的应用。为了克服这个问题,几个小的Cas9被发明出来,包括SaCas9(PAM序列为NNGRRT)、St1Cas9(PAM序列为NNAGAW)、NmCas9(PAM序列为NNNNGATT)、Nme2Cas9(PAM序列为NNNNCC)、CjCas9(PAM序列为NNNNRYAC)。但是这些Cas9或者容易脱靶(即非靶向位点切割),或者PAM序列复杂,或者编辑活性低,难以广泛应用。
因此,寻找编辑活性高、特异性高、PAM序列简单的小型CRISPR/Cas9系统是解决上述问题的希望所在。
发明内容
针对上述问题,本发明目的在于提供一种编辑活性高、特异性高、Cas9蛋白小、PAM序列简单的新的CRISPR/Cas9基因编辑系统及其应用。
因此,在第一方面,本发明提供了一种CRISPR/Cas9基因编辑系统,所述基因编辑在细胞中或体外进行,其特征在于,所述CRISPR/Cas9系统为Cas9蛋白与sgRNA的复合体,能精确定位靶向DNA序列并产生切割,使靶向DNA序列发生双链断裂损伤;
所述Cas9蛋白为:
SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
Sa-Sauri蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
所述Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
在第二方面,本发明提供了一种采用本发明第一方面的CRISPR/Cas9基因编辑系统在细胞中进行基因编辑的方法,所述方法通过Cas9蛋白与sgRNA的复合物识别定位靶向DNA序列来对该靶向DNA序列进行编辑,所述方法包括以下步骤:
(1)合成Cas9基因序列并将其克隆到表达载体例如pAAV2_ITR上,以获得克隆有Cas9基因序列的表达载体,例如pAAV2_Cas9_ITR,其中所述Cas9基因序列:
(a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷酸序列,
(b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
(c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中中任一个所示的核苷酸序列;
(2)合成与sgRNA对应的寡核苷酸单链DNA,即寡核苷酸正向链序列和寡核苷酸反向链序列,并在所述寡核苷酸正向链序列和寡核苷酸反向链序列退火后将其连接至所述克隆有Cas9基因序列的表达载体的酶切位点,例如质粒pAAV2_Cas9_U6_BsaI的BsaI酶切位点,以得到表达所述Cas9蛋白和所述sgRNA的表达载体,例如pAAV2_Cas9-hU6-sgRNA,其中所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或与SEQ ID NO:11所示的核苷酸序列至少80%相同的核苷酸序列,或基于该核苷酸序列进行的修饰,所述修饰包括例如磷酸化、缩短、加长、硫化、甲基化或羟基化;
(3)将所述表达所述Cas9蛋白和所述sgRNA的表达载体递送至含有靶位点的细胞中,以实现对所述靶位点进行编辑。
在第三方面,本发明提供了一种用于基因编辑的CRISPR/Cas9基因编辑系统的试剂盒,该试剂盒包括:
(1)Cas9蛋白和sgRNA,其中,所述Cas9蛋白为:
SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
Sa-Sauri蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
所述Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列;
或者
(2)克隆有Cas9基因序列和表达sgRNA的序列的表达载体,其中
所述Cas9基因序列:
(a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷酸序列,
(b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
(c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中任一项所示的核苷酸序列,并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
在第四方面,本发明提供了本发明第一方面的CRISPR/Cas9基因编辑系统在基因敲除、定点碱基的改变、定点插入、基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪中的应用。
相比于现有技术中已有的CRISPR/Cas9基因编辑系统,本发明的CRISPR/Cas9基因编辑系统所包含的Cas9蛋白更小,其与现有技术相比具有的氨基酸数更少,因此可以有效地进行包装;另外,本发明的CRISPR/Cas9基因编辑系统所针对的PAM序列更简单,由此可以靶向基因组中更多的DNA序列,有更高的编辑效率。
附图说明
图1为CRISPR/Cas9基因编辑系统切割靶向DNA序列的示意图。其中,灰色椭圆形表示Cas9蛋白,黑色弯曲状表示sgRNA序列,基因组上链中加深区域表示PAM序列。
图2为质粒pAAV2_Cas9_U6_BsaI图谱的示意图。其中,包括AAV2 ITR、CMV增强子、CMV启动子、SV40 NLS、Cas9、nucleoplasmin NLS、3x HA、bGH poly(A)、人U6启动子(hU6)、BsaI内切酶位点、sgRNA scaffold序列等元件。
图3a至图3j为靶位点DNA序列被编辑后的部分二代测序结果,其中编辑结果有缺失、插入或错配,并且最后的4bp或5bp表示PAM序列,从图3a至图3j分别为NNGG、NNGRM、NNGG、NNGR、NNGG、NNGG、NNGRM、NNGRM、NNGG和NNGRR;图3k为SlugCas9-HF基因编辑系统在两个靶位点的编辑情况,其中X轴代表G4和G7两个靶位点,Y轴代表indel效率。
图4a至图4j为内源位点的T7 Endonuclease I酶切结果,其中箭头表示切开的片段大小;图4k为SlugCas9-HF基因编辑系统在GFP报告系统细胞系中的特异性检测结果,图中上方显示GFP报告系统示意图,在起始密码子ATG和GFP编码序列之间插入特定的靶DNA序列及 PAM,造成GFP移码突变,当基因编辑系统对靶DNA进行切割后,细胞通过自身修复系统会使部分细胞恢复GFP阅读框,产生绿色荧光,图中柱状图Y轴代表GFP阳性比率,X轴代表On-target sgRNA和mismatch sgRNA序列。
具体实施方式
以下将通过实施例进一步说明本发明,但以下实施例并不对本发明的保护范围做任何形式的限定;相反,本发明的保护范围由随附权利要求所限定。
如背景技术部分所描述,现有的CRISPR/Cas9基因编辑系统存在各种各样的问题,例如Cas9蛋白过大,使得该系统不能被有效地包装到载体如病毒中去。又例如,目前针对的PAM序列较为复杂,因此导致编辑范围小,难以广泛应用。又例如,目前的小型Cas9编辑活性普遍偏低。
针对上述问题,本发明的目的在于提供一种编辑活性高、特异性高、Cas9蛋白小、PAM序列简单的新的CRISPR/Cas9基因编辑系统及其应用。
因此,在第一方面,本发明提供了一种CRISPR/Cas9基因编辑系统,所述基因编辑在细胞中或体外进行,所述CRISPR/Cas9系统为Cas9蛋白与sgRNA的复合体,能精确定位靶向DNA序列并产生切割,使该靶向DNA序列发生双链断裂损伤;
所述Cas9蛋白为:
SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
Sa-Sauri蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
所述Cas9蛋白具有与SEQ ID NO:1-10和58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
在本文中,SEQ ID NO:1-11和SEQ ID NO:58的序列如下所示:
SEQ ID NO:1——SauriCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000001
SEQ ID NO:2——ShaCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000002
Figure PCTCN2020107880-appb-000003
SEQ ID NO:3——SlugCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000004
SEQ ID NO:4——SlutCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000005
Figure PCTCN2020107880-appb-000006
SEQ ID NO:5——Sa-SauriCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000007
SEQ ID NO:6——Sa-SepCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000008
Figure PCTCN2020107880-appb-000009
SEQ ID NO:7——Sa-SeqCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000010
SEQ ID NO:8——Sa-ShaCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000011
Figure PCTCN2020107880-appb-000012
SEQ ID NO:9——Sa-SlugCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000013
Figure PCTCN2020107880-appb-000014
SEQ ID NO:10——Sa-SlutCas9蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000015
SEQ ID NO:11——sgRNA的核苷酸序列:
Figure PCTCN2020107880-appb-000016
SEQ ID NO:58——SlugCas9-HF蛋白的氨基酸序列:
Figure PCTCN2020107880-appb-000017
Figure PCTCN2020107880-appb-000018
如上所述,本发明人发现了多种可以与单链指导RNA(sgRNA)复合的Cas9蛋白。对于SauriCas9蛋白,其与sgRNA形成的复合体,在本申请中记为CRISPR/SauriCas9基因编辑系统(即SauriCas9蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑的系统)。其他Cas9蛋白与sgRNA形成的复合体可以相似方式命名,例如CRISPR/ShaCas9基因编辑系统、CRISPR/SlugCas9基因编辑系统,以此类推。
本发明的所有Cas9蛋白均很小,仅具有不到一千一百个氨基酸。具体地,SauriCas9蛋白具有1061个氨基酸,ShaCas9蛋白、Sa-SepCas9蛋白、Sa-ShaCas9蛋白、Sa-SlugCas9蛋白具有1055个氨基酸,Sa-SeqCas9蛋白具有1053个氨基酸,SlugCas9蛋白、SlugCas9-HF蛋白和SlutCas9蛋白均具有1054个氨基酸,Sa-SauriCas9蛋白和Sa-SlutCas9蛋白具有1056个氨基酸。
在一个实施方案中,本发明Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%、100%、或者80%-100%中任一百分比相同的氨基酸序列。
在一个实施方案中,所述细胞包括真核细胞和原核细胞,所述真核生物细胞包括例如哺乳动物细胞和植物细胞,所述哺乳动物细胞包括例如中国仓鼠卵巢细胞、幼仓鼠肾细胞、小鼠Sertoli细胞、小鼠乳腺瘤细胞、buffalo大鼠肝细胞、大鼠肝瘤细胞、由SV40转化的猴肾CVI系、猴肾细胞、犬肾细胞、人宫颈癌细胞、人肺细胞、人肝细胞、HIH/3T3细胞、人U2-OS骨肉瘤细胞、人A549细胞、人K562细胞、人HEK293细胞、人HEK293T细胞、人HCT116细胞或人MCF-7细胞或TRI细胞,但是不限于此。
在一个实施方案中,所述CRISPR/Cas9系统包含Staphylococcus auricularisCas9(SauriCas9)蛋白,其具有SEQ ID NO:1所示的氨基酸序列,它和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SauriCas9蛋白来源于耳氏葡萄球菌(Staphylococcus auricularis),所述SauriCas9蛋白的UniProt的检索号为A0A2T4M4R5。
在一个实施方案中,所述SauriCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的SauriCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Staphylococcus haemolyticus Cas9(ShaCas9)蛋白,其具有SEQ ID NO:2所示的氨基酸序列,它和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述ShaCas9蛋白来源于溶血葡萄球菌(Staphylococcus haemolyticus),所述ShaCas9蛋白的UniProt的检索号为A0A2T4SLN6。
在一个实施方案中,所述ShaCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的ShaCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Staphylococcus lugdunensis Cas9(SlugCas9)蛋白,其具有SEQ ID NO:3所示的氨基酸序列,它和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SlugCas9蛋白来源于路邓葡萄球菌(Staphylococcus lugdunensis),所述SlugCas9蛋白的UniProt的检索号为A0A133QCR3。
在一个实施方案中,所述SlugCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的SlugCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Staphylococcus lutrae Cas9(SlutCas9)蛋白,其具有SEQ ID NO:4所示的氨基酸序列,它和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SlutCas9蛋白来源于水獭葡萄球菌(Staphylococcus lutrae),所述SlutCas9蛋白的UniProt的检索号为A0A1W6BMI2。
在一个实施方案中,所述SlutCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的SlutCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-SauriCas9蛋白,所述Sa-SauriCas9蛋白是将SaCas9的PI结构域替换为SauriCas9的PI结构域而得到的一种融合蛋白,其中SauriCas9为Staphylococcus auricularis Cas9。Sa-SauriCas9蛋白具有SEQ ID NO:5所示的氨基酸序列。Sa-SauriCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SauriCas9蛋白来源于耳氏葡萄球菌(Staphylococcus  auricularis),所述SauriCas9蛋白的UniProt的检索号为A0A2T4M4R5。
在一个实施方案中,所述Sa-SauriCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-SauriCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-SepCas9蛋白,所述Sa-SepCas9蛋白将SaCas9的PI结构域替换为SepCas9的PI结构域而得到的融合蛋白,其中SepCas9为Staphylococcus epidermidis Cas9。所述Sa-SepCas9蛋白具有SEQ ID NO:6所示的氨基酸序列。所述Sa-SepCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SepCas9蛋白来源于表皮葡萄球菌(Staphylococcus epidermidis),所述SepCas9蛋白的UniProt的检索号为A0A1Q9MLU4,在NCBI的检索号为WP_075777761.1。
在一个实施方案中,所述Sa-SepCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-SepCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-SeqCas9蛋白,所述Sa-SeqCas9蛋白是将SaCas9的PI结构域替换为SeqCas9的PI结构域而得到的融合蛋白,其中SeqCas9为Staphylococcus equorum Cas9。所述Sa-SeqCas9蛋白具有SEQ ID NO:7所示的氨基酸序列。所述Sa-SeqCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SeqCas9蛋白来源于马胃葡萄球菌属(Staphylococcus equorum),所述SeqCas9蛋白的UniProt的检索号为A0A1E5TL62。
在一个实施方案中,所述Sa-SeqCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-SeqCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-ShaCas9蛋白,所述Sa-ShaCas9蛋白是将SaCas9的PI结构域替换为ShaCas9的PI结构域的融合蛋白,其中ShaCas9为Staphylococcus haemolyticus Cas9。所述Sa-ShaCas9蛋白具有SEQ ID NO:8所示的氨基酸序列。所述Sa-ShaCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述ShaCas9蛋白来源于溶血葡萄球菌属(Staphylococcus haemolyticus),所述ShaCas9蛋白的UniProt的检索号为A0A2T4SLN6。
在一个实施方案中,所述Sa-ShaCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-ShaCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-SlugCas9蛋白,所述Sa-SlugCas9蛋白是将SaCas9的PI结构域替换为SlugCas9的PI结构域而获得的融合蛋白,其中所述 SlugCas9蛋白为Staphylococcus lugdunensis Cas9。所述Sa-SlugCas9蛋白具有SEQ ID NO:9所示的氨基酸序列。所述SlugCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SlugCas9蛋白来源于路邓葡萄球菌属(Staphylococcus lugdunensis),所述SlugCas9蛋白的UniProt的检索号为A0A133QCR3。
在一个实施方案中,所述Sa-SlugCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-SlugCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含Sa-SlutCas9蛋白,所述Sa-SlutCas9蛋白是将SaCas9的PI结构域替换为SlutCas9的PI结构域而得到的融合蛋白,其中SlutCas9为Staphylococcus lutrae Cas9。所述Sa-SlutCas9蛋白具有SEQ ID NO:10所示的氨基酸序列。所述Sa-SlutCas9融合蛋白和单链指导RNA(sgRNA)共同作用实现基因编辑。
在一个实施方案中,所述SlutCas9蛋白来源于水獭葡萄球菌(Staphylococcus lutrae),所述SlutCas9蛋白的UniProt的检索号为A0A1W6BMI2。
在一个实施方案中,所述Sa-SlutCas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Sa-SlutCas9蛋白。
在一个实施方案中,所述CRISPR/Cas9系统包含SlugCas9-HF蛋白,所述SlugCas9-HF蛋白是氨基酸修饰蛋白,在SlugCas9上引入R247A、N415A、T421A、R656A突变,SlugCas9-HF为Staphylococcus lugdunensis Cas9-HiFi。SlugCas9-HF和单链指导RNA(sgRNA)共同作用实现基因编辑。所述SlugCas9-HF蛋白与sgRNA的复合体脱靶率低,特异性高,并且对非靶向DNA序列具有较低的容忍度,即基本上不切割或者不切割该非靶向DNA序列。
在一个实施方案中,所述SlugCas9蛋白属于路邓葡萄球菌属(Staphylococcus lugdunensis),所述SlugCas9蛋白的UniProt的检索号为A0A133QCR3。
在一个实施方案中,所述SlugCas9-HF蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的SlugCas9-HF蛋白。
在一个实施方案中,所述精确定位靶向DNA序列包括所述sgRNA中的5’端20bp或21bp序列能与该靶向DNA序列形成碱基互补配对结构。
在一个实施方案中,所述精确定位靶向DNA序列包括所述Cas9蛋白与sgRNA的复合体识别该靶向DNA序列上的PAM序列。
在一个实施方案中,对于所述SlugCas9-HF蛋白与sgRNA的复合体,所述sgRNA中的5’端20bp或21bp序列能与非靶向DNA序列形成不完全碱基互补配对结构。具体地,在本发 明中,所述不完全碱基互补配对结构包含一部分碱基互补配对结构和一部分非碱基互补配对结构。在一个优选的实施方案中,所述非靶向DNA序列与sgRNA存在两个及以上碱基错配。
在又一个实施方案中,所述SlugCas9-HF蛋白与sgRNA的复合体能识别非靶向DNA序列上的PAM序列。
在一个实施方案中,所述PAM序列和所述靶向DNA序列分别为如下所示:
对于SauriCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
对于ShaCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
对于SlugCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
对于SlutCas9蛋白,所述PAM为NNGR,所述靶向DNA序列为SEQ ID NO:14所示;
对于Sa-Sauri蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
对于Sa-SepCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
对于Sa-SeqCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
对于Sa-ShaCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
对于Sa-SlugCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
对于Sa-SlutCas9蛋白,所述PAM为NNGRR,所述靶向DNA序列为SEQ ID NO:15所示;
对于SlugCas9-HF蛋白,所述PAM序列为NNGG,所述靶向DNA序列为SEQ ID NO:12所示。
SEQ ID NO:12-15的核苷酸序列如下所示:
NNNNNNNNNNNNNNNNNNNNNNNGG(SEQ ID NO:12);
NNNNNNNNNNNNNNNNNNNNNNNGRM(SEQ ID NO:13);
NNNNNNNNNNNNNNNNNNNNNNNGR(SEQ ID NO:14);
NNNNNNNNNNNNNNNNNNNNNNNGRR(SEQ ID NO:15)。
本领域技术人员可以理解的是,上文中的碱基N表示A(腺嘌呤)、T(胸腺嘧啶)、C(胞嘧啶)和G(鸟嘌呤)四种碱基中的任何一种,上文中的碱基M表示A和C两种碱基中的任何一种,上文中的碱基R表示A和G两种碱基中的任何一种。
在一个实施方案中,所述Cas9蛋白与sgRNA的复合体能精确定位靶向DNA序列指所述Cas9蛋白与sgRNA的复合体能够识别并结合靶向DNA序列,或指将与所述Cas9蛋白融合的其他蛋白或特异性识别sgRNA的蛋白带至该靶向DNA序列的位置。
在一个实施方案中,所述Cas9蛋白与sgRNA的复合体、或所述与所述Cas9蛋白融合的其他蛋白、或所述特异性识别sgRNA的蛋白能够对靶向DNA区域进行修饰和调控,所述修饰和调控包括但不限定于基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪。
在一个实施方案中,所述SlugCas9-HF蛋白与sgRNA的复合体对非靶向DNA序列具有低容忍度,即所述SlugCas9-HF蛋白与sgRNA的复合体基本上不能或者不能识别并结合非靶向DNA序列,或基本上不能或不能将与所述SlugCas9-HF蛋白融合的其他蛋白或特异性识别sgRNA的蛋白带至非靶向DNA序列的位置。
在本文中,表述“所述SlugCas9-HF蛋白与sgRNA的复合体基本上不能识别并结合非靶向DNA序列”中的术语“基本上”是指所述SlugCas9-HF蛋白与sgRNA的复合体识别和结合——如果有——非靶向DNA序列的程度几乎没有或者没有生物学意义和/或统计学意义。
在又一个实施方案中,所述SlugCas9-HF蛋白与sgRNA的复合体或所述与所述SlugCas9-HF蛋白融合的其他蛋白或特异性识别sgRNA的蛋白基本上不能或者不能对非靶向DNA区域进行修饰和调控,所述修饰和调控包括但不限于基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪。
类似地,在本文中,表述“所述SlugCas9-HF蛋白与sgRNA的复合体或所述与所述SlugCas9-HF蛋白融合的其他蛋白或特异性识别sgRNA的蛋白基本上不能或者不能对非靶向DNA区域进行修饰和调控”中的术语“基本上”是指所述SlugCas9-HF蛋白与sgRNA的复合体或所述与所述SlugCas9-HF蛋白融合的其他蛋白或特异性识别sgRNA的蛋白修饰和调控——如果有——非靶向DNA区域的程度几乎没有或者没有生物学意义和/或统计学意义。
在一个实施方案中,所述单碱基转换器包括但不限定于碱基腺嘌呤到鸟嘌呤的转换、或胞嘧啶到胸腺嘧啶的转换、或胞嘧啶到尿嘧啶的转换、或其它碱基之间的转换。
本发明提供的CRISPR/Cas9基因编辑系统编辑活性高,并且特异性也高,与现有的CRISPR/Cas9基因编辑系统相比具有明显的优势。
本发明通过基因合成、分子克隆、细胞转染、PCR产物深度测序、流式细胞分析技术、生物信息学分析等技术检测CRISPR/Cas9系统的编辑效率和脱靶率。
本发明的CRISPR/Cas9基因编辑系统在包含靶位点的GFP报告系统细胞系中进行了验证, 结果发现,该基因编辑系统可以高特异性编辑靶基因,并且脱靶率低。
因此,在第二方面,本发明提供了一种采用本发明第一方面的CRISPR/Cas9基因编辑系统在细胞中进行基因编辑的方法,所述方法通过Cas9蛋白与sgRNA的复合物识别定位靶向DNA序列来对该靶向DNA序列进行编辑,所述方法包括以下步骤:
(1)合成Cas9基因序列并将其克隆到表达载体例如pAAV2_ITR上,以获得克隆有Cas9基因序列的表达载体,例如pAAV2_Cas9_ITR,其中所述Cas9基因序列:
(a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷酸序列,
(b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
(c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中任一个所示的核苷酸序列;
(2)合成与sgRNA对应的寡核苷酸单链DNA,即寡核苷酸正向链序列(Oligo-F)和寡核苷酸反向链序列(Oligo-R),并在所述寡核苷酸正向链序列和寡核苷酸反向链序列退火后将其连接至所述克隆有Cas9基因序列的表达载体的酶切位点,例如质粒pAAV2_Cas9_U6_BsaI的BsaI酶切位点,以得到表达所述Cas9蛋白和所述sgRNA的表达载体,例如pAAV2_Cas9-hU6-sgRNA,其中所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或与SEQ ID NO:11所示的核苷酸序列至少80%相同的核苷酸序列,或基于该核苷酸序列进行的修饰,所述修饰包括例如磷酸化、缩短、加长、硫化、甲基化或羟基化;
(3)将所述表达所述Cas9蛋白和所述sgRNA的表达载体递送至含有靶位点的细胞中,以实现对所述靶位点进行编辑。
SEQ ID NO:23-32和SEQ ID NO:112如下所示:
SEQ ID NO:23——人源化的SauriCas9基因序列:
Figure PCTCN2020107880-appb-000019
Figure PCTCN2020107880-appb-000020
Figure PCTCN2020107880-appb-000021
SEQ ID NO:24——人源化的ShaCas9基因序列:
Figure PCTCN2020107880-appb-000022
Figure PCTCN2020107880-appb-000023
Figure PCTCN2020107880-appb-000024
SEQ.ID.NO:25——人源化的SlugCas9基因序列:
Figure PCTCN2020107880-appb-000025
Figure PCTCN2020107880-appb-000026
SEQ.ID.NO:26——人源化的SlutCas9基因序列:
Figure PCTCN2020107880-appb-000027
Figure PCTCN2020107880-appb-000028
Figure PCTCN2020107880-appb-000029
SEQ ID NO:27——人源化的Sa-SauriCas9基因序列:
Figure PCTCN2020107880-appb-000030
Figure PCTCN2020107880-appb-000031
Figure PCTCN2020107880-appb-000032
SEQ ID NO:28——人源化的Sa-SepCas9基因序列:
Figure PCTCN2020107880-appb-000033
Figure PCTCN2020107880-appb-000034
SEQ ID NO:29——人源化的Sa-SeqCas9基因序列:
Figure PCTCN2020107880-appb-000035
Figure PCTCN2020107880-appb-000036
Figure PCTCN2020107880-appb-000037
SEQ ID NO:30——人源化的Sa-ShaCas9基因序列:
Figure PCTCN2020107880-appb-000038
Figure PCTCN2020107880-appb-000039
Figure PCTCN2020107880-appb-000040
SEQ ID NO:31——人源化的Sa-SlugCas9基因序列:
Figure PCTCN2020107880-appb-000041
Figure PCTCN2020107880-appb-000042
SEQ ID NO:32——人源化的Sa-SlutCas9基因序列:
Figure PCTCN2020107880-appb-000043
Figure PCTCN2020107880-appb-000044
Figure PCTCN2020107880-appb-000045
SEQ ID NO:112——人源化的SlugCas9-HF基因序列:
Figure PCTCN2020107880-appb-000046
Figure PCTCN2020107880-appb-000047
Figure PCTCN2020107880-appb-000048
在一个实施方案中,所述表达载体可以为质粒载体、逆转录病毒载体、腺病毒载体、腺相关病毒载体如pAAV2_ITR等等。但是,可以理解,任何其他合适的表达载体也是可行的。
本发明中,可以根据具体需要,针对待编辑的DNA序列设计任意靶向的sgRNA,并可以在一定程度上对sgRNA进行本领域所熟知的修饰。因此,在一个实施方案中,对所述sgRNA进行的修饰包括但不限定于磷酸化、缩短、加长、硫化、甲基化、羟基化。
本发明中,可以根据具体需要,针对待编辑的DNA序列设计任意的mismatch sgRNA,并可以在一定程度上对sgRNA进行本领域所熟知的修饰,所述修饰包括但不限定于磷酸化、缩短、加长、硫化、甲基化、羟基化。
在一个实施方案中,步骤(3)中递送至含有靶位点的细胞中的CRISPR/Cas9系统可以包括但不限定于:表达本发明Cas9蛋白和sgRNA的质粒载体、逆转录病毒、腺病毒、或腺相关病毒载体,或者sgRNA和蛋白自身,根据具体需要来确定。
在一个进一步的实施方案中,在步骤(3)中,所述递送的工具包括但不限定于脂质体、阳离子多聚物、纳米颗粒、多功能信封式纳米以及病毒载体。
在一个进一步的实施方案中,在步骤(3)中,所述细胞包括但不限定于真核细胞和原核细胞如细菌细胞,所述真核生物细胞包括例如哺乳动物细胞和植物细胞,所述哺乳动物细胞包括例如动物细胞如中国仓鼠卵巢细胞、幼仓鼠肾细胞、小鼠Sertoli细胞、小鼠乳腺瘤细胞、buffalo大鼠肝细胞、大鼠肝瘤细胞、由SV40转化的猴肾CVI系、猴肾细胞、犬肾细胞、人细胞如人宫颈癌细胞、人肺细胞、人肝细胞、HIH/3T3细胞、人U2-OS骨肉瘤细胞、人A549细胞、人K562细胞、人HEK293细胞、人HEK293T细胞、人HCT116细胞或人MCF-7细胞或TRI细胞。
在一个进一步的实施方案中,步骤(2)中所述的修饰包括但不限定于磷酸化、缩短、加长、硫化或甲基化。
在一个具体的实施方案中,对于除SlugCas9-HF基因以外的其他Cas9基因,所述Oligo-F为SEQ ID NO:16,所述Oligo-R为SEQ ID NO:17;对于SlugCas9-HF基因,所述Oligo-F和所述Oligo-R包括SEQ ID NO:59和SEQ ID NO:60所示的第一寡核苷酸正向链序列(Oligo-F1) 和第一寡核苷酸反向链序列(Oligo-R1),以及SEQ ID NO:61和SEQ ID NO:62所示的第二寡核苷酸正向链序列(Oligo-F2)和第二寡核苷酸反向链序列(Oligo-R2)。
如本领域技术人员可以理解的,Oligo-F序列和Oligo-R序列需要进行退火变为双链DNA。因此,在一个实施方案中,所述退火反应体系为:1μL 100μM所述oligo-F序列、1μL 100μM所述oligo-R序列和28μL水。震荡混匀后,将退火反应体系放置于PCR仪中运行退火程序,退火程序如下:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。
在一个实施方案中,所述克隆有Cas9的表达载体如质粒pAAV2_Cas9_ITR需要经过限制性内切酶如BsaI进行线性化处理。
在一个具体实施方案中,所述Oligo-F序列和所述Oligo-R序列退火后的产物与线性化处理后的所述克隆有Cas9的表达载体如pAAV2_Cas9_ITR骨架载体通过DNA连接酶进行连接反应,由此得到同时克隆有Cas9和sgRNA的表达载体,如pAAV2_Cas9-hU6-sgRNA。在一个更具体的实施方案中,所述pAAV2_Cas9-hU6-sgRNA为腺相关病毒骨架质粒,其包括AAV2ITR、CMV增强子、CMV启动子、SV40 NLS、Cas9、nucleoplasmin NLS、3x HA、bGH poly(A)、人U6启动子、BsaI内切酶位点、sgRNA scaffold序列。
在一个具体实施方案中,将所述连接产物转化至感受态细胞,然后经Sanger测序验证正确的克隆,然后提取质粒备用。
在一个具体实施方案中,对于除SlugCas9-HF基因以外的其他Cas9基因,步骤(3)中的细胞为HEK293T细胞,其包含的靶位点具有SEQ ID NO:18所示的核苷酸序列;对于SlugCas9-HF基因,步骤(3)中的细胞中的靶位点分别具有SEQ ID NO:63和SEQ ID NO:64所示的核苷酸序列。
在一个具体实施方案中,步骤(3)中的递送工具为脂质体,包括例如
Figure PCTCN2020107880-appb-000049
或PEI。
在一个任选的实施方案中,所述方法还包括步骤(4)检测编辑后的靶位点的编辑效率,例如通过对编辑后的靶位点进行PCR扩增,然后进行T7EI酶切法或者二代测序法。
在一个更具体的实施方案中,步骤(4)中用于PCR扩增的模板为编辑后的HEK293T细胞的基因组DNA。
在一个具体实施方案中,步骤(4)中,对于除SlugCas9-HF基因以外的其他Cas9基因,用于PCR扩增的引物序列为SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、SEQ ID NO:22;对于SlugCas9-HF基因,PCR扩增的引物序列为SEQ ID NO:65、SEQ ID NO:66、SEQ ID NO: 21、SEQ ID NO:22、SEQ ID NO:67。
在第三方面,本发明还提供一种用于基因编辑的CRISPR/Cas9基因编辑系统的试剂盒,该试剂盒包括:
(1)Cas9蛋白和sgRNA,其中,所述Cas9蛋白为:
SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
Sa-SauriCas9蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
所述Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列;
或者
(2)克隆有Cas9基因序列和与sgRNA对应的寡核苷酸单链DNA的表达载体,其中
所述Cas9基因序列:
(a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷酸序列,
(b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
(c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中任一项所示的核苷酸序列,并且
所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
在第四方面,本发明还提供本发明第一方面的CRISPR/Cas9基因编辑系统在基因敲除、定点碱基的改变、定点插入、基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪中的应用。
相比于现有技术中已有的CRISPR/Cas9基因编辑系统,本发明的CRISPR/Cas9基因编辑系统所包含的Cas9蛋白更小,其与现有技术相比具有的氨基酸数更少,因此可以有效地进行包装;另外,本发明的CRISPR/Cas9基因编辑系统所针对的PAM序列更简单,由此可以靶向基因组中更多的DNA序列,有更高的编辑效率。
下面参照附图,通过具体实施例来对本发明进行更为详细的描述。应理解,除非特别说明,否则本发明采用的试剂、方法和设备均为本技术领域常规的试剂、方法和设备。除非特别说明,否则以下实施例所用试剂和材料均为市购。未注明具体条件的实验方法,通常按照常规条件或制造厂商所建议条件实施。
实施例1、质粒pAAV2_Cas9-ITR的构建
步骤(1):根据Cas9基因在UniProt上的检索号,下载其氨基酸序列。
在本发明中,根据SauriCas9基因、ShaCas9基因、SlugCas9基因、SlutCas9基因、Sa-SauriCas9基因、Sa-SepCas9基因、Sa-SeqCas9基因、Sa-ShaCas9基因、Sa-SlugCas9基因和Sa-SlutCas9基因在UniProt上的检索号,下载其氨基酸序列。各Cas9基因在UniProt上的检索号及其氨基酸序列具体如下:
Cas9基因 检索号 氨基酸序列
SauriCas9 A0A2T4M4R5 SEQ ID NO:1
ShaCas9 A0A2T4SLN6 SEQ ID NO:2
SlugCas9 A0A133QCR3 SEQ ID NO:3
SlutCas9 A0A1W6BMI2 SEQ ID NO:4
Sa-SauriCas9 A0A2T4M4R5 SEQ ID NO:5
Sa-SepCas9 A0A1Q9MLU4 SEQ ID NO:6
Sa-SeqCas9 A0A1E5TL62 SEQ ID NO:7
Sa-ShaCas9 A0A2T4SLN6 SEQ ID NO:8
Sa-SlugCas9 A0A133QCR3 SEQ ID NO:9
Sa-SlutCas9 A0A1W6BMI2 SEQ ID NO:10
SlugCas9-HF A0A133QCR3 SEQ ID NO:58*
*相对于SEQ ID NO:3,SEQ ID NO:58中引入了R247A、N415A、T421A、R656A突 变。
步骤(2):将上述Cas9的氨基酸序列进行密码子优化,获得了所述Cas9在人细胞中高表达的编码序列。SauriCas9蛋白、ShaCas9蛋白、SlugCas9蛋白、SlutCas9蛋白、Sa-Sauri蛋白、Sa-SepCas9蛋白、Sa-SeqCas9蛋白、Sa-ShaCas9蛋白、Sa-SlugCas9蛋白和Sa-SlutCas9蛋白在人细胞中高表达的编码序列分别如SEQ ID NO:23-32和SEQ ID NO:112所示。
步骤(3):将在步骤(2)中获得的Cas9的编码序列进行基因合成,并构建至pAAV2_ITR骨架质粒上,得到质粒pAAV2_Cas9_ITR,如图2所示。
实施例2、线性化质粒pAAV2_Cas9_ITR的制备
步骤(1):用BsaI限制性内切酶将在实施例1中获得的质粒pAAV2_Cas9_ITR进行酶切线性化,酶切体系为:1μg质粒pAAV2_Cas9_ITR,5μL 10x CutSmart缓冲液,1μL BsaI内切酶,ddH 2O补足至50μL。使该酶切体系在37℃反应1小时。
步骤(2):将酶切后的产物在1%的琼脂糖凝胶上电泳,120V电泳30分钟。
步骤(3):切取DNA片段,用胶回收试剂盒依据厂家提供的步骤进行回收,最终用ddH 2O进行洗脱。所述DNA片段,即为包含SauriCas9、ShaCas9、SlugCas9、SlutCas9、Sa-Sauri、Sa-SepCas9、Sa-SeqCas9、Sa-ShaCas9、Sa-SlugCas9、Sa-SlutCas9和SlugCas9-HF的线性化质粒pAAV2_Cas9_ITR,其大小分别为7447bp、7430bp、7427bp、7437bp、7433bp、7430bp、7423bp、7430bp、7430bp、7433bp和7427bp。
步骤(4):将回收的线性化质粒pAAV2_Cas9_ITR用NanoDrop测定DNA浓度,备用或置于-20℃进行长期保存。
实施例3、质粒pAAV2_Cas9_hU6_sgRNA的构建
步骤(1):设计sgRNA序列。
步骤(2):在设计的sgRNA序列对应的正义链和反义链上分别加上线性化质粒pAAV2_Cas9_ITR两侧对应的粘性末端序列,并合成寡核苷酸单链DNA。
对于SlugCas9-HF以外的其他基因,所述寡核苷酸单链DNA的具体序列为:
Oligo-F:CACCGCTCGGAGATCATCATTGCG(SEQ ID NO:16);
Oligo-R:AAACCGCAATGATGATCTCCGAGC(SEQ ID NO:17)。
另外,对于SlugCas9-HF,所述寡核苷酸单链DNA的具体序列为:
Oligo-F1:CACCAGAGTAGGCTGGTAGATGGAG(SEQ ID NO:59);
Oligo-R1:AAACCTCCATCTACCAGCCTACTCT(SEQ ID NO:60);
Oligo-F2:CACCGTCAGACATGAGATCACAGAT(SEQ ID NO:61);
Oligo-R2:AAACATCTGTGATCTCATGTCTGAC(SEQ ID NO:62)。
步骤(3):寡核苷酸单链DNA进行退火变为双链DNA。退火反应体系为:1μL 100μM oligo-F,1μL 100μM oligo-R,28μL ddH 2O。将该退火体系震荡混匀后,放置于PCR仪中运行退火程序,退火程序为:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。
步骤(4):将退火后的产物与在实施例2中得到的线性化质粒pAAV2_Cas9_ITR在DNA连接酶的作用下依据产品提供的步骤进行连接。
步骤(5):取1μL连接产物进行化学感受态转化,将生长的细菌克隆进行Sanger测序验证。
步骤(6):将测序验证连接正确的克隆摇菌,提取质粒即pAAV2_Cas9-hU6-sgRNA备用。
实施例4、表达Cas9蛋白和sgRNA的质粒pAAV2_Cas9-hU6-sgRNA对HEK293T细胞系的转染
1、pAAV2_Cas9-hU6-sgRNA对GFP报告系统HEK293T细胞系的转染
步骤(1):在第0天,根据转染所需,将含有GFP报告系统的HEK293T细胞系在6孔板进行铺板,细胞密度约30%左右,靶向位点的序列如SEQ ID NO:18(GCTCGGAGATCATCATTGCGNNNNN)所示。
步骤(2):在第1天,进行转染,转染步骤如下:
i.取2μg待转染质粒pAAV2_Cas9-hU6-sgRNA加入至100μL Opti-MEM培养基中,轻轻吹打混匀;
ii.将
Figure PCTCN2020107880-appb-000050
轻弹混匀,吸取5μL加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min;
iii.将稀释的
Figure PCTCN2020107880-appb-000051
和稀释的质粒进行混合,轻轻吹打混匀,室温静置20min,然后加入至包含待转染细胞的培养基中。该待转染细胞中包含CMV-ATG-target site-NNNNNNN-GFP核苷酸序列,其核苷酸序列如SEQ ID NO:113所示,其中包含SEQ ID NO:18所示的序列。需要说明的是,该核苷酸序列在靶位点和GFP序列之间存在7个随机碱基N作为PAM序列(下划线粗体标注)。
步骤(3):将细胞置于37℃、5%的CO 2培养箱中继续培养。
步骤(4):编辑3天后,通过流式分选将GFP阳性细胞分选出来,置于37℃、5%的CO 2培养箱中继续培养。
SEQ ID NO:113所示的序列:
Figure PCTCN2020107880-appb-000052
2、pAAV2_SlugCas9-HF-hU6-sgRNA对HEK293T细胞系的转染
步骤(1):在第0天,根据转染所需,将含有sgRNA靶向位点的HEK293T细胞系在6 孔板进行铺板,细胞密度约30%左右。SlugCas9-HF的G4靶向位点的序列如SEQ ID NO:63(AGAGTAGGCTGGTAGATGGAGNNNN)所示,G7靶向位点的序列如SEQ ID NO:64(ATCTGTGATCTCATGTCTGACNNNN)所示。
步骤(2):在第1天,进行转染,转染步骤如下:
i.取2μg待转染质粒pAAV2_SlugCas9-HF-hU6-sgRNA加入至100μL Opti-MEM培养基中,轻轻吹打混匀;
ii.将
Figure PCTCN2020107880-appb-000053
轻弹混匀,吸取5μL加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min;
iii.将稀释的
Figure PCTCN2020107880-appb-000054
和稀释的质粒进行混合,轻轻吹打混匀,室温静置20min,然后加入至包含待转染细胞的培养基中。
步骤(3):将细胞置于37℃、5%的CO 2培养箱中继续培养。
实施例5、二代测序文库的制备
步骤(1):收集编辑3天后的HEK293T细胞或流式分选后的GFP报告系统HEK293T细胞系,用DNA试剂盒依据产品提供的步骤提取基因组DNA。
步骤(2):进行PCR建库第一轮PCR,用2xQ5 Mastermix进行PCR反应。对于除SlugCas9-HF以外的其他基因,PCR引物如SEQ ID NO:19和SEQ ID NO:20;对于SlugCas9-HF基因,PCR引物如SEQ ID NO:65和SEQ ID NO:66所示。具体如下:
F1-ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNGCGAGAAAAGCCTTGTTT(SEQ ID NO:19);
R1-ACTGGAGTTCAGACGTGTGCTCTTCCGATCTCTGAACTTGTGGCCGTTTAC(SEQ ID NO:20);
F1-ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNTGTCAGGCAGCAGAGCTC(SEQ ID NO:65);
R1-ACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNGGCGATGGCTTCCTGGTC(SEQ ID NO:66)。
反应体系如下:
Figure PCTCN2020107880-appb-000055
Figure PCTCN2020107880-appb-000056
PCR运行程序如下:
Figure PCTCN2020107880-appb-000057
步骤(3):进行PCR建库第二轮PCR,用2xQ5 Mastermix进行PCR反应。对于除SlugCas9-HF外的其他基因,PCR引物如SEQ ID NO:21和SEQ ID NO:22所示;对于SlugCas9-HF,G4位点的一轮PCR产物用SEQ ID NO:21和SEQ ID NO:22所示引物扩增,G7位点的一轮PCR产物用SEQ ID NO:21和SEQ ID NO:67所示引物扩增。
F2-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC(SEQ ID NO:21)
R2-CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTG(SEQ ID NO:22);
F2-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC(SEQ ID NO:21)
R2-1-CAAGCAGAAGACGGCATACGAGATCACTGTGTGACTGGAGTTCAGACGTGTG(SEQ ID NO:22)
F2-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC(SEQ ID NO:21)
R2-2-CAAGCAGAAGACGGCATACGAGATATTGGCGTGACTGGAGTTCAGACGTGTG(SEQ ID NO:67)
反应体系如下:
Figure PCTCN2020107880-appb-000058
PCR运行程序如下:
Figure PCTCN2020107880-appb-000059
步骤(4):将第二轮的PCR产物用胶回收试剂盒依据厂家提供的步骤,纯化366bp或406bp(后者仅对SlugCas9-HF)大小的DNA片段,二代测序文库制备完毕。
实施例6、二代测序结果的分析
步骤(1):将制备好的二代测序文库交由公司在HiseqXTen上进行双端测序。
步骤(2):生物信息学分析二代测序结果,部分编辑结果如图3a至图3j所示。从图中可以看出,编辑结果有缺失、插入或错配,并且最后的4bp或5bp表示PAM序列,从图3a至图3j分别为NNGG、NNGRM、NNGG、NNGR、NNGG、NNGG、NNGRM、NNGRM、NNGG和NNGRR。图3k显示SlugCas9-HF基因编辑系统在两个靶位点的编辑情况,其中X轴代表G4和G7两个靶位点,Y轴代表indel效率。
实施例7、内源位点的验证
步骤(1):将表达Cas9和sgRNA的质粒pAAV2_Cas9-hU6-sgRNA通过
Figure PCTCN2020107880-appb-000060
依据厂家提供的步骤转染至HEK293T细胞中,其中针对不同Cas9,crRNA和靶位点的具体序列表1所示;
步骤(2):提取编辑5天后的细胞基因组DNA,通过2x Q5 Master mix,用引物Test-F和Test-R对靶向DNA序列进行扩增;其中引物Test-F和Test-R的具体序列示于下表1;
步骤(3):将PCR产物通过琼脂糖凝胶回收,纯化由此得到的不同大小的DNA片段,DNA片段的大小如表1所示;
步骤(4):依照T7 Endonuclease I的说明书对纯化的DNA片段进行酶切,然后跑胶检测。
结果示于图4a-4j中。在各图中,左侧为阴性对照组,转染时无sgRNA,T7 Endonuclease I切靶向序列后无切开的片段,说明未发生编辑;图中右侧为实验组,转染时有sgRNA,T7 Endonuclease I切靶向序列后出现切开的片段,说明发生了编辑。
Figure PCTCN2020107880-appb-000061
Figure PCTCN2020107880-appb-000062
实施例8、CRISPR/Cas9系统的特异性检测
在本实施例中,以SlugCas9-HF为例验证了CRISPR/Cas9系统的特异性,具体操作如下:
1、构建质粒pAAV2_SlugCas9-HF_ITR,具体步骤参见实施例1。
2、制备线性化质粒pAAV2_SlugCas9-HF_ITR,具体步骤参见实施例2。
3、构建质粒pAAV2_SlugCas9-HF-hU6-On target sgRNA和pAAV2_SlugCas9-HF-hU6-mismatch sgRNA
步骤(1),设计On-target sgRNA序列和mismatch sgRNA序列。
步骤(2),在设计的On-target sgRNA序列和mismatch sgRNA序列所对应的正义链和反义链上分别加上线性化质粒pAAV2_SlugCas9-HF_ITR两侧对应的粘性末端序列,并合成寡核苷酸单链DNA,其具体序列为(其中,下划线粗体碱基为mismatch碱基):
Oligo-F3:CACCGGCTCGGAGATCATCATTGCG(SEQ ID NO:68)(On-target)
Oligo-R3:AAACCGCAATGATGATCTCCGAGCC(SEQ ID NO:89)(On-target)
Oligo-F4:CACC AACTCGGAGATCATCATTGCG(SEQ ID NO:69)(mismatch)
Oligo-F5:CACCG ATTCGGAGATCATCATTGCG(SEQ ID NO:70)(mismatch)
Oligo-F6:CACCGG TCCGGAGATCATCATTGCG(SEQ ID NO:71)(mismatch)
Oligo-F7:CACCGGC CTGGAGATCATCATTGCG(SEQ ID NO:72)(mismatch)
Oligo-F8:CACCGGCT TAGAGATCATCATTGCG(SEQ ID NO:73)(mismatch)
Oligo-F9:CACCGGCTC AAAGATCATCATTGCG(SEQ ID NO:74)(mismatch)
Oligo-F10:CACCGGCTCG AGGATCATCATTGCG(SEQ ID NO:75)(mismatch)
Oligo-F11:CACCGGCTCGG GAATCATCATTGCG(SEQ ID NO:76)(mismatch)
Oligo-F12:CACCGGCTCGGA AGTCATCATTGCG(SEQ ID NO:77)(mismatch)
Oligo-F13:CACCGGCTCGGAG GCCATCATTGCG(SEQ ID NO:78)(mismatch)
Oligo-F14:CACCGGCTCGGAGA CTATCATTGCG(SEQ ID NO:79)(mismatch)
Oligo-F15:CACCGGCTCGGAGAT TGTCATTGCG(SEQ ID NO:80)(mismatch)
Oligo-F16:CACCGGCTCGGAGATC GCCATTGCG(SEQ ID NO:81)(mismatch)
Oligo-F17:CACCGGCTCGGAGATCA CTATTGCG(SEQ ID NO:82)(mismatch)
Oligo-F18:CACCGGCTCGGAGATCAT TGTTGCG(SEQ ID NO:83)(mismatch)
Oligo-F19:CACCGGCTCGGAGATCATC GCTGCG(SEQ ID NO:84)(mismatch)
Oligo-F20:CACCGGCTCGGAGATCATCA CCGCG(SEQ ID NO:85)(mismatch)
Oligo-F21:CACCGGCTCGGAGATCATCAT CACG(SEQ ID NO:86)(mismatch)
Oligo-F22:CACCGGCTCGGAGATCATCATT ATG(SEQ ID NO:87)(mismatch)
Oligo-F23:CACCGGCTCGGAGATCATCATTG TA(SEQ ID NO:88)(mismatch)
Oligo-R4:AAACCGCAATGATGATCTCCGAG TT(SEQ ID NO:90)(mismatch)
Oligo-R5:AAACCGCAATGATGATCTCCGA ATC(SEQ ID NO:91)(mismatch)
Oligo-R6:AAACCGCAATGATGATCTCCG GACC(SEQ ID NO:92)(mismatch)
Oligo-R7:AAACCGCAATGATGATCTCC AGGCC(SEQ ID NO:93)(mismatch)
Oligo-R8:AAACCGCAATGATGATCTC TAAGCC(SEQ ID NO:94)(mismatch)
Oligo-R9:AAACCGCAATGATGATCT TTGAGCC(SEQ ID NO:95)(mismatch)
Oligo-R10:AAACCGCAATGATGATC CTCGAGCC(SEQ ID NO:96)(mismatch)
Oligo-R11:AAACCGCAATGATGAT TCCCGAGCC(SEQ ID NO:97)(mismatch)
Oligo-R12:AAACCGCAATGATGA CTTCCGAGCC(SEQ ID NO:98)(mismatch)
Oligo-R13:AAACCGCAATGATG GCCTCCGAGCC(SEQ ID NO:99)(mismatch)
Oligo-R14:AAACCGCAATGAT AGTCTCCGAGCC(SEQ ID NO:100)(mismatch)
Oligo-R15:AAACCGCAATGA CAATCTCCGAGCC(SEQ ID NO:101)(mismatch)
Oligo-R16:AAACCGCAATG GCGATCTCCGAGCC(SEQ ID NO:102)(mismatch)
Oligo-R17:AAACCGCAAT AGTGATCTCCGAGCC(SEQ ID NO:103)(mismatch)
Oligo-R18:AAACCGCAA CAATGATCTCCGAGCC(SEQ ID NO:104)(mismatch)
Oligo-R19:AAACCGCA GCGATGATCTCCGAGCC(SEQ ID NO:105)(mismatch)
Oligo-R20:AAACCGC GGTGATGATCTCCGAGCC(SEQ ID NO:106)(mismatch)
Oligo-R21:AAACCG TGATGATGATCTCCGAGCC(SEQ ID NO:107)(mismatch)
Oligo-R22:AAACC ATAATGATGATCTCCGAGCC(SEQ ID NO:108)(mismatch)
Oligo-R23:AAAC TACAATGATGATCTCCGAGCC(SEQ ID NO:109)(mismatch)
步骤(3),寡核苷酸单链DNA进行退火变为双链DNA,退火反应体系:1μL 100μM oligo-F,1μL 100μM oligo-R,28μL ddH 2O,震荡混匀后,放置于PCR仪中运行退火程序:95℃_5min,85℃_1min,75℃_1min,65℃_1min,55℃_1min,45℃_1min,35℃_1min,25℃_1min,4℃保存,降温速率0.3℃/s。
步骤(4),将退火后的产物与线性化质粒pAAV2_SlugCas9-HF_ITR在DNA连接酶的作用下依据产品提供的步骤进行连接。
步骤(5),取1μL连接产物进行化学感受态转化,将生长的细菌克隆进行Sanger测序验证。
步骤(6),将测序验证连接正确的克隆摇菌,提取质粒pAAV2_SlugCas9-HF-hU6-On target sgRNA和pAAV2_SlugCas9-HF-hU6-mismatch sgRNA,备用。
4、分别用pAAV2_SlugCas9-HF-hU6-On target sgRNA和pAAV2_SlugCas9-HF-hU6-mismatch sgRNA转染GFP报告系统HEK293T细胞系,具体步骤如下:
步骤(1),第0天,根据转染所需,将含有GFP报告系统的HEK293T细胞系在6孔板进行铺板,细胞密度约30%左右,靶位点序列如SEQ ID NO:110(GGCTCGGAGATCATCATTGCGNNNN)所示。
步骤(2),第1天,进行转染,转染体系如下:
i.取2μg待转染质粒pAAV2_SlugCas9-HF-hU6-On target sgRNA/mismatch sgRNA加入至100μL Opti-MEM培养基中,轻轻吹打混匀;
ii.将
Figure PCTCN2020107880-appb-000063
轻弹混匀,吸取5μL加入至100μL Opti-MEM培养基中,轻轻混匀,室温静置5min;
iii.将稀释的
Figure PCTCN2020107880-appb-000064
和稀释的质粒进行混合,轻轻吹打混匀,室温静置20min,然后加入至待转染细胞的培养基中。该待转染细胞中包含CMV-ATG-target site-CTGG-GFP核苷酸序列,其核苷酸序列如SEQ ID NO:111所示,其中包含SEQ ID NO:110所示的序列。
步骤(3),将细胞置于37℃、5%的CO 2培养箱中继续培养。
5、流式细胞分析技术分析SlugCas9-HF的编辑效率及脱靶率
步骤(1),收集编辑3天后的细胞,进行流式分析处理;
步骤(2),用FlowJo分析软件分析GFP阳性比率并作图。
SEQ ID NO:111所示的序列:
Figure PCTCN2020107880-appb-000065
Figure PCTCN2020107880-appb-000066
图4k显示SlugCas9-HF基因编辑系统在GFP报告系统HEK293T细胞系中的特异性检测结果。从图中可以看出,SlugCas9-HF与sgRNA的复合体特异性地切割了靶向序列(图中1所指示的序列),但基本不能或者不能切割非靶向序列(图中2-21所指示的序列)。由此可见,本发明系统具有高特异性、低脱靶率。

Claims (23)

  1. 一种CRISPR/Cas9基因编辑系统,所述基因编辑在细胞中或体外进行,其特征在于,所述CRISPR/Cas9系统为Cas9蛋白与sgRNA的复合体,能精确定位靶向DNA序列并产生切割,使靶向DNA序列发生双链断裂损伤;
    所述Cas9蛋白为:
    SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
    ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
    SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
    SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
    Sa-Sauri蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
    Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
    Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
    Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
    Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
    Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
    SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
    所述Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
    所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
  2. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述细胞包括真核细胞和原核细胞;所述真核生物细胞包括例如哺乳动物细胞和植物细胞;所述哺乳动物细胞包括例如中国仓鼠卵巢细胞、幼仓鼠肾细胞、小鼠Sertoli细胞、小鼠乳腺瘤细胞、buffalo大鼠肝细胞、大鼠肝瘤细胞、由SV40转化的猴肾CVI系、猴肾细胞、犬肾细胞、人宫颈癌细胞、人肺细胞、人肝细胞、HIH/3T3细胞、人U2-OS骨肉瘤细胞、人A549细胞、人K562细胞、人HEK293细胞、人HEK293T细胞、人HCT116细胞或人MCF-7细胞或TRI细胞。
  3. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述Cas9蛋白包括无切割活性或仅具有单链切割活性或具有双链切割活性的Cas9蛋白。
  4. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述精确定位靶向DNA序列包括所述sgRNA中的5’端20bp或21bp序列与所述靶向DNA序列形成碱基互补配对结构。
  5. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述精确定位靶向DNA序列包括所述Cas9蛋白与sgRNA的复合体识别所述靶向DNA序列上的PAM序列。
  6. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述SlugCas9-HF蛋白与sgRNA的复合体的所述sgRNA中的5’端20bp或21bp序列能与非靶向DNA序列形成不完全碱基互补配对结构;优选地,所述非靶向DNA序列与sgRNA存在两个及以上碱基错配。
  7. 根据权利要求6所述的CRISPR/Cas9基因编辑系统,其特征在于,所述SlugCas9-HF蛋白与sgRNA的复合体能识别非靶向DNA序列上的PAM序列。
  8. 根据权利要求5-7中任一项所述的CRISPR/Cas9基因编辑系统,其特征在于:
    对于SauriCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
    对于ShaCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
    对于SlugCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
    对于SlutCas9蛋白,所述PAM为NNGR,所述靶向DNA序列为SEQ ID NO:14所示;
    对于Sa-Sauri蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
    对于Sa-SepCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
    对于Sa-SeqCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
    对于Sa-ShaCas9蛋白,所述PAM为NNGRM,所述靶向DNA序列为SEQ ID NO:13所示;
    对于Sa-SlugCas9蛋白,所述PAM为NNGG,所述靶向DNA序列为SEQ ID NO:12所示;
    对于Sa-SlutCas9蛋白,所述PAM为NNGRR,所述靶向DNA序列为SEQ ID NO:15所示;
    对于SlugCas9-HF蛋白,所述PAM序列为NNGG,所述靶向DNA序列为SEQ ID NO:12所示。
  9. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述sgRNA经磷 酸化、缩短、加长、硫化、甲基化或羟基化修饰而被改造。
  10. 根据权利要求1所述的CRISPR/Cas9基因编辑系统,其特征在于,所述Cas9蛋白与sgRNA的复合体能精确定位靶向DNA序列指所述Cas9蛋白与sgRNA的复合体能够识别并结合所述靶向DNA序列,或指将与所述Cas9蛋白融合的其他蛋白或特异性识别sgRNA的蛋白带至所述靶向DNA序列的位置。
  11. 根据权利要求10所述的CRISPR/Cas9基因编辑系统,其特征在于,所述Cas9蛋白与sgRNA的复合体或、所述与所述Cas9蛋白融合的其他蛋白、或所述特异性识别sgRNA的蛋白能够对靶向DNA区域进行修饰和调控,所述修饰和调控包括基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪。
  12. 根据权利要求6-8任一项所述的CRISPR/Cas9基因编辑系统,其特征在于,所述SlugCas9-HF蛋白与sgRNA的复合体基本上不能或者不能识别并结合所述非靶向DNA序列,或基本上不能或不能将与所述SlugCas9蛋白融合的其他蛋白或特异性识别sgRNA的蛋白带至非靶向DNA序列的位置。
  13. 根据权利要求12所述的CRISPR/Cas9基因编辑系统,其特征在于,所述SlugCas9-HF蛋白与sgRNA的复合体或、所述与所述SlugCas9-HF蛋白蛋白融合的其他蛋白、或所述特异性识别sgRNA的蛋白基本上不能或者不能对非靶向DNA区域进行修饰和调控,所述修饰和调控包括基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪。
  14. 根据权利要求11或13所述的CRISPR/Cas9基因编辑系统,其特征在于,所述单碱基转换器包括碱基腺嘌呤到鸟嘌呤的转换、或胞嘧啶到胸腺嘧啶的转换、或胞嘧啶到尿嘧啶的转换、或其它碱基之间的转换。
  15. 一种采用如权利要求1-14中任一项所述的CRISPR/Cas9基因编辑系统在细胞中进行基因编辑的方法,所述方法通过Cas9蛋白与sgRNA的复合物识别定位靶向DNA序列来对所述靶向DNA序列进行编辑,所述方法包括以下步骤:
    (1)合成Cas9基因序列并将其克隆到表达载体例如pAAV2_ITR上,以获得克隆有Cas9基因序列的表达载体,例如pAAV2_Cas9_ITR,其中所述Cas9基因序列:
    (a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷酸序列,
    (b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
    (c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中任一个所示的核苷酸序列;
    (2)合成与sgRNA对应的寡核苷酸单链DNA,即寡核苷酸正向链序列和寡核苷酸反向链序列,并在所述寡核苷酸正向链序列和寡核苷酸反向链序列退火后将其连接至所述克隆有Cas9基因序列的表达载体的酶切位点,例如质粒pAAV2_Cas9_U6_BsaI的BsaI酶切位点,以得到表达所述Cas9蛋白和所述sgRNA的表达载体,例如pAAV2_Cas9-hU6-sgRNA,其中所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或与SEQ ID NO:11所示的核苷酸序列至少80%相同的核苷酸序列,或基于该核苷酸序列进行的修饰,所述修饰包括例如磷酸化、缩短、加长、硫化、甲基化或羟基化;
    (3)将所述表达所述Cas9蛋白和所述sgRNA的表达载体递送至含有靶位点的细胞中,以实现对所述靶位点进行编辑;
    (4)任选地,检测编辑后的靶位点的编辑效率,例如通过对编辑后的靶位点进行PCR扩增,然后进行T7EI酶切法或者二代测序法。
  16. 根据权利要求15所述的方法,其特征在于,所述pAAV2_Cas9-hU6-sgRNA为腺相关病毒骨架质粒,其包括AAV2 ITR、CMV增强子、CMV启动子、SV40 NLS、Cas9、nucleoplasmin NLS、3x HA、bGH poly(A)、人U6启动子、BsaI内切酶位点、sgRNA scaffold序列。
  17. 根据权利要求15所述的方法,其特征在于,递送至细胞的CRISPR/Cas9系统包括:表达所述Cas9蛋白和所述sgRNA的质粒、逆转录病毒、腺病毒、腺相关病毒载体,或所述sgRNA和所述Cas9蛋白。
  18. 根据权利要求15所述的方法,其特征在于,对于除SlugCas9-HF基因以外的其他Cas9基因,所述寡核苷酸正向链序列和所述寡核苷酸反向链序列分别具有SEQ ID NO:16和SEQ ID NO:17所示的核苷酸序列;对于SlugCas9-HF基因,所述寡核苷酸正向链序列和所述寡核苷酸反向链序列包括SEQ ID NO:59和SEQ ID NO:60所示的第一寡核苷酸正向链序列和第一寡核苷酸反向链序列,以及SEQ ID NO:61和SEQ ID NO:62所示的第二寡核苷酸正向链序列和第二寡核苷酸反向链序列。
  19. 根据权利要求15所述的方法,其特征在于,对于除SlugCas9-HF基因以外的其他Cas9基因,步骤(3)中的细胞中的靶位点具有SEQ ID NO:18所示的核苷酸序列;对于SlugCas9-HF基因,步骤(3)中的细胞中的靶位点分别具有SEQ ID NO:63和SEQ ID NO:64所示的核苷酸序列。
  20. 根据权利要求15所述的方法,其特征在于,步骤(4)中PCR的模板为编辑后的DNA; 对于除SlugCas9-HF基因以外的其他Cas9基因,PCR扩增的引物序列为SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、SEQ ID NO:22;对于SlugCas9-HF基因,PCR扩增的引物序列为:SEQ ID NO:65、SEQ ID NO:66、SEQ ID NO:21、SEQ ID NO:22、SEQ ID NO:67。
  21. 根据权利要求11所述的方法,其特征在于,所述细胞包括真核细胞和原核细胞;所述真核生物细胞包括哺乳动物细胞和植物细胞;所述哺乳动物细胞包括中国仓鼠卵巢细胞、幼仓鼠肾细胞、小鼠Sertoli细胞、小鼠乳腺瘤细胞、buffalo大鼠肝细胞、大鼠肝瘤细胞、由SV40转化的猴肾CVI系、猴肾细胞、犬肾细胞、人宫颈癌细胞、人肺细胞、人肝细胞、HIH/3T3细胞、人U2-OS骨肉瘤细胞、人A549细胞、人K562细胞、人HEK293细胞、人HEK293T细胞、人HCT116细胞或人MCF-7细胞或TRI细胞。
  22. 一种用于基因编辑的CRISPR/Cas9基因编辑系统的试剂盒,该试剂盒包括:
    (1)Cas9蛋白和sgRNA,其中,所述Cas9蛋白为:
    SauriCas9蛋白,其具有SEQ ID NO:1所示的氨基酸序列,
    ShaCas9蛋白,其具有SEQ ID NO:2所示的氨基酸序列,
    SlugCas9蛋白,其具有SEQ ID NO:3所示的氨基酸序列,
    SlutCas9蛋白,其具有SEQ ID NO:4所示的氨基酸序列,
    Sa-Sauri蛋白,其具有SEQ ID NO:5所示的氨基酸序列,
    Sa-SepCas9蛋白,其具有SEQ ID NO:6所示的氨基酸序列,
    Sa-SeqCas9蛋白,其具有SEQ ID NO:7所示的氨基酸序列,
    Sa-ShaCas9蛋白,其具有SEQ ID NO:8所示的氨基酸序列,
    Sa-SlugCas9蛋白,其具有SEQ ID NO:9所示的氨基酸序列,
    Sa-SlutCas9蛋白,其具有SEQ ID NO:10所示的氨基酸序列,或
    SlugCas9-HF蛋白,其具有SEQ ID NO:58所示的氨基酸序列,或者
    所述Cas9蛋白具有与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列;并且
    所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列;
    或者
    (2)克隆有Cas9基因序列和与sgRNA对应的寡核苷酸单链DNA的表达载体,其中
    所述Cas9基因序列:
    (a)具有编码SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列的核苷 酸序列,
    (b)具有编码与SEQ ID NO:1-10和SEQ ID NO:58中任一个所示的氨基酸序列至少80%相同的氨基酸序列的核苷酸序列;或者
    (c)为人源化的Cas9基因序列,例如为具有SEQ ID NO:23-32和SEQ ID NO:112中任一项所示的核苷酸序列,并且
    所述sgRNA具有SEQ ID NO:11所示的核苷酸序列,或者为基于SEQ ID NO:11改造的sgRNA序列。
  23. 权利要求1-14中任一项所述的CRISPR/Cas9基因编辑系统在基因敲除、定点碱基的改变、定点插入、基因转录水平的调控、DNA甲基化调控、DNA乙酰化修饰、组蛋白乙酰化修饰、单碱基转换器或染色质成像追踪中的应用。
PCT/CN2020/107880 2019-08-08 2020-08-07 CRISPR/Cas9基因编辑系统及其应用 WO2021023307A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/633,354 US20240175055A1 (en) 2019-08-08 2020-08-07 Crispr/cas9 gene editing system and application thereof
JP2022507560A JP2022543451A (ja) 2019-08-08 2020-08-07 CRISPR/Cas9遺伝子編集システムおよびその適用
EP20849939.2A EP4012037A1 (en) 2019-08-08 2020-08-07 Crispr/cas9 gene editing system and application thereof

Applications Claiming Priority (20)

Application Number Priority Date Filing Date Title
CN201910731794.1 2019-08-08
CN201910731396.XA CN110577969B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN201910731401.7 2019-08-08
CN201910731396.X 2019-08-08
CN201910731390.2 2019-08-08
CN201910731794.1A CN110551762B (zh) 2019-08-08 2019-08-08 CRISPR/ShaCas9基因编辑系统及其应用
CN201910731795.6A CN110499334A (zh) 2019-08-08 2019-08-08 CRISPR/SlugCas9基因编辑系统及其应用
CN201910731398.9A CN110577970B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN201910731402.1A CN110551761B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN201910731803.7A CN110499335B (zh) 2019-08-08 2019-08-08 CRISPR/SauriCas9基因编辑系统及其应用
CN201910731802.2 2019-08-08
CN201910731401.7A CN110577971B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-SauriCas9基因编辑系统及其应用
CN201910731390.2A CN110551760B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-SeqCas9基因编辑系统及其应用
CN201910731795.6 2019-08-08
CN201910731398.9 2019-08-08
CN201910731412.5A CN110577972B (zh) 2019-08-08 2019-08-08 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN201910731402.1 2019-08-08
CN201910731802.2A CN110551763B (zh) 2019-08-08 2019-08-08 CRISPR/SlutCas9基因编辑系统及其应用
CN201910731803.7 2019-08-08
CN201910731412.5 2019-08-08

Publications (1)

Publication Number Publication Date
WO2021023307A1 true WO2021023307A1 (zh) 2021-02-11

Family

ID=74503329

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/107880 WO2021023307A1 (zh) 2019-08-08 2020-08-07 CRISPR/Cas9基因编辑系统及其应用

Country Status (4)

Country Link
US (1) US20240175055A1 (zh)
EP (1) EP4012037A1 (zh)
JP (1) JP2022543451A (zh)
WO (1) WO2021023307A1 (zh)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110499334A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SlugCas9基因编辑系统及其应用
CN110499335A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SauriCas9基因编辑系统及其应用
CN110551760A (zh) * 2019-08-08 2019-12-10 复旦大学 一种CRISPR/Sa-SeqCas9基因编辑系统和其应用
CN110551761A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110551762A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/ShaCas9基因编辑系统及其应用
CN110551763A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/SlutCas9基因编辑系统及其应用
CN110577971A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SauriCas9基因编辑系统及其应用
CN110577972A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110577970A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN110577969A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlugCas9基因编辑系统及其应用
EP4144841A1 (en) * 2021-09-07 2023-03-08 Bayer AG Novel small rna programmable endonuclease systems with impoved pam specificity and uses thereof

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024155111A1 (ko) * 2023-01-18 2024-07-25 주식회사 엣진 염기 에디터의 염기 교정 능력을 측정하는 세포외 시험법

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105209621A (zh) * 2012-12-12 2015-12-30 布罗德研究所有限公司 对用于序列操纵的改进的系统、方法和酶组合物进行的工程化和优化
WO2018078134A1 (en) * 2016-10-28 2018-05-03 Genethon Compositions and methods for the treatment of myotonic dystrophy
CN110337493A (zh) * 2016-10-28 2019-10-15 吉尼松公司 用于治疗肌强直性营养不良的组合物和方法
CN110499335A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SauriCas9基因编辑系统及其应用
CN110499334A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SlugCas9基因编辑系统及其应用
CN110551760A (zh) * 2019-08-08 2019-12-10 复旦大学 一种CRISPR/Sa-SeqCas9基因编辑系统和其应用
CN110551763A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/SlutCas9基因编辑系统及其应用
CN110551761A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110551762A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/ShaCas9基因编辑系统及其应用
CN110577972A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110577970A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN110577969A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN110577971A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SauriCas9基因编辑系统及其应用

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11339437B2 (en) * 2014-03-10 2022-05-24 Editas Medicine, Inc. Compositions and methods for treating CEP290-associated disease
CN105950639A (zh) * 2016-05-04 2016-09-21 广州美格生物科技有限公司 金黄色葡萄球菌CRISPR/Cas9系统的制备及其在构建小鼠模型中的应用

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105209621A (zh) * 2012-12-12 2015-12-30 布罗德研究所有限公司 对用于序列操纵的改进的系统、方法和酶组合物进行的工程化和优化
WO2018078134A1 (en) * 2016-10-28 2018-05-03 Genethon Compositions and methods for the treatment of myotonic dystrophy
CN110337493A (zh) * 2016-10-28 2019-10-15 吉尼松公司 用于治疗肌强直性营养不良的组合物和方法
CN110499335A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SauriCas9基因编辑系统及其应用
CN110499334A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SlugCas9基因编辑系统及其应用
CN110551760A (zh) * 2019-08-08 2019-12-10 复旦大学 一种CRISPR/Sa-SeqCas9基因编辑系统和其应用
CN110551763A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/SlutCas9基因编辑系统及其应用
CN110551761A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110551762A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/ShaCas9基因编辑系统及其应用
CN110577972A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110577970A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN110577969A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN110577971A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SauriCas9基因编辑系统及其应用

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
"NCBI", Database accession no. WP - 075777761.1
"UniProt", Database accession no. AOA2T4SLN6
DATABASE Protein 19 June 2019 (2019-06-19), ANONYMOUS: "type II CRISPR RNA-guided endonuclease Cas9 [Staphylococcus lugdunensis]", XP055778439, retrieved from NCBI Database accession no. WP_002460848.1 *
DATABASE Protein 20 June 2019 (2019-06-20), ANONYMOUS: "type II CRISPR RNA-guided endonuclease Cas9 [Staphylococcus haemolyticus]", XP055778436, retrieved from NCBI Database accession no. WP_053017934.1 *
DATABASE Protein 20 June 2019 (2019-06-20), ANONYMOUS: "type II CRISPR RNA-guided endonuclease Cas9 [Staphylococcus lutrae]", XP055778441, retrieved from NCBI Database accession no. WP_085237539.1 *
DATABASE Protein 9 October 2019 (2019-10-09), ANONYMOUS: "WP_107392933.1", XP055778430, retrieved from NCBI Database accession no. WP_107392933.1 *
KIRA S MAKAROVA , L ARAVIIND , NICK V GRISHIN , IGOR B ROGOZIN , EUGENE V KOONIN: "A DNA Repair System Specific for Thermophilic Archaea and Bacteria Predicted by Genomic Context Analysis", NUCLEIC ACIDS RESEARCH, vol. 30, no. 2, 15 January 2002 (2002-01-15), pages 482 - 496, XP002784019, ISSN: 1362-4962 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110577969B (zh) * 2019-08-08 2022-07-22 复旦大学 CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN110551762A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/ShaCas9基因编辑系统及其应用
CN110499334A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SlugCas9基因编辑系统及其应用
CN110577971B (zh) * 2019-08-08 2022-11-18 复旦大学 CRISPR/Sa-SauriCas9基因编辑系统及其应用
CN110577972B (zh) * 2019-08-08 2022-10-11 复旦大学 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110551763A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/SlutCas9基因编辑系统及其应用
CN110577971A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SauriCas9基因编辑系统及其应用
CN110577972A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-ShaCas9基因编辑系统及其应用
CN110577970A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN110577969A (zh) * 2019-08-08 2019-12-17 复旦大学 CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN110551760A (zh) * 2019-08-08 2019-12-10 复旦大学 一种CRISPR/Sa-SeqCas9基因编辑系统和其应用
CN110499335A (zh) * 2019-08-08 2019-11-26 复旦大学 CRISPR/SauriCas9基因编辑系统及其应用
CN110551761A (zh) * 2019-08-08 2019-12-10 复旦大学 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110577970B (zh) * 2019-08-08 2022-11-18 复旦大学 CRISPR/Sa-SlutCas9基因编辑系统及其应用
CN110551761B (zh) * 2019-08-08 2022-11-18 复旦大学 CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110551760B (zh) * 2019-08-08 2022-11-18 复旦大学 CRISPR/Sa-SeqCas9基因编辑系统及其应用
CN110499335B (zh) * 2019-08-08 2023-03-28 复旦大学 CRISPR/SauriCas9基因编辑系统及其应用
CN110551763B (zh) * 2019-08-08 2023-03-10 复旦大学 CRISPR/SlutCas9基因编辑系统及其应用
CN110551762B (zh) * 2019-08-08 2023-03-10 复旦大学 CRISPR/ShaCas9基因编辑系统及其应用
WO2023036669A1 (en) * 2021-09-07 2023-03-16 Bayer Aktiengesellschaft Novel small rna programmable endonuclease systems with impoved pam specificity and uses thereof
EP4144841A1 (en) * 2021-09-07 2023-03-08 Bayer AG Novel small rna programmable endonuclease systems with impoved pam specificity and uses thereof

Also Published As

Publication number Publication date
EP4012037A1 (en) 2022-06-15
JP2022543451A (ja) 2022-10-12
US20240175055A1 (en) 2024-05-30

Similar Documents

Publication Publication Date Title
WO2021023307A1 (zh) CRISPR/Cas9基因编辑系统及其应用
US11667917B2 (en) Composition for genome editing using CRISPR/CPF1 system and use thereof
US20180195089A1 (en) CRISPR Oligonucleotides and Gene Editing
WO2017190664A1 (zh) 化学合成的crRNA和修饰crRNA在CRISPR/Cpf1基因编辑系统中的应用
EP3744844A1 (en) Extended single guide rna and use thereof
CN113373130A (zh) Cas12蛋白、含有Cas12蛋白的基因编辑系统及应用
WO2017215648A1 (zh) 基因敲除方法
CN110551761B (zh) CRISPR/Sa-SepCas9基因编辑系统及其应用
CN110577971B (zh) CRISPR/Sa-SauriCas9基因编辑系统及其应用
WO2020173150A1 (zh) 单碱基编辑导致非靶向单核苷酸变异及避免该变异的高特异性无脱靶单碱基基因编辑工具
CN105567718B (zh) 一种同时表达多个sgRNA的载体的构建方法
WO2023142594A1 (zh) 一种精确无pam限制的腺嘌呤碱基编辑器及其应用
CN112159801B (zh) SlugCas9-HF蛋白、含有SlugCas9-HF蛋白的基因编辑系统及应用
WO2020087631A1 (zh) 基于C2c1核酸酶的基因组编辑系统和方法
CN110499335B (zh) CRISPR/SauriCas9基因编辑系统及其应用
CN110551762B (zh) CRISPR/ShaCas9基因编辑系统及其应用
CN110577969B (zh) CRISPR/Sa-SlugCas9基因编辑系统及其应用
CN110499334A (zh) CRISPR/SlugCas9基因编辑系统及其应用
CN110551763B (zh) CRISPR/SlutCas9基因编辑系统及其应用
WO2023016021A1 (zh) 一种碱基编辑工具及其构建方法
JP7109009B2 (ja) 遺伝子ノックアウト方法
CN116004716A (zh) 一种复制型dCas9-FokI系统进行高效基因编辑的方法
CN110551760B (zh) CRISPR/Sa-SeqCas9基因编辑系统及其应用
CN114990093A (zh) 氨基酸序列小的蛋白序列mini rfx-cas13d
CN111662932B (zh) 一种提高CRISPR-Cas9基因编辑中同源重组修复效率的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20849939

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022507560

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020849939

Country of ref document: EP

Effective date: 20220309